Add wdrvScreenshot() to capture the screen to PNG via stb_image_write.h, reading the framebuffer (or DDI bitmap fallback) and VGA DAC palette. Convert demo.c to non-interactive mode with automatic screenshots after each demo (DEMO01-15.PNG) and no keypress waits, plus per-driver DOSBox-X configs for automated testing. Set a standard Windows 3.1 256-color palette (8R x 8G x 4B color cube with 20 static system colors) to ensure consistent output across drivers. Fix wdrvSetPalette to also program the VGA DAC directly, since VBESVGA's SetPalette DDI updates its internal color table but not the hardware. Detect DAC width via VBE 4F08 (S3TRIO=6-bit, VBESVGA=8-bit) and use correct shift in both DAC writes and reads — fixes dark display on VBESVGA where 6-bit values in 8-bit DAC produced 1/4 brightness. Fix S3 dispYOffset: extend PDEVICE deHeight by the offset so the driver's internal clipping allows the full 600-row logical screen, rather than incorrectly reducing dpVertRes to 590. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
13 KiB
13 KiB
Win31drv Project Memory
Build Environment
- DJGPP cross-compiler:
~/djgpp/djgpp/bin/i586-pc-msdosdjgpp-gcc(GCC 12.2.0) - DJGPP binutils need
libfl.so.2: stored intools/lib/(Makefiles set LD_LIBRARY_PATH) - CWSDPMI zip stored in
tools/cwsdpmi.zip(extracted to bin/ during build) - DOSBox-X:
flatpak run com.dosbox_x.DOSBox-X(installed as user flatpak) - CWSDPMI.EXE in
bin/directory for DPMI support under DOSBox-X - Config:
dosbox-x.confwith S3 Trio64 machine type, 64MB RAM
Project Structure
windriver/
├── Makefile # Top-level: builds demo, calls win31drv/Makefile
├── demo.c # Demo program
├── dosbox-x.conf # DOSBox-X config (S3 SVGA)
├── obj/ # Demo object files
├── bin/ # Executables + CWSDPMI.EXE
└── win31drv/ # Library
├── Makefile # Builds libwindrv.a
├── obj/ # Library objects
├── neload.c/h # NE format loader
├── neformat.h # NE structures
├── thunk.c/h # 32→16 bit thunking
├── windrv.c/h # Main API
├── winstub.c/h # Windows API stubs
├── winddi.h # DDI structures
└── wintypes.h # Win16 types
DJGPP Portability Notes
uint32_tisunsigned long(notunsigned int) in DJGPP — usePRIu32/PRIX32from<inttypes.h>- Always include
<stdarg.h>explicitly forva_list/va_start/va_end - Headers must be self-contained (include their own dependencies)
Thunking Architecture Notes
- SS == DS == DGROUP: Win3.x drivers assume SS == DS == DGROUP. VBESVGA's BBLT.ASM does
PrestoChangeoSelector(SS, WorkSelector)to create a code alias for compiled blit code. thunkCall16 uses dgroupSel as SS (SP=0xFFF0) when available. Without this, the code alias has the wrong base and the CPU executes garbage. - Register corruption with -O2 inlining: When demo.c's demoDrawing is inlined into main,
DJGPP GCC 12.2.0 mishandles callee-saved registers across thunk calls in long functions.
Fix:
__attribute__((noinline))on demoDrawing. Symptom: handle pointer corrupted to a ColorInfo return value (e.g. 0xFF0001F6) between Demo 2 and Demo 3.
DOSBox-X Driver Notes
waitForEngine(): GP_STAT port 0x9AE8 bit 9 polling — S3 only (gIsS3 guard)- S3 detection: Probe CR30 chip ID register. S3 chips: 0x81-0xE1. ET4000: 0x00. Only apply S3-specific setup (cursor disable, dispYOffset, setDisplayStart) when isS3=true AND driver is not VGA-class (1bpp/4planes).
- Pattern scratch artifact: S3 driver writes 8x8 dithered brush pattern to VRAM at fixed
position (~(144,1)-(151,8)) during accelerated pattern fills. Fixed by shifting CRTC display
start down 10 scanlines (
dispYOffset) so the scratch area is off-screen. All drawing Y coordinates are offset by dispYOffset. The full dpVertRes (600) is reported and usable — the shift just consumes slightly more VRAM. - S3TRIO BitBlt source corruption: S3TRIO's accelerated BitBlt corrupts source VRAM during source-dependent ROP operations (SRCINVERT, NOTSRCCOPY, SRCAND, SRCPAINT). In Windows 3.1, GDI uses intermediate off-screen bitmaps. Our direct DDI calls must work around this by redrawing source rects after the ROP, or using separate source areas. Off-screen VRAM (y >= screenH) is NOT usable — the driver clips to screen dimensions.
-fno-gcserequired for windrv.c: With -O2 GCSE, stack layout causes issues during 16-bit driver calls. Only windrv.c needs this. SeeWINDRV_CFLAGSin win31drv/Makefile.- Output DDI (polylines/rectangles) requires a physical pen from RealizeObject, not a raw LogPen16T. The pen must be in DGROUP (same as brush, drawMode, PDEVICE).
- Curve primitives removed: Ellipses, polygons, roundrects, arcs, and pies were removed because DIB engine drivers (VBESVGA/ET4000) hang (expect GDI curve decomposition callback) and S3TRIO only partially renders them. Only polyline is reliable via Output DDI.
- OS_RECTANGLE crashes DIB engine drivers: VBESVGA/ET4000 Output(OS_RECTANGLE) crashes
like curve primitives.
wdrvRectangleExdraws rectangles as two 3-point polylines instead. - Output DDI lpClipRect: DIB engine drivers (VBESVGA) dereference lpClipRect unconditionally in polyline paths too. Always pass a valid clip rect (0,0,0x7FFF,0x7FFF).
wdrvUnloadDriverdoes NOT auto-call Disable — caller must handle text mode restoresleep()hangs under DOSBox-X because BIOS timer ticks don't advance without I/O- Debug output:
-dflag enables verbose logging in neload, winstub, thunk, and windrv - SetPalette DDI vs VGA DAC: VBESVGA's SetPalette DDI updates the driver's internal
color table (used by ColorInfo for RGB→index matching) but does NOT program the VGA
DAC hardware. In real Windows 3.1, GDI programs the DAC separately.
wdrvSetPaletteworks around this by also writing DAC registers directly (ports 0x3C8/0x3C9) after the DDI call. This is idempotent on drivers like S3TRIO that already program the DAC. - DAC width: S3TRIO uses 6-bit DAC (values 0-63), VBESVGA uses 8-bit DAC (values
0-255). Detected at Enable via VBE 4F08 subfunc 01. Both
wdrvSetPalette(port write) andreadDacPalette(screenshot) use the detected width for correct shift amounts. Wrong shift causes dark display (6-bit values in 8-bit DAC = 1/4 brightness). - Known issue: mode mismatch HW=800x600 vs GDIINFO=640x480
DGROUP Stack Management
- VGA.DRV ships with DGROUP[0x0A]=0xFFFF (stack bottom = top of segment → no stack). Its BitBlt prolog calls a stack check function at 0x18CA that compares available stack against [SS:0x0A]. With 0xFFFF, ALL functions fail immediately (return 0).
- Fix: patch [0x0A] to objBase after extending DGROUP. Only done when original = 0xFFFF.
- S3TRIO.DRV and VBESVGA.DRV have [0x0A]=0x0000 — no patching needed.
- Do NOT unconditionally overwrite DGROUP offsets 0x00-0x0F — VBESVGA.DRV stores driver-specific data there (0x030A at offset 0, 0x01 at offset 4).
BitBlt Source Device Rules
- For pattern-only ROPs (PATCOPY=0xF0, BLACKNESS=0x00, WHITENESS=0xFF, etc.), lpSrcDev must be NULL (0:0) per DDI spec. VGA.DRV rejects non-NULL source for pattern-only ROPs. S3TRIO.DRV tolerates it but correct behavior is NULL.
- Source dependency check:
ropNeedsSrc = (((rop8 >> 2) ^ rop8) & 0x33) != 0
ExtTextOut DDI Notes
- Font format: VBESVGA.DRV requires .FNT v3 (fsVersion=0x0300) when BigFontFlags is set (386 protected mode with WF_PMODE + WF_CPU386). v2 (0x0200) is rejected at runtime.
- v3 char table: at file offset 0x94 (fs30CharOffset), 6-byte entries {WORD width, DWORD offset}. The DWORD offset is absolute from the font segment base. Use FntCharEntry30T.
- Bitmap layout: per-character contiguous, column-major byte order. For each character,
all pixHeight rows of byte-column 0 come first, then all rows of byte-column 1, etc.
Address formula:
(byteCol * pixHeight) + row. For 8px-wide chars (1 byte column), this is identical to row-major. VGA BIOS 8x16 font is already in this format — no transpose needed. - lpClipRect must NOT be NULL: VBESVGA's get_clip unconditionally dereferences lpClipRect (STRBLT.ASM:1008 "We assume that we will never get passed a null rectangle"). Pass a RECT covering the full screen (0, 0, 0x7FFF, 0x7FFF).
- lpTextXForm: declared but never read by VBESVGA — pass NULL.
- lp_font offset: passed as fontSel:0x42 (points to fsType within the .FNT block).
- Return value: DX bit 15 = error. AX=0 is NOT necessarily failure.
INT 10h ES Translation
- Different INT 10h function families use different ES:offset registers: VBE 4Fxx → ES:DI, AH=10h (palette) → ES:DX, AH=11h (fonts) → ES:BP, AH=1Bh → ES:DI
- Only specific AL subfunctions use ES as a buffer pointer; most don't
- Copy sizes must be exact (17 bytes for palette, CX*3 for DAC blocks, etc.)
- Copy direction matters: "Set" = copy-in only, "Read/Get" = copy-out only
WINFLAGS Handling
- WF_80x87 NOT used: We don't save/restore FPU state across thunk boundaries
- VGA-class drivers need WF_STANDARD: VGA.DRV's physical_enable hangs in Enhanced mode (polls VDD that doesn't exist). Auto-detected after Enable(style=1) returns 1bpp/4planes GDIINFO → repatch __WINFLAGS in all segments (0x0025→0x0015).
- SVGA drivers (S3TRIO, VBESVGA) use WF_ENHANCED normally
ET4000 Driver Notes
- ET4000.DRV from Win 3.x distribution is SZDD-compressed; decompress with
msexpand(rename to .DR_ first, output is .DR without the V — rename to .DRV) - DOSBox-X machine type:
svga_et4000for ET4000 hardware emulation - ET4000 is 640x480 8bpp, software-rendered (no accelerator engine in DOSBox-X)
- CR30=0x00 on ET4000 → isS3=false → no S3 engine wait, no display start shift
- GetPixel breaks ScanLR on ET4000: The Pixel DDI with color=-1 (get mode) leaves VGA hardware state (likely GR5 read mode) that causes ScanLR to not match the pixel just read. GetPixel itself returns correct values; only the ScanLR interaction is broken. Flood fill software path avoids GetPixel entirely, using ScanLR for all pixel-color queries.
Font Loading Notes
- .FON files are NE containers with RT_FONT (type 8) resources; each resource is raw .FNT data
- All Win 3.x .FON files contain v2 fonts (0x0200); VBESVGA.DRV requires v3 (0x0300)
- v2→v3 conversion: insert 30-byte extension at 0x76, expand 4-byte char table to 6-byte entries
- v2 char table offsets are absolute from segment base, not relative to fsBitsOffset. Correct v3 offset = v2offset + shift (where shift = newBitmapOff - origBitsOff)
wdrvLoadFontFon(path, index)loads from .FON;wdrvLoadFontFnt(path)loads raw .FNTwdrvLoadFontBuiltin()returns the VGA ROM 8x16 singleton; must NOT be passed to wdrvUnloadFontwdrvLoadFontTtf(path, pointSize)loads TrueType via stb_truetype, rasterizes to 1-bit v3 FNTwdrvExtTextOuttakes aWdrvFontT fontparameter (NULL = built-in)- Available test fonts in
fon/: COURE.FON (8x13, 9x16, 12x20), SSERIFE.FON, SERIFE.FON, VGASYS.FON, etc. - Available TTF fonts in
ttf/: LIBMONO.TTF, LIBSANS.TTF, LIBSERIF.TTF (Liberation family)
Current Demo Status
- S3TRIO.DRV, VBESVGA.DRV, VGA.DRV, ET4000.DRV all work: Load → Enable → Draw → Disable → Unload
- Demo 1: Fill rectangles (BitBlt) — works
- Demo 2: Pixel patterns (Pixel) — works
- Demo 3: Lines/starburst (Output/Polyline) — works
- Demo 4: Screen-to-screen blit (BitBlt SRCCOPY) — works
- Demo 5: ExtTextOut text rendering — works (VBESVGA.DRV)
- Demo 7: TrueType font rendering at multiple sizes — works
- Demo 8: Color text showcase (fg/bg colors, opaque/transparent, palette grid) — works
- Demo 9: ROP3 operations (DSTINVERT, SRCINVERT, NOTSRCCOPY, SRCAND, SRCPAINT) — works
- Demo 10: ScanLR + Flood Fill (FB or software fallback) — works on all drivers
- Demo 11: Text measurement (GetCharWidth DDI, wdrvMeasureText) — works
- Demo 12: Styled pen lines (software Bresenham, all drivers) — works
- Demo 13: Pixel buffer blit (FB or software fallback) — works on all drivers
- Demo 14: Hardware cursor (arrow, crosshair, ibeam, hand; animated circle) — works
- Demo 15: Screen save/restore (screen-to-screen BitBlt stash/restore) — works
- VGA.DRV: 640x480 4-plane 16-color mode; limited color palette but functional
- ET4000.DRV: 640x480 8bpp on svga_et4000; software-only, no hw acceleration
- Drivers stored in
drivers/directory, copied tobin/during build
Software Rendering Fallbacks
- Styled pens: Always software-rendered via Bresenham + Pixel DDI (PS_SOLID uses HW Output DDI). S3TRIO silently accepts styled pens but doesn't render them; software path gives identical output everywhere.
- Flood fill: Uses direct FB when available, falls back to ScanLR DDI + FillRect. Cannot use GetPixel — ET4000 DIB engine's Pixel DDI (color=-1) corrupts VRAM, causing subsequent ScanLR to not match the read pixel. Seed color is determined by probing ScanLR with each palette index until a match is found.
- Pixel blit: Uses direct FB memcpy when available, falls back to per-pixel SetPixel with PALETTEINDEX.
- All four drivers (S3TRIO, VBESVGA, ET4000, VGA) now have identical feature sets.
New API Functions (added with demos 9-15)
wdrvScanLR(handle, x, y, color, style)— ScanLR DDI wrapper (ordinal 12)wdrvFloodFill(handle, x, y, fillColor)— scanline flood fill (FB or software)wdrvGetCharWidths(handle, font, firstChar, lastChar, widths)— GetCharWidth DDIwdrvMeasureText(handle, font, text, length)— sum char widths for stringwdrvPolylineEx(handle, points, count, color, penStyle)— polyline with pen stylewdrvRectangleEx(handle, x, y, w, h, color, penStyle)— rectangle with pen stylewdrvBlitPixels(handle, x, y, w, h, pixels, srcPitch)— pixel blit (FB or software)wdrvBlitBmp(handle, x, y, bmpPath, setPalette)— load+display 8bpp BMPwdrvSetCursor(handle, shape)— built-in cursor shapes (arrow/crosshair/ibeam/hand/none)wdrvSetCursorCustom(handle, hotX, hotY, andMask, xorMask)— custom 32x32 mono cursorwdrvMoveCursor(handle, x, y)— move hardware cursorwdrvCreateBitmap(handle, width, height)— CreateBitmap DDIwdrvDeleteBitmap(handle, bitmap)— DeleteBitmap DDIwdrvBitmapSetPixels/GetPixels(handle, bitmap, data, size)— BitmapBits DDIwdrvBitBltFromBitmap/ToBitmap(handle, bitmap, ...)— BitBlt with bitmap PDEVICEwdrvScreenshot(handle, filename)— capture screen to PNG (FB or DDI fallback)