# Win31drv Project Memory ## Build Environment - DJGPP cross-compiler: `~/djgpp/djgpp/bin/i586-pc-msdosdjgpp-gcc` (GCC 12.2.0) - DJGPP binutils need `libfl.so.2`: stored in `tools/lib/` (Makefiles set LD_LIBRARY_PATH) - CWSDPMI zip stored in `tools/cwsdpmi.zip` (extracted to bin/ during build) - DOSBox-X: `flatpak run com.dosbox_x.DOSBox-X` (installed as user flatpak) - CWSDPMI.EXE in `bin/` directory for DPMI support under DOSBox-X - Config: `dosbox-x.conf` with S3 Trio64 machine type, 64MB RAM ## Project Structure ``` windriver/ ├── Makefile # Top-level: builds demo, calls win31drv/Makefile ├── demo.c # Demo program ├── dosbox-x.conf # DOSBox-X config (S3 SVGA) ├── obj/ # Demo object files ├── bin/ # Executables + CWSDPMI.EXE └── win31drv/ # Library ├── Makefile # Builds libwindrv.a ├── obj/ # Library objects ├── neload.c/h # NE format loader ├── neformat.h # NE structures ├── thunk.c/h # 32→16 bit thunking ├── windrv.c/h # Main API ├── winstub.c/h # Windows API stubs ├── winddi.h # DDI structures └── wintypes.h # Win16 types ``` ## DJGPP Portability Notes - `uint32_t` is `unsigned long` (not `unsigned int`) in DJGPP — use `PRIu32`/`PRIX32` from `` - Always include `` explicitly for `va_list`/`va_start`/`va_end` - Headers must be self-contained (include their own dependencies) ## Thunking Architecture Notes - **SS == DS == DGROUP**: Win3.x drivers assume SS == DS == DGROUP. VBESVGA's BBLT.ASM does `PrestoChangeoSelector(SS, WorkSelector)` to create a code alias for compiled blit code. thunkCall16 uses dgroupSel as SS (SP=0xFFF0) when available. Without this, the code alias has the wrong base and the CPU executes garbage. - **Register corruption with -O2 inlining**: When demo.c's demoDrawing is inlined into main, DJGPP GCC 12.2.0 mishandles callee-saved registers across thunk calls in long functions. Fix: `__attribute__((noinline))` on demoDrawing. Symptom: handle pointer corrupted to a ColorInfo return value (e.g. 0xFF0001F6) between Demo 2 and Demo 3. ## DOSBox-X Driver Notes - `waitForEngine()`: GP_STAT port 0x9AE8 bit 9 polling — S3 only (gIsS3 guard) - **S3 detection**: Probe CR30 chip ID register. S3 chips: 0x81-0xE1. ET4000: 0x00. Only apply S3-specific setup (cursor disable, dispYOffset, setDisplayStart) when isS3=true AND driver is not VGA-class (1bpp/4planes). - **Pattern scratch artifact**: S3 driver writes 8x8 dithered brush pattern to VRAM at fixed position (~(144,1)-(151,8)) during accelerated pattern fills. Fixed by shifting CRTC display start down 10 scanlines (`dispYOffset`) so the scratch area is off-screen. All drawing Y coordinates are offset by dispYOffset. The full dpVertRes (600) is reported and usable — the shift just consumes slightly more VRAM. - **S3TRIO BitBlt source corruption**: S3TRIO's accelerated BitBlt corrupts source VRAM during source-dependent ROP operations (SRCINVERT, NOTSRCCOPY, SRCAND, SRCPAINT). In Windows 3.1, GDI uses intermediate off-screen bitmaps. Our direct DDI calls must work around this by redrawing source rects after the ROP, or using separate source areas. Off-screen VRAM (y >= screenH) is NOT usable — the driver clips to screen dimensions. - **`-fno-gcse` required for windrv.c**: With -O2 GCSE, stack layout causes issues during 16-bit driver calls. Only windrv.c needs this. See `WINDRV_CFLAGS` in win31drv/Makefile. - Output DDI (polylines/rectangles) requires a **physical pen** from RealizeObject, not a raw LogPen16T. The pen must be in DGROUP (same as brush, drawMode, PDEVICE). - **Curve primitives removed**: Ellipses, polygons, roundrects, arcs, and pies were removed because DIB engine drivers (VBESVGA/ET4000) hang (expect GDI curve decomposition callback) and S3TRIO only partially renders them. Only polyline is reliable via Output DDI. - **OS_RECTANGLE crashes DIB engine drivers**: VBESVGA/ET4000 Output(OS_RECTANGLE) crashes like curve primitives. `wdrvRectangleEx` draws rectangles as two 3-point polylines instead. - **Output DDI lpClipRect**: DIB engine drivers (VBESVGA) dereference lpClipRect unconditionally in polyline paths too. Always pass a valid clip rect (0,0,0x7FFF,0x7FFF). - `wdrvUnloadDriver` does NOT auto-call Disable — caller must handle text mode restore - `sleep()` hangs under DOSBox-X because BIOS timer ticks don't advance without I/O - Debug output: `-d` flag enables verbose logging in neload, winstub, thunk, and windrv - **SetPalette DDI vs VGA DAC**: VBESVGA's SetPalette DDI updates the driver's internal color table (used by ColorInfo for RGB→index matching) but does NOT program the VGA DAC hardware. In real Windows 3.1, GDI programs the DAC separately. `wdrvSetPalette` works around this by also writing DAC registers directly (ports 0x3C8/0x3C9) after the DDI call. This is idempotent on drivers like S3TRIO that already program the DAC. - **DAC width**: S3TRIO uses 6-bit DAC (values 0-63), VBESVGA uses 8-bit DAC (values 0-255). Detected at Enable via VBE 4F08 subfunc 01. Both `wdrvSetPalette` (port write) and `readDacPalette` (screenshot) use the detected width for correct shift amounts. Wrong shift causes dark display (6-bit values in 8-bit DAC = 1/4 brightness). - Known issue: mode mismatch HW=800x600 vs GDIINFO=640x480 ## DGROUP Stack Management - VGA.DRV ships with DGROUP[0x0A]=0xFFFF (stack bottom = top of segment → no stack). Its BitBlt prolog calls a stack check function at 0x18CA that compares available stack against [SS:0x0A]. With 0xFFFF, ALL functions fail immediately (return 0). - Fix: patch [0x0A] to objBase after extending DGROUP. Only done when original = 0xFFFF. - S3TRIO.DRV and VBESVGA.DRV have [0x0A]=0x0000 — no patching needed. - **Do NOT unconditionally overwrite DGROUP offsets 0x00-0x0F** — VBESVGA.DRV stores driver-specific data there (0x030A at offset 0, 0x01 at offset 4). ## BitBlt Source Device Rules - For pattern-only ROPs (PATCOPY=0xF0, BLACKNESS=0x00, WHITENESS=0xFF, etc.), lpSrcDev must be NULL (0:0) per DDI spec. VGA.DRV rejects non-NULL source for pattern-only ROPs. S3TRIO.DRV tolerates it but correct behavior is NULL. - Source dependency check: `ropNeedsSrc = (((rop8 >> 2) ^ rop8) & 0x33) != 0` ## ExtTextOut DDI Notes - **Font format**: VBESVGA.DRV requires .FNT v3 (fsVersion=0x0300) when BigFontFlags is set (386 protected mode with WF_PMODE + WF_CPU386). v2 (0x0200) is rejected at runtime. - **v3 char table**: at file offset 0x94 (fs30CharOffset), 6-byte entries {WORD width, DWORD offset}. The DWORD offset is absolute from the font segment base. Use FntCharEntry30T. - **Bitmap layout**: per-character contiguous, **column-major** byte order. For each character, all pixHeight rows of byte-column 0 come first, then all rows of byte-column 1, etc. Address formula: `(byteCol * pixHeight) + row`. For 8px-wide chars (1 byte column), this is identical to row-major. VGA BIOS 8x16 font is already in this format — no transpose needed. - **lpClipRect must NOT be NULL**: VBESVGA's get_clip unconditionally dereferences lpClipRect (STRBLT.ASM:1008 "We assume that we will never get passed a null rectangle"). Pass a RECT covering the full screen (0, 0, 0x7FFF, 0x7FFF). - **lpTextXForm**: declared but never read by VBESVGA — pass NULL. - **lp_font offset**: passed as fontSel:0x42 (points to fsType within the .FNT block). - **Return value**: DX bit 15 = error. AX=0 is NOT necessarily failure. ## INT 10h ES Translation - Different INT 10h function families use different ES:offset registers: VBE 4Fxx → ES:DI, AH=10h (palette) → ES:DX, AH=11h (fonts) → ES:BP, AH=1Bh → ES:DI - Only specific AL subfunctions use ES as a buffer pointer; most don't - Copy sizes must be exact (17 bytes for palette, CX*3 for DAC blocks, etc.) - Copy direction matters: "Set" = copy-in only, "Read/Get" = copy-out only ## WINFLAGS Handling - **WF_80x87 NOT used**: We don't save/restore FPU state across thunk boundaries - **VGA-class drivers need WF_STANDARD**: VGA.DRV's physical_enable hangs in Enhanced mode (polls VDD that doesn't exist). Auto-detected after Enable(style=1) returns 1bpp/4planes GDIINFO → repatch __WINFLAGS in all segments (0x0025→0x0015). - SVGA drivers (S3TRIO, VBESVGA) use WF_ENHANCED normally ## ET4000 Driver Notes - ET4000.DRV from Win 3.x distribution is SZDD-compressed; decompress with `msexpand` (rename to .DR_ first, output is .DR without the V — rename to .DRV) - DOSBox-X machine type: `svga_et4000` for ET4000 hardware emulation - ET4000 is 640x480 8bpp, software-rendered (no accelerator engine in DOSBox-X) - CR30=0x00 on ET4000 → isS3=false → no S3 engine wait, no display start shift - **GetPixel breaks ScanLR on ET4000**: The Pixel DDI with color=-1 (get mode) leaves VGA hardware state (likely GR5 read mode) that causes ScanLR to not match the pixel just read. GetPixel itself returns correct values; only the ScanLR interaction is broken. Flood fill software path avoids GetPixel entirely, using ScanLR for all pixel-color queries. ## Font Loading Notes - .FON files are NE containers with RT_FONT (type 8) resources; each resource is raw .FNT data - All Win 3.x .FON files contain v2 fonts (0x0200); VBESVGA.DRV requires v3 (0x0300) - v2→v3 conversion: insert 30-byte extension at 0x76, expand 4-byte char table to 6-byte entries - **v2 char table offsets are absolute from segment base**, not relative to fsBitsOffset. Correct v3 offset = v2offset + shift (where shift = newBitmapOff - origBitsOff) - `wdrvLoadFontFon(path, index)` loads from .FON; `wdrvLoadFontFnt(path)` loads raw .FNT - `wdrvLoadFontBuiltin()` returns the VGA ROM 8x16 singleton; must NOT be passed to wdrvUnloadFont - `wdrvLoadFontTtf(path, pointSize)` loads TrueType via stb_truetype, rasterizes to 1-bit v3 FNT - `wdrvExtTextOut` takes a `WdrvFontT font` parameter (NULL = built-in) - Available test fonts in `fon/`: COURE.FON (8x13, 9x16, 12x20), SSERIFE.FON, SERIFE.FON, VGASYS.FON, etc. - Available TTF fonts in `ttf/`: LIBMONO.TTF, LIBSANS.TTF, LIBSERIF.TTF (Liberation family) ## Current Demo Status - S3TRIO.DRV, VBESVGA.DRV, VGA.DRV, ET4000.DRV all work: Load → Enable → Draw → Disable → Unload - Demo 1: Fill rectangles (BitBlt) — works - Demo 2: Pixel patterns (Pixel) — works - Demo 3: Lines/starburst (Output/Polyline) — works - Demo 4: Screen-to-screen blit (BitBlt SRCCOPY) — works - Demo 5: ExtTextOut text rendering — works (VBESVGA.DRV) - Demo 7: TrueType font rendering at multiple sizes — works - Demo 8: Color text showcase (fg/bg colors, opaque/transparent, palette grid) — works - Demo 9: ROP3 operations (DSTINVERT, SRCINVERT, NOTSRCCOPY, SRCAND, SRCPAINT) — works - Demo 10: ScanLR + Flood Fill (FB or software fallback) — works on all drivers - Demo 11: Text measurement (GetCharWidth DDI, wdrvMeasureText) — works - Demo 12: Styled pen lines (software Bresenham, all drivers) — works - Demo 13: Pixel buffer blit (FB or software fallback) — works on all drivers - Demo 14: Hardware cursor (arrow, crosshair, ibeam, hand; animated circle) — works - Demo 15: Screen save/restore (screen-to-screen BitBlt stash/restore) — works - VGA.DRV: 640x480 4-plane 16-color mode; limited color palette but functional - ET4000.DRV: 640x480 8bpp on svga_et4000; software-only, no hw acceleration - Drivers stored in `drivers/` directory, copied to `bin/` during build ## Software Rendering Fallbacks - **Styled pens**: Always software-rendered via Bresenham + Pixel DDI (PS_SOLID uses HW Output DDI). S3TRIO silently accepts styled pens but doesn't render them; software path gives identical output everywhere. - **Flood fill**: Uses direct FB when available, falls back to ScanLR DDI + FillRect. Cannot use GetPixel — ET4000 DIB engine's Pixel DDI (color=-1) corrupts VRAM, causing subsequent ScanLR to not match the read pixel. Seed color is determined by probing ScanLR with each palette index until a match is found. - **Pixel blit**: Uses direct FB memcpy when available, falls back to per-pixel SetPixel with PALETTEINDEX. - All four drivers (S3TRIO, VBESVGA, ET4000, VGA) now have identical feature sets. ## New API Functions (added with demos 9-15) - `wdrvScanLR(handle, x, y, color, style)` — ScanLR DDI wrapper (ordinal 12) - `wdrvFloodFill(handle, x, y, fillColor)` — scanline flood fill (FB or software) - `wdrvGetCharWidths(handle, font, firstChar, lastChar, widths)` — GetCharWidth DDI - `wdrvMeasureText(handle, font, text, length)` — sum char widths for string - `wdrvPolylineEx(handle, points, count, color, penStyle)` — polyline with pen style - `wdrvRectangleEx(handle, x, y, w, h, color, penStyle)` — rectangle with pen style - `wdrvBlitPixels(handle, x, y, w, h, pixels, srcPitch)` — pixel blit (FB or software) - `wdrvBlitBmp(handle, x, y, bmpPath, setPalette)` — load+display 8bpp BMP - `wdrvSetCursor(handle, shape)` — built-in cursor shapes (arrow/crosshair/ibeam/hand/none) - `wdrvSetCursorCustom(handle, hotX, hotY, andMask, xorMask)` — custom 32x32 mono cursor - `wdrvMoveCursor(handle, x, y)` — move hardware cursor - `wdrvCreateBitmap(handle, width, height)` — CreateBitmap DDI - `wdrvDeleteBitmap(handle, bitmap)` — DeleteBitmap DDI - `wdrvBitmapSetPixels/GetPixels(handle, bitmap, data, size)` — BitmapBits DDI - `wdrvBitBltFromBitmap/ToBitmap(handle, bitmap, ...)` — BitBlt with bitmap PDEVICE - `wdrvScreenshot(handle, filename)` — capture screen to PNG (FB or DDI fallback)