WinDriver/CLAUDE.md
Scott Duensing 5527130145 Add wdrvScreenshot, auto-demo mode, palette fixes, and DAC width detection
Add wdrvScreenshot() to capture the screen to PNG via stb_image_write.h,
reading the framebuffer (or DDI bitmap fallback) and VGA DAC palette.

Convert demo.c to non-interactive mode with automatic screenshots after
each demo (DEMO01-15.PNG) and no keypress waits, plus per-driver
DOSBox-X configs for automated testing.

Set a standard Windows 3.1 256-color palette (8R x 8G x 4B color cube
with 20 static system colors) to ensure consistent output across drivers.

Fix wdrvSetPalette to also program the VGA DAC directly, since VBESVGA's
SetPalette DDI updates its internal color table but not the hardware.

Detect DAC width via VBE 4F08 (S3TRIO=6-bit, VBESVGA=8-bit) and use
correct shift in both DAC writes and reads — fixes dark display on
VBESVGA where 6-bit values in 8-bit DAC produced 1/4 brightness.

Fix S3 dispYOffset: extend PDEVICE deHeight by the offset so the
driver's internal clipping allows the full 600-row logical screen,
rather than incorrectly reducing dpVertRes to 590.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-23 18:55:57 -06:00

202 lines
13 KiB
Markdown

# Win31drv Project Memory
## Build Environment
- DJGPP cross-compiler: `~/djgpp/djgpp/bin/i586-pc-msdosdjgpp-gcc` (GCC 12.2.0)
- DJGPP binutils need `libfl.so.2`: stored in `tools/lib/` (Makefiles set LD_LIBRARY_PATH)
- CWSDPMI zip stored in `tools/cwsdpmi.zip` (extracted to bin/ during build)
- DOSBox-X: `flatpak run com.dosbox_x.DOSBox-X` (installed as user flatpak)
- CWSDPMI.EXE in `bin/` directory for DPMI support under DOSBox-X
- Config: `dosbox-x.conf` with S3 Trio64 machine type, 64MB RAM
## Project Structure
```
windriver/
├── Makefile # Top-level: builds demo, calls win31drv/Makefile
├── demo.c # Demo program
├── dosbox-x.conf # DOSBox-X config (S3 SVGA)
├── obj/ # Demo object files
├── bin/ # Executables + CWSDPMI.EXE
└── win31drv/ # Library
├── Makefile # Builds libwindrv.a
├── obj/ # Library objects
├── neload.c/h # NE format loader
├── neformat.h # NE structures
├── thunk.c/h # 32→16 bit thunking
├── windrv.c/h # Main API
├── winstub.c/h # Windows API stubs
├── winddi.h # DDI structures
└── wintypes.h # Win16 types
```
## DJGPP Portability Notes
- `uint32_t` is `unsigned long` (not `unsigned int`) in DJGPP — use `PRIu32`/`PRIX32` from `<inttypes.h>`
- Always include `<stdarg.h>` explicitly for `va_list`/`va_start`/`va_end`
- Headers must be self-contained (include their own dependencies)
## Thunking Architecture Notes
- **SS == DS == DGROUP**: Win3.x drivers assume SS == DS == DGROUP. VBESVGA's BBLT.ASM does
`PrestoChangeoSelector(SS, WorkSelector)` to create a code alias for compiled blit code.
thunkCall16 uses dgroupSel as SS (SP=0xFFF0) when available. Without this, the code alias
has the wrong base and the CPU executes garbage.
- **Register corruption with -O2 inlining**: When demo.c's demoDrawing is inlined into main,
DJGPP GCC 12.2.0 mishandles callee-saved registers across thunk calls in long functions.
Fix: `__attribute__((noinline))` on demoDrawing. Symptom: handle pointer corrupted to
a ColorInfo return value (e.g. 0xFF0001F6) between Demo 2 and Demo 3.
## DOSBox-X Driver Notes
- `waitForEngine()`: GP_STAT port 0x9AE8 bit 9 polling — S3 only (gIsS3 guard)
- **S3 detection**: Probe CR30 chip ID register. S3 chips: 0x81-0xE1. ET4000: 0x00.
Only apply S3-specific setup (cursor disable, dispYOffset, setDisplayStart) when isS3=true
AND driver is not VGA-class (1bpp/4planes).
- **Pattern scratch artifact**: S3 driver writes 8x8 dithered brush pattern to VRAM at fixed
position (~(144,1)-(151,8)) during accelerated pattern fills. Fixed by shifting CRTC display
start down 10 scanlines (`dispYOffset`) so the scratch area is off-screen. All drawing Y
coordinates are offset by dispYOffset. The full dpVertRes (600) is reported and usable —
the shift just consumes slightly more VRAM.
- **S3TRIO BitBlt source corruption**: S3TRIO's accelerated BitBlt corrupts source VRAM
during source-dependent ROP operations (SRCINVERT, NOTSRCCOPY, SRCAND, SRCPAINT).
In Windows 3.1, GDI uses intermediate off-screen bitmaps. Our direct DDI calls must
work around this by redrawing source rects after the ROP, or using separate source areas.
Off-screen VRAM (y >= screenH) is NOT usable — the driver clips to screen dimensions.
- **`-fno-gcse` required for windrv.c**: With -O2 GCSE, stack layout causes issues during
16-bit driver calls. Only windrv.c needs this. See `WINDRV_CFLAGS` in win31drv/Makefile.
- Output DDI (polylines/rectangles) requires a **physical pen** from RealizeObject, not a
raw LogPen16T. The pen must be in DGROUP (same as brush, drawMode, PDEVICE).
- **Curve primitives removed**: Ellipses, polygons, roundrects, arcs, and pies were removed
because DIB engine drivers (VBESVGA/ET4000) hang (expect GDI curve decomposition callback)
and S3TRIO only partially renders them. Only polyline is reliable via Output DDI.
- **OS_RECTANGLE crashes DIB engine drivers**: VBESVGA/ET4000 Output(OS_RECTANGLE) crashes
like curve primitives. `wdrvRectangleEx` draws rectangles as two 3-point polylines instead.
- **Output DDI lpClipRect**: DIB engine drivers (VBESVGA) dereference lpClipRect
unconditionally in polyline paths too. Always pass a valid clip rect (0,0,0x7FFF,0x7FFF).
- `wdrvUnloadDriver` does NOT auto-call Disable — caller must handle text mode restore
- `sleep()` hangs under DOSBox-X because BIOS timer ticks don't advance without I/O
- Debug output: `-d` flag enables verbose logging in neload, winstub, thunk, and windrv
- **SetPalette DDI vs VGA DAC**: VBESVGA's SetPalette DDI updates the driver's internal
color table (used by ColorInfo for RGB→index matching) but does NOT program the VGA
DAC hardware. In real Windows 3.1, GDI programs the DAC separately. `wdrvSetPalette`
works around this by also writing DAC registers directly (ports 0x3C8/0x3C9) after
the DDI call. This is idempotent on drivers like S3TRIO that already program the DAC.
- **DAC width**: S3TRIO uses 6-bit DAC (values 0-63), VBESVGA uses 8-bit DAC (values
0-255). Detected at Enable via VBE 4F08 subfunc 01. Both `wdrvSetPalette` (port write)
and `readDacPalette` (screenshot) use the detected width for correct shift amounts.
Wrong shift causes dark display (6-bit values in 8-bit DAC = 1/4 brightness).
- Known issue: mode mismatch HW=800x600 vs GDIINFO=640x480
## DGROUP Stack Management
- VGA.DRV ships with DGROUP[0x0A]=0xFFFF (stack bottom = top of segment → no stack).
Its BitBlt prolog calls a stack check function at 0x18CA that compares available stack
against [SS:0x0A]. With 0xFFFF, ALL functions fail immediately (return 0).
- Fix: patch [0x0A] to objBase after extending DGROUP. Only done when original = 0xFFFF.
- S3TRIO.DRV and VBESVGA.DRV have [0x0A]=0x0000 — no patching needed.
- **Do NOT unconditionally overwrite DGROUP offsets 0x00-0x0F** — VBESVGA.DRV stores
driver-specific data there (0x030A at offset 0, 0x01 at offset 4).
## BitBlt Source Device Rules
- For pattern-only ROPs (PATCOPY=0xF0, BLACKNESS=0x00, WHITENESS=0xFF, etc.),
lpSrcDev must be NULL (0:0) per DDI spec. VGA.DRV rejects non-NULL source for
pattern-only ROPs. S3TRIO.DRV tolerates it but correct behavior is NULL.
- Source dependency check: `ropNeedsSrc = (((rop8 >> 2) ^ rop8) & 0x33) != 0`
## ExtTextOut DDI Notes
- **Font format**: VBESVGA.DRV requires .FNT v3 (fsVersion=0x0300) when BigFontFlags is set
(386 protected mode with WF_PMODE + WF_CPU386). v2 (0x0200) is rejected at runtime.
- **v3 char table**: at file offset 0x94 (fs30CharOffset), 6-byte entries {WORD width, DWORD offset}.
The DWORD offset is absolute from the font segment base. Use FntCharEntry30T.
- **Bitmap layout**: per-character contiguous, **column-major** byte order. For each character,
all pixHeight rows of byte-column 0 come first, then all rows of byte-column 1, etc.
Address formula: `(byteCol * pixHeight) + row`. For 8px-wide chars (1 byte column), this
is identical to row-major. VGA BIOS 8x16 font is already in this format — no transpose needed.
- **lpClipRect must NOT be NULL**: VBESVGA's get_clip unconditionally dereferences lpClipRect
(STRBLT.ASM:1008 "We assume that we will never get passed a null rectangle"). Pass a RECT
covering the full screen (0, 0, 0x7FFF, 0x7FFF).
- **lpTextXForm**: declared but never read by VBESVGA — pass NULL.
- **lp_font offset**: passed as fontSel:0x42 (points to fsType within the .FNT block).
- **Return value**: DX bit 15 = error. AX=0 is NOT necessarily failure.
## INT 10h ES Translation
- Different INT 10h function families use different ES:offset registers:
VBE 4Fxx → ES:DI, AH=10h (palette) → ES:DX, AH=11h (fonts) → ES:BP, AH=1Bh → ES:DI
- Only specific AL subfunctions use ES as a buffer pointer; most don't
- Copy sizes must be exact (17 bytes for palette, CX*3 for DAC blocks, etc.)
- Copy direction matters: "Set" = copy-in only, "Read/Get" = copy-out only
## WINFLAGS Handling
- **WF_80x87 NOT used**: We don't save/restore FPU state across thunk boundaries
- **VGA-class drivers need WF_STANDARD**: VGA.DRV's physical_enable hangs in Enhanced
mode (polls VDD that doesn't exist). Auto-detected after Enable(style=1) returns
1bpp/4planes GDIINFO → repatch __WINFLAGS in all segments (0x0025→0x0015).
- SVGA drivers (S3TRIO, VBESVGA) use WF_ENHANCED normally
## ET4000 Driver Notes
- ET4000.DRV from Win 3.x distribution is SZDD-compressed; decompress with `msexpand`
(rename to .DR_ first, output is .DR without the V — rename to .DRV)
- DOSBox-X machine type: `svga_et4000` for ET4000 hardware emulation
- ET4000 is 640x480 8bpp, software-rendered (no accelerator engine in DOSBox-X)
- CR30=0x00 on ET4000 → isS3=false → no S3 engine wait, no display start shift
- **GetPixel breaks ScanLR on ET4000**: The Pixel DDI with color=-1 (get mode) leaves
VGA hardware state (likely GR5 read mode) that causes ScanLR to not match the pixel
just read. GetPixel itself returns correct values; only the ScanLR interaction is
broken. Flood fill software path avoids GetPixel entirely, using ScanLR for all
pixel-color queries.
## Font Loading Notes
- .FON files are NE containers with RT_FONT (type 8) resources; each resource is raw .FNT data
- All Win 3.x .FON files contain v2 fonts (0x0200); VBESVGA.DRV requires v3 (0x0300)
- v2→v3 conversion: insert 30-byte extension at 0x76, expand 4-byte char table to 6-byte entries
- **v2 char table offsets are absolute from segment base**, not relative to fsBitsOffset.
Correct v3 offset = v2offset + shift (where shift = newBitmapOff - origBitsOff)
- `wdrvLoadFontFon(path, index)` loads from .FON; `wdrvLoadFontFnt(path)` loads raw .FNT
- `wdrvLoadFontBuiltin()` returns the VGA ROM 8x16 singleton; must NOT be passed to wdrvUnloadFont
- `wdrvLoadFontTtf(path, pointSize)` loads TrueType via stb_truetype, rasterizes to 1-bit v3 FNT
- `wdrvExtTextOut` takes a `WdrvFontT font` parameter (NULL = built-in)
- Available test fonts in `fon/`: COURE.FON (8x13, 9x16, 12x20), SSERIFE.FON, SERIFE.FON, VGASYS.FON, etc.
- Available TTF fonts in `ttf/`: LIBMONO.TTF, LIBSANS.TTF, LIBSERIF.TTF (Liberation family)
## Current Demo Status
- S3TRIO.DRV, VBESVGA.DRV, VGA.DRV, ET4000.DRV all work: Load → Enable → Draw → Disable → Unload
- Demo 1: Fill rectangles (BitBlt) — works
- Demo 2: Pixel patterns (Pixel) — works
- Demo 3: Lines/starburst (Output/Polyline) — works
- Demo 4: Screen-to-screen blit (BitBlt SRCCOPY) — works
- Demo 5: ExtTextOut text rendering — works (VBESVGA.DRV)
- Demo 7: TrueType font rendering at multiple sizes — works
- Demo 8: Color text showcase (fg/bg colors, opaque/transparent, palette grid) — works
- Demo 9: ROP3 operations (DSTINVERT, SRCINVERT, NOTSRCCOPY, SRCAND, SRCPAINT) — works
- Demo 10: ScanLR + Flood Fill (FB or software fallback) — works on all drivers
- Demo 11: Text measurement (GetCharWidth DDI, wdrvMeasureText) — works
- Demo 12: Styled pen lines (software Bresenham, all drivers) — works
- Demo 13: Pixel buffer blit (FB or software fallback) — works on all drivers
- Demo 14: Hardware cursor (arrow, crosshair, ibeam, hand; animated circle) — works
- Demo 15: Screen save/restore (screen-to-screen BitBlt stash/restore) — works
- VGA.DRV: 640x480 4-plane 16-color mode; limited color palette but functional
- ET4000.DRV: 640x480 8bpp on svga_et4000; software-only, no hw acceleration
- Drivers stored in `drivers/` directory, copied to `bin/` during build
## Software Rendering Fallbacks
- **Styled pens**: Always software-rendered via Bresenham + Pixel DDI (PS_SOLID uses HW Output DDI).
S3TRIO silently accepts styled pens but doesn't render them; software path gives identical output everywhere.
- **Flood fill**: Uses direct FB when available, falls back to ScanLR DDI + FillRect.
Cannot use GetPixel — ET4000 DIB engine's Pixel DDI (color=-1) corrupts VRAM,
causing subsequent ScanLR to not match the read pixel. Seed color is determined by
probing ScanLR with each palette index until a match is found.
- **Pixel blit**: Uses direct FB memcpy when available, falls back to per-pixel SetPixel with PALETTEINDEX.
- All four drivers (S3TRIO, VBESVGA, ET4000, VGA) now have identical feature sets.
## New API Functions (added with demos 9-15)
- `wdrvScanLR(handle, x, y, color, style)` — ScanLR DDI wrapper (ordinal 12)
- `wdrvFloodFill(handle, x, y, fillColor)` — scanline flood fill (FB or software)
- `wdrvGetCharWidths(handle, font, firstChar, lastChar, widths)` — GetCharWidth DDI
- `wdrvMeasureText(handle, font, text, length)` — sum char widths for string
- `wdrvPolylineEx(handle, points, count, color, penStyle)` — polyline with pen style
- `wdrvRectangleEx(handle, x, y, w, h, color, penStyle)` — rectangle with pen style
- `wdrvBlitPixels(handle, x, y, w, h, pixels, srcPitch)` — pixel blit (FB or software)
- `wdrvBlitBmp(handle, x, y, bmpPath, setPalette)` — load+display 8bpp BMP
- `wdrvSetCursor(handle, shape)` — built-in cursor shapes (arrow/crosshair/ibeam/hand/none)
- `wdrvSetCursorCustom(handle, hotX, hotY, andMask, xorMask)` — custom 32x32 mono cursor
- `wdrvMoveCursor(handle, x, y)` — move hardware cursor
- `wdrvCreateBitmap(handle, width, height)` — CreateBitmap DDI
- `wdrvDeleteBitmap(handle, bitmap)` — DeleteBitmap DDI
- `wdrvBitmapSetPixels/GetPixels(handle, bitmap, data, size)` — BitmapBits DDI
- `wdrvBitBltFromBitmap/ToBitmap(handle, bitmap, ...)` — BitBlt with bitmap PDEVICE
- `wdrvScreenshot(handle, filename)` — capture screen to PNG (FB or DDI fallback)