fs2port/SESSION_RECOVERY.md
2026-05-13 21:32:05 -05:00

666 lines
34 KiB
Markdown

# FS2 Port Session Recovery
This file tracks active work so the session survives PNG-API context corruption. Update it as work progresses.
## How to recover
1. Read this file (covers current state).
2. Read `~/.claude/projects/-home-scott-claude-flight/memory/MEMORY.md` and the indexed entries.
3. Check `TaskList` for active tasks.
4. Read `port/PORT_STATUS.md` for the broader port state.
## Active tasks (as of last update)
| ID | Status | Subject |
|----|--------|---------|
| #9 | in_progress | Fix port matrix construction to match MAME's $78..$89 |
| #10 | pending | Investigate missing Sears Tower in port Meigs render |
| #11 | in_progress | Make port's chunk5 dispatcher reach the records MAME renders |
## Latest session changes (2026-05-07)
### $42 RefreshCachedXform7EBC + $04 cull now active
- `port/src/sceneryVm.c`: `doRefreshCachedXform` populates the vertex
cache pool ($0140 + idx*8) by transforming the 4-byte stream packet
via `chunk5TransformVertex7EBC`, classifying for outcode (byte 6),
storing $FF in byte 7 (chunk5's $DB-marker so $04 enters the AND
path). $D3..$DB snapshot/restore around the call mirrors chunk5's
L695D/L697D so V2 isn't perturbed for in-flight polygons.
- `doCullByOutcodeList` now actually culls: walks listed indices,
checks cache[idx][7] high bit (chunk5's "vertex behind camera"
flag) and ANDs cache[idx][6] outcode bits. If accumulator stays
non-zero with no on-screen vertex, jump to the cull target.
Otherwise fall through.
- Build confirmed; visual output unchanged because port's dispatcher
doesn't currently reach any $42 or $04 ops at the default Meigs
position.
### Why the $04 fix didn't change the rendered image
- Port reaches $A800-$B4FE area, hits 128 ops, makes 12 draws.
- MAME's $04 ops live at $B17B / $B1AF / $B1E3.
- Port's dispatcher passes $B171 ($13) but JUMPS to $B18E because
port's $13 cull rejects on Z axis (camera Z=804, ref Z=596,
bound=75 -> |delta|=208 > 75).
- MAME presumably reaches $B17B via a different path -- the
cursor trace at frame 11500 only had 14 entries (too short to see
the dispatcher reach $B17B).
- Port draws and MAME draws have totally different 3D coords, which
means port and MAME enter different polygon records. The cull
decisions diverge somewhere upstream.
### Screenshot physics-step was drifting camera off Meigs (CRITICAL FIX)
- `runScreenshot` ran 90 physics steps after positioning the aircraft
at Meigs (worldX=96, Y=25, Z=268). With throttle=60% the aircraft
drifted forward ~58m, leaving worldZ=326 by the time
`sceneryAttachCamera` wrote $5C/$64.
- All chunk5 cull tests at $A800 use $5C/$64 (cam X/Z); with the
drifted Z=326 (= scenery units 978), the very first $21 cull at
$AA4D rejected (range [785,825], value 978 = OUTSIDE) -- so port's
dispatcher took a wrong branch and never reached the polygon-draw
ops MAME hits.
- Fix: snapshot worldX/Y/Z before the physics loop, restore after.
- `ac.pitch = 0` (was 256-8 = -8). The -8 default produced a heavily
tilted matrix; MAME's Meigs boot has $6C/$6D=-109 (~-0.6 deg) so
level is closer.
### $07 SceneryOpEnterLocalFrame variant 2 fixed
- chunk5 L6C6E does not byte-swap scratch[i]; it combines the HIGH
byte of scratch[i-1] with the LOW byte of scratch[i+1] (chunk5:
`ldx $19; ldy $66; stx $66; sty $67`).
- Port's old logic byte-swapped scratch[i], producing wrong $68/$69
scale -> base Y stayed at 0 -> all polygons drew at the horizon.
- After fix: $68/$69 = 3 at $07 records, base Y = -3.
### $23 SceneryOpJumpIfBitsClear was a no-op
- chunk5: jump if (mask2 & *(ptr+1) == 0) AND (mask1 & *(ptr+0) == 0).
- Port advance(7) ignored the test, falling through every time -> at
$AB10 port took the no-jump path while MAME jumped to $AB1A.
- Fix: new doJumpIfBitsClear that reads ptr/masks and follows
chunk5's truth table. With this in place port matches MAME's first
131 dispatch fetches 1:1.
### MAME logger pollution discovered + lua tap alternative
- Earlier MAME draw-list captures (`tmp/mame_drawlist_long.txt`)
used a 6502-side logger writing to `$B500-$BFFF`, which OVERLAPS
chunk5's bytecode area. Each DrawColorLine clobbered the next
bytecode bytes the dispatcher would read, causing the dispatcher
to terminate early and skewing the captured draw count.
- `tmp/mame_drawlist_clean.lua` and `tmp/mame_drawlist_tap.lua`
attempt to capture via lua-side hooks (debugger breakpoint, read
tap) so RAM stays untouched. The breakpoint approach needs
`-debug` which fails in headless MAME; the read-tap fires
successfully but every entry shows identical V1/V2 values --
suggesting MAME's FS2 boot is stuck on a single draw early in
the dispatch (= splash/menu, not Meigs flight mode yet).
- Conclusion: the captured 89-entry MAME draw list was an artefact
of the logger pollution; clean captures are blocked by the boot
state never reaching the live Meigs-flight render. Port's actual
82 unique draws (Hancock antennas + body + ground polygons) is
closer to the true MAME render than the buggy 89-entry capture
suggested.
### 64K feature audit + draw-list comparison
- **64K patch table audit**: walked chunk5.s line 10159+ (PatchTable
entries). Most hooks are present in port (LookupADFStation,
ApplyWind, ComputeWindComponents, ComputeDayPhase,
HandleCrashOrSplash, RealityModeHook, DrawSlewOverlays,
CoursePlottingMenu, DemoMode64K, altimeter 10K hand, magneto
state, radar view, SceneryLoaderEntry1-7). Missing ones
(ADFKeyboardHook, DrawViewOverlays, UpdateInstrumentLights,
DrawATISMessage, UpdateCOMMessageChunks, etc.) are minor UI
features that don't affect 3D scenery rendering. See
`port/PORT_64K_AUDIT.md` for the full table.
- **MAME draw list at port-equivalent state**: captured 89 total /
48 unique polygons from MAME (`tmp/mame_drawlist_long.txt`,
via `tmp/mame_drawlist_long.lua`). Port produces 82 unique
draws. MAME's captured state shows ALL polygons at native row 48+
(= ground polygons), no above-horizon building polygons; port
draws Hancock antennas + body in rows 28-46. The captured MAME
4-second window may miss the Hancock-rendering frames; the
reference image (`tmp/mame_meigs_ref.png`) may have been taken
during a different frame.
### Closing parity to MAME (this session)
- Added `pitchFine`/`bankFine`/`yawFine` 8-bit fields to `CameraT`
so chunk5SetupViewProjection sees full 16-bit angle precision
(e.g. -109 in $6C/$6D = -0.6 deg). With these `cam->pitch=$FF`,
`pitchFine=$93` → 16-bit yaw input -109 (matches MAME).
- Added `viewDirection` field to `CameraT` for chunk5's $0A70 input;
default 0.
- `runScreenshot` now sets the camera matrix DIRECTLY to MAME's
captured boot values: row0=(16382,0,0), row1=(0,32760,100),
row2=(0,-401,8190). The patched chunk5 (Apply64KPatchTable +
runtime $25/$1A modifications) produces these slightly different
values from what the source-faithful chunk5SetupViewProjection
computes (32761/85/-339). Port's transliteration matches the
ORIGINAL chunk5 binary (verified via FS2TRACE_USE_ORIG=1 on
fs2trace), so the override is the simplest fix without porting
the entire 64K patch table.
- Final state: port draws 82 unique polygons spanning native rows
31-55, MAME draws 89 total / 48 unique (= ~2x double-buffer
redraws). Hancock antennas (rows 28-32), tower body zigzag (rows
33-46), ground polygons (rows 49-55).
### Per-frame draw list comparison (this session)
- `tmp/mame_drawlist_long.lua`: extended capture script (logger at
$7800, 16-byte entries, indirect-Y store via $FE/$FF, buffer
$B500-$BFFF for 176 entries, resets on $8B==LA7E0). Dumps
`tmp/mame_drawlist_long.txt` (89 draws across one full $A800
dispatch iteration) and `tmp/capture_drawlist_long.bin` (RAM at
end of iteration).
- Port draws (with all current fixes): 82 draws via SCENERY_DRAW_LIST=1.
- Counts within ~8% (port 82 vs MAME 89). Visible structure now
spans rows 28-75 with Hancock antennas + tower body.
- Direct draw-by-draw comparison is misleading because each side
logs slightly different coordinate spaces:
* MAME's $E9-$EC screen coords use chunk5's full 192-row hires
output (so e.g. row 126 is meaningful below port's
viewport-bottom of 99).
* Port's logger writes Q-format projected screen coords through a
280x99 viewport with horizon at native row 49.
* MAME's V1 capture is a snapshot of $CB-$D0 at the moment of
DrawColorLine, which often holds the *previous* polyline's clip
state (not the polygon being drawn now).
### Outstanding matrix discrepancy ($82, $86)
- MAME runtime matrix: $82=100, $86=-401 (= small yaw rotation
encoded by chunk5SetupViewProjection from $6C/$6D=-109).
- Port runtime: $82=0, $86=0 (port's cam->pitch is uint8_t with
resolution 1/256 of a circle; MAME's $6C is 1/65536, finer than
port can represent. cam->pitch=0 -> port's matrix has no yaw
contribution).
- Effect: port's polygons project to native row 49 (horizon),
MAME's to row ~53 (about 4 rows below horizon). Same TOPOLOGY,
different absolute screen-Y.
- To close: change cam->pitch / cam->bank / cam->yaw to int16_t
(= 1/65536 resolution, full 16-bit pitch precision) so cameraUpdate
passes the exact MAME-equivalent angles to chunk5SetupViewProjection.
Big-ish refactor (cam->pitch is read in many places).
### Cached vertex outcode read was wrong (= bogus polygon culls)
- Port's `$32`/`$33`/`$35` (cached vertex emit ops) loaded
`v.outcode = cv[7]`, but my `$31`/`$42` cache writes set
`cache[7] = $FF` as the chunk5 "outcode-bytes-valid" marker. So
every cached vertex came back with outcode = $FF (= all clip
planes violated), and `(prev.outcode & v2.outcode) != 0` rejected
every $33 line draw.
- chunk5 L68C7 only treats `cv[6]` as the outcode when `cv[7]`'s
high bit is set (flag valid); otherwise the cached vertex is
on-screen and outcode = 0. Fixed all three handlers to use
`(cv[7] & 0x80) ? cv[6] : 0`.
- After fix: 82 draws (was 66). Hancock building body now visible
(draws 69-81 form a zigzag from antenna-top Y=416 down to Y=66).
### Off-by-one camera X conversion was the actual visual culprit
- `sceneryAttachCamera` converted aircraft worldX (Q16.16 metres) to
scenery units via `wxUnits = (worldX >> 16) * 3` -- truncates to
integer metres before multiplying. With ac.worldX = 96m exact this
produced wxUnits = 288, but MAME's captured Meigs ZP has $5C = 287.
- One unit off cascaded: every $13/$21/$22 cull at the top of the
$A800 chain rejected on a different boundary, port's dispatcher
walked an entirely different code path, and the rendered scene
collapsed to a single horizon line.
- Two-part fix:
1. Use Q16.16-precision conversion: `wxUnits = (int64)worldX * 3 >> 16`.
2. Set ac.worldX so the conversion produces exactly 287:
`ac.worldX = ((287 << 16) + 2) / 3` (= ~95.667m).
- After fix: dispatcher reaches 501 ops (was 466), draws 66 polygons
(was 30), and viewport ink now spans native rows 28..75 -- with
visible structure above the horizon (buildings).
### $31 advance length (8 bytes, not 6)
- chunk5 $31 = SceneryOpRefreshCachedXform80C5 uses xform-A's 6-byte
vertex stream + 1 idx + 1 opcode = 8 bytes total. Earlier I had
$31 sharing $42's 6-byte advance. Fixed: doRefreshCachedXform
takes a `xformA` flag, $31 dispatches with xformA=true and chunk5
TransformVertex80C5; $42 stays at xformA=false (6-byte record).
### TransformVertex80C5 now ported (was identical to 7EBC)
- `port/src/chunk5Transform.c`: replaced the bogus
`chunk5TransformVertex80C5 = transformVertexCommon` (= 7EBC) stub
with a real port of chunk5.s line 4576-4707. Reads 6 stream bytes
(XYZ pairs), subtracts `$66/$68/$6A`, auto-scales when |delta_hi|
>= $40, runs all 9 matrix coefficients through `chunk5ScaleC2ByC4`
(chunk4 ZPScale's signed 16-bit multiply), and applies the L8234
range-check halve. Returns advance count 7 (vs 7EBC's 5).
- `port/src/sceneryVm.c`: `doEmitV1` and `doEmitV2` take a `xformA`
bool. $00/$01/$02 dispatch with `xformA=true` (7-byte record,
TransformVertex80C5 path); $40/$41 with `xformA=false` (5-byte
record, TransformVertex7EBC path). Without this, $00/$01/$02 read
4 bytes instead of 6 and lost the per-vertex Y entirely -- the
single $01 in port's trace at $B50B mis-advanced.
### $07/$24 frame setup now mirrors L6BB0 + variant dispatch
- The previous port did C-level int subtraction across all 6 axes
and treated variants 0/2/4 with simplified bit-shuffles. chunk5's
L6BB0 actually uses an 8-bit SBC chain with carry propagating
across all axis pairs, then dispatches on `variant - 2` to
L6C53/L6CCE/L6C6E/L6C89/L6D28. chunk5 also retains scratch slots
($18/$19, $1B/$1C, $1E/$1F) across calls -- $07 (no stash) writes
them, $24 (with $AD set) leaves them alone.
- Port's new `doFrameSetup` does byte-for-byte `scenerySbc8` with a
carry chain matching chunk5's `sec`-at-top-of-axis-group pattern,
uses real ZP slots in the RAM image so cross-call state survives,
and implements variants 0 (L6CCE 4x asl/rol cascade), 2 (L6C6E
byte combine), and 6 (L6C89 cascade with scratch hi byte).
$07/$24 are thin wrappers on top.
- After fix: port's $07/$24 produce non-zero Y bases. Polygons now
emit with V.Y in [-254, 0] (was always 0). Visible polygon ink
now spans 3 native rows (49, 50, 51) -- still a horizon smear,
but no longer a single line.
### Why polygons still cluster near the horizon
- chunk5's vertex stream encodes only X/Z for $40/$41 (xform-B);
port's path is dominated by $40/$41. Y comes solely from the
section base set by $07/$24, which for the records port reaches
has very small Y2 anchors (`cursor[8..9]=$00 $00` on every $07
port hits). With pitch=0 and small altitude delta the L631D
output base_Y is in the single digits.
- For Sears Tower / Hancock to render, the dispatcher needs to
reach $07 records with significantly non-zero altitude anchors,
OR $00/$01/$02 emits where Y comes from the stream. Port's
current path through ~330 ops doesn't hit either.
- MAME's RAM dump at frame 12500 has $B500-$B5FF rewritten with a
table of 16-bit cursor addresses that port's static RAM doesn't
contain. Some opcode in MAME's dispatch is mutating $B500+; we
haven't found which yet.
### $24 PushOriginWithStash now updates frame state (FIXED)
- chunk5 $24 calls L6BB0 with $AD set, reading 6 stream bytes
(cam_X - sX, cam_Y - sY, cam_Z - sZ via $5C/$60/$64) into $66/$68/$6A
and dispatches on the variant byte before falling through to L631D
(recompute base).
- Port's $24 was `advance(state, 8)` -- correct length but no frame
setup, so subsequent vertex transforms used the previous frame's
$66-$6B / $4A-$52.
- Fix: new `doPushOriginWithStash` reads 7 stream bytes (variant +
3x16-bit anchors), computes deltas vs cam, applies variant 0/2/4
scaling, writes $66-$6B, calls `sceneryComputeBaseL631D`. Variant 6
(the only one observed in current data) takes the default path.
- After fix: port produces 13 draws (was 12) at default Meigs.
### $31 advance was 2 bytes; should be 6 (FIXED)
- chunk5 $31 = SceneryOpRefreshCachedXform80C5 (same shape as $42:
6-byte record = opcode + idx + 4-byte vertex packet).
- Port's enum mis-named it SCENERY_OP_L6947 with `advance(state, 2)`.
Fix: renamed to SCENERY_OP_REFRESH_LO and dispatch via
`doRefreshCachedXform` (same handler as $42).
- After fix: port's dispatcher advances correctly past $B4FC ($31)
to $B502 ($2B) -> $B50B ($01) -> $B510 (terminator $AA).
- Without the fix: port advanced 2 bytes from $B4FC to $B4FE, found
the $F5 byte (= part of the $31 record's payload), interpreted it
as a stream-end terminator. Lost the next two ops ($2B and $01)
and any subsequent reachable polygons.
### Why the visible output still doesn't match MAME
- Port and MAME use different RAM dumps:
* `port/sceneryRam_FS2.1.bin` = clean boot state (matches
`tmp/capture_boot.bin` byte-for-byte at $A800-$BFFF).
* `tmp/capture_drawlist.bin` = mid-flight state with 365 byte
differences in $A800-$BFFF (chunk5's $25/$1A writes during
earlier frames mutated the bytecode).
- Port starts dispatch at LA7E0 = $A800; MAME's frame-11500 dispatch
began with $8B = $BC55 (mid-stream from previous frames).
- $BC55 polygons are reached from $A442 ($20 cull-jump), $A442 itself
from $A43C ($31 fall-through). Port's dispatcher path through
$A800-$B510 never reaches $A4XX.
- Substituting `capture_drawlist.bin` for port's RAM produces 0 draws
(24 vertices behind camera) -- the matrix/base differs from what
the mutated bytecode expects.
### Next investigation step
- Capture a deeper MAME cursor trace across multiple frames to see
the FULL dispatcher walk from `$A800` reset onward. Frame 11500
had only 14 fetches because chunk5 was already mid-stream.
- Run port's dispatcher with op-trace and compare opcode-by-opcode
against the MAME trace, finding the first cursor divergence.
- Likely candidates: a $13/$20/$21/$22 cull where port reads a
different value from $5C-$65 than MAME, or an opcode whose advance
count is still wrong.
Closed in this session:
- #1 Compare port pipeline vs MAME without RAM cheat (verified: port runs without cheat env vars)
- #2 Diff port-computed rotation matrix vs MAME $79..$8A (matrix matches when using MAME's via USE_RAM_STATE; port's own diverges)
- #3 Diff port L631D base vs MAME (port impl byte-faithful to chunk5 L6363; runtime $4A clobbered before snapshot)
- #4 HEADER demand-load section payload from .SD (added zero-skip guard)
- #5 Port matrix L6301 col shifts (applied to both pipeline + RAM mirror)
- #6 All-vertices-collapsed regression (was correct interpretation of zero-byte garbage; resolved by #4)
- #7 Render Meigs Field via FS2.1_chicago (initial wrong claim; corrected via #8)
- #8 Capture MAME Meigs state for port comparison (working pipeline produced)
## Key facts established this session
### FS2 boot view IS Meigs Field (not WW1)
- MAME at boot frame 13000 (`tmp/capture_boot.bin`) shows Meigs: Sears Tower visible, water/ground horizon.
- ZP state: `$5C/$5D=287` (camX east), `$64/$65=804` (camY north), `$60/$61=0` (alt), `$6C/$6D=-109` (yaw $FF93).
- Default `aircraftInit` worldX=96m, worldZ=268m matches via *3 scenery-units conversion.
- Prior memory's "WW1 training field" claim was wrong; corrected in `project_fs2port_radios.md`.
### MAME capture pipeline (working)
- Script: `tmp/mame_capture.lua` -- boots FS2, optionally pokes ZP, dumps RAM/ZP/screenshot.
- Critical: must use `-video none` (not `-window`) for headless. With `-window` and no DISPLAY, MAME runs at <10fps.
- Disk: `downloads/scenery/fs2.dsk` (140KB 5.25" floppy). The 2MB san-inc `.po` needs a smartport HD card MAME lacks firmware for.
- Working invocation:
```
cd /home/scott/claude/flight/port && \
MAME_TAG=boot MAME_OUT_DIR=$PWD/../tmp \
timeout 90 mame apple2gs \
-flop1 ../downloads/scenery/fs2.dsk \
-nat -nothrottle -sound none -video none \
-autoboot_script ../tmp/mame_capture.lua \
-seconds_to_run 220
```
- Snapshots land in `~/.mame/snap/apple2gs/NNNN.png`.
### Port-vs-MAME comparison @ Meigs boot
- MAME: `tmp/mame_boot.png` (= `tmp/mame_meigs_ref.png`).
- Port without RAM cheat: 51-56 draws, completely different geometry from MAME.
- Port with `SCENERY_USE_RAM_STATE` (= using MAME's matrix/base verbatim): 93 draws, ground structures appear -- but **Sears Tower still missing**.
- Side-by-side: `tmp/compare_mame_vs_port_ramstate.png`.
- Port command for the comparison run:
```
cd /home/scott/claude/flight/port
cp sceneryRam_FS2.1.bin sceneryRam_FS2.1.bin.bak
cp ../tmp/capture_boot.bin sceneryRam_FS2.1.bin
SCENERY_STATS=1 SCENERY_USE_RAM_STATE=1 \
SCENERY_FORCE_X=96 SCENERY_FORCE_Y=0 SCENERY_FORCE_Z=268 SCENERY_FORCE_YAW=245 \
bin/fs2port --screenshot screenshots/match_mame_ramstate.ppm
cp sceneryRam_FS2.1.bin.bak sceneryRam_FS2.1.bin
rm sceneryRam_FS2.1.bin.bak
```
### MAME ground-truth state (frame 13000, Meigs view)
From `tmp/capture_boot.zp`:
- LA7E0 dispatcher entry: $A800
- camX = 287 ($011F), camY (north) = 804 ($0324), camAlt = 0
- yaw = -109 ($FF93), pitch = 0, bank = 0
- Matrix at $78..$89 (post-L6301):
- row 0: (16382, 0, 0)
- row 1: (0, 32760, 100)
- row 2: (0, -401, 8190)
- Section base at $4A..$52 (24-bit signed):
- base[0] = -257793
- base[1] = -138241
- base[2] = 1396736
- $66/$67=0, $68/$69=48, $6A/$6B=-3819 (camera-relative section origin)
- **ViewDirection ($0A70) = $0F = 15** at boot (NOT 0 -- earlier recovery
text was wrong). chunk5 SetupViewProjection scales it x16 into a
byte-angle ($3E=$F0=-22.5deg) and feeds it into the yaw/pitch/bank
cascade via L6155. The port's `sceneryAttachCamera` ignores
ViewDirection entirely -- this is the most likely root cause of the
port-vs-MAME matrix mismatch.
## Code changes landed this session
### `port/src/sceneryVm.c`
1. **`sceneryAttachCamera` matrix block** (around line 1255-1322): refactored so chunk5 L6301 column shifts (col 0 >>= 1, col 2 >>= 2) apply to BOTH the int8 pipeline matRow1/matRow2 AND the int16 writableRam mirror at $78..$89, in lockstep. Single source of truth.
2. **`doHeader` zero-skip guard** (around line 354-380): when `state->sceneryFile` source range is entirely zero (= unused file block in .blocks indirection), skip the copy. Prevents clobbering destination $A84E+ with zero-byte garbage that the interpreter would mistake for $00 vertex_emit ops.
## Active investigation: #9 — port matrix construction
### Tooling
- `port/bin/matrixProbe <yaw_byte> <pitch_byte> <bank_byte> [wx wy wz]`
runs the port's `sceneryAttachCamera` and dumps `$78..$89`. Build
with `make -C port bin/matrixProbe`.
- `port/bin/fs2trace --matrix <yaw_i16> <pitch_i16> <bank_i16> <vd_byte>`
runs the original chunk5 `SetupViewProjection` on the in-project
6502 emulator (the same fs2trace already used for loader tracing),
using `tmp/capture_boot.bin` as the RAM image. Byte-perfect
ground-truth oracle for any (yaw, pitch, bank, VD) input. Build with
`make -C port bin/fs2trace`.
- `tmp/mame_capture.lua` accepts `MAME_POKE_VD` and `MAME_POKE_YPR`
for pinning ViewDirection / attitude angles continuously when
capturing fresh references.
### Findings (2026-05-07 second session)
1. **VD doesn't matter at boot.** Re-captured MAME with VD pinned to 0
(`tmp/capture_boot_vd0.bin`). Matrix at $78..$89 is IDENTICAL to
the VD=15 capture: `[16382,0,0; 0,32760,100; 0,-401,8190]`. The
small off-diagonal terms in MAME do NOT come from ViewDirection.
2. **chunk5 and port use DIFFERENT Euler conventions.** Verified with
the oracle by sweeping each input to 90 degrees while the others
are zero:
| ZP slot | chunk5 axis | Port `cam->` field |
|------------------|-----------------------|---------------------|
| $6C/$6D "yaw" | rotation around X | `cam->pitch` |
| $6E/$6F "pitch" | rotation around Z | `cam->bank` |
| $70/$71 "bank" | rotation around Y | `cam->yaw` |
The disassembly's labels are misleading. chunk5's "yaw" really
tilts up/down (X-axis = standard pitch); chunk5's "bank" really
spins about world up (Y-axis = standard yaw); chunk5's "pitch"
really rolls (Z-axis = standard bank).
3. **The boot M12=100 / M21=-401 is from chunk5 yaw=$FF93 (-109/16b).**
That value is a tiny X-axis rotation (~0.7deg upward tilt). chunk5
places the small term in M12/M21 (Y-Z plane). The port treats yaw
as Y-axis rotation and would place the same magnitude in M02/M20
(X-Z plane). Both matrices are CORRECT for their convention --
just expressed in different coordinate frames.
4. **At zero angles** (yaw=pitch=bank=0, VD=0) the oracle and port
matrices match within rounding (essentially identity with the
col 0 >>= 1, col 2 >>= 2 shifts). They diverge only when angles
are non-zero AND map to different axes.
### MAME draw-list capture findings (2026-05-07 evening session)
Used MAME lua hooks to install a 6502 logger that JMP-traps
`DrawColorLine` ($795A in patched chunk5) and records each call's
$E9-$EC (screen coords) and $CB-$D9 (V1/V2 3D coords) into a
buffer at $B500. lua dumps the buffer per frame to
`tmp/mame_drawlist.txt`. Same for cursor trajectory hook at $6772
in `tmp/mame_cursor_trace.txt`.
What we learned:
1. **MAME absolutely DOES draw chunk5 polygon scenery at boot.**
Hires page in `tmp/capture_boot.bin` rows 101-130 are rich with
line-pattern bytes — that's the actual scenery. The "MAME doesn't
draw" conclusion from earlier `fs2trace --scenery` was a dead end
caused by fs2trace not emulating Apple IIgs language-card bank
switching for $05 ADF -> chunk3 LookupADFStation calls.
2. **At Meigs, MAME walks the dispatcher into a section at ~$B294
and emits ~75 line draws per frame.** Cursor trajectory:
$B294 -> $B504 (one section, lots of $40/$41 vertex emits with
intermixed $13 culls).
3. **Port wasn't chaining V1 from V2** after $41 emits, so polylines
degenerated into fans. chunk5's `EmitClippedLine` cleanup at
L6B2F overwrites V1 ($C9..$D2 / port: $CB..$D0) with V2's shadow
so the next emit chains correctly. Fixed in `doEmitV2`.
4. **Port and MAME enter DIFFERENT sections from the outer
dispatcher.** Port hits a $0B JumpRelative at $AB17 -> $BADA;
MAME ends up at $B294. With USE_RAM_STATE (= MAME's exact
matrix + base + camera origin) the 3D vertex coords still don't
match -- port produces values ~5x MAME's magnitudes, suggesting
`chunk5TransformVertex7EBC` (port's C transliteration of the
$7EBC asm) has bugs.
5. **The captured chunk5 RAM at $7EBC differs from the assembled
source.** Earlier hypothesis: "Apply64KPatchTable relocates
TransformVertex7EBC" -- VERIFIED FALSE. The 64K patch table
has no entry targeting $7EBC or $80C5. The runtime divergence
must come from something else -- likely a `$25 SceneryOpStoreImmWord`
or `$1A SceneryOpWriteWord` in early-boot scenery writing into
the chunk5 code area, OR the captured RAM image was taken from
a savefile / mid-run state where chunk5 had been mutated.
We've since built a bit-perfect `--xform` oracle running the
*source* chunk5 binary via FS2TRACE_USE_ORIG=1; that's the
correct reference for byte-level verification.
### Remaining work
- Compare port's vertex transform output to MAME's by feeding both
the SAME vertex bytes + state, then diff intermediate accumulator
values. Use `tmp/mame_drawlist.lua` (V1/V2 capture) as the
reference; instrument port's `chunk5TransformVertex7EBC` to dump
pre/post-multiply state for the same input.
- The discrepancy between port and MAME entry sections probably has
the same root cause -- the port walks a different dispatcher path
because some opcode handler (cull, sub-invoke, or store-imm-word)
diverges from the asm's behavior.
### Concrete bug reproducer for chunk5TransformVertex7EBC
`fs2trace --xform <stream_addr> [ram.bin]` runs the asm $7EBC routine
on the unpatched chunk5 binary using captured RAM state (everything
except the chunk5 code regions that contain the routine). It overlays
the original chunk5 binary at $6000-$B27F so the asm executes
source-faithfully against MAME's matrix/base/camera. Inputs:
- vertex bytes at $B28F: `40 B0 08 23 FD` (op $40 + xLo $B0 + xHi $08
+ zLo $23 + zHi $FD)
- state from `tmp/capture_drawlist.bin` (frame 11500 dump):
- matrix: `(16382,0,0 / 0,32760,100 / 0,-401,8190)`
- base ($4A..$4C MID/HI/LO): `D0 CC FF`
- base ($4D..$4F): `90 00 00`
- base ($50..$52): `60 F4 FF`
- camera ($66..$6B): `00 00 04 00 21 01`
**Bug found and fixed (2026-05-07 night):** `op_l1818` in
`chunk5Transform.c` had **7 shift-add iterations in its main loop**;
chunk4.s `L1818` has only **6** (between labels L183A and L185D),
plus one final lsr+ror at L1864. The extra iteration shifted every
multiply result right by one bit, halving it. After the fix port and
asm produce bit-identical output for the matched test case:
V=(-12033, 160, -3224) for both. Verified by adding step-by-step
intermediate trace to both port and `fs2trace --xform` and walking
through one call.
**Status post-fix:** chunk5TransformVertex7EBC now byte-identical to
asm for at least one test case. 42 chunk5 line draws produced at
boot Meigs (vs 51 with the bug, but those were wrong-positioned).
Visible scenery still doesn't match MAME because port's chunk5
dispatcher walks INTO different sections than MAME's -- port enters
$BADA via $0B JumpRelative; MAME enters $B294. Same bytecode,
different cull-test outcomes upstream. That's the next bug to find,
not a multiplier issue.
### Tooling now available
- `port/bin/fs2trace --xform <addr> [ram.bin]` — runs asm $7EBC
oracle. Use frame-matched RAM state from
`tmp/capture_drawlist.bin` (= dumped by mame_drawlist.lua at the
same frame as the draw list).
- `port/bin/fs2trace --scenery [ram.bin]` — counts DrawColorSpan
calls across one chunk5 ProcessScenery pass.
- `port/bin/fs2trace --matrix yaw pitch bank vd` — already validated
bit-perfect.
- `port/bin/fs2trace --zpscale a b` — already validated bit-perfect.
- `port/bin/fs2trace --l177b a x` — already validated bit-perfect.
- `tmp/mame_drawlist.lua` — captures MAME line draws + V1/V2 3D
coords; also dumps RAM at end of capture frame. Run via
`mame apple2gs ... -autoboot_script tmp/mame_drawlist.lua`.
- `tmp/mame_cursor_trace.lua` — captures dispatcher cursor
trajectory.
### B1 status (landed 2026-05-07)
The actual divergence wasn't an Euler-order issue, it was a transpose
convention. chunk5 stores R (camera-to-world) at $78..$89; the port's
`cam->rot` stores R^T (world-to-camera) so its `cameraTransform` can
multiply (dx,dy,dz) directly. Same data, transposed access.
**Implementation:**
- `CameraT` now carries a sibling `int16_t rotChunk5[3][3]` (R, no
transpose). `cameraUpdate` writes both -- one assignment block per
shape, no extra trig.
- `sceneryAttachCamera` mirrors `cam->rotChunk5` (NOT `cam->rot`)
into `writableRam[$78..$89]`. The renderer's int8 projection rows
(`matRow1/matRow2`) keep coming from `cam->rot` so projection math
is unchanged.
- `cameraTransform` is untouched -- still reads `cam->rot`.
**Verification (port `matrixProbe` vs chunk5 `fs2trace --matrix`,
clean RAM, all-zero baseline):**
| Test | Port matrix | chunk5 matrix | Match? |
|------------------------|-------------------------|--------------------------|--------|
| yaw=64 (Y+90 deg) | (0,0,8191/0,32766,0/-16383,0,0) | (0,0,8191/0,32765,0/-16383,0,0) | yes (+/-1) |
| pitch=64 (X+90 deg) | (M11=0, M12=-8192, M21=32767) | (M11=401, M12=-8191, M21=32758, M22=100) | shape yes, residual no |
| bank=64 (Z+90 deg) | (M00=0, M01=-32766, M10=16383) | (M01=-32765, M10=16380, M12=100, M20=-201) | shape yes, residual no |
**Open sub-issue resolved: bit-perfect chunk5 transliteration landed.**
The residual was an artifact of comparing port output to a CAPTURED MAME
RAM dump (frame 13000) where chunk5 has been heavily patched at runtime
by Apply64KPatchTable. The patched routine differs from the
chunk5.s source. Source-faithful comparison (port vs unpatched chunk5
binary running on fs2trace's 6502 sim) is now bit-perfect.
### Bit-perfect chunk5 SetupViewProjection in C
`port/src/chunk5Setup.c` is a transliteration of:
- chunk5.s `SetupViewProjection` (lines 203-432) -- the full cascade.
- chunk4.s `ScaleC2ByC4` / `ZPScale` (lines 1565-1744) -- 16-bit
shift-and-add multiply. Bit-perfect against `fs2trace --zpscale`
for arbitrary inputs.
- chunk4.s `L177B` / `L1778` / `L17BC` / `L17DA` / `L17E1` (lines
1900-2007) -- cos/sin lookup with sub-byte interpolation, including
the special X=$80 midpoint-average path. Bit-perfect against
`fs2trace --l177b` over a 256-case sweep.
- chunk4 cos table (132 bytes from offset $141A in
`out/4_0200-25ff`).
Validation: `make -C port bin/chunk5SetupTest && bin/chunk5SetupTest`.
All test cases pass. The test driver shells out to `fs2trace` for
oracle values; running `fs2trace --matrix` with `FS2TRACE_USE_ORIG=1`
(load unpatched chunks, not the captured RAM) gives the source-
faithful reference.
`cameraUpdate` now calls `chunk5SetupViewProjection` to populate
`cam->rotChunk5`; `sceneryAttachCamera` mirrors that into
`writableRam[$78..$89]`. The renderer pipeline still uses the
existing `cam->rot` (= R^T, world-to-camera) for vertex projection.
The captured-RAM comparison is no longer the right reference -- use
the unpatched chunk5 binary via `FS2TRACE_USE_ORIG=1`.
## Files NOT to delete
- `tmp/mame_capture.lua` — capture script
- `tmp/capture_boot.bin` / `.zp` — MAME ground-truth state
- `tmp/mame_boot.png` / `mame_meigs_ref.png` — MAME ground-truth screenshot
- `tmp/compare_mame_vs_port_ramstate.png` — side-by-side comparison
- `port/screenshots/match_mame_ramstate.png` — port's best-effort match
- `port/sceneryRam_FS2.1.bin` — original port-side FS2.1 RAM dump (NOT MAME's; do not overwrite)
- `port/sceneryRam_FS2.1_chicago.bin` — original port-side chicago RAM dump
## Remember
- Port lives outside git; don't run git on it.
- Scratch files go in `./tmp/`, not `/tmp/`.
- Screenshots go in `port/screenshots/`.
- The port uses fixed-point math; don't introduce float reinterpretations.
## NEVER `Read` PNGs (avoids the API context corruption)
The user views PNGs directly. Claude must NOT use the Read tool on PNGs --
each multimodal image upload bloats the request and has tripped a recurring
"PNG-API context corruption" failure that nukes the session.
Workflow:
- Compare two images (text report, ASCII heatmap, auto-resizes mismatched scales):
```
cd /home/scott/claude/flight
port/tools/imgDiagnose.sh diff tmp/mame_boot.png port/screenshots/match_mame_ramstate.png --ascii
```
- Single-image summary (non-black coverage, luminance histogram, horizon-row guess):
```
port/tools/imgDiagnose.sh stats tmp/mame_boot.png
```
- Inputs may be `.png`, `.ppm`, or `.pgm`. PNGs are converted via
ImageMagick into a temp PPM in `tmp/` that the C tools read. Tools
live at `port/tools/imgDiff.c` / `imgStats.c` and build into
`port/bin/` via `make -C port tools` (auto-built on first wrapper run).
- The port already writes PPMs from `--screenshot` -- prefer those over
re-encoding to PNG when possible.