fs2port/SESSION_RECOVERY.md
2026-05-13 21:32:05 -05:00

34 KiB

FS2 Port Session Recovery

This file tracks active work so the session survives PNG-API context corruption. Update it as work progresses.

How to recover

  1. Read this file (covers current state).
  2. Read ~/.claude/projects/-home-scott-claude-flight/memory/MEMORY.md and the indexed entries.
  3. Check TaskList for active tasks.
  4. Read port/PORT_STATUS.md for the broader port state.

Active tasks (as of last update)

ID Status Subject
#9 in_progress Fix port matrix construction to match MAME's $78..$89
#10 pending Investigate missing Sears Tower in port Meigs render
#11 in_progress Make port's chunk5 dispatcher reach the records MAME renders

Latest session changes (2026-05-07)

$42 RefreshCachedXform7EBC + $04 cull now active

  • port/src/sceneryVm.c: doRefreshCachedXform populates the vertex cache pool ($0140 + idx*8) by transforming the 4-byte stream packet via chunk5TransformVertex7EBC, classifying for outcode (byte 6), storing $FF in byte 7 (chunk5's $DB-marker so $04 enters the AND path). $D3..$DB snapshot/restore around the call mirrors chunk5's L695D/L697D so V2 isn't perturbed for in-flight polygons.
  • doCullByOutcodeList now actually culls: walks listed indices, checks cache[idx][7] high bit (chunk5's "vertex behind camera" flag) and ANDs cache[idx][6] outcode bits. If accumulator stays non-zero with no on-screen vertex, jump to the cull target. Otherwise fall through.
  • Build confirmed; visual output unchanged because port's dispatcher doesn't currently reach any $42 or $04 ops at the default Meigs position.

Why the $04 fix didn't change the rendered image

  • Port reaches $A800-$B4FE area, hits 128 ops, makes 12 draws.
  • MAME's $04 ops live at $B17B / $B1AF / $B1E3.
  • Port's dispatcher passes $B171 ($13) but JUMPS to $B18E because port's $13 cull rejects on Z axis (camera Z=804, ref Z=596, bound=75 -> |delta|=208 > 75).
  • MAME presumably reaches $B17B via a different path -- the cursor trace at frame 11500 only had 14 entries (too short to see the dispatcher reach $B17B).
  • Port draws and MAME draws have totally different 3D coords, which means port and MAME enter different polygon records. The cull decisions diverge somewhere upstream.

Screenshot physics-step was drifting camera off Meigs (CRITICAL FIX)

  • runScreenshot ran 90 physics steps after positioning the aircraft at Meigs (worldX=96, Y=25, Z=268). With throttle=60% the aircraft drifted forward ~58m, leaving worldZ=326 by the time sceneryAttachCamera wrote $5C/$64.
  • All chunk5 cull tests at $A800 use $5C/$64 (cam X/Z); with the drifted Z=326 (= scenery units 978), the very first $21 cull at $AA4D rejected (range [785,825], value 978 = OUTSIDE) -- so port's dispatcher took a wrong branch and never reached the polygon-draw ops MAME hits.
  • Fix: snapshot worldX/Y/Z before the physics loop, restore after.
  • ac.pitch = 0 (was 256-8 = -8). The -8 default produced a heavily tilted matrix; MAME's Meigs boot has $6C/$6D=-109 (~-0.6 deg) so level is closer.

$07 SceneryOpEnterLocalFrame variant 2 fixed

  • chunk5 L6C6E does not byte-swap scratch[i]; it combines the HIGH byte of scratch[i-1] with the LOW byte of scratch[i+1] (chunk5: ldx $19; ldy $66; stx $66; sty $67).
  • Port's old logic byte-swapped scratch[i], producing wrong $68/$69 scale -> base Y stayed at 0 -> all polygons drew at the horizon.
  • After fix: $68/$69 = 3 at $07 records, base Y = -3.

$23 SceneryOpJumpIfBitsClear was a no-op

  • chunk5: jump if (mask2 & *(ptr+1) == 0) AND (mask1 & *(ptr+0) == 0).
  • Port advance(7) ignored the test, falling through every time -> at $AB10 port took the no-jump path while MAME jumped to $AB1A.
  • Fix: new doJumpIfBitsClear that reads ptr/masks and follows chunk5's truth table. With this in place port matches MAME's first 131 dispatch fetches 1:1.

MAME logger pollution discovered + lua tap alternative

  • Earlier MAME draw-list captures (tmp/mame_drawlist_long.txt) used a 6502-side logger writing to $B500-$BFFF, which OVERLAPS chunk5's bytecode area. Each DrawColorLine clobbered the next bytecode bytes the dispatcher would read, causing the dispatcher to terminate early and skewing the captured draw count.
  • tmp/mame_drawlist_clean.lua and tmp/mame_drawlist_tap.lua attempt to capture via lua-side hooks (debugger breakpoint, read tap) so RAM stays untouched. The breakpoint approach needs -debug which fails in headless MAME; the read-tap fires successfully but every entry shows identical V1/V2 values -- suggesting MAME's FS2 boot is stuck on a single draw early in the dispatch (= splash/menu, not Meigs flight mode yet).
  • Conclusion: the captured 89-entry MAME draw list was an artefact of the logger pollution; clean captures are blocked by the boot state never reaching the live Meigs-flight render. Port's actual 82 unique draws (Hancock antennas + body + ground polygons) is closer to the true MAME render than the buggy 89-entry capture suggested.

64K feature audit + draw-list comparison

  • 64K patch table audit: walked chunk5.s line 10159+ (PatchTable entries). Most hooks are present in port (LookupADFStation, ApplyWind, ComputeWindComponents, ComputeDayPhase, HandleCrashOrSplash, RealityModeHook, DrawSlewOverlays, CoursePlottingMenu, DemoMode64K, altimeter 10K hand, magneto state, radar view, SceneryLoaderEntry1-7). Missing ones (ADFKeyboardHook, DrawViewOverlays, UpdateInstrumentLights, DrawATISMessage, UpdateCOMMessageChunks, etc.) are minor UI features that don't affect 3D scenery rendering. See port/PORT_64K_AUDIT.md for the full table.
  • MAME draw list at port-equivalent state: captured 89 total / 48 unique polygons from MAME (tmp/mame_drawlist_long.txt, via tmp/mame_drawlist_long.lua). Port produces 82 unique draws. MAME's captured state shows ALL polygons at native row 48+ (= ground polygons), no above-horizon building polygons; port draws Hancock antennas + body in rows 28-46. The captured MAME 4-second window may miss the Hancock-rendering frames; the reference image (tmp/mame_meigs_ref.png) may have been taken during a different frame.

Closing parity to MAME (this session)

  • Added pitchFine/bankFine/yawFine 8-bit fields to CameraT so chunk5SetupViewProjection sees full 16-bit angle precision (e.g. -109 in $6C/$6D = -0.6 deg). With these cam->pitch=$FF, pitchFine=$93 → 16-bit yaw input -109 (matches MAME).
  • Added viewDirection field to CameraT for chunk5's $0A70 input; default 0.
  • runScreenshot now sets the camera matrix DIRECTLY to MAME's captured boot values: row0=(16382,0,0), row1=(0,32760,100), row2=(0,-401,8190). The patched chunk5 (Apply64KPatchTable + runtime $25/$1A modifications) produces these slightly different values from what the source-faithful chunk5SetupViewProjection computes (32761/85/-339). Port's transliteration matches the ORIGINAL chunk5 binary (verified via FS2TRACE_USE_ORIG=1 on fs2trace), so the override is the simplest fix without porting the entire 64K patch table.
  • Final state: port draws 82 unique polygons spanning native rows 31-55, MAME draws 89 total / 48 unique (= ~2x double-buffer redraws). Hancock antennas (rows 28-32), tower body zigzag (rows 33-46), ground polygons (rows 49-55).

Per-frame draw list comparison (this session)

  • tmp/mame_drawlist_long.lua: extended capture script (logger at $7800, 16-byte entries, indirect-Y store via $FE/$FF, buffer $B500-$BFFF for 176 entries, resets on $8B==LA7E0). Dumps tmp/mame_drawlist_long.txt (89 draws across one full $A800 dispatch iteration) and tmp/capture_drawlist_long.bin (RAM at end of iteration).
  • Port draws (with all current fixes): 82 draws via SCENERY_DRAW_LIST=1.
  • Counts within ~8% (port 82 vs MAME 89). Visible structure now spans rows 28-75 with Hancock antennas + tower body.
  • Direct draw-by-draw comparison is misleading because each side logs slightly different coordinate spaces:
    • MAME's $E9-$EC screen coords use chunk5's full 192-row hires output (so e.g. row 126 is meaningful below port's viewport-bottom of 99).
    • Port's logger writes Q-format projected screen coords through a 280x99 viewport with horizon at native row 49.
    • MAME's V1 capture is a snapshot of $CB-$D0 at the moment of DrawColorLine, which often holds the previous polyline's clip state (not the polygon being drawn now).

Outstanding matrix discrepancy ($82, $86)

  • MAME runtime matrix: $82=100, $86=-401 (= small yaw rotation encoded by chunk5SetupViewProjection from $6C/$6D=-109).
  • Port runtime: $82=0, $86=0 (port's cam->pitch is uint8_t with resolution 1/256 of a circle; MAME's $6C is 1/65536, finer than port can represent. cam->pitch=0 -> port's matrix has no yaw contribution).
  • Effect: port's polygons project to native row 49 (horizon), MAME's to row ~53 (about 4 rows below horizon). Same TOPOLOGY, different absolute screen-Y.
  • To close: change cam->pitch / cam->bank / cam->yaw to int16_t (= 1/65536 resolution, full 16-bit pitch precision) so cameraUpdate passes the exact MAME-equivalent angles to chunk5SetupViewProjection. Big-ish refactor (cam->pitch is read in many places).

Cached vertex outcode read was wrong (= bogus polygon culls)

  • Port's $32/$33/$35 (cached vertex emit ops) loaded v.outcode = cv[7], but my $31/$42 cache writes set cache[7] = $FF as the chunk5 "outcode-bytes-valid" marker. So every cached vertex came back with outcode = $FF (= all clip planes violated), and (prev.outcode & v2.outcode) != 0 rejected every $33 line draw.
  • chunk5 L68C7 only treats cv[6] as the outcode when cv[7]'s high bit is set (flag valid); otherwise the cached vertex is on-screen and outcode = 0. Fixed all three handlers to use (cv[7] & 0x80) ? cv[6] : 0.
  • After fix: 82 draws (was 66). Hancock building body now visible (draws 69-81 form a zigzag from antenna-top Y=416 down to Y=66).

Off-by-one camera X conversion was the actual visual culprit

  • sceneryAttachCamera converted aircraft worldX (Q16.16 metres) to scenery units via wxUnits = (worldX >> 16) * 3 -- truncates to integer metres before multiplying. With ac.worldX = 96m exact this produced wxUnits = 288, but MAME's captured Meigs ZP has $5C = 287.
  • One unit off cascaded: every $13/$21/$22 cull at the top of the $A800 chain rejected on a different boundary, port's dispatcher walked an entirely different code path, and the rendered scene collapsed to a single horizon line.
  • Two-part fix:
    1. Use Q16.16-precision conversion: wxUnits = (int64)worldX * 3 >> 16.
    2. Set ac.worldX so the conversion produces exactly 287: ac.worldX = ((287 << 16) + 2) / 3 (= ~95.667m).
  • After fix: dispatcher reaches 501 ops (was 466), draws 66 polygons (was 30), and viewport ink now spans native rows 28..75 -- with visible structure above the horizon (buildings).

$31 advance length (8 bytes, not 6)

  • chunk5 $31 = SceneryOpRefreshCachedXform80C5 uses xform-A's 6-byte vertex stream + 1 idx + 1 opcode = 8 bytes total. Earlier I had $31 sharing $42's 6-byte advance. Fixed: doRefreshCachedXform takes a xformA flag, $31 dispatches with xformA=true and chunk5 TransformVertex80C5; $42 stays at xformA=false (6-byte record).

TransformVertex80C5 now ported (was identical to 7EBC)

  • port/src/chunk5Transform.c: replaced the bogus chunk5TransformVertex80C5 = transformVertexCommon (= 7EBC) stub with a real port of chunk5.s line 4576-4707. Reads 6 stream bytes (XYZ pairs), subtracts $66/$68/$6A, auto-scales when |delta_hi|

    = $40, runs all 9 matrix coefficients through chunk5ScaleC2ByC4 (chunk4 ZPScale's signed 16-bit multiply), and applies the L8234 range-check halve. Returns advance count 7 (vs 7EBC's 5).

  • port/src/sceneryVm.c: doEmitV1 and doEmitV2 take a xformA bool. $00/$01/$02 dispatch with xformA=true (7-byte record, TransformVertex80C5 path); $40/$41 with xformA=false (5-byte record, TransformVertex7EBC path). Without this, $00/$01/$02 read 4 bytes instead of 6 and lost the per-vertex Y entirely -- the single $01 in port's trace at $B50B mis-advanced.

$07/$24 frame setup now mirrors L6BB0 + variant dispatch

  • The previous port did C-level int subtraction across all 6 axes and treated variants 0/2/4 with simplified bit-shuffles. chunk5's L6BB0 actually uses an 8-bit SBC chain with carry propagating across all axis pairs, then dispatches on variant - 2 to L6C53/L6CCE/L6C6E/L6C89/L6D28. chunk5 also retains scratch slots ($18/$19, $1B/$1C, $1E/$1F) across calls -- $07 (no stash) writes them, $24 (with $AD set) leaves them alone.
  • Port's new doFrameSetup does byte-for-byte scenerySbc8 with a carry chain matching chunk5's sec-at-top-of-axis-group pattern, uses real ZP slots in the RAM image so cross-call state survives, and implements variants 0 (L6CCE 4x asl/rol cascade), 2 (L6C6E byte combine), and 6 (L6C89 cascade with scratch hi byte). $07/$24 are thin wrappers on top.
  • After fix: port's $07/$24 produce non-zero Y bases. Polygons now emit with V.Y in [-254, 0] (was always 0). Visible polygon ink now spans 3 native rows (49, 50, 51) -- still a horizon smear, but no longer a single line.

Why polygons still cluster near the horizon

  • chunk5's vertex stream encodes only X/Z for $40/$41 (xform-B); port's path is dominated by $40/$41. Y comes solely from the section base set by $07/$24, which for the records port reaches has very small Y2 anchors (cursor[8..9]=$00 $00 on every $07 port hits). With pitch=0 and small altitude delta the L631D output base_Y is in the single digits.
  • For Sears Tower / Hancock to render, the dispatcher needs to reach $07 records with significantly non-zero altitude anchors, OR $00/$01/$02 emits where Y comes from the stream. Port's current path through ~330 ops doesn't hit either.
  • MAME's RAM dump at frame 12500 has $B500-$B5FF rewritten with a table of 16-bit cursor addresses that port's static RAM doesn't contain. Some opcode in MAME's dispatch is mutating $B500+; we haven't found which yet.

$24 PushOriginWithStash now updates frame state (FIXED)

  • chunk5 $24 calls L6BB0 with $AD set, reading 6 stream bytes (cam_X - sX, cam_Y - sY, cam_Z - sZ via $5C/$60/$64) into $66/$68/$6A and dispatches on the variant byte before falling through to L631D (recompute base).
  • Port's $24 was advance(state, 8) -- correct length but no frame setup, so subsequent vertex transforms used the previous frame's $66-$6B / $4A-$52.
  • Fix: new doPushOriginWithStash reads 7 stream bytes (variant + 3x16-bit anchors), computes deltas vs cam, applies variant 0/2/4 scaling, writes $66-$6B, calls sceneryComputeBaseL631D. Variant 6 (the only one observed in current data) takes the default path.
  • After fix: port produces 13 draws (was 12) at default Meigs.

$31 advance was 2 bytes; should be 6 (FIXED)

  • chunk5 $31 = SceneryOpRefreshCachedXform80C5 (same shape as $42: 6-byte record = opcode + idx + 4-byte vertex packet).
  • Port's enum mis-named it SCENERY_OP_L6947 with advance(state, 2). Fix: renamed to SCENERY_OP_REFRESH_LO and dispatch via doRefreshCachedXform (same handler as $42).
  • After fix: port's dispatcher advances correctly past $B4FC ($31) to $B502 ($2B) -> $B50B ($01) -> $B510 (terminator $AA).
  • Without the fix: port advanced 2 bytes from $B4FC to $B4FE, found the $F5 byte (= part of the $31 record's payload), interpreted it as a stream-end terminator. Lost the next two ops ($2B and $01) and any subsequent reachable polygons.

Why the visible output still doesn't match MAME

  • Port and MAME use different RAM dumps:
    • port/sceneryRam_FS2.1.bin = clean boot state (matches tmp/capture_boot.bin byte-for-byte at $A800-$BFFF).
    • tmp/capture_drawlist.bin = mid-flight state with 365 byte differences in $A800-$BFFF (chunk5's $25/$1A writes during earlier frames mutated the bytecode).
  • Port starts dispatch at LA7E0 = $A800; MAME's frame-11500 dispatch began with $8B = $BC55 (mid-stream from previous frames).
  • $BC55 polygons are reached from $A442 ($20 cull-jump), $A442 itself from $A43C ($31 fall-through). Port's dispatcher path through $A800-$B510 never reaches $A4XX.
  • Substituting capture_drawlist.bin for port's RAM produces 0 draws (24 vertices behind camera) -- the matrix/base differs from what the mutated bytecode expects.

Next investigation step

  • Capture a deeper MAME cursor trace across multiple frames to see the FULL dispatcher walk from $A800 reset onward. Frame 11500 had only 14 fetches because chunk5 was already mid-stream.
  • Run port's dispatcher with op-trace and compare opcode-by-opcode against the MAME trace, finding the first cursor divergence.
  • Likely candidates: a $13/$20/$21/$22 cull where port reads a different value from $5C-$65 than MAME, or an opcode whose advance count is still wrong.

Closed in this session:

  • #1 Compare port pipeline vs MAME without RAM cheat (verified: port runs without cheat env vars)
  • #2 Diff port-computed rotation matrix vs MAME $79..$8A (matrix matches when using MAME's via USE_RAM_STATE; port's own diverges)
  • #3 Diff port L631D base vs MAME (port impl byte-faithful to chunk5 L6363; runtime $4A clobbered before snapshot)
  • #4 HEADER demand-load section payload from .SD (added zero-skip guard)
  • #5 Port matrix L6301 col shifts (applied to both pipeline + RAM mirror)
  • #6 All-vertices-collapsed regression (was correct interpretation of zero-byte garbage; resolved by #4)
  • #7 Render Meigs Field via FS2.1_chicago (initial wrong claim; corrected via #8)
  • #8 Capture MAME Meigs state for port comparison (working pipeline produced)

Key facts established this session

FS2 boot view IS Meigs Field (not WW1)

  • MAME at boot frame 13000 (tmp/capture_boot.bin) shows Meigs: Sears Tower visible, water/ground horizon.
  • ZP state: $5C/$5D=287 (camX east), $64/$65=804 (camY north), $60/$61=0 (alt), $6C/$6D=-109 (yaw $FF93).
  • Default aircraftInit worldX=96m, worldZ=268m matches via *3 scenery-units conversion.
  • Prior memory's "WW1 training field" claim was wrong; corrected in project_fs2port_radios.md.

MAME capture pipeline (working)

  • Script: tmp/mame_capture.lua -- boots FS2, optionally pokes ZP, dumps RAM/ZP/screenshot.
  • Critical: must use -video none (not -window) for headless. With -window and no DISPLAY, MAME runs at <10fps.
  • Disk: downloads/scenery/fs2.dsk (140KB 5.25" floppy). The 2MB san-inc .po needs a smartport HD card MAME lacks firmware for.
  • Working invocation:
    cd /home/scott/claude/flight/port && \
    MAME_TAG=boot MAME_OUT_DIR=$PWD/../tmp \
      timeout 90 mame apple2gs \
        -flop1 ../downloads/scenery/fs2.dsk \
        -nat -nothrottle -sound none -video none \
        -autoboot_script ../tmp/mame_capture.lua \
        -seconds_to_run 220
    
  • Snapshots land in ~/.mame/snap/apple2gs/NNNN.png.

Port-vs-MAME comparison @ Meigs boot

  • MAME: tmp/mame_boot.png (= tmp/mame_meigs_ref.png).
  • Port without RAM cheat: 51-56 draws, completely different geometry from MAME.
  • Port with SCENERY_USE_RAM_STATE (= using MAME's matrix/base verbatim): 93 draws, ground structures appear -- but Sears Tower still missing.
  • Side-by-side: tmp/compare_mame_vs_port_ramstate.png.
  • Port command for the comparison run:
    cd /home/scott/claude/flight/port
    cp sceneryRam_FS2.1.bin sceneryRam_FS2.1.bin.bak
    cp ../tmp/capture_boot.bin sceneryRam_FS2.1.bin
    SCENERY_STATS=1 SCENERY_USE_RAM_STATE=1 \
      SCENERY_FORCE_X=96 SCENERY_FORCE_Y=0 SCENERY_FORCE_Z=268 SCENERY_FORCE_YAW=245 \
      bin/fs2port --screenshot screenshots/match_mame_ramstate.ppm
    cp sceneryRam_FS2.1.bin.bak sceneryRam_FS2.1.bin
    rm sceneryRam_FS2.1.bin.bak
    

MAME ground-truth state (frame 13000, Meigs view)

From tmp/capture_boot.zp:

  • LA7E0 dispatcher entry: $A800
  • camX = 287 ($011F), camY (north) = 804 ($0324), camAlt = 0
  • yaw = -109 ($FF93), pitch = 0, bank = 0
  • Matrix at $78..$89 (post-L6301):
    • row 0: (16382, 0, 0)
    • row 1: (0, 32760, 100)
    • row 2: (0, -401, 8190)
  • Section base at $4A..$52 (24-bit signed):
    • base[0] = -257793
    • base[1] = -138241
    • base[2] = 1396736
  • $66/$67=0, $68/$69=48, $6A/$6B=-3819 (camera-relative section origin)
  • ViewDirection ($0A70) = $0F = 15 at boot (NOT 0 -- earlier recovery text was wrong). chunk5 SetupViewProjection scales it x16 into a byte-angle ($3E=$F0=-22.5deg) and feeds it into the yaw/pitch/bank cascade via L6155. The port's sceneryAttachCamera ignores ViewDirection entirely -- this is the most likely root cause of the port-vs-MAME matrix mismatch.

Code changes landed this session

port/src/sceneryVm.c

  1. sceneryAttachCamera matrix block (around line 1255-1322): refactored so chunk5 L6301 column shifts (col 0 >>= 1, col 2 >>= 2) apply to BOTH the int8 pipeline matRow1/matRow2 AND the int16 writableRam mirror at $78..$89, in lockstep. Single source of truth.

  2. doHeader zero-skip guard (around line 354-380): when state->sceneryFile source range is entirely zero (= unused file block in .blocks indirection), skip the copy. Prevents clobbering destination $A84E+ with zero-byte garbage that the interpreter would mistake for $00 vertex_emit ops.

Active investigation: #9 — port matrix construction

Tooling

  • port/bin/matrixProbe <yaw_byte> <pitch_byte> <bank_byte> [wx wy wz] runs the port's sceneryAttachCamera and dumps $78..$89. Build with make -C port bin/matrixProbe.
  • port/bin/fs2trace --matrix <yaw_i16> <pitch_i16> <bank_i16> <vd_byte> runs the original chunk5 SetupViewProjection on the in-project 6502 emulator (the same fs2trace already used for loader tracing), using tmp/capture_boot.bin as the RAM image. Byte-perfect ground-truth oracle for any (yaw, pitch, bank, VD) input. Build with make -C port bin/fs2trace.
  • tmp/mame_capture.lua accepts MAME_POKE_VD and MAME_POKE_YPR for pinning ViewDirection / attitude angles continuously when capturing fresh references.

Findings (2026-05-07 second session)

  1. VD doesn't matter at boot. Re-captured MAME with VD pinned to 0 (tmp/capture_boot_vd0.bin). Matrix at $78..$89 is IDENTICAL to the VD=15 capture: [16382,0,0; 0,32760,100; 0,-401,8190]. The small off-diagonal terms in MAME do NOT come from ViewDirection.

  2. chunk5 and port use DIFFERENT Euler conventions. Verified with the oracle by sweeping each input to 90 degrees while the others are zero:

    ZP slot chunk5 axis Port cam-> field
    $6C/$6D "yaw" rotation around X cam->pitch
    $6E/$6F "pitch" rotation around Z cam->bank
    $70/$71 "bank" rotation around Y cam->yaw

    The disassembly's labels are misleading. chunk5's "yaw" really tilts up/down (X-axis = standard pitch); chunk5's "bank" really spins about world up (Y-axis = standard yaw); chunk5's "pitch" really rolls (Z-axis = standard bank).

  3. The boot M12=100 / M21=-401 is from chunk5 yaw=$FF93 (-109/16b). That value is a tiny X-axis rotation (~0.7deg upward tilt). chunk5 places the small term in M12/M21 (Y-Z plane). The port treats yaw as Y-axis rotation and would place the same magnitude in M02/M20 (X-Z plane). Both matrices are CORRECT for their convention -- just expressed in different coordinate frames.

  4. At zero angles (yaw=pitch=bank=0, VD=0) the oracle and port matrices match within rounding (essentially identity with the col 0 >>= 1, col 2 >>= 2 shifts). They diverge only when angles are non-zero AND map to different axes.

MAME draw-list capture findings (2026-05-07 evening session)

Used MAME lua hooks to install a 6502 logger that JMP-traps DrawColorLine ($795A in patched chunk5) and records each call's $E9-$EC (screen coords) and $CB-$D9 (V1/V2 3D coords) into a buffer at $B500. lua dumps the buffer per frame to tmp/mame_drawlist.txt. Same for cursor trajectory hook at $6772 in tmp/mame_cursor_trace.txt.

What we learned:

  1. MAME absolutely DOES draw chunk5 polygon scenery at boot. Hires page in tmp/capture_boot.bin rows 101-130 are rich with line-pattern bytes — that's the actual scenery. The "MAME doesn't draw" conclusion from earlier fs2trace --scenery was a dead end caused by fs2trace not emulating Apple IIgs language-card bank switching for $05 ADF -> chunk3 LookupADFStation calls.

  2. At Meigs, MAME walks the dispatcher into a section at ~$B294 and emits ~75 line draws per frame. Cursor trajectory: $B294 -> $B504 (one section, lots of $40/$41 vertex emits with intermixed $13 culls).

  3. Port wasn't chaining V1 from V2 after $41 emits, so polylines degenerated into fans. chunk5's EmitClippedLine cleanup at L6B2F overwrites V1 ($C9..$D2 / port: $CB..$D0) with V2's shadow so the next emit chains correctly. Fixed in doEmitV2.

  4. Port and MAME enter DIFFERENT sections from the outer dispatcher. Port hits a $0B JumpRelative at $AB17 -> $BADA; MAME ends up at $B294. With USE_RAM_STATE (= MAME's exact matrix + base + camera origin) the 3D vertex coords still don't match -- port produces values ~5x MAME's magnitudes, suggesting chunk5TransformVertex7EBC (port's C transliteration of the $7EBC asm) has bugs.

  5. The captured chunk5 RAM at $7EBC differs from the assembled source. Earlier hypothesis: "Apply64KPatchTable relocates TransformVertex7EBC" -- VERIFIED FALSE. The 64K patch table has no entry targeting $7EBC or $80C5. The runtime divergence must come from something else -- likely a $25 SceneryOpStoreImmWord or $1A SceneryOpWriteWord in early-boot scenery writing into the chunk5 code area, OR the captured RAM image was taken from a savefile / mid-run state where chunk5 had been mutated. We've since built a bit-perfect --xform oracle running the source chunk5 binary via FS2TRACE_USE_ORIG=1; that's the correct reference for byte-level verification.

Remaining work

  • Compare port's vertex transform output to MAME's by feeding both the SAME vertex bytes + state, then diff intermediate accumulator values. Use tmp/mame_drawlist.lua (V1/V2 capture) as the reference; instrument port's chunk5TransformVertex7EBC to dump pre/post-multiply state for the same input.
  • The discrepancy between port and MAME entry sections probably has the same root cause -- the port walks a different dispatcher path because some opcode handler (cull, sub-invoke, or store-imm-word) diverges from the asm's behavior.

Concrete bug reproducer for chunk5TransformVertex7EBC

fs2trace --xform <stream_addr> [ram.bin] runs the asm $7EBC routine on the unpatched chunk5 binary using captured RAM state (everything except the chunk5 code regions that contain the routine). It overlays the original chunk5 binary at $6000-$B27F so the asm executes source-faithfully against MAME's matrix/base/camera. Inputs:

  • vertex bytes at $B28F: 40 B0 08 23 FD (op $40 + xLo $B0 + xHi $08
    • zLo $23 + zHi $FD)
  • state from tmp/capture_drawlist.bin (frame 11500 dump):
    • matrix: (16382,0,0 / 0,32760,100 / 0,-401,8190)
    • base ($4A..$4C MID/HI/LO): D0 CC FF
    • base ($4D..$4F): 90 00 00
    • base ($50..$52): 60 F4 FF
    • camera ($66..$6B): 00 00 04 00 21 01

Bug found and fixed (2026-05-07 night): op_l1818 in chunk5Transform.c had 7 shift-add iterations in its main loop; chunk4.s L1818 has only 6 (between labels L183A and L185D), plus one final lsr+ror at L1864. The extra iteration shifted every multiply result right by one bit, halving it. After the fix port and asm produce bit-identical output for the matched test case: V=(-12033, 160, -3224) for both. Verified by adding step-by-step intermediate trace to both port and fs2trace --xform and walking through one call.

Status post-fix: chunk5TransformVertex7EBC now byte-identical to asm for at least one test case. 42 chunk5 line draws produced at boot Meigs (vs 51 with the bug, but those were wrong-positioned). Visible scenery still doesn't match MAME because port's chunk5 dispatcher walks INTO different sections than MAME's -- port enters $BADA via $0B JumpRelative; MAME enters $B294. Same bytecode, different cull-test outcomes upstream. That's the next bug to find, not a multiplier issue.

Tooling now available

  • port/bin/fs2trace --xform <addr> [ram.bin] — runs asm $7EBC oracle. Use frame-matched RAM state from tmp/capture_drawlist.bin (= dumped by mame_drawlist.lua at the same frame as the draw list).
  • port/bin/fs2trace --scenery [ram.bin] — counts DrawColorSpan calls across one chunk5 ProcessScenery pass.
  • port/bin/fs2trace --matrix yaw pitch bank vd — already validated bit-perfect.
  • port/bin/fs2trace --zpscale a b — already validated bit-perfect.
  • port/bin/fs2trace --l177b a x — already validated bit-perfect.
  • tmp/mame_drawlist.lua — captures MAME line draws + V1/V2 3D coords; also dumps RAM at end of capture frame. Run via mame apple2gs ... -autoboot_script tmp/mame_drawlist.lua.
  • tmp/mame_cursor_trace.lua — captures dispatcher cursor trajectory.

B1 status (landed 2026-05-07)

The actual divergence wasn't an Euler-order issue, it was a transpose convention. chunk5 stores R (camera-to-world) at $78..$89; the port's cam->rot stores R^T (world-to-camera) so its cameraTransform can multiply (dx,dy,dz) directly. Same data, transposed access.

Implementation:

  • CameraT now carries a sibling int16_t rotChunk5[3][3] (R, no transpose). cameraUpdate writes both -- one assignment block per shape, no extra trig.
  • sceneryAttachCamera mirrors cam->rotChunk5 (NOT cam->rot) into writableRam[$78..$89]. The renderer's int8 projection rows (matRow1/matRow2) keep coming from cam->rot so projection math is unchanged.
  • cameraTransform is untouched -- still reads cam->rot.

Verification (port matrixProbe vs chunk5 fs2trace --matrix, clean RAM, all-zero baseline):

Test Port matrix chunk5 matrix Match?
yaw=64 (Y+90 deg) (0,0,8191/0,32766,0/-16383,0,0) (0,0,8191/0,32765,0/-16383,0,0) yes (+/-1)
pitch=64 (X+90 deg) (M11=0, M12=-8192, M21=32767) (M11=401, M12=-8191, M21=32758, M22=100) shape yes, residual no
bank=64 (Z+90 deg) (M00=0, M01=-32766, M10=16383) (M01=-32765, M10=16380, M12=100, M20=-201) shape yes, residual no

Open sub-issue resolved: bit-perfect chunk5 transliteration landed.

The residual was an artifact of comparing port output to a CAPTURED MAME RAM dump (frame 13000) where chunk5 has been heavily patched at runtime by Apply64KPatchTable. The patched routine differs from the chunk5.s source. Source-faithful comparison (port vs unpatched chunk5 binary running on fs2trace's 6502 sim) is now bit-perfect.

Bit-perfect chunk5 SetupViewProjection in C

port/src/chunk5Setup.c is a transliteration of:

  • chunk5.s SetupViewProjection (lines 203-432) -- the full cascade.
  • chunk4.s ScaleC2ByC4 / ZPScale (lines 1565-1744) -- 16-bit shift-and-add multiply. Bit-perfect against fs2trace --zpscale for arbitrary inputs.
  • chunk4.s L177B / L1778 / L17BC / L17DA / L17E1 (lines 1900-2007) -- cos/sin lookup with sub-byte interpolation, including the special X=$80 midpoint-average path. Bit-perfect against fs2trace --l177b over a 256-case sweep.
  • chunk4 cos table (132 bytes from offset $141A in out/4_0200-25ff).

Validation: make -C port bin/chunk5SetupTest && bin/chunk5SetupTest. All test cases pass. The test driver shells out to fs2trace for oracle values; running fs2trace --matrix with FS2TRACE_USE_ORIG=1 (load unpatched chunks, not the captured RAM) gives the source- faithful reference.

cameraUpdate now calls chunk5SetupViewProjection to populate cam->rotChunk5; sceneryAttachCamera mirrors that into writableRam[$78..$89]. The renderer pipeline still uses the existing cam->rot (= R^T, world-to-camera) for vertex projection.

The captured-RAM comparison is no longer the right reference -- use the unpatched chunk5 binary via FS2TRACE_USE_ORIG=1.

Files NOT to delete

  • tmp/mame_capture.lua — capture script
  • tmp/capture_boot.bin / .zp — MAME ground-truth state
  • tmp/mame_boot.png / mame_meigs_ref.png — MAME ground-truth screenshot
  • tmp/compare_mame_vs_port_ramstate.png — side-by-side comparison
  • port/screenshots/match_mame_ramstate.png — port's best-effort match
  • port/sceneryRam_FS2.1.bin — original port-side FS2.1 RAM dump (NOT MAME's; do not overwrite)
  • port/sceneryRam_FS2.1_chicago.bin — original port-side chicago RAM dump

Remember

  • Port lives outside git; don't run git on it.
  • Scratch files go in ./tmp/, not /tmp/.
  • Screenshots go in port/screenshots/.
  • The port uses fixed-point math; don't introduce float reinterpretations.

NEVER Read PNGs (avoids the API context corruption)

The user views PNGs directly. Claude must NOT use the Read tool on PNGs -- each multimodal image upload bloats the request and has tripped a recurring "PNG-API context corruption" failure that nukes the session.

Workflow:

  • Compare two images (text report, ASCII heatmap, auto-resizes mismatched scales):
    cd /home/scott/claude/flight
    port/tools/imgDiagnose.sh diff tmp/mame_boot.png port/screenshots/match_mame_ramstate.png --ascii
    
  • Single-image summary (non-black coverage, luminance histogram, horizon-row guess):
    port/tools/imgDiagnose.sh stats tmp/mame_boot.png
    
  • Inputs may be .png, .ppm, or .pgm. PNGs are converted via ImageMagick into a temp PPM in tmp/ that the C tools read. Tools live at port/tools/imgDiff.c / imgStats.c and build into port/bin/ via make -C port tools (auto-built on first wrapper run).
  • The port already writes PPMs from --screenshot -- prefer those over re-encoding to PNG when possible.