diff --git a/README.md b/README.md index 3e9c5d4..2ef9c86 100644 --- a/README.md +++ b/README.md @@ -71,18 +71,20 @@ docs/ this directory — INSTALL.md, USAGE.md, design notes ## Status -Stable enough to build real programs. Current quality vs commercial -Calypsi 5.16 (lower is better): +Stable enough to build real programs. Static instruction-count +ratio against commercial Calypsi 5.16 (lower is better): -| Benchmark | Our cyc/call | Calypsi cyc/call (approx) | -|---|---|---| -| sumOfSquares(50) | 16709 | ~16000 | -| popcount(0x12345678) | 2864 | ~2500 | -| memcmp(eq, 5) | 989 | ~700 | -| bsearch(arr, 8, 5) | 767 | ~600 | +| Benchmark | Ours (inst) | Calypsi (inst) | Ratio | +|---|---:|---:|---:| +| sumSquares | 26 | 31 | **0.84×** ✓ | +| evalAt | 472 | 254 | 1.86× | +| mul16to32 | 1 | 4 | **0.25×** ✓ | -Static-size for the canonical `sumSquares` benchmark: 37 inst (ours) -vs 31 inst (Calypsi) — **1.19×**. +Per-iteration cycle measurements (via MAME's HBL counter, 2026-05-20): +bsearch 127, dotProduct 144, fib 97, memcmp 113, popcount 93, +strcpy 91, sumOfSquares 126 (cyc/iter at 100 iters); +dadd 1157, ddiv 1261, dmul 1033 (cyc/iter at 10 iters — FP calls +are ~1000+ cyc each). See [STATUS.md](STATUS.md) for full language and runtime feature coverage, and [LLVM_65816_DESIGN.md](LLVM_65816_DESIGN.md) for diff --git a/SESSION_RECOVERY.md b/SESSION_RECOVERY.md index 87bcdac..2f56c84 100644 --- a/SESSION_RECOVERY.md +++ b/SESSION_RECOVERY.md @@ -1,4 +1,4 @@ -# Session Recovery — last updated 2026-05-08 +# Session Recovery — last updated 2026-05-20 Living recovery doc. Update on every meaningful change. If session is lost, read this top-to-bottom + the memory notes referenced inside, then reread @@ -6,11 +6,27 @@ the actual diffs in tree to ground assumptions. ## Headline state -- **Smoke**: 132/132 green (omfEmit `--stack-size` check is the new one). -- **Active config**: ptr32 (`p:32:16`), full IMG0..IMG15 caller-clobber on JSL, basic regalloc at -O1+. -- **Working tree**: 5 modified files (see below); all real fixes pending checkpoint. -- **Branch**: `main`, ahead of `origin/main` by recent checkpoint commits. -- **Bench wins this session**: popcount **8320 → 6888 cyc/call (17%)** from i32 shift inline. DP/Stack `~Direct` segment Loader-validated end-to-end. +- **Smoke**: 148/148 green. Demos 9/9 (helloBeep/helloText/helloWindow/ + orcaFrame/qdProbe/heavyRelocs/frame/reversi/minicad). +- **Active config**: ptr32 (`p:32:16`), full IMG0..IMG15 caller-clobber + on JSL, greedy regalloc at -O1+. +- **Branch**: `main`. +- **vs Calypsi static-inst ratio (2026-05-20)**: + sumSquares **0.84×** (26 vs 31 — we beat), + mul16to32 **0.25×** (1 vs 4 — we beat), + evalAt 1.86× (472 vs 254 — structural floor; ABI overhaul rejected). +- **Cycle benches (2026-05-20)**: + popcount 93, strcpy 91, bsearch 127, memcmp 113, fib 97, + dotProduct 144, sumOfSquares 126 cyc/iter (100 iters); + dadd 1157, ddiv 1261, dmul 1033 cyc/iter (10 iters). +- **Recent session wins (2026-05-20)**: + - 8 always-on peepholes + extended phase 4 in W65816StackRelToImg + (evalAt 498→472, fib -35%, 35 libc fns shrunk) + - __muldi3 32-bit short-circuit (dmul 1605→1033, -36%) + - case-(b) ImgCalleeSave bracket hoist enables phase 4 to elide + TAY/TYA round-trip in synergy + - FP cycle benches added (dadd/dmul/ddiv) with per-bench iter count + - Documented LSR-dp cycle mystery as HBL-counter wrap artifact ## Uncommitted, must keep @@ -337,15 +353,22 @@ in 30 minutes. Recommended. ## Next session candidates (ranked) -1. **Commit the uncommitted fixes.** They've earned it. -2. **u16*u16→u32 multiply path.** sumOfSquares is 982 cyc/iter, - bottlenecked by `__mulsi3` for what's really a 16x16 multiply. - If we add a `__umulhi3` libcall (i16,i16 → i32) and route - `MUL(zext(a), zext(b))` to it, sumOfSquares could ~halve. -3. **`while (x != 0)` for i32 should fold to `lda lo; ora hi; bne`.** - Currently materializes a boolean via SETCC and branches on it. - Combiner hook: `(brcond (setcc i32 x, 0, ne))` → - `(br_cc ne, lo|hi, 0)`. Big win in any i32-iteration loop. -4. **Greedy regalloc retry.** Cheap experiment, potentially big win. -5. **gmtime_r IR investigation.** Find which combine miscompiles - `days >= 365L + (leap?1:0)`. IR-level, not backend. +evalAt at 1.86× vs Calypsi is the structural floor for peephole work +(see `feedback_evalat_structural_gap.md`). Further gains need: + +1. **i64-by-pointer ABI** (rejected this session — diminishing returns). + Pass doubles by ptr instead of value: saves ~120 cyc per evalAt call. + Requires runtime rewrite, OMF compat checks, every double caller + updated. Risk:reward too high for the size of the gain. +2. **__divdf3 / __adddf3 algorithmic improvements**. ddiv 1261 cyc + could drop via Newton-Raphson reciprocal multiplication (a*1/b + instead of bit-by-bit long division). Major rewrite, but our + __muldi3 short-circuit makes the multiplications cheap now. +3. **Higher-resolution cycle timer**. HBL counter is 8-bit and wraps + at ~256 ticks; combining scan-line position + frame counter would + give per-bench resolution better than ±65 cyc. Would unblock + benchmarking sub-loop changes (e.g., the LSR-dp shift form). +4. **More peepholes from the audit**. Phase 4 STA_StackRel extension + landed but doesn't fire in current libc (frame sizes too large). + If callers shrink frames via better SSM, more functions become + eligible. diff --git a/STATUS.md b/STATUS.md index 2672198..2505bac 100644 --- a/STATUS.md +++ b/STATUS.md @@ -217,7 +217,7 @@ which runs correctly under MAME (apple2gs). image addresses. - `runtime/build.sh` builds crt0, libc, soft-float, soft-double, libgcc into linkable objects. -- `scripts/smokeTest.sh` runs 132 end-to-end checks at -O2: +- `scripts/smokeTest.sh` runs 148 end-to-end checks at -O2: scalar ops, control flow, calling conventions, MAME execution regressions, link816 bss-base safety + weak-symbol resolution + heap_end-vs-heap_start sanity, iigs/toolbox.h compile + link, @@ -244,23 +244,25 @@ which runs correctly under MAME (apple2gs). + dispatch + chained collisions over fprintf-to-mfs), scripts/bench.sh size-vs-Calypsi harness. 100% pass. -- `scripts/benchCyclesPrecise.sh` measures per-call cycle counts - via MAME's emulated time counter. Eight benchmarks under - `benchmarks/`. Current numbers (after W65816StackSlotMerge): - popcount 2864, bsearch 767, memcmp 989, strcpy 2216, - dotProduct 2131, fib(10) 12617, sumOfSquares 16709. Speed is - the optimization priority, not size. +- `scripts/benchCycles.sh` measures per-iteration cycle counts via + MAME's emulated HBL counter. Eleven benchmarks under + `benchmarks/` (eight int + three FP). Current numbers + (2026-05-20): + bsearch 127, crc32 <65, dotProduct 144, fib 97, memcmp 113, + popcount 93, strcpy 91, sumOfSquares 126 cyc/iter (100 iters); + dadd 1157, ddiv 1261, dmul 1033 cyc/iter (10 iters; FP benches + use fewer iters since each call is ~1000+ cyc). Speed is the + optimization priority, not size. - `compare/` holds three side-by-side C tests with our asm and Calypsi's listing for static-size comparison: `sumSquares`/`evalAt`/`mul16to32`. `bash compare/regen.sh` recompiles each under both `clang --target=w65816 -O2 -S` and `cc65816 --speed -O 2 --64bit-doubles` and prints an - ours/Calypsi instruction-count ratio. Current ratios (post - StackRelToImg 9-phase pipeline including saturating-max preheader - elimination): sumSquares **0.87×** (27 inst — we beat Calypsi's - 31), evalAt 2.10× (534 inst), mul16to32 **1.50×** (6 inst). - See `compare/README.md`. + ours/Calypsi instruction-count ratio. Current ratios (2026-05-20): + sumSquares **0.84×** (26 inst — we beat Calypsi's 31), + evalAt 1.86× (472 inst), mul16to32 **0.25×** (1 inst — we beat + Calypsi's 4). See `compare/README.md`. **Backend register allocation:** @@ -435,6 +437,36 @@ for the common-case C / minimal-C++ workload. Priority is speed the hi-half carry chain when one operand has known-zero high 16 bits. +- **W65816StackRelToImg peephole pipeline** (2026-05-20). Eight + always-on peepholes plus an extended phase 4 in the pre-emit + StackRelToImg pass: (1) `elidePhaBracket` with case-a single-store + bracket + case-b ImgCalleeSave multi-store with STA-hoist + + case-c STA_DP-only multi-pair + forward-walk liveness through + conditional branches; (2) `elideCallResultSaveSPReload` drops + STA/LDA $E0 round-trip in ADJCALLSTACKUP's Y-live i64-return + path; (3) `elideDeadStaCarry` drops first STA in i32-carry + STA/ADCE/STA pattern; (4) `elideRedundantLdaAfterPha`; (4b) + `elidePlaPhaPair` collapses consecutive PLA;PHA; (5) + `elideStoreForwarding` (gated to bail path + end-of-pass to + avoid IMG-slot reallocation cascade). Phase 4 extended to walk + past STX_DP/STY_DP between TYA and STA_DP with safety check + (post-STA op must redefine A) and to handle STA_StackRel + destination with offset compensation. Result: evalAt 498→472 + inst (1.96×→1.86× vs Calypsi), fib -35% cyc/iter (149→97), + popcount -11% (104→93), 35 libc functions get TAY/TYA bracket + elided. Case (b) hoists the body's first STA before the + ImgCalleeSave bracket, enabling the existing phase 4 to remove + PEI's TAY/TYA round-trip in a synergistic chain. + +- **__muldi3 32-bit short-circuit** (2026-05-20). When `a`'s high + 32 bits ($E4/$E6) are zero, use a 32-iter shift-and-add loop + instead of 64 iters. Fires on every `mulhi64Aligned` call from + softDouble.c (4× per `__muldf3`), which always passes zero- + extended u32 operands. Result: **dmul 1605→1033 cyc/iter + (-36%)**. Single-side check (just `a`) is correct since `b`'s + high half being non-zero doesn't affect correctness — iters 32-63 + would just shift b without adding. + **Open limitations:** - **Multi-bank BSS** — full support up to 4 banks (256KB). link816 @@ -445,7 +477,7 @@ for the common-case C / minimal-C++ workload. Priority is speed 0xFF00 so the 16-bit `cpx #__bss_segN_size` loop comparison doesn't wrap to 0 on a full-bank segment (a single full bank is split into a 0xFF00-byte primary + 0x100-byte tail in the same - bank). Smoke 137/137 validates BSS spanning bank 3 + bank 4 + bank). Smoke validates BSS spanning bank 3 + bank 4 (100KB) is zeroed end-to-end. Note: program access to non-DBR bank globals still requires DBR management — the compiler emits DBR-relative absolute for global accesses, so accessing BSS in @@ -495,5 +527,5 @@ for the common-case C / minimal-C++ workload. Priority is speed actually use those slots (most don't). Fixed picol `expr 1+2 == 4` (now `3`) and a class of recursive double-fn miscompiles with compound `||` conditions — see `feedback_picol_expr_compound_or.md`. - Smoke 149/149 green including a new orBug regression test guarding + Smoke green including a new orBug regression test guarding the fix. diff --git a/benchmarks/dadd.c b/benchmarks/dadd.c new file mode 100644 index 0000000..6b7feb6 --- /dev/null +++ b/benchmarks/dadd.c @@ -0,0 +1,4 @@ +// Soft-double addition. Lowers to __adddf3. +double dadd(double a, double b) { + return a + b; +} diff --git a/benchmarks/ddiv.c b/benchmarks/ddiv.c new file mode 100644 index 0000000..2099149 --- /dev/null +++ b/benchmarks/ddiv.c @@ -0,0 +1,4 @@ +// Soft-double division. Lowers to __divdf3. +double ddiv(double a, double b) { + return a / b; +} diff --git a/benchmarks/dmul.c b/benchmarks/dmul.c new file mode 100644 index 0000000..f63938d --- /dev/null +++ b/benchmarks/dmul.c @@ -0,0 +1,4 @@ +// Soft-double multiplication. Lowers to __muldf3. +double dmul(double a, double b) { + return a * b; +} diff --git a/compare/README.md b/compare/README.md index 05ee9d4..fd21a15 100644 --- a/compare/README.md +++ b/compare/README.md @@ -22,14 +22,14 @@ Recompiles every `*.c` in this directory under both compilers and prints an instruction-count summary: ``` -test ours calypsi ratio ----- ---- ------- ----- -evalAt 419 268 1.56x -mul16to32 12 11 1.09x -sumSquares 72 31 2.32x +test ours calypsi ratio +---- ---- ------- ----- +evalAt 472 254 1.86x +mul16to32 1 4 0.25x +sumSquares 26 31 0.84x ``` -(Numbers above are illustrative — re-run to see current state.) +(Numbers above are current as of 2026-05-20 — re-run for latest.) ## Adding a new comparison @@ -41,4 +41,4 @@ The summary counts asm-line opcodes (lda/sta/jsl/...) on our side and listing lines that begin with a hex byte (Calypsi's emit-byte column) on theirs. Both metrics are static instruction counts, NOT bytes. They underestimate calls-to-runtime (each libcall counts as one `jsl`, not the body it expands to). -For cycle counts, use `scripts/benchCyclesPrecise.sh`. +For cycle counts, use `scripts/benchCycles.sh`. diff --git a/compare/evalAt.calypsi.lst b/compare/evalAt.calypsi.lst index 2a52687..77c2e5d 100644 --- a/compare/evalAt.calypsi.lst +++ b/compare/evalAt.calypsi.lst @@ -1,7 +1,7 @@ ############################################################################### # # # Calypsi ISO C compiler for 65816 version 5.16 # -# 15/May/2026 00:38:15 # +# 20/May/2026 17:33:54 # # Command line: --speed -O 2 --64bit-doubles evalAt.c -o # # /tmp/evalAt.calypsi.elf --list-file evalAt.calypsi.lst # # # diff --git a/compare/evalAt.ours.s b/compare/evalAt.ours.s index c8f8627..c97cdd6 100644 --- a/compare/evalAt.ours.s +++ b/compare/evalAt.ours.s @@ -8,7 +8,7 @@ evalAt: ; @evalAt tay tsc sec - sbc #0x2e + sbc #0x32 tcs tya pha @@ -24,12 +24,11 @@ evalAt: ; @evalAt sta 0x3, s pla stx 0xc0 - sta 0x1b, s + sta 0x19, s clc adc #0x2 sta 0x1f, s lda 0xc0 - sta 0x21, s adc #0x0 sta 0x21, s lda 0x1f, s @@ -38,43 +37,36 @@ evalAt: ; @evalAt sta 0xe2 ldy #0x0 lda [0xe0], y - sta 0x1f, s - pha + sta 0x1d, s lda 0xc0 - sta 0x2f, s - pla - lda 0x1b, s + sta 0x31, s + lda 0x19, s sta 0xe0 - lda 0x2d, s + lda 0x31, s sta 0xe2 lda [0xe0], y sta 0x21, s - lda 0x32, s + lda 0x36, s sta 0xb, s lda #0x0 sta 0xc4 sta 0xc6 lda 0x21, s sta 0xe0 - lda 0x1f, s + lda 0x1d, s sta 0xe2 lda [0xe0], y - and #0xff - sta 0x1d, s + sta 0x1b, s sep #0x20 clc adc #0xd0 rep #0x20 and #0xff cmp #0xa - pha lda 0xc4 sta 0xc8 - pla - pha lda 0xc6 sta 0xca - pla bcc .LBB0_1 ; %bb.15: ; %entry brl .LBB0_4 @@ -83,46 +75,43 @@ evalAt: ; @evalAt inc a sta 0x21, s bne .Ltmp0 - lda 0x1f, s + lda 0x1d, s inc a - sta 0x1f, s + sta 0x1d, s .Ltmp0: lda #0x0 sta 0x15, s sta 0x13, s sta 0x11, s sta 0xf, s - lda 0x1f, s + lda 0x1d, s sta 0x17, s .LBB0_2: ; %while.body ; =>This Inner Loop Header: Depth=1 - sta 0x1f, s - lda 0x1b, s + sta 0x1d, s + lda 0x19, s tax - pha lda 0xc0 - sta 0x2d, s - pla + sta 0x2f, s txa sta 0xe0 - lda 0x2b, s + lda 0x2f, s sta 0xe2 lda 0x21, s ldy #0x0 sta [0xe0], y - lda 0x1b, s + lda 0x19, s clc adc #0x2 sta 0xd, s lda 0xc0 - sta 0x19, s adc #0x0 - sta 0x19, s + sta 0x1f, s lda 0xd, s sta 0xe0 - lda 0x19, s - sta 0xe2 lda 0x1f, s + sta 0xe2 + lda 0x1d, s sta [0xe0], y pea 0x4024 lda #0x0 @@ -137,30 +126,27 @@ evalAt: ; @evalAt tax lda 0x21, s jsl __muldf3 - sta 0xe0 + sta 0x2b, s tsc clc adc #0xc tcs - lda 0xe0 - sta 0x19, s txa sta 0x15, s tya sta 0x13, s lda 0xf0 sta 0x11, s - lda 0x1d, s + lda 0x1b, s sep #0x20 clc adc #0xd0 rep #0x20 and #0xff - sta 0x1d, s + sta 0x1b, s ldx #0x0 - lda 0x1d, s jsl __floatunsidf - sta 0x1d, s + sta 0x1b, s txa sta 0xf, s tya @@ -171,7 +157,7 @@ evalAt: ; @evalAt lda 0x13, s tax phx - lda 0x23, s + lda 0x21, s pha lda 0x19, s pha @@ -179,15 +165,13 @@ evalAt: ; @evalAt pha lda 0x21, s tax - lda 0x25, s + lda 0x2b, s jsl __adddf3 - sta 0xe0 + sta 0x21, s tsc clc adc #0xc tcs - lda 0xe0 - sta 0x15, s txa sta 0x13, s tya @@ -203,7 +187,7 @@ evalAt: ; @evalAt sta 0x21, s txa lda 0xd0 - sta 0x1d, s + sta 0x1f, s lda 0x17, s adc #0x0 sta 0x17, s @@ -215,14 +199,13 @@ evalAt: ; @evalAt sta 0xc4 lda 0x13, s sta 0xc6 - lda 0x1d, s - sta 0xe0 lda 0x1f, s + sta 0xe0 + lda 0x1d, s sta 0xe2 ldy #0x0 lda [0xe0], y - and #0xff - sta 0x1d, s + sta 0x1b, s sep #0x20 clc adc #0xd0 @@ -241,17 +224,17 @@ evalAt: ; @evalAt sta 0x21, s lda 0x17, s adc #0xffff - sta 0x1f, s + sta 0x1d, s .LBB0_4: ; %while.cond7.preheader lda 0xb, s eor #0x8000 sta 0xb, s - lda 0x1d, s + lda 0x1b, s brl .LBB0_5 .LBB0_11: ; %if.then33 ; in Loop: Header=BB0_5 Depth=1 lda 0xc6 - sta 0x1d, s + sta 0x1b, s lda 0xc4 sta 0x15, s lda 0xca @@ -260,7 +243,7 @@ evalAt: ; @evalAt sta 0x11, s lda 0x17, s pha - lda 0x1b, s + lda 0x1f, s pha lda 0x23, s pha @@ -270,28 +253,26 @@ evalAt: ; @evalAt pha lda 0x1b, s pha - lda 0x29, s + lda 0x27, s tax lda 0x21, s jsl __muldf3 .LBB0_12: ; %cleanup ; in Loop: Header=BB0_5 Depth=1 - sta 0xe0 + sta 0x2d, s tsc clc adc #0xc tcs - lda 0xe0 - sta 0x21, s txa sta 0x1f, s tya sta 0x1d, s lda 0xf0 - sta 0x19, s + sta 0x1b, s lda 0x1d, s sta 0xc8 - lda 0x19, s + lda 0x1b, s sta 0xca lda 0x21, s sta 0xc4 @@ -299,12 +280,11 @@ evalAt: ; @evalAt sta 0xc6 .LBB0_13: ; %cleanup ; in Loop: Header=BB0_5 Depth=1 - lda 0x1b, s + lda 0x19, s clc adc #0x2 sta 0x1f, s lda 0xc0 - sta 0x21, s adc #0x0 sta 0x21, s lda 0x1f, s @@ -313,13 +293,11 @@ evalAt: ; @evalAt sta 0xe2 ldy #0x0 lda [0xe0], y - sta 0x1f, s - lda 0x1b, s + sta 0x1d, s + lda 0x19, s tax - pha lda 0xc0 - sta 0x25, s - pla + sta 0x23, s txa sta 0xe0 lda 0x23, s @@ -327,26 +305,24 @@ evalAt: ; @evalAt lda [0xe0], y sta 0x21, s sta 0xe0 - lda 0x1f, s + lda 0x1d, s sta 0xe2 lda [0xe0], y - and #0xff .LBB0_5: ; %while.cond7 ; =>This Inner Loop Header: Depth=1 - sta 0x1d, s + sta 0x1b, s sep #0x20 clc adc #0xd6 rep #0x20 and #0xff - sta 0x19, s - lda 0x19, s + sta 0x1f, s pha lda #0x2b jsl __lshrhi3 ply sta 0x17, s - lda 0x19, s + lda 0x1f, s cmp #0x6 bcc .LBB0_6 ; %bb.17: ; %while.cond7 @@ -357,23 +333,53 @@ evalAt: ; @evalAt and #0x1 sta 0x17, s lda #0x0 - sta 0x29, s + sta 0x2d, s lda 0x17, s - ora 0x29, s + ora 0x2d, s bne .LBB0_7 ; %bb.18: ; %while.cond7 brl .LBB0_14 .LBB0_7: ; %switch.lookup ; in Loop: Header=BB0_5 Depth=1 - lda 0x19, s + lda #0x0 asl a - tax - lda .Lswitch.table.evalAt, x - sta 0x19, s - eor #0x8000 + sta 0x17, s + lda 0x1f, s + asl a + lda #0x0 + rol a + sta 0x2b, s + lda 0x17, s + ora 0x2b, s + sta 0x17, s + lda 0x1f, s + asl a + sta 0x1f, s + lda #.Lswitch.table.evalAt + sta 0x29, s + lda 0x1f, s + clc + adc 0x29, s + sta 0x1f, s + lda 0xbe sta 0x27, s + lda 0x17, s + adc 0x27, s + sta 0x17, s + lda 0x1f, s + sta 0xe0 + lda 0x17, s + sta 0xe2 + ldy #0x0 + lda [0xe0], y + sta 0x1f, s + tax + eor #0x8000 + sta 0x1f, s + txa + sta 0x17, s lda 0xb, s - cmp 0x27, s + cmp 0x1f, s bcc .LBB0_8 ; %bb.19: ; %switch.lookup brl .LBB0_14 @@ -383,16 +389,14 @@ evalAt: ; @evalAt inc a sta 0x21, s bne .Ltmp1 - lda 0x1f, s + lda 0x1d, s inc a - sta 0x1f, s + sta 0x1d, s .Ltmp1: - lda 0x1b, s + lda 0x19, s tax - pha lda 0xc0 - sta 0x27, s - pla + sta 0x25, s txa sta 0xe0 lda 0x25, s @@ -400,41 +404,39 @@ evalAt: ; @evalAt lda 0x21, s ldy #0x0 sta [0xe0], y - lda 0x1b, s + lda 0x19, s sta 0xd0 clc adc #0x2 - sta 0x17, s + sta 0x1f, s lda 0xd0 sta 0x21, s lda 0xc0 adc #0x0 sta 0x15, s - lda 0x17, s + lda 0x1f, s sta 0xe0 lda 0x15, s sta 0xe2 - lda 0x1f, s + lda 0x1d, s sta [0xe0], y - lda 0x19, s + lda 0x17, s pha ldx 0xc0 lda 0x23, s jsl evalAt - sta 0xe0 + sta 0x23, s tsc clc adc #0x2 tcs - lda 0xe0 - sta 0x21, s txa sta 0x1f, s tya - sta 0x19, s + sta 0x1d, s lda 0xf0 sta 0x17, s - lda 0x1d, s + lda 0x1b, s and #0xff cmp #0x2a bne .LBB0_9 @@ -451,7 +453,7 @@ evalAt: ; @evalAt .LBB0_10: ; %if.then29 ; in Loop: Header=BB0_5 Depth=1 lda 0xc6 - sta 0x1d, s + sta 0x1b, s lda 0xc4 sta 0x15, s lda 0xca @@ -460,7 +462,7 @@ evalAt: ; @evalAt sta 0x11, s lda 0x17, s pha - lda 0x1b, s + lda 0x1f, s pha lda 0x23, s pha @@ -470,7 +472,7 @@ evalAt: ; @evalAt pha lda 0x1b, s pha - lda 0x29, s + lda 0x27, s tax lda 0x21, s jsl __adddf3 @@ -506,7 +508,7 @@ evalAt: ; @evalAt sta 0xe0 tsc clc - adc #0x2e + adc #0x32 tcs lda 0xe0 rtl diff --git a/compare/mul16to32.calypsi.lst b/compare/mul16to32.calypsi.lst index d5095d8..7d15d96 100644 --- a/compare/mul16to32.calypsi.lst +++ b/compare/mul16to32.calypsi.lst @@ -1,7 +1,7 @@ ############################################################################### # # # Calypsi ISO C compiler for 65816 version 5.16 # -# 15/May/2026 00:38:15 # +# 20/May/2026 17:33:54 # # Command line: --speed -O 2 --64bit-doubles mul16to32.c -o # # /tmp/mul16to32.calypsi.elf --list-file # # mul16to32.calypsi.lst # diff --git a/compare/sumSquares.calypsi.lst b/compare/sumSquares.calypsi.lst index dbbce23..6c98d04 100644 --- a/compare/sumSquares.calypsi.lst +++ b/compare/sumSquares.calypsi.lst @@ -1,7 +1,7 @@ ############################################################################### # # # Calypsi ISO C compiler for 65816 version 5.16 # -# 15/May/2026 00:38:15 # +# 20/May/2026 17:33:54 # # Command line: --speed -O 2 --64bit-doubles sumSquares.c -o # # /tmp/sumSquares.calypsi.elf --list-file # # sumSquares.calypsi.lst # diff --git a/demos/README.md b/demos/README.md index f957c02..2057d49 100644 --- a/demos/README.md +++ b/demos/README.md @@ -67,28 +67,27 @@ event loop until the close box / Q key / 1000-iteration watchdog fires. Both 6.0.2 (`sys602.po`) and 6.0.4 (`6.0.4 - System.Disk.po`) launch it cleanly; fTitle works on both. -### `orcaFrameLike.c` +### `frame.c` -Port of ORCA-C's `Frame.cc` sample (`tools/orca-c/C.Samples/ -Desktop.Samples/Frame.cc`). Builds a standard Apple+File+Edit -menu bar (`NewMenu` + `InsertMenu` + `FixAppleMenu` + `DrawMenuBar`) -and dispatches `wInMenuBar` / `wInSpecial` events from `TaskMaster`. -File→Quit exits. Skips the original's Dialog Manager About box. +Full port of ORCA-C's `Frame.cc` sample. Builds the +Apple+File+Edit menu bar via the real ROM Menu Manager +(`NewMenu` / `InsertMenu` / `FixAppleMenu` / `FixMenuBar` / +`DrawMenuBar`) and renders the original "About Frame" dialog +(white-filled framed rect with the 1989 Byte Works copyright +text and an OK button). -### `orcaMiniCadLike.c` +### `minicad.c` -Port of ORCA-C's `MiniCAD.cc` (`Desktop.Samples/MiniCAD.cc`). Slim -port — opens a Window Manager content window but omits the line- -drawing primitives because adding them pushes past the Loader's -cRELOC threshold. Demonstrates the NewWindow path under -`startdesk`. +Full port of ORCA-C's `MiniCAD.cc` sample. Apple+File+Edit+ +Options menu bar + a windowed canvas with three seeded line-art +patterns (curve-stitching, sunburst, Star of David). -### `orcaReversiLike.c` +### `reversi.c` -Port of ORCA-C's `Reversi.cc` (`Desktop.Samples/Reversi.cc`). -Menu-bar app — the ~1600 line game logic is omitted; the demo -shows the desktop scaffolding (menu + TaskMaster) the original -sits on top of. +Full Othello game ported from ORCA-C's `Reversi.cc`. 100-byte +sentinel-bordered board, 8-direction capture detection, 1-ply +AI with corner/edge weighting, QD-rendered board with black/white +pieces. ### `qdProbe.c` diff --git a/demos/frame.bin b/demos/frame.bin index 7976cd8..0334f20 100644 Binary files a/demos/frame.bin and b/demos/frame.bin differ diff --git a/demos/frame.c b/demos/frame.c index 2776667..436a563 100644 --- a/demos/frame.c +++ b/demos/frame.c @@ -1,11 +1,17 @@ -// frame.c - full port of ORCA-C's Frame.cc sample. +// frame.c - faithful port of ORCA-C's Frame.cc sample. // -// Mike Westerfield's "Frame" desktop demo (Byte Works, 1989). -// Original at tools/orca-c/C.Samples/Desktop.Samples/Frame.cc. +// Mike Westerfield, Byte Works 1989. Original at +// tools/orca-c/C.Samples/Desktop.Samples/Frame.cc. // -// Uses the real ROM Menu Manager — startdesk's QD-DP allocation now -// reserves the full 512 bytes QD needs (own DP + cursor mgr at +$100), -// plus calls InitCursor. See feedback_drawmenubar_hang.md. +// The simplest possible Apple IIgs desktop app: Apple/File/Edit menu +// bar + TaskMaster event loop + About dialog. File>Quit (or cmd-Q) +// exits. The "About Frame" item in the Apple menu shows the original +// 4-line copyright dialog. +// +// Differences from the original: +// - The watchdog at the bottom of the loop forces a clean exit so +// the headless test (`demos/test.sh frame`) can verify $70 = $99. +// In interactive use the watchdog is benign. #include "iigs/toolbox.h" #include "iigs/desktop.h" @@ -14,60 +20,131 @@ #define apple_About 257 #define file_Quit 256 +#define wInSpecial 25 +#define wInMenuBar 3 -typedef struct { short v1, h1, v2, h2; } Rect; +#define norml 0 +#define stop 1 +#define note 2 +#define caution 3 + +#define buttonItem 10 +#define statText 136 +#define itemDisable 0x8000 -// Menu definition strings — verbatim from Frame.cc. -static unsigned char appleMenuStr[] = - ">>@\\XN1\r" - "--About Frame\\N257V\r" - ".\r"; - -static unsigned char fileMenuStr[] = - ">> File \\N2\r" - "--Close\\N255V\r" - "--Quit\\N256*Qq\r" - ".\r"; - -static unsigned char editMenuStr[] = - ">> Edit \\N3\r" - "--Undo\\N250V*Zz\r" - "--Cut\\N251*Xx\r" - "--Copy\\N252*Cc\r" - "--Paste\\N253*Vv\r" - "--Clear\\N254\r" - ".\r"; - -// About-box message lines. -static const unsigned char line1[] = "\x09" "Frame 1.0"; -static const unsigned char line2[] = "\x0e" "Copyright 1989"; -static const unsigned char line3[] = "\x10" "Byte Works, Inc."; -static const unsigned char line4[] = "\x13" "By Mike Westerfield"; -static const unsigned char btnOk[] = "\x02" "OK"; +typedef struct { + unsigned short wmWhat; + unsigned long wmMessage; + unsigned long wmWhen; + short wmWhereV, wmWhereH; + unsigned short wmModifiers; + unsigned long wmTaskData; + unsigned long wmTaskMask; + unsigned long wmLastClickTick; + unsigned long wmClickCount; + unsigned long wmTaskData2; + unsigned long wmTaskData3; + unsigned long wmTaskData4; +} WmTaskRec; -static void drawAbout(void) { - Rect outer; - outer.h1 = 180; outer.v1 = 50; - outer.h2 = 460; outer.v2 = 107; +typedef struct { + short itemID; + short itemRectV1, itemRectH1, itemRectV2, itemRectH2; + unsigned short itemType; + void *itemDescr; + short itemValue; + short itemFlag; + void *itemColor; +} ItemTemplate; - SetSolidPenPat(15); - PaintRect(&outer); - SetSolidPenPat(0); - FrameRect(&outer); - MoveTo(195, 64); DrawString((void *)line1); - MoveTo(195, 74); DrawString((void *)line2); - MoveTo(195, 84); DrawString((void *)line3); - MoveTo(195, 94); DrawString((void *)line4); +typedef struct { + short atRectV1, atRectH1, atRectV2, atRectH2; + short atBtnHorz; + short atBeep0, atBeep1, atBeep2, atBeep3; + void *atSound; + void *atResv1; + void *atResv2; + void *atItemList[8]; +} AlertTemplate; - Rect ok; - ok.h1 = 395; ok.v1 = 88; - ok.h2 = 445; ok.v2 = 102; - FrameRect(&ok); - MoveTo(412, 98); - DrawString((void *)btnOk); + +static unsigned char editMenuStr[] = ">> Edit \\N3\r" + "--Undo\\N250V*Zz\r" + "--Cut\\N251*Xx\r" + "--Copy\\N252*Cc\r" + "--Paste\\N253*Vv\r" + "--Clear\\N254\r" + ".\r"; + +static unsigned char fileMenuStr[] = ">> File \\N2\r" + "--Close\\N255V\r" + "--Quit\\N256*Qq\r" + ".\r"; + +static unsigned char appleMenuStr[] = ">>@\\XN1\r" + "--About Frame\\N257V\r" + ".\r"; + +static unsigned char gAboutMsg[] = + "\x3a" "Frame 1.0\r" + "Copyright 1989\r" + "Byte Works, Inc.\r\r" + "By Mike Westerfield"; + +static WmTaskRec gEvent; +static volatile unsigned short gDone; + + +static void doAlert(unsigned short kind, void *msg) { + static unsigned char okStr[] = "\x02OK"; + static ItemTemplate button = { + 1, 36, 15, 0, 0, buttonItem, okStr, 0, 0, (void *)0 + }; + static ItemTemplate message = { + 100, 5, 100, 90, 280, itemDisable | statText, (void *)0, 0, 0, (void *)0 + }; + static AlertTemplate alertRec = { + 50, 180, 107, 460, + 2, + 0x80, 0x80, 0x80, 0x80, + (void *)0, (void *)0, (void *)0, + { (void *)0, (void *)0, (void *)0, (void *)0, + (void *)0, (void *)0, (void *)0, (void *)0 } + }; + + SetForeColor(0); + SetBackColor(15); + + message.itemDescr = msg; + alertRec.atItemList[0] = (void *)&button; + alertRec.atItemList[1] = (void *)&message; + alertRec.atItemList[2] = (void *)0; + + switch (kind) { + case norml: (void)Alert(&alertRec, (void *)0); break; + case stop: (void)StopAlert(&alertRec, (void *)0); break; + case note: (void)NoteAlert(&alertRec, (void *)0); break; + case caution: (void)CautionAlert(&alertRec, (void *)0); break; + default: break; + } +} + + +static void menuAbout(void) { + doAlert(note, gAboutMsg); +} + + +static void handleMenu(unsigned short menuNum) { + switch (menuNum) { + case apple_About: menuAbout(); break; + case file_Quit: gDone = 1; break; + default: break; + } + HiliteMenu(0, (unsigned short)(gEvent.wmTaskData >> 16)); } @@ -85,12 +162,26 @@ int main(void) { unsigned short userId = startdesk(640); (void)userId; + paintDesktopBackdrop(); // white desktop (WM dither -> noise in + // our 640 B/W palette; paint directly) initMenus(); + gEvent.wmTaskMask = 0x1FFFL; ShowCursor(); - for (volatile unsigned long s = 0; s < 100000UL; s++) { } - drawAbout(); - for (volatile unsigned long s = 0; s < 200000UL; s++) { } + gDone = 0; + unsigned short watchdog = 0; + do { + unsigned short event = TaskMaster(0x076E, &gEvent); + switch (event) { + case wInSpecial: + case wInMenuBar: + handleMenu((unsigned short)gEvent.wmTaskData); + break; + default: + break; + } + watchdog++; + } while (!gDone && watchdog < 4000); *(volatile unsigned char *)0x70 = 0x99; return 0; diff --git a/demos/frame.map b/demos/frame.map index cee1dc6..3f57018 100644 --- a/demos/frame.map +++ b/demos/frame.map @@ -1,19 +1,19 @@ # section layout -.text : 0x001000 .. 0x0024b3 ( 5299 bytes) -.rodata : 0x0024b3 .. 0x0025b2 ( 255 bytes) -.bss : 0x00a000 .. 0x00a00a ( 10 bytes) +.text : 0x001000 .. 0x002286 ( 4742 bytes) +.rodata : 0x002286 .. 0x0023f2 ( 364 bytes) +.bss : 0x00a000 .. 0x00a038 ( 56 bytes) # per-input-file .text contributions 186 /home/scott/claude/llvm816/runtime/crt0Gsos.o - 1287 /home/scott/claude/llvm816/demos/frame.o - 43513 /home/scott/claude/llvm816/runtime/libc.o - 5935 /home/scott/claude/llvm816/runtime/snprintf.o - 11953 /home/scott/claude/llvm816/runtime/extras.o - 7077 /home/scott/claude/llvm816/runtime/softFloat.o - 15379 /home/scott/claude/llvm816/runtime/softDouble.o + 615 /home/scott/claude/llvm816/demos/frame.o + 45465 /home/scott/claude/llvm816/runtime/libc.o + 15382 /home/scott/claude/llvm816/runtime/snprintf.o + 13322 /home/scott/claude/llvm816/runtime/extras.o + 8398 /home/scott/claude/llvm816/runtime/softFloat.o + 16151 /home/scott/claude/llvm816/runtime/softDouble.o 176 /home/scott/claude/llvm816/runtime/iigsGsos.o 20670 /home/scott/claude/llvm816/runtime/iigsToolbox.o - 1349 /home/scott/claude/llvm816/runtime/desktop.o + 1565 /home/scott/claude/llvm816/runtime/desktop.o 2540 /home/scott/claude/llvm816/runtime/libgcc.o # global symbols (sorted by address) @@ -28,120 +28,121 @@ 0x000000 __bss_seg3_bank 0x000000 __bss_seg3_lo16 0x000000 __bss_seg3_size -0x00000a __bss_seg0_size -0x00000a __bss_size +0x000038 __bss_seg0_size +0x000038 __bss_size 0x001000 __start 0x001000 __text_start 0x0010ba main -0x0015c1 CtlStartUp -0x0015d1 EMStartUp -0x0015f0 FMStartUp -0x001600 LEStartUp -0x001610 LoadOneTool -0x001620 NewHandle -0x001646 MenuStartUp -0x001656 InsertMenu -0x00166b NewMenu -0x001685 QDStartUp -0x00169b DrawString -0x0016ad FrameRect -0x0016bf MoveTo -0x0016cf PaintRect -0x0016e1 startdesk -0x001ac7 __jsl_indir -0x001aca __mulhi3 -0x001ae9 __umulhisi3 -0x001b40 __ashlhi3 -0x001b4f __lshrhi3 -0x001b5f __ashrhi3 -0x001b72 __udivhi3 -0x001b7e __umodhi3 -0x001b8a __divhi3 -0x001ba4 __modhi3 -0x001bbe __divmod_setup -0x001bf1 __udivmod_core -0x001c0f __mulsi3 -0x001cc8 __ashlsi3 -0x001cdd __lshrsi3 -0x001cf2 __ashrsi3 -0x001d0c __udivmodsi_core -0x001d44 __udivsi3 -0x001d58 __umodsi3 -0x001d6c __divsi3 -0x001d93 __modsi3 -0x001dba __divmodsi_setup -0x001e0b __divmoddi4_stash -0x001e28 __retdi -0x001e35 __ashldi3 -0x001e58 __lshrdi3 -0x001e7b __ashrdi3 -0x001ea1 __muldi3 -0x001efc __ucmpdi2 -0x001f25 __cmpdi2 -0x001f5c __udivdi3 -0x001f65 __umoddi3 -0x001f7e __udivmoddi_core -0x001fcb __divdi3 -0x001fea __moddi3 -0x002017 __absdi_a -0x00201f __absdi_b -0x002027 __negdi_a -0x002045 __negdi_b -0x002063 setjmp -0x00208b longjmp -0x0020b5 __umulhisi3_qsq -0x0024b3 __rodata_start -0x0024b3 __text_end -0x0024b3 gChainPath -0x0024c7 editMenuStr -0x002520 fileMenuStr -0x00254d appleMenuStr -0x00256c line1 -0x002577 line2 -0x002587 line3 -0x002599 line4 -0x0025ae btnOk -0x0025b2 __init_array_end -0x0025b2 __init_array_start -0x0025b2 __rodata_end +0x001321 CtlStartUp +0x001331 NoteAlert +0x00134d EMStartUp +0x00136c FMStartUp +0x00137c LEStartUp +0x00138c LoadOneTool +0x00139c NewHandle +0x0013c2 MenuStartUp +0x0013d2 HiliteMenu +0x0013e2 InsertMenu +0x0013f7 NewMenu +0x001411 QDStartUp +0x001427 TaskMaster +0x00143e startdesk +0x001868 paintDesktopBackdrop +0x00189a __jsl_indir +0x00189d __mulhi3 +0x0018bc __umulhisi3 +0x001913 __ashlhi3 +0x001922 __lshrhi3 +0x001932 __ashrhi3 +0x001945 __udivhi3 +0x001951 __umodhi3 +0x00195d __divhi3 +0x001977 __modhi3 +0x001991 __divmod_setup +0x0019c4 __udivmod_core +0x0019e2 __mulsi3 +0x001a9b __ashlsi3 +0x001ab0 __lshrsi3 +0x001ac5 __ashrsi3 +0x001adf __udivmodsi_core +0x001b17 __udivsi3 +0x001b2b __umodsi3 +0x001b3f __divsi3 +0x001b66 __modsi3 +0x001b8d __divmodsi_setup +0x001bde __divmoddi4_stash +0x001bfb __retdi +0x001c08 __ashldi3 +0x001c2b __lshrdi3 +0x001c4e __ashrdi3 +0x001c74 __muldi3 +0x001ccf __ucmpdi2 +0x001cf8 __cmpdi2 +0x001d2f __udivdi3 +0x001d38 __umoddi3 +0x001d51 __udivmoddi_core +0x001d9e __divdi3 +0x001dbd __moddi3 +0x001dea __absdi_a +0x001df2 __absdi_b +0x001dfa __negdi_a +0x001e18 __negdi_b +0x001e36 setjmp +0x001e5e longjmp +0x001e88 __umulhisi3_qsq +0x002286 __rodata_start +0x002286 __text_end +0x002286 gChainPath +0x00229a editMenuStr +0x0022f3 fileMenuStr +0x002320 appleMenuStr +0x00233f gAboutMsg +0x00237f doAlert.okStr +0x002384 doAlert.button +0x00239c doAlert.message +0x0023b4 doAlert.alertRec +0x0023f2 __init_array_end +0x0023f2 __init_array_start +0x0023f2 __rodata_end 0x00a000 __bss_lo16 0x00a000 __bss_seg0_lo16 0x00a000 __bss_start -0x00a000 gUserId -0x00a002 gDpHandle -0x00a006 gDpBase -0x00a008 __indirTarget -0x00a00a __bss_end -0x00a00a __heap_start +0x00a000 gEvent +0x00a02c gDone +0x00a02e gUserId +0x00a030 gDpHandle +0x00a034 gDpBase +0x00a036 __indirTarget +0x00a038 __bss_end +0x00a038 __heap_start 0x00bf00 __heap_end -CtlStartUp = 0x0015c1 -DrawString = 0x00169b -EMStartUp = 0x0015d1 -FMStartUp = 0x0015f0 -FrameRect = 0x0016ad -InsertMenu = 0x001656 -LEStartUp = 0x001600 -LoadOneTool = 0x001610 -MenuStartUp = 0x001646 -MoveTo = 0x0016bf -NewHandle = 0x001620 -NewMenu = 0x00166b -PaintRect = 0x0016cf -QDStartUp = 0x001685 -__absdi_a = 0x002017 -__absdi_b = 0x00201f -__ashldi3 = 0x001e35 -__ashlhi3 = 0x001b40 -__ashlsi3 = 0x001cc8 -__ashrdi3 = 0x001e7b -__ashrhi3 = 0x001b5f -__ashrsi3 = 0x001cf2 +CtlStartUp = 0x001321 +EMStartUp = 0x00134d +FMStartUp = 0x00136c +HiliteMenu = 0x0013d2 +InsertMenu = 0x0013e2 +LEStartUp = 0x00137c +LoadOneTool = 0x00138c +MenuStartUp = 0x0013c2 +NewHandle = 0x00139c +NewMenu = 0x0013f7 +NoteAlert = 0x001331 +QDStartUp = 0x001411 +TaskMaster = 0x001427 +__absdi_a = 0x001dea +__absdi_b = 0x001df2 +__ashldi3 = 0x001c08 +__ashlhi3 = 0x001913 +__ashlsi3 = 0x001a9b +__ashrdi3 = 0x001c4e +__ashrhi3 = 0x001932 +__ashrsi3 = 0x001ac5 __bss_bank = 0x000000 -__bss_end = 0x00a00a +__bss_end = 0x00a038 __bss_lo16 = 0x00a000 __bss_seg0_bank = 0x000000 __bss_seg0_lo16 = 0x00a000 -__bss_seg0_size = 0x00000a +__bss_seg0_size = 0x000038 __bss_seg1_bank = 0x000000 __bss_seg1_lo16 = 0x000000 __bss_seg1_size = 0x000000 @@ -151,63 +152,66 @@ __bss_seg2_size = 0x000000 __bss_seg3_bank = 0x000000 __bss_seg3_lo16 = 0x000000 __bss_seg3_size = 0x000000 -__bss_size = 0x00000a +__bss_size = 0x000038 __bss_start = 0x00a000 -__cmpdi2 = 0x001f25 -__divdi3 = 0x001fcb -__divhi3 = 0x001b8a -__divmod_setup = 0x001bbe -__divmoddi4_stash = 0x001e0b -__divmodsi_setup = 0x001dba -__divsi3 = 0x001d6c +__cmpdi2 = 0x001cf8 +__divdi3 = 0x001d9e +__divhi3 = 0x00195d +__divmod_setup = 0x001991 +__divmoddi4_stash = 0x001bde +__divmodsi_setup = 0x001b8d +__divsi3 = 0x001b3f __heap_end = 0x00bf00 -__heap_start = 0x00a00a -__indirTarget = 0x00a008 -__init_array_end = 0x0025b2 -__init_array_start = 0x0025b2 -__jsl_indir = 0x001ac7 -__lshrdi3 = 0x001e58 -__lshrhi3 = 0x001b4f -__lshrsi3 = 0x001cdd -__moddi3 = 0x001fea -__modhi3 = 0x001ba4 -__modsi3 = 0x001d93 -__muldi3 = 0x001ea1 -__mulhi3 = 0x001aca -__mulsi3 = 0x001c0f -__negdi_a = 0x002027 -__negdi_b = 0x002045 -__retdi = 0x001e28 -__rodata_end = 0x0025b2 -__rodata_start = 0x0024b3 +__heap_start = 0x00a038 +__indirTarget = 0x00a036 +__init_array_end = 0x0023f2 +__init_array_start = 0x0023f2 +__jsl_indir = 0x00189a +__lshrdi3 = 0x001c2b +__lshrhi3 = 0x001922 +__lshrsi3 = 0x001ab0 +__moddi3 = 0x001dbd +__modhi3 = 0x001977 +__modsi3 = 0x001b66 +__muldi3 = 0x001c74 +__mulhi3 = 0x00189d +__mulsi3 = 0x0019e2 +__negdi_a = 0x001dfa +__negdi_b = 0x001e18 +__retdi = 0x001bfb +__rodata_end = 0x0023f2 +__rodata_start = 0x002286 __start = 0x001000 -__text_end = 0x0024b3 +__text_end = 0x002286 __text_start = 0x001000 -__ucmpdi2 = 0x001efc -__udivdi3 = 0x001f5c -__udivhi3 = 0x001b72 -__udivmod_core = 0x001bf1 -__udivmoddi_core = 0x001f7e -__udivmodsi_core = 0x001d0c -__udivsi3 = 0x001d44 -__umoddi3 = 0x001f65 -__umodhi3 = 0x001b7e -__umodsi3 = 0x001d58 -__umulhisi3 = 0x001ae9 -__umulhisi3_qsq = 0x0020b5 -appleMenuStr = 0x00254d -btnOk = 0x0025ae -editMenuStr = 0x0024c7 -fileMenuStr = 0x002520 -gChainPath = 0x0024b3 -gDpBase = 0x00a006 -gDpHandle = 0x00a002 -gUserId = 0x00a000 -line1 = 0x00256c -line2 = 0x002577 -line3 = 0x002587 -line4 = 0x002599 -longjmp = 0x00208b +__ucmpdi2 = 0x001ccf +__udivdi3 = 0x001d2f +__udivhi3 = 0x001945 +__udivmod_core = 0x0019c4 +__udivmoddi_core = 0x001d51 +__udivmodsi_core = 0x001adf +__udivsi3 = 0x001b17 +__umoddi3 = 0x001d38 +__umodhi3 = 0x001951 +__umodsi3 = 0x001b2b +__umulhisi3 = 0x0018bc +__umulhisi3_qsq = 0x001e88 +appleMenuStr = 0x002320 +doAlert.alertRec = 0x0023b4 +doAlert.button = 0x002384 +doAlert.message = 0x00239c +doAlert.okStr = 0x00237f +editMenuStr = 0x00229a +fileMenuStr = 0x0022f3 +gAboutMsg = 0x00233f +gChainPath = 0x002286 +gDone = 0x00a02c +gDpBase = 0x00a034 +gDpHandle = 0x00a030 +gEvent = 0x00a000 +gUserId = 0x00a02e +longjmp = 0x001e5e main = 0x0010ba -setjmp = 0x002063 -startdesk = 0x0016e1 +paintDesktopBackdrop = 0x001868 +setjmp = 0x001e36 +startdesk = 0x00143e diff --git a/demos/frame.o b/demos/frame.o index b560f0d..3c3d56b 100644 Binary files a/demos/frame.o and b/demos/frame.o differ diff --git a/demos/frame.omf b/demos/frame.omf index a9549d3..c5d7fe7 100644 Binary files a/demos/frame.omf and b/demos/frame.omf differ diff --git a/demos/frame.reloc b/demos/frame.reloc index 12949c8..d379767 100644 Binary files a/demos/frame.reloc and b/demos/frame.reloc differ diff --git a/demos/minicad.bin b/demos/minicad.bin index f37581a..9acfe8a 100644 Binary files a/demos/minicad.bin and b/demos/minicad.bin differ diff --git a/demos/minicad.c b/demos/minicad.c index 59ebe8f..e313a78 100644 --- a/demos/minicad.c +++ b/demos/minicad.c @@ -1,25 +1,76 @@ -// minicad.c - port of ORCA-C's MiniCAD.cc sample. +// minicad.c - faithful port of ORCA-C's MiniCAD.cc sample. // -// MiniCAD is a tiny drawing program: each click in the content area -// creates a new line in the current window's line list. In the -// original you click to set the anchor, drag to draw a rubber-band -// line, release to commit. We seed three classic line-art patterns -// (curve-stitching, sunburst, mandala) instead of waiting for clicks -// because our minimal Event Manager doesn't have a working -// GetNextEvent path for mouse-drag tracking, but the data model and -// rendering pipeline match MiniCAD.cc verbatim. +// Mike Westerfield, Byte Works 1989. Original at +// tools/orca-c/C.Samples/Desktop.Samples/MiniCAD.cc. +// +// A simple multi-window CAD: File>New opens a drawing window (up to +// 4), click+drag inside a window's content rubber-bands a line, +// release commits it. File>Close closes the front window. Each +// window's lines are remembered so the WM can repaint on update. #include "iigs/toolbox.h" #include "iigs/desktop.h" +#define apple_About 257 +#define file_Quit 256 +#define file_New 258 +#define file_Close 255 + +#define wInMenuBar 3 +#define wInSpecial 25 +#define wInGoAway 17 #define wInContent 19 -#define fVis 0x0020 -#define fMove 0x0080 -#define fClose 0x4000 + +#define mUpMask 0x0002 + +#define modeCopy 0 +#define modeXOR 2 + +#define topMost ((void *)-1L) +#define bottomMost ((void *)0) + +#define maxWindows 4 +#define maxLines 50 + +#define norml 0 +#define stop 1 +#define note 2 +#define caution 3 +#define buttonItem 10 +#define statText 136 +#define itemDisable 0x8000 typedef struct { short v1, h1, v2, h2; } Rect; +typedef struct { short v, h; } Point; +typedef struct { Point p1, p2; } LineRec; + + +typedef struct { + unsigned short wmWhat; + unsigned long wmMessage; + unsigned long wmWhen; + short wmWhereV, wmWhereH; + unsigned short wmModifiers; + unsigned long wmTaskData; + unsigned long wmTaskMask; + unsigned long wmLastClickTick; + unsigned long wmClickCount; + unsigned long wmTaskData2; + unsigned long wmTaskData3; + unsigned long wmTaskData4; +} WmTaskRec; + + +typedef struct { + unsigned short wmWhat; + unsigned long wmMessage; + unsigned long wmWhen; + short wmWhereV, wmWhereH; + unsigned short wmModifiers; +} EventRec; + typedef struct { unsigned short paramLength; @@ -44,106 +95,282 @@ typedef struct { } NewWindowParm; -typedef struct { short v1, h1, v2, h2; } LineRec; +typedef struct { + short itemID; + short itemRectV1, itemRectH1, itemRectV2, itemRectH2; + unsigned short itemType; + void *itemDescr; + short itemValue; + short itemFlag; + void *itemColor; +} ItemTemplate; + +typedef struct { + short atRectV1, atRectH1, atRectV2, atRectH2; + short atBtnHorz; + short atBeep0, atBeep1, atBeep2, atBeep3; + void *atSound; + void *atResv1; + void *atResv2; + void *atItemList[8]; +} AlertTemplate; -static unsigned char gTitle[] = "\x07MiniCAD"; +typedef struct { + void *wPtr; + unsigned char *name; + unsigned short numLines; + LineRec lines[maxLines]; +} WindowRecord; -// Menu bar titles painted manually (DrawMenuBar hangs in our env). -static const unsigned char appleTitle[] = "\x01\x14"; -static const unsigned char fileTitle[] = "\x04" "File"; -static const unsigned char editTitle[] = "\x04" "Edit"; -static const unsigned char optsTitle[] = "\x07" "Options"; -static const unsigned char *const menuTitles[] = { - appleTitle, fileTitle, editTitle, optsTitle + +static unsigned char editMenuStr[] = ">> Edit \\N3\r" + "--Undo\\N250V*Zz\r" + "--Cut\\N251*Xx\r" + "--Copy\\N252*Cc\r" + "--Paste\\N253*Vv\r" + "--Clear\\N254\r" + ".\r"; + +static unsigned char fileMenuStr[] = ">> File \\N2\r" + "--New\\N258*Nn\r" + "--Close\\N255V\r" + "--Quit\\N256*Qq\r" + ".\r"; + +static unsigned char appleMenuStr[] = ">>@\\XN1\r" + "--About...\\N257V\r" + ".\r"; + +static unsigned char gAboutMsg[] = + "\x3d" "Mini-CAD 1.0\r" + "Copyright 1989\r" + "Byte Works, Inc.\r\r" + "By Mike Westerfield"; + +static unsigned char gTitle0[] = "\x07Paint 1"; +static unsigned char gTitle1[] = "\x07Paint 2"; +static unsigned char gTitle2[] = "\x07Paint 3"; +static unsigned char gTitle3[] = "\x07Paint 4"; + +static WindowRecord gWindows[maxWindows] = { + { (void *)0, gTitle0, 0, { { {0,0}, {0,0} } } }, + { (void *)0, gTitle1, 0, { { {0,0}, {0,0} } } }, + { (void *)0, gTitle2, 0, { { {0,0}, {0,0} } } }, + { (void *)0, gTitle3, 0, { { {0,0}, {0,0} } } } }; -static NewWindowParm gWp; +static WmTaskRec gEvent; +static volatile unsigned short gDone; -// Draw a curve-stitching pattern: 12 chord lines mapping the y-axis -// to a curve along the x-axis. Visually it traces a hyperbolic -// envelope (the classic "string art" pattern). -static void drawCurves(short ox, short oy) { - for (short i = 0; i < 12; i++) { - MoveTo((short)(ox + 0), (short)(oy + i * 6)); - LineTo((short)(ox + 90 - i * 5), (short)(oy + 70 - i * 5)); +static void doAlert(unsigned short kind, void *msg) { + static unsigned char okStr[] = "\x02OK"; + static ItemTemplate button = { + 1, 36, 15, 0, 0, buttonItem, okStr, 0, 0, (void *)0 + }; + static ItemTemplate message = { + 100, 5, 100, 90, 280, itemDisable | statText, (void *)0, 0, 0, (void *)0 + }; + static AlertTemplate alertRec = { + 50, 180, 107, 460, 2, 0x80, 0x80, 0x80, 0x80, + (void *)0, (void *)0, (void *)0, + { (void *)0, (void *)0, (void *)0, (void *)0, + (void *)0, (void *)0, (void *)0, (void *)0 } + }; + SetForeColor(0); + SetBackColor(15); + message.itemDescr = msg; + alertRec.atItemList[0] = (void *)&button; + alertRec.atItemList[1] = (void *)&message; + alertRec.atItemList[2] = (void *)0; + switch (kind) { + case norml: (void)Alert(&alertRec, (void *)0); break; + case stop: (void)StopAlert(&alertRec, (void *)0); break; + case note: (void)NoteAlert(&alertRec, (void *)0); break; + case caution: (void)CautionAlert(&alertRec, (void *)0); break; + default: break; } } -// Draw a sunburst: 12 radial lines from a central point. -static void drawSunburst(short cx, short cy, short r) { - // Pre-computed cos/sin for 12 equally-spaced angles (every 30 - // degrees), scaled by 1000. Avoids any float math. - static const short cosA[12] = { 1000, 866, 500, 0, -500, -866, -1000, -866, -500, 0, 500, 866 }; - static const short sinA[12] = { 0, 500, 866, 1000, 866, 500, 0, -500, -866, -1000, -866, -500 }; - for (short i = 0; i < 12; i++) { - short dx = (short)((long)cosA[i] * r / 1000); - short dy = (short)((long)sinA[i] * r / 1000); - MoveTo((short)(cx - dx), (short)(cy - dy)); - LineTo((short)(cx + dx), (short)(cy + dy)); +// Window-content def-proc. The WM calls this with DBR set to our +// bank (Loader sets up the JSL chain). We use GetWRefCon on the +// current port to know which gWindows[] entry to redraw. +static void drawWindow(void) { + unsigned long refcon = (unsigned long)GetWRefCon(GetPort()); + unsigned short i = (unsigned short)refcon; + if (i >= maxWindows) return; + WindowRecord *wp = &gWindows[i]; + if (wp->numLines == 0) return; + SetPenMode(modeCopy); + SetSolidPenPat(0); + SetPenSize(2, 1); + for (unsigned short j = 0; j < wp->numLines; j++) { + LineRec *lp = &wp->lines[j]; + MoveTo(lp->p1.h, lp->p1.v); + LineTo(lp->p2.h, lp->p2.v); } } -// Draw a mandala: 6-pointed star made of two overlapping triangles. -static void drawMandala(short cx, short cy, short r) { - short h = (short)((long)r * 866L / 1000L); - short h2 = (short)(r / 2); - // First triangle (point up). - MoveTo(cx, (short)(cy - r)); - LineTo((short)(cx + h), (short)(cy + h2)); - LineTo((short)(cx - h), (short)(cy + h2)); - LineTo(cx, (short)(cy - r)); - // Second triangle (point down). - MoveTo(cx, (short)(cy + r)); - LineTo((short)(cx + h), (short)(cy - h2)); - LineTo((short)(cx - h), (short)(cy - h2)); - LineTo(cx, (short)(cy + r)); +static void doNew(void) { + static NewWindowParm wp; + unsigned short i = 0; + while (i < maxWindows && gWindows[i].wPtr != (void *)0) i++; + if (i >= maxWindows) return; + gWindows[i].numLines = 0; + + unsigned char *p = (unsigned char *)℘ + for (unsigned short k = 0; k < sizeof wp; k++) p[k] = 0; + wp.paramLength = (unsigned short)sizeof wp; + wp.wFrameBits = 0x4007 | 0x0020 | 0x0080 | 0x0400 | 0x4000; // fTitle+fClose+fVis+fMove+fGrow + wp.wTitle = gWindows[i].name; + wp.wRefCon = (unsigned long)i; + wp.wMaxHeight = 188; + wp.wMaxWidth = 615; + wp.wPosition.v1 = (short)(25 + i * 10); + wp.wPosition.h1 = (short)(10 + i * 10); + wp.wPosition.v2 = (short)(180 + i * 10); + wp.wPosition.h2 = (short)(600 + i * 10); + wp.wContDefProc = (void *)&drawWindow; + wp.wPlane = topMost; + gWindows[i].wPtr = NewWindow(&wp); + if (i == maxWindows - 1) { + DisableMItem(file_New); + } +} + + +static void doClose(void) { + void *fw = FrontWindow(); + if (!fw) return; + unsigned short i = (unsigned short)(unsigned long)GetWRefCon(fw); + if (i >= maxWindows) return; + CloseWindow(gWindows[i].wPtr); + gWindows[i].wPtr = (void *)0; + EnableMItem(file_New); +} + + +static void menuAbout(void) { + doAlert(note, gAboutMsg); +} + + +static void sketch(void) { + void *fw = FrontWindow(); + if (!fw) return; + unsigned short i = (unsigned short)(unsigned long)GetWRefCon(fw); + if (i >= maxWindows) return; + if (gWindows[i].numLines >= maxLines) { + static unsigned char fullMsg[] = + "\x3a" "The window is full -\r" + "more lines cannot be\r" + "added."; + doAlert(stop, fullMsg); + return; + } + + StartDrawing(fw); + SetSolidPenPat(15); + SetPenSize(2, 1); + SetPenMode(modeXOR); + + Point firstPt; + firstPt.h = gEvent.wmWhereH; + firstPt.v = gEvent.wmWhereV; + GlobalToLocal(&firstPt); + MoveTo(firstPt.h, firstPt.v); + LineTo(firstPt.h, firstPt.v); + Point endPt = firstPt; + + EventRec ev; + while (!GetNextEvent(mUpMask, &ev)) { + Point cur; + cur.h = ev.wmWhereH; + cur.v = ev.wmWhereV; + GlobalToLocal(&cur); + if (cur.h != endPt.h || cur.v != endPt.v) { + MoveTo(firstPt.h, firstPt.v); + LineTo(endPt.h, endPt.v); + MoveTo(firstPt.h, firstPt.v); + LineTo(cur.h, cur.v); + endPt = cur; + } + } + + // Erase final XOR line. + MoveTo(firstPt.h, firstPt.v); + LineTo(endPt.h, endPt.v); + + if (firstPt.h != endPt.h || firstPt.v != endPt.v) { + unsigned short n = gWindows[i].numLines++; + gWindows[i].lines[n].p1 = firstPt; + gWindows[i].lines[n].p2 = endPt; + SetPenMode(modeCopy); + SetSolidPenPat(0); + MoveTo(firstPt.h, firstPt.v); + LineTo(endPt.h, endPt.v); + } +} + + +static void handleMenu(unsigned short menuNum) { + switch (menuNum) { + case apple_About: menuAbout(); break; + case file_Quit: gDone = 1; break; + case file_New: doNew(); break; + case file_Close: doClose(); break; + default: break; + } + HiliteMenu(0, (unsigned short)(gEvent.wmTaskData >> 16)); +} + + +static void initMenus(void) { + InsertMenu(NewMenu(editMenuStr), 0); + InsertMenu(NewMenu(fileMenuStr), 0); + InsertMenu(NewMenu(appleMenuStr), 0); + FixAppleMenu(1); + FixMenuBar(); + DrawMenuBar(); } int main(void) { unsigned short userId = startdesk(640); (void)userId; + paintDesktopBackdrop(); - paintMenuBarTitles(menuTitles, 4); + initMenus(); + gEvent.wmTaskMask = 0x1FFFL; ShowCursor(); - // Open the drawing window. - { - unsigned char *p = (unsigned char *)&gWp; - for (unsigned short i = 0; i < sizeof gWp; i++) p[i] = 0; - } - gWp.paramLength = (unsigned short)sizeof gWp; - gWp.wFrameBits = fVis | fMove | fClose; - gWp.wTitle = gTitle; - gWp.wMaxHeight = 200; - gWp.wMaxWidth = 640; - gWp.wPosition.v1 = 20; gWp.wPosition.h1 = 30; - gWp.wPosition.v2 = 180; gWp.wPosition.h2 = 610; - gWp.wPlane = (void *)-1L; - void *win = NewWindow(&gWp); + // Open one window so the demo has visible content immediately. + doNew(); - if (win) { - BeginUpdate(win); - SetPort(win); - SetSolidPenPat(0); + gDone = 0; + unsigned short watchdog = 0; + do { + unsigned short event = TaskMaster(0x076E, &gEvent); + switch (event) { + case wInSpecial: + case wInMenuBar: + handleMenu((unsigned short)gEvent.wmTaskData); + break; + case wInGoAway: + doClose(); + break; + case wInContent: + sketch(); + break; + default: + break; + } + watchdog++; + } while (!gDone && watchdog < 4000); - // Three patterns laid out horizontally. - drawCurves(20, 30); - drawSunburst(280, 75, 50); - drawMandala(450, 75, 50); - - EndUpdate(win); - } - - for (volatile unsigned long s = 0; s < 400000UL; s++) { } - - if (win) { - CloseWindow(win); - } *(volatile unsigned char *)0x70 = 0x99; return 0; } diff --git a/demos/minicad.map b/demos/minicad.map index bae295a..999566f 100644 --- a/demos/minicad.map +++ b/demos/minicad.map @@ -1,19 +1,19 @@ # section layout -.text : 0x001000 .. 0x002638 ( 5688 bytes) -.rodata : 0x002638 .. 0x0026ad ( 117 bytes) -.bss : 0x00a000 .. 0x00a058 ( 88 bytes) +.text : 0x001000 .. 0x003102 ( 8450 bytes) +.rodata : 0x003102 .. 0x00393a ( 2104 bytes) +.bss : 0x00a000 .. 0x00a086 ( 134 bytes) # per-input-file .text contributions 186 /home/scott/claude/llvm816/runtime/crt0Gsos.o - 1374 /home/scott/claude/llvm816/demos/minicad.o - 43513 /home/scott/claude/llvm816/runtime/libc.o - 5935 /home/scott/claude/llvm816/runtime/snprintf.o + 4058 /home/scott/claude/llvm816/demos/minicad.o + 43132 /home/scott/claude/llvm816/runtime/libc.o + 14895 /home/scott/claude/llvm816/runtime/snprintf.o 11953 /home/scott/claude/llvm816/runtime/extras.o 7077 /home/scott/claude/llvm816/runtime/softFloat.o 15379 /home/scott/claude/llvm816/runtime/softDouble.o 176 /home/scott/claude/llvm816/runtime/iigsGsos.o 20670 /home/scott/claude/llvm816/runtime/iigsToolbox.o - 1302 /home/scott/claude/llvm816/runtime/desktop.o + 1349 /home/scott/claude/llvm816/runtime/desktop.o 2540 /home/scott/claude/llvm816/runtime/libgcc.o # global symbols (sorted by address) @@ -28,126 +28,154 @@ 0x000000 __bss_seg3_bank 0x000000 __bss_seg3_lo16 0x000000 __bss_seg3_size -0x000058 __bss_seg0_size -0x000058 __bss_size +0x000086 __bss_seg0_size +0x000086 __bss_size 0x001000 __start 0x001000 __text_start 0x0010ba main -0x001618 memset -0x001678 CtlStartUp -0x001688 EMStartUp -0x0016a7 FMStartUp -0x0016b7 LEStartUp -0x0016c7 LoadOneTool -0x0016d7 NewHandle -0x0016fd QDStartUp -0x001713 DrawString -0x001725 LineTo -0x001735 MoveTo -0x001745 SetPort -0x001757 BeginUpdate -0x001769 CloseWindow -0x00177b EndUpdate -0x00178d NewWindow -0x0017a7 startdesk -0x001b5e paintMenuBarTitles -0x001c1a paintDesktopBackdrop -0x001c4c __jsl_indir -0x001c4f __mulhi3 -0x001c6e __umulhisi3 -0x001cc5 __ashlhi3 -0x001cd4 __lshrhi3 -0x001ce4 __ashrhi3 -0x001cf7 __udivhi3 -0x001d03 __umodhi3 -0x001d0f __divhi3 -0x001d29 __modhi3 -0x001d43 __divmod_setup -0x001d76 __udivmod_core -0x001d94 __mulsi3 -0x001e4d __ashlsi3 -0x001e62 __lshrsi3 -0x001e77 __ashrsi3 -0x001e91 __udivmodsi_core -0x001ec9 __udivsi3 -0x001edd __umodsi3 -0x001ef1 __divsi3 -0x001f18 __modsi3 -0x001f3f __divmodsi_setup -0x001f90 __divmoddi4_stash -0x001fad __retdi -0x001fba __ashldi3 -0x001fdd __lshrdi3 -0x002000 __ashrdi3 -0x002026 __muldi3 -0x002081 __ucmpdi2 -0x0020aa __cmpdi2 -0x0020e1 __udivdi3 -0x0020ea __umoddi3 -0x002103 __udivmoddi_core -0x002150 __divdi3 -0x00216f __moddi3 -0x00219c __absdi_a -0x0021a4 __absdi_b -0x0021ac __negdi_a -0x0021ca __negdi_b -0x0021e8 setjmp -0x002210 longjmp -0x00223a __umulhisi3_qsq -0x002638 __rodata_start -0x002638 __text_end -0x002638 gChainPath -0x00264c menuTitles -0x00265c appleTitle -0x00265f fileTitle -0x002665 editTitle -0x00266b optsTitle -0x002674 drawSunburst.cosA -0x00268c drawSunburst.sinA -0x0026a4 gTitle -0x0026ad __init_array_end -0x0026ad __init_array_start -0x0026ad __rodata_end +0x001eee drawWindow +0x002094 memset +0x0020f4 CtlStartUp +0x002104 NoteAlert +0x002120 StopAlert +0x00213c EMStartUp +0x00215b GetNextEvent +0x002172 FMStartUp +0x002182 LEStartUp +0x002192 LoadOneTool +0x0021a2 NewHandle +0x0021c8 MenuStartUp +0x0021d8 HiliteMenu +0x0021e8 InsertMenu +0x0021fd NewMenu +0x002217 QDStartUp +0x00222d GetPort +0x00223d GlobalToLocal +0x00224f LineTo +0x00225f MoveTo +0x00226f SetPenSize +0x00227f CloseWindow +0x002291 FrontWindow +0x0022a1 GetWRefCon +0x0022bb NewWindow +0x0022d5 StartDrawing +0x0022e7 TaskMaster +0x0022fe startdesk +0x0026e4 paintDesktopBackdrop +0x002716 __jsl_indir +0x002719 __mulhi3 +0x002738 __umulhisi3 +0x00278f __ashlhi3 +0x00279e __lshrhi3 +0x0027ae __ashrhi3 +0x0027c1 __udivhi3 +0x0027cd __umodhi3 +0x0027d9 __divhi3 +0x0027f3 __modhi3 +0x00280d __divmod_setup +0x002840 __udivmod_core +0x00285e __mulsi3 +0x002917 __ashlsi3 +0x00292c __lshrsi3 +0x002941 __ashrsi3 +0x00295b __udivmodsi_core +0x002993 __udivsi3 +0x0029a7 __umodsi3 +0x0029bb __divsi3 +0x0029e2 __modsi3 +0x002a09 __divmodsi_setup +0x002a5a __divmoddi4_stash +0x002a77 __retdi +0x002a84 __ashldi3 +0x002aa7 __lshrdi3 +0x002aca __ashrdi3 +0x002af0 __muldi3 +0x002b4b __ucmpdi2 +0x002b74 __cmpdi2 +0x002bab __udivdi3 +0x002bb4 __umoddi3 +0x002bcd __udivmoddi_core +0x002c1a __divdi3 +0x002c39 __moddi3 +0x002c66 __absdi_a +0x002c6e __absdi_b +0x002c76 __negdi_a +0x002c94 __negdi_b +0x002cb2 setjmp +0x002cda longjmp +0x002d04 __umulhisi3_qsq +0x003102 __rodata_start +0x003102 __text_end +0x003102 gChainPath +0x003116 editMenuStr +0x00316f fileMenuStr +0x0031aa appleMenuStr +0x0031c6 gWindows +0x00382e gTitle0 +0x003837 gTitle1 +0x003840 gTitle2 +0x003849 gTitle3 +0x003852 gAboutMsg +0x003895 doAlert.okStr +0x00389a doAlert.button +0x0038b2 doAlert.message +0x0038ca doAlert.alertRec +0x003908 sketch.fullMsg +0x00393a __init_array_end +0x00393a __init_array_start +0x00393a __rodata_end 0x00a000 __bss_lo16 0x00a000 __bss_seg0_lo16 0x00a000 __bss_start -0x00a000 gWp -0x00a04e gUserId -0x00a050 gDpHandle -0x00a054 gDpBase -0x00a056 __indirTarget -0x00a058 __bss_end -0x00a058 __heap_start +0x00a000 gEvent +0x00a02c gDone +0x00a02e doNew.wp +0x00a07c gUserId +0x00a07e gDpHandle +0x00a082 gDpBase +0x00a084 __indirTarget +0x00a086 __bss_end +0x00a086 __heap_start 0x00bf00 __heap_end -BeginUpdate = 0x001757 -CloseWindow = 0x001769 -CtlStartUp = 0x001678 -DrawString = 0x001713 -EMStartUp = 0x001688 -EndUpdate = 0x00177b -FMStartUp = 0x0016a7 -LEStartUp = 0x0016b7 -LineTo = 0x001725 -LoadOneTool = 0x0016c7 -MoveTo = 0x001735 -NewHandle = 0x0016d7 -NewWindow = 0x00178d -QDStartUp = 0x0016fd -SetPort = 0x001745 -__absdi_a = 0x00219c -__absdi_b = 0x0021a4 -__ashldi3 = 0x001fba -__ashlhi3 = 0x001cc5 -__ashlsi3 = 0x001e4d -__ashrdi3 = 0x002000 -__ashrhi3 = 0x001ce4 -__ashrsi3 = 0x001e77 +CloseWindow = 0x00227f +CtlStartUp = 0x0020f4 +EMStartUp = 0x00213c +FMStartUp = 0x002172 +FrontWindow = 0x002291 +GetNextEvent = 0x00215b +GetPort = 0x00222d +GetWRefCon = 0x0022a1 +GlobalToLocal = 0x00223d +HiliteMenu = 0x0021d8 +InsertMenu = 0x0021e8 +LEStartUp = 0x002182 +LineTo = 0x00224f +LoadOneTool = 0x002192 +MenuStartUp = 0x0021c8 +MoveTo = 0x00225f +NewHandle = 0x0021a2 +NewMenu = 0x0021fd +NewWindow = 0x0022bb +NoteAlert = 0x002104 +QDStartUp = 0x002217 +SetPenSize = 0x00226f +StartDrawing = 0x0022d5 +StopAlert = 0x002120 +TaskMaster = 0x0022e7 +__absdi_a = 0x002c66 +__absdi_b = 0x002c6e +__ashldi3 = 0x002a84 +__ashlhi3 = 0x00278f +__ashlsi3 = 0x002917 +__ashrdi3 = 0x002aca +__ashrhi3 = 0x0027ae +__ashrsi3 = 0x002941 __bss_bank = 0x000000 -__bss_end = 0x00a058 +__bss_end = 0x00a086 __bss_lo16 = 0x00a000 __bss_seg0_bank = 0x000000 __bss_seg0_lo16 = 0x00a000 -__bss_seg0_size = 0x000058 +__bss_seg0_size = 0x000086 __bss_seg1_bank = 0x000000 __bss_seg1_lo16 = 0x000000 __bss_seg1_size = 0x000000 @@ -157,67 +185,75 @@ __bss_seg2_size = 0x000000 __bss_seg3_bank = 0x000000 __bss_seg3_lo16 = 0x000000 __bss_seg3_size = 0x000000 -__bss_size = 0x000058 +__bss_size = 0x000086 __bss_start = 0x00a000 -__cmpdi2 = 0x0020aa -__divdi3 = 0x002150 -__divhi3 = 0x001d0f -__divmod_setup = 0x001d43 -__divmoddi4_stash = 0x001f90 -__divmodsi_setup = 0x001f3f -__divsi3 = 0x001ef1 +__cmpdi2 = 0x002b74 +__divdi3 = 0x002c1a +__divhi3 = 0x0027d9 +__divmod_setup = 0x00280d +__divmoddi4_stash = 0x002a5a +__divmodsi_setup = 0x002a09 +__divsi3 = 0x0029bb __heap_end = 0x00bf00 -__heap_start = 0x00a058 -__indirTarget = 0x00a056 -__init_array_end = 0x0026ad -__init_array_start = 0x0026ad -__jsl_indir = 0x001c4c -__lshrdi3 = 0x001fdd -__lshrhi3 = 0x001cd4 -__lshrsi3 = 0x001e62 -__moddi3 = 0x00216f -__modhi3 = 0x001d29 -__modsi3 = 0x001f18 -__muldi3 = 0x002026 -__mulhi3 = 0x001c4f -__mulsi3 = 0x001d94 -__negdi_a = 0x0021ac -__negdi_b = 0x0021ca -__retdi = 0x001fad -__rodata_end = 0x0026ad -__rodata_start = 0x002638 +__heap_start = 0x00a086 +__indirTarget = 0x00a084 +__init_array_end = 0x00393a +__init_array_start = 0x00393a +__jsl_indir = 0x002716 +__lshrdi3 = 0x002aa7 +__lshrhi3 = 0x00279e +__lshrsi3 = 0x00292c +__moddi3 = 0x002c39 +__modhi3 = 0x0027f3 +__modsi3 = 0x0029e2 +__muldi3 = 0x002af0 +__mulhi3 = 0x002719 +__mulsi3 = 0x00285e +__negdi_a = 0x002c76 +__negdi_b = 0x002c94 +__retdi = 0x002a77 +__rodata_end = 0x00393a +__rodata_start = 0x003102 __start = 0x001000 -__text_end = 0x002638 +__text_end = 0x003102 __text_start = 0x001000 -__ucmpdi2 = 0x002081 -__udivdi3 = 0x0020e1 -__udivhi3 = 0x001cf7 -__udivmod_core = 0x001d76 -__udivmoddi_core = 0x002103 -__udivmodsi_core = 0x001e91 -__udivsi3 = 0x001ec9 -__umoddi3 = 0x0020ea -__umodhi3 = 0x001d03 -__umodsi3 = 0x001edd -__umulhisi3 = 0x001c6e -__umulhisi3_qsq = 0x00223a -appleTitle = 0x00265c -drawSunburst.cosA = 0x002674 -drawSunburst.sinA = 0x00268c -editTitle = 0x002665 -fileTitle = 0x00265f -gChainPath = 0x002638 -gDpBase = 0x00a054 -gDpHandle = 0x00a050 -gTitle = 0x0026a4 -gUserId = 0x00a04e -gWp = 0x00a000 -longjmp = 0x002210 +__ucmpdi2 = 0x002b4b +__udivdi3 = 0x002bab +__udivhi3 = 0x0027c1 +__udivmod_core = 0x002840 +__udivmoddi_core = 0x002bcd +__udivmodsi_core = 0x00295b +__udivsi3 = 0x002993 +__umoddi3 = 0x002bb4 +__umodhi3 = 0x0027cd +__umodsi3 = 0x0029a7 +__umulhisi3 = 0x002738 +__umulhisi3_qsq = 0x002d04 +appleMenuStr = 0x0031aa +doAlert.alertRec = 0x0038ca +doAlert.button = 0x00389a +doAlert.message = 0x0038b2 +doAlert.okStr = 0x003895 +doNew.wp = 0x00a02e +drawWindow = 0x001eee +editMenuStr = 0x003116 +fileMenuStr = 0x00316f +gAboutMsg = 0x003852 +gChainPath = 0x003102 +gDone = 0x00a02c +gDpBase = 0x00a082 +gDpHandle = 0x00a07e +gEvent = 0x00a000 +gTitle0 = 0x00382e +gTitle1 = 0x003837 +gTitle2 = 0x003840 +gTitle3 = 0x003849 +gUserId = 0x00a07c +gWindows = 0x0031c6 +longjmp = 0x002cda main = 0x0010ba -memset = 0x001618 -menuTitles = 0x00264c -optsTitle = 0x00266b -paintDesktopBackdrop = 0x001c1a -paintMenuBarTitles = 0x001b5e -setjmp = 0x0021e8 -startdesk = 0x0017a7 +memset = 0x002094 +paintDesktopBackdrop = 0x0026e4 +setjmp = 0x002cb2 +sketch.fullMsg = 0x003908 +startdesk = 0x0022fe diff --git a/demos/minicad.o b/demos/minicad.o index 717c10a..d37a62a 100644 Binary files a/demos/minicad.o and b/demos/minicad.o differ diff --git a/demos/minicad.omf b/demos/minicad.omf index 3cbb43e..d2a4d3b 100644 Binary files a/demos/minicad.omf and b/demos/minicad.omf differ diff --git a/demos/minicad.reloc b/demos/minicad.reloc index ecb326b..b137a76 100644 Binary files a/demos/minicad.reloc and b/demos/minicad.reloc differ diff --git a/demos/orcaFrameLike.bin b/demos/orcaFrameLike.bin deleted file mode 100644 index 6aee66e..0000000 Binary files a/demos/orcaFrameLike.bin and /dev/null differ diff --git a/demos/orcaFrameLike.c b/demos/orcaFrameLike.c deleted file mode 100644 index 19529fe..0000000 --- a/demos/orcaFrameLike.c +++ /dev/null @@ -1,160 +0,0 @@ -// orcaFrameLike.c - port of ORCA-C's Frame.cc sample. -// -// Mike Westerfield's "Frame" demo: brings up the standard Apple+File+Edit -// menu bar via the Window Manager / Menu Manager toolboxes, then runs -// a TaskMaster event loop until the user picks File > Quit (or the -// watchdog fires). Modeled after tools/orca-c/C.Samples/Desktop.Samples/ -// Frame.cc. -// -// What this port skips (vs the original): -// - Alert/Dialog Manager (DoAlert + MenuAbout). The Dialog Manager -// adds several toolbox calls that push us past the GS/OS Loader's -// cRELOC threshold ([[loader-creloc-threshold]]). HandleMenu for -// the "About" item is a no-op here. -// - enddesk() shutdown chain — GS/OS QUIT cleans up; see -// [[orca-frame-demo-landed]]. -// -// What this port keeps: -// - The exact ORCA menu-template strings (NewMenu with `>>` and `--` -// escape sequences), so Edit/File/Apple menus render identically. -// - HiliteMenu unhighlight after a menu pick. -// - TaskMaster mask 0x076E + the wInMenuBar / wInSpecial event -// dispatch. - -#include "iigs/toolbox.h" -#include "iigs/desktop.h" - -// Apple-assigned menu item IDs from Frame.cc -#define apple_About 257 -#define file_Quit 256 - -// TaskMaster event codes -#define wInSpecial 25 -#define wInMenuBar 3 - - -typedef struct { - unsigned short wmWhat; - unsigned long wmMessage; - unsigned long wmWhen; - short wmWhereV, wmWhereH; - unsigned short wmModifiers; - unsigned long wmTaskData; - unsigned long wmTaskMask; - unsigned long wmLastClickTick; - unsigned long wmClickCount; - unsigned long wmTaskData2; - unsigned long wmTaskData3; - unsigned long wmTaskData4; -} WmTaskRec; - - -static unsigned char editMenuStr[] = ">> Edit \\N3\r" - "--Undo\\N250V*Zz\r" - "--Cut\\N251*Xx\r" - "--Copy\\N252*Cc\r" - "--Paste\\N253*Vv\r" - "--Clear\\N254\r" - ".\r"; - -static unsigned char fileMenuStr[] = ">> File \\N2\r" - "--Close\\N255V\r" - "--Quit\\N256*Qq\r" - ".\r"; - -static unsigned char appleMenuStr[] = ">>@\\XN1\r" - "--About Frame\\N257V\r" - ".\r"; - -static WmTaskRec gEvent; -static volatile unsigned short gDone; - - -static void initMenus(void) { - *(volatile unsigned char *)0x00000F90UL = 0xB0; - void *m1 = NewMenu(editMenuStr); - *(volatile unsigned char *)0x00000F91UL = 0xB1; - InsertMenu(m1, 0); - *(volatile unsigned char *)0x00000F92UL = 0xB2; - InsertMenu(NewMenu(fileMenuStr), 0); - *(volatile unsigned char *)0x00000F93UL = 0xB3; - InsertMenu(NewMenu(appleMenuStr), 0); - *(volatile unsigned char *)0x00000F94UL = 0xB4; - FixAppleMenu(1); - *(volatile unsigned char *)0x00000F95UL = 0xB5; - FixMenuBar(); - *(volatile unsigned char *)0x00000F96UL = 0xB6; - DrawMenuBar(); - *(volatile unsigned char *)0x00000F97UL = 0xB7; -} - - -static void handleMenu(unsigned short menuNum) { - switch (menuNum) { - case apple_About: - // About handler skipped — Dialog Manager would push us - // past the Loader cRELOC limit. Real Frame.cc shows an - // alert; we just unhilite and continue. - break; - case file_Quit: - gDone = 1; - break; - default: - break; - } - HiliteMenu(0, (unsigned short)(gEvent.wmTaskData >> 16)); -} - - -int main(void) { - unsigned short userId = startdesk(640); - (void)userId; - - (void)&initMenus; // kept for documentation — see init below - - // Manually fill SHR with a clean Finder-style desktop: white - // menu bar (rows 0-12), a 1-pixel black separator (row 13), then - // gray desktop (rows 14-199). We bypass the Window Manager's - // dithered desktop fill because MAME's NTSC chroma simulator - // renders 640-mode alternating-bit dithers as colored noise even - // with SCB bit 4 set. - __asm__ volatile ( - "rep #0x30\n" - // Menu bar (rows 0..12): solid white = $FF bytes - "ldx #0x0000\n" - "1:\n" - ".byte 0xa9, 0xff, 0xff\n" // lda #$FFFF - ".byte 0x9f, 0x00, 0x20, 0xe1\n" // sta long $E1:2000, X - "inx\n inx\n" - ".byte 0xe0, 0x20, 0x08\n" // cpx #$0820 (13 * 160) - "bcc 1b\n" - // Black separator (row 13): all $00 bytes - "2:\n" - ".byte 0xa9, 0x00, 0x00\n" - ".byte 0x9f, 0x00, 0x20, 0xe1\n" - "inx\n inx\n" - ".byte 0xe0, 0xc0, 0x08\n" // cpx #$08C0 - "bcc 2b\n" - // Desktop (rows 14..199): solid white - "3:\n" - ".byte 0xa9, 0xff, 0xff\n" - ".byte 0x9f, 0x00, 0x20, 0xe1\n" - "inx\n inx\n" - ".byte 0xe0, 0x00, 0x7d\n" // cpx #$7D00 - "bcc 3b\n" - ::: "a", "x", "memory"); - gEvent.wmTaskMask = 0x1FFFL; - ShowCursor(); - - // Linger so the menu bar is visible (~1.5 sec at -nothrottle - // emulator speed). In interactive use you'd loop in TaskMaster - // until the user picks File→Quit; the headless test takes the - // snapshot during this spin and verifies $70=$99 after it ends. - (void)gDone; - (void)&handleMenu; - for (volatile unsigned long s = 0; s < 200000UL; s++) { } - - // Skip enddesk(); GS/OS QUIT cleans up on return. - *(volatile unsigned char *)0x70 = 0x99; - return 0; -} diff --git a/demos/orcaFrameLike.map b/demos/orcaFrameLike.map deleted file mode 100644 index ca1df87..0000000 --- a/demos/orcaFrameLike.map +++ /dev/null @@ -1,183 +0,0 @@ -# section layout -.text : 0x001000 .. 0x002085 ( 4229 bytes) -.rodata : 0x002085 .. 0x002099 ( 20 bytes) -.bss : 0x00a000 .. 0x00a00a ( 10 bytes) - -# per-input-file .text contributions - 186 /home/scott/claude/llvm816/runtime/crt0Gsos.o - 446 /home/scott/claude/llvm816/demos/orcaFrameLike.o - 43513 /home/scott/claude/llvm816/runtime/libc.o - 5935 /home/scott/claude/llvm816/runtime/snprintf.o - 11953 /home/scott/claude/llvm816/runtime/extras.o - 7077 /home/scott/claude/llvm816/runtime/softFloat.o - 15379 /home/scott/claude/llvm816/runtime/softDouble.o - 176 /home/scott/claude/llvm816/runtime/iigsGsos.o - 20670 /home/scott/claude/llvm816/runtime/iigsToolbox.o - 1050 /home/scott/claude/llvm816/runtime/desktop.o - 2540 /home/scott/claude/llvm816/runtime/libgcc.o - -# global symbols (sorted by address) -0x000000 __bss_bank -0x000000 __bss_seg0_bank -0x000000 __bss_seg1_bank -0x000000 __bss_seg1_lo16 -0x000000 __bss_seg1_size -0x000000 __bss_seg2_bank -0x000000 __bss_seg2_lo16 -0x000000 __bss_seg2_size -0x000000 __bss_seg3_bank -0x000000 __bss_seg3_lo16 -0x000000 __bss_seg3_size -0x00000a __bss_seg0_size -0x00000a __bss_size -0x001000 __start -0x001000 __text_start -0x0010ba main -0x001278 CtlStartUp -0x001288 EMStartUp -0x0012a7 FMStartUp -0x0012b7 LEStartUp -0x0012c7 LoadOneTool -0x0012d7 NewHandle -0x0012fd QDStartUp -0x001313 startdesk -0x001699 __jsl_indir -0x00169c __mulhi3 -0x0016bb __umulhisi3 -0x001712 __ashlhi3 -0x001721 __lshrhi3 -0x001731 __ashrhi3 -0x001744 __udivhi3 -0x001750 __umodhi3 -0x00175c __divhi3 -0x001776 __modhi3 -0x001790 __divmod_setup -0x0017c3 __udivmod_core -0x0017e1 __mulsi3 -0x00189a __ashlsi3 -0x0018af __lshrsi3 -0x0018c4 __ashrsi3 -0x0018de __udivmodsi_core -0x001916 __udivsi3 -0x00192a __umodsi3 -0x00193e __divsi3 -0x001965 __modsi3 -0x00198c __divmodsi_setup -0x0019dd __divmoddi4_stash -0x0019fa __retdi -0x001a07 __ashldi3 -0x001a2a __lshrdi3 -0x001a4d __ashrdi3 -0x001a73 __muldi3 -0x001ace __ucmpdi2 -0x001af7 __cmpdi2 -0x001b2e __udivdi3 -0x001b37 __umoddi3 -0x001b50 __udivmoddi_core -0x001b9d __divdi3 -0x001bbc __moddi3 -0x001be9 __absdi_a -0x001bf1 __absdi_b -0x001bf9 __negdi_a -0x001c17 __negdi_b -0x001c35 setjmp -0x001c5d longjmp -0x001c87 __umulhisi3_qsq -0x002085 __rodata_start -0x002085 __text_end -0x002085 gChainPath -0x002099 __init_array_end -0x002099 __init_array_start -0x002099 __rodata_end -0x00a000 __bss_lo16 -0x00a000 __bss_seg0_lo16 -0x00a000 __bss_start -0x00a000 gDone -0x00a002 gUserId -0x00a004 gDpHandle -0x00a008 __indirTarget -0x00a00a __bss_end -0x00a00a __heap_start -0x00bf00 __heap_end -CtlStartUp = 0x001278 -EMStartUp = 0x001288 -FMStartUp = 0x0012a7 -LEStartUp = 0x0012b7 -LoadOneTool = 0x0012c7 -NewHandle = 0x0012d7 -QDStartUp = 0x0012fd -__absdi_a = 0x001be9 -__absdi_b = 0x001bf1 -__ashldi3 = 0x001a07 -__ashlhi3 = 0x001712 -__ashlsi3 = 0x00189a -__ashrdi3 = 0x001a4d -__ashrhi3 = 0x001731 -__ashrsi3 = 0x0018c4 -__bss_bank = 0x000000 -__bss_end = 0x00a00a -__bss_lo16 = 0x00a000 -__bss_seg0_bank = 0x000000 -__bss_seg0_lo16 = 0x00a000 -__bss_seg0_size = 0x00000a -__bss_seg1_bank = 0x000000 -__bss_seg1_lo16 = 0x000000 -__bss_seg1_size = 0x000000 -__bss_seg2_bank = 0x000000 -__bss_seg2_lo16 = 0x000000 -__bss_seg2_size = 0x000000 -__bss_seg3_bank = 0x000000 -__bss_seg3_lo16 = 0x000000 -__bss_seg3_size = 0x000000 -__bss_size = 0x00000a -__bss_start = 0x00a000 -__cmpdi2 = 0x001af7 -__divdi3 = 0x001b9d -__divhi3 = 0x00175c -__divmod_setup = 0x001790 -__divmoddi4_stash = 0x0019dd -__divmodsi_setup = 0x00198c -__divsi3 = 0x00193e -__heap_end = 0x00bf00 -__heap_start = 0x00a00a -__indirTarget = 0x00a008 -__init_array_end = 0x002099 -__init_array_start = 0x002099 -__jsl_indir = 0x001699 -__lshrdi3 = 0x001a2a -__lshrhi3 = 0x001721 -__lshrsi3 = 0x0018af -__moddi3 = 0x001bbc -__modhi3 = 0x001776 -__modsi3 = 0x001965 -__muldi3 = 0x001a73 -__mulhi3 = 0x00169c -__mulsi3 = 0x0017e1 -__negdi_a = 0x001bf9 -__negdi_b = 0x001c17 -__retdi = 0x0019fa -__rodata_end = 0x002099 -__rodata_start = 0x002085 -__start = 0x001000 -__text_end = 0x002085 -__text_start = 0x001000 -__ucmpdi2 = 0x001ace -__udivdi3 = 0x001b2e -__udivhi3 = 0x001744 -__udivmod_core = 0x0017c3 -__udivmoddi_core = 0x001b50 -__udivmodsi_core = 0x0018de -__udivsi3 = 0x001916 -__umoddi3 = 0x001b37 -__umodhi3 = 0x001750 -__umodsi3 = 0x00192a -__umulhisi3 = 0x0016bb -__umulhisi3_qsq = 0x001c87 -gChainPath = 0x002085 -gDone = 0x00a000 -gDpHandle = 0x00a004 -gUserId = 0x00a002 -longjmp = 0x001c5d -main = 0x0010ba -setjmp = 0x001c35 -startdesk = 0x001313 diff --git a/demos/orcaFrameLike.o b/demos/orcaFrameLike.o deleted file mode 100644 index d24b875..0000000 Binary files a/demos/orcaFrameLike.o and /dev/null differ diff --git a/demos/orcaFrameLike.omf b/demos/orcaFrameLike.omf deleted file mode 100644 index 78ab4cd..0000000 Binary files a/demos/orcaFrameLike.omf and /dev/null differ diff --git a/demos/orcaFrameLike.reloc b/demos/orcaFrameLike.reloc deleted file mode 100644 index 074de1e..0000000 Binary files a/demos/orcaFrameLike.reloc and /dev/null differ diff --git a/demos/orcaMiniCadLike.bin b/demos/orcaMiniCadLike.bin deleted file mode 100644 index 95e1f9d..0000000 Binary files a/demos/orcaMiniCadLike.bin and /dev/null differ diff --git a/demos/orcaMiniCadLike.c b/demos/orcaMiniCadLike.c deleted file mode 100644 index 66b35a3..0000000 --- a/demos/orcaMiniCadLike.c +++ /dev/null @@ -1,155 +0,0 @@ -// orcaMiniCadLike.c - port of ORCA-C's MiniCAD.cc sample. -// -// Mike Westerfield's "MiniCAD" — drawing program with a Window -// Manager content window. Original at tools/orca-c/C.Samples/ -// Desktop.Samples/MiniCAD.cc. -// -// Architecture (preserves the original's WM event flow): -// - startdesk(640) brings up the full toolset. -// - NewWindow opens a content window. -// - TaskMaster event loop dispatches wInContent and wInGoAway. -// - Each wInContent click draws one line segment in the window -// via BeginUpdate/EndUpdate (so the WM's update region is -// properly managed — drawing OUTSIDE the WM update flow makes -// TaskMaster hang on subsequent calls). -// -// What this port skips (would push past GS/OS Loader's reloc cap): -// - Menu bar (Apple/File/Edit) — kept for orcaFrameLike. -// - Alert/Dialog Manager About box. - -#include "iigs/toolbox.h" -#include "iigs/desktop.h" - -#define wInContent 19 -#define wInGoAway 17 -#define keyDownEvt 3 - -#define fTitle 0x0001 -#define fVis 0x0020 -#define fMove 0x0080 -#define fGrow 0x0400 -#define fClose 0x4000 - - -typedef struct { short v1, h1, v2, h2; } Rect; - -typedef struct { - unsigned short paramLength; - unsigned short wFrameBits; - void *wTitle; - unsigned long wRefCon; - Rect wZoom; - void *wColor; - short wYOrigin, wXOrigin; - short wDataH, wDataV; - short wMaxHeight, wMaxWidth; - short wScrollVer, wScrollHor; - short wPageVer, wPageHor; - unsigned long wInfoRefCon; - short wInfoHeight; - void *wFrameDefProc; - void *wInfoDefProc; - void *wContDefProc; - Rect wPosition; - void *wPlane; - void *wStorage; -} NewWindowParm; - -typedef struct { - unsigned short wmWhat; - unsigned long wmMessage; - unsigned long wmWhen; - short wmWhereV, wmWhereH; - unsigned short wmModifiers; - unsigned long wmTaskData; - unsigned long wmTaskMask; - unsigned long wmLastClickTick; - unsigned long wmClickCount; - unsigned long wmTaskData2; - unsigned long wmTaskData3; - unsigned long wmTaskData4; -} WmTaskRec; - - -static unsigned char gTitle[] = "\x07MiniCAD"; -static NewWindowParm gWp; -static WmTaskRec gEv; - - -int main(void) { - unsigned short userId = startdesk(640); - (void)userId; - - // Paint a clean Finder-style backdrop (white menu bar + black - // separator + white desktop) directly into SHR, bypassing the - // WM's dithered desktop fill (MAME NTSC-chroma simulator renders - // 640-mode dithers as colored noise). See orcaFrameLike.c. - __asm__ volatile ( - "rep #0x30\n" - "ldx #0x0000\n" - "1:\n" - ".byte 0xa9, 0xff, 0xff\n" - ".byte 0x9f, 0x00, 0x20, 0xe1\n" - "inx\n inx\n" - ".byte 0xe0, 0x20, 0x08\n" - "bcc 1b\n" - "2:\n" - ".byte 0xa9, 0x00, 0x00\n" - ".byte 0x9f, 0x00, 0x20, 0xe1\n" - "inx\n inx\n" - ".byte 0xe0, 0xc0, 0x08\n" - "bcc 2b\n" - "3:\n" - ".byte 0xa9, 0xff, 0xff\n" - ".byte 0x9f, 0x00, 0x20, 0xe1\n" - "inx\n inx\n" - ".byte 0xe0, 0x00, 0x7d\n" - "bcc 3b\n" - ::: "a", "x", "memory"); - - ShowCursor(); - - // Open a drawing window. - { - unsigned char *p = (unsigned char *)&gWp; - for (unsigned short i = 0; i < sizeof gWp; i++) p[i] = 0; - } - gWp.paramLength = (unsigned short)sizeof gWp; - gWp.wFrameBits = fVis | fMove | fClose; - gWp.wTitle = gTitle; - gWp.wMaxHeight = 200; - gWp.wMaxWidth = 640; - gWp.wPosition.v1 = 20; gWp.wPosition.h1 = 20; - gWp.wPosition.v2 = 160; gWp.wPosition.h2 = 620; - gWp.wPlane = (void *)-1L; - - void *win = NewWindow(&gWp); - if (win) { - // Draw inside BeginUpdate / EndUpdate so the WM accepts the - // content area as painted. Without this the WM keeps the - // region dirty and tries to invoke our NULL wContDefProc on - // every TaskMaster iteration. - BeginUpdate(win); - SetPort(win); - // A small line-art demo — proves QD pen / MoveTo / LineTo - // flow lands pixels inside the window's content area. - for (short i = 0; i < 12; i++) { - MoveTo(40, (short)(30 + i * 8)); - LineTo((short)(50 + i * 40), (short)(120 - i * 6)); - } - EndUpdate(win); - } - - // Linger so the rendered window is visible for ~1 second in - // interactive use and any timed screenshot. No TaskMaster loop - // here — see [[orca-demos-landed]] memory for the WM-update - // gotcha that hangs TaskMaster after we draw. - (void)gEv; - for (volatile unsigned long s = 0; s < 500000UL; s++) { } - - if (win) { - CloseWindow(win); - } - *(volatile unsigned char *)0x70 = 0x99; - return 0; -} diff --git a/demos/orcaMiniCadLike.map b/demos/orcaMiniCadLike.map deleted file mode 100644 index e4d6c53..0000000 --- a/demos/orcaMiniCadLike.map +++ /dev/null @@ -1,201 +0,0 @@ -# section layout -.text : 0x001000 .. 0x00227e ( 4734 bytes) -.rodata : 0x00227e .. 0x00229b ( 29 bytes) -.bss : 0x00a000 .. 0x00a056 ( 86 bytes) - -# per-input-file .text contributions - 186 /home/scott/claude/llvm816/runtime/crt0Gsos.o - 725 /home/scott/claude/llvm816/demos/orcaMiniCadLike.o - 43513 /home/scott/claude/llvm816/runtime/libc.o - 5935 /home/scott/claude/llvm816/runtime/snprintf.o - 11953 /home/scott/claude/llvm816/runtime/extras.o - 7077 /home/scott/claude/llvm816/runtime/softFloat.o - 15379 /home/scott/claude/llvm816/runtime/softDouble.o - 176 /home/scott/claude/llvm816/runtime/iigsGsos.o - 20670 /home/scott/claude/llvm816/runtime/iigsToolbox.o - 1050 /home/scott/claude/llvm816/runtime/desktop.o - 2540 /home/scott/claude/llvm816/runtime/libgcc.o - -# global symbols (sorted by address) -0x000000 __bss_bank -0x000000 __bss_seg0_bank -0x000000 __bss_seg1_bank -0x000000 __bss_seg1_lo16 -0x000000 __bss_seg1_size -0x000000 __bss_seg2_bank -0x000000 __bss_seg2_lo16 -0x000000 __bss_seg2_size -0x000000 __bss_seg3_bank -0x000000 __bss_seg3_lo16 -0x000000 __bss_seg3_size -0x000056 __bss_seg0_size -0x000056 __bss_size -0x001000 __start -0x001000 __text_start -0x0010ba main -0x00138f memset -0x0013ef CtlStartUp -0x0013ff EMStartUp -0x00141e FMStartUp -0x00142e LEStartUp -0x00143e LoadOneTool -0x00144e NewHandle -0x001474 QDStartUp -0x00148a LineTo -0x00149a MoveTo -0x0014aa SetPort -0x0014bc BeginUpdate -0x0014ce CloseWindow -0x0014e0 EndUpdate -0x0014f2 NewWindow -0x00150c startdesk -0x001892 __jsl_indir -0x001895 __mulhi3 -0x0018b4 __umulhisi3 -0x00190b __ashlhi3 -0x00191a __lshrhi3 -0x00192a __ashrhi3 -0x00193d __udivhi3 -0x001949 __umodhi3 -0x001955 __divhi3 -0x00196f __modhi3 -0x001989 __divmod_setup -0x0019bc __udivmod_core -0x0019da __mulsi3 -0x001a93 __ashlsi3 -0x001aa8 __lshrsi3 -0x001abd __ashrsi3 -0x001ad7 __udivmodsi_core -0x001b0f __udivsi3 -0x001b23 __umodsi3 -0x001b37 __divsi3 -0x001b5e __modsi3 -0x001b85 __divmodsi_setup -0x001bd6 __divmoddi4_stash -0x001bf3 __retdi -0x001c00 __ashldi3 -0x001c23 __lshrdi3 -0x001c46 __ashrdi3 -0x001c6c __muldi3 -0x001cc7 __ucmpdi2 -0x001cf0 __cmpdi2 -0x001d27 __udivdi3 -0x001d30 __umoddi3 -0x001d49 __udivmoddi_core -0x001d96 __divdi3 -0x001db5 __moddi3 -0x001de2 __absdi_a -0x001dea __absdi_b -0x001df2 __negdi_a -0x001e10 __negdi_b -0x001e2e setjmp -0x001e56 longjmp -0x001e80 __umulhisi3_qsq -0x00227e __rodata_start -0x00227e __text_end -0x00227e gChainPath -0x002292 gTitle -0x00229b __init_array_end -0x00229b __init_array_start -0x00229b __rodata_end -0x00a000 __bss_lo16 -0x00a000 __bss_seg0_lo16 -0x00a000 __bss_start -0x00a000 gWp -0x00a04e gUserId -0x00a050 gDpHandle -0x00a054 __indirTarget -0x00a056 __bss_end -0x00a056 __heap_start -0x00bf00 __heap_end -BeginUpdate = 0x0014bc -CloseWindow = 0x0014ce -CtlStartUp = 0x0013ef -EMStartUp = 0x0013ff -EndUpdate = 0x0014e0 -FMStartUp = 0x00141e -LEStartUp = 0x00142e -LineTo = 0x00148a -LoadOneTool = 0x00143e -MoveTo = 0x00149a -NewHandle = 0x00144e -NewWindow = 0x0014f2 -QDStartUp = 0x001474 -SetPort = 0x0014aa -__absdi_a = 0x001de2 -__absdi_b = 0x001dea -__ashldi3 = 0x001c00 -__ashlhi3 = 0x00190b -__ashlsi3 = 0x001a93 -__ashrdi3 = 0x001c46 -__ashrhi3 = 0x00192a -__ashrsi3 = 0x001abd -__bss_bank = 0x000000 -__bss_end = 0x00a056 -__bss_lo16 = 0x00a000 -__bss_seg0_bank = 0x000000 -__bss_seg0_lo16 = 0x00a000 -__bss_seg0_size = 0x000056 -__bss_seg1_bank = 0x000000 -__bss_seg1_lo16 = 0x000000 -__bss_seg1_size = 0x000000 -__bss_seg2_bank = 0x000000 -__bss_seg2_lo16 = 0x000000 -__bss_seg2_size = 0x000000 -__bss_seg3_bank = 0x000000 -__bss_seg3_lo16 = 0x000000 -__bss_seg3_size = 0x000000 -__bss_size = 0x000056 -__bss_start = 0x00a000 -__cmpdi2 = 0x001cf0 -__divdi3 = 0x001d96 -__divhi3 = 0x001955 -__divmod_setup = 0x001989 -__divmoddi4_stash = 0x001bd6 -__divmodsi_setup = 0x001b85 -__divsi3 = 0x001b37 -__heap_end = 0x00bf00 -__heap_start = 0x00a056 -__indirTarget = 0x00a054 -__init_array_end = 0x00229b -__init_array_start = 0x00229b -__jsl_indir = 0x001892 -__lshrdi3 = 0x001c23 -__lshrhi3 = 0x00191a -__lshrsi3 = 0x001aa8 -__moddi3 = 0x001db5 -__modhi3 = 0x00196f -__modsi3 = 0x001b5e -__muldi3 = 0x001c6c -__mulhi3 = 0x001895 -__mulsi3 = 0x0019da -__negdi_a = 0x001df2 -__negdi_b = 0x001e10 -__retdi = 0x001bf3 -__rodata_end = 0x00229b -__rodata_start = 0x00227e -__start = 0x001000 -__text_end = 0x00227e -__text_start = 0x001000 -__ucmpdi2 = 0x001cc7 -__udivdi3 = 0x001d27 -__udivhi3 = 0x00193d -__udivmod_core = 0x0019bc -__udivmoddi_core = 0x001d49 -__udivmodsi_core = 0x001ad7 -__udivsi3 = 0x001b0f -__umoddi3 = 0x001d30 -__umodhi3 = 0x001949 -__umodsi3 = 0x001b23 -__umulhisi3 = 0x0018b4 -__umulhisi3_qsq = 0x001e80 -gChainPath = 0x00227e -gDpHandle = 0x00a050 -gTitle = 0x002292 -gUserId = 0x00a04e -gWp = 0x00a000 -longjmp = 0x001e56 -main = 0x0010ba -memset = 0x00138f -setjmp = 0x001e2e -startdesk = 0x00150c diff --git a/demos/orcaMiniCadLike.o b/demos/orcaMiniCadLike.o deleted file mode 100644 index a72980a..0000000 Binary files a/demos/orcaMiniCadLike.o and /dev/null differ diff --git a/demos/orcaMiniCadLike.omf b/demos/orcaMiniCadLike.omf deleted file mode 100644 index 789963e..0000000 Binary files a/demos/orcaMiniCadLike.omf and /dev/null differ diff --git a/demos/orcaMiniCadLike.reloc b/demos/orcaMiniCadLike.reloc deleted file mode 100644 index f924495..0000000 Binary files a/demos/orcaMiniCadLike.reloc and /dev/null differ diff --git a/demos/orcaReversiLike.bin b/demos/orcaReversiLike.bin deleted file mode 100644 index 6aee66e..0000000 Binary files a/demos/orcaReversiLike.bin and /dev/null differ diff --git a/demos/orcaReversiLike.c b/demos/orcaReversiLike.c deleted file mode 100644 index 0951e7f..0000000 --- a/demos/orcaReversiLike.c +++ /dev/null @@ -1,136 +0,0 @@ -// orcaReversiLike.c - port of ORCA-C's Reversi.cc sample. -// -// Mike Westerfield's "Reversi" is a full Othello game running under -// the Apple IIgs Window Manager (~1600 lines of game + UI). This -// port keeps the desktop scaffolding (startdesk + menu bar + -// TaskMaster) but stops short of the game logic itself — the IIgs -// Loader's silent rejection of OMFs past a complex cRELOC/byte-count -// threshold ([[loader-creloc-threshold]]) doesn't leave room for the -// full game in a single segment. Original at tools/orca-c/C.Samples/ -// Desktop.Samples/Reversi.cc. -// -// What this port keeps: -// - Full toolset init via startdesk(640). -// - Apple/File/Edit menu bar (NewMenu strings derived from -// Reversi.cc). -// - TaskMaster event loop with menu / wInGoAway dispatch. -// -// What this port skips: -// - The game itself (board, moves, AI, scoring). -// - QDAuxStartUp / SetPenMode / DrawControls / etc. -// - Alert/Dialog Manager. - -#include "iigs/toolbox.h" -#include "iigs/desktop.h" - -#define apple_About 257 -#define file_New 258 -#define file_Close 259 -#define file_Quit 256 - -#define wInSpecial 25 -#define wInMenuBar 3 -#define wInGoAway 17 - - -typedef struct { - unsigned short wmWhat; - unsigned long wmMessage; - unsigned long wmWhen; - short wmWhereV, wmWhereH; - unsigned short wmModifiers; - unsigned long wmTaskData; - unsigned long wmTaskMask; - unsigned long wmLastClickTick; - unsigned long wmClickCount; - unsigned long wmTaskData2; - unsigned long wmTaskData3; - unsigned long wmTaskData4; -} WmTaskRec; - - -// Menu templates per Reversi.cc style — same Apple/File/Edit -// scaffolding any IIgs WM app needs. -static unsigned char editMenuStr[] = ">> Edit \\N3\r" - "--Undo\\N250V*Zz\r" - "--Cut\\N251*Xx\r" - "--Copy\\N252*Cc\r" - "--Paste\\N253*Vv\r" - "--Clear\\N254\r" - ".\r"; - -static unsigned char fileMenuStr[] = ">> File \\N2\r" - "--New Game\\N258*Nn\r" - "--Close\\N259V\r" - "--Quit\\N256*Qq\r" - ".\r"; - -static unsigned char appleMenuStr[] = ">>@\\XN1\r" - "--About Reversi\\N257V\r" - ".\r"; - -static volatile unsigned short gDone; - - -static void initMenus(void) { - InsertMenu(NewMenu(editMenuStr), 0); - InsertMenu(NewMenu(fileMenuStr), 0); - InsertMenu(NewMenu(appleMenuStr), 0); - FixAppleMenu(1); - FixMenuBar(); - DrawMenuBar(); -} - - -static void handleMenu(unsigned short menuNum, unsigned long taskData) { - switch (menuNum) { - case file_Quit: - gDone = 1; - break; - default: - break; - } - HiliteMenu(0, (unsigned short)(taskData >> 16)); -} - - -int main(void) { - unsigned short userId = startdesk(640); - (void)userId; - - (void)&initMenus; - - // Manually paint Finder-style desktop: white menu bar (rows 0-12), - // 1-pixel black separator (row 13), white desktop (rows 14-199). - // See orcaFrameLike.c for the WM-vs-MAME-NTSC rationale. - __asm__ volatile ( - "rep #0x30\n" - "ldx #0x0000\n" - "1:\n" - ".byte 0xa9, 0xff, 0xff\n" - ".byte 0x9f, 0x00, 0x20, 0xe1\n" - "inx\n inx\n" - ".byte 0xe0, 0x20, 0x08\n" - "bcc 1b\n" - "2:\n" - ".byte 0xa9, 0x00, 0x00\n" - ".byte 0x9f, 0x00, 0x20, 0xe1\n" - "inx\n inx\n" - ".byte 0xe0, 0xc0, 0x08\n" - "bcc 2b\n" - "3:\n" - ".byte 0xa9, 0xff, 0xff\n" - ".byte 0x9f, 0x00, 0x20, 0xe1\n" - "inx\n inx\n" - ".byte 0xe0, 0x00, 0x7d\n" - "bcc 3b\n" - ::: "a", "x", "memory"); - ShowCursor(); - - (void)gDone; - (void)&handleMenu; - for (volatile unsigned long s = 0; s < 200000UL; s++) { } - - *(volatile unsigned char *)0x70 = 0x99; - return 0; -} diff --git a/demos/orcaReversiLike.map b/demos/orcaReversiLike.map deleted file mode 100644 index 22d3a8a..0000000 --- a/demos/orcaReversiLike.map +++ /dev/null @@ -1,183 +0,0 @@ -# section layout -.text : 0x001000 .. 0x002085 ( 4229 bytes) -.rodata : 0x002085 .. 0x002099 ( 20 bytes) -.bss : 0x00a000 .. 0x00a00a ( 10 bytes) - -# per-input-file .text contributions - 186 /home/scott/claude/llvm816/runtime/crt0Gsos.o - 446 /home/scott/claude/llvm816/demos/orcaReversiLike.o - 43513 /home/scott/claude/llvm816/runtime/libc.o - 5935 /home/scott/claude/llvm816/runtime/snprintf.o - 11953 /home/scott/claude/llvm816/runtime/extras.o - 7077 /home/scott/claude/llvm816/runtime/softFloat.o - 15379 /home/scott/claude/llvm816/runtime/softDouble.o - 176 /home/scott/claude/llvm816/runtime/iigsGsos.o - 20670 /home/scott/claude/llvm816/runtime/iigsToolbox.o - 1050 /home/scott/claude/llvm816/runtime/desktop.o - 2540 /home/scott/claude/llvm816/runtime/libgcc.o - -# global symbols (sorted by address) -0x000000 __bss_bank -0x000000 __bss_seg0_bank -0x000000 __bss_seg1_bank -0x000000 __bss_seg1_lo16 -0x000000 __bss_seg1_size -0x000000 __bss_seg2_bank -0x000000 __bss_seg2_lo16 -0x000000 __bss_seg2_size -0x000000 __bss_seg3_bank -0x000000 __bss_seg3_lo16 -0x000000 __bss_seg3_size -0x00000a __bss_seg0_size -0x00000a __bss_size -0x001000 __start -0x001000 __text_start -0x0010ba main -0x001278 CtlStartUp -0x001288 EMStartUp -0x0012a7 FMStartUp -0x0012b7 LEStartUp -0x0012c7 LoadOneTool -0x0012d7 NewHandle -0x0012fd QDStartUp -0x001313 startdesk -0x001699 __jsl_indir -0x00169c __mulhi3 -0x0016bb __umulhisi3 -0x001712 __ashlhi3 -0x001721 __lshrhi3 -0x001731 __ashrhi3 -0x001744 __udivhi3 -0x001750 __umodhi3 -0x00175c __divhi3 -0x001776 __modhi3 -0x001790 __divmod_setup -0x0017c3 __udivmod_core -0x0017e1 __mulsi3 -0x00189a __ashlsi3 -0x0018af __lshrsi3 -0x0018c4 __ashrsi3 -0x0018de __udivmodsi_core -0x001916 __udivsi3 -0x00192a __umodsi3 -0x00193e __divsi3 -0x001965 __modsi3 -0x00198c __divmodsi_setup -0x0019dd __divmoddi4_stash -0x0019fa __retdi -0x001a07 __ashldi3 -0x001a2a __lshrdi3 -0x001a4d __ashrdi3 -0x001a73 __muldi3 -0x001ace __ucmpdi2 -0x001af7 __cmpdi2 -0x001b2e __udivdi3 -0x001b37 __umoddi3 -0x001b50 __udivmoddi_core -0x001b9d __divdi3 -0x001bbc __moddi3 -0x001be9 __absdi_a -0x001bf1 __absdi_b -0x001bf9 __negdi_a -0x001c17 __negdi_b -0x001c35 setjmp -0x001c5d longjmp -0x001c87 __umulhisi3_qsq -0x002085 __rodata_start -0x002085 __text_end -0x002085 gChainPath -0x002099 __init_array_end -0x002099 __init_array_start -0x002099 __rodata_end -0x00a000 __bss_lo16 -0x00a000 __bss_seg0_lo16 -0x00a000 __bss_start -0x00a000 gDone -0x00a002 gUserId -0x00a004 gDpHandle -0x00a008 __indirTarget -0x00a00a __bss_end -0x00a00a __heap_start -0x00bf00 __heap_end -CtlStartUp = 0x001278 -EMStartUp = 0x001288 -FMStartUp = 0x0012a7 -LEStartUp = 0x0012b7 -LoadOneTool = 0x0012c7 -NewHandle = 0x0012d7 -QDStartUp = 0x0012fd -__absdi_a = 0x001be9 -__absdi_b = 0x001bf1 -__ashldi3 = 0x001a07 -__ashlhi3 = 0x001712 -__ashlsi3 = 0x00189a -__ashrdi3 = 0x001a4d -__ashrhi3 = 0x001731 -__ashrsi3 = 0x0018c4 -__bss_bank = 0x000000 -__bss_end = 0x00a00a -__bss_lo16 = 0x00a000 -__bss_seg0_bank = 0x000000 -__bss_seg0_lo16 = 0x00a000 -__bss_seg0_size = 0x00000a -__bss_seg1_bank = 0x000000 -__bss_seg1_lo16 = 0x000000 -__bss_seg1_size = 0x000000 -__bss_seg2_bank = 0x000000 -__bss_seg2_lo16 = 0x000000 -__bss_seg2_size = 0x000000 -__bss_seg3_bank = 0x000000 -__bss_seg3_lo16 = 0x000000 -__bss_seg3_size = 0x000000 -__bss_size = 0x00000a -__bss_start = 0x00a000 -__cmpdi2 = 0x001af7 -__divdi3 = 0x001b9d -__divhi3 = 0x00175c -__divmod_setup = 0x001790 -__divmoddi4_stash = 0x0019dd -__divmodsi_setup = 0x00198c -__divsi3 = 0x00193e -__heap_end = 0x00bf00 -__heap_start = 0x00a00a -__indirTarget = 0x00a008 -__init_array_end = 0x002099 -__init_array_start = 0x002099 -__jsl_indir = 0x001699 -__lshrdi3 = 0x001a2a -__lshrhi3 = 0x001721 -__lshrsi3 = 0x0018af -__moddi3 = 0x001bbc -__modhi3 = 0x001776 -__modsi3 = 0x001965 -__muldi3 = 0x001a73 -__mulhi3 = 0x00169c -__mulsi3 = 0x0017e1 -__negdi_a = 0x001bf9 -__negdi_b = 0x001c17 -__retdi = 0x0019fa -__rodata_end = 0x002099 -__rodata_start = 0x002085 -__start = 0x001000 -__text_end = 0x002085 -__text_start = 0x001000 -__ucmpdi2 = 0x001ace -__udivdi3 = 0x001b2e -__udivhi3 = 0x001744 -__udivmod_core = 0x0017c3 -__udivmoddi_core = 0x001b50 -__udivmodsi_core = 0x0018de -__udivsi3 = 0x001916 -__umoddi3 = 0x001b37 -__umodhi3 = 0x001750 -__umodsi3 = 0x00192a -__umulhisi3 = 0x0016bb -__umulhisi3_qsq = 0x001c87 -gChainPath = 0x002085 -gDone = 0x00a000 -gDpHandle = 0x00a004 -gUserId = 0x00a002 -longjmp = 0x001c5d -main = 0x0010ba -setjmp = 0x001c35 -startdesk = 0x001313 diff --git a/demos/orcaReversiLike.o b/demos/orcaReversiLike.o deleted file mode 100644 index 2381823..0000000 Binary files a/demos/orcaReversiLike.o and /dev/null differ diff --git a/demos/orcaReversiLike.omf b/demos/orcaReversiLike.omf deleted file mode 100644 index 2fa2899..0000000 Binary files a/demos/orcaReversiLike.omf and /dev/null differ diff --git a/demos/orcaReversiLike.reloc b/demos/orcaReversiLike.reloc deleted file mode 100644 index 074de1e..0000000 Binary files a/demos/orcaReversiLike.reloc and /dev/null differ diff --git a/demos/qdProbe.bin b/demos/qdProbe.bin index c19bf35..8256fd4 100644 Binary files a/demos/qdProbe.bin and b/demos/qdProbe.bin differ diff --git a/demos/qdProbe.c b/demos/qdProbe.c index 45660be..f8c47e7 100644 --- a/demos/qdProbe.c +++ b/demos/qdProbe.c @@ -45,19 +45,32 @@ int main(void) { *(volatile unsigned char *)0x80 = 0xA1; unsigned short userId = MMStartUp(); + // QD needs $200 bytes (own DP + cursor mgr at +$100), EM at +$200. + // masterSCB = $90 (640 mode, color burst OFF) avoids the NTSC chroma + // simulator turning the WM's dithered desktop pattern into red/green + // noise. See runtime/src/desktop.c for the full layout. void *dpH = NewHandle(0x400UL, userId, 0xC015, (void *)0); unsigned short dp = blockAddrLo(dpH); *(volatile unsigned char *)0x81 = 0xA2; - QDStartUp(dp, 0x80, 640, userId); + QDStartUp(dp, 0x90, 640, userId); *(volatile unsigned char *)0x82 = 0xA3; + // Match runtime/src/desktop.c's palette setup so the WM's dithered + // desktop fill renders as a clean B/W stipple instead of chroma. + for (unsigned short p = 0; p < 16; p++) { + volatile unsigned short *pal = + (volatile unsigned short *)(0xE19E00UL + (unsigned long)p * 32UL); + for (unsigned short k = 0; k < 16; k++) { + pal[k] = (k & 1) ? 0x0FFF : 0x0000; + } + } // SHR row 1 marker: 'After QDStartUp' { volatile unsigned char *shr = (volatile unsigned char *)(0xE12000UL + 160); for (unsigned short i = 0; i < 160; i++) shr[i] = 0x55; } - EMStartUp((unsigned short)(dp + 0x100), 20, 0, 0, 639, 199, userId); + EMStartUp((unsigned short)(dp + 0x200), 20, 0, 0, 639, 199, userId); *(volatile unsigned char *)0x83 = 0xA4; SchStartUp(); @@ -75,10 +88,9 @@ int main(void) { RefreshDesktop((void *)0); *(volatile unsigned char *)0x87 = 0xA8; - // Spin to let the WM emit any deferred paint. - for (unsigned long i = 0; i < 200000UL; i++) { - __asm__ volatile ("nop"); - } + // Spin to let the WM emit any deferred paint AND give snapshot + // tools time to capture the post-paint state. + for (volatile unsigned long s = 0; s < 300000UL; s++) { } *(volatile unsigned char *)0x86 = 0xA7; *(volatile unsigned char *)0x70 = 0x99; diff --git a/demos/qdProbe.map b/demos/qdProbe.map index e649d11..0978ede 100644 --- a/demos/qdProbe.map +++ b/demos/qdProbe.map @@ -1,11 +1,11 @@ # section layout -.text : 0x001000 .. 0x001d0c ( 3340 bytes) -.rodata : 0x001d0c .. 0x001d20 ( 20 bytes) +.text : 0x001000 .. 0x001ffe ( 4094 bytes) +.rodata : 0x001ffe .. 0x002012 ( 20 bytes) .bss : 0x00a000 .. 0x00a002 ( 2 bytes) # per-input-file .text contributions 186 /home/scott/claude/llvm816/runtime/crt0Gsos.o - 505 /home/scott/claude/llvm816/demos/qdProbe.o + 1259 /home/scott/claude/llvm816/demos/qdProbe.o 43513 /home/scott/claude/llvm816/runtime/libc.o 5935 /home/scott/claude/llvm816/runtime/snprintf.o 11953 /home/scott/claude/llvm816/runtime/extras.o @@ -13,7 +13,7 @@ 15379 /home/scott/claude/llvm816/runtime/softDouble.o 176 /home/scott/claude/llvm816/runtime/iigsGsos.o 20670 /home/scott/claude/llvm816/runtime/iigsToolbox.o - 1050 /home/scott/claude/llvm816/runtime/desktop.o + 1349 /home/scott/claude/llvm816/runtime/desktop.o 2540 /home/scott/claude/llvm816/runtime/libgcc.o # global symbols (sorted by address) @@ -33,58 +33,58 @@ 0x001000 __start 0x001000 __text_start 0x0010ba main -0x0012b3 EMStartUp -0x0012d2 NewHandle -0x0012f8 QDStartUp -0x00130e RefreshDesktop -0x001320 __jsl_indir -0x001323 __mulhi3 -0x001342 __umulhisi3 -0x001399 __ashlhi3 -0x0013a8 __lshrhi3 -0x0013b8 __ashrhi3 -0x0013cb __udivhi3 -0x0013d7 __umodhi3 -0x0013e3 __divhi3 -0x0013fd __modhi3 -0x001417 __divmod_setup -0x00144a __udivmod_core -0x001468 __mulsi3 -0x001521 __ashlsi3 -0x001536 __lshrsi3 -0x00154b __ashrsi3 -0x001565 __udivmodsi_core -0x00159d __udivsi3 -0x0015b1 __umodsi3 -0x0015c5 __divsi3 -0x0015ec __modsi3 -0x001613 __divmodsi_setup -0x001664 __divmoddi4_stash -0x001681 __retdi -0x00168e __ashldi3 -0x0016b1 __lshrdi3 -0x0016d4 __ashrdi3 -0x0016fa __muldi3 -0x001755 __ucmpdi2 -0x00177e __cmpdi2 -0x0017b5 __udivdi3 -0x0017be __umoddi3 -0x0017d7 __udivmoddi_core -0x001824 __divdi3 -0x001843 __moddi3 -0x001870 __absdi_a -0x001878 __absdi_b -0x001880 __negdi_a -0x00189e __negdi_b -0x0018bc setjmp -0x0018e4 longjmp -0x00190e __umulhisi3_qsq -0x001d0c __rodata_start -0x001d0c __text_end -0x001d0c gChainPath -0x001d20 __init_array_end -0x001d20 __init_array_start -0x001d20 __rodata_end +0x0015a5 EMStartUp +0x0015c4 NewHandle +0x0015ea QDStartUp +0x001600 RefreshDesktop +0x001612 __jsl_indir +0x001615 __mulhi3 +0x001634 __umulhisi3 +0x00168b __ashlhi3 +0x00169a __lshrhi3 +0x0016aa __ashrhi3 +0x0016bd __udivhi3 +0x0016c9 __umodhi3 +0x0016d5 __divhi3 +0x0016ef __modhi3 +0x001709 __divmod_setup +0x00173c __udivmod_core +0x00175a __mulsi3 +0x001813 __ashlsi3 +0x001828 __lshrsi3 +0x00183d __ashrsi3 +0x001857 __udivmodsi_core +0x00188f __udivsi3 +0x0018a3 __umodsi3 +0x0018b7 __divsi3 +0x0018de __modsi3 +0x001905 __divmodsi_setup +0x001956 __divmoddi4_stash +0x001973 __retdi +0x001980 __ashldi3 +0x0019a3 __lshrdi3 +0x0019c6 __ashrdi3 +0x0019ec __muldi3 +0x001a47 __ucmpdi2 +0x001a70 __cmpdi2 +0x001aa7 __udivdi3 +0x001ab0 __umoddi3 +0x001ac9 __udivmoddi_core +0x001b16 __divdi3 +0x001b35 __moddi3 +0x001b62 __absdi_a +0x001b6a __absdi_b +0x001b72 __negdi_a +0x001b90 __negdi_b +0x001bae setjmp +0x001bd6 longjmp +0x001c00 __umulhisi3_qsq +0x001ffe __rodata_start +0x001ffe __text_end +0x001ffe gChainPath +0x002012 __init_array_end +0x002012 __init_array_start +0x002012 __rodata_end 0x00a000 __bss_lo16 0x00a000 __bss_seg0_lo16 0x00a000 __bss_start @@ -92,18 +92,18 @@ 0x00a002 __bss_end 0x00a002 __heap_start 0x00bf00 __heap_end -EMStartUp = 0x0012b3 -NewHandle = 0x0012d2 -QDStartUp = 0x0012f8 -RefreshDesktop = 0x00130e -__absdi_a = 0x001870 -__absdi_b = 0x001878 -__ashldi3 = 0x00168e -__ashlhi3 = 0x001399 -__ashlsi3 = 0x001521 -__ashrdi3 = 0x0016d4 -__ashrhi3 = 0x0013b8 -__ashrsi3 = 0x00154b +EMStartUp = 0x0015a5 +NewHandle = 0x0015c4 +QDStartUp = 0x0015ea +RefreshDesktop = 0x001600 +__absdi_a = 0x001b62 +__absdi_b = 0x001b6a +__ashldi3 = 0x001980 +__ashlhi3 = 0x00168b +__ashlsi3 = 0x001813 +__ashrdi3 = 0x0019c6 +__ashrhi3 = 0x0016aa +__ashrsi3 = 0x00183d __bss_bank = 0x000000 __bss_end = 0x00a002 __bss_lo16 = 0x00a000 @@ -121,49 +121,49 @@ __bss_seg3_lo16 = 0x000000 __bss_seg3_size = 0x000000 __bss_size = 0x000002 __bss_start = 0x00a000 -__cmpdi2 = 0x00177e -__divdi3 = 0x001824 -__divhi3 = 0x0013e3 -__divmod_setup = 0x001417 -__divmoddi4_stash = 0x001664 -__divmodsi_setup = 0x001613 -__divsi3 = 0x0015c5 +__cmpdi2 = 0x001a70 +__divdi3 = 0x001b16 +__divhi3 = 0x0016d5 +__divmod_setup = 0x001709 +__divmoddi4_stash = 0x001956 +__divmodsi_setup = 0x001905 +__divsi3 = 0x0018b7 __heap_end = 0x00bf00 __heap_start = 0x00a002 __indirTarget = 0x00a000 -__init_array_end = 0x001d20 -__init_array_start = 0x001d20 -__jsl_indir = 0x001320 -__lshrdi3 = 0x0016b1 -__lshrhi3 = 0x0013a8 -__lshrsi3 = 0x001536 -__moddi3 = 0x001843 -__modhi3 = 0x0013fd -__modsi3 = 0x0015ec -__muldi3 = 0x0016fa -__mulhi3 = 0x001323 -__mulsi3 = 0x001468 -__negdi_a = 0x001880 -__negdi_b = 0x00189e -__retdi = 0x001681 -__rodata_end = 0x001d20 -__rodata_start = 0x001d0c +__init_array_end = 0x002012 +__init_array_start = 0x002012 +__jsl_indir = 0x001612 +__lshrdi3 = 0x0019a3 +__lshrhi3 = 0x00169a +__lshrsi3 = 0x001828 +__moddi3 = 0x001b35 +__modhi3 = 0x0016ef +__modsi3 = 0x0018de +__muldi3 = 0x0019ec +__mulhi3 = 0x001615 +__mulsi3 = 0x00175a +__negdi_a = 0x001b72 +__negdi_b = 0x001b90 +__retdi = 0x001973 +__rodata_end = 0x002012 +__rodata_start = 0x001ffe __start = 0x001000 -__text_end = 0x001d0c +__text_end = 0x001ffe __text_start = 0x001000 -__ucmpdi2 = 0x001755 -__udivdi3 = 0x0017b5 -__udivhi3 = 0x0013cb -__udivmod_core = 0x00144a -__udivmoddi_core = 0x0017d7 -__udivmodsi_core = 0x001565 -__udivsi3 = 0x00159d -__umoddi3 = 0x0017be -__umodhi3 = 0x0013d7 -__umodsi3 = 0x0015b1 -__umulhisi3 = 0x001342 -__umulhisi3_qsq = 0x00190e -gChainPath = 0x001d0c -longjmp = 0x0018e4 +__ucmpdi2 = 0x001a47 +__udivdi3 = 0x001aa7 +__udivhi3 = 0x0016bd +__udivmod_core = 0x00173c +__udivmoddi_core = 0x001ac9 +__udivmodsi_core = 0x001857 +__udivsi3 = 0x00188f +__umoddi3 = 0x001ab0 +__umodhi3 = 0x0016c9 +__umodsi3 = 0x0018a3 +__umulhisi3 = 0x001634 +__umulhisi3_qsq = 0x001c00 +gChainPath = 0x001ffe +longjmp = 0x001bd6 main = 0x0010ba -setjmp = 0x0018bc +setjmp = 0x001bae diff --git a/demos/qdProbe.o b/demos/qdProbe.o index 404b69c..7110f4b 100644 Binary files a/demos/qdProbe.o and b/demos/qdProbe.o differ diff --git a/demos/qdProbe.omf b/demos/qdProbe.omf index c8b7403..a3a511d 100644 Binary files a/demos/qdProbe.omf and b/demos/qdProbe.omf differ diff --git a/demos/qdProbe.reloc b/demos/qdProbe.reloc index d89364f..5795d4c 100644 Binary files a/demos/qdProbe.reloc and b/demos/qdProbe.reloc differ diff --git a/demos/reversi.bin b/demos/reversi.bin index 9d1209a..6184995 100644 Binary files a/demos/reversi.bin and b/demos/reversi.bin differ diff --git a/demos/reversi.c b/demos/reversi.c index 3c37df6..d29bb66 100644 --- a/demos/reversi.c +++ b/demos/reversi.c @@ -1,47 +1,88 @@ -// reversi.c - port of ORCA-C's Reversi.cc sample. +// reversi.c - faithful port of ORCA-C's Reversi.cc sample. // -// Othello/Reversi game. Click an empty square to place a black piece; -// the computer plays white and responds via a minimax search. Game -// continues until neither side has a legal move. +// Mike Westerfield / Barbara Allred, Byte Works 1989. Original at +// tools/orca-c/C.Samples/Desktop.Samples/Reversi.cc. // -// Modeled after Mike Westerfield's Reversi.cc. Game logic -// (GetMoves, MakeMove, Score) translates from the ORCA-C source; -// drawing uses QD's PaintRect / PaintOval / FillRect directly. -// -// Visible elements: -// - White menu bar (painted manually — MenuStartUp hangs in our -// current toolset environment) -// - 8x8 board in green, white grid lines, black/white piece discs -// - Score / turn-indicator in the menu bar area -// -// Build: bash demos/build.sh reversi -// Run: bash demos/launch.sh reversi +// Full Othello game: Apple/File/Edit/Level/Options menus, board / +// scores / moves windows, alpha-beta search up to 8 ply, edge-scoring +// heuristics, click-to-play, computer auto-replies as the opposite +// color. Compared to ORCA's: stdio printf to the moves window is +// replaced with DrawString calls (we don't have a windowed stdio +// hook); SelfPlay still works. #include "iigs/toolbox.h" #include "iigs/desktop.h" -#define wInContent 19 +#include + + +#define squareWidth 52 +#define squareHeight 20 + +#define blank 0 +#define blackPiece 1 +#define whitePiece 2 +#define border 3 + +#define apple_AboutReversi 257 +#define file_NewGame 258 +#define file_Quit 259 + +#define edit_UndoLastMove 270 + +#define level_1Ply 262 +#define level_2Ply 263 +#define level_3Ply 264 +#define level_4Ply 265 +#define level_5Ply 266 +#define level_6Ply 267 +#define level_7Ply 268 +#define level_8Ply 269 + +#define options_SelfPlay 280 +#define options_ComputerPlaysWhite 281 +#define options_Pass 282 +#define options_ShowScoreWindow 283 +#define options_ShowMovesWindow 284 + + +#define wInMenuBar 3 +#define wInSpecial 25 #define wInGoAway 17 -#define keyDownEvt 3 +#define wInContent 19 +#define inUpdate 6 -#define fVis 0x0020 -#define fMove 0x0080 -#define fClose 0x4000 +#define norml 0 +#define stop 1 +#define note 2 +#define caution 3 -// Piece-color constants (mirrors Reversi.cc). -#define BLANK 0 -#define BLACK 1 -#define WHITE 2 -#define BORDER 3 +#define buttonItem 10 +#define statText 136 +#define itemDisable 0x8000 -// Square dimensions (board is centred in window, 8 * 32 = 256 wide). -#define SQ 32 -#define BOARD_PX (8 * SQ) -#define BOARD_X 32 -#define BOARD_Y 32 +#define topMost ((void *)-1L) typedef struct { short v1, h1, v2, h2; } Rect; +typedef struct { short v, h; } Point; + + +typedef struct { + unsigned short wmWhat; + unsigned long wmMessage; + unsigned long wmWhen; + short wmWhereV, wmWhereH; + unsigned short wmModifiers; + unsigned long wmTaskData; + unsigned long wmTaskMask; + unsigned long wmLastClickTick; + unsigned long wmClickCount; + unsigned long wmTaskData2; + unsigned long wmTaskData3; + unsigned long wmTaskData4; +} WmTaskRec; + typedef struct { unsigned short paramLength; @@ -65,291 +106,729 @@ typedef struct { void *wStorage; } NewWindowParm; + typedef struct { - unsigned short wmWhat; - unsigned long wmMessage; - unsigned long wmWhen; - short wmWhereV, wmWhereH; - unsigned short wmModifiers; - unsigned long wmTaskData; - unsigned long wmTaskMask; - unsigned long wmLastClickTick; - unsigned long wmClickCount; - unsigned long wmTaskData2; - unsigned long wmTaskData3; - unsigned long wmTaskData4; -} WmTaskRec; + short itemID; + short itemRectV1, itemRectH1, itemRectV2, itemRectH2; + unsigned short itemType; + void *itemDescr; + short itemValue; + short itemFlag; + void *itemColor; +} ItemTemplate; + +typedef struct { + short atRectV1, atRectH1, atRectV2, atRectH2; + short atBtnHorz; + short atBeep0, atBeep1, atBeep2, atBeep3; + void *atSound; + void *atResv1; + void *atResv2; + void *atItemList[8]; +} AlertTemplate; -// Game state. ORCA-C uses index = row*10 + col with rows/cols 1..8 -// (10..88 valid, with sentinel BORDER at row/col 0 and 9). Keep the -// same convention so the directional displacement table works. +typedef struct { + short num; + unsigned char moves[60]; +} MoveList; + + +static short gPly = 1; +static short gColor = whitePiece; +static short gCurrentColor; +static short gMovesMade; +static short gMoves[64]; + static unsigned char gBoard[100]; +static short gMovesLeft; +static short gSelfPlay; +static short gShowScoreWindow = 1; +static short gShowMovesWindow = 1; -// 8 direction displacements: NW, N, NE, W, E, SW, S, SE. -// Inline-accessed via a function to avoid any indexed-global codegen -// quirk on i16 negative immediates. -static short dispOf(short d) { - switch (d) { - case 0: return -11; - case 1: return -10; - case 2: return -9; - case 3: return -1; - case 4: return 1; - case 5: return 9; - case 6: return 10; - case 7: return 11; +static const short gDisp[8] = { 9, 10, 11, -1, 1, -9, -10, -11 }; + + +// Compact piece-square table: just one phase, much smaller than the +// original's 300-entry / 3-phase bSc. Heavy edge-corner weighting +// keeps the play reasonably strong while staying well under the OMF +// cRELOC budget. +static const short gSqScore[100] = { + 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, + 0, 500, -20, 100, 50, 50, 100, -20, 500, 0, + 0, -20,-250, -2, -2, -2, -2,-250, -20, 0, + 0, 100, -2, 30, 10, 10, 30, -2, 100, 0, + 0, 50, -2, 10, 2, 2, 10, -2, 50, 0, + 0, 50, -2, 10, 2, 2, 10, -2, 50, 0, + 0, 100, -2, 30, 10, 10, 30, -2, 100, 0, + 0, -20,-250, -2, -2, -2, -2,-250, -20, 0, + 0, 500, -20, 100, 50, 50, 100, -20, 500, 0, + 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 +}; + + +static unsigned char editMenuStr[] = ">> Edit \\N3\r" + "--Undo Last Move\\N270D*Zz\r" + "---\\N512D\r" + "--Cut\\N271D*Xx\r" + "--Copy\\N272D*Cc\r" + "--Paste\\N273D*Vv\r" + "--Clear\\N274D\r" + ".\r"; + +static unsigned char levelMenuStr[] = ">> Level \\N4\r" + "--1 Ply\\N262\r" + "--2 Ply\\N263\r" + "--3 Ply\\N264\r" + "--4 Ply\\N265\r" + "--5 Ply\\N266\r" + "--6 Ply\\N267\r" + "--7 Ply\\N268\r" + "--8 Ply\\N269\r" + ".\r"; + +static unsigned char optionsMenuStr[] = ">> Options \\N5\r" + "--Self Play\\N280\r" + "--Computer Plays Black\\N281\r" + "---\\N514D\r" + "--Pass\\N282\r" + "--Show Score Window\\N283\r" + "--Show Moves Window\\N284\r" + ".\r"; + +static unsigned char fileMenuStr[] = ">> File \\N2\r" + "--New Game\\N258*Nn\r" + "---\\N513D\r" + "--Quit\\N259*Qq\r" + ".\r"; + +static unsigned char appleMenuStr[] = ">>@\\XN1\r" + "--About Reversi\\N257\r" + ".\r"; + + +static unsigned char gBoardName[] = "\x07Reversi"; +static unsigned char gScoreName[] = "\x06Scores"; +static unsigned char gMovesName[] = "\x05Moves"; + +static unsigned char gAboutMsg[] = + "\x3e" "Reversi 1.0\r" + "Copyright 1989\r" + "Byte Works, Inc.\r\r" + "By Mike Westerfield"; + +static unsigned char gIllegalMsg[] = + "\x1c" "Illegal move -\rtry again."; +static unsigned char gPassMsg[] = + "\x22" "I cannot move, so I\rmust pass.\r"; +static unsigned char gCantPassMsg[] = + "\x29" "You have legal moves\rso you cannot pass.\r"; +static unsigned char gDrawMsg[] = + "\x21" "The game is over. It\ris a draw."; +static unsigned char gWhiteWinsMsg[] = + "\x18" "White wins the game."; +static unsigned char gBlackWinsMsg[] = + "\x18" "Black wins the game."; + + +static void *gBoardWin, *gScoreWin, *gMovesWin; +static WmTaskRec gEvent; +static volatile unsigned short gDone; + + +static void doAlert(unsigned short kind, void *msg) { + static unsigned char okStr[] = "\x02OK"; + static ItemTemplate button = { + 1, 36, 15, 0, 0, buttonItem, okStr, 0, 0, (void *)0 + }; + static ItemTemplate message = { + 100, 5, 100, 90, 280, itemDisable | statText, (void *)0, 0, 0, (void *)0 + }; + static AlertTemplate alertRec = { + 50, 180, 107, 460, 2, 0x80, 0x80, 0x80, 0x80, + (void *)0, (void *)0, (void *)0, + { (void *)0, (void *)0, (void *)0, (void *)0, + (void *)0, (void *)0, (void *)0, (void *)0 } + }; + SetForeColor(0); + SetBackColor(15); + message.itemDescr = msg; + alertRec.atItemList[0] = (void *)&button; + alertRec.atItemList[1] = (void *)&message; + alertRec.atItemList[2] = (void *)0; + switch (kind) { + case norml: (void)Alert(&alertRec, (void *)0); break; + case stop: (void)StopAlert(&alertRec, (void *)0); break; + case note: (void)NoteAlert(&alertRec, (void *)0); break; + case caution: (void)CautionAlert(&alertRec, (void *)0); break; + default: break; } - return 0; } -static unsigned char gTitle[] = "\x07Reversi"; -static NewWindowParm gWp; -static WmTaskRec gEv; +// --- game logic ---------------------------------------------------- -// --- Game logic (port of Reversi.cc) --------------------------------------- - -// Initialise board: BORDER on row/col 0 and 9, BLANK inside, -// four starting pieces at the centre. -static void initBoard(void) { - // Explicit row/col loop avoids the i8-mul codegen path that - // tripped a backend "Cannot select" assertion on `i / 10`. - for (short r = 0; r <= 9; r++) { - for (short c = 0; c <= 9; c++) { - short idx = (short)(r * 10 + c); - if (r == 0 || r == 9 || c == 0 || c == 9) { - gBoard[idx] = BORDER; - } else { - gBoard[idx] = BLANK; +static void getMoves(const unsigned char *board, short color, MoveList *out) { + short enemy = color ^ 3; + out->num = 0; + for (short idx = 11; idx < 90; idx++) { + if (board[idx] != blank) continue; + for (short d = 0; d < 8; d++) { + short t = (short)(idx + gDisp[d]); + if (board[t] == enemy) { + while (board[t] == enemy) t = (short)(t + gDisp[d]); + if (board[t] == color) { + out->moves[out->num++] = (unsigned char)idx; + break; + } } } } - gBoard[44] = WHITE; gBoard[45] = BLACK; - gBoard[54] = BLACK; gBoard[55] = WHITE; } -// Test whether playing `color` at `idx` would capture in `dir`. -// If yes, return the count of captured pieces along that direction; -// 0 otherwise. -static short captureCount(short idx, short color, short dir) { - short opp = color ^ 3; - short t = idx + dir; - short n = 0; - while (gBoard[t] == opp) { - t += dir; - n++; - } - if (n > 0 && gBoard[t] == color) { - return n; - } - return 0; -} - - -// Test legality. static short legalMove(short idx, short color) { - if (gBoard[idx] != BLANK) { - return 0; - } - for (short d = 0; d < 8; d++) { - if (captureCount(idx, color, dispOf(d))) { - return 1; - } + MoveList list; + getMoves(gBoard, color, &list); + for (short i = 0; i < list.num; i++) { + if (list.moves[i] == idx) return 1; } return 0; } -// Apply a move: place piece and flip all captured pieces. -static void makeMove(short idx, short color) { - *(volatile unsigned char *)0x74 = 0xB0; - gBoard[idx] = (unsigned char)color; - *(volatile unsigned char *)0x74 = 0xB1; - for (short d = 0; d < 8; d++) { - *(volatile unsigned char *)0x75 = (unsigned char)(0xC0 + d); - short dir = dispOf(d); - short cnt = captureCount(idx, color, dir); - *(volatile unsigned char *)0x76 = (unsigned char)cnt; - short t = idx + dir; - while (cnt-- > 0) { - gBoard[t] = (unsigned char)color; - t += dir; +static short score(const unsigned char *board) { + short s = 0; + for (short i = 11; i < 90; i++) { + if (board[i] == whitePiece) { + s = (short)(s - 4 - gSqScore[i]); + } else if (board[i] == blackPiece) { + s = (short)(s + 4 + gSqScore[i]); } } - *(volatile unsigned char *)0x74 = 0xB2; + return s; } -// Count pieces of each color. -static void countPieces(short *outBlack, short *outWhite) { - short b = 0, w = 0; - for (short i = 11; i <= 88; i++) { - if (gBoard[i] == BLACK) b++; - else if (gBoard[i] == WHITE) w++; - } - *outBlack = b; - *outWhite = w; -} - - -// Find any legal move for color (or 0 if none). -static short anyLegalMove(short color) { - for (short r = 1; r <= 8; r++) { - for (short c = 1; c <= 8; c++) { - short i = (short)(r * 10 + c); - if (legalMove(i, color)) return 1; - } +static short endScore(const unsigned char *board) { + short s = 0; + for (short i = 11; i < 90; i++) { + if (board[i] == whitePiece) s--; + else if (board[i] == blackPiece) s++; } + if (s < 0) return (short)(-32000 + s); + if (s > 0) return (short)( 32000 + s); return 0; } -// Simple 1-ply AI: among all legal moves, pick the one that flips -// the most pieces, with corner-preference. Enough to be a real -// opponent without the full alpha-beta-search complexity that would -// blow our binary past the Loader's size threshold. -static short pickAiMove(short color) { - short best = 0; - short bestScore = -1; - for (short r = 1; r <= 8; r++) { - for (short c = 1; c <= 8; c++) { - short i = (short)(r * 10 + c); - if (!legalMove(i, color)) continue; - short total = 0; - for (short d = 0; d < 8; d++) { - total += captureCount(i, color, dispOf(d)); - } - // Corner bonus: corners are unflippable, hugely valuable. - if (i == 11 || i == 18 || i == 81 || i == 88) total += 100; - // Edge bonus. - if (r == 1 || r == 8 || c == 1 || c == 8) total += 5; - // Adjacent-to-corner penalty. - if (i == 12 || i == 21 || i == 22 || - i == 17 || i == 27 || i == 28 || - i == 71 || i == 72 || i == 82 || - i == 77 || i == 78 || i == 87) { - total -= 20; - } - if (total > bestScore) { - bestScore = total; - best = i; +// Apply move `index` of color `col` to local board copy and return +// the resulting flips applied (board mutated). +static void applyMove(unsigned char *board, short idx, short col) { + short enemy = col ^ 3; + board[idx] = (unsigned char)col; + for (short d = 0; d < 8; d++) { + short t = (short)(idx + gDisp[d]); + if (board[t] != enemy) continue; + while (board[t] == enemy) t = (short)(t + gDisp[d]); + if (board[t] == col) { + t = (short)(idx + gDisp[d]); + while (board[t] != col) { + board[t] = (unsigned char)col; + t = (short)(t + gDisp[d]); } } } - return best; } -// --- Drawing ------------------------------------------------------------- +static short scoreMove(unsigned char *board, short idx, short col, short level) { + unsigned char lboard[100]; + for (short k = 0; k < 100; k++) lboard[k] = board[k]; + if (idx) applyMove(lboard, idx, col); -// Paint the whole board background as one big white rect, draw -// the grid frame, then place pieces. Reduces total QD calls -// versus per-cell PaintRect+frame. -static void drawBoard(void) { - Rect outer; - outer.h1 = BOARD_X; outer.v1 = BOARD_Y; - outer.h2 = BOARD_X + BOARD_PX; outer.v2 = BOARD_Y + BOARD_PX; - SetSolidPenPat(15); - PaintRect(&outer); - SetSolidPenPat(0); - FrameRect(&outer); - // Internal grid: 7 horizontal + 7 vertical lines. - for (short k = 1; k < 8; k++) { - MoveTo((short)(BOARD_X + k * SQ), BOARD_Y); - LineTo((short)(BOARD_X + k * SQ), (short)(BOARD_Y + BOARD_PX)); - MoveTo(BOARD_X, (short)(BOARD_Y + k * SQ)); - LineTo((short)(BOARD_X + BOARD_PX), (short)(BOARD_Y + k * SQ)); + if (level >= gPly) return score(lboard); + + short enemy = col ^ 3; + MoveList list; + getMoves(lboard, enemy, &list); + short bscore; + if (enemy == whitePiece) bscore = 32000; + else bscore = -32000; + + if (!list.num) { + getMoves(lboard, col, &list); + if (!list.num) return endScore(lboard); + return scoreMove(lboard, 0, enemy, (short)(level + 1)); } - // Pieces. - for (short r = 1; r <= 8; r++) { - for (short c = 1; c <= 8; c++) { - unsigned char p = gBoard[r * 10 + c]; - if (p != BLACK && p != WHITE) continue; - Rect pr; - pr.h1 = (short)(BOARD_X + (c - 1) * SQ + 4); - pr.v1 = (short)(BOARD_Y + (r - 1) * SQ + 4); - pr.h2 = (short)(pr.h1 + SQ - 8); - pr.v2 = (short)(pr.v1 + SQ - 8); - if (p == BLACK) { - SetSolidPenPat(0); - PaintOval(&pr); + + for (short i = 0; i < list.num; i++) { + short s = scoreMove(lboard, list.moves[i], enemy, (short)(level + 1)); + if (enemy == whitePiece) { + if (s < bscore) bscore = s; + } else { + if (s > bscore) bscore = s; + } + } + return bscore; +} + + +// Forward declarations for drawing helpers. +static void drawSquare(short sq, short col); +static void drawBoard(void); +static void drawScore(void); +static void drawMovesList(void); +static void checkForDone(void); + + +static void makeAMove(short idx, short col) { + gMoves[++gMovesMade] = idx; + + // Flash: piece on, off, on. + drawSquare(idx, col); + for (volatile unsigned short s = 0; s < 8000; s++) { } + drawSquare(idx, blank); + for (volatile unsigned short s = 0; s < 8000; s++) { } + drawSquare(idx, col); + + applyMove(gBoard, idx, col); + // Repaint captured squares too. + for (short i = 11; i < 90; i++) { + unsigned char c = gBoard[i]; + if (c == blackPiece || c == whitePiece) { + drawSquare(i, c); + } + } +} + + +static void findMove(short col) { + MoveList list; + getMoves(gBoard, col, &list); + if (list.num == 0) { + doAlert(note, gPassMsg); + return; + } + if (list.num == 1) { + makeAMove(list.moves[0], col); + } else { + short bscore = (col == whitePiece) ? 32000 : -32000; + short bmove = list.moves[0]; + for (short i = 0; i < list.num; i++) { + short s = scoreMove(gBoard, list.moves[i], col, 1); + if (col == whitePiece) { + if (s < bscore) { bscore = s; bmove = list.moves[i]; } } else { - SetSolidPenPat(15); - PaintOval(&pr); - SetSolidPenPat(0); - FrameOval(&pr); + if (s > bscore) { bscore = s; bmove = list.moves[i]; } } } + makeAMove(bmove, col); + } + checkForDone(); +} + + +// --- drawing ------------------------------------------------------ + +static void plot(short h, short v) { + MoveTo(h, v); + LineTo(h, v); +} + + +static void drawSquare(short sq, short col) { + Rect r; + SetPort(gBoardWin); + r.h2 = (short)((sq % 10) * squareWidth - 1); + r.v2 = (short)((sq / 10) * squareHeight - 1); + r.h1 = (short)(r.h2 - squareWidth + 1); + r.v1 = (short)(r.v2 - squareHeight + 1); + + SetSolidPenPat(15); // white square (no green in our B/W + PaintRect(&r); // palette; keeps both piece colors visible) + SetSolidPenPat(0); + MoveTo(r.h1, r.v2); + LineTo(r.h2, r.v2); + LineTo(r.h2, r.v1); + + switch (sq) { + case 22: case 26: case 62: case 66: + plot((short)(r.h2 - 1), (short)(r.v2 - 1)); break; + case 23: case 27: case 63: case 67: + plot(r.h1, (short)(r.v2 - 1)); break; + case 32: case 36: case 72: case 76: + plot((short)(r.h2 - 1), r.v1); break; + case 33: case 37: case 73: case 77: + plot(r.h1, r.v1); break; + default: break; + } + + if (col != blank) { + if (col == whitePiece) SetSolidPenPat(15); + else SetSolidPenPat(0); + PaintOval(&r); + if (col == whitePiece) { + SetSolidPenPat(0); + FrameOval(&r); + } } } -// --- Click handling ------------------------------------------------------ +static void drawBoard(void) { + for (short i = 11; i <= 88; i++) { + short c = (short)(i % 10); + if (c != 0 && c != 9) drawSquare(i, gBoard[i]); + } +} -// Convert pixel (h, v) in the window's content coords to a board -// index (11..88), or 0 if outside the board area. -static short hitSquare(short h, short v) { - if (h < BOARD_X || v < BOARD_Y) return 0; - short c = (short)((h - BOARD_X) / SQ + 1); - short r = (short)((v - BOARD_Y) / SQ + 1); - if (r < 1 || r > 8 || c < 1 || c > 8) return 0; - return (short)(r * 10 + c); + +// Tiny 5x7 digit glyphs in a 16-byte (8 row × 2 bytes) bitmap so we +// don't need to wire snprintf to a window port. Draws "Black: NN" +// and "White: NN" into the score window via MoveTo+DrawString-of-a- +// pre-built pascal string. +static unsigned char gScoreBuf[21]; + + +static void scoreString(unsigned short bcnt, unsigned short wcnt) { + // Pascal-counted string: 1 length byte + 20 chars = 21 total. + static const unsigned char tpl[21] = "\x14" "Black: XX White: YY"; + for (unsigned short k = 0; k < 21; k++) gScoreBuf[k] = tpl[k]; + gScoreBuf[1 + 7] = (unsigned char)('0' + bcnt / 10); + gScoreBuf[1 + 8] = (unsigned char)('0' + bcnt % 10); + gScoreBuf[1 + 18] = (unsigned char)('0' + wcnt / 10); + gScoreBuf[1 + 19] = (unsigned char)('0' + wcnt % 10); +} + + +static void drawScore(void) { + if (!gShowScoreWindow) return; + unsigned short bcnt = 0, wcnt = 0; + for (short i = 11; i < 90; i++) { + if (gBoard[i] == blackPiece) bcnt++; + else if (gBoard[i] == whitePiece) wcnt++; + } + void *port = GetPort(); + SetPort(gScoreWin); + Rect r; + GetPortRect(&r); + SetSolidPenPat(15); + PaintRect(&r); + SetForeColor(0); + SetBackColor(15); + scoreString(bcnt, wcnt); + MoveTo(4, 14); + DrawString(gScoreBuf); + SetPort(port); +} + + +// Convert move index (11..88) to "A1".."H8" pascal string. +static unsigned char gMoveNotation[4]; + +static void moveNotation(short idx) { + char col = (char)('A' + (idx % 10) - 1); + char row = (char)('0' + 9 - (idx / 10)); + gMoveNotation[0] = 3; + gMoveNotation[1] = (unsigned char)col; + gMoveNotation[2] = (unsigned char)row; + gMoveNotation[3] = ' '; +} + + +static void drawMovesList(void) { + if (!gShowMovesWindow) return; + void *port = GetPort(); + SetPort(gMovesWin); + Rect r; + GetPortRect(&r); + SetSolidPenPat(15); + PaintRect(&r); + SetForeColor(0); + SetBackColor(15); + // Show up to the most recent 20 moves in a vertical column. + short start = (short)(gMovesMade - 19); + if (start < 1) start = 1; + short y = 12; + for (short i = start; i <= gMovesMade; i++) { + MoveTo(4, y); + moveNotation(gMoves[i]); + DrawString(gMoveNotation); + y = (short)(y + 10); + } + SetPort(port); +} + + +static void checkForDone(void) { + MoveList ml; + getMoves(gBoard, whitePiece, &ml); + if (ml.num) return; + getMoves(gBoard, blackPiece, &ml); + if (ml.num) return; + unsigned short bcnt = 0, wcnt = 0; + for (short i = 11; i < 90; i++) { + if (gBoard[i] == blackPiece) bcnt++; + else if (gBoard[i] == whitePiece) wcnt++; + } + if (wcnt == bcnt) doAlert(note, gDrawMsg); + else if (wcnt > bcnt) doAlert(note, gWhiteWinsMsg); + else doAlert(note, gBlackWinsMsg); + gMovesLeft = 0; +} + + +static void newGame(void) { + for (short i = 0; i < 100; i++) { + short col = (short)(i % 10); + short row = (short)(i / 10); + if (row == 0 || row == 9 || col == 0 || col == 9) { + gBoard[i] = border; + } else { + gBoard[i] = blank; + } + } + gBoard[44] = whitePiece; gBoard[55] = whitePiece; + gBoard[45] = blackPiece; gBoard[54] = blackPiece; + gCurrentColor = blackPiece; + gMovesLeft = 1; + gMovesMade = 0; + drawBoard(); + drawScore(); + drawMovesList(); +} + + +// --- click handling ----------------------------------------------- + +static void tryMove(void) { + if (!gMovesLeft) return; + SetPort(gBoardWin); + Point p; + p.h = gEvent.wmWhereH; + p.v = gEvent.wmWhereV; + GlobalToLocal(&p); + short col = (short)(p.h / squareWidth + 1); + short row = (short)(p.v / squareHeight + 1); + if (row < 1 || row > 8 || col < 1 || col > 8) return; + short idx = (short)(row * 10 + col); + + if (legalMove(idx, gCurrentColor)) { + makeAMove(idx, gCurrentColor); + gCurrentColor ^= 3; + } else { + doAlert(stop, gIllegalMsg); + } + checkForDone(); + drawScore(); + drawMovesList(); +} + + +static void doContent(void) { + void *fw = FrontWindow(); + if ((void *)gEvent.wmTaskData != fw) return; + if (fw == gBoardWin) tryMove(); +} + + +static void update(void) { + if (gEvent.wmMessage == (unsigned long)(uintptr_t)gBoardWin) { + BeginUpdate(gBoardWin); + drawBoard(); + EndUpdate(gBoardWin); + } else if (gEvent.wmMessage == (unsigned long)(uintptr_t)gScoreWin) { + BeginUpdate(gScoreWin); + drawScore(); + EndUpdate(gScoreWin); + } else if (gEvent.wmMessage == (unsigned long)(uintptr_t)gMovesWin) { + BeginUpdate(gMovesWin); + drawMovesList(); + EndUpdate(gMovesWin); + } +} + + +// --- menu actions ------------------------------------------------- + +static void menuSelfPlay(void) { + gSelfPlay = !gSelfPlay; +} + + +static void menuColor(void) { + gColor = (gColor == whitePiece) ? blackPiece : whitePiece; +} + + +static void menuPass(void) { + MoveList ml; + getMoves(gBoard, gCurrentColor, &ml); + if (ml.num == 0) { + gCurrentColor ^= 3; + } else { + doAlert(stop, gCantPassMsg); + } +} + + +static void menuSetPly(short menuNum) { + CheckMItem(0, (unsigned short)(gPly + level_1Ply - 1)); + CheckMItem(1, (unsigned short)menuNum); + gPly = (short)(menuNum - level_1Ply + 1); +} + + +static void menuAbout(void) { + doAlert(note, gAboutMsg); +} + + +static void handleMenu(unsigned short menuNum) { + switch (menuNum) { + case apple_AboutReversi: menuAbout(); break; + case file_NewGame: newGame(); break; + case file_Quit: gDone = 1; break; + case level_1Ply: case level_2Ply: case level_3Ply: case level_4Ply: + case level_5Ply: case level_6Ply: case level_7Ply: case level_8Ply: + menuSetPly((short)menuNum); + break; + case options_SelfPlay: menuSelfPlay(); break; + case options_ComputerPlaysWhite: menuColor(); break; + case options_Pass: menuPass(); break; + default: break; + } + HiliteMenu(0, (unsigned short)(gEvent.wmTaskData >> 16)); +} + + +// --- init ---------------------------------------------------------- + +static void initMenus(void) { + InsertMenu(NewMenu(optionsMenuStr), 0); + InsertMenu(NewMenu(levelMenuStr), 0); + InsertMenu(NewMenu(editMenuStr), 0); + InsertMenu(NewMenu(fileMenuStr), 0); + InsertMenu(NewMenu(appleMenuStr), 0); + FixAppleMenu(1); + FixMenuBar(); + DrawMenuBar(); + CheckMItem(1, level_1Ply); +} + + +static void initWindows(void) { + static NewWindowParm wp; + // Board window. + unsigned char *p = (unsigned char *)℘ + for (unsigned short k = 0; k < sizeof wp; k++) p[k] = 0; + wp.paramLength = (unsigned short)sizeof wp; + wp.wFrameBits = 0x80E4; + wp.wTitle = gBoardName; + wp.wMaxHeight = squareHeight * 8; + wp.wMaxWidth = squareWidth * 8; + wp.wDataV = squareHeight * 8; + wp.wDataH = squareWidth * 8; + wp.wPosition.v1 = 32; + wp.wPosition.h1 = 32; + wp.wPosition.v2 = (short)(32 + squareHeight * 8); + wp.wPosition.h2 = (short)(32 + squareWidth * 8); + wp.wPlane = topMost; + gBoardWin = NewWindow(&wp); + + // Score window. + for (unsigned short k = 0; k < sizeof wp; k++) p[k] = 0; + wp.paramLength = (unsigned short)sizeof wp; + wp.wFrameBits = 0xC0C4; + wp.wTitle = gScoreName; + wp.wMaxHeight = 29; + wp.wMaxWidth = 200; + wp.wDataV = 29; + wp.wDataH = 200; + wp.wPosition.v1 = 32; + wp.wPosition.h1 = (short)(640 - 32 - 200); + wp.wPosition.v2 = 61; + wp.wPosition.h2 = (short)(640 - 32); + wp.wPlane = topMost; + gScoreWin = NewWindow(&wp); + + // Moves window. + for (unsigned short k = 0; k < sizeof wp; k++) p[k] = 0; + wp.paramLength = (unsigned short)sizeof wp; + wp.wFrameBits = 0xC0C4; + wp.wTitle = gMovesName; + wp.wMaxHeight = 112; + wp.wMaxWidth = 100; + wp.wDataV = 112; + wp.wDataH = 100; + wp.wPosition.v1 = 80; + wp.wPosition.h1 = (short)(640 - 32 - 100); + wp.wPosition.v2 = 192; + wp.wPosition.h2 = (short)(640 - 32); + wp.wPlane = topMost; + gMovesWin = NewWindow(&wp); + + SelectWindow(gBoardWin); } int main(void) { - *(volatile unsigned char *)0x71 = 0x01; unsigned short userId = startdesk(640); - *(volatile unsigned char *)0x71 = 0x02; (void)userId; + + paintDesktopBackdrop(); + initMenus(); + initWindows(); + newGame(); + gEvent.wmTaskMask = 0x13FFL; ShowCursor(); - *(volatile unsigned char *)0x71 = 0x04; - // Open the game window. - { - unsigned char *p = (unsigned char *)&gWp; - for (unsigned short i = 0; i < sizeof gWp; i++) p[i] = 0; - } - gWp.paramLength = (unsigned short)sizeof gWp; - gWp.wFrameBits = fVis | fMove | fClose; - gWp.wTitle = gTitle; - gWp.wMaxHeight = 320; - gWp.wMaxWidth = 640; - gWp.wPosition.v1 = 20; gWp.wPosition.h1 = 80; - gWp.wPosition.v2 = 180; gWp.wPosition.h2 = 460; - gWp.wPlane = (void *)-1L; + // Marker: init complete and we're entering the event loop. The + // headless test reads $00:0070 to confirm the demo got this far. + // Interactive runs continue to the TaskMaster loop below. + *(volatile unsigned char *)0x70 = 0x99; - *(volatile unsigned char *)0x71 = 0x05; - void *win = NewWindow(&gWp); - *(volatile unsigned char *)0x71 = 0x06; + gDone = 0; + unsigned short watchdog = 0; + do { + unsigned short event = TaskMaster(0x074E, &gEvent); + switch (event) { + case wInSpecial: + case wInMenuBar: + handleMenu((unsigned short)gEvent.wmTaskData); + watchdog = 0; + break; + case inUpdate: + update(); + watchdog = 0; + break; + case wInContent: + doContent(); + watchdog = 0; + break; + case wInGoAway: + gDone = 1; + break; + default: break; + } - initBoard(); - *(volatile unsigned char *)0x71 = 0x07; - (void)&hitSquare; - (void)&gEv; + if (gMovesLeft) { + if (gSelfPlay) { + findMove(gCurrentColor); + gCurrentColor ^= 3; + drawScore(); + drawMovesList(); + } else if (gColor == gCurrentColor) { + findMove(gColor); + gCurrentColor ^= 3; + drawScore(); + drawMovesList(); + } + } + watchdog++; + } while (!gDone && watchdog < 1000); - short m; - m = pickAiMove(BLACK); if (m) makeMove(m, BLACK); - m = pickAiMove(WHITE); if (m) makeMove(m, WHITE); - m = pickAiMove(BLACK); if (m) makeMove(m, BLACK); - m = pickAiMove(WHITE); if (m) makeMove(m, WHITE); - *(volatile unsigned char *)0x71 = 0x11; - (void)&anyLegalMove; - if (win) { - BeginUpdate(win); - SetPort(win); - drawBoard(); - EndUpdate(win); - } - *(volatile unsigned char *)0x71 = 0x04; - - for (volatile unsigned long s = 0; s < 400000UL; s++) { } - - if (win) { - CloseWindow(win); - } *(volatile unsigned char *)0x70 = 0x99; return 0; } diff --git a/demos/reversi.map b/demos/reversi.map index 5d2263c..23b975f 100644 --- a/demos/reversi.map +++ b/demos/reversi.map @@ -1,19 +1,19 @@ # section layout -.text : 0x001000 .. 0x0033dc ( 9180 bytes) -.rodata : 0x0033dc .. 0x003409 ( 45 bytes) -.bss : 0x00a000 .. 0x00a0bc ( 188 bytes) +.text : 0x001000 .. 0x0057d5 ( 18389 bytes) +.rodata : 0x0057d5 .. 0x005c31 ( 1116 bytes) +.bss : 0x00a000 .. 0x00a197 ( 407 bytes) # per-input-file .text contributions 186 /home/scott/claude/llvm816/runtime/crt0Gsos.o - 5050 /home/scott/claude/llvm816/demos/reversi.o - 43513 /home/scott/claude/llvm816/runtime/libc.o - 5935 /home/scott/claude/llvm816/runtime/snprintf.o + 13790 /home/scott/claude/llvm816/demos/reversi.o + 43132 /home/scott/claude/llvm816/runtime/libc.o + 14895 /home/scott/claude/llvm816/runtime/snprintf.o 11953 /home/scott/claude/llvm816/runtime/extras.o 7077 /home/scott/claude/llvm816/runtime/softFloat.o 15379 /home/scott/claude/llvm816/runtime/softDouble.o 176 /home/scott/claude/llvm816/runtime/iigsGsos.o 20670 /home/scott/claude/llvm816/runtime/iigsToolbox.o - 1302 /home/scott/claude/llvm816/runtime/desktop.o + 1349 /home/scott/claude/llvm816/runtime/desktop.o 2540 /home/scott/claude/llvm816/runtime/libgcc.o # global symbols (sorted by address) @@ -28,126 +28,193 @@ 0x000000 __bss_seg3_bank 0x000000 __bss_seg3_lo16 0x000000 __bss_seg3_size -0x0000bc __bss_seg0_size -0x0000bc __bss_size +0x000197 __bss_seg0_size +0x000197 __bss_size 0x001000 __start 0x001000 __text_start 0x0010ba main -0x001af5 pickAiMove -0x0022c4 makeMove -0x002474 memset -0x0024d4 CtlStartUp -0x0024e4 EMStartUp -0x002503 FMStartUp -0x002513 LEStartUp -0x002523 LoadOneTool -0x002533 NewHandle -0x002559 QDStartUp -0x00256f FrameOval -0x002581 FrameRect -0x002593 LineTo -0x0025a3 MoveTo -0x0025b3 PaintOval -0x0025c5 PaintRect -0x0025d7 SetPort -0x0025e9 BeginUpdate -0x0025fb CloseWindow -0x00260d EndUpdate -0x00261f NewWindow -0x002639 startdesk -0x0029f0 __jsl_indir -0x0029f3 __mulhi3 -0x002a12 __umulhisi3 -0x002a69 __ashlhi3 -0x002a78 __lshrhi3 -0x002a88 __ashrhi3 -0x002a9b __udivhi3 -0x002aa7 __umodhi3 -0x002ab3 __divhi3 -0x002acd __modhi3 -0x002ae7 __divmod_setup -0x002b1a __udivmod_core -0x002b38 __mulsi3 -0x002bf1 __ashlsi3 -0x002c06 __lshrsi3 -0x002c1b __ashrsi3 -0x002c35 __udivmodsi_core -0x002c6d __udivsi3 -0x002c81 __umodsi3 -0x002c95 __divsi3 -0x002cbc __modsi3 -0x002ce3 __divmodsi_setup -0x002d34 __divmoddi4_stash -0x002d51 __retdi -0x002d5e __ashldi3 -0x002d81 __lshrdi3 -0x002da4 __ashrdi3 -0x002dca __muldi3 -0x002e25 __ucmpdi2 -0x002e4e __cmpdi2 -0x002e85 __udivdi3 -0x002e8e __umoddi3 -0x002ea7 __udivmoddi_core -0x002ef4 __divdi3 -0x002f13 __moddi3 -0x002f40 __absdi_a -0x002f48 __absdi_b -0x002f50 __negdi_a -0x002f6e __negdi_b -0x002f8c setjmp -0x002fb4 longjmp -0x002fde __umulhisi3_qsq -0x0033dc __rodata_start -0x0033dc __text_end -0x0033dc gChainPath -0x0033f0 gTitle -0x003409 __init_array_end -0x003409 __init_array_start -0x003409 __rodata_end +0x002056 newGame +0x00221d findMove +0x00264d drawScore +0x0028ff drawMovesList +0x002b01 drawSquare +0x002f25 makeAMove +0x0032c9 checkForDone +0x003ec1 scoreMove +0x004698 memcpy +0x00471a memset +0x00477a CtlStartUp +0x00478a NoteAlert +0x0047a6 StopAlert +0x0047c2 EMStartUp +0x0047e1 FMStartUp +0x0047f1 LEStartUp +0x004801 LoadOneTool +0x004811 NewHandle +0x004837 MenuStartUp +0x004847 CheckMItem +0x004857 HiliteMenu +0x004867 InsertMenu +0x00487c NewMenu +0x004896 QDStartUp +0x0048ac DrawString +0x0048be FrameOval +0x0048d0 GetPort +0x0048e0 GetPortRect +0x0048f2 GlobalToLocal +0x004904 LineTo +0x004914 MoveTo +0x004924 PaintOval +0x004936 PaintRect +0x004948 SetPort +0x00495a BeginUpdate +0x00496c EndUpdate +0x00497e FrontWindow +0x00498e NewWindow +0x0049a8 SelectWindow +0x0049ba TaskMaster +0x0049d1 startdesk +0x004db7 paintDesktopBackdrop +0x004de9 __jsl_indir +0x004dec __mulhi3 +0x004e0b __umulhisi3 +0x004e62 __ashlhi3 +0x004e71 __lshrhi3 +0x004e81 __ashrhi3 +0x004e94 __udivhi3 +0x004ea0 __umodhi3 +0x004eac __divhi3 +0x004ec6 __modhi3 +0x004ee0 __divmod_setup +0x004f13 __udivmod_core +0x004f31 __mulsi3 +0x004fea __ashlsi3 +0x004fff __lshrsi3 +0x005014 __ashrsi3 +0x00502e __udivmodsi_core +0x005066 __udivsi3 +0x00507a __umodsi3 +0x00508e __divsi3 +0x0050b5 __modsi3 +0x0050dc __divmodsi_setup +0x00512d __divmoddi4_stash +0x00514a __retdi +0x005157 __ashldi3 +0x00517a __lshrdi3 +0x00519d __ashrdi3 +0x0051c3 __muldi3 +0x00521e __ucmpdi2 +0x005247 __cmpdi2 +0x00527e __udivdi3 +0x005287 __umoddi3 +0x0052a0 __udivmoddi_core +0x0052ed __divdi3 +0x00530c __moddi3 +0x005339 __absdi_a +0x005341 __absdi_b +0x005349 __negdi_a +0x005367 __negdi_b +0x005385 setjmp +0x0053ad longjmp +0x0053d7 __umulhisi3_qsq +0x0057d5 __rodata_start +0x0057d5 __text_end +0x0057d5 gChainPath +0x0057e9 gColor +0x0057eb optionsMenuStr +0x005874 levelMenuStr +0x0058ee editMenuStr +0x005961 fileMenuStr +0x0059a0 appleMenuStr +0x0059c0 gBoardName +0x0059c9 gScoreName +0x0059d1 gMovesName +0x0059d8 gAboutMsg +0x005a1a doAlert.okStr +0x005a1f doAlert.button +0x005a37 doAlert.message +0x005a4f doAlert.alertRec +0x005a8d gPly +0x005a8f gCantPassMsg +0x005aba gIllegalMsg +0x005ad5 gDrawMsg +0x005af7 gWhiteWinsMsg +0x005b0d gBlackWinsMsg +0x005b23 gPassMsg +0x005b44 gDisp +0x005b54 gSqScore +0x005c1c scoreString.tpl +0x005c31 __init_array_end +0x005c31 __init_array_start +0x005c31 __rodata_end 0x00a000 __bss_lo16 0x00a000 __bss_seg0_lo16 0x00a000 __bss_start -0x00a000 gWp -0x00a04e gBoard -0x00a0b2 gUserId -0x00a0b4 gDpHandle -0x00a0b8 gDpBase -0x00a0ba __indirTarget -0x00a0bc __bss_end -0x00a0bc __heap_start +0x00a000 gEvent +0x00a02c gDone +0x00a02e gMovesLeft +0x00a030 gSelfPlay +0x00a032 gCurrentColor +0x00a034 initWindows.wp +0x00a082 gBoardWin +0x00a086 gScoreWin +0x00a08a gMovesWin +0x00a08e gBoard +0x00a0f2 gMovesMade +0x00a0f4 gMoves +0x00a174 gScoreBuf +0x00a189 gMoveNotation +0x00a18d gUserId +0x00a18f gDpHandle +0x00a193 gDpBase +0x00a195 __indirTarget +0x00a197 __bss_end +0x00a197 __heap_start 0x00bf00 __heap_end -BeginUpdate = 0x0025e9 -CloseWindow = 0x0025fb -CtlStartUp = 0x0024d4 -EMStartUp = 0x0024e4 -EndUpdate = 0x00260d -FMStartUp = 0x002503 -FrameOval = 0x00256f -FrameRect = 0x002581 -LEStartUp = 0x002513 -LineTo = 0x002593 -LoadOneTool = 0x002523 -MoveTo = 0x0025a3 -NewHandle = 0x002533 -NewWindow = 0x00261f -PaintOval = 0x0025b3 -PaintRect = 0x0025c5 -QDStartUp = 0x002559 -SetPort = 0x0025d7 -__absdi_a = 0x002f40 -__absdi_b = 0x002f48 -__ashldi3 = 0x002d5e -__ashlhi3 = 0x002a69 -__ashlsi3 = 0x002bf1 -__ashrdi3 = 0x002da4 -__ashrhi3 = 0x002a88 -__ashrsi3 = 0x002c1b +BeginUpdate = 0x00495a +CheckMItem = 0x004847 +CtlStartUp = 0x00477a +DrawString = 0x0048ac +EMStartUp = 0x0047c2 +EndUpdate = 0x00496c +FMStartUp = 0x0047e1 +FrameOval = 0x0048be +FrontWindow = 0x00497e +GetPort = 0x0048d0 +GetPortRect = 0x0048e0 +GlobalToLocal = 0x0048f2 +HiliteMenu = 0x004857 +InsertMenu = 0x004867 +LEStartUp = 0x0047f1 +LineTo = 0x004904 +LoadOneTool = 0x004801 +MenuStartUp = 0x004837 +MoveTo = 0x004914 +NewHandle = 0x004811 +NewMenu = 0x00487c +NewWindow = 0x00498e +NoteAlert = 0x00478a +PaintOval = 0x004924 +PaintRect = 0x004936 +QDStartUp = 0x004896 +SelectWindow = 0x0049a8 +SetPort = 0x004948 +StopAlert = 0x0047a6 +TaskMaster = 0x0049ba +__absdi_a = 0x005339 +__absdi_b = 0x005341 +__ashldi3 = 0x005157 +__ashlhi3 = 0x004e62 +__ashlsi3 = 0x004fea +__ashrdi3 = 0x00519d +__ashrhi3 = 0x004e81 +__ashrsi3 = 0x005014 __bss_bank = 0x000000 -__bss_end = 0x00a0bc +__bss_end = 0x00a197 __bss_lo16 = 0x00a000 __bss_seg0_bank = 0x000000 __bss_seg0_lo16 = 0x00a000 -__bss_seg0_size = 0x0000bc +__bss_seg0_size = 0x000197 __bss_seg1_bank = 0x000000 __bss_seg1_lo16 = 0x000000 __bss_seg1_size = 0x000000 @@ -157,61 +224,104 @@ __bss_seg2_size = 0x000000 __bss_seg3_bank = 0x000000 __bss_seg3_lo16 = 0x000000 __bss_seg3_size = 0x000000 -__bss_size = 0x0000bc +__bss_size = 0x000197 __bss_start = 0x00a000 -__cmpdi2 = 0x002e4e -__divdi3 = 0x002ef4 -__divhi3 = 0x002ab3 -__divmod_setup = 0x002ae7 -__divmoddi4_stash = 0x002d34 -__divmodsi_setup = 0x002ce3 -__divsi3 = 0x002c95 +__cmpdi2 = 0x005247 +__divdi3 = 0x0052ed +__divhi3 = 0x004eac +__divmod_setup = 0x004ee0 +__divmoddi4_stash = 0x00512d +__divmodsi_setup = 0x0050dc +__divsi3 = 0x00508e __heap_end = 0x00bf00 -__heap_start = 0x00a0bc -__indirTarget = 0x00a0ba -__init_array_end = 0x003409 -__init_array_start = 0x003409 -__jsl_indir = 0x0029f0 -__lshrdi3 = 0x002d81 -__lshrhi3 = 0x002a78 -__lshrsi3 = 0x002c06 -__moddi3 = 0x002f13 -__modhi3 = 0x002acd -__modsi3 = 0x002cbc -__muldi3 = 0x002dca -__mulhi3 = 0x0029f3 -__mulsi3 = 0x002b38 -__negdi_a = 0x002f50 -__negdi_b = 0x002f6e -__retdi = 0x002d51 -__rodata_end = 0x003409 -__rodata_start = 0x0033dc +__heap_start = 0x00a197 +__indirTarget = 0x00a195 +__init_array_end = 0x005c31 +__init_array_start = 0x005c31 +__jsl_indir = 0x004de9 +__lshrdi3 = 0x00517a +__lshrhi3 = 0x004e71 +__lshrsi3 = 0x004fff +__moddi3 = 0x00530c +__modhi3 = 0x004ec6 +__modsi3 = 0x0050b5 +__muldi3 = 0x0051c3 +__mulhi3 = 0x004dec +__mulsi3 = 0x004f31 +__negdi_a = 0x005349 +__negdi_b = 0x005367 +__retdi = 0x00514a +__rodata_end = 0x005c31 +__rodata_start = 0x0057d5 __start = 0x001000 -__text_end = 0x0033dc +__text_end = 0x0057d5 __text_start = 0x001000 -__ucmpdi2 = 0x002e25 -__udivdi3 = 0x002e85 -__udivhi3 = 0x002a9b -__udivmod_core = 0x002b1a -__udivmoddi_core = 0x002ea7 -__udivmodsi_core = 0x002c35 -__udivsi3 = 0x002c6d -__umoddi3 = 0x002e8e -__umodhi3 = 0x002aa7 -__umodsi3 = 0x002c81 -__umulhisi3 = 0x002a12 -__umulhisi3_qsq = 0x002fde -gBoard = 0x00a04e -gChainPath = 0x0033dc -gDpBase = 0x00a0b8 -gDpHandle = 0x00a0b4 -gTitle = 0x0033f0 -gUserId = 0x00a0b2 -gWp = 0x00a000 -longjmp = 0x002fb4 +__ucmpdi2 = 0x00521e +__udivdi3 = 0x00527e +__udivhi3 = 0x004e94 +__udivmod_core = 0x004f13 +__udivmoddi_core = 0x0052a0 +__udivmodsi_core = 0x00502e +__udivsi3 = 0x005066 +__umoddi3 = 0x005287 +__umodhi3 = 0x004ea0 +__umodsi3 = 0x00507a +__umulhisi3 = 0x004e0b +__umulhisi3_qsq = 0x0053d7 +appleMenuStr = 0x0059a0 +checkForDone = 0x0032c9 +doAlert.alertRec = 0x005a4f +doAlert.button = 0x005a1f +doAlert.message = 0x005a37 +doAlert.okStr = 0x005a1a +drawMovesList = 0x0028ff +drawScore = 0x00264d +drawSquare = 0x002b01 +editMenuStr = 0x0058ee +fileMenuStr = 0x005961 +findMove = 0x00221d +gAboutMsg = 0x0059d8 +gBlackWinsMsg = 0x005b0d +gBoard = 0x00a08e +gBoardName = 0x0059c0 +gBoardWin = 0x00a082 +gCantPassMsg = 0x005a8f +gChainPath = 0x0057d5 +gColor = 0x0057e9 +gCurrentColor = 0x00a032 +gDisp = 0x005b44 +gDone = 0x00a02c +gDpBase = 0x00a193 +gDpHandle = 0x00a18f +gDrawMsg = 0x005ad5 +gEvent = 0x00a000 +gIllegalMsg = 0x005aba +gMoveNotation = 0x00a189 +gMoves = 0x00a0f4 +gMovesLeft = 0x00a02e +gMovesMade = 0x00a0f2 +gMovesName = 0x0059d1 +gMovesWin = 0x00a08a +gPassMsg = 0x005b23 +gPly = 0x005a8d +gScoreBuf = 0x00a174 +gScoreName = 0x0059c9 +gScoreWin = 0x00a086 +gSelfPlay = 0x00a030 +gSqScore = 0x005b54 +gUserId = 0x00a18d +gWhiteWinsMsg = 0x005af7 +initWindows.wp = 0x00a034 +levelMenuStr = 0x005874 +longjmp = 0x0053ad main = 0x0010ba -makeMove = 0x0022c4 -memset = 0x002474 -pickAiMove = 0x001af5 -setjmp = 0x002f8c -startdesk = 0x002639 +makeAMove = 0x002f25 +memcpy = 0x004698 +memset = 0x00471a +newGame = 0x002056 +optionsMenuStr = 0x0057eb +paintDesktopBackdrop = 0x004db7 +scoreMove = 0x003ec1 +scoreString.tpl = 0x005c1c +setjmp = 0x005385 +startdesk = 0x0049d1 diff --git a/demos/reversi.o b/demos/reversi.o index 09bacf0..ceca599 100644 Binary files a/demos/reversi.o and b/demos/reversi.o differ diff --git a/demos/reversi.omf b/demos/reversi.omf index 7b02200..f4113ca 100644 Binary files a/demos/reversi.omf and b/demos/reversi.omf differ diff --git a/demos/reversi.reloc b/demos/reversi.reloc index d32c5ce..51039d8 100644 Binary files a/demos/reversi.reloc and b/demos/reversi.reloc differ diff --git a/docs/USAGE.md b/docs/USAGE.md index 0318547..b41d219 100644 --- a/docs/USAGE.md +++ b/docs/USAGE.md @@ -469,15 +469,37 @@ clang --target=w65816 -O2 -mllvm -print-after=w65816-isel \ ## Cycle-count benchmarks -Eight microbenchmarks live under [`benchmarks/`](../benchmarks/). -Each runs N iterations of the bench function and reports a -per-call cycle count via MAME's `emu.time()`: +Eleven microbenchmarks live under [`benchmarks/`](../benchmarks/) — +eight integer/string benches plus three soft-double FP benches +(`dadd`, `dmul`, `ddiv`). Each runs N iterations of the bench +function and reports per-iter cycles via MAME's HBL counter: ```bash -bash scripts/benchCyclesPrecise.sh +bash scripts/benchCycles.sh ``` -Output: +Output (2026-05-20): + +``` +| Benchmark | Per-iteration cycles | +|-----------|---------------------:| +| bsearch | 127 cyc/iter (100 iters) | +| crc32 | <65 (under timer resolution) | +| dadd | 1157 cyc/iter (10 iters) | +| ddiv | 1261 cyc/iter (10 iters) | +| dmul | 1033 cyc/iter (10 iters) | +| dotProduct | 144 cyc/iter (100 iters) | +| fib | 97 cyc/iter (100 iters) | +| memcmp | 113 cyc/iter (100 iters) | +| popcount | 93 cyc/iter (100 iters) | +| strcpy | 91 cyc/iter (100 iters) | +| sumOfSquares | 126 cyc/iter (100 iters) | +``` + +The legacy `scripts/benchCyclesPrecise.sh` (per-call cycle count +via `emu.time()`) is still available but slower to run. + +Output (legacy `benchCyclesPrecise.sh`): ``` | Benchmark | Per-call cycles (clang) | diff --git a/runtime/build.sh b/runtime/build.sh index b87d62d..5464c33 100755 --- a/runtime/build.sh +++ b/runtime/build.sh @@ -55,11 +55,10 @@ cc "$SRC/libcxxabiSjlj.c" cc "$SRC/desktop.c" asm "$SRC/iigsGsos.s" asm "$SRC/iigsToolbox.s" -# softDouble.c builds at -O2. dpack stays noinline (basic regalloc -# overflows when dpack inlines into __adddf3/__muldf3). dclass MUST -# stay inline (its pointer-arg writes from a noinline boundary would -# lower to `sta (d,s),y` which uses DBR — silently corrupted under -# DBR != 0, caught by the dmul-after-bank-switch test). +# softDouble.c builds at -O2. dpack is noinline to dodge a backend +# stack-slot aliasing bug; dclass stays inline because pointer-arg +# stores from a noinline boundary use DBR-relative addressing (broken +# under DBR != 0). Both choices documented in the source. cc "$SRC/softDouble.c" echo "runtime built: $(ls -1 "$OUT"/*.o | wc -l) objects" diff --git a/runtime/src/libc.c b/runtime/src/libc.c index e2fce58..de80b47 100644 --- a/runtime/src/libc.c +++ b/runtime/src/libc.c @@ -798,7 +798,10 @@ typedef unsigned long clock_t; // DP scratch ($E0..$E7), then memcpy out. We can't use "=g" // constraints (W65816 backend rejects memory operands in inline // asm), so the data path runs through known DP addresses. -__attribute__((noinline)) +// +// "memory" clobber on the asm tells the scheduler we touch arbitrary +// memory, so it can't reorder the asm against the volatile DP reads +// below. That permits inlining without losing the read ordering. static void readTimeHex(unsigned char buf[8]) { __asm__ volatile ( "pea 0\n" @@ -1070,25 +1073,6 @@ extern int vsnprintf(char *buf, size_t n, const char *fmt, va_list ap); int vfprintf(FILE *stream, const char *fmt, va_list ap); size_t fwrite(const void *ptr, size_t size, size_t nmemb, FILE *stream); -// Opaque pos-update helper. The vfprintf body's `stream->pos += -// written` got DSE'd under p:32:16 + size_t=unsigned long when called -// after a format-spec vsnprintf call. Routing through an explicit -// noinline helper forces the compiler to emit the memory store. -volatile unsigned long g_advProbeStream; -volatile unsigned long g_advProbeWritten; -volatile unsigned int g_advProbeCalls; -volatile unsigned long g_advProbePostPos; -__attribute__((noinline)) -void __mfsAdvancePos(FILE *stream, size_t written) { - g_advProbeCalls++; - g_advProbeStream = (unsigned long)stream; - g_advProbeWritten = written; - stream->pos = stream->pos + written; - if (stream->pos > stream->size) stream->size = stream->pos; - g_advProbePostPos = stream->pos; -} - -__attribute__((noinline)) int fprintf(FILE *stream, const char *fmt, ...) { va_list ap; __builtin_va_start(ap, fmt); @@ -1097,7 +1081,6 @@ int fprintf(FILE *stream, const char *fmt, ...) { return r; } -__attribute__((noinline)) int vfprintf(FILE *stream, const char *fmt, va_list ap) { if (!stream) return -1; if (stream->kind == FILE_KIND_STDOUT || stream->kind == FILE_KIND_STDERR) @@ -1124,19 +1107,11 @@ int vfprintf(FILE *stream, const char *fmt, va_list ap) { size_t remain = (stream->cap > stream->pos) ? stream->cap - stream->pos : 0; if (remain == 0) { stream->err = 1; return -1; } - // Stash the FILE* low+high halves in volatile stack locals so - // the compiler is forced to reload after vsnprintf. Without - // this, the compiler keeps stream's hi half in IMG0 ($D0) for - // the entire function; vsnprintf uses $D0 as scratch, so when - // we read stream->* after vsnprintf returns the hi is garbage - // and writes go to the wrong bank. Caught by hex dumper test. - volatile unsigned int streamLo = (unsigned int)(unsigned long)stream; - volatile unsigned int streamHi = (unsigned int)((unsigned long)stream >> 16); int n = vsnprintf(stream->buf + stream->pos, remain, fmt, ap); - FILE *vs = (FILE *)((unsigned long)streamLo | ((unsigned long)streamHi << 16)); - if (n < 0) { vs->err = 1; return -1; } + if (n < 0) { stream->err = 1; return -1; } size_t written = ((size_t)n < remain) ? (size_t)n : remain - 1; - __mfsAdvancePos(vs, written); + stream->pos += written; + if (stream->pos > stream->size) stream->size = stream->pos; return n; } return -1; @@ -1219,7 +1194,6 @@ int system(const char *cmd) { (void)cmd; return 0; } // Returns NULL if no registration matches `path` (or the requested // mode isn't compatible with the registration's writable flag). -__attribute__((noinline)) static void initFileMem(FILE *f, const MfsEntry *reg, int wantWrite) { f->kind = FILE_KIND_MEM; f->writable = (u8)(wantWrite ? 1 : 0); @@ -1230,15 +1204,7 @@ static void initFileMem(FILE *f, const MfsEntry *reg, int wantWrite) { f->cap = reg->cap; f->pos = 0; f->unget = -1; - // Workaround: write path via byte-by-byte memcpy to dodge a ptr32 - // SDAG combiner bug where the i32 ptr-store of `f->path = reg->path` - // (struct offset 22) ends up writing to the previously-computed - // `f->pos` address (offset 16), corrupting pos. - { - const unsigned char *src = (const unsigned char *)®->path; - unsigned char *dst = (unsigned char *)&f->path; - dst[0] = src[0]; dst[1] = src[1]; dst[2] = src[2]; dst[3] = src[3]; - } + f->path = reg->path; } // Scratch GSString for fopen's gsosOpen call. Single static buffer is diff --git a/runtime/src/libgcc.s b/runtime/src/libgcc.s index 3bdbe6f..31230a0 100644 --- a/runtime/src/libgcc.s +++ b/runtime/src/libgcc.s @@ -979,7 +979,18 @@ __muldi3: stz 0xf4 stz 0xf6 stz 0xf8 - ; Loop 64 times on a's bits. + ; Short-circuit when a's high half ($E4/$E6) is zero: bits 32..63 + ; of a are 0, so the 32 high iterations would add nothing. Saves + ; ~50% of __muldi3 cost in mulhi64Aligned (softDouble.c), which + ; passes only u32-wide operands. b's high half is irrelevant for + ; this short-circuit — even if b is full-width, iters 32..63 only + ; shift b and add zero. + lda 0xe4 + ora 0xe6 + bne .Lmuldi_long + ldy #0x20 + bra .Lmuldi_loop +.Lmuldi_long: ldy #0x40 .Lmuldi_loop: ; Right-shift the 64-bit `a` by 1. $E0=lo..$E6=hi (matches the diff --git a/runtime/src/math.c b/runtime/src/math.c index 0134ecb..c4f9815 100644 --- a/runtime/src/math.c +++ b/runtime/src/math.c @@ -708,12 +708,6 @@ float expm1f(float x) { return (float)expm1((double)x); } // to avoid overflow — for |x|, |y| < ~1e150 the naive form is fine, // past that you'd want the standard scale-by-max trick. -// hypot — naive sqrt(x*x + y*y). NO `volatile` on the temps — -// clang's codegen for volatile-double locals on this target generates -// stack-relative loads/stores that crash under the GS/OS Loader (the -// chain executes correctly under runInMame but not via Finder). The -// volatile-free version works in both contexts. -__attribute__((noinline)) double hypot(double x, double y) { double xx = x * x; double yy = y * y; @@ -734,8 +728,6 @@ float hypotf(float x, float y) { // Implemented WITHOUT calling pow because clang treats pow as a // known builtin and either inlines it (with bad fold of pow(x,1/3)) // or DCEs the call entirely (cbrt body collapses to "return 0"). -// This implementation has no pow dependency and is immune. -__attribute__((noinline)) double cbrt(double x) { if (x == 0.0) return x; int neg = (int)(dToBits(x) >> 63) & 1; diff --git a/runtime/src/qsort.c b/runtime/src/qsort.c index 8e4e8e2..95b06ac 100644 --- a/runtime/src/qsort.c +++ b/runtime/src/qsort.c @@ -57,7 +57,6 @@ void *bsearch(const void *key, const void *base, size_t nmemb, // the split, qsort's i32-pointer pressure under ptr32 produces // ADCEfi tied-def chains the inline-spiller can't allocate ("ran // out of registers" failure). -__attribute__((noinline)) static void qsortInner(unsigned char *base, unsigned char *cur, size_t size, CmpFnT cmp) { while (cur > base) { diff --git a/runtime/src/snprintf.c b/runtime/src/snprintf.c index 35b7109..158c0ae 100644 --- a/runtime/src/snprintf.c +++ b/runtime/src/snprintf.c @@ -18,25 +18,9 @@ // the buffer been unbounded (C99 vsnprintf semantics), not just the // number actually written. This lets callers detect truncation. // -// **Sink state lives in file-static globals** instead of an explicit -// struct passed by pointer. This was originally a workaround for two -// W65816 backend bugs (since fixed): -// (1) The address of a stack-resident struct used to be computed -// wrong (&s came out as SP+5 = address of s.end instead of SP+3). -// (2) Functions taking fmt as arg1 (stack) didn't initialize the -// fmt local before the loop body — first char came from the -// arg slot but the loop's fmt++ ran on uninitialized memory. -// The struct-sink form now compiles correctly, but switching back to it -// would shift every TU's branch distances; left as-is for stability. -// Single-threaded use only, but that matches the rest of this runtime. -// -// Reverse-emit pattern (used by emitUDec / emitULong / emitHex): the -// natural countdown forms (`while (i > 0) emit(buf[--i])`, -// `while (i > 0) { i--; emit(buf[i]); }`, -// `for (j = i - 1; j >= 0; j--) emit(buf[j])`) all lower to a -// do-while whose `dec a; bpl` exit condition runs the loop one -// extra time on this backend, leaking a `buf[-1]` read. Use the -// forward count + index-arithmetic form instead. +// Sink state lives in file-static globals (gCur/gEnd/gTotal) rather +// than a per-call context. Single-threaded use only, but that matches +// the rest of this runtime. typedef unsigned long size_t; typedef __builtin_va_list va_list; @@ -50,7 +34,6 @@ static char *gEnd; static size_t gTotal; -__attribute__((noinline)) static void emit(char c) { if (gCur < gEnd) { *gCur++ = c; @@ -59,7 +42,6 @@ static void emit(char c) { } -__attribute__((noinline)) static void emitStr(const char *p) { if (!p) { p = "(null)"; @@ -70,7 +52,6 @@ static void emitStr(const char *p) { } -__attribute__((noinline)) static void emitUDec(unsigned int n) { char buf[6]; int i = 0; @@ -82,15 +63,10 @@ static void emitUDec(unsigned int n) { buf[i++] = '0' + (n % 10); n /= 10; } - // Reverse-emit; see file header for the forward-index rationale. - int top = i; - for (int j = 0; j < top; j++) { - emit(buf[top - 1 - j]); - } + while (i > 0) emit(buf[--i]); } -__attribute__((noinline)) static void emitDec(int n) { // -n on INT_MIN is signed-overflow UB; negate as unsigned. if (n < 0) { @@ -102,7 +78,6 @@ static void emitDec(int n) { } -__attribute__((noinline)) static void emitULong(unsigned long n) { char buf[11]; int i = 0; @@ -114,15 +89,10 @@ static void emitULong(unsigned long n) { buf[i++] = '0' + (n % 10); n /= 10; } - // Reverse-emit; see file header for the forward-index rationale. - int top = i; - for (int j = 0; j < top; j++) { - emit(buf[top - 1 - j]); - } + while (i > 0) emit(buf[--i]); } -__attribute__((noinline)) static void emitSignedLong(long n) { // See emitDec: avoid the signed-overflow UB on LONG_MIN. if (n < 0) { @@ -134,7 +104,6 @@ static void emitSignedLong(long n) { } -__attribute__((noinline)) static void emitHex(unsigned int n, int width) { static const char digits[] = "0123456789abcdef"; // unsigned int is 16-bit on this target -> at most 4 hex digits. @@ -153,15 +122,10 @@ static void emitHex(unsigned int n, int width) { while (i < width) { buf[i++] = '0'; } - // Reverse-emit; see file header for the forward-index rationale. - int top = i; - for (int j = 0; j < top; j++) { - emit(buf[top - 1 - j]); - } + while (i > 0) emit(buf[--i]); } -__attribute__((noinline)) static void emitDouble(double v, int prec, char spec) { // For %g / %G, "precision" is total significant digits. Real glibc // would compute exponent and choose between %e and %f styles, but diff --git a/runtime/src/softDouble.c b/runtime/src/softDouble.c index b02ee1e..beb6bb8 100644 --- a/runtime/src/softDouble.c +++ b/runtime/src/softDouble.c @@ -24,45 +24,20 @@ typedef unsigned char u8; // Pack sign / unbiased-exp / mantissa-with-leading-bit into IEEE-754 // double. Returns sign for zero or underflow; sign|inf for overflow. -// -// Body uses per-word writes through a `union { u64; u16[4]; }` and -// stores each word through a volatile-qualified accessor to defeat -// the backend's stack-slot coalescing. Without the volatile wrap, -// inlining dpack into __adddf3 hit a stack-slot-aliasing miscompile -// where result word 2 got OR'd with result word 3 (dadd(1.5, 2.5) → -// 0x4010_4010_0000_0000 instead of 0x4010_0000_0000_0000). Real fix -// needs backend stack-slot lifetime analysis at the coalescer stage. static u64 dpack(u64 sign, s16 exp, u64 mant) { if (mant == 0) return sign; s16 eS = exp + DEXP_BIAS; if (eS <= 0) return sign; if (eS >= 2047) return sign | DEXP_MASK; - union { u64 u; u16 w[4]; } mantU, signU; - mantU.u = mant; - signU.u = sign; - // Volatile output array forces distinct stack slots per word — - // the compiler can't fold these into shared slots. - volatile u16 outW[4]; - outW[0] = (u16)(mantU.w[0] | signU.w[0]); - outW[1] = (u16)(mantU.w[1] | signU.w[1]); - outW[2] = (u16)(mantU.w[2] | signU.w[2]); - outW[3] = (u16)((mantU.w[3] & 0x000F) | signU.w[3] | ((u16)eS << 4)); - union { u64 u; u16 w[4]; } r; - r.w[0] = outW[0]; - r.w[1] = outW[1]; - r.w[2] = outW[2]; - r.w[3] = outW[3]; - return r.u; + return sign | (mant & DMANT_MASK) | ((u64)(u16)eS << DEXP_SHIFT); } // Decompose `x` into sign / unbiased-exp / mantissa-with-leading-bit. // Returns the class: 0=zero, 1=normal, 2=infinity, 3=NaN. -// noinline reduces register pressure in __muldf3/__divdf3/__adddf3 -// — without it, greedy regalloc runs out of registers in __muldf3 -// at -O2. Now safe because pointer-arg writes lower to STBptr/STAptr -// which use [$E0],Y indirect-long with the bank byte forced to 0 -// (DBR-independent). See `feedback_dbr_ptr_deref_spill.md`. -// noinline removed — pointer-arg stores now lower to STBptr/STAptr (indirect-long, DBR-independent) +// +// Kept inline: passing pointer args from a noinline boundary lowers to +// `sta (d,s),y` (DBR-relative) — broken under DBR != 0. Inlining keeps +// the stores within the caller's frame. See feedback_dbr_ptr_deref_spill.md. static u16 dclass(u64 x, u64 *out_sign, s16 *out_exp, u64 *out_mant) { *out_sign = x & DSIGN_BIT; s16 e = (s16)((x >> DEXP_SHIFT) & 0x7FF); @@ -142,10 +117,9 @@ u64 __adddf3(u64 a, u64 b) { // left-shift if subtraction left the lead below 55. Reverse order // would shift an over-wide value out of u64 range entirely. // Use if + do-while because pure `while (cond) body` triggers a - // ptr32 backend bug: PHP/PLP wrap pass mis-identifies the loop's - // pre-test LDA reload as flag corruption and wraps the wrong - // range, so the BEQ tests stale flags and the loop body never - // fires. `do { } while (cond)` is unaffected (test-after-body). + // backend bug in the left-shift renormalize path: subtraction + // cases (different signs) lose their result (7+(-8) → -0 instead + // of -1). do-while is unaffected (test-after-body). if (mr & ~((1ULL << 56) - 1)) { do { u64 sticky_bit = mr & 1; @@ -282,26 +256,14 @@ u64 __divdf3(u64 a, u64 b) { // Handle the leading quotient bit explicitly. u64 q = DMANT_LEAD; u64 r = ma - mb; - // `volatile vmb`: forces mb to be re-read from memory inside the - // loop. Without this, the W65816 codegen miscompiles `r >= mb` and - // `r -= mb` when called as the 3rd+ chained `__divdf3` after prior - // softDouble libcalls (sqrt3 Newton iter — 3rd iter returned 0.0 - // instead of 1.41421). Adding `volatile` to either `r` or `mb` - // alone fixes it, suggesting the compiler is keeping one of them - // in registers across loop iterations and a JSL inside the loop - // (__ashlsi3 for `r <<= 1`) clobbers the held value. The real - // fix lives in the W65816 backend's u64-shift lowering; volatile - // here is the conservative workaround. - volatile u64 vmb = mb; // Compute 52 more fractional bits via standard shift-test-subtract. for (int i = 51; i >= 0; i--) { r <<= 1; - if (r >= vmb) { - r -= vmb; + if (r >= mb) { + r -= mb; q |= (1ULL << i); } } - mb = vmb; // resync in case below reads mb // Round to nearest, ties to even. Generate one extra bit (the // "guard"), examine the remainder for any non-zero "sticky" tail, // and round q up when guard=1 and (sticky || (q & 1)). Without diff --git a/runtime/src/softFloat.c b/runtime/src/softFloat.c index bf055bc..0cd2d3e 100644 --- a/runtime/src/softFloat.c +++ b/runtime/src/softFloat.c @@ -38,7 +38,6 @@ typedef int s16; #define MANT_MASK 0x007FFFFFUL #define MANT_LEAD 0x00800000UL // implicit leading 1 -__attribute__((noinline)) static u16 fpClass(u32 x, u32 *out_sign, s16 *out_exp, u32 *out_mant) { *out_sign = x & SIGN_BIT; s16 e = (s16)((x >> EXP_SHIFT) & 0xFF); @@ -61,7 +60,6 @@ static u16 fpClass(u32 x, u32 *out_sign, s16 *out_exp, u32 *out_mant) { return 1; // normal } -__attribute__((noinline)) static u32 fpPack(u32 sign, s16 exp, u32 mant) { if (mant == 0) return sign; // zero // Normalize: shift mantissa until bit 23 is the leading 1. diff --git a/runtime/src/strtok.c b/runtime/src/strtok.c index 03e6567..ca29e3f 100644 --- a/runtime/src/strtok.c +++ b/runtime/src/strtok.c @@ -9,7 +9,6 @@ static char *gStrtokSave; // strtok_r, growing the .o by ~70%. The runtime's bank-0 budget // is tight enough that the duplicated code pushes rodata past // 0xC000 (IIgs IO window), corrupting string literals at runtime. -__attribute__((noinline)) char *strtok_r(char *str, const char *delim, char **saveptr) { unsigned char *s; if (str != (char *)0) { diff --git a/runtime/src/timeExt.c b/runtime/src/timeExt.c index 7124c8d..dc74e8a 100644 --- a/runtime/src/timeExt.c +++ b/runtime/src/timeExt.c @@ -164,7 +164,6 @@ static const char *const __monLong[12] = { // (__udivhi3 + __umodhi3) is slower than one __udivhi3 + multiply but // is the only spelling that avoids the negation bug at this width. // Calendar values stay under 65535 so u16 suffices. -__attribute__((noinline)) static char *fmtN(char *p, unsigned long v, int n) { unsigned int v16 = (unsigned int)v; p += n; @@ -220,7 +219,6 @@ char *ctime(const time_t *t) { // %Y %m %d %H %M %S %j %w %a %A %b %h %B %p %% // Composite specs (expanded by main loop via strftimeComposite): // %D %F %R %T %r %x %X %c -__attribute__((noinline)) static int strftimeOne(char dst[8], char spec, const struct tm *tm, const char **strOut) { *strOut = 0; diff --git a/screenshots/frame.png b/screenshots/frame.png index 052783d..31dc20e 100644 --- a/screenshots/frame.png +++ b/screenshots/frame.png @@ -1,3 +1,3 @@ version https://git-lfs.github.com/spec/v1 -oid sha256:051573cbf726c8f39f6ad9ce3cf8fb49f5282be228845bb50b13887d0c0a5fc8 -size 2713 +oid sha256:0d383d40d649de9e2985f6aa22381cb4bff184bd28d026673237798508c7c160 +size 1373 diff --git a/screenshots/minicad.png b/screenshots/minicad.png index 60c19f3..00e2e43 100644 --- a/screenshots/minicad.png +++ b/screenshots/minicad.png @@ -1,3 +1,3 @@ version https://git-lfs.github.com/spec/v1 -oid sha256:81a59ee363dbe66546fe56bd7570757886ab0526a99be4839387c73c009849ce -size 3803 +oid sha256:1a5cb11d44ad3254569cf3200a50ed925e6863634048b54fef7c9a1185e01ef8 +size 1693 diff --git a/screenshots/orcaFrameLike.png b/screenshots/orcaFrameLike.png deleted file mode 100644 index 757cdeb..0000000 --- a/screenshots/orcaFrameLike.png +++ /dev/null @@ -1,3 +0,0 @@ -version https://git-lfs.github.com/spec/v1 -oid sha256:08243cbf6f9d31d3a29dc50b3e9f98e278777b13ab361e30fdd8687247cac256 -size 1203 diff --git a/screenshots/orcaMiniCadLike.png b/screenshots/orcaMiniCadLike.png deleted file mode 100644 index 9898b4c..0000000 --- a/screenshots/orcaMiniCadLike.png +++ /dev/null @@ -1,3 +0,0 @@ -version https://git-lfs.github.com/spec/v1 -oid sha256:8a3378f00c133c0754db525b4b8d1215ba1ddcb515ff5b52a0ec5a561d1ba307 -size 2891 diff --git a/screenshots/orcaReversiLike.png b/screenshots/orcaReversiLike.png deleted file mode 100644 index 757cdeb..0000000 --- a/screenshots/orcaReversiLike.png +++ /dev/null @@ -1,3 +0,0 @@ -version https://git-lfs.github.com/spec/v1 -oid sha256:08243cbf6f9d31d3a29dc50b3e9f98e278777b13ab361e30fdd8687247cac256 -size 1203 diff --git a/screenshots/qdProbe.png b/screenshots/qdProbe.png index d560a07..84f7fb9 100644 --- a/screenshots/qdProbe.png +++ b/screenshots/qdProbe.png @@ -1,3 +1,3 @@ version https://git-lfs.github.com/spec/v1 -oid sha256:c9f01c81966a4b6508f086ed8a9d3e54b3694e60643cec97ed33a516b5065823 -size 14278 +oid sha256:e53252d3a0ea077043d598b06081dfc6ff4a1901a7cf18b371abf9bb08143e15 +size 1199 diff --git a/screenshots/reversi.png b/screenshots/reversi.png index d6048b4..ed782b6 100644 --- a/screenshots/reversi.png +++ b/screenshots/reversi.png @@ -1,3 +1,3 @@ version https://git-lfs.github.com/spec/v1 -oid sha256:f09749b0d2605f9f6bfac1e9f8629f84753ad0c3c65805bb228a415171973ba8 -size 3936 +oid sha256:66d15097eb8ad88a92b27cef8c631f8803e9ae1960674aaea38eb773dda75a51 +size 4427 diff --git a/scripts/benchCycles.sh b/scripts/benchCycles.sh index ead9809..e16162d 100755 --- a/scripts/benchCycles.sh +++ b/scripts/benchCycles.sh @@ -24,6 +24,8 @@ oCrt0=$(mktemp --suffix=.o) oLibgcc=$(mktemp --suffix=.o) "$LLVM_MC" -arch=w65816 -filetype=obj "$PROJECT_ROOT/runtime/src/crt0.s" -o "$oCrt0" "$LLVM_MC" -arch=w65816 -filetype=obj "$PROJECT_ROOT/runtime/src/libgcc.s" -o "$oLibgcc" +# softDouble.o is needed for FP benches (dmul/dadd/ddiv → __muldf3/etc.) +oSoftDouble="$PROJECT_ROOT/runtime/softDouble.o" # Per-benchmark wrapper template. The C wrapper calls each benchmark # with appropriate inputs, then writes the iteration count and cycle @@ -39,6 +41,9 @@ benchInputs() { dotProduct) echo 'dotProduct(va, vb, 4)';; popcount) echo 'popcount(0x12345678UL)';; crc32) echo 'crc32((const unsigned char *)"hello", 5)';; + dmul) echo 'dmul(da, db)';; + dadd) echo 'dadd(da, db)';; + ddiv) echo 'ddiv(da, db)';; *) echo "/* unknown */";; esac } @@ -53,6 +58,9 @@ benchExtern() { dotProduct) echo 'extern long dotProduct(const short *a, const short *b, unsigned int n); static const short va[] = {1,2,3,4}; static const short vb[] = {5,6,7,8};';; popcount) echo 'extern int popcount(unsigned long x);';; crc32) echo 'extern unsigned long crc32(const unsigned char *p, unsigned int n);';; + dmul) echo 'extern double dmul(double a, double b); static volatile double da = 3.14, db = 2.71;';; + dadd) echo 'extern double dadd(double a, double b); static volatile double da = 3.14, db = 2.71;';; + ddiv) echo 'extern double ddiv(double a, double b); static volatile double da = 3.14, db = 2.71;';; *) echo '';; esac } @@ -68,6 +76,14 @@ runOneBench() { echo "(no input config)" return fi + # FP benches assign result to sinkD (double); rest assign to sink as ulong + # FP benches also use fewer iters (each call is ~1000+ cycles, so 100 + # iters wraps the 8-bit HBL counter many times). + local sink_lhs sink_cast iters + case "$name" in + dmul|dadd|ddiv) sink_lhs='sinkD'; sink_cast=''; iters=10 ;; + *) sink_lhs='sink'; sink_cast='(unsigned long)'; iters=100 ;; + esac local cwrap=$(mktemp --suffix=.c) local owrap=$(mktemp --suffix=.o) @@ -90,7 +106,8 @@ __attribute__((noinline)) static unsigned char readVbl(void) { return r; } volatile unsigned long sink; -#define ITERS 100 +volatile double sinkD; +#define ITERS $iters int main(void) { // Re-enable IRQs so the IIgs ROM's VBL handler runs and the // VBL counter at \$E1006B actually ticks. crt0 disables IRQs @@ -98,7 +115,7 @@ int main(void) { __asm__ volatile ("cli\n" ::: "memory"); unsigned char t0 = readVbl(); for (int i = 0; i < ITERS; i++) { - sink = (unsigned long)($call_expr); + $sink_lhs = $sink_cast($call_expr); } unsigned char t1 = readVbl(); __asm__ volatile ("sei\n" ::: "memory"); @@ -114,7 +131,7 @@ EOF || { echo "compile-fail"; rm -f "$cwrap" "$owrap"; return; } "$CLANG" --target=w65816 -O2 -ffunction-sections -c "$BENCH_DIR/$name.c" -o "$obench" 2>/dev/null \ || { echo "compile-fail"; rm -f "$cwrap" "$owrap" "$obench"; return; } - "$LINK" -o "$bin" --text-base 0x1000 "$oCrt0" "$oLibgcc" "$owrap" "$obench" 2>/dev/null \ + "$LINK" -o "$bin" --text-base 0x1000 "$oCrt0" "$oLibgcc" "$oSoftDouble" "$owrap" "$obench" 2>/dev/null \ || { echo "link-fail"; rm -f "$cwrap" "$owrap" "$obench" "$bin"; return; } # Read VBL delta at $025000. @@ -135,8 +152,8 @@ EOF if [ "$ticks" -eq 0 ]; then echo "<65 cyc/iter (under timer resolution)" else - local cycles=$((ticks * 65 / 100)) - printf "%d hbl-ticks (~%d cyc/iter)" "$ticks" "$cycles" + local cycles=$((ticks * 65 / iters)) + printf "%d hbl-ticks (~%d cyc/iter, %d iters)" "$ticks" "$cycles" "$iters" fi fi } diff --git a/scripts/runInMame.sh b/scripts/runInMame.sh index 79dd36c..e5dd0e4 100755 --- a/scripts/runInMame.sh +++ b/scripts/runInMame.sh @@ -21,7 +21,15 @@ source "$(dirname "$0")/common.sh" BIN="$1" shift -SECS=3 +# Frame budget: load at frame 30, check at CHECK_FRAME (default 300 = 4.5 +# simulated seconds after load). Override via env for heavy-compute tests. +# Earlier default was 60 frames (0.5 sec), which falsely flagged slow but +# correct math (e.g. 6-iter sqrt with chained soft-double libcalls) as +# runtime hangs — see feedback_sqrt_runtime_broken.md. +CHECK_FRAME=${MAME_CHECK_FRAME:-300} +# seconds_to_run is simulated time; MAME terminates at this point. Sized +# to comfortably exceed CHECK_FRAME (300 frames = 5 sec at 60Hz). +SECS=${MAME_SECS:-6} # Build address list as Lua table entries. LUA_CHECKS="" @@ -84,7 +92,7 @@ emu.register_frame_done(function() cpu.state["S"].value = 0x01FF print("MAME-LOADED bytes=" .. #data) end - if frame == 60 then + if frame == $CHECK_FRAME then local cpu = manager.machine.devices[":maincpu"] local mem = cpu.spaces["program"] $LUA_CHECKS diff --git a/scripts/runInMameWithGsosStub.sh b/scripts/runInMameWithGsosStub.sh index 39e077b..ef1303c 100755 --- a/scripts/runInMameWithGsosStub.sh +++ b/scripts/runInMameWithGsosStub.sh @@ -22,7 +22,8 @@ source "$(dirname "$0")/common.sh" BIN="$1" shift -SECS=3 +CHECK_FRAME=${MAME_CHECK_FRAME:-300} +SECS=${MAME_SECS:-6} # 23-byte stub bytes (see runtime/src/iigsGsosStub.s for source). # Hand-assembled to avoid relying on llvm-mc tracking M-flag state. @@ -96,7 +97,7 @@ $STUB_LUA cpu.state["S"].value = 0x01FF print("MAME-LOADED bytes=" .. #data .. " stub=$((${#STUB_BYTES}/2))") end - if frame == 60 then + if frame == $CHECK_FRAME then local cpu = manager.machine.devices[":maincpu"] local mem = cpu.spaces["program"] $LUA_CHECKS diff --git a/scripts/runMultiSeg.sh b/scripts/runMultiSeg.sh index 9f8d803..65b51b6 100755 --- a/scripts/runMultiSeg.sh +++ b/scripts/runMultiSeg.sh @@ -11,7 +11,8 @@ source "$(dirname "$0")/common.sh" MANIFEST="$1" shift -SECS=3 +CHECK_FRAME=${MAME_CHECK_FRAME:-300} +SECS=${MAME_SECS:-6} # Build address list as Lua table entries, mirroring runInMame.sh. LUA_CHECKS="" @@ -97,7 +98,7 @@ $LOAD_LUA cpu.state["S"].value = 0x01FF print('MAME-READY pc=0x' .. string.format('%06x', $ENTRY_BASE + $ENTRY_OFF)) end - if frame == 60 then + if frame == $CHECK_FRAME then local cpu = manager.machine.devices[":maincpu"] local mem = cpu.spaces["program"] $LUA_CHECKS diff --git a/src/link816/link816.cpp b/src/link816/link816.cpp index be6b8d4..d25517f 100644 --- a/src/link816/link816.cpp +++ b/src/link816/link816.cpp @@ -833,6 +833,15 @@ struct Linker { L.bssBase = 0xD000; } } + // Also bump past the IO window if BSS would SPAN it + // (starts below 0xC000, extends into or past 0xC000). + // BSS writes to 0xC000-0xCFFF hit soft switches — caught + // by smoke #128 hex dumper, where ~954-byte BSS pushed + // past 0xC000 and BSS-clear writes crashed MAME. + if (L.bssBase < 0xC000 && + L.bssBase + L.bssSize > 0xC000) { + L.bssBase = 0xD000; + } if (L.bssBase + L.bssSize > 0x10000u) { char msg[256]; std::snprintf(msg, sizeof(msg), diff --git a/src/llvm/lib/Target/W65816/W65816ISelLowering.cpp b/src/llvm/lib/Target/W65816/W65816ISelLowering.cpp index ad3ff9d..5780640 100644 --- a/src/llvm/lib/Target/W65816/W65816ISelLowering.cpp +++ b/src/llvm/lib/Target/W65816/W65816ISelLowering.cpp @@ -114,6 +114,17 @@ W65816TargetLowering::W65816TargetLowering(const TargetMachine &TM, for (MVT VT : MVT::integer_valuetypes()) setLoadExtAction(ISD::SEXTLOAD, VT, MVT::i8, Expand); + // GlobalOpt sometimes narrows a `short` global to `i1` when it sees + // every assignment is 0 or 1. Custom-lower so LowerLoad rewrites + // `zext/sext/anyext from i1` into a plain byte load + appropriate + // mask. Both i16 and i8 result widths can appear, depending on + // whether the consumer wants the value as `short` or `bool`. + for (MVT ResVT : {MVT::i8, MVT::i16}) { + setLoadExtAction(ISD::ZEXTLOAD, ResVT, MVT::i1, Custom); + setLoadExtAction(ISD::SEXTLOAD, ResVT, MVT::i1, Custom); + setLoadExtAction(ISD::EXTLOAD, ResVT, MVT::i1, Custom); + } + // Only register i32 ext-load / trunc-store and Custom actions when // i32 is actually a legal type (ptr32 mode active). Otherwise the // Custom-action calls intercept i16/i8 ops, and LowerTruncate's @@ -191,6 +202,20 @@ W65816TargetLowering::W65816TargetLowering(const TargetMachine &TM, setOperationAction(ISD::SMUL_LOHI, MVT::i16, Expand); setOperationAction(ISD::UMUL_LOHI, MVT::i16, Expand); setOperationAction(ISD::MUL, MVT::i16, LibCall); + + // i8 multiply / mulh / div / rem: SDAG narrows e.g. `x / 10` to + // `mulhu i8 x, -51` + shift when it proves operands fit in i8. + // The 65816 has no native 8-bit multiplier; route everything + // through the 16-bit libcalls by Promoting i8 ops to i16. + setOperationAction(ISD::MUL, MVT::i8, Promote); + setOperationAction(ISD::MULHU, MVT::i8, Promote); + setOperationAction(ISD::MULHS, MVT::i8, Promote); + setOperationAction(ISD::SDIV, MVT::i8, Promote); + setOperationAction(ISD::UDIV, MVT::i8, Promote); + setOperationAction(ISD::SREM, MVT::i8, Promote); + setOperationAction(ISD::UREM, MVT::i8, Promote); + setOperationAction(ISD::SMUL_LOHI, MVT::i8, Expand); + setOperationAction(ISD::UMUL_LOHI, MVT::i8, Expand); // CTPOP/CTLZ/CTTZ/ROTL/ROTR — no hardware support. Expand lets the // type legalizer rewrite into a sequence of basic ops. Without // this, e.g. `x && !(x & (x-1))` (LLVM canonicalises to popcount==1) @@ -904,6 +929,28 @@ SDValue W65816TargetLowering::LowerLoad(SDValue Op, Ld->getAlign(), Ld->getMemOperand()->getFlags()); } + // i1 memory type comes from GlobalOpt narrowing `short` globals + // whose only assignments are 0/1. Treat as i8 load + appropriate + // mask — the underlying memory is still byte-sized. + if (MemVT == MVT::i1) { + SDValue ByteLd = DAG.getExtLoad(ISD::ZEXTLOAD, DL, MVT::i16, Chain, + FoldedLo, MVT::i8, + Ld->getMemOperand()); + SDValue Val = ByteLd; + if (ExtType == ISD::ZEXTLOAD || ExtType == ISD::EXTLOAD) { + Val = DAG.getNode(ISD::AND, DL, MVT::i16, ByteLd, + DAG.getConstant(1, DL, MVT::i16)); + } else if (ExtType == ISD::SEXTLOAD) { + // i1 sign-extend: bit 0 -> all bits. AND #1 then NEG. + SDValue Bit = DAG.getNode(ISD::AND, DL, MVT::i16, ByteLd, + DAG.getConstant(1, DL, MVT::i16)); + Val = DAG.getNode(ISD::SUB, DL, MVT::i16, + DAG.getConstant(0, DL, MVT::i16), Bit); + } + if (Op.getValueType() == MVT::i8) + Val = DAG.getNode(ISD::TRUNCATE, DL, MVT::i8, Val); + return DAG.getMergeValues({Val, ByteLd.getValue(1)}, DL); + } return DAG.getExtLoad(ExtType, DL, Op.getValueType(), Chain, FoldedLo, MemVT, Ld->getMemOperand()); } @@ -913,6 +960,9 @@ SDValue W65816TargetLowering::LowerLoad(SDValue Op, return SDValue(); EVT MemVT = Ld->getMemoryVT(); + // Widen i1 memVT to i8 (single-byte storage). getMemIntrinsicNode + // asserts memvt must be supported; i1 isn't. + if (MemVT == MVT::i1) MemVT = MVT::i8; SDVTList VTs = DAG.getVTList(MVT::i16, MVT::Other); SDValue Ops[] = { Chain, Ptr }; // memVT for the LD_PTR memintrinsic must match MMO's size (i8 vs @@ -925,10 +975,14 @@ SDValue W65816TargetLowering::LowerLoad(SDValue Op, MemVT, Ld->getMemOperand()); SDValue Val = LdNode; // Byte memory access: mask the high byte for zextload, leave anyext. + // i1 memVT was widened to i8 above; the mask path is the same. if (MemVT == MVT::i8) { - if (Ld->getExtensionType() == ISD::ZEXTLOAD) - Val = DAG.getNode(ISD::AND, DL, MVT::i16, Val, - DAG.getConstant(0xFF, DL, MVT::i16)); + EVT OrigMemVT = Ld->getMemoryVT(); + SDValue MaskC = DAG.getConstant(OrigMemVT == MVT::i1 ? 1 : 0xFF, + DL, MVT::i16); + if (Ld->getExtensionType() == ISD::ZEXTLOAD || + (OrigMemVT == MVT::i1 && Ld->getExtensionType() == ISD::EXTLOAD)) + Val = DAG.getNode(ISD::AND, DL, MVT::i16, Val, MaskC); else if (Ld->getExtensionType() == ISD::SEXTLOAD) Val = DAG.getNode(ISD::SIGN_EXTEND_INREG, DL, MVT::i16, Val, DAG.getValueType(MVT::i8)); diff --git a/src/llvm/lib/Target/W65816/W65816ImgCalleeSave.cpp b/src/llvm/lib/Target/W65816/W65816ImgCalleeSave.cpp index 7af5379..cf05eea 100644 --- a/src/llvm/lib/Target/W65816/W65816ImgCalleeSave.cpp +++ b/src/llvm/lib/Target/W65816/W65816ImgCalleeSave.cpp @@ -110,21 +110,32 @@ static int classifyImgReg(unsigned Reg) { return -1; } +// Classification of a DP-addressed instruction's relation to a DP slot. +enum class DpAccess { + None, // not a DP-imm instruction we care about + Read, // only reads the DP slot (e.g., LDA $C0) + Write, // only writes the DP slot (e.g., STA $C0) + ReadWrite, // both (e.g., INC $C0) +}; + // Map a DP-addressed instruction's first immediate operand to an IMG -// slot index if it falls in $C0..$CE. Returns -1 otherwise. -static int classifyDpImmAsImg(const MachineInstr &MI) { - // Most DP-addressed opcodes take the dp address as immediate op 0. - // (Some, like ADC_DP-form-with-explicit-A, may put the imm at op 1.) - // For our scan, check the first IMM operand we find. +// slot index and access mode. Returns (-1, None) if it doesn't access +// an IMG slot. +static std::pair classifyDpImmAsImg(const MachineInstr &MI) { unsigned Opc = MI.getOpcode(); + DpAccess Mode; switch (Opc) { - case W65816::LDA_DP: + // Pure stores: write only. case W65816::STA_DP: case W65816::STZ_DP: - case W65816::LDX_DP: case W65816::STX_DP: - case W65816::LDY_DP: case W65816::STY_DP: + Mode = DpAccess::Write; + break; + // Pure loads / compares / bit-tests: read only (writes to A/X/Y/P, not DP). + case W65816::LDA_DP: + case W65816::LDX_DP: + case W65816::LDY_DP: case W65816::ADC_DP: case W65816::SBC_DP: case W65816::AND_DP: @@ -134,53 +145,68 @@ static int classifyDpImmAsImg(const MachineInstr &MI) { case W65816::CPX_DP: case W65816::CPY_DP: case W65816::BIT_DP: + Mode = DpAccess::Read; + break; + // Read-modify-write. case W65816::INC_DP: case W65816::DEC_DP: case W65816::ASL_DP: case W65816::LSR_DP: case W65816::ROL_DP: case W65816::ROR_DP: + Mode = DpAccess::ReadWrite; break; default: - return -1; + return {-1, DpAccess::None}; } for (const auto &MO : MI.operands()) { if (!MO.isImm()) continue; int64_t V = MO.getImm(); for (int i = 0; i < 8; ++i) if ((int64_t)IMG_DP[i] == V) - return i; - return -1; // First imm is the dp addr; not in IMG range. + return {i, Mode}; + return {-1, DpAccess::None}; // First imm is the dp addr; not in IMG range. } - return -1; + return {-1, DpAccess::None}; } bool W65816ImgCalleeSave::runOnMachineFunction(MachineFunction &MF) { - // Step 1: scan for IMG8..IMG15 usage. copyPhysReg already lowered - // some COPY $imgN = $a forms to STA_DP imm:0xC0 (etc.), so we have - // to check both the physreg form AND the DP-immediate form. - bool UsedSlot[8] = {false}; - bool AnyUsed = false; + // Step 1: scan for IMG8..IMG15 WRITES. Reads alone don't need saving + // — if we never write IMGn, the caller's value survives untouched + // (other functions we call also preserve IMG8..IMG15 by the same + // convention, so no chain breaks the invariant). Saving on read-only + // use costs ~6 bytes per slot of needlessly-saved prologue/epilogue + // (caught by evalAt at 1.96× Calypsi — 5 IMG slots saved when fewer + // were actually written). + // + // copyPhysReg lowers `COPY $imgN = $a` to `STA_DP imm:0xCx`, so we + // check both the physreg-DEF form AND the DP-imm-store form. + bool WrittenSlot[8] = {false}; + bool AnyWritten = false; for (auto &MBB : MF) { for (auto &MI : MBB) { - // physreg form: $imgN = ... or ... = $imgN + // physreg-DEF form: $imgN appearing as a Def operand. for (const auto &MO : MI.operands()) { - if (!MO.isReg() || MO.getReg() == 0) continue; + if (!MO.isReg() || MO.getReg() == 0 || !MO.isDef()) continue; int idx = classifyImgReg(MO.getReg()); if (idx >= 0) { - UsedSlot[idx] = true; - AnyUsed = true; + WrittenSlot[idx] = true; + AnyWritten = true; } } - // DP-imm form: lda dp imm:0xC0 etc. - int idx = classifyDpImmAsImg(MI); - if (idx >= 0) { - UsedSlot[idx] = true; - AnyUsed = true; + // DP-imm form: STA_DP / INC_DP / etc. write the slot at $Cx. + auto [idx, mode] = classifyDpImmAsImg(MI); + if (idx >= 0 && + (mode == DpAccess::Write || mode == DpAccess::ReadWrite)) { + WrittenSlot[idx] = true; + AnyWritten = true; } } } - if (!AnyUsed) return false; + if (!AnyWritten) return false; + // Rename for downstream Step 2/3/4 readability — they use UsedSlot. + bool (&UsedSlot)[8] = WrittenSlot; + (void)AnyWritten; // Step 2: allocate one frame slot per used IMG. Size = 2 bytes (each // Img16 holds a 16-bit value). Mark as a spill slot so PEI accounts diff --git a/src/llvm/lib/Target/W65816/W65816InstrInfo.td b/src/llvm/lib/Target/W65816/W65816InstrInfo.td index 4416a1b..b702161 100644 --- a/src/llvm/lib/Target/W65816/W65816InstrInfo.td +++ b/src/llvm/lib/Target/W65816/W65816InstrInfo.td @@ -942,6 +942,17 @@ def : Pat<(i16 (zextloadi8 (W65816Wrapper tglobaladdr:$g))), def : Pat<(i16 (zextloadi8 (W65816Wrapper texternalsym:$s))), (ANDi16imm (LDAabs texternalsym:$s), 0xFF)>; +// i1-result loads from globals: GlobalOpt narrows `static short` to +// i1 when it sees every assignment is 0 or 1. zextloadi1 and +// extloadi1 land on us as i16-result loads with `s8`/i1 memory type; +// emit them as a normal byte load + mask (zext) or bare load (ext). +def : Pat<(i16 (zextloadi1 (W65816Wrapper tglobaladdr:$g))), + (ANDi16imm (LDAabs tglobaladdr:$g), 0xFF)>; +def : Pat<(i16 (extloadi1 (W65816Wrapper tglobaladdr:$g))), + (LDAabs tglobaladdr:$g)>; +def : Pat<(i16 (sextloadi1 (W65816Wrapper tglobaladdr:$g))), + (ANDi16imm (LDAabs tglobaladdr:$g), 1)>; + // CMP / branches. CMP sets the flags via the W65816cmp SDNode (glue // out); the W65816brcc node consumes the glue and dispatches to the // right Bxx instruction by condition code. diff --git a/src/llvm/lib/Target/W65816/W65816LowerWide32.cpp b/src/llvm/lib/Target/W65816/W65816LowerWide32.cpp index afcb125..ef1752e 100644 --- a/src/llvm/lib/Target/W65816/W65816LowerWide32.cpp +++ b/src/llvm/lib/Target/W65816/W65816LowerWide32.cpp @@ -117,17 +117,33 @@ bool W65816LowerWide32::runOnMachineFunction(MachineFunction &MF) { MachineInstr *DefMI = MRI.getUniqueVRegDef(W); if (DefMI && DefMI->getOpcode() == TargetOpcode::REG_SEQUENCE) { Register Lo, Hi; + bool Bail = false; for (unsigned op = 1; op + 1 < DefMI->getNumOperands(); op += 2) { if (!DefMI->getOperand(op).isReg() || !DefMI->getOperand(op + 1).isImm()) continue; unsigned idx = DefMI->getOperand(op + 1).getImm(); Register Src = DefMI->getOperand(op).getReg(); + unsigned SrcSub = DefMI->getOperand(op).getSubReg(); + // If the source has a sub-register specifier (e.g. + // `%W.sub_lo:wide32` is a slice of a wide32 vreg), the + // effective "half" is the corresponding half of that source. + // Resolve via wideMap when the parent is already mapped; + // otherwise defer until a later iteration picks it up. + if (SrcSub != 0) { + if (!Src.isVirtual() || !wideMap.count(Src)) { + Bail = true; + break; + } + auto [SrcLo, SrcHi] = wideMap[Src]; + Src = (SrcSub == llvm::sub_lo) ? SrcLo : SrcHi; + } if (idx == llvm::sub_lo) Lo = Src; else if (idx == llvm::sub_hi) Hi = Src; } + if (Bail) continue; if (Lo && Hi) { wideMap[W] = {Lo, Hi}; toErase.push_back(DefMI); @@ -156,25 +172,38 @@ bool W65816LowerWide32::runOnMachineFunction(MachineFunction &MF) { MachineInstr *LoDefMI = nullptr; MachineInstr *HiDefMI = nullptr; bool ok = true; + bool Bail = false; for (MachineInstr &MI : MRI.def_instructions(W)) { if (!MI.isCopy()) { ok = false; break; } const MachineOperand &Dst = MI.getOperand(0); const MachineOperand &Src = MI.getOperand(1); if (!Dst.isReg() || Dst.getReg() != W) { ok = false; break; } unsigned SubIdx = Dst.getSubReg(); + Register S = Src.getReg(); + unsigned SrcSub = Src.getSubReg(); + // If the source has a sub-register specifier, resolve through + // wideMap[parent]. Symmetric with the REG_SEQUENCE handler + // above — without this, `%W.sub_lo = COPY %V.sub_lo:wide32` + // records the wide32 parent %V instead of %V's i16 sub_lo. + if (SrcSub != 0) { + if (!S.isVirtual() || !wideMap.count(S)) { Bail = true; break; } + auto [SL, SH] = wideMap[S]; + S = (SrcSub == llvm::sub_lo) ? SL : SH; + } if (SubIdx == llvm::sub_lo) { if (LoDefMI) { ok = false; break; } LoDefMI = &MI; - LoSrc = Src.getReg(); + LoSrc = S; } else if (SubIdx == llvm::sub_hi) { if (HiDefMI) { ok = false; break; } HiDefMI = &MI; - HiSrc = Src.getReg(); + HiSrc = S; } else { ok = false; break; } } + if (Bail) continue; if (ok && LoSrc && HiSrc) { wideMap[W] = {LoSrc, HiSrc}; if (LoDefMI) toErase.push_back(LoDefMI); diff --git a/src/llvm/lib/Target/W65816/W65816PromoteFiToImg.cpp b/src/llvm/lib/Target/W65816/W65816PromoteFiToImg.cpp index 8810bc2..3e3f52c 100644 --- a/src/llvm/lib/Target/W65816/W65816PromoteFiToImg.cpp +++ b/src/llvm/lib/Target/W65816/W65816PromoteFiToImg.cpp @@ -281,7 +281,11 @@ bool W65816PromoteFiToImg::runOnMachineFunction(MachineFunction &MF) { Name == "__modsi3" || Name == "__ashlhi3" || Name == "__lshrhi3" || Name == "__ashrhi3" || Name == "__ashlsi3" || Name == "__lshrsi3" || - Name == "__ashrsi3") + Name == "__ashrsi3" || + // 64-bit helpers: use $E0..$EE only, no IMG0..7 touch. + Name == "__ashldi3" || Name == "__lshrdi3" || + Name == "__ashrdi3" || Name == "__cmpdi2" || + Name == "__ucmpdi2") return true; return false; } diff --git a/src/llvm/lib/Target/W65816/W65816StackRelToImg.cpp b/src/llvm/lib/Target/W65816/W65816StackRelToImg.cpp index e3a1633..5b44b9f 100644 --- a/src/llvm/lib/Target/W65816/W65816StackRelToImg.cpp +++ b/src/llvm/lib/Target/W65816/W65816StackRelToImg.cpp @@ -54,6 +54,7 @@ #include "llvm/CodeGen/MachineInstrBuilder.h" #include "llvm/Support/Debug.h" #include "llvm/Support/Format.h" +#include using namespace llvm; @@ -131,6 +132,501 @@ static bool isImgSafeCall(const MachineInstr &MI) { } +// Phase 12 peephole — A-dead PHA/PLA bracket elision. Two shapes: +// +// (a) PEI single-store IMG-source-STAfi bracket. When the next op +// after PLA redefines A, the bracket is dead weight: +// +// PHA ; (LDA_DP $cx | TXA | TYA) ; STA_StackRel (off+2) ; PLA +// [next redefines A] +// → +// (LDA_DP $cx | TXA | TYA) ; STA_StackRel off +// +// (b) ImgCalleeSave multi-store bracket at function entry. When the +// post-PLA pattern is "STX_DP ... ; STA_StackRel destOff ; [redefines +// A]", the post-PLA STA is storing entry-A to its final slot — we +// reorder by hoisting that STA to BEFORE the bracket, then dropping +// PHA/PLA and reverting inner offsets: +// +// PHA ; (LDA_DP $cx ; STA_StackRel off+2)×N ; PLA +// STX_DP $cM ; STA_StackRel destOff +// [next redefines A] +// → +// STA_StackRel destOff ; hoisted, entry-A → slot first +// (LDA_DP $cx ; STA_StackRel off)×N +// STX_DP $cM ; STX stays after saves +// [next op] +// +// Restricted to the entry MBB starting at MBB.begin() to ensure the +// match is an ImgCalleeSave-emitted prologue bracket (and not a mid- +// function bracket where the post-PLA STA is consuming a *different* +// A value than what was preserved). +static bool elidePhaBracket(MachineFunction &MF, + const W65816InstrInfo *TII) { + bool Changed = false; + auto opNoTouchA = [](unsigned Op) { + switch (Op) { + case W65816::STX_DP: case W65816::STX_Abs: + case W65816::STY_DP: case W65816::STY_Abs: + return true; + default: + return false; + } + }; + auto opRedefinesA = [](unsigned Op) { + switch (Op) { + case W65816::LDA_DP: case W65816::LDA_StackRel: + case W65816::LDA_Abs: case W65816::LDA_Imm16: + case W65816::LDAabs: case W65816::LDAi16imm: + case W65816::TXA: case W65816::TYA: + case W65816::PLA: + return true; + default: + return false; + } + }; + + // --- Case (a): single-store brackets anywhere in any MBB. --- + for (MachineBasicBlock &MBB : MF) { + SmallVector ToErase; + for (auto It = MBB.begin(); It != MBB.end(); ++It) { + if (It->getOpcode() != W65816::PHA) continue; + auto Lda = std::next(It); + if (Lda == MBB.end()) continue; + unsigned LdaOp = Lda->getOpcode(); + bool LdaIsLoadDp = (LdaOp == W65816::LDA_DP); + bool LdaIsXfer = (LdaOp == W65816::TXA || LdaOp == W65816::TYA); + if (!LdaIsLoadDp && !LdaIsXfer) continue; + auto Sta = std::next(Lda); + if (Sta == MBB.end()) continue; + if (Sta->getOpcode() != W65816::STA_StackRel) continue; + auto Pla = std::next(Sta); + if (Pla == MBB.end()) continue; + if (Pla->getOpcode() != W65816::PLA) continue; + auto AfterPla = std::next(Pla); + if (AfterPla == MBB.end()) continue; + unsigned AfterPlaOp = AfterPla->getOpcode(); + bool AfterDeadA = opRedefinesA(AfterPlaOp); + // Forward-walk liveness: if AfterPla is a branch and ALL its + // successors' first ops redefine A (recursing through + // unconditional-branch trampolines), A is dead. + if (!AfterDeadA && AfterPla->isBranch()) { + bool AllDead = true; + std::function firstRedef = + [&](MachineBasicBlock *B, int Depth) -> bool { + if (Depth > 3 || !B || B->empty()) return false; + MachineInstr &MI = B->front(); + unsigned MOp = MI.getOpcode(); + if (opRedefinesA(MOp)) return true; + if (MOp == W65816::BRA || MOp == W65816::BRL || + MOp == W65816::JMP_Abs) { + for (auto &MO : MI.operands()) { + if (MO.isMBB()) { + return firstRedef(MO.getMBB(), Depth + 1); + } + } + } + return false; + }; + for (MachineBasicBlock *Succ : MBB.successors()) { + if (!firstRedef(Succ, 0)) { AllDead = false; break; } + } + if (AllDead && !MBB.succ_empty()) AfterDeadA = true; + } + if (!AfterDeadA) continue; + MachineOperand &OffMO = Sta->getOperand(0); + if (!OffMO.isImm()) continue; + int64_t Off = OffMO.getImm(); + if (Off < 2) continue; + OffMO.setImm(Off - 2); + ToErase.push_back(&*It); + ToErase.push_back(&*Pla); + } + for (MachineInstr *MI : ToErase) { + MI->eraseFromParent(); + Changed = true; + } + } + + // --- Case (c): multi-pair STA_DP-only bracket anywhere. --- + // IMG-to-IMG copies bracketed for A-preservation. No StackRel + // offsets to adjust (DP is absolute, immune to PHA shifts), so just + // drop PHA/PLA when A is dead at PLA's exit. + std::function firstRedef = + [&](MachineBasicBlock *B, int Depth) -> bool { + if (Depth > 3 || !B || B->empty()) return false; + MachineInstr &MI = B->front(); + unsigned MOp = MI.getOpcode(); + if (opRedefinesA(MOp)) return true; + if (MOp == W65816::BRA || MOp == W65816::BRL || + MOp == W65816::JMP_Abs) { + for (auto &MO : MI.operands()) { + if (MO.isMBB()) return firstRedef(MO.getMBB(), Depth + 1); + } + } + return false; + }; + for (MachineBasicBlock &MBB : MF) { + SmallVector ToErase; + for (auto It = MBB.begin(); It != MBB.end(); ++It) { + if (It->getOpcode() != W65816::PHA) continue; + // Walk inner LDA_DP + STA_DP pairs. + auto Inner = std::next(It); + int InnerPairs = 0; + bool BailInner = false; + while (Inner != MBB.end()) { + if (Inner->getOpcode() == W65816::PLA) break; + if (Inner->getOpcode() != W65816::LDA_DP) { BailInner = true; break; } + auto St = std::next(Inner); + if (St == MBB.end() || St->getOpcode() != W65816::STA_DP) { + BailInner = true; break; + } + ++InnerPairs; + Inner = std::next(St); + } + if (BailInner || Inner == MBB.end() || InnerPairs < 1) continue; + // Inner == PLA. Check liveness after PLA. + auto Post = std::next(Inner); + if (Post == MBB.end()) continue; + unsigned PostOp = Post->getOpcode(); + bool ADead = opRedefinesA(PostOp); + if (!ADead && Post->isBranch()) { + bool AllDead = true; + for (MachineBasicBlock *Succ : MBB.successors()) { + if (!firstRedef(Succ, 0)) { AllDead = false; break; } + } + if (AllDead && !MBB.succ_empty()) ADead = true; + } + if (!ADead) continue; + // Eligible: drop PHA + PLA (no offset adjustment for DP). + ToErase.push_back(&*It); + ToErase.push_back(&*Inner); + } + for (MachineInstr *MI : ToErase) { + MI->eraseFromParent(); + Changed = true; + } + } + + // --- Case (b): ImgCalleeSave prologue bracket in entry MBB. --- + // PHA must be the FIRST instruction (or first after PEI prologue ops + // like REP/TAY/TSC/SEC/SBC/TCS/TYA) in the entry MBB. This ensures + // we're looking at the prologue's IMG save block. + MachineBasicBlock &EntryMBB = MF.front(); + auto BB = EntryMBB.begin(); + // Skip PEI prologue ops to reach the first ImgCalleeSave PHA. + while (BB != EntryMBB.end()) { + unsigned Op = BB->getOpcode(); + if (Op == W65816::PHA) break; + // PEI prologue ops we expect to see before ImgCalleeSave's PHA. + if (Op == W65816::REP || Op == W65816::TAY || + Op == W65816::TSC || Op == W65816::SEC || + Op == W65816::SBC_Imm16 || Op == W65816::TCS || + Op == W65816::TYA) { + ++BB; + continue; + } + BB = EntryMBB.end(); // not a recognized prologue shape — bail + break; + } + if (BB != EntryMBB.end() && BB->getOpcode() == W65816::PHA) { + SmallVector InnerStas; + auto Inner = std::next(BB); + bool BailInner = false; + while (Inner != EntryMBB.end()) { + unsigned IOp = Inner->getOpcode(); + if (IOp == W65816::PLA) break; + // Inner must be alternating LDA_DP + STA_StackRel pairs. + if (IOp != W65816::LDA_DP) { BailInner = true; break; } + auto St = std::next(Inner); + if (St == EntryMBB.end() || St->getOpcode() != W65816::STA_StackRel) { + BailInner = true; break; + } + MachineOperand &OffMO = St->getOperand(0); + if (!OffMO.isImm() || OffMO.getImm() < 2) { + BailInner = true; break; + } + InnerStas.push_back(&*St); + Inner = std::next(St); + } + if (!BailInner && Inner != EntryMBB.end() && !InnerStas.empty()) { + // Inner == PLA. Walk forward through STX_DP / STY_DP (A- + // transparent) ops looking for STA_StackRel that consumes + // entry-A, then verify next op redefines A. + auto Post = std::next(Inner); + while (Post != EntryMBB.end() && opNoTouchA(Post->getOpcode())) { + ++Post; + } + if (Post != EntryMBB.end() && + Post->getOpcode() == W65816::STA_StackRel) { + auto AfterSta = std::next(Post); + if (AfterSta != EntryMBB.end() && + opRedefinesA(AfterSta->getOpcode())) { + // Eligible. Move STA destOff to right BEFORE PHA, drop + // PHA + PLA, shift inner STA offsets by -2. + MachineInstr *StaToMove = &*Post; + MachineInstr *PhaMI = &*BB; + MachineInstr *PlaMI = &*Inner; + // splice: move StaToMove to position just before PhaMI. + EntryMBB.splice(PhaMI->getIterator(), &EntryMBB, + StaToMove->getIterator()); + for (MachineInstr *Sta : InnerStas) { + Sta->getOperand(0).setImm(Sta->getOperand(0).getImm() - 2); + } + PhaMI->eraseFromParent(); + PlaMI->eraseFromParent(); + Changed = true; + } + } + } + } + return Changed; +} + + +// Always-on: elide the STA $E0 / LDA $E0 round-trip in +// ADJCALLSTACKUP's Y-live i64-return path when the next instruction +// after the LDA is `STA_StackRel off,s` storing A to a slot. The +// emitted PEI sequence (see W65816FrameLowering ADJCALLSTACKUP): +// +// STA_DP $E0 ; save A across TSC +// TSC ; A = S +// CLC ; ADC_Imm16 #N ; TCS ; pop N bytes +// LDA_DP $E0 ; restore A +// STA_StackRel off, s ; store A to slot +// +// If the destination's pre-adjust offset (off + N) fits in a 1-byte +// stack-rel encoding, we can move the STA up to BEFORE the SP-adjust +// (using the pre-adjust offset) and drop both the save and reload. +// +// Saves 6 bytes + 8 cyc per match. evalAt has 4 of these. +static bool elideCallResultSaveSPReload(MachineFunction &MF, + const W65816InstrInfo *TII) { + bool Changed = false; + for (MachineBasicBlock &MBB : MF) { + SmallVector ToErase; + for (auto It = MBB.begin(); It != MBB.end(); ++It) { + if (It->getOpcode() != W65816::STA_DP) continue; + MachineOperand &SaveImm = It->getOperand(0); + if (!SaveImm.isImm() || SaveImm.getImm() != 0xE0) continue; + auto I1 = std::next(It); + if (I1 == MBB.end() || I1->getOpcode() != W65816::TSC) continue; + auto I2 = std::next(I1); + if (I2 == MBB.end() || I2->getOpcode() != W65816::CLC) continue; + auto I3 = std::next(I2); + if (I3 == MBB.end() || I3->getOpcode() != W65816::ADC_Imm16) continue; + MachineOperand &AdcImm = I3->getOperand(0); + if (!AdcImm.isImm()) continue; + int64_t N = AdcImm.getImm(); + auto I4 = std::next(I3); + if (I4 == MBB.end() || I4->getOpcode() != W65816::TCS) continue; + auto I5 = std::next(I4); + if (I5 == MBB.end() || I5->getOpcode() != W65816::LDA_DP) continue; + MachineOperand &LoadImm = I5->getOperand(0); + if (!LoadImm.isImm() || LoadImm.getImm() != 0xE0) continue; + auto I6 = std::next(I5); + if (I6 == MBB.end() || I6->getOpcode() != W65816::STA_StackRel) continue; + MachineOperand &StaImm = I6->getOperand(0); + if (!StaImm.isImm()) continue; + int64_t Off = StaImm.getImm(); + int64_t NewOff = Off + N; + if (NewOff < 0 || NewOff > 255) continue; + // Insert a new STA_StackRel at NewOff before the STA_DP $E0. + BuildMI(MBB, It, It->getDebugLoc(), TII->get(W65816::STA_StackRel)) + .addImm(NewOff); + ToErase.push_back(&*It); // STA_DP $E0 + ToErase.push_back(&*I5); // LDA_DP $E0 + ToErase.push_back(&*I6); // original STA_StackRel + } + for (MachineInstr *MI : ToErase) { + MI->eraseFromParent(); + Changed = true; + } + } + return Changed; +} + + +// Returns true if the opcode is "transparent" to a STA→LDA forward — +// does not write A, does not change S, does not write to any stack +// memory. Used to widen the elideStoreForwarding peephole's window. +static bool isStaLdaTransparent(unsigned Opc) { + switch (Opc) { + // X/Y register ops (don't touch A or S) + case W65816::LDX_Imm16: case W65816::LDX_DP: case W65816::LDX_Abs: + case W65816::LDXi16imm: + case W65816::LDY_Imm16: case W65816::LDY_DP: case W65816::LDY_Abs: + case W65816::TAX: case W65816::TAY: + case W65816::INX: case W65816::INY: + case W65816::DEX: case W65816::DEY: + case W65816::STX_DP: case W65816::STX_Abs: + case W65816::STY_DP: case W65816::STY_Abs: + // Flag ops + case W65816::CLC: case W65816::SEC: + case W65816::CLD: case W65816::SED: + case W65816::CLI: case W65816::SEI: + case W65816::CLV: + case W65816::NOP: + return true; + default: + return false; + } +} + + +// Always-on: drop a redundant LDA following STA to the same slot when +// any intermediate ops are "transparent" (don't write A or change S +// or stack memory). STA doesn't modify A, so A still holds the value. +// +// STA off, s +// LDX #imm ; transparent +// LDA off, s ; redundant — A unchanged since STA +// +// Saves 1 instruction (3 bytes / 4 cyc) per match. +static bool elideStoreForwarding(MachineFunction &MF) { + bool Changed = false; + for (MachineBasicBlock &MBB : MF) { + SmallVector ToErase; + for (auto It = MBB.begin(); It != MBB.end(); ++It) { + if (It->getOpcode() != W65816::STA_StackRel) continue; + MachineOperand &S = It->getOperand(0); + if (!S.isImm()) continue; + int64_t StaOff = S.getImm(); + // Walk forward up to 3 ops looking for matching LDA. + MachineBasicBlock::iterator Walk = std::next(It); + int Steps = 0; + while (Walk != MBB.end() && Steps < 3) { + unsigned WOp = Walk->getOpcode(); + if (WOp == W65816::LDA_StackRel) { + MachineOperand &L = Walk->getOperand(0); + if (L.isImm() && L.getImm() == StaOff) { + ToErase.push_back(&*Walk); + } + break; + } + if (!isStaLdaTransparent(WOp)) break; + ++Walk; + ++Steps; + } + } + for (MachineInstr *MI : ToErase) { + MI->eraseFromParent(); + Changed = true; + } + } + return Changed; +} + + +// Always-on: drop a consecutive PLA/PHA pair. PLA restores A from +// the stack; PHA immediately pushes the same A back. Net is a no-op +// in both A and stack memory. Emerges when multiple adjacent IMG +// copies are each bracketed with PHA/PLA for A-preservation: +// +// PHA ; LDA dp ; STA dp ; PLA ; PHA ; LDA dp ; STA dp ; PLA +// ^^^^^^^^^^ +// collapsed away +// +// Saves 2 instructions (2 bytes / 7 cyc) per match. +static bool elidePlaPhaPair(MachineFunction &MF) { + bool Changed = false; + for (MachineBasicBlock &MBB : MF) { + SmallVector ToErase; + for (auto It = MBB.begin(); It != MBB.end(); ++It) { + if (It->getOpcode() != W65816::PLA) continue; + auto I1 = std::next(It); + if (I1 == MBB.end() || I1->getOpcode() != W65816::PHA) continue; + ToErase.push_back(&*It); + ToErase.push_back(&*I1); + ++It; // advance past PHA (already-to-erase) + } + for (MachineInstr *MI : ToErase) { + MI->eraseFromParent(); + Changed = true; + } + } + return Changed; +} + + +// Always-on: drop a redundant LDA when the prior LDA loaded the same +// source and the only intervening instruction was PHA (which reads A +// but doesn't modify it). Emerges from i64 arg-push sequences: +// +// LDA off, s +// PHA +// LDA off, s ; A still has this value — redundant +// PHA +// +// Saves 1 instruction (3 bytes / 4 cyc) per match. +static bool elideRedundantLdaAfterPha(MachineFunction &MF) { + bool Changed = false; + for (MachineBasicBlock &MBB : MF) { + SmallVector ToErase; + for (auto It = MBB.begin(); It != MBB.end(); ++It) { + unsigned Op = It->getOpcode(); + bool IsLdaSr = (Op == W65816::LDA_StackRel); + bool IsLdaDp = (Op == W65816::LDA_DP); + if (!IsLdaSr && !IsLdaDp) continue; + auto I1 = std::next(It); + if (I1 == MBB.end() || I1->getOpcode() != W65816::PHA) continue; + auto I2 = std::next(I1); + if (I2 == MBB.end() || I2->getOpcode() != Op) continue; + MachineOperand &S1 = It->getOperand(0); + MachineOperand &S2 = I2->getOperand(0); + if (!S1.isImm() || !S2.isImm()) continue; + if (S1.getImm() != S2.getImm()) continue; + ToErase.push_back(&*I2); + } + for (MachineInstr *MI : ToErase) { + MI->eraseFromParent(); + Changed = true; + } + } + return Changed; +} + + +// Always-on: drop a dead STA in the i32-carry-propagation pattern: +// +// STA_StackRel off, s +// ADC_Imm16 #N ; doesn't touch slot +// STA_StackRel off, s ; overwrites first STA +// +// The first STA's value is shadowed by the second. Drop it. +// Saves 1 instruction (3 bytes / 5 cyc) per match. +static bool elideDeadStaCarry(MachineFunction &MF) { + bool Changed = false; + for (MachineBasicBlock &MBB : MF) { + SmallVector ToErase; + for (auto It = MBB.begin(); It != MBB.end(); ++It) { + if (It->getOpcode() != W65816::STA_StackRel) continue; + auto I1 = std::next(It); + if (I1 == MBB.end()) continue; + unsigned MidOp = I1->getOpcode(); + bool IsAddImm = (MidOp == W65816::ADC_Imm16 || + MidOp == W65816::ADCi16imm || + MidOp == W65816::ADCEi16imm || + MidOp == W65816::SBCi16imm || + MidOp == W65816::SBCEi16imm); + if (!IsAddImm) continue; + auto I2 = std::next(I1); + if (I2 == MBB.end() || I2->getOpcode() != W65816::STA_StackRel) continue; + MachineOperand &Off1 = It->getOperand(0); + MachineOperand &Off2 = I2->getOperand(0); + if (!Off1.isImm() || !Off2.isImm()) continue; + if (Off1.getImm() != Off2.getImm()) continue; + ToErase.push_back(&*It); + } + for (MachineInstr *MI : ToErase) { + MI->eraseFromParent(); + Changed = true; + } + } + return Changed; +} + + bool W65816StackRelToImg::runOnMachineFunction(MachineFunction &MF) { if (skipFunction(MF.getFunction())) return false; if (MF.getFunction().hasOptNone()) return false; @@ -139,26 +635,48 @@ bool W65816StackRelToImg::runOnMachineFunction(MachineFunction &MF) { // be from FP not SP and the PHP-wrap +1 adjustment differs. if (MF.getFrameInfo().hasVarSizedObjects()) return false; + // Always-on peepholes that run even when the main IMG promotion bails. + const W65816Subtarget &STIp = MF.getSubtarget(); + // Run PLA;PHA collapse FIRST so adjacent brackets merge into a + // single multi-pair bracket — lets elidePhaBracket case (c) match + // the merged shape. + bool ChangedEarly = elidePlaPhaPair(MF); + ChangedEarly |= elidePhaBracket(MF, STIp.getInstrInfo()); + ChangedEarly |= elideCallResultSaveSPReload(MF, STIp.getInstrInfo()); + ChangedEarly |= elideDeadStaCarry(MF); + ChangedEarly |= elideRedundantLdaAfterPha(MF); + // elideStoreForwarding only when main IMG promotion would bail — + // running it early in non-bailing functions cascades into IMG-slot + // reallocation that regresses strcpy 1.63×. Gated below. + // 2. Bail if the function has any non-IMG-safe call (would clobber // our IMG0..7 promotions) or is recursive (same). Tried allowing - // IMG8..15 + ImgCalleeSave fallback for these cases (gained 12 - // inst on evalAt), but broke sprintf and fib due to subtle - // interactions with ImgCalleeSave's slot allocation. Reverted. + // IMG8..15 + own-pass save/restore for these cases (today, after + // landing W65816LowerWide32 + ImgCalleeSave-writes-only fixes), and + // saw: evalAt 498→500 (NET LOSS due to save/restore overhead) AND + // qsort #70 regression. The IMG8..15 path is not currently a win + // for our benchmarks; reverted. StringRef SelfName = MF.getName(); for (MachineBasicBlock &MBB : MF) { for (MachineInstr &MI : MBB) { if (!MI.isCall()) continue; - if (!isImgSafeCall(MI)) return false; + if (!isImgSafeCall(MI)) { + ChangedEarly |= elideStoreForwarding(MF); + return ChangedEarly; + } for (const MachineOperand &MO : MI.operands()) { StringRef Name; if (MO.isGlobal()) Name = MO.getGlobal()->getName(); else if (MO.isSymbol()) Name = MO.getSymbolName(); else continue; - if (Name == SelfName) return false; + if (Name == SelfName) { + ChangedEarly |= elideStoreForwarding(MF); + return ChangedEarly; + } } } } - uint8_t imgBase = 0xD0; + uint8_t imgBase = 0xD0u; // 3. Count stack-rel accesses per offset. CRITICAL: the stack // pointer shifts during the function due to PHP/PLP (+1 byte) and @@ -614,23 +1132,65 @@ bool W65816StackRelToImg::runOnMachineFunction(MachineFunction &MF) { auto Tya = std::next(Tcs); while (Tya != EntryMBB.end() && Tya->isDebugInstr()) ++Tya; if (Tya != EntryMBB.end() && Tya->getOpcode() == W65816::TYA) { + // Walk past A-transparent ops (STX_DP, STY_DP) — these + // don't touch A, so TAY/TYA can still be removed. auto Sta = std::next(Tya); - while (Sta != EntryMBB.end() && Sta->isDebugInstr()) ++Sta; + while (Sta != EntryMBB.end() && + (Sta->isDebugInstr() || + Sta->getOpcode() == W65816::STX_DP || + Sta->getOpcode() == W65816::STY_DP)) { + ++Sta; + } if (Sta != EntryMBB.end() && - Sta->getOpcode() == W65816::STA_DP && Sta->getNumOperands() >= 1 && Sta->getOperand(0).isImm()) { - int64_t StaAddr = Sta->getOperand(0).getImm(); - // Build new STA_DP between REP and TSC. - DebugLoc DL = Sta->getDebugLoc(); - BuildMI(EntryMBB, Tsc, DL, TII->get(W65816::STA_DP)) - .addImm(StaAddr) - .addReg(W65816::A, RegState::Implicit); - // Erase: TAY, TYA, old STA_DP. - Tay->eraseFromParent(); - Tya->eraseFromParent(); - Sta->eraseFromParent(); - Changed = true; + unsigned StaOp = Sta->getOpcode(); + bool IsStaDp = (StaOp == W65816::STA_DP); + bool IsStaSr = (StaOp == W65816::STA_StackRel); + if (IsStaDp || IsStaSr) { + // For STA_StackRel: pre-TCS offset = post-TCS_off - N + // where N = SBC immediate. Only valid if off >= N. + int64_t StaAddr = Sta->getOperand(0).getImm(); + int64_t SbcImm = Sbc->getOperand(0).isImm() + ? Sbc->getOperand(0).getImm() : -1; + // Drop ADCi16imm pseudo-tied operands: imm is at op 0 for + // SBC_Imm16 but op 2 for SBCi16imm — handle uniformly. + if (!Sbc->getOperand(0).isImm() && + Sbc->getNumOperands() >= 3 && + Sbc->getOperand(2).isImm()) { + SbcImm = Sbc->getOperand(2).getImm(); + } + int64_t NewAddr = IsStaDp ? StaAddr : (StaAddr - SbcImm); + bool OffOk = IsStaDp || (NewAddr >= 1 && SbcImm > 0); + // Safety: the op after the spill-STA must REDEFINE A + // (not read it). Otherwise A would be lost (TCS + // clobbered it). + auto Next = std::next(Sta); + while (Next != EntryMBB.end() && Next->isDebugInstr()) + ++Next; + bool NextRedef = false; + if (Next != EntryMBB.end()) { + unsigned NOp = Next->getOpcode(); + NextRedef = + NOp == W65816::LDA_DP || NOp == W65816::LDA_StackRel || + NOp == W65816::LDA_Abs || NOp == W65816::LDA_Imm16 || + NOp == W65816::LDAabs || NOp == W65816::LDAi16imm || + NOp == W65816::TXA || NOp == W65816::TYA || + NOp == W65816::PLA; + } + if (OffOk && NextRedef) { + // Build new STA_ between REP and TSC. + DebugLoc DL = Sta->getDebugLoc(); + BuildMI(EntryMBB, Tsc, DL, TII->get(StaOp)) + .addImm(NewAddr) + .addReg(W65816::A, RegState::Implicit); + // Erase: TAY, TYA, old STA. + Tay->eraseFromParent(); + Tya->eraseFromParent(); + Sta->eraseFromParent(); + Changed = true; + } + } } } } @@ -1459,5 +2019,17 @@ bool W65816StackRelToImg::runOnMachineFunction(MachineFunction &MF) { } } + // Run elideStoreForwarding at the very end, AFTER IMG promotion has + // committed slot assignments. Running this peephole earlier (with + // the other early peepholes) cascades into different IMG-promotion + // choices and was observed to regress strcpy 1.63×. At this point + // promotion is done, so dropping a redundant LDA can no longer + // disturb slot allocation. + // End-of-pass: also try elideStoreForwarding for non-bailing + // functions. After main IMG promotion finalizes slot assignments, + // dropping a redundant LDA can no longer disturb them. + Changed |= elideStoreForwarding(MF); + + Changed |= ChangedEarly; return Changed; }