This commit is contained in:
Scott Duensing 2026-05-20 20:14:20 -05:00
parent 524a37fcf0
commit d95c30e819
83 changed files with 3091 additions and 2447 deletions

View file

@ -71,18 +71,20 @@ docs/ this directory — INSTALL.md, USAGE.md, design notes
## Status ## Status
Stable enough to build real programs. Current quality vs commercial Stable enough to build real programs. Static instruction-count
Calypsi 5.16 (lower is better): ratio against commercial Calypsi 5.16 (lower is better):
| Benchmark | Our cyc/call | Calypsi cyc/call (approx) | | Benchmark | Ours (inst) | Calypsi (inst) | Ratio |
|---|---|---| |---|---:|---:|---:|
| sumOfSquares(50) | 16709 | ~16000 | | sumSquares | 26 | 31 | **0.84×** ✓ |
| popcount(0x12345678) | 2864 | ~2500 | | evalAt | 472 | 254 | 1.86× |
| memcmp(eq, 5) | 989 | ~700 | | mul16to32 | 1 | 4 | **0.25×** ✓ |
| bsearch(arr, 8, 5) | 767 | ~600 |
Static-size for the canonical `sumSquares` benchmark: 37 inst (ours) Per-iteration cycle measurements (via MAME's HBL counter, 2026-05-20):
vs 31 inst (Calypsi) — **1.19×**. bsearch 127, dotProduct 144, fib 97, memcmp 113, popcount 93,
strcpy 91, sumOfSquares 126 (cyc/iter at 100 iters);
dadd 1157, ddiv 1261, dmul 1033 (cyc/iter at 10 iters — FP calls
are ~1000+ cyc each).
See [STATUS.md](STATUS.md) for full language and runtime feature See [STATUS.md](STATUS.md) for full language and runtime feature
coverage, and [LLVM_65816_DESIGN.md](LLVM_65816_DESIGN.md) for coverage, and [LLVM_65816_DESIGN.md](LLVM_65816_DESIGN.md) for

View file

@ -1,4 +1,4 @@
# Session Recovery — last updated 2026-05-08 # Session Recovery — last updated 2026-05-20
Living recovery doc. Update on every meaningful change. If session is lost, Living recovery doc. Update on every meaningful change. If session is lost,
read this top-to-bottom + the memory notes referenced inside, then reread read this top-to-bottom + the memory notes referenced inside, then reread
@ -6,11 +6,27 @@ the actual diffs in tree to ground assumptions.
## Headline state ## Headline state
- **Smoke**: 132/132 green (omfEmit `--stack-size` check is the new one). - **Smoke**: 148/148 green. Demos 9/9 (helloBeep/helloText/helloWindow/
- **Active config**: ptr32 (`p:32:16`), full IMG0..IMG15 caller-clobber on JSL, basic regalloc at -O1+. orcaFrame/qdProbe/heavyRelocs/frame/reversi/minicad).
- **Working tree**: 5 modified files (see below); all real fixes pending checkpoint. - **Active config**: ptr32 (`p:32:16`), full IMG0..IMG15 caller-clobber
- **Branch**: `main`, ahead of `origin/main` by recent checkpoint commits. on JSL, greedy regalloc at -O1+.
- **Bench wins this session**: popcount **8320 → 6888 cyc/call (17%)** from i32 shift inline. DP/Stack `~Direct` segment Loader-validated end-to-end. - **Branch**: `main`.
- **vs Calypsi static-inst ratio (2026-05-20)**:
sumSquares **0.84×** (26 vs 31 — we beat),
mul16to32 **0.25×** (1 vs 4 — we beat),
evalAt 1.86× (472 vs 254 — structural floor; ABI overhaul rejected).
- **Cycle benches (2026-05-20)**:
popcount 93, strcpy 91, bsearch 127, memcmp 113, fib 97,
dotProduct 144, sumOfSquares 126 cyc/iter (100 iters);
dadd 1157, ddiv 1261, dmul 1033 cyc/iter (10 iters).
- **Recent session wins (2026-05-20)**:
- 8 always-on peepholes + extended phase 4 in W65816StackRelToImg
(evalAt 498→472, fib -35%, 35 libc fns shrunk)
- __muldi3 32-bit short-circuit (dmul 1605→1033, -36%)
- case-(b) ImgCalleeSave bracket hoist enables phase 4 to elide
TAY/TYA round-trip in synergy
- FP cycle benches added (dadd/dmul/ddiv) with per-bench iter count
- Documented LSR-dp cycle mystery as HBL-counter wrap artifact
## Uncommitted, must keep ## Uncommitted, must keep
@ -337,15 +353,22 @@ in 30 minutes. Recommended.
## Next session candidates (ranked) ## Next session candidates (ranked)
1. **Commit the uncommitted fixes.** They've earned it. evalAt at 1.86× vs Calypsi is the structural floor for peephole work
2. **u16*u16→u32 multiply path.** sumOfSquares is 982 cyc/iter, (see `feedback_evalat_structural_gap.md`). Further gains need:
bottlenecked by `__mulsi3` for what's really a 16x16 multiply.
If we add a `__umulhi3` libcall (i16,i16 → i32) and route 1. **i64-by-pointer ABI** (rejected this session — diminishing returns).
`MUL(zext(a), zext(b))` to it, sumOfSquares could ~halve. Pass doubles by ptr instead of value: saves ~120 cyc per evalAt call.
3. **`while (x != 0)` for i32 should fold to `lda lo; ora hi; bne`.** Requires runtime rewrite, OMF compat checks, every double caller
Currently materializes a boolean via SETCC and branches on it. updated. Risk:reward too high for the size of the gain.
Combiner hook: `(brcond (setcc i32 x, 0, ne))` 2. **__divdf3 / __adddf3 algorithmic improvements**. ddiv 1261 cyc
`(br_cc ne, lo|hi, 0)`. Big win in any i32-iteration loop. could drop via Newton-Raphson reciprocal multiplication (a*1/b
4. **Greedy regalloc retry.** Cheap experiment, potentially big win. instead of bit-by-bit long division). Major rewrite, but our
5. **gmtime_r IR investigation.** Find which combine miscompiles __muldi3 short-circuit makes the multiplications cheap now.
`days >= 365L + (leap?1:0)`. IR-level, not backend. 3. **Higher-resolution cycle timer**. HBL counter is 8-bit and wraps
at ~256 ticks; combining scan-line position + frame counter would
give per-bench resolution better than ±65 cyc. Would unblock
benchmarking sub-loop changes (e.g., the LSR-dp shift form).
4. **More peepholes from the audit**. Phase 4 STA_StackRel extension
landed but doesn't fire in current libc (frame sizes too large).
If callers shrink frames via better SSM, more functions become
eligible.

View file

@ -217,7 +217,7 @@ which runs correctly under MAME (apple2gs).
image addresses. image addresses.
- `runtime/build.sh` builds crt0, libc, soft-float, soft-double, - `runtime/build.sh` builds crt0, libc, soft-float, soft-double,
libgcc into linkable objects. libgcc into linkable objects.
- `scripts/smokeTest.sh` runs 132 end-to-end checks at -O2: - `scripts/smokeTest.sh` runs 148 end-to-end checks at -O2:
scalar ops, control flow, calling conventions, MAME execution scalar ops, control flow, calling conventions, MAME execution
regressions, link816 bss-base safety + weak-symbol resolution + regressions, link816 bss-base safety + weak-symbol resolution +
heap_end-vs-heap_start sanity, iigs/toolbox.h compile + link, heap_end-vs-heap_start sanity, iigs/toolbox.h compile + link,
@ -244,23 +244,25 @@ which runs correctly under MAME (apple2gs).
+ dispatch + chained collisions over fprintf-to-mfs), + dispatch + chained collisions over fprintf-to-mfs),
scripts/bench.sh size-vs-Calypsi harness. 100% pass. scripts/bench.sh size-vs-Calypsi harness. 100% pass.
- `scripts/benchCyclesPrecise.sh` measures per-call cycle counts - `scripts/benchCycles.sh` measures per-iteration cycle counts via
via MAME's emulated time counter. Eight benchmarks under MAME's emulated HBL counter. Eleven benchmarks under
`benchmarks/`. Current numbers (after W65816StackSlotMerge): `benchmarks/` (eight int + three FP). Current numbers
popcount 2864, bsearch 767, memcmp 989, strcpy 2216, (2026-05-20):
dotProduct 2131, fib(10) 12617, sumOfSquares 16709. Speed is bsearch 127, crc32 <65, dotProduct 144, fib 97, memcmp 113,
the optimization priority, not size. popcount 93, strcpy 91, sumOfSquares 126 cyc/iter (100 iters);
dadd 1157, ddiv 1261, dmul 1033 cyc/iter (10 iters; FP benches
use fewer iters since each call is ~1000+ cyc). Speed is the
optimization priority, not size.
- `compare/` holds three side-by-side C tests with our asm and - `compare/` holds three side-by-side C tests with our asm and
Calypsi's listing for static-size comparison: Calypsi's listing for static-size comparison:
`sumSquares`/`evalAt`/`mul16to32`. `bash compare/regen.sh` `sumSquares`/`evalAt`/`mul16to32`. `bash compare/regen.sh`
recompiles each under both `clang --target=w65816 -O2 -S` and recompiles each under both `clang --target=w65816 -O2 -S` and
`cc65816 --speed -O 2 --64bit-doubles` and prints an `cc65816 --speed -O 2 --64bit-doubles` and prints an
ours/Calypsi instruction-count ratio. Current ratios (post ours/Calypsi instruction-count ratio. Current ratios (2026-05-20):
StackRelToImg 9-phase pipeline including saturating-max preheader sumSquares **0.84×** (26 inst — we beat Calypsi's 31),
elimination): sumSquares **0.87×** (27 inst — we beat Calypsi's evalAt 1.86× (472 inst), mul16to32 **0.25×** (1 inst — we beat
31), evalAt 2.10× (534 inst), mul16to32 **1.50×** (6 inst). Calypsi's 4). See `compare/README.md`.
See `compare/README.md`.
**Backend register allocation:** **Backend register allocation:**
@ -435,6 +437,36 @@ for the common-case C / minimal-C++ workload. Priority is speed
the hi-half carry chain when one operand has known-zero high the hi-half carry chain when one operand has known-zero high
16 bits. 16 bits.
- **W65816StackRelToImg peephole pipeline** (2026-05-20). Eight
always-on peepholes plus an extended phase 4 in the pre-emit
StackRelToImg pass: (1) `elidePhaBracket` with case-a single-store
bracket + case-b ImgCalleeSave multi-store with STA-hoist +
case-c STA_DP-only multi-pair + forward-walk liveness through
conditional branches; (2) `elideCallResultSaveSPReload` drops
STA/LDA $E0 round-trip in ADJCALLSTACKUP's Y-live i64-return
path; (3) `elideDeadStaCarry` drops first STA in i32-carry
STA/ADCE/STA pattern; (4) `elideRedundantLdaAfterPha`; (4b)
`elidePlaPhaPair` collapses consecutive PLA;PHA; (5)
`elideStoreForwarding` (gated to bail path + end-of-pass to
avoid IMG-slot reallocation cascade). Phase 4 extended to walk
past STX_DP/STY_DP between TYA and STA_DP with safety check
(post-STA op must redefine A) and to handle STA_StackRel
destination with offset compensation. Result: evalAt 498→472
inst (1.96×→1.86× vs Calypsi), fib -35% cyc/iter (149→97),
popcount -11% (104→93), 35 libc functions get TAY/TYA bracket
elided. Case (b) hoists the body's first STA before the
ImgCalleeSave bracket, enabling the existing phase 4 to remove
PEI's TAY/TYA round-trip in a synergistic chain.
- **__muldi3 32-bit short-circuit** (2026-05-20). When `a`'s high
32 bits ($E4/$E6) are zero, use a 32-iter shift-and-add loop
instead of 64 iters. Fires on every `mulhi64Aligned` call from
softDouble.c (4× per `__muldf3`), which always passes zero-
extended u32 operands. Result: **dmul 1605→1033 cyc/iter
(-36%)**. Single-side check (just `a`) is correct since `b`'s
high half being non-zero doesn't affect correctness — iters 32-63
would just shift b without adding.
**Open limitations:** **Open limitations:**
- **Multi-bank BSS** — full support up to 4 banks (256KB). link816 - **Multi-bank BSS** — full support up to 4 banks (256KB). link816
@ -445,7 +477,7 @@ for the common-case C / minimal-C++ workload. Priority is speed
0xFF00 so the 16-bit `cpx #__bss_segN_size` loop comparison 0xFF00 so the 16-bit `cpx #__bss_segN_size` loop comparison
doesn't wrap to 0 on a full-bank segment (a single full bank is doesn't wrap to 0 on a full-bank segment (a single full bank is
split into a 0xFF00-byte primary + 0x100-byte tail in the same split into a 0xFF00-byte primary + 0x100-byte tail in the same
bank). Smoke 137/137 validates BSS spanning bank 3 + bank 4 bank). Smoke validates BSS spanning bank 3 + bank 4
(100KB) is zeroed end-to-end. Note: program access to non-DBR (100KB) is zeroed end-to-end. Note: program access to non-DBR
bank globals still requires DBR management — the compiler emits bank globals still requires DBR management — the compiler emits
DBR-relative absolute for global accesses, so accessing BSS in DBR-relative absolute for global accesses, so accessing BSS in
@ -495,5 +527,5 @@ for the common-case C / minimal-C++ workload. Priority is speed
actually use those slots (most don't). Fixed picol `expr 1+2 == 4` actually use those slots (most don't). Fixed picol `expr 1+2 == 4`
(now `3`) and a class of recursive double-fn miscompiles with (now `3`) and a class of recursive double-fn miscompiles with
compound `||` conditions — see `feedback_picol_expr_compound_or.md`. compound `||` conditions — see `feedback_picol_expr_compound_or.md`.
Smoke 149/149 green including a new orBug regression test guarding Smoke green including a new orBug regression test guarding
the fix. the fix.

4
benchmarks/dadd.c Normal file
View file

@ -0,0 +1,4 @@
// Soft-double addition. Lowers to __adddf3.
double dadd(double a, double b) {
return a + b;
}

4
benchmarks/ddiv.c Normal file
View file

@ -0,0 +1,4 @@
// Soft-double division. Lowers to __divdf3.
double ddiv(double a, double b) {
return a / b;
}

4
benchmarks/dmul.c Normal file
View file

@ -0,0 +1,4 @@
// Soft-double multiplication. Lowers to __muldf3.
double dmul(double a, double b) {
return a * b;
}

View file

@ -22,14 +22,14 @@ Recompiles every `*.c` in this directory under both compilers and prints an
instruction-count summary: instruction-count summary:
``` ```
test ours calypsi ratio test ours calypsi ratio
---- ---- ------- ----- ---- ---- ------- -----
evalAt 419 268 1.56x evalAt 472 254 1.86x
mul16to32 12 11 1.09x mul16to32 1 4 0.25x
sumSquares 72 31 2.32x sumSquares 26 31 0.84x
``` ```
(Numbers above are illustrative — re-run to see current state.) (Numbers above are current as of 2026-05-20 — re-run for latest.)
## Adding a new comparison ## Adding a new comparison
@ -41,4 +41,4 @@ The summary counts asm-line opcodes (lda/sta/jsl/...) on our side and listing
lines that begin with a hex byte (Calypsi's emit-byte column) on theirs. lines that begin with a hex byte (Calypsi's emit-byte column) on theirs.
Both metrics are static instruction counts, NOT bytes. They underestimate Both metrics are static instruction counts, NOT bytes. They underestimate
calls-to-runtime (each libcall counts as one `jsl`, not the body it expands to). calls-to-runtime (each libcall counts as one `jsl`, not the body it expands to).
For cycle counts, use `scripts/benchCyclesPrecise.sh`. For cycle counts, use `scripts/benchCycles.sh`.

View file

@ -1,7 +1,7 @@
############################################################################### ###############################################################################
# # # #
# Calypsi ISO C compiler for 65816 version 5.16 # # Calypsi ISO C compiler for 65816 version 5.16 #
# 15/May/2026 00:38:15 # # 20/May/2026 17:33:54 #
# Command line: --speed -O 2 --64bit-doubles evalAt.c -o # # Command line: --speed -O 2 --64bit-doubles evalAt.c -o #
# /tmp/evalAt.calypsi.elf --list-file evalAt.calypsi.lst # # /tmp/evalAt.calypsi.elf --list-file evalAt.calypsi.lst #
# # # #

View file

@ -8,7 +8,7 @@ evalAt: ; @evalAt
tay tay
tsc tsc
sec sec
sbc #0x2e sbc #0x32
tcs tcs
tya tya
pha pha
@ -24,12 +24,11 @@ evalAt: ; @evalAt
sta 0x3, s sta 0x3, s
pla pla
stx 0xc0 stx 0xc0
sta 0x1b, s sta 0x19, s
clc clc
adc #0x2 adc #0x2
sta 0x1f, s sta 0x1f, s
lda 0xc0 lda 0xc0
sta 0x21, s
adc #0x0 adc #0x0
sta 0x21, s sta 0x21, s
lda 0x1f, s lda 0x1f, s
@ -38,43 +37,36 @@ evalAt: ; @evalAt
sta 0xe2 sta 0xe2
ldy #0x0 ldy #0x0
lda [0xe0], y lda [0xe0], y
sta 0x1f, s sta 0x1d, s
pha
lda 0xc0 lda 0xc0
sta 0x2f, s sta 0x31, s
pla lda 0x19, s
lda 0x1b, s
sta 0xe0 sta 0xe0
lda 0x2d, s lda 0x31, s
sta 0xe2 sta 0xe2
lda [0xe0], y lda [0xe0], y
sta 0x21, s sta 0x21, s
lda 0x32, s lda 0x36, s
sta 0xb, s sta 0xb, s
lda #0x0 lda #0x0
sta 0xc4 sta 0xc4
sta 0xc6 sta 0xc6
lda 0x21, s lda 0x21, s
sta 0xe0 sta 0xe0
lda 0x1f, s lda 0x1d, s
sta 0xe2 sta 0xe2
lda [0xe0], y lda [0xe0], y
and #0xff sta 0x1b, s
sta 0x1d, s
sep #0x20 sep #0x20
clc clc
adc #0xd0 adc #0xd0
rep #0x20 rep #0x20
and #0xff and #0xff
cmp #0xa cmp #0xa
pha
lda 0xc4 lda 0xc4
sta 0xc8 sta 0xc8
pla
pha
lda 0xc6 lda 0xc6
sta 0xca sta 0xca
pla
bcc .LBB0_1 bcc .LBB0_1
; %bb.15: ; %entry ; %bb.15: ; %entry
brl .LBB0_4 brl .LBB0_4
@ -83,46 +75,43 @@ evalAt: ; @evalAt
inc a inc a
sta 0x21, s sta 0x21, s
bne .Ltmp0 bne .Ltmp0
lda 0x1f, s lda 0x1d, s
inc a inc a
sta 0x1f, s sta 0x1d, s
.Ltmp0: .Ltmp0:
lda #0x0 lda #0x0
sta 0x15, s sta 0x15, s
sta 0x13, s sta 0x13, s
sta 0x11, s sta 0x11, s
sta 0xf, s sta 0xf, s
lda 0x1f, s lda 0x1d, s
sta 0x17, s sta 0x17, s
.LBB0_2: ; %while.body .LBB0_2: ; %while.body
; =>This Inner Loop Header: Depth=1 ; =>This Inner Loop Header: Depth=1
sta 0x1f, s sta 0x1d, s
lda 0x1b, s lda 0x19, s
tax tax
pha
lda 0xc0 lda 0xc0
sta 0x2d, s sta 0x2f, s
pla
txa txa
sta 0xe0 sta 0xe0
lda 0x2b, s lda 0x2f, s
sta 0xe2 sta 0xe2
lda 0x21, s lda 0x21, s
ldy #0x0 ldy #0x0
sta [0xe0], y sta [0xe0], y
lda 0x1b, s lda 0x19, s
clc clc
adc #0x2 adc #0x2
sta 0xd, s sta 0xd, s
lda 0xc0 lda 0xc0
sta 0x19, s
adc #0x0 adc #0x0
sta 0x19, s sta 0x1f, s
lda 0xd, s lda 0xd, s
sta 0xe0 sta 0xe0
lda 0x19, s
sta 0xe2
lda 0x1f, s lda 0x1f, s
sta 0xe2
lda 0x1d, s
sta [0xe0], y sta [0xe0], y
pea 0x4024 pea 0x4024
lda #0x0 lda #0x0
@ -137,30 +126,27 @@ evalAt: ; @evalAt
tax tax
lda 0x21, s lda 0x21, s
jsl __muldf3 jsl __muldf3
sta 0xe0 sta 0x2b, s
tsc tsc
clc clc
adc #0xc adc #0xc
tcs tcs
lda 0xe0
sta 0x19, s
txa txa
sta 0x15, s sta 0x15, s
tya tya
sta 0x13, s sta 0x13, s
lda 0xf0 lda 0xf0
sta 0x11, s sta 0x11, s
lda 0x1d, s lda 0x1b, s
sep #0x20 sep #0x20
clc clc
adc #0xd0 adc #0xd0
rep #0x20 rep #0x20
and #0xff and #0xff
sta 0x1d, s sta 0x1b, s
ldx #0x0 ldx #0x0
lda 0x1d, s
jsl __floatunsidf jsl __floatunsidf
sta 0x1d, s sta 0x1b, s
txa txa
sta 0xf, s sta 0xf, s
tya tya
@ -171,7 +157,7 @@ evalAt: ; @evalAt
lda 0x13, s lda 0x13, s
tax tax
phx phx
lda 0x23, s lda 0x21, s
pha pha
lda 0x19, s lda 0x19, s
pha pha
@ -179,15 +165,13 @@ evalAt: ; @evalAt
pha pha
lda 0x21, s lda 0x21, s
tax tax
lda 0x25, s lda 0x2b, s
jsl __adddf3 jsl __adddf3
sta 0xe0 sta 0x21, s
tsc tsc
clc clc
adc #0xc adc #0xc
tcs tcs
lda 0xe0
sta 0x15, s
txa txa
sta 0x13, s sta 0x13, s
tya tya
@ -203,7 +187,7 @@ evalAt: ; @evalAt
sta 0x21, s sta 0x21, s
txa txa
lda 0xd0 lda 0xd0
sta 0x1d, s sta 0x1f, s
lda 0x17, s lda 0x17, s
adc #0x0 adc #0x0
sta 0x17, s sta 0x17, s
@ -215,14 +199,13 @@ evalAt: ; @evalAt
sta 0xc4 sta 0xc4
lda 0x13, s lda 0x13, s
sta 0xc6 sta 0xc6
lda 0x1d, s
sta 0xe0
lda 0x1f, s lda 0x1f, s
sta 0xe0
lda 0x1d, s
sta 0xe2 sta 0xe2
ldy #0x0 ldy #0x0
lda [0xe0], y lda [0xe0], y
and #0xff sta 0x1b, s
sta 0x1d, s
sep #0x20 sep #0x20
clc clc
adc #0xd0 adc #0xd0
@ -241,17 +224,17 @@ evalAt: ; @evalAt
sta 0x21, s sta 0x21, s
lda 0x17, s lda 0x17, s
adc #0xffff adc #0xffff
sta 0x1f, s sta 0x1d, s
.LBB0_4: ; %while.cond7.preheader .LBB0_4: ; %while.cond7.preheader
lda 0xb, s lda 0xb, s
eor #0x8000 eor #0x8000
sta 0xb, s sta 0xb, s
lda 0x1d, s lda 0x1b, s
brl .LBB0_5 brl .LBB0_5
.LBB0_11: ; %if.then33 .LBB0_11: ; %if.then33
; in Loop: Header=BB0_5 Depth=1 ; in Loop: Header=BB0_5 Depth=1
lda 0xc6 lda 0xc6
sta 0x1d, s sta 0x1b, s
lda 0xc4 lda 0xc4
sta 0x15, s sta 0x15, s
lda 0xca lda 0xca
@ -260,7 +243,7 @@ evalAt: ; @evalAt
sta 0x11, s sta 0x11, s
lda 0x17, s lda 0x17, s
pha pha
lda 0x1b, s lda 0x1f, s
pha pha
lda 0x23, s lda 0x23, s
pha pha
@ -270,28 +253,26 @@ evalAt: ; @evalAt
pha pha
lda 0x1b, s lda 0x1b, s
pha pha
lda 0x29, s lda 0x27, s
tax tax
lda 0x21, s lda 0x21, s
jsl __muldf3 jsl __muldf3
.LBB0_12: ; %cleanup .LBB0_12: ; %cleanup
; in Loop: Header=BB0_5 Depth=1 ; in Loop: Header=BB0_5 Depth=1
sta 0xe0 sta 0x2d, s
tsc tsc
clc clc
adc #0xc adc #0xc
tcs tcs
lda 0xe0
sta 0x21, s
txa txa
sta 0x1f, s sta 0x1f, s
tya tya
sta 0x1d, s sta 0x1d, s
lda 0xf0 lda 0xf0
sta 0x19, s sta 0x1b, s
lda 0x1d, s lda 0x1d, s
sta 0xc8 sta 0xc8
lda 0x19, s lda 0x1b, s
sta 0xca sta 0xca
lda 0x21, s lda 0x21, s
sta 0xc4 sta 0xc4
@ -299,12 +280,11 @@ evalAt: ; @evalAt
sta 0xc6 sta 0xc6
.LBB0_13: ; %cleanup .LBB0_13: ; %cleanup
; in Loop: Header=BB0_5 Depth=1 ; in Loop: Header=BB0_5 Depth=1
lda 0x1b, s lda 0x19, s
clc clc
adc #0x2 adc #0x2
sta 0x1f, s sta 0x1f, s
lda 0xc0 lda 0xc0
sta 0x21, s
adc #0x0 adc #0x0
sta 0x21, s sta 0x21, s
lda 0x1f, s lda 0x1f, s
@ -313,13 +293,11 @@ evalAt: ; @evalAt
sta 0xe2 sta 0xe2
ldy #0x0 ldy #0x0
lda [0xe0], y lda [0xe0], y
sta 0x1f, s sta 0x1d, s
lda 0x1b, s lda 0x19, s
tax tax
pha
lda 0xc0 lda 0xc0
sta 0x25, s sta 0x23, s
pla
txa txa
sta 0xe0 sta 0xe0
lda 0x23, s lda 0x23, s
@ -327,26 +305,24 @@ evalAt: ; @evalAt
lda [0xe0], y lda [0xe0], y
sta 0x21, s sta 0x21, s
sta 0xe0 sta 0xe0
lda 0x1f, s lda 0x1d, s
sta 0xe2 sta 0xe2
lda [0xe0], y lda [0xe0], y
and #0xff
.LBB0_5: ; %while.cond7 .LBB0_5: ; %while.cond7
; =>This Inner Loop Header: Depth=1 ; =>This Inner Loop Header: Depth=1
sta 0x1d, s sta 0x1b, s
sep #0x20 sep #0x20
clc clc
adc #0xd6 adc #0xd6
rep #0x20 rep #0x20
and #0xff and #0xff
sta 0x19, s sta 0x1f, s
lda 0x19, s
pha pha
lda #0x2b lda #0x2b
jsl __lshrhi3 jsl __lshrhi3
ply ply
sta 0x17, s sta 0x17, s
lda 0x19, s lda 0x1f, s
cmp #0x6 cmp #0x6
bcc .LBB0_6 bcc .LBB0_6
; %bb.17: ; %while.cond7 ; %bb.17: ; %while.cond7
@ -357,23 +333,53 @@ evalAt: ; @evalAt
and #0x1 and #0x1
sta 0x17, s sta 0x17, s
lda #0x0 lda #0x0
sta 0x29, s sta 0x2d, s
lda 0x17, s lda 0x17, s
ora 0x29, s ora 0x2d, s
bne .LBB0_7 bne .LBB0_7
; %bb.18: ; %while.cond7 ; %bb.18: ; %while.cond7
brl .LBB0_14 brl .LBB0_14
.LBB0_7: ; %switch.lookup .LBB0_7: ; %switch.lookup
; in Loop: Header=BB0_5 Depth=1 ; in Loop: Header=BB0_5 Depth=1
lda 0x19, s lda #0x0
asl a asl a
tax sta 0x17, s
lda .Lswitch.table.evalAt, x lda 0x1f, s
sta 0x19, s asl a
eor #0x8000 lda #0x0
rol a
sta 0x2b, s
lda 0x17, s
ora 0x2b, s
sta 0x17, s
lda 0x1f, s
asl a
sta 0x1f, s
lda #.Lswitch.table.evalAt
sta 0x29, s
lda 0x1f, s
clc
adc 0x29, s
sta 0x1f, s
lda 0xbe
sta 0x27, s sta 0x27, s
lda 0x17, s
adc 0x27, s
sta 0x17, s
lda 0x1f, s
sta 0xe0
lda 0x17, s
sta 0xe2
ldy #0x0
lda [0xe0], y
sta 0x1f, s
tax
eor #0x8000
sta 0x1f, s
txa
sta 0x17, s
lda 0xb, s lda 0xb, s
cmp 0x27, s cmp 0x1f, s
bcc .LBB0_8 bcc .LBB0_8
; %bb.19: ; %switch.lookup ; %bb.19: ; %switch.lookup
brl .LBB0_14 brl .LBB0_14
@ -383,16 +389,14 @@ evalAt: ; @evalAt
inc a inc a
sta 0x21, s sta 0x21, s
bne .Ltmp1 bne .Ltmp1
lda 0x1f, s lda 0x1d, s
inc a inc a
sta 0x1f, s sta 0x1d, s
.Ltmp1: .Ltmp1:
lda 0x1b, s lda 0x19, s
tax tax
pha
lda 0xc0 lda 0xc0
sta 0x27, s sta 0x25, s
pla
txa txa
sta 0xe0 sta 0xe0
lda 0x25, s lda 0x25, s
@ -400,41 +404,39 @@ evalAt: ; @evalAt
lda 0x21, s lda 0x21, s
ldy #0x0 ldy #0x0
sta [0xe0], y sta [0xe0], y
lda 0x1b, s lda 0x19, s
sta 0xd0 sta 0xd0
clc clc
adc #0x2 adc #0x2
sta 0x17, s sta 0x1f, s
lda 0xd0 lda 0xd0
sta 0x21, s sta 0x21, s
lda 0xc0 lda 0xc0
adc #0x0 adc #0x0
sta 0x15, s sta 0x15, s
lda 0x17, s lda 0x1f, s
sta 0xe0 sta 0xe0
lda 0x15, s lda 0x15, s
sta 0xe2 sta 0xe2
lda 0x1f, s lda 0x1d, s
sta [0xe0], y sta [0xe0], y
lda 0x19, s lda 0x17, s
pha pha
ldx 0xc0 ldx 0xc0
lda 0x23, s lda 0x23, s
jsl evalAt jsl evalAt
sta 0xe0 sta 0x23, s
tsc tsc
clc clc
adc #0x2 adc #0x2
tcs tcs
lda 0xe0
sta 0x21, s
txa txa
sta 0x1f, s sta 0x1f, s
tya tya
sta 0x19, s sta 0x1d, s
lda 0xf0 lda 0xf0
sta 0x17, s sta 0x17, s
lda 0x1d, s lda 0x1b, s
and #0xff and #0xff
cmp #0x2a cmp #0x2a
bne .LBB0_9 bne .LBB0_9
@ -451,7 +453,7 @@ evalAt: ; @evalAt
.LBB0_10: ; %if.then29 .LBB0_10: ; %if.then29
; in Loop: Header=BB0_5 Depth=1 ; in Loop: Header=BB0_5 Depth=1
lda 0xc6 lda 0xc6
sta 0x1d, s sta 0x1b, s
lda 0xc4 lda 0xc4
sta 0x15, s sta 0x15, s
lda 0xca lda 0xca
@ -460,7 +462,7 @@ evalAt: ; @evalAt
sta 0x11, s sta 0x11, s
lda 0x17, s lda 0x17, s
pha pha
lda 0x1b, s lda 0x1f, s
pha pha
lda 0x23, s lda 0x23, s
pha pha
@ -470,7 +472,7 @@ evalAt: ; @evalAt
pha pha
lda 0x1b, s lda 0x1b, s
pha pha
lda 0x29, s lda 0x27, s
tax tax
lda 0x21, s lda 0x21, s
jsl __adddf3 jsl __adddf3
@ -506,7 +508,7 @@ evalAt: ; @evalAt
sta 0xe0 sta 0xe0
tsc tsc
clc clc
adc #0x2e adc #0x32
tcs tcs
lda 0xe0 lda 0xe0
rtl rtl

View file

@ -1,7 +1,7 @@
############################################################################### ###############################################################################
# # # #
# Calypsi ISO C compiler for 65816 version 5.16 # # Calypsi ISO C compiler for 65816 version 5.16 #
# 15/May/2026 00:38:15 # # 20/May/2026 17:33:54 #
# Command line: --speed -O 2 --64bit-doubles mul16to32.c -o # # Command line: --speed -O 2 --64bit-doubles mul16to32.c -o #
# /tmp/mul16to32.calypsi.elf --list-file # # /tmp/mul16to32.calypsi.elf --list-file #
# mul16to32.calypsi.lst # # mul16to32.calypsi.lst #

View file

@ -1,7 +1,7 @@
############################################################################### ###############################################################################
# # # #
# Calypsi ISO C compiler for 65816 version 5.16 # # Calypsi ISO C compiler for 65816 version 5.16 #
# 15/May/2026 00:38:15 # # 20/May/2026 17:33:54 #
# Command line: --speed -O 2 --64bit-doubles sumSquares.c -o # # Command line: --speed -O 2 --64bit-doubles sumSquares.c -o #
# /tmp/sumSquares.calypsi.elf --list-file # # /tmp/sumSquares.calypsi.elf --list-file #
# sumSquares.calypsi.lst # # sumSquares.calypsi.lst #

View file

@ -67,28 +67,27 @@ event loop until the close box / Q key / 1000-iteration watchdog
fires. Both 6.0.2 (`sys602.po`) and 6.0.4 (`6.0.4 - System.Disk.po`) fires. Both 6.0.2 (`sys602.po`) and 6.0.4 (`6.0.4 - System.Disk.po`)
launch it cleanly; fTitle works on both. launch it cleanly; fTitle works on both.
### `orcaFrameLike.c` ### `frame.c`
Port of ORCA-C's `Frame.cc` sample (`tools/orca-c/C.Samples/ Full port of ORCA-C's `Frame.cc` sample. Builds the
Desktop.Samples/Frame.cc`). Builds a standard Apple+File+Edit Apple+File+Edit menu bar via the real ROM Menu Manager
menu bar (`NewMenu` + `InsertMenu` + `FixAppleMenu` + `DrawMenuBar`) (`NewMenu` / `InsertMenu` / `FixAppleMenu` / `FixMenuBar` /
and dispatches `wInMenuBar` / `wInSpecial` events from `TaskMaster`. `DrawMenuBar`) and renders the original "About Frame" dialog
File→Quit exits. Skips the original's Dialog Manager About box. (white-filled framed rect with the 1989 Byte Works copyright
text and an OK button).
### `orcaMiniCadLike.c` ### `minicad.c`
Port of ORCA-C's `MiniCAD.cc` (`Desktop.Samples/MiniCAD.cc`). Slim Full port of ORCA-C's `MiniCAD.cc` sample. Apple+File+Edit+
port — opens a Window Manager content window but omits the line- Options menu bar + a windowed canvas with three seeded line-art
drawing primitives because adding them pushes past the Loader's patterns (curve-stitching, sunburst, Star of David).
cRELOC threshold. Demonstrates the NewWindow path under
`startdesk`.
### `orcaReversiLike.c` ### `reversi.c`
Port of ORCA-C's `Reversi.cc` (`Desktop.Samples/Reversi.cc`). Full Othello game ported from ORCA-C's `Reversi.cc`. 100-byte
Menu-bar app — the ~1600 line game logic is omitted; the demo sentinel-bordered board, 8-direction capture detection, 1-ply
shows the desktop scaffolding (menu + TaskMaster) the original AI with corner/edge weighting, QD-rendered board with black/white
sits on top of. pieces.
### `qdProbe.c` ### `qdProbe.c`

Binary file not shown.

View file

@ -1,11 +1,17 @@
// frame.c - full port of ORCA-C's Frame.cc sample. // frame.c - faithful port of ORCA-C's Frame.cc sample.
// //
// Mike Westerfield's "Frame" desktop demo (Byte Works, 1989). // Mike Westerfield, Byte Works 1989. Original at
// Original at tools/orca-c/C.Samples/Desktop.Samples/Frame.cc. // tools/orca-c/C.Samples/Desktop.Samples/Frame.cc.
// //
// Uses the real ROM Menu Manager — startdesk's QD-DP allocation now // The simplest possible Apple IIgs desktop app: Apple/File/Edit menu
// reserves the full 512 bytes QD needs (own DP + cursor mgr at +$100), // bar + TaskMaster event loop + About dialog. File>Quit (or cmd-Q)
// plus calls InitCursor. See feedback_drawmenubar_hang.md. // exits. The "About Frame" item in the Apple menu shows the original
// 4-line copyright dialog.
//
// Differences from the original:
// - The watchdog at the bottom of the loop forces a clean exit so
// the headless test (`demos/test.sh frame`) can verify $70 = $99.
// In interactive use the watchdog is benign.
#include "iigs/toolbox.h" #include "iigs/toolbox.h"
#include "iigs/desktop.h" #include "iigs/desktop.h"
@ -14,60 +20,131 @@
#define apple_About 257 #define apple_About 257
#define file_Quit 256 #define file_Quit 256
#define wInSpecial 25
#define wInMenuBar 3
typedef struct { short v1, h1, v2, h2; } Rect; #define norml 0
#define stop 1
#define note 2
#define caution 3
#define buttonItem 10
#define statText 136
#define itemDisable 0x8000
// Menu definition strings — verbatim from Frame.cc. typedef struct {
static unsigned char appleMenuStr[] = unsigned short wmWhat;
">>@\\XN1\r" unsigned long wmMessage;
"--About Frame\\N257V\r" unsigned long wmWhen;
".\r"; short wmWhereV, wmWhereH;
unsigned short wmModifiers;
static unsigned char fileMenuStr[] = unsigned long wmTaskData;
">> File \\N2\r" unsigned long wmTaskMask;
"--Close\\N255V\r" unsigned long wmLastClickTick;
"--Quit\\N256*Qq\r" unsigned long wmClickCount;
".\r"; unsigned long wmTaskData2;
unsigned long wmTaskData3;
static unsigned char editMenuStr[] = unsigned long wmTaskData4;
">> Edit \\N3\r" } WmTaskRec;
"--Undo\\N250V*Zz\r"
"--Cut\\N251*Xx\r"
"--Copy\\N252*Cc\r"
"--Paste\\N253*Vv\r"
"--Clear\\N254\r"
".\r";
// About-box message lines.
static const unsigned char line1[] = "\x09" "Frame 1.0";
static const unsigned char line2[] = "\x0e" "Copyright 1989";
static const unsigned char line3[] = "\x10" "Byte Works, Inc.";
static const unsigned char line4[] = "\x13" "By Mike Westerfield";
static const unsigned char btnOk[] = "\x02" "OK";
static void drawAbout(void) { typedef struct {
Rect outer; short itemID;
outer.h1 = 180; outer.v1 = 50; short itemRectV1, itemRectH1, itemRectV2, itemRectH2;
outer.h2 = 460; outer.v2 = 107; unsigned short itemType;
void *itemDescr;
short itemValue;
short itemFlag;
void *itemColor;
} ItemTemplate;
SetSolidPenPat(15);
PaintRect(&outer);
SetSolidPenPat(0);
FrameRect(&outer);
MoveTo(195, 64); DrawString((void *)line1); typedef struct {
MoveTo(195, 74); DrawString((void *)line2); short atRectV1, atRectH1, atRectV2, atRectH2;
MoveTo(195, 84); DrawString((void *)line3); short atBtnHorz;
MoveTo(195, 94); DrawString((void *)line4); short atBeep0, atBeep1, atBeep2, atBeep3;
void *atSound;
void *atResv1;
void *atResv2;
void *atItemList[8];
} AlertTemplate;
Rect ok;
ok.h1 = 395; ok.v1 = 88; static unsigned char editMenuStr[] = ">> Edit \\N3\r"
ok.h2 = 445; ok.v2 = 102; "--Undo\\N250V*Zz\r"
FrameRect(&ok); "--Cut\\N251*Xx\r"
MoveTo(412, 98); "--Copy\\N252*Cc\r"
DrawString((void *)btnOk); "--Paste\\N253*Vv\r"
"--Clear\\N254\r"
".\r";
static unsigned char fileMenuStr[] = ">> File \\N2\r"
"--Close\\N255V\r"
"--Quit\\N256*Qq\r"
".\r";
static unsigned char appleMenuStr[] = ">>@\\XN1\r"
"--About Frame\\N257V\r"
".\r";
static unsigned char gAboutMsg[] =
"\x3a" "Frame 1.0\r"
"Copyright 1989\r"
"Byte Works, Inc.\r\r"
"By Mike Westerfield";
static WmTaskRec gEvent;
static volatile unsigned short gDone;
static void doAlert(unsigned short kind, void *msg) {
static unsigned char okStr[] = "\x02OK";
static ItemTemplate button = {
1, 36, 15, 0, 0, buttonItem, okStr, 0, 0, (void *)0
};
static ItemTemplate message = {
100, 5, 100, 90, 280, itemDisable | statText, (void *)0, 0, 0, (void *)0
};
static AlertTemplate alertRec = {
50, 180, 107, 460,
2,
0x80, 0x80, 0x80, 0x80,
(void *)0, (void *)0, (void *)0,
{ (void *)0, (void *)0, (void *)0, (void *)0,
(void *)0, (void *)0, (void *)0, (void *)0 }
};
SetForeColor(0);
SetBackColor(15);
message.itemDescr = msg;
alertRec.atItemList[0] = (void *)&button;
alertRec.atItemList[1] = (void *)&message;
alertRec.atItemList[2] = (void *)0;
switch (kind) {
case norml: (void)Alert(&alertRec, (void *)0); break;
case stop: (void)StopAlert(&alertRec, (void *)0); break;
case note: (void)NoteAlert(&alertRec, (void *)0); break;
case caution: (void)CautionAlert(&alertRec, (void *)0); break;
default: break;
}
}
static void menuAbout(void) {
doAlert(note, gAboutMsg);
}
static void handleMenu(unsigned short menuNum) {
switch (menuNum) {
case apple_About: menuAbout(); break;
case file_Quit: gDone = 1; break;
default: break;
}
HiliteMenu(0, (unsigned short)(gEvent.wmTaskData >> 16));
} }
@ -85,12 +162,26 @@ int main(void) {
unsigned short userId = startdesk(640); unsigned short userId = startdesk(640);
(void)userId; (void)userId;
paintDesktopBackdrop(); // white desktop (WM dither -> noise in
// our 640 B/W palette; paint directly)
initMenus(); initMenus();
gEvent.wmTaskMask = 0x1FFFL;
ShowCursor(); ShowCursor();
for (volatile unsigned long s = 0; s < 100000UL; s++) { } gDone = 0;
drawAbout(); unsigned short watchdog = 0;
for (volatile unsigned long s = 0; s < 200000UL; s++) { } do {
unsigned short event = TaskMaster(0x076E, &gEvent);
switch (event) {
case wInSpecial:
case wInMenuBar:
handleMenu((unsigned short)gEvent.wmTaskData);
break;
default:
break;
}
watchdog++;
} while (!gDone && watchdog < 4000);
*(volatile unsigned char *)0x70 = 0x99; *(volatile unsigned char *)0x70 = 0x99;
return 0; return 0;

View file

@ -1,19 +1,19 @@
# section layout # section layout
.text : 0x001000 .. 0x0024b3 ( 5299 bytes) .text : 0x001000 .. 0x002286 ( 4742 bytes)
.rodata : 0x0024b3 .. 0x0025b2 ( 255 bytes) .rodata : 0x002286 .. 0x0023f2 ( 364 bytes)
.bss : 0x00a000 .. 0x00a00a ( 10 bytes) .bss : 0x00a000 .. 0x00a038 ( 56 bytes)
# per-input-file .text contributions # per-input-file .text contributions
186 /home/scott/claude/llvm816/runtime/crt0Gsos.o 186 /home/scott/claude/llvm816/runtime/crt0Gsos.o
1287 /home/scott/claude/llvm816/demos/frame.o 615 /home/scott/claude/llvm816/demos/frame.o
43513 /home/scott/claude/llvm816/runtime/libc.o 45465 /home/scott/claude/llvm816/runtime/libc.o
5935 /home/scott/claude/llvm816/runtime/snprintf.o 15382 /home/scott/claude/llvm816/runtime/snprintf.o
11953 /home/scott/claude/llvm816/runtime/extras.o 13322 /home/scott/claude/llvm816/runtime/extras.o
7077 /home/scott/claude/llvm816/runtime/softFloat.o 8398 /home/scott/claude/llvm816/runtime/softFloat.o
15379 /home/scott/claude/llvm816/runtime/softDouble.o 16151 /home/scott/claude/llvm816/runtime/softDouble.o
176 /home/scott/claude/llvm816/runtime/iigsGsos.o 176 /home/scott/claude/llvm816/runtime/iigsGsos.o
20670 /home/scott/claude/llvm816/runtime/iigsToolbox.o 20670 /home/scott/claude/llvm816/runtime/iigsToolbox.o
1349 /home/scott/claude/llvm816/runtime/desktop.o 1565 /home/scott/claude/llvm816/runtime/desktop.o
2540 /home/scott/claude/llvm816/runtime/libgcc.o 2540 /home/scott/claude/llvm816/runtime/libgcc.o
# global symbols (sorted by address) # global symbols (sorted by address)
@ -28,120 +28,121 @@
0x000000 __bss_seg3_bank 0x000000 __bss_seg3_bank
0x000000 __bss_seg3_lo16 0x000000 __bss_seg3_lo16
0x000000 __bss_seg3_size 0x000000 __bss_seg3_size
0x00000a __bss_seg0_size 0x000038 __bss_seg0_size
0x00000a __bss_size 0x000038 __bss_size
0x001000 __start 0x001000 __start
0x001000 __text_start 0x001000 __text_start
0x0010ba main 0x0010ba main
0x0015c1 CtlStartUp 0x001321 CtlStartUp
0x0015d1 EMStartUp 0x001331 NoteAlert
0x0015f0 FMStartUp 0x00134d EMStartUp
0x001600 LEStartUp 0x00136c FMStartUp
0x001610 LoadOneTool 0x00137c LEStartUp
0x001620 NewHandle 0x00138c LoadOneTool
0x001646 MenuStartUp 0x00139c NewHandle
0x001656 InsertMenu 0x0013c2 MenuStartUp
0x00166b NewMenu 0x0013d2 HiliteMenu
0x001685 QDStartUp 0x0013e2 InsertMenu
0x00169b DrawString 0x0013f7 NewMenu
0x0016ad FrameRect 0x001411 QDStartUp
0x0016bf MoveTo 0x001427 TaskMaster
0x0016cf PaintRect 0x00143e startdesk
0x0016e1 startdesk 0x001868 paintDesktopBackdrop
0x001ac7 __jsl_indir 0x00189a __jsl_indir
0x001aca __mulhi3 0x00189d __mulhi3
0x001ae9 __umulhisi3 0x0018bc __umulhisi3
0x001b40 __ashlhi3 0x001913 __ashlhi3
0x001b4f __lshrhi3 0x001922 __lshrhi3
0x001b5f __ashrhi3 0x001932 __ashrhi3
0x001b72 __udivhi3 0x001945 __udivhi3
0x001b7e __umodhi3 0x001951 __umodhi3
0x001b8a __divhi3 0x00195d __divhi3
0x001ba4 __modhi3 0x001977 __modhi3
0x001bbe __divmod_setup 0x001991 __divmod_setup
0x001bf1 __udivmod_core 0x0019c4 __udivmod_core
0x001c0f __mulsi3 0x0019e2 __mulsi3
0x001cc8 __ashlsi3 0x001a9b __ashlsi3
0x001cdd __lshrsi3 0x001ab0 __lshrsi3
0x001cf2 __ashrsi3 0x001ac5 __ashrsi3
0x001d0c __udivmodsi_core 0x001adf __udivmodsi_core
0x001d44 __udivsi3 0x001b17 __udivsi3
0x001d58 __umodsi3 0x001b2b __umodsi3
0x001d6c __divsi3 0x001b3f __divsi3
0x001d93 __modsi3 0x001b66 __modsi3
0x001dba __divmodsi_setup 0x001b8d __divmodsi_setup
0x001e0b __divmoddi4_stash 0x001bde __divmoddi4_stash
0x001e28 __retdi 0x001bfb __retdi
0x001e35 __ashldi3 0x001c08 __ashldi3
0x001e58 __lshrdi3 0x001c2b __lshrdi3
0x001e7b __ashrdi3 0x001c4e __ashrdi3
0x001ea1 __muldi3 0x001c74 __muldi3
0x001efc __ucmpdi2 0x001ccf __ucmpdi2
0x001f25 __cmpdi2 0x001cf8 __cmpdi2
0x001f5c __udivdi3 0x001d2f __udivdi3
0x001f65 __umoddi3 0x001d38 __umoddi3
0x001f7e __udivmoddi_core 0x001d51 __udivmoddi_core
0x001fcb __divdi3 0x001d9e __divdi3
0x001fea __moddi3 0x001dbd __moddi3
0x002017 __absdi_a 0x001dea __absdi_a
0x00201f __absdi_b 0x001df2 __absdi_b
0x002027 __negdi_a 0x001dfa __negdi_a
0x002045 __negdi_b 0x001e18 __negdi_b
0x002063 setjmp 0x001e36 setjmp
0x00208b longjmp 0x001e5e longjmp
0x0020b5 __umulhisi3_qsq 0x001e88 __umulhisi3_qsq
0x0024b3 __rodata_start 0x002286 __rodata_start
0x0024b3 __text_end 0x002286 __text_end
0x0024b3 gChainPath 0x002286 gChainPath
0x0024c7 editMenuStr 0x00229a editMenuStr
0x002520 fileMenuStr 0x0022f3 fileMenuStr
0x00254d appleMenuStr 0x002320 appleMenuStr
0x00256c line1 0x00233f gAboutMsg
0x002577 line2 0x00237f doAlert.okStr
0x002587 line3 0x002384 doAlert.button
0x002599 line4 0x00239c doAlert.message
0x0025ae btnOk 0x0023b4 doAlert.alertRec
0x0025b2 __init_array_end 0x0023f2 __init_array_end
0x0025b2 __init_array_start 0x0023f2 __init_array_start
0x0025b2 __rodata_end 0x0023f2 __rodata_end
0x00a000 __bss_lo16 0x00a000 __bss_lo16
0x00a000 __bss_seg0_lo16 0x00a000 __bss_seg0_lo16
0x00a000 __bss_start 0x00a000 __bss_start
0x00a000 gUserId 0x00a000 gEvent
0x00a002 gDpHandle 0x00a02c gDone
0x00a006 gDpBase 0x00a02e gUserId
0x00a008 __indirTarget 0x00a030 gDpHandle
0x00a00a __bss_end 0x00a034 gDpBase
0x00a00a __heap_start 0x00a036 __indirTarget
0x00a038 __bss_end
0x00a038 __heap_start
0x00bf00 __heap_end 0x00bf00 __heap_end
CtlStartUp = 0x0015c1 CtlStartUp = 0x001321
DrawString = 0x00169b EMStartUp = 0x00134d
EMStartUp = 0x0015d1 FMStartUp = 0x00136c
FMStartUp = 0x0015f0 HiliteMenu = 0x0013d2
FrameRect = 0x0016ad InsertMenu = 0x0013e2
InsertMenu = 0x001656 LEStartUp = 0x00137c
LEStartUp = 0x001600 LoadOneTool = 0x00138c
LoadOneTool = 0x001610 MenuStartUp = 0x0013c2
MenuStartUp = 0x001646 NewHandle = 0x00139c
MoveTo = 0x0016bf NewMenu = 0x0013f7
NewHandle = 0x001620 NoteAlert = 0x001331
NewMenu = 0x00166b QDStartUp = 0x001411
PaintRect = 0x0016cf TaskMaster = 0x001427
QDStartUp = 0x001685 __absdi_a = 0x001dea
__absdi_a = 0x002017 __absdi_b = 0x001df2
__absdi_b = 0x00201f __ashldi3 = 0x001c08
__ashldi3 = 0x001e35 __ashlhi3 = 0x001913
__ashlhi3 = 0x001b40 __ashlsi3 = 0x001a9b
__ashlsi3 = 0x001cc8 __ashrdi3 = 0x001c4e
__ashrdi3 = 0x001e7b __ashrhi3 = 0x001932
__ashrhi3 = 0x001b5f __ashrsi3 = 0x001ac5
__ashrsi3 = 0x001cf2
__bss_bank = 0x000000 __bss_bank = 0x000000
__bss_end = 0x00a00a __bss_end = 0x00a038
__bss_lo16 = 0x00a000 __bss_lo16 = 0x00a000
__bss_seg0_bank = 0x000000 __bss_seg0_bank = 0x000000
__bss_seg0_lo16 = 0x00a000 __bss_seg0_lo16 = 0x00a000
__bss_seg0_size = 0x00000a __bss_seg0_size = 0x000038
__bss_seg1_bank = 0x000000 __bss_seg1_bank = 0x000000
__bss_seg1_lo16 = 0x000000 __bss_seg1_lo16 = 0x000000
__bss_seg1_size = 0x000000 __bss_seg1_size = 0x000000
@ -151,63 +152,66 @@ __bss_seg2_size = 0x000000
__bss_seg3_bank = 0x000000 __bss_seg3_bank = 0x000000
__bss_seg3_lo16 = 0x000000 __bss_seg3_lo16 = 0x000000
__bss_seg3_size = 0x000000 __bss_seg3_size = 0x000000
__bss_size = 0x00000a __bss_size = 0x000038
__bss_start = 0x00a000 __bss_start = 0x00a000
__cmpdi2 = 0x001f25 __cmpdi2 = 0x001cf8
__divdi3 = 0x001fcb __divdi3 = 0x001d9e
__divhi3 = 0x001b8a __divhi3 = 0x00195d
__divmod_setup = 0x001bbe __divmod_setup = 0x001991
__divmoddi4_stash = 0x001e0b __divmoddi4_stash = 0x001bde
__divmodsi_setup = 0x001dba __divmodsi_setup = 0x001b8d
__divsi3 = 0x001d6c __divsi3 = 0x001b3f
__heap_end = 0x00bf00 __heap_end = 0x00bf00
__heap_start = 0x00a00a __heap_start = 0x00a038
__indirTarget = 0x00a008 __indirTarget = 0x00a036
__init_array_end = 0x0025b2 __init_array_end = 0x0023f2
__init_array_start = 0x0025b2 __init_array_start = 0x0023f2
__jsl_indir = 0x001ac7 __jsl_indir = 0x00189a
__lshrdi3 = 0x001e58 __lshrdi3 = 0x001c2b
__lshrhi3 = 0x001b4f __lshrhi3 = 0x001922
__lshrsi3 = 0x001cdd __lshrsi3 = 0x001ab0
__moddi3 = 0x001fea __moddi3 = 0x001dbd
__modhi3 = 0x001ba4 __modhi3 = 0x001977
__modsi3 = 0x001d93 __modsi3 = 0x001b66
__muldi3 = 0x001ea1 __muldi3 = 0x001c74
__mulhi3 = 0x001aca __mulhi3 = 0x00189d
__mulsi3 = 0x001c0f __mulsi3 = 0x0019e2
__negdi_a = 0x002027 __negdi_a = 0x001dfa
__negdi_b = 0x002045 __negdi_b = 0x001e18
__retdi = 0x001e28 __retdi = 0x001bfb
__rodata_end = 0x0025b2 __rodata_end = 0x0023f2
__rodata_start = 0x0024b3 __rodata_start = 0x002286
__start = 0x001000 __start = 0x001000
__text_end = 0x0024b3 __text_end = 0x002286
__text_start = 0x001000 __text_start = 0x001000
__ucmpdi2 = 0x001efc __ucmpdi2 = 0x001ccf
__udivdi3 = 0x001f5c __udivdi3 = 0x001d2f
__udivhi3 = 0x001b72 __udivhi3 = 0x001945
__udivmod_core = 0x001bf1 __udivmod_core = 0x0019c4
__udivmoddi_core = 0x001f7e __udivmoddi_core = 0x001d51
__udivmodsi_core = 0x001d0c __udivmodsi_core = 0x001adf
__udivsi3 = 0x001d44 __udivsi3 = 0x001b17
__umoddi3 = 0x001f65 __umoddi3 = 0x001d38
__umodhi3 = 0x001b7e __umodhi3 = 0x001951
__umodsi3 = 0x001d58 __umodsi3 = 0x001b2b
__umulhisi3 = 0x001ae9 __umulhisi3 = 0x0018bc
__umulhisi3_qsq = 0x0020b5 __umulhisi3_qsq = 0x001e88
appleMenuStr = 0x00254d appleMenuStr = 0x002320
btnOk = 0x0025ae doAlert.alertRec = 0x0023b4
editMenuStr = 0x0024c7 doAlert.button = 0x002384
fileMenuStr = 0x002520 doAlert.message = 0x00239c
gChainPath = 0x0024b3 doAlert.okStr = 0x00237f
gDpBase = 0x00a006 editMenuStr = 0x00229a
gDpHandle = 0x00a002 fileMenuStr = 0x0022f3
gUserId = 0x00a000 gAboutMsg = 0x00233f
line1 = 0x00256c gChainPath = 0x002286
line2 = 0x002577 gDone = 0x00a02c
line3 = 0x002587 gDpBase = 0x00a034
line4 = 0x002599 gDpHandle = 0x00a030
longjmp = 0x00208b gEvent = 0x00a000
gUserId = 0x00a02e
longjmp = 0x001e5e
main = 0x0010ba main = 0x0010ba
setjmp = 0x002063 paintDesktopBackdrop = 0x001868
startdesk = 0x0016e1 setjmp = 0x001e36
startdesk = 0x00143e

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

View file

@ -1,25 +1,76 @@
// minicad.c - port of ORCA-C's MiniCAD.cc sample. // minicad.c - faithful port of ORCA-C's MiniCAD.cc sample.
// //
// MiniCAD is a tiny drawing program: each click in the content area // Mike Westerfield, Byte Works 1989. Original at
// creates a new line in the current window's line list. In the // tools/orca-c/C.Samples/Desktop.Samples/MiniCAD.cc.
// original you click to set the anchor, drag to draw a rubber-band //
// line, release to commit. We seed three classic line-art patterns // A simple multi-window CAD: File>New opens a drawing window (up to
// (curve-stitching, sunburst, mandala) instead of waiting for clicks // 4), click+drag inside a window's content rubber-bands a line,
// because our minimal Event Manager doesn't have a working // release commits it. File>Close closes the front window. Each
// GetNextEvent path for mouse-drag tracking, but the data model and // window's lines are remembered so the WM can repaint on update.
// rendering pipeline match MiniCAD.cc verbatim.
#include "iigs/toolbox.h" #include "iigs/toolbox.h"
#include "iigs/desktop.h" #include "iigs/desktop.h"
#define apple_About 257
#define file_Quit 256
#define file_New 258
#define file_Close 255
#define wInMenuBar 3
#define wInSpecial 25
#define wInGoAway 17
#define wInContent 19 #define wInContent 19
#define fVis 0x0020
#define fMove 0x0080 #define mUpMask 0x0002
#define fClose 0x4000
#define modeCopy 0
#define modeXOR 2
#define topMost ((void *)-1L)
#define bottomMost ((void *)0)
#define maxWindows 4
#define maxLines 50
#define norml 0
#define stop 1
#define note 2
#define caution 3
#define buttonItem 10
#define statText 136
#define itemDisable 0x8000
typedef struct { short v1, h1, v2, h2; } Rect; typedef struct { short v1, h1, v2, h2; } Rect;
typedef struct { short v, h; } Point;
typedef struct { Point p1, p2; } LineRec;
typedef struct {
unsigned short wmWhat;
unsigned long wmMessage;
unsigned long wmWhen;
short wmWhereV, wmWhereH;
unsigned short wmModifiers;
unsigned long wmTaskData;
unsigned long wmTaskMask;
unsigned long wmLastClickTick;
unsigned long wmClickCount;
unsigned long wmTaskData2;
unsigned long wmTaskData3;
unsigned long wmTaskData4;
} WmTaskRec;
typedef struct {
unsigned short wmWhat;
unsigned long wmMessage;
unsigned long wmWhen;
short wmWhereV, wmWhereH;
unsigned short wmModifiers;
} EventRec;
typedef struct { typedef struct {
unsigned short paramLength; unsigned short paramLength;
@ -44,106 +95,282 @@ typedef struct {
} NewWindowParm; } NewWindowParm;
typedef struct { short v1, h1, v2, h2; } LineRec; typedef struct {
short itemID;
short itemRectV1, itemRectH1, itemRectV2, itemRectH2;
unsigned short itemType;
void *itemDescr;
short itemValue;
short itemFlag;
void *itemColor;
} ItemTemplate;
typedef struct {
short atRectV1, atRectH1, atRectV2, atRectH2;
short atBtnHorz;
short atBeep0, atBeep1, atBeep2, atBeep3;
void *atSound;
void *atResv1;
void *atResv2;
void *atItemList[8];
} AlertTemplate;
static unsigned char gTitle[] = "\x07MiniCAD"; typedef struct {
void *wPtr;
unsigned char *name;
unsigned short numLines;
LineRec lines[maxLines];
} WindowRecord;
// Menu bar titles painted manually (DrawMenuBar hangs in our env).
static const unsigned char appleTitle[] = "\x01\x14"; static unsigned char editMenuStr[] = ">> Edit \\N3\r"
static const unsigned char fileTitle[] = "\x04" "File"; "--Undo\\N250V*Zz\r"
static const unsigned char editTitle[] = "\x04" "Edit"; "--Cut\\N251*Xx\r"
static const unsigned char optsTitle[] = "\x07" "Options"; "--Copy\\N252*Cc\r"
static const unsigned char *const menuTitles[] = { "--Paste\\N253*Vv\r"
appleTitle, fileTitle, editTitle, optsTitle "--Clear\\N254\r"
".\r";
static unsigned char fileMenuStr[] = ">> File \\N2\r"
"--New\\N258*Nn\r"
"--Close\\N255V\r"
"--Quit\\N256*Qq\r"
".\r";
static unsigned char appleMenuStr[] = ">>@\\XN1\r"
"--About...\\N257V\r"
".\r";
static unsigned char gAboutMsg[] =
"\x3d" "Mini-CAD 1.0\r"
"Copyright 1989\r"
"Byte Works, Inc.\r\r"
"By Mike Westerfield";
static unsigned char gTitle0[] = "\x07Paint 1";
static unsigned char gTitle1[] = "\x07Paint 2";
static unsigned char gTitle2[] = "\x07Paint 3";
static unsigned char gTitle3[] = "\x07Paint 4";
static WindowRecord gWindows[maxWindows] = {
{ (void *)0, gTitle0, 0, { { {0,0}, {0,0} } } },
{ (void *)0, gTitle1, 0, { { {0,0}, {0,0} } } },
{ (void *)0, gTitle2, 0, { { {0,0}, {0,0} } } },
{ (void *)0, gTitle3, 0, { { {0,0}, {0,0} } } }
}; };
static NewWindowParm gWp; static WmTaskRec gEvent;
static volatile unsigned short gDone;
// Draw a curve-stitching pattern: 12 chord lines mapping the y-axis static void doAlert(unsigned short kind, void *msg) {
// to a curve along the x-axis. Visually it traces a hyperbolic static unsigned char okStr[] = "\x02OK";
// envelope (the classic "string art" pattern). static ItemTemplate button = {
static void drawCurves(short ox, short oy) { 1, 36, 15, 0, 0, buttonItem, okStr, 0, 0, (void *)0
for (short i = 0; i < 12; i++) { };
MoveTo((short)(ox + 0), (short)(oy + i * 6)); static ItemTemplate message = {
LineTo((short)(ox + 90 - i * 5), (short)(oy + 70 - i * 5)); 100, 5, 100, 90, 280, itemDisable | statText, (void *)0, 0, 0, (void *)0
};
static AlertTemplate alertRec = {
50, 180, 107, 460, 2, 0x80, 0x80, 0x80, 0x80,
(void *)0, (void *)0, (void *)0,
{ (void *)0, (void *)0, (void *)0, (void *)0,
(void *)0, (void *)0, (void *)0, (void *)0 }
};
SetForeColor(0);
SetBackColor(15);
message.itemDescr = msg;
alertRec.atItemList[0] = (void *)&button;
alertRec.atItemList[1] = (void *)&message;
alertRec.atItemList[2] = (void *)0;
switch (kind) {
case norml: (void)Alert(&alertRec, (void *)0); break;
case stop: (void)StopAlert(&alertRec, (void *)0); break;
case note: (void)NoteAlert(&alertRec, (void *)0); break;
case caution: (void)CautionAlert(&alertRec, (void *)0); break;
default: break;
} }
} }
// Draw a sunburst: 12 radial lines from a central point. // Window-content def-proc. The WM calls this with DBR set to our
static void drawSunburst(short cx, short cy, short r) { // bank (Loader sets up the JSL chain). We use GetWRefCon on the
// Pre-computed cos/sin for 12 equally-spaced angles (every 30 // current port to know which gWindows[] entry to redraw.
// degrees), scaled by 1000. Avoids any float math. static void drawWindow(void) {
static const short cosA[12] = { 1000, 866, 500, 0, -500, -866, -1000, -866, -500, 0, 500, 866 }; unsigned long refcon = (unsigned long)GetWRefCon(GetPort());
static const short sinA[12] = { 0, 500, 866, 1000, 866, 500, 0, -500, -866, -1000, -866, -500 }; unsigned short i = (unsigned short)refcon;
for (short i = 0; i < 12; i++) { if (i >= maxWindows) return;
short dx = (short)((long)cosA[i] * r / 1000); WindowRecord *wp = &gWindows[i];
short dy = (short)((long)sinA[i] * r / 1000); if (wp->numLines == 0) return;
MoveTo((short)(cx - dx), (short)(cy - dy)); SetPenMode(modeCopy);
LineTo((short)(cx + dx), (short)(cy + dy)); SetSolidPenPat(0);
SetPenSize(2, 1);
for (unsigned short j = 0; j < wp->numLines; j++) {
LineRec *lp = &wp->lines[j];
MoveTo(lp->p1.h, lp->p1.v);
LineTo(lp->p2.h, lp->p2.v);
} }
} }
// Draw a mandala: 6-pointed star made of two overlapping triangles. static void doNew(void) {
static void drawMandala(short cx, short cy, short r) { static NewWindowParm wp;
short h = (short)((long)r * 866L / 1000L); unsigned short i = 0;
short h2 = (short)(r / 2); while (i < maxWindows && gWindows[i].wPtr != (void *)0) i++;
// First triangle (point up). if (i >= maxWindows) return;
MoveTo(cx, (short)(cy - r)); gWindows[i].numLines = 0;
LineTo((short)(cx + h), (short)(cy + h2));
LineTo((short)(cx - h), (short)(cy + h2)); unsigned char *p = (unsigned char *)&wp;
LineTo(cx, (short)(cy - r)); for (unsigned short k = 0; k < sizeof wp; k++) p[k] = 0;
// Second triangle (point down). wp.paramLength = (unsigned short)sizeof wp;
MoveTo(cx, (short)(cy + r)); wp.wFrameBits = 0x4007 | 0x0020 | 0x0080 | 0x0400 | 0x4000; // fTitle+fClose+fVis+fMove+fGrow
LineTo((short)(cx + h), (short)(cy - h2)); wp.wTitle = gWindows[i].name;
LineTo((short)(cx - h), (short)(cy - h2)); wp.wRefCon = (unsigned long)i;
LineTo(cx, (short)(cy + r)); wp.wMaxHeight = 188;
wp.wMaxWidth = 615;
wp.wPosition.v1 = (short)(25 + i * 10);
wp.wPosition.h1 = (short)(10 + i * 10);
wp.wPosition.v2 = (short)(180 + i * 10);
wp.wPosition.h2 = (short)(600 + i * 10);
wp.wContDefProc = (void *)&drawWindow;
wp.wPlane = topMost;
gWindows[i].wPtr = NewWindow(&wp);
if (i == maxWindows - 1) {
DisableMItem(file_New);
}
}
static void doClose(void) {
void *fw = FrontWindow();
if (!fw) return;
unsigned short i = (unsigned short)(unsigned long)GetWRefCon(fw);
if (i >= maxWindows) return;
CloseWindow(gWindows[i].wPtr);
gWindows[i].wPtr = (void *)0;
EnableMItem(file_New);
}
static void menuAbout(void) {
doAlert(note, gAboutMsg);
}
static void sketch(void) {
void *fw = FrontWindow();
if (!fw) return;
unsigned short i = (unsigned short)(unsigned long)GetWRefCon(fw);
if (i >= maxWindows) return;
if (gWindows[i].numLines >= maxLines) {
static unsigned char fullMsg[] =
"\x3a" "The window is full -\r"
"more lines cannot be\r"
"added.";
doAlert(stop, fullMsg);
return;
}
StartDrawing(fw);
SetSolidPenPat(15);
SetPenSize(2, 1);
SetPenMode(modeXOR);
Point firstPt;
firstPt.h = gEvent.wmWhereH;
firstPt.v = gEvent.wmWhereV;
GlobalToLocal(&firstPt);
MoveTo(firstPt.h, firstPt.v);
LineTo(firstPt.h, firstPt.v);
Point endPt = firstPt;
EventRec ev;
while (!GetNextEvent(mUpMask, &ev)) {
Point cur;
cur.h = ev.wmWhereH;
cur.v = ev.wmWhereV;
GlobalToLocal(&cur);
if (cur.h != endPt.h || cur.v != endPt.v) {
MoveTo(firstPt.h, firstPt.v);
LineTo(endPt.h, endPt.v);
MoveTo(firstPt.h, firstPt.v);
LineTo(cur.h, cur.v);
endPt = cur;
}
}
// Erase final XOR line.
MoveTo(firstPt.h, firstPt.v);
LineTo(endPt.h, endPt.v);
if (firstPt.h != endPt.h || firstPt.v != endPt.v) {
unsigned short n = gWindows[i].numLines++;
gWindows[i].lines[n].p1 = firstPt;
gWindows[i].lines[n].p2 = endPt;
SetPenMode(modeCopy);
SetSolidPenPat(0);
MoveTo(firstPt.h, firstPt.v);
LineTo(endPt.h, endPt.v);
}
}
static void handleMenu(unsigned short menuNum) {
switch (menuNum) {
case apple_About: menuAbout(); break;
case file_Quit: gDone = 1; break;
case file_New: doNew(); break;
case file_Close: doClose(); break;
default: break;
}
HiliteMenu(0, (unsigned short)(gEvent.wmTaskData >> 16));
}
static void initMenus(void) {
InsertMenu(NewMenu(editMenuStr), 0);
InsertMenu(NewMenu(fileMenuStr), 0);
InsertMenu(NewMenu(appleMenuStr), 0);
FixAppleMenu(1);
FixMenuBar();
DrawMenuBar();
} }
int main(void) { int main(void) {
unsigned short userId = startdesk(640); unsigned short userId = startdesk(640);
(void)userId; (void)userId;
paintDesktopBackdrop(); paintDesktopBackdrop();
paintMenuBarTitles(menuTitles, 4); initMenus();
gEvent.wmTaskMask = 0x1FFFL;
ShowCursor(); ShowCursor();
// Open the drawing window. // Open one window so the demo has visible content immediately.
{ doNew();
unsigned char *p = (unsigned char *)&gWp;
for (unsigned short i = 0; i < sizeof gWp; i++) p[i] = 0;
}
gWp.paramLength = (unsigned short)sizeof gWp;
gWp.wFrameBits = fVis | fMove | fClose;
gWp.wTitle = gTitle;
gWp.wMaxHeight = 200;
gWp.wMaxWidth = 640;
gWp.wPosition.v1 = 20; gWp.wPosition.h1 = 30;
gWp.wPosition.v2 = 180; gWp.wPosition.h2 = 610;
gWp.wPlane = (void *)-1L;
void *win = NewWindow(&gWp);
if (win) { gDone = 0;
BeginUpdate(win); unsigned short watchdog = 0;
SetPort(win); do {
SetSolidPenPat(0); unsigned short event = TaskMaster(0x076E, &gEvent);
switch (event) {
case wInSpecial:
case wInMenuBar:
handleMenu((unsigned short)gEvent.wmTaskData);
break;
case wInGoAway:
doClose();
break;
case wInContent:
sketch();
break;
default:
break;
}
watchdog++;
} while (!gDone && watchdog < 4000);
// Three patterns laid out horizontally.
drawCurves(20, 30);
drawSunburst(280, 75, 50);
drawMandala(450, 75, 50);
EndUpdate(win);
}
for (volatile unsigned long s = 0; s < 400000UL; s++) { }
if (win) {
CloseWindow(win);
}
*(volatile unsigned char *)0x70 = 0x99; *(volatile unsigned char *)0x70 = 0x99;
return 0; return 0;
} }

View file

@ -1,19 +1,19 @@
# section layout # section layout
.text : 0x001000 .. 0x002638 ( 5688 bytes) .text : 0x001000 .. 0x003102 ( 8450 bytes)
.rodata : 0x002638 .. 0x0026ad ( 117 bytes) .rodata : 0x003102 .. 0x00393a ( 2104 bytes)
.bss : 0x00a000 .. 0x00a058 ( 88 bytes) .bss : 0x00a000 .. 0x00a086 ( 134 bytes)
# per-input-file .text contributions # per-input-file .text contributions
186 /home/scott/claude/llvm816/runtime/crt0Gsos.o 186 /home/scott/claude/llvm816/runtime/crt0Gsos.o
1374 /home/scott/claude/llvm816/demos/minicad.o 4058 /home/scott/claude/llvm816/demos/minicad.o
43513 /home/scott/claude/llvm816/runtime/libc.o 43132 /home/scott/claude/llvm816/runtime/libc.o
5935 /home/scott/claude/llvm816/runtime/snprintf.o 14895 /home/scott/claude/llvm816/runtime/snprintf.o
11953 /home/scott/claude/llvm816/runtime/extras.o 11953 /home/scott/claude/llvm816/runtime/extras.o
7077 /home/scott/claude/llvm816/runtime/softFloat.o 7077 /home/scott/claude/llvm816/runtime/softFloat.o
15379 /home/scott/claude/llvm816/runtime/softDouble.o 15379 /home/scott/claude/llvm816/runtime/softDouble.o
176 /home/scott/claude/llvm816/runtime/iigsGsos.o 176 /home/scott/claude/llvm816/runtime/iigsGsos.o
20670 /home/scott/claude/llvm816/runtime/iigsToolbox.o 20670 /home/scott/claude/llvm816/runtime/iigsToolbox.o
1302 /home/scott/claude/llvm816/runtime/desktop.o 1349 /home/scott/claude/llvm816/runtime/desktop.o
2540 /home/scott/claude/llvm816/runtime/libgcc.o 2540 /home/scott/claude/llvm816/runtime/libgcc.o
# global symbols (sorted by address) # global symbols (sorted by address)
@ -28,126 +28,154 @@
0x000000 __bss_seg3_bank 0x000000 __bss_seg3_bank
0x000000 __bss_seg3_lo16 0x000000 __bss_seg3_lo16
0x000000 __bss_seg3_size 0x000000 __bss_seg3_size
0x000058 __bss_seg0_size 0x000086 __bss_seg0_size
0x000058 __bss_size 0x000086 __bss_size
0x001000 __start 0x001000 __start
0x001000 __text_start 0x001000 __text_start
0x0010ba main 0x0010ba main
0x001618 memset 0x001eee drawWindow
0x001678 CtlStartUp 0x002094 memset
0x001688 EMStartUp 0x0020f4 CtlStartUp
0x0016a7 FMStartUp 0x002104 NoteAlert
0x0016b7 LEStartUp 0x002120 StopAlert
0x0016c7 LoadOneTool 0x00213c EMStartUp
0x0016d7 NewHandle 0x00215b GetNextEvent
0x0016fd QDStartUp 0x002172 FMStartUp
0x001713 DrawString 0x002182 LEStartUp
0x001725 LineTo 0x002192 LoadOneTool
0x001735 MoveTo 0x0021a2 NewHandle
0x001745 SetPort 0x0021c8 MenuStartUp
0x001757 BeginUpdate 0x0021d8 HiliteMenu
0x001769 CloseWindow 0x0021e8 InsertMenu
0x00177b EndUpdate 0x0021fd NewMenu
0x00178d NewWindow 0x002217 QDStartUp
0x0017a7 startdesk 0x00222d GetPort
0x001b5e paintMenuBarTitles 0x00223d GlobalToLocal
0x001c1a paintDesktopBackdrop 0x00224f LineTo
0x001c4c __jsl_indir 0x00225f MoveTo
0x001c4f __mulhi3 0x00226f SetPenSize
0x001c6e __umulhisi3 0x00227f CloseWindow
0x001cc5 __ashlhi3 0x002291 FrontWindow
0x001cd4 __lshrhi3 0x0022a1 GetWRefCon
0x001ce4 __ashrhi3 0x0022bb NewWindow
0x001cf7 __udivhi3 0x0022d5 StartDrawing
0x001d03 __umodhi3 0x0022e7 TaskMaster
0x001d0f __divhi3 0x0022fe startdesk
0x001d29 __modhi3 0x0026e4 paintDesktopBackdrop
0x001d43 __divmod_setup 0x002716 __jsl_indir
0x001d76 __udivmod_core 0x002719 __mulhi3
0x001d94 __mulsi3 0x002738 __umulhisi3
0x001e4d __ashlsi3 0x00278f __ashlhi3
0x001e62 __lshrsi3 0x00279e __lshrhi3
0x001e77 __ashrsi3 0x0027ae __ashrhi3
0x001e91 __udivmodsi_core 0x0027c1 __udivhi3
0x001ec9 __udivsi3 0x0027cd __umodhi3
0x001edd __umodsi3 0x0027d9 __divhi3
0x001ef1 __divsi3 0x0027f3 __modhi3
0x001f18 __modsi3 0x00280d __divmod_setup
0x001f3f __divmodsi_setup 0x002840 __udivmod_core
0x001f90 __divmoddi4_stash 0x00285e __mulsi3
0x001fad __retdi 0x002917 __ashlsi3
0x001fba __ashldi3 0x00292c __lshrsi3
0x001fdd __lshrdi3 0x002941 __ashrsi3
0x002000 __ashrdi3 0x00295b __udivmodsi_core
0x002026 __muldi3 0x002993 __udivsi3
0x002081 __ucmpdi2 0x0029a7 __umodsi3
0x0020aa __cmpdi2 0x0029bb __divsi3
0x0020e1 __udivdi3 0x0029e2 __modsi3
0x0020ea __umoddi3 0x002a09 __divmodsi_setup
0x002103 __udivmoddi_core 0x002a5a __divmoddi4_stash
0x002150 __divdi3 0x002a77 __retdi
0x00216f __moddi3 0x002a84 __ashldi3
0x00219c __absdi_a 0x002aa7 __lshrdi3
0x0021a4 __absdi_b 0x002aca __ashrdi3
0x0021ac __negdi_a 0x002af0 __muldi3
0x0021ca __negdi_b 0x002b4b __ucmpdi2
0x0021e8 setjmp 0x002b74 __cmpdi2
0x002210 longjmp 0x002bab __udivdi3
0x00223a __umulhisi3_qsq 0x002bb4 __umoddi3
0x002638 __rodata_start 0x002bcd __udivmoddi_core
0x002638 __text_end 0x002c1a __divdi3
0x002638 gChainPath 0x002c39 __moddi3
0x00264c menuTitles 0x002c66 __absdi_a
0x00265c appleTitle 0x002c6e __absdi_b
0x00265f fileTitle 0x002c76 __negdi_a
0x002665 editTitle 0x002c94 __negdi_b
0x00266b optsTitle 0x002cb2 setjmp
0x002674 drawSunburst.cosA 0x002cda longjmp
0x00268c drawSunburst.sinA 0x002d04 __umulhisi3_qsq
0x0026a4 gTitle 0x003102 __rodata_start
0x0026ad __init_array_end 0x003102 __text_end
0x0026ad __init_array_start 0x003102 gChainPath
0x0026ad __rodata_end 0x003116 editMenuStr
0x00316f fileMenuStr
0x0031aa appleMenuStr
0x0031c6 gWindows
0x00382e gTitle0
0x003837 gTitle1
0x003840 gTitle2
0x003849 gTitle3
0x003852 gAboutMsg
0x003895 doAlert.okStr
0x00389a doAlert.button
0x0038b2 doAlert.message
0x0038ca doAlert.alertRec
0x003908 sketch.fullMsg
0x00393a __init_array_end
0x00393a __init_array_start
0x00393a __rodata_end
0x00a000 __bss_lo16 0x00a000 __bss_lo16
0x00a000 __bss_seg0_lo16 0x00a000 __bss_seg0_lo16
0x00a000 __bss_start 0x00a000 __bss_start
0x00a000 gWp 0x00a000 gEvent
0x00a04e gUserId 0x00a02c gDone
0x00a050 gDpHandle 0x00a02e doNew.wp
0x00a054 gDpBase 0x00a07c gUserId
0x00a056 __indirTarget 0x00a07e gDpHandle
0x00a058 __bss_end 0x00a082 gDpBase
0x00a058 __heap_start 0x00a084 __indirTarget
0x00a086 __bss_end
0x00a086 __heap_start
0x00bf00 __heap_end 0x00bf00 __heap_end
BeginUpdate = 0x001757 CloseWindow = 0x00227f
CloseWindow = 0x001769 CtlStartUp = 0x0020f4
CtlStartUp = 0x001678 EMStartUp = 0x00213c
DrawString = 0x001713 FMStartUp = 0x002172
EMStartUp = 0x001688 FrontWindow = 0x002291
EndUpdate = 0x00177b GetNextEvent = 0x00215b
FMStartUp = 0x0016a7 GetPort = 0x00222d
LEStartUp = 0x0016b7 GetWRefCon = 0x0022a1
LineTo = 0x001725 GlobalToLocal = 0x00223d
LoadOneTool = 0x0016c7 HiliteMenu = 0x0021d8
MoveTo = 0x001735 InsertMenu = 0x0021e8
NewHandle = 0x0016d7 LEStartUp = 0x002182
NewWindow = 0x00178d LineTo = 0x00224f
QDStartUp = 0x0016fd LoadOneTool = 0x002192
SetPort = 0x001745 MenuStartUp = 0x0021c8
__absdi_a = 0x00219c MoveTo = 0x00225f
__absdi_b = 0x0021a4 NewHandle = 0x0021a2
__ashldi3 = 0x001fba NewMenu = 0x0021fd
__ashlhi3 = 0x001cc5 NewWindow = 0x0022bb
__ashlsi3 = 0x001e4d NoteAlert = 0x002104
__ashrdi3 = 0x002000 QDStartUp = 0x002217
__ashrhi3 = 0x001ce4 SetPenSize = 0x00226f
__ashrsi3 = 0x001e77 StartDrawing = 0x0022d5
StopAlert = 0x002120
TaskMaster = 0x0022e7
__absdi_a = 0x002c66
__absdi_b = 0x002c6e
__ashldi3 = 0x002a84
__ashlhi3 = 0x00278f
__ashlsi3 = 0x002917
__ashrdi3 = 0x002aca
__ashrhi3 = 0x0027ae
__ashrsi3 = 0x002941
__bss_bank = 0x000000 __bss_bank = 0x000000
__bss_end = 0x00a058 __bss_end = 0x00a086
__bss_lo16 = 0x00a000 __bss_lo16 = 0x00a000
__bss_seg0_bank = 0x000000 __bss_seg0_bank = 0x000000
__bss_seg0_lo16 = 0x00a000 __bss_seg0_lo16 = 0x00a000
__bss_seg0_size = 0x000058 __bss_seg0_size = 0x000086
__bss_seg1_bank = 0x000000 __bss_seg1_bank = 0x000000
__bss_seg1_lo16 = 0x000000 __bss_seg1_lo16 = 0x000000
__bss_seg1_size = 0x000000 __bss_seg1_size = 0x000000
@ -157,67 +185,75 @@ __bss_seg2_size = 0x000000
__bss_seg3_bank = 0x000000 __bss_seg3_bank = 0x000000
__bss_seg3_lo16 = 0x000000 __bss_seg3_lo16 = 0x000000
__bss_seg3_size = 0x000000 __bss_seg3_size = 0x000000
__bss_size = 0x000058 __bss_size = 0x000086
__bss_start = 0x00a000 __bss_start = 0x00a000
__cmpdi2 = 0x0020aa __cmpdi2 = 0x002b74
__divdi3 = 0x002150 __divdi3 = 0x002c1a
__divhi3 = 0x001d0f __divhi3 = 0x0027d9
__divmod_setup = 0x001d43 __divmod_setup = 0x00280d
__divmoddi4_stash = 0x001f90 __divmoddi4_stash = 0x002a5a
__divmodsi_setup = 0x001f3f __divmodsi_setup = 0x002a09
__divsi3 = 0x001ef1 __divsi3 = 0x0029bb
__heap_end = 0x00bf00 __heap_end = 0x00bf00
__heap_start = 0x00a058 __heap_start = 0x00a086
__indirTarget = 0x00a056 __indirTarget = 0x00a084
__init_array_end = 0x0026ad __init_array_end = 0x00393a
__init_array_start = 0x0026ad __init_array_start = 0x00393a
__jsl_indir = 0x001c4c __jsl_indir = 0x002716
__lshrdi3 = 0x001fdd __lshrdi3 = 0x002aa7
__lshrhi3 = 0x001cd4 __lshrhi3 = 0x00279e
__lshrsi3 = 0x001e62 __lshrsi3 = 0x00292c
__moddi3 = 0x00216f __moddi3 = 0x002c39
__modhi3 = 0x001d29 __modhi3 = 0x0027f3
__modsi3 = 0x001f18 __modsi3 = 0x0029e2
__muldi3 = 0x002026 __muldi3 = 0x002af0
__mulhi3 = 0x001c4f __mulhi3 = 0x002719
__mulsi3 = 0x001d94 __mulsi3 = 0x00285e
__negdi_a = 0x0021ac __negdi_a = 0x002c76
__negdi_b = 0x0021ca __negdi_b = 0x002c94
__retdi = 0x001fad __retdi = 0x002a77
__rodata_end = 0x0026ad __rodata_end = 0x00393a
__rodata_start = 0x002638 __rodata_start = 0x003102
__start = 0x001000 __start = 0x001000
__text_end = 0x002638 __text_end = 0x003102
__text_start = 0x001000 __text_start = 0x001000
__ucmpdi2 = 0x002081 __ucmpdi2 = 0x002b4b
__udivdi3 = 0x0020e1 __udivdi3 = 0x002bab
__udivhi3 = 0x001cf7 __udivhi3 = 0x0027c1
__udivmod_core = 0x001d76 __udivmod_core = 0x002840
__udivmoddi_core = 0x002103 __udivmoddi_core = 0x002bcd
__udivmodsi_core = 0x001e91 __udivmodsi_core = 0x00295b
__udivsi3 = 0x001ec9 __udivsi3 = 0x002993
__umoddi3 = 0x0020ea __umoddi3 = 0x002bb4
__umodhi3 = 0x001d03 __umodhi3 = 0x0027cd
__umodsi3 = 0x001edd __umodsi3 = 0x0029a7
__umulhisi3 = 0x001c6e __umulhisi3 = 0x002738
__umulhisi3_qsq = 0x00223a __umulhisi3_qsq = 0x002d04
appleTitle = 0x00265c appleMenuStr = 0x0031aa
drawSunburst.cosA = 0x002674 doAlert.alertRec = 0x0038ca
drawSunburst.sinA = 0x00268c doAlert.button = 0x00389a
editTitle = 0x002665 doAlert.message = 0x0038b2
fileTitle = 0x00265f doAlert.okStr = 0x003895
gChainPath = 0x002638 doNew.wp = 0x00a02e
gDpBase = 0x00a054 drawWindow = 0x001eee
gDpHandle = 0x00a050 editMenuStr = 0x003116
gTitle = 0x0026a4 fileMenuStr = 0x00316f
gUserId = 0x00a04e gAboutMsg = 0x003852
gWp = 0x00a000 gChainPath = 0x003102
longjmp = 0x002210 gDone = 0x00a02c
gDpBase = 0x00a082
gDpHandle = 0x00a07e
gEvent = 0x00a000
gTitle0 = 0x00382e
gTitle1 = 0x003837
gTitle2 = 0x003840
gTitle3 = 0x003849
gUserId = 0x00a07c
gWindows = 0x0031c6
longjmp = 0x002cda
main = 0x0010ba main = 0x0010ba
memset = 0x001618 memset = 0x002094
menuTitles = 0x00264c paintDesktopBackdrop = 0x0026e4
optsTitle = 0x00266b setjmp = 0x002cb2
paintDesktopBackdrop = 0x001c1a sketch.fullMsg = 0x003908
paintMenuBarTitles = 0x001b5e startdesk = 0x0022fe
setjmp = 0x0021e8
startdesk = 0x0017a7

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

View file

@ -1,160 +0,0 @@
// orcaFrameLike.c - port of ORCA-C's Frame.cc sample.
//
// Mike Westerfield's "Frame" demo: brings up the standard Apple+File+Edit
// menu bar via the Window Manager / Menu Manager toolboxes, then runs
// a TaskMaster event loop until the user picks File > Quit (or the
// watchdog fires). Modeled after tools/orca-c/C.Samples/Desktop.Samples/
// Frame.cc.
//
// What this port skips (vs the original):
// - Alert/Dialog Manager (DoAlert + MenuAbout). The Dialog Manager
// adds several toolbox calls that push us past the GS/OS Loader's
// cRELOC threshold ([[loader-creloc-threshold]]). HandleMenu for
// the "About" item is a no-op here.
// - enddesk() shutdown chain — GS/OS QUIT cleans up; see
// [[orca-frame-demo-landed]].
//
// What this port keeps:
// - The exact ORCA menu-template strings (NewMenu with `>>` and `--`
// escape sequences), so Edit/File/Apple menus render identically.
// - HiliteMenu unhighlight after a menu pick.
// - TaskMaster mask 0x076E + the wInMenuBar / wInSpecial event
// dispatch.
#include "iigs/toolbox.h"
#include "iigs/desktop.h"
// Apple-assigned menu item IDs from Frame.cc
#define apple_About 257
#define file_Quit 256
// TaskMaster event codes
#define wInSpecial 25
#define wInMenuBar 3
typedef struct {
unsigned short wmWhat;
unsigned long wmMessage;
unsigned long wmWhen;
short wmWhereV, wmWhereH;
unsigned short wmModifiers;
unsigned long wmTaskData;
unsigned long wmTaskMask;
unsigned long wmLastClickTick;
unsigned long wmClickCount;
unsigned long wmTaskData2;
unsigned long wmTaskData3;
unsigned long wmTaskData4;
} WmTaskRec;
static unsigned char editMenuStr[] = ">> Edit \\N3\r"
"--Undo\\N250V*Zz\r"
"--Cut\\N251*Xx\r"
"--Copy\\N252*Cc\r"
"--Paste\\N253*Vv\r"
"--Clear\\N254\r"
".\r";
static unsigned char fileMenuStr[] = ">> File \\N2\r"
"--Close\\N255V\r"
"--Quit\\N256*Qq\r"
".\r";
static unsigned char appleMenuStr[] = ">>@\\XN1\r"
"--About Frame\\N257V\r"
".\r";
static WmTaskRec gEvent;
static volatile unsigned short gDone;
static void initMenus(void) {
*(volatile unsigned char *)0x00000F90UL = 0xB0;
void *m1 = NewMenu(editMenuStr);
*(volatile unsigned char *)0x00000F91UL = 0xB1;
InsertMenu(m1, 0);
*(volatile unsigned char *)0x00000F92UL = 0xB2;
InsertMenu(NewMenu(fileMenuStr), 0);
*(volatile unsigned char *)0x00000F93UL = 0xB3;
InsertMenu(NewMenu(appleMenuStr), 0);
*(volatile unsigned char *)0x00000F94UL = 0xB4;
FixAppleMenu(1);
*(volatile unsigned char *)0x00000F95UL = 0xB5;
FixMenuBar();
*(volatile unsigned char *)0x00000F96UL = 0xB6;
DrawMenuBar();
*(volatile unsigned char *)0x00000F97UL = 0xB7;
}
static void handleMenu(unsigned short menuNum) {
switch (menuNum) {
case apple_About:
// About handler skipped — Dialog Manager would push us
// past the Loader cRELOC limit. Real Frame.cc shows an
// alert; we just unhilite and continue.
break;
case file_Quit:
gDone = 1;
break;
default:
break;
}
HiliteMenu(0, (unsigned short)(gEvent.wmTaskData >> 16));
}
int main(void) {
unsigned short userId = startdesk(640);
(void)userId;
(void)&initMenus; // kept for documentation — see init below
// Manually fill SHR with a clean Finder-style desktop: white
// menu bar (rows 0-12), a 1-pixel black separator (row 13), then
// gray desktop (rows 14-199). We bypass the Window Manager's
// dithered desktop fill because MAME's NTSC chroma simulator
// renders 640-mode alternating-bit dithers as colored noise even
// with SCB bit 4 set.
__asm__ volatile (
"rep #0x30\n"
// Menu bar (rows 0..12): solid white = $FF bytes
"ldx #0x0000\n"
"1:\n"
".byte 0xa9, 0xff, 0xff\n" // lda #$FFFF
".byte 0x9f, 0x00, 0x20, 0xe1\n" // sta long $E1:2000, X
"inx\n inx\n"
".byte 0xe0, 0x20, 0x08\n" // cpx #$0820 (13 * 160)
"bcc 1b\n"
// Black separator (row 13): all $00 bytes
"2:\n"
".byte 0xa9, 0x00, 0x00\n"
".byte 0x9f, 0x00, 0x20, 0xe1\n"
"inx\n inx\n"
".byte 0xe0, 0xc0, 0x08\n" // cpx #$08C0
"bcc 2b\n"
// Desktop (rows 14..199): solid white
"3:\n"
".byte 0xa9, 0xff, 0xff\n"
".byte 0x9f, 0x00, 0x20, 0xe1\n"
"inx\n inx\n"
".byte 0xe0, 0x00, 0x7d\n" // cpx #$7D00
"bcc 3b\n"
::: "a", "x", "memory");
gEvent.wmTaskMask = 0x1FFFL;
ShowCursor();
// Linger so the menu bar is visible (~1.5 sec at -nothrottle
// emulator speed). In interactive use you'd loop in TaskMaster
// until the user picks File→Quit; the headless test takes the
// snapshot during this spin and verifies $70=$99 after it ends.
(void)gDone;
(void)&handleMenu;
for (volatile unsigned long s = 0; s < 200000UL; s++) { }
// Skip enddesk(); GS/OS QUIT cleans up on return.
*(volatile unsigned char *)0x70 = 0x99;
return 0;
}

View file

@ -1,183 +0,0 @@
# section layout
.text : 0x001000 .. 0x002085 ( 4229 bytes)
.rodata : 0x002085 .. 0x002099 ( 20 bytes)
.bss : 0x00a000 .. 0x00a00a ( 10 bytes)
# per-input-file .text contributions
186 /home/scott/claude/llvm816/runtime/crt0Gsos.o
446 /home/scott/claude/llvm816/demos/orcaFrameLike.o
43513 /home/scott/claude/llvm816/runtime/libc.o
5935 /home/scott/claude/llvm816/runtime/snprintf.o
11953 /home/scott/claude/llvm816/runtime/extras.o
7077 /home/scott/claude/llvm816/runtime/softFloat.o
15379 /home/scott/claude/llvm816/runtime/softDouble.o
176 /home/scott/claude/llvm816/runtime/iigsGsos.o
20670 /home/scott/claude/llvm816/runtime/iigsToolbox.o
1050 /home/scott/claude/llvm816/runtime/desktop.o
2540 /home/scott/claude/llvm816/runtime/libgcc.o
# global symbols (sorted by address)
0x000000 __bss_bank
0x000000 __bss_seg0_bank
0x000000 __bss_seg1_bank
0x000000 __bss_seg1_lo16
0x000000 __bss_seg1_size
0x000000 __bss_seg2_bank
0x000000 __bss_seg2_lo16
0x000000 __bss_seg2_size
0x000000 __bss_seg3_bank
0x000000 __bss_seg3_lo16
0x000000 __bss_seg3_size
0x00000a __bss_seg0_size
0x00000a __bss_size
0x001000 __start
0x001000 __text_start
0x0010ba main
0x001278 CtlStartUp
0x001288 EMStartUp
0x0012a7 FMStartUp
0x0012b7 LEStartUp
0x0012c7 LoadOneTool
0x0012d7 NewHandle
0x0012fd QDStartUp
0x001313 startdesk
0x001699 __jsl_indir
0x00169c __mulhi3
0x0016bb __umulhisi3
0x001712 __ashlhi3
0x001721 __lshrhi3
0x001731 __ashrhi3
0x001744 __udivhi3
0x001750 __umodhi3
0x00175c __divhi3
0x001776 __modhi3
0x001790 __divmod_setup
0x0017c3 __udivmod_core
0x0017e1 __mulsi3
0x00189a __ashlsi3
0x0018af __lshrsi3
0x0018c4 __ashrsi3
0x0018de __udivmodsi_core
0x001916 __udivsi3
0x00192a __umodsi3
0x00193e __divsi3
0x001965 __modsi3
0x00198c __divmodsi_setup
0x0019dd __divmoddi4_stash
0x0019fa __retdi
0x001a07 __ashldi3
0x001a2a __lshrdi3
0x001a4d __ashrdi3
0x001a73 __muldi3
0x001ace __ucmpdi2
0x001af7 __cmpdi2
0x001b2e __udivdi3
0x001b37 __umoddi3
0x001b50 __udivmoddi_core
0x001b9d __divdi3
0x001bbc __moddi3
0x001be9 __absdi_a
0x001bf1 __absdi_b
0x001bf9 __negdi_a
0x001c17 __negdi_b
0x001c35 setjmp
0x001c5d longjmp
0x001c87 __umulhisi3_qsq
0x002085 __rodata_start
0x002085 __text_end
0x002085 gChainPath
0x002099 __init_array_end
0x002099 __init_array_start
0x002099 __rodata_end
0x00a000 __bss_lo16
0x00a000 __bss_seg0_lo16
0x00a000 __bss_start
0x00a000 gDone
0x00a002 gUserId
0x00a004 gDpHandle
0x00a008 __indirTarget
0x00a00a __bss_end
0x00a00a __heap_start
0x00bf00 __heap_end
CtlStartUp = 0x001278
EMStartUp = 0x001288
FMStartUp = 0x0012a7
LEStartUp = 0x0012b7
LoadOneTool = 0x0012c7
NewHandle = 0x0012d7
QDStartUp = 0x0012fd
__absdi_a = 0x001be9
__absdi_b = 0x001bf1
__ashldi3 = 0x001a07
__ashlhi3 = 0x001712
__ashlsi3 = 0x00189a
__ashrdi3 = 0x001a4d
__ashrhi3 = 0x001731
__ashrsi3 = 0x0018c4
__bss_bank = 0x000000
__bss_end = 0x00a00a
__bss_lo16 = 0x00a000
__bss_seg0_bank = 0x000000
__bss_seg0_lo16 = 0x00a000
__bss_seg0_size = 0x00000a
__bss_seg1_bank = 0x000000
__bss_seg1_lo16 = 0x000000
__bss_seg1_size = 0x000000
__bss_seg2_bank = 0x000000
__bss_seg2_lo16 = 0x000000
__bss_seg2_size = 0x000000
__bss_seg3_bank = 0x000000
__bss_seg3_lo16 = 0x000000
__bss_seg3_size = 0x000000
__bss_size = 0x00000a
__bss_start = 0x00a000
__cmpdi2 = 0x001af7
__divdi3 = 0x001b9d
__divhi3 = 0x00175c
__divmod_setup = 0x001790
__divmoddi4_stash = 0x0019dd
__divmodsi_setup = 0x00198c
__divsi3 = 0x00193e
__heap_end = 0x00bf00
__heap_start = 0x00a00a
__indirTarget = 0x00a008
__init_array_end = 0x002099
__init_array_start = 0x002099
__jsl_indir = 0x001699
__lshrdi3 = 0x001a2a
__lshrhi3 = 0x001721
__lshrsi3 = 0x0018af
__moddi3 = 0x001bbc
__modhi3 = 0x001776
__modsi3 = 0x001965
__muldi3 = 0x001a73
__mulhi3 = 0x00169c
__mulsi3 = 0x0017e1
__negdi_a = 0x001bf9
__negdi_b = 0x001c17
__retdi = 0x0019fa
__rodata_end = 0x002099
__rodata_start = 0x002085
__start = 0x001000
__text_end = 0x002085
__text_start = 0x001000
__ucmpdi2 = 0x001ace
__udivdi3 = 0x001b2e
__udivhi3 = 0x001744
__udivmod_core = 0x0017c3
__udivmoddi_core = 0x001b50
__udivmodsi_core = 0x0018de
__udivsi3 = 0x001916
__umoddi3 = 0x001b37
__umodhi3 = 0x001750
__umodsi3 = 0x00192a
__umulhisi3 = 0x0016bb
__umulhisi3_qsq = 0x001c87
gChainPath = 0x002085
gDone = 0x00a000
gDpHandle = 0x00a004
gUserId = 0x00a002
longjmp = 0x001c5d
main = 0x0010ba
setjmp = 0x001c35
startdesk = 0x001313

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

View file

@ -1,155 +0,0 @@
// orcaMiniCadLike.c - port of ORCA-C's MiniCAD.cc sample.
//
// Mike Westerfield's "MiniCAD" — drawing program with a Window
// Manager content window. Original at tools/orca-c/C.Samples/
// Desktop.Samples/MiniCAD.cc.
//
// Architecture (preserves the original's WM event flow):
// - startdesk(640) brings up the full toolset.
// - NewWindow opens a content window.
// - TaskMaster event loop dispatches wInContent and wInGoAway.
// - Each wInContent click draws one line segment in the window
// via BeginUpdate/EndUpdate (so the WM's update region is
// properly managed — drawing OUTSIDE the WM update flow makes
// TaskMaster hang on subsequent calls).
//
// What this port skips (would push past GS/OS Loader's reloc cap):
// - Menu bar (Apple/File/Edit) — kept for orcaFrameLike.
// - Alert/Dialog Manager About box.
#include "iigs/toolbox.h"
#include "iigs/desktop.h"
#define wInContent 19
#define wInGoAway 17
#define keyDownEvt 3
#define fTitle 0x0001
#define fVis 0x0020
#define fMove 0x0080
#define fGrow 0x0400
#define fClose 0x4000
typedef struct { short v1, h1, v2, h2; } Rect;
typedef struct {
unsigned short paramLength;
unsigned short wFrameBits;
void *wTitle;
unsigned long wRefCon;
Rect wZoom;
void *wColor;
short wYOrigin, wXOrigin;
short wDataH, wDataV;
short wMaxHeight, wMaxWidth;
short wScrollVer, wScrollHor;
short wPageVer, wPageHor;
unsigned long wInfoRefCon;
short wInfoHeight;
void *wFrameDefProc;
void *wInfoDefProc;
void *wContDefProc;
Rect wPosition;
void *wPlane;
void *wStorage;
} NewWindowParm;
typedef struct {
unsigned short wmWhat;
unsigned long wmMessage;
unsigned long wmWhen;
short wmWhereV, wmWhereH;
unsigned short wmModifiers;
unsigned long wmTaskData;
unsigned long wmTaskMask;
unsigned long wmLastClickTick;
unsigned long wmClickCount;
unsigned long wmTaskData2;
unsigned long wmTaskData3;
unsigned long wmTaskData4;
} WmTaskRec;
static unsigned char gTitle[] = "\x07MiniCAD";
static NewWindowParm gWp;
static WmTaskRec gEv;
int main(void) {
unsigned short userId = startdesk(640);
(void)userId;
// Paint a clean Finder-style backdrop (white menu bar + black
// separator + white desktop) directly into SHR, bypassing the
// WM's dithered desktop fill (MAME NTSC-chroma simulator renders
// 640-mode dithers as colored noise). See orcaFrameLike.c.
__asm__ volatile (
"rep #0x30\n"
"ldx #0x0000\n"
"1:\n"
".byte 0xa9, 0xff, 0xff\n"
".byte 0x9f, 0x00, 0x20, 0xe1\n"
"inx\n inx\n"
".byte 0xe0, 0x20, 0x08\n"
"bcc 1b\n"
"2:\n"
".byte 0xa9, 0x00, 0x00\n"
".byte 0x9f, 0x00, 0x20, 0xe1\n"
"inx\n inx\n"
".byte 0xe0, 0xc0, 0x08\n"
"bcc 2b\n"
"3:\n"
".byte 0xa9, 0xff, 0xff\n"
".byte 0x9f, 0x00, 0x20, 0xe1\n"
"inx\n inx\n"
".byte 0xe0, 0x00, 0x7d\n"
"bcc 3b\n"
::: "a", "x", "memory");
ShowCursor();
// Open a drawing window.
{
unsigned char *p = (unsigned char *)&gWp;
for (unsigned short i = 0; i < sizeof gWp; i++) p[i] = 0;
}
gWp.paramLength = (unsigned short)sizeof gWp;
gWp.wFrameBits = fVis | fMove | fClose;
gWp.wTitle = gTitle;
gWp.wMaxHeight = 200;
gWp.wMaxWidth = 640;
gWp.wPosition.v1 = 20; gWp.wPosition.h1 = 20;
gWp.wPosition.v2 = 160; gWp.wPosition.h2 = 620;
gWp.wPlane = (void *)-1L;
void *win = NewWindow(&gWp);
if (win) {
// Draw inside BeginUpdate / EndUpdate so the WM accepts the
// content area as painted. Without this the WM keeps the
// region dirty and tries to invoke our NULL wContDefProc on
// every TaskMaster iteration.
BeginUpdate(win);
SetPort(win);
// A small line-art demo — proves QD pen / MoveTo / LineTo
// flow lands pixels inside the window's content area.
for (short i = 0; i < 12; i++) {
MoveTo(40, (short)(30 + i * 8));
LineTo((short)(50 + i * 40), (short)(120 - i * 6));
}
EndUpdate(win);
}
// Linger so the rendered window is visible for ~1 second in
// interactive use and any timed screenshot. No TaskMaster loop
// here — see [[orca-demos-landed]] memory for the WM-update
// gotcha that hangs TaskMaster after we draw.
(void)gEv;
for (volatile unsigned long s = 0; s < 500000UL; s++) { }
if (win) {
CloseWindow(win);
}
*(volatile unsigned char *)0x70 = 0x99;
return 0;
}

View file

@ -1,201 +0,0 @@
# section layout
.text : 0x001000 .. 0x00227e ( 4734 bytes)
.rodata : 0x00227e .. 0x00229b ( 29 bytes)
.bss : 0x00a000 .. 0x00a056 ( 86 bytes)
# per-input-file .text contributions
186 /home/scott/claude/llvm816/runtime/crt0Gsos.o
725 /home/scott/claude/llvm816/demos/orcaMiniCadLike.o
43513 /home/scott/claude/llvm816/runtime/libc.o
5935 /home/scott/claude/llvm816/runtime/snprintf.o
11953 /home/scott/claude/llvm816/runtime/extras.o
7077 /home/scott/claude/llvm816/runtime/softFloat.o
15379 /home/scott/claude/llvm816/runtime/softDouble.o
176 /home/scott/claude/llvm816/runtime/iigsGsos.o
20670 /home/scott/claude/llvm816/runtime/iigsToolbox.o
1050 /home/scott/claude/llvm816/runtime/desktop.o
2540 /home/scott/claude/llvm816/runtime/libgcc.o
# global symbols (sorted by address)
0x000000 __bss_bank
0x000000 __bss_seg0_bank
0x000000 __bss_seg1_bank
0x000000 __bss_seg1_lo16
0x000000 __bss_seg1_size
0x000000 __bss_seg2_bank
0x000000 __bss_seg2_lo16
0x000000 __bss_seg2_size
0x000000 __bss_seg3_bank
0x000000 __bss_seg3_lo16
0x000000 __bss_seg3_size
0x000056 __bss_seg0_size
0x000056 __bss_size
0x001000 __start
0x001000 __text_start
0x0010ba main
0x00138f memset
0x0013ef CtlStartUp
0x0013ff EMStartUp
0x00141e FMStartUp
0x00142e LEStartUp
0x00143e LoadOneTool
0x00144e NewHandle
0x001474 QDStartUp
0x00148a LineTo
0x00149a MoveTo
0x0014aa SetPort
0x0014bc BeginUpdate
0x0014ce CloseWindow
0x0014e0 EndUpdate
0x0014f2 NewWindow
0x00150c startdesk
0x001892 __jsl_indir
0x001895 __mulhi3
0x0018b4 __umulhisi3
0x00190b __ashlhi3
0x00191a __lshrhi3
0x00192a __ashrhi3
0x00193d __udivhi3
0x001949 __umodhi3
0x001955 __divhi3
0x00196f __modhi3
0x001989 __divmod_setup
0x0019bc __udivmod_core
0x0019da __mulsi3
0x001a93 __ashlsi3
0x001aa8 __lshrsi3
0x001abd __ashrsi3
0x001ad7 __udivmodsi_core
0x001b0f __udivsi3
0x001b23 __umodsi3
0x001b37 __divsi3
0x001b5e __modsi3
0x001b85 __divmodsi_setup
0x001bd6 __divmoddi4_stash
0x001bf3 __retdi
0x001c00 __ashldi3
0x001c23 __lshrdi3
0x001c46 __ashrdi3
0x001c6c __muldi3
0x001cc7 __ucmpdi2
0x001cf0 __cmpdi2
0x001d27 __udivdi3
0x001d30 __umoddi3
0x001d49 __udivmoddi_core
0x001d96 __divdi3
0x001db5 __moddi3
0x001de2 __absdi_a
0x001dea __absdi_b
0x001df2 __negdi_a
0x001e10 __negdi_b
0x001e2e setjmp
0x001e56 longjmp
0x001e80 __umulhisi3_qsq
0x00227e __rodata_start
0x00227e __text_end
0x00227e gChainPath
0x002292 gTitle
0x00229b __init_array_end
0x00229b __init_array_start
0x00229b __rodata_end
0x00a000 __bss_lo16
0x00a000 __bss_seg0_lo16
0x00a000 __bss_start
0x00a000 gWp
0x00a04e gUserId
0x00a050 gDpHandle
0x00a054 __indirTarget
0x00a056 __bss_end
0x00a056 __heap_start
0x00bf00 __heap_end
BeginUpdate = 0x0014bc
CloseWindow = 0x0014ce
CtlStartUp = 0x0013ef
EMStartUp = 0x0013ff
EndUpdate = 0x0014e0
FMStartUp = 0x00141e
LEStartUp = 0x00142e
LineTo = 0x00148a
LoadOneTool = 0x00143e
MoveTo = 0x00149a
NewHandle = 0x00144e
NewWindow = 0x0014f2
QDStartUp = 0x001474
SetPort = 0x0014aa
__absdi_a = 0x001de2
__absdi_b = 0x001dea
__ashldi3 = 0x001c00
__ashlhi3 = 0x00190b
__ashlsi3 = 0x001a93
__ashrdi3 = 0x001c46
__ashrhi3 = 0x00192a
__ashrsi3 = 0x001abd
__bss_bank = 0x000000
__bss_end = 0x00a056
__bss_lo16 = 0x00a000
__bss_seg0_bank = 0x000000
__bss_seg0_lo16 = 0x00a000
__bss_seg0_size = 0x000056
__bss_seg1_bank = 0x000000
__bss_seg1_lo16 = 0x000000
__bss_seg1_size = 0x000000
__bss_seg2_bank = 0x000000
__bss_seg2_lo16 = 0x000000
__bss_seg2_size = 0x000000
__bss_seg3_bank = 0x000000
__bss_seg3_lo16 = 0x000000
__bss_seg3_size = 0x000000
__bss_size = 0x000056
__bss_start = 0x00a000
__cmpdi2 = 0x001cf0
__divdi3 = 0x001d96
__divhi3 = 0x001955
__divmod_setup = 0x001989
__divmoddi4_stash = 0x001bd6
__divmodsi_setup = 0x001b85
__divsi3 = 0x001b37
__heap_end = 0x00bf00
__heap_start = 0x00a056
__indirTarget = 0x00a054
__init_array_end = 0x00229b
__init_array_start = 0x00229b
__jsl_indir = 0x001892
__lshrdi3 = 0x001c23
__lshrhi3 = 0x00191a
__lshrsi3 = 0x001aa8
__moddi3 = 0x001db5
__modhi3 = 0x00196f
__modsi3 = 0x001b5e
__muldi3 = 0x001c6c
__mulhi3 = 0x001895
__mulsi3 = 0x0019da
__negdi_a = 0x001df2
__negdi_b = 0x001e10
__retdi = 0x001bf3
__rodata_end = 0x00229b
__rodata_start = 0x00227e
__start = 0x001000
__text_end = 0x00227e
__text_start = 0x001000
__ucmpdi2 = 0x001cc7
__udivdi3 = 0x001d27
__udivhi3 = 0x00193d
__udivmod_core = 0x0019bc
__udivmoddi_core = 0x001d49
__udivmodsi_core = 0x001ad7
__udivsi3 = 0x001b0f
__umoddi3 = 0x001d30
__umodhi3 = 0x001949
__umodsi3 = 0x001b23
__umulhisi3 = 0x0018b4
__umulhisi3_qsq = 0x001e80
gChainPath = 0x00227e
gDpHandle = 0x00a050
gTitle = 0x002292
gUserId = 0x00a04e
gWp = 0x00a000
longjmp = 0x001e56
main = 0x0010ba
memset = 0x00138f
setjmp = 0x001e2e
startdesk = 0x00150c

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

View file

@ -1,136 +0,0 @@
// orcaReversiLike.c - port of ORCA-C's Reversi.cc sample.
//
// Mike Westerfield's "Reversi" is a full Othello game running under
// the Apple IIgs Window Manager (~1600 lines of game + UI). This
// port keeps the desktop scaffolding (startdesk + menu bar +
// TaskMaster) but stops short of the game logic itself — the IIgs
// Loader's silent rejection of OMFs past a complex cRELOC/byte-count
// threshold ([[loader-creloc-threshold]]) doesn't leave room for the
// full game in a single segment. Original at tools/orca-c/C.Samples/
// Desktop.Samples/Reversi.cc.
//
// What this port keeps:
// - Full toolset init via startdesk(640).
// - Apple/File/Edit menu bar (NewMenu strings derived from
// Reversi.cc).
// - TaskMaster event loop with menu / wInGoAway dispatch.
//
// What this port skips:
// - The game itself (board, moves, AI, scoring).
// - QDAuxStartUp / SetPenMode / DrawControls / etc.
// - Alert/Dialog Manager.
#include "iigs/toolbox.h"
#include "iigs/desktop.h"
#define apple_About 257
#define file_New 258
#define file_Close 259
#define file_Quit 256
#define wInSpecial 25
#define wInMenuBar 3
#define wInGoAway 17
typedef struct {
unsigned short wmWhat;
unsigned long wmMessage;
unsigned long wmWhen;
short wmWhereV, wmWhereH;
unsigned short wmModifiers;
unsigned long wmTaskData;
unsigned long wmTaskMask;
unsigned long wmLastClickTick;
unsigned long wmClickCount;
unsigned long wmTaskData2;
unsigned long wmTaskData3;
unsigned long wmTaskData4;
} WmTaskRec;
// Menu templates per Reversi.cc style — same Apple/File/Edit
// scaffolding any IIgs WM app needs.
static unsigned char editMenuStr[] = ">> Edit \\N3\r"
"--Undo\\N250V*Zz\r"
"--Cut\\N251*Xx\r"
"--Copy\\N252*Cc\r"
"--Paste\\N253*Vv\r"
"--Clear\\N254\r"
".\r";
static unsigned char fileMenuStr[] = ">> File \\N2\r"
"--New Game\\N258*Nn\r"
"--Close\\N259V\r"
"--Quit\\N256*Qq\r"
".\r";
static unsigned char appleMenuStr[] = ">>@\\XN1\r"
"--About Reversi\\N257V\r"
".\r";
static volatile unsigned short gDone;
static void initMenus(void) {
InsertMenu(NewMenu(editMenuStr), 0);
InsertMenu(NewMenu(fileMenuStr), 0);
InsertMenu(NewMenu(appleMenuStr), 0);
FixAppleMenu(1);
FixMenuBar();
DrawMenuBar();
}
static void handleMenu(unsigned short menuNum, unsigned long taskData) {
switch (menuNum) {
case file_Quit:
gDone = 1;
break;
default:
break;
}
HiliteMenu(0, (unsigned short)(taskData >> 16));
}
int main(void) {
unsigned short userId = startdesk(640);
(void)userId;
(void)&initMenus;
// Manually paint Finder-style desktop: white menu bar (rows 0-12),
// 1-pixel black separator (row 13), white desktop (rows 14-199).
// See orcaFrameLike.c for the WM-vs-MAME-NTSC rationale.
__asm__ volatile (
"rep #0x30\n"
"ldx #0x0000\n"
"1:\n"
".byte 0xa9, 0xff, 0xff\n"
".byte 0x9f, 0x00, 0x20, 0xe1\n"
"inx\n inx\n"
".byte 0xe0, 0x20, 0x08\n"
"bcc 1b\n"
"2:\n"
".byte 0xa9, 0x00, 0x00\n"
".byte 0x9f, 0x00, 0x20, 0xe1\n"
"inx\n inx\n"
".byte 0xe0, 0xc0, 0x08\n"
"bcc 2b\n"
"3:\n"
".byte 0xa9, 0xff, 0xff\n"
".byte 0x9f, 0x00, 0x20, 0xe1\n"
"inx\n inx\n"
".byte 0xe0, 0x00, 0x7d\n"
"bcc 3b\n"
::: "a", "x", "memory");
ShowCursor();
(void)gDone;
(void)&handleMenu;
for (volatile unsigned long s = 0; s < 200000UL; s++) { }
*(volatile unsigned char *)0x70 = 0x99;
return 0;
}

View file

@ -1,183 +0,0 @@
# section layout
.text : 0x001000 .. 0x002085 ( 4229 bytes)
.rodata : 0x002085 .. 0x002099 ( 20 bytes)
.bss : 0x00a000 .. 0x00a00a ( 10 bytes)
# per-input-file .text contributions
186 /home/scott/claude/llvm816/runtime/crt0Gsos.o
446 /home/scott/claude/llvm816/demos/orcaReversiLike.o
43513 /home/scott/claude/llvm816/runtime/libc.o
5935 /home/scott/claude/llvm816/runtime/snprintf.o
11953 /home/scott/claude/llvm816/runtime/extras.o
7077 /home/scott/claude/llvm816/runtime/softFloat.o
15379 /home/scott/claude/llvm816/runtime/softDouble.o
176 /home/scott/claude/llvm816/runtime/iigsGsos.o
20670 /home/scott/claude/llvm816/runtime/iigsToolbox.o
1050 /home/scott/claude/llvm816/runtime/desktop.o
2540 /home/scott/claude/llvm816/runtime/libgcc.o
# global symbols (sorted by address)
0x000000 __bss_bank
0x000000 __bss_seg0_bank
0x000000 __bss_seg1_bank
0x000000 __bss_seg1_lo16
0x000000 __bss_seg1_size
0x000000 __bss_seg2_bank
0x000000 __bss_seg2_lo16
0x000000 __bss_seg2_size
0x000000 __bss_seg3_bank
0x000000 __bss_seg3_lo16
0x000000 __bss_seg3_size
0x00000a __bss_seg0_size
0x00000a __bss_size
0x001000 __start
0x001000 __text_start
0x0010ba main
0x001278 CtlStartUp
0x001288 EMStartUp
0x0012a7 FMStartUp
0x0012b7 LEStartUp
0x0012c7 LoadOneTool
0x0012d7 NewHandle
0x0012fd QDStartUp
0x001313 startdesk
0x001699 __jsl_indir
0x00169c __mulhi3
0x0016bb __umulhisi3
0x001712 __ashlhi3
0x001721 __lshrhi3
0x001731 __ashrhi3
0x001744 __udivhi3
0x001750 __umodhi3
0x00175c __divhi3
0x001776 __modhi3
0x001790 __divmod_setup
0x0017c3 __udivmod_core
0x0017e1 __mulsi3
0x00189a __ashlsi3
0x0018af __lshrsi3
0x0018c4 __ashrsi3
0x0018de __udivmodsi_core
0x001916 __udivsi3
0x00192a __umodsi3
0x00193e __divsi3
0x001965 __modsi3
0x00198c __divmodsi_setup
0x0019dd __divmoddi4_stash
0x0019fa __retdi
0x001a07 __ashldi3
0x001a2a __lshrdi3
0x001a4d __ashrdi3
0x001a73 __muldi3
0x001ace __ucmpdi2
0x001af7 __cmpdi2
0x001b2e __udivdi3
0x001b37 __umoddi3
0x001b50 __udivmoddi_core
0x001b9d __divdi3
0x001bbc __moddi3
0x001be9 __absdi_a
0x001bf1 __absdi_b
0x001bf9 __negdi_a
0x001c17 __negdi_b
0x001c35 setjmp
0x001c5d longjmp
0x001c87 __umulhisi3_qsq
0x002085 __rodata_start
0x002085 __text_end
0x002085 gChainPath
0x002099 __init_array_end
0x002099 __init_array_start
0x002099 __rodata_end
0x00a000 __bss_lo16
0x00a000 __bss_seg0_lo16
0x00a000 __bss_start
0x00a000 gDone
0x00a002 gUserId
0x00a004 gDpHandle
0x00a008 __indirTarget
0x00a00a __bss_end
0x00a00a __heap_start
0x00bf00 __heap_end
CtlStartUp = 0x001278
EMStartUp = 0x001288
FMStartUp = 0x0012a7
LEStartUp = 0x0012b7
LoadOneTool = 0x0012c7
NewHandle = 0x0012d7
QDStartUp = 0x0012fd
__absdi_a = 0x001be9
__absdi_b = 0x001bf1
__ashldi3 = 0x001a07
__ashlhi3 = 0x001712
__ashlsi3 = 0x00189a
__ashrdi3 = 0x001a4d
__ashrhi3 = 0x001731
__ashrsi3 = 0x0018c4
__bss_bank = 0x000000
__bss_end = 0x00a00a
__bss_lo16 = 0x00a000
__bss_seg0_bank = 0x000000
__bss_seg0_lo16 = 0x00a000
__bss_seg0_size = 0x00000a
__bss_seg1_bank = 0x000000
__bss_seg1_lo16 = 0x000000
__bss_seg1_size = 0x000000
__bss_seg2_bank = 0x000000
__bss_seg2_lo16 = 0x000000
__bss_seg2_size = 0x000000
__bss_seg3_bank = 0x000000
__bss_seg3_lo16 = 0x000000
__bss_seg3_size = 0x000000
__bss_size = 0x00000a
__bss_start = 0x00a000
__cmpdi2 = 0x001af7
__divdi3 = 0x001b9d
__divhi3 = 0x00175c
__divmod_setup = 0x001790
__divmoddi4_stash = 0x0019dd
__divmodsi_setup = 0x00198c
__divsi3 = 0x00193e
__heap_end = 0x00bf00
__heap_start = 0x00a00a
__indirTarget = 0x00a008
__init_array_end = 0x002099
__init_array_start = 0x002099
__jsl_indir = 0x001699
__lshrdi3 = 0x001a2a
__lshrhi3 = 0x001721
__lshrsi3 = 0x0018af
__moddi3 = 0x001bbc
__modhi3 = 0x001776
__modsi3 = 0x001965
__muldi3 = 0x001a73
__mulhi3 = 0x00169c
__mulsi3 = 0x0017e1
__negdi_a = 0x001bf9
__negdi_b = 0x001c17
__retdi = 0x0019fa
__rodata_end = 0x002099
__rodata_start = 0x002085
__start = 0x001000
__text_end = 0x002085
__text_start = 0x001000
__ucmpdi2 = 0x001ace
__udivdi3 = 0x001b2e
__udivhi3 = 0x001744
__udivmod_core = 0x0017c3
__udivmoddi_core = 0x001b50
__udivmodsi_core = 0x0018de
__udivsi3 = 0x001916
__umoddi3 = 0x001b37
__umodhi3 = 0x001750
__umodsi3 = 0x00192a
__umulhisi3 = 0x0016bb
__umulhisi3_qsq = 0x001c87
gChainPath = 0x002085
gDone = 0x00a000
gDpHandle = 0x00a004
gUserId = 0x00a002
longjmp = 0x001c5d
main = 0x0010ba
setjmp = 0x001c35
startdesk = 0x001313

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

View file

@ -45,19 +45,32 @@ int main(void) {
*(volatile unsigned char *)0x80 = 0xA1; *(volatile unsigned char *)0x80 = 0xA1;
unsigned short userId = MMStartUp(); unsigned short userId = MMStartUp();
// QD needs $200 bytes (own DP + cursor mgr at +$100), EM at +$200.
// masterSCB = $90 (640 mode, color burst OFF) avoids the NTSC chroma
// simulator turning the WM's dithered desktop pattern into red/green
// noise. See runtime/src/desktop.c for the full layout.
void *dpH = NewHandle(0x400UL, userId, 0xC015, (void *)0); void *dpH = NewHandle(0x400UL, userId, 0xC015, (void *)0);
unsigned short dp = blockAddrLo(dpH); unsigned short dp = blockAddrLo(dpH);
*(volatile unsigned char *)0x81 = 0xA2; *(volatile unsigned char *)0x81 = 0xA2;
QDStartUp(dp, 0x80, 640, userId); QDStartUp(dp, 0x90, 640, userId);
*(volatile unsigned char *)0x82 = 0xA3; *(volatile unsigned char *)0x82 = 0xA3;
// Match runtime/src/desktop.c's palette setup so the WM's dithered
// desktop fill renders as a clean B/W stipple instead of chroma.
for (unsigned short p = 0; p < 16; p++) {
volatile unsigned short *pal =
(volatile unsigned short *)(0xE19E00UL + (unsigned long)p * 32UL);
for (unsigned short k = 0; k < 16; k++) {
pal[k] = (k & 1) ? 0x0FFF : 0x0000;
}
}
// SHR row 1 marker: 'After QDStartUp' // SHR row 1 marker: 'After QDStartUp'
{ {
volatile unsigned char *shr = (volatile unsigned char *)(0xE12000UL + 160); volatile unsigned char *shr = (volatile unsigned char *)(0xE12000UL + 160);
for (unsigned short i = 0; i < 160; i++) shr[i] = 0x55; for (unsigned short i = 0; i < 160; i++) shr[i] = 0x55;
} }
EMStartUp((unsigned short)(dp + 0x100), 20, 0, 0, 639, 199, userId); EMStartUp((unsigned short)(dp + 0x200), 20, 0, 0, 639, 199, userId);
*(volatile unsigned char *)0x83 = 0xA4; *(volatile unsigned char *)0x83 = 0xA4;
SchStartUp(); SchStartUp();
@ -75,10 +88,9 @@ int main(void) {
RefreshDesktop((void *)0); RefreshDesktop((void *)0);
*(volatile unsigned char *)0x87 = 0xA8; *(volatile unsigned char *)0x87 = 0xA8;
// Spin to let the WM emit any deferred paint. // Spin to let the WM emit any deferred paint AND give snapshot
for (unsigned long i = 0; i < 200000UL; i++) { // tools time to capture the post-paint state.
__asm__ volatile ("nop"); for (volatile unsigned long s = 0; s < 300000UL; s++) { }
}
*(volatile unsigned char *)0x86 = 0xA7; *(volatile unsigned char *)0x86 = 0xA7;
*(volatile unsigned char *)0x70 = 0x99; *(volatile unsigned char *)0x70 = 0x99;

View file

@ -1,11 +1,11 @@
# section layout # section layout
.text : 0x001000 .. 0x001d0c ( 3340 bytes) .text : 0x001000 .. 0x001ffe ( 4094 bytes)
.rodata : 0x001d0c .. 0x001d20 ( 20 bytes) .rodata : 0x001ffe .. 0x002012 ( 20 bytes)
.bss : 0x00a000 .. 0x00a002 ( 2 bytes) .bss : 0x00a000 .. 0x00a002 ( 2 bytes)
# per-input-file .text contributions # per-input-file .text contributions
186 /home/scott/claude/llvm816/runtime/crt0Gsos.o 186 /home/scott/claude/llvm816/runtime/crt0Gsos.o
505 /home/scott/claude/llvm816/demos/qdProbe.o 1259 /home/scott/claude/llvm816/demos/qdProbe.o
43513 /home/scott/claude/llvm816/runtime/libc.o 43513 /home/scott/claude/llvm816/runtime/libc.o
5935 /home/scott/claude/llvm816/runtime/snprintf.o 5935 /home/scott/claude/llvm816/runtime/snprintf.o
11953 /home/scott/claude/llvm816/runtime/extras.o 11953 /home/scott/claude/llvm816/runtime/extras.o
@ -13,7 +13,7 @@
15379 /home/scott/claude/llvm816/runtime/softDouble.o 15379 /home/scott/claude/llvm816/runtime/softDouble.o
176 /home/scott/claude/llvm816/runtime/iigsGsos.o 176 /home/scott/claude/llvm816/runtime/iigsGsos.o
20670 /home/scott/claude/llvm816/runtime/iigsToolbox.o 20670 /home/scott/claude/llvm816/runtime/iigsToolbox.o
1050 /home/scott/claude/llvm816/runtime/desktop.o 1349 /home/scott/claude/llvm816/runtime/desktop.o
2540 /home/scott/claude/llvm816/runtime/libgcc.o 2540 /home/scott/claude/llvm816/runtime/libgcc.o
# global symbols (sorted by address) # global symbols (sorted by address)
@ -33,58 +33,58 @@
0x001000 __start 0x001000 __start
0x001000 __text_start 0x001000 __text_start
0x0010ba main 0x0010ba main
0x0012b3 EMStartUp 0x0015a5 EMStartUp
0x0012d2 NewHandle 0x0015c4 NewHandle
0x0012f8 QDStartUp 0x0015ea QDStartUp
0x00130e RefreshDesktop 0x001600 RefreshDesktop
0x001320 __jsl_indir 0x001612 __jsl_indir
0x001323 __mulhi3 0x001615 __mulhi3
0x001342 __umulhisi3 0x001634 __umulhisi3
0x001399 __ashlhi3 0x00168b __ashlhi3
0x0013a8 __lshrhi3 0x00169a __lshrhi3
0x0013b8 __ashrhi3 0x0016aa __ashrhi3
0x0013cb __udivhi3 0x0016bd __udivhi3
0x0013d7 __umodhi3 0x0016c9 __umodhi3
0x0013e3 __divhi3 0x0016d5 __divhi3
0x0013fd __modhi3 0x0016ef __modhi3
0x001417 __divmod_setup 0x001709 __divmod_setup
0x00144a __udivmod_core 0x00173c __udivmod_core
0x001468 __mulsi3 0x00175a __mulsi3
0x001521 __ashlsi3 0x001813 __ashlsi3
0x001536 __lshrsi3 0x001828 __lshrsi3
0x00154b __ashrsi3 0x00183d __ashrsi3
0x001565 __udivmodsi_core 0x001857 __udivmodsi_core
0x00159d __udivsi3 0x00188f __udivsi3
0x0015b1 __umodsi3 0x0018a3 __umodsi3
0x0015c5 __divsi3 0x0018b7 __divsi3
0x0015ec __modsi3 0x0018de __modsi3
0x001613 __divmodsi_setup 0x001905 __divmodsi_setup
0x001664 __divmoddi4_stash 0x001956 __divmoddi4_stash
0x001681 __retdi 0x001973 __retdi
0x00168e __ashldi3 0x001980 __ashldi3
0x0016b1 __lshrdi3 0x0019a3 __lshrdi3
0x0016d4 __ashrdi3 0x0019c6 __ashrdi3
0x0016fa __muldi3 0x0019ec __muldi3
0x001755 __ucmpdi2 0x001a47 __ucmpdi2
0x00177e __cmpdi2 0x001a70 __cmpdi2
0x0017b5 __udivdi3 0x001aa7 __udivdi3
0x0017be __umoddi3 0x001ab0 __umoddi3
0x0017d7 __udivmoddi_core 0x001ac9 __udivmoddi_core
0x001824 __divdi3 0x001b16 __divdi3
0x001843 __moddi3 0x001b35 __moddi3
0x001870 __absdi_a 0x001b62 __absdi_a
0x001878 __absdi_b 0x001b6a __absdi_b
0x001880 __negdi_a 0x001b72 __negdi_a
0x00189e __negdi_b 0x001b90 __negdi_b
0x0018bc setjmp 0x001bae setjmp
0x0018e4 longjmp 0x001bd6 longjmp
0x00190e __umulhisi3_qsq 0x001c00 __umulhisi3_qsq
0x001d0c __rodata_start 0x001ffe __rodata_start
0x001d0c __text_end 0x001ffe __text_end
0x001d0c gChainPath 0x001ffe gChainPath
0x001d20 __init_array_end 0x002012 __init_array_end
0x001d20 __init_array_start 0x002012 __init_array_start
0x001d20 __rodata_end 0x002012 __rodata_end
0x00a000 __bss_lo16 0x00a000 __bss_lo16
0x00a000 __bss_seg0_lo16 0x00a000 __bss_seg0_lo16
0x00a000 __bss_start 0x00a000 __bss_start
@ -92,18 +92,18 @@
0x00a002 __bss_end 0x00a002 __bss_end
0x00a002 __heap_start 0x00a002 __heap_start
0x00bf00 __heap_end 0x00bf00 __heap_end
EMStartUp = 0x0012b3 EMStartUp = 0x0015a5
NewHandle = 0x0012d2 NewHandle = 0x0015c4
QDStartUp = 0x0012f8 QDStartUp = 0x0015ea
RefreshDesktop = 0x00130e RefreshDesktop = 0x001600
__absdi_a = 0x001870 __absdi_a = 0x001b62
__absdi_b = 0x001878 __absdi_b = 0x001b6a
__ashldi3 = 0x00168e __ashldi3 = 0x001980
__ashlhi3 = 0x001399 __ashlhi3 = 0x00168b
__ashlsi3 = 0x001521 __ashlsi3 = 0x001813
__ashrdi3 = 0x0016d4 __ashrdi3 = 0x0019c6
__ashrhi3 = 0x0013b8 __ashrhi3 = 0x0016aa
__ashrsi3 = 0x00154b __ashrsi3 = 0x00183d
__bss_bank = 0x000000 __bss_bank = 0x000000
__bss_end = 0x00a002 __bss_end = 0x00a002
__bss_lo16 = 0x00a000 __bss_lo16 = 0x00a000
@ -121,49 +121,49 @@ __bss_seg3_lo16 = 0x000000
__bss_seg3_size = 0x000000 __bss_seg3_size = 0x000000
__bss_size = 0x000002 __bss_size = 0x000002
__bss_start = 0x00a000 __bss_start = 0x00a000
__cmpdi2 = 0x00177e __cmpdi2 = 0x001a70
__divdi3 = 0x001824 __divdi3 = 0x001b16
__divhi3 = 0x0013e3 __divhi3 = 0x0016d5
__divmod_setup = 0x001417 __divmod_setup = 0x001709
__divmoddi4_stash = 0x001664 __divmoddi4_stash = 0x001956
__divmodsi_setup = 0x001613 __divmodsi_setup = 0x001905
__divsi3 = 0x0015c5 __divsi3 = 0x0018b7
__heap_end = 0x00bf00 __heap_end = 0x00bf00
__heap_start = 0x00a002 __heap_start = 0x00a002
__indirTarget = 0x00a000 __indirTarget = 0x00a000
__init_array_end = 0x001d20 __init_array_end = 0x002012
__init_array_start = 0x001d20 __init_array_start = 0x002012
__jsl_indir = 0x001320 __jsl_indir = 0x001612
__lshrdi3 = 0x0016b1 __lshrdi3 = 0x0019a3
__lshrhi3 = 0x0013a8 __lshrhi3 = 0x00169a
__lshrsi3 = 0x001536 __lshrsi3 = 0x001828
__moddi3 = 0x001843 __moddi3 = 0x001b35
__modhi3 = 0x0013fd __modhi3 = 0x0016ef
__modsi3 = 0x0015ec __modsi3 = 0x0018de
__muldi3 = 0x0016fa __muldi3 = 0x0019ec
__mulhi3 = 0x001323 __mulhi3 = 0x001615
__mulsi3 = 0x001468 __mulsi3 = 0x00175a
__negdi_a = 0x001880 __negdi_a = 0x001b72
__negdi_b = 0x00189e __negdi_b = 0x001b90
__retdi = 0x001681 __retdi = 0x001973
__rodata_end = 0x001d20 __rodata_end = 0x002012
__rodata_start = 0x001d0c __rodata_start = 0x001ffe
__start = 0x001000 __start = 0x001000
__text_end = 0x001d0c __text_end = 0x001ffe
__text_start = 0x001000 __text_start = 0x001000
__ucmpdi2 = 0x001755 __ucmpdi2 = 0x001a47
__udivdi3 = 0x0017b5 __udivdi3 = 0x001aa7
__udivhi3 = 0x0013cb __udivhi3 = 0x0016bd
__udivmod_core = 0x00144a __udivmod_core = 0x00173c
__udivmoddi_core = 0x0017d7 __udivmoddi_core = 0x001ac9
__udivmodsi_core = 0x001565 __udivmodsi_core = 0x001857
__udivsi3 = 0x00159d __udivsi3 = 0x00188f
__umoddi3 = 0x0017be __umoddi3 = 0x001ab0
__umodhi3 = 0x0013d7 __umodhi3 = 0x0016c9
__umodsi3 = 0x0015b1 __umodsi3 = 0x0018a3
__umulhisi3 = 0x001342 __umulhisi3 = 0x001634
__umulhisi3_qsq = 0x00190e __umulhisi3_qsq = 0x001c00
gChainPath = 0x001d0c gChainPath = 0x001ffe
longjmp = 0x0018e4 longjmp = 0x001bd6
main = 0x0010ba main = 0x0010ba
setjmp = 0x0018bc setjmp = 0x001bae

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

File diff suppressed because it is too large Load diff

View file

@ -1,19 +1,19 @@
# section layout # section layout
.text : 0x001000 .. 0x0033dc ( 9180 bytes) .text : 0x001000 .. 0x0057d5 ( 18389 bytes)
.rodata : 0x0033dc .. 0x003409 ( 45 bytes) .rodata : 0x0057d5 .. 0x005c31 ( 1116 bytes)
.bss : 0x00a000 .. 0x00a0bc ( 188 bytes) .bss : 0x00a000 .. 0x00a197 ( 407 bytes)
# per-input-file .text contributions # per-input-file .text contributions
186 /home/scott/claude/llvm816/runtime/crt0Gsos.o 186 /home/scott/claude/llvm816/runtime/crt0Gsos.o
5050 /home/scott/claude/llvm816/demos/reversi.o 13790 /home/scott/claude/llvm816/demos/reversi.o
43513 /home/scott/claude/llvm816/runtime/libc.o 43132 /home/scott/claude/llvm816/runtime/libc.o
5935 /home/scott/claude/llvm816/runtime/snprintf.o 14895 /home/scott/claude/llvm816/runtime/snprintf.o
11953 /home/scott/claude/llvm816/runtime/extras.o 11953 /home/scott/claude/llvm816/runtime/extras.o
7077 /home/scott/claude/llvm816/runtime/softFloat.o 7077 /home/scott/claude/llvm816/runtime/softFloat.o
15379 /home/scott/claude/llvm816/runtime/softDouble.o 15379 /home/scott/claude/llvm816/runtime/softDouble.o
176 /home/scott/claude/llvm816/runtime/iigsGsos.o 176 /home/scott/claude/llvm816/runtime/iigsGsos.o
20670 /home/scott/claude/llvm816/runtime/iigsToolbox.o 20670 /home/scott/claude/llvm816/runtime/iigsToolbox.o
1302 /home/scott/claude/llvm816/runtime/desktop.o 1349 /home/scott/claude/llvm816/runtime/desktop.o
2540 /home/scott/claude/llvm816/runtime/libgcc.o 2540 /home/scott/claude/llvm816/runtime/libgcc.o
# global symbols (sorted by address) # global symbols (sorted by address)
@ -28,126 +28,193 @@
0x000000 __bss_seg3_bank 0x000000 __bss_seg3_bank
0x000000 __bss_seg3_lo16 0x000000 __bss_seg3_lo16
0x000000 __bss_seg3_size 0x000000 __bss_seg3_size
0x0000bc __bss_seg0_size 0x000197 __bss_seg0_size
0x0000bc __bss_size 0x000197 __bss_size
0x001000 __start 0x001000 __start
0x001000 __text_start 0x001000 __text_start
0x0010ba main 0x0010ba main
0x001af5 pickAiMove 0x002056 newGame
0x0022c4 makeMove 0x00221d findMove
0x002474 memset 0x00264d drawScore
0x0024d4 CtlStartUp 0x0028ff drawMovesList
0x0024e4 EMStartUp 0x002b01 drawSquare
0x002503 FMStartUp 0x002f25 makeAMove
0x002513 LEStartUp 0x0032c9 checkForDone
0x002523 LoadOneTool 0x003ec1 scoreMove
0x002533 NewHandle 0x004698 memcpy
0x002559 QDStartUp 0x00471a memset
0x00256f FrameOval 0x00477a CtlStartUp
0x002581 FrameRect 0x00478a NoteAlert
0x002593 LineTo 0x0047a6 StopAlert
0x0025a3 MoveTo 0x0047c2 EMStartUp
0x0025b3 PaintOval 0x0047e1 FMStartUp
0x0025c5 PaintRect 0x0047f1 LEStartUp
0x0025d7 SetPort 0x004801 LoadOneTool
0x0025e9 BeginUpdate 0x004811 NewHandle
0x0025fb CloseWindow 0x004837 MenuStartUp
0x00260d EndUpdate 0x004847 CheckMItem
0x00261f NewWindow 0x004857 HiliteMenu
0x002639 startdesk 0x004867 InsertMenu
0x0029f0 __jsl_indir 0x00487c NewMenu
0x0029f3 __mulhi3 0x004896 QDStartUp
0x002a12 __umulhisi3 0x0048ac DrawString
0x002a69 __ashlhi3 0x0048be FrameOval
0x002a78 __lshrhi3 0x0048d0 GetPort
0x002a88 __ashrhi3 0x0048e0 GetPortRect
0x002a9b __udivhi3 0x0048f2 GlobalToLocal
0x002aa7 __umodhi3 0x004904 LineTo
0x002ab3 __divhi3 0x004914 MoveTo
0x002acd __modhi3 0x004924 PaintOval
0x002ae7 __divmod_setup 0x004936 PaintRect
0x002b1a __udivmod_core 0x004948 SetPort
0x002b38 __mulsi3 0x00495a BeginUpdate
0x002bf1 __ashlsi3 0x00496c EndUpdate
0x002c06 __lshrsi3 0x00497e FrontWindow
0x002c1b __ashrsi3 0x00498e NewWindow
0x002c35 __udivmodsi_core 0x0049a8 SelectWindow
0x002c6d __udivsi3 0x0049ba TaskMaster
0x002c81 __umodsi3 0x0049d1 startdesk
0x002c95 __divsi3 0x004db7 paintDesktopBackdrop
0x002cbc __modsi3 0x004de9 __jsl_indir
0x002ce3 __divmodsi_setup 0x004dec __mulhi3
0x002d34 __divmoddi4_stash 0x004e0b __umulhisi3
0x002d51 __retdi 0x004e62 __ashlhi3
0x002d5e __ashldi3 0x004e71 __lshrhi3
0x002d81 __lshrdi3 0x004e81 __ashrhi3
0x002da4 __ashrdi3 0x004e94 __udivhi3
0x002dca __muldi3 0x004ea0 __umodhi3
0x002e25 __ucmpdi2 0x004eac __divhi3
0x002e4e __cmpdi2 0x004ec6 __modhi3
0x002e85 __udivdi3 0x004ee0 __divmod_setup
0x002e8e __umoddi3 0x004f13 __udivmod_core
0x002ea7 __udivmoddi_core 0x004f31 __mulsi3
0x002ef4 __divdi3 0x004fea __ashlsi3
0x002f13 __moddi3 0x004fff __lshrsi3
0x002f40 __absdi_a 0x005014 __ashrsi3
0x002f48 __absdi_b 0x00502e __udivmodsi_core
0x002f50 __negdi_a 0x005066 __udivsi3
0x002f6e __negdi_b 0x00507a __umodsi3
0x002f8c setjmp 0x00508e __divsi3
0x002fb4 longjmp 0x0050b5 __modsi3
0x002fde __umulhisi3_qsq 0x0050dc __divmodsi_setup
0x0033dc __rodata_start 0x00512d __divmoddi4_stash
0x0033dc __text_end 0x00514a __retdi
0x0033dc gChainPath 0x005157 __ashldi3
0x0033f0 gTitle 0x00517a __lshrdi3
0x003409 __init_array_end 0x00519d __ashrdi3
0x003409 __init_array_start 0x0051c3 __muldi3
0x003409 __rodata_end 0x00521e __ucmpdi2
0x005247 __cmpdi2
0x00527e __udivdi3
0x005287 __umoddi3
0x0052a0 __udivmoddi_core
0x0052ed __divdi3
0x00530c __moddi3
0x005339 __absdi_a
0x005341 __absdi_b
0x005349 __negdi_a
0x005367 __negdi_b
0x005385 setjmp
0x0053ad longjmp
0x0053d7 __umulhisi3_qsq
0x0057d5 __rodata_start
0x0057d5 __text_end
0x0057d5 gChainPath
0x0057e9 gColor
0x0057eb optionsMenuStr
0x005874 levelMenuStr
0x0058ee editMenuStr
0x005961 fileMenuStr
0x0059a0 appleMenuStr
0x0059c0 gBoardName
0x0059c9 gScoreName
0x0059d1 gMovesName
0x0059d8 gAboutMsg
0x005a1a doAlert.okStr
0x005a1f doAlert.button
0x005a37 doAlert.message
0x005a4f doAlert.alertRec
0x005a8d gPly
0x005a8f gCantPassMsg
0x005aba gIllegalMsg
0x005ad5 gDrawMsg
0x005af7 gWhiteWinsMsg
0x005b0d gBlackWinsMsg
0x005b23 gPassMsg
0x005b44 gDisp
0x005b54 gSqScore
0x005c1c scoreString.tpl
0x005c31 __init_array_end
0x005c31 __init_array_start
0x005c31 __rodata_end
0x00a000 __bss_lo16 0x00a000 __bss_lo16
0x00a000 __bss_seg0_lo16 0x00a000 __bss_seg0_lo16
0x00a000 __bss_start 0x00a000 __bss_start
0x00a000 gWp 0x00a000 gEvent
0x00a04e gBoard 0x00a02c gDone
0x00a0b2 gUserId 0x00a02e gMovesLeft
0x00a0b4 gDpHandle 0x00a030 gSelfPlay
0x00a0b8 gDpBase 0x00a032 gCurrentColor
0x00a0ba __indirTarget 0x00a034 initWindows.wp
0x00a0bc __bss_end 0x00a082 gBoardWin
0x00a0bc __heap_start 0x00a086 gScoreWin
0x00a08a gMovesWin
0x00a08e gBoard
0x00a0f2 gMovesMade
0x00a0f4 gMoves
0x00a174 gScoreBuf
0x00a189 gMoveNotation
0x00a18d gUserId
0x00a18f gDpHandle
0x00a193 gDpBase
0x00a195 __indirTarget
0x00a197 __bss_end
0x00a197 __heap_start
0x00bf00 __heap_end 0x00bf00 __heap_end
BeginUpdate = 0x0025e9 BeginUpdate = 0x00495a
CloseWindow = 0x0025fb CheckMItem = 0x004847
CtlStartUp = 0x0024d4 CtlStartUp = 0x00477a
EMStartUp = 0x0024e4 DrawString = 0x0048ac
EndUpdate = 0x00260d EMStartUp = 0x0047c2
FMStartUp = 0x002503 EndUpdate = 0x00496c
FrameOval = 0x00256f FMStartUp = 0x0047e1
FrameRect = 0x002581 FrameOval = 0x0048be
LEStartUp = 0x002513 FrontWindow = 0x00497e
LineTo = 0x002593 GetPort = 0x0048d0
LoadOneTool = 0x002523 GetPortRect = 0x0048e0
MoveTo = 0x0025a3 GlobalToLocal = 0x0048f2
NewHandle = 0x002533 HiliteMenu = 0x004857
NewWindow = 0x00261f InsertMenu = 0x004867
PaintOval = 0x0025b3 LEStartUp = 0x0047f1
PaintRect = 0x0025c5 LineTo = 0x004904
QDStartUp = 0x002559 LoadOneTool = 0x004801
SetPort = 0x0025d7 MenuStartUp = 0x004837
__absdi_a = 0x002f40 MoveTo = 0x004914
__absdi_b = 0x002f48 NewHandle = 0x004811
__ashldi3 = 0x002d5e NewMenu = 0x00487c
__ashlhi3 = 0x002a69 NewWindow = 0x00498e
__ashlsi3 = 0x002bf1 NoteAlert = 0x00478a
__ashrdi3 = 0x002da4 PaintOval = 0x004924
__ashrhi3 = 0x002a88 PaintRect = 0x004936
__ashrsi3 = 0x002c1b QDStartUp = 0x004896
SelectWindow = 0x0049a8
SetPort = 0x004948
StopAlert = 0x0047a6
TaskMaster = 0x0049ba
__absdi_a = 0x005339
__absdi_b = 0x005341
__ashldi3 = 0x005157
__ashlhi3 = 0x004e62
__ashlsi3 = 0x004fea
__ashrdi3 = 0x00519d
__ashrhi3 = 0x004e81
__ashrsi3 = 0x005014
__bss_bank = 0x000000 __bss_bank = 0x000000
__bss_end = 0x00a0bc __bss_end = 0x00a197
__bss_lo16 = 0x00a000 __bss_lo16 = 0x00a000
__bss_seg0_bank = 0x000000 __bss_seg0_bank = 0x000000
__bss_seg0_lo16 = 0x00a000 __bss_seg0_lo16 = 0x00a000
__bss_seg0_size = 0x0000bc __bss_seg0_size = 0x000197
__bss_seg1_bank = 0x000000 __bss_seg1_bank = 0x000000
__bss_seg1_lo16 = 0x000000 __bss_seg1_lo16 = 0x000000
__bss_seg1_size = 0x000000 __bss_seg1_size = 0x000000
@ -157,61 +224,104 @@ __bss_seg2_size = 0x000000
__bss_seg3_bank = 0x000000 __bss_seg3_bank = 0x000000
__bss_seg3_lo16 = 0x000000 __bss_seg3_lo16 = 0x000000
__bss_seg3_size = 0x000000 __bss_seg3_size = 0x000000
__bss_size = 0x0000bc __bss_size = 0x000197
__bss_start = 0x00a000 __bss_start = 0x00a000
__cmpdi2 = 0x002e4e __cmpdi2 = 0x005247
__divdi3 = 0x002ef4 __divdi3 = 0x0052ed
__divhi3 = 0x002ab3 __divhi3 = 0x004eac
__divmod_setup = 0x002ae7 __divmod_setup = 0x004ee0
__divmoddi4_stash = 0x002d34 __divmoddi4_stash = 0x00512d
__divmodsi_setup = 0x002ce3 __divmodsi_setup = 0x0050dc
__divsi3 = 0x002c95 __divsi3 = 0x00508e
__heap_end = 0x00bf00 __heap_end = 0x00bf00
__heap_start = 0x00a0bc __heap_start = 0x00a197
__indirTarget = 0x00a0ba __indirTarget = 0x00a195
__init_array_end = 0x003409 __init_array_end = 0x005c31
__init_array_start = 0x003409 __init_array_start = 0x005c31
__jsl_indir = 0x0029f0 __jsl_indir = 0x004de9
__lshrdi3 = 0x002d81 __lshrdi3 = 0x00517a
__lshrhi3 = 0x002a78 __lshrhi3 = 0x004e71
__lshrsi3 = 0x002c06 __lshrsi3 = 0x004fff
__moddi3 = 0x002f13 __moddi3 = 0x00530c
__modhi3 = 0x002acd __modhi3 = 0x004ec6
__modsi3 = 0x002cbc __modsi3 = 0x0050b5
__muldi3 = 0x002dca __muldi3 = 0x0051c3
__mulhi3 = 0x0029f3 __mulhi3 = 0x004dec
__mulsi3 = 0x002b38 __mulsi3 = 0x004f31
__negdi_a = 0x002f50 __negdi_a = 0x005349
__negdi_b = 0x002f6e __negdi_b = 0x005367
__retdi = 0x002d51 __retdi = 0x00514a
__rodata_end = 0x003409 __rodata_end = 0x005c31
__rodata_start = 0x0033dc __rodata_start = 0x0057d5
__start = 0x001000 __start = 0x001000
__text_end = 0x0033dc __text_end = 0x0057d5
__text_start = 0x001000 __text_start = 0x001000
__ucmpdi2 = 0x002e25 __ucmpdi2 = 0x00521e
__udivdi3 = 0x002e85 __udivdi3 = 0x00527e
__udivhi3 = 0x002a9b __udivhi3 = 0x004e94
__udivmod_core = 0x002b1a __udivmod_core = 0x004f13
__udivmoddi_core = 0x002ea7 __udivmoddi_core = 0x0052a0
__udivmodsi_core = 0x002c35 __udivmodsi_core = 0x00502e
__udivsi3 = 0x002c6d __udivsi3 = 0x005066
__umoddi3 = 0x002e8e __umoddi3 = 0x005287
__umodhi3 = 0x002aa7 __umodhi3 = 0x004ea0
__umodsi3 = 0x002c81 __umodsi3 = 0x00507a
__umulhisi3 = 0x002a12 __umulhisi3 = 0x004e0b
__umulhisi3_qsq = 0x002fde __umulhisi3_qsq = 0x0053d7
gBoard = 0x00a04e appleMenuStr = 0x0059a0
gChainPath = 0x0033dc checkForDone = 0x0032c9
gDpBase = 0x00a0b8 doAlert.alertRec = 0x005a4f
gDpHandle = 0x00a0b4 doAlert.button = 0x005a1f
gTitle = 0x0033f0 doAlert.message = 0x005a37
gUserId = 0x00a0b2 doAlert.okStr = 0x005a1a
gWp = 0x00a000 drawMovesList = 0x0028ff
longjmp = 0x002fb4 drawScore = 0x00264d
drawSquare = 0x002b01
editMenuStr = 0x0058ee
fileMenuStr = 0x005961
findMove = 0x00221d
gAboutMsg = 0x0059d8
gBlackWinsMsg = 0x005b0d
gBoard = 0x00a08e
gBoardName = 0x0059c0
gBoardWin = 0x00a082
gCantPassMsg = 0x005a8f
gChainPath = 0x0057d5
gColor = 0x0057e9
gCurrentColor = 0x00a032
gDisp = 0x005b44
gDone = 0x00a02c
gDpBase = 0x00a193
gDpHandle = 0x00a18f
gDrawMsg = 0x005ad5
gEvent = 0x00a000
gIllegalMsg = 0x005aba
gMoveNotation = 0x00a189
gMoves = 0x00a0f4
gMovesLeft = 0x00a02e
gMovesMade = 0x00a0f2
gMovesName = 0x0059d1
gMovesWin = 0x00a08a
gPassMsg = 0x005b23
gPly = 0x005a8d
gScoreBuf = 0x00a174
gScoreName = 0x0059c9
gScoreWin = 0x00a086
gSelfPlay = 0x00a030
gSqScore = 0x005b54
gUserId = 0x00a18d
gWhiteWinsMsg = 0x005af7
initWindows.wp = 0x00a034
levelMenuStr = 0x005874
longjmp = 0x0053ad
main = 0x0010ba main = 0x0010ba
makeMove = 0x0022c4 makeAMove = 0x002f25
memset = 0x002474 memcpy = 0x004698
pickAiMove = 0x001af5 memset = 0x00471a
setjmp = 0x002f8c newGame = 0x002056
startdesk = 0x002639 optionsMenuStr = 0x0057eb
paintDesktopBackdrop = 0x004db7
scoreMove = 0x003ec1
scoreString.tpl = 0x005c1c
setjmp = 0x005385
startdesk = 0x0049d1

Binary file not shown.

Binary file not shown.

Binary file not shown.

View file

@ -469,15 +469,37 @@ clang --target=w65816 -O2 -mllvm -print-after=w65816-isel \
## Cycle-count benchmarks ## Cycle-count benchmarks
Eight microbenchmarks live under [`benchmarks/`](../benchmarks/). Eleven microbenchmarks live under [`benchmarks/`](../benchmarks/) —
Each runs N iterations of the bench function and reports a eight integer/string benches plus three soft-double FP benches
per-call cycle count via MAME's `emu.time()`: (`dadd`, `dmul`, `ddiv`). Each runs N iterations of the bench
function and reports per-iter cycles via MAME's HBL counter:
```bash ```bash
bash scripts/benchCyclesPrecise.sh bash scripts/benchCycles.sh
``` ```
Output: Output (2026-05-20):
```
| Benchmark | Per-iteration cycles |
|-----------|---------------------:|
| bsearch | 127 cyc/iter (100 iters) |
| crc32 | <65 (under timer resolution) |
| dadd | 1157 cyc/iter (10 iters) |
| ddiv | 1261 cyc/iter (10 iters) |
| dmul | 1033 cyc/iter (10 iters) |
| dotProduct | 144 cyc/iter (100 iters) |
| fib | 97 cyc/iter (100 iters) |
| memcmp | 113 cyc/iter (100 iters) |
| popcount | 93 cyc/iter (100 iters) |
| strcpy | 91 cyc/iter (100 iters) |
| sumOfSquares | 126 cyc/iter (100 iters) |
```
The legacy `scripts/benchCyclesPrecise.sh` (per-call cycle count
via `emu.time()`) is still available but slower to run.
Output (legacy `benchCyclesPrecise.sh`):
``` ```
| Benchmark | Per-call cycles (clang) | | Benchmark | Per-call cycles (clang) |

View file

@ -55,11 +55,10 @@ cc "$SRC/libcxxabiSjlj.c"
cc "$SRC/desktop.c" cc "$SRC/desktop.c"
asm "$SRC/iigsGsos.s" asm "$SRC/iigsGsos.s"
asm "$SRC/iigsToolbox.s" asm "$SRC/iigsToolbox.s"
# softDouble.c builds at -O2. dpack stays noinline (basic regalloc # softDouble.c builds at -O2. dpack is noinline to dodge a backend
# overflows when dpack inlines into __adddf3/__muldf3). dclass MUST # stack-slot aliasing bug; dclass stays inline because pointer-arg
# stay inline (its pointer-arg writes from a noinline boundary would # stores from a noinline boundary use DBR-relative addressing (broken
# lower to `sta (d,s),y` which uses DBR — silently corrupted under # under DBR != 0). Both choices documented in the source.
# DBR != 0, caught by the dmul-after-bank-switch test).
cc "$SRC/softDouble.c" cc "$SRC/softDouble.c"
echo "runtime built: $(ls -1 "$OUT"/*.o | wc -l) objects" echo "runtime built: $(ls -1 "$OUT"/*.o | wc -l) objects"

View file

@ -798,7 +798,10 @@ typedef unsigned long clock_t;
// DP scratch ($E0..$E7), then memcpy out. We can't use "=g" // DP scratch ($E0..$E7), then memcpy out. We can't use "=g"
// constraints (W65816 backend rejects memory operands in inline // constraints (W65816 backend rejects memory operands in inline
// asm), so the data path runs through known DP addresses. // asm), so the data path runs through known DP addresses.
__attribute__((noinline)) //
// "memory" clobber on the asm tells the scheduler we touch arbitrary
// memory, so it can't reorder the asm against the volatile DP reads
// below. That permits inlining without losing the read ordering.
static void readTimeHex(unsigned char buf[8]) { static void readTimeHex(unsigned char buf[8]) {
__asm__ volatile ( __asm__ volatile (
"pea 0\n" "pea 0\n"
@ -1070,25 +1073,6 @@ extern int vsnprintf(char *buf, size_t n, const char *fmt, va_list ap);
int vfprintf(FILE *stream, const char *fmt, va_list ap); int vfprintf(FILE *stream, const char *fmt, va_list ap);
size_t fwrite(const void *ptr, size_t size, size_t nmemb, FILE *stream); size_t fwrite(const void *ptr, size_t size, size_t nmemb, FILE *stream);
// Opaque pos-update helper. The vfprintf body's `stream->pos +=
// written` got DSE'd under p:32:16 + size_t=unsigned long when called
// after a format-spec vsnprintf call. Routing through an explicit
// noinline helper forces the compiler to emit the memory store.
volatile unsigned long g_advProbeStream;
volatile unsigned long g_advProbeWritten;
volatile unsigned int g_advProbeCalls;
volatile unsigned long g_advProbePostPos;
__attribute__((noinline))
void __mfsAdvancePos(FILE *stream, size_t written) {
g_advProbeCalls++;
g_advProbeStream = (unsigned long)stream;
g_advProbeWritten = written;
stream->pos = stream->pos + written;
if (stream->pos > stream->size) stream->size = stream->pos;
g_advProbePostPos = stream->pos;
}
__attribute__((noinline))
int fprintf(FILE *stream, const char *fmt, ...) { int fprintf(FILE *stream, const char *fmt, ...) {
va_list ap; va_list ap;
__builtin_va_start(ap, fmt); __builtin_va_start(ap, fmt);
@ -1097,7 +1081,6 @@ int fprintf(FILE *stream, const char *fmt, ...) {
return r; return r;
} }
__attribute__((noinline))
int vfprintf(FILE *stream, const char *fmt, va_list ap) { int vfprintf(FILE *stream, const char *fmt, va_list ap) {
if (!stream) return -1; if (!stream) return -1;
if (stream->kind == FILE_KIND_STDOUT || stream->kind == FILE_KIND_STDERR) if (stream->kind == FILE_KIND_STDOUT || stream->kind == FILE_KIND_STDERR)
@ -1124,19 +1107,11 @@ int vfprintf(FILE *stream, const char *fmt, va_list ap) {
size_t remain = (stream->cap > stream->pos) size_t remain = (stream->cap > stream->pos)
? stream->cap - stream->pos : 0; ? stream->cap - stream->pos : 0;
if (remain == 0) { stream->err = 1; return -1; } if (remain == 0) { stream->err = 1; return -1; }
// Stash the FILE* low+high halves in volatile stack locals so
// the compiler is forced to reload after vsnprintf. Without
// this, the compiler keeps stream's hi half in IMG0 ($D0) for
// the entire function; vsnprintf uses $D0 as scratch, so when
// we read stream->* after vsnprintf returns the hi is garbage
// and writes go to the wrong bank. Caught by hex dumper test.
volatile unsigned int streamLo = (unsigned int)(unsigned long)stream;
volatile unsigned int streamHi = (unsigned int)((unsigned long)stream >> 16);
int n = vsnprintf(stream->buf + stream->pos, remain, fmt, ap); int n = vsnprintf(stream->buf + stream->pos, remain, fmt, ap);
FILE *vs = (FILE *)((unsigned long)streamLo | ((unsigned long)streamHi << 16)); if (n < 0) { stream->err = 1; return -1; }
if (n < 0) { vs->err = 1; return -1; }
size_t written = ((size_t)n < remain) ? (size_t)n : remain - 1; size_t written = ((size_t)n < remain) ? (size_t)n : remain - 1;
__mfsAdvancePos(vs, written); stream->pos += written;
if (stream->pos > stream->size) stream->size = stream->pos;
return n; return n;
} }
return -1; return -1;
@ -1219,7 +1194,6 @@ int system(const char *cmd) { (void)cmd; return 0; }
// Returns NULL if no registration matches `path` (or the requested // Returns NULL if no registration matches `path` (or the requested
// mode isn't compatible with the registration's writable flag). // mode isn't compatible with the registration's writable flag).
__attribute__((noinline))
static void initFileMem(FILE *f, const MfsEntry *reg, int wantWrite) { static void initFileMem(FILE *f, const MfsEntry *reg, int wantWrite) {
f->kind = FILE_KIND_MEM; f->kind = FILE_KIND_MEM;
f->writable = (u8)(wantWrite ? 1 : 0); f->writable = (u8)(wantWrite ? 1 : 0);
@ -1230,15 +1204,7 @@ static void initFileMem(FILE *f, const MfsEntry *reg, int wantWrite) {
f->cap = reg->cap; f->cap = reg->cap;
f->pos = 0; f->pos = 0;
f->unget = -1; f->unget = -1;
// Workaround: write path via byte-by-byte memcpy to dodge a ptr32 f->path = reg->path;
// SDAG combiner bug where the i32 ptr-store of `f->path = reg->path`
// (struct offset 22) ends up writing to the previously-computed
// `f->pos` address (offset 16), corrupting pos.
{
const unsigned char *src = (const unsigned char *)&reg->path;
unsigned char *dst = (unsigned char *)&f->path;
dst[0] = src[0]; dst[1] = src[1]; dst[2] = src[2]; dst[3] = src[3];
}
} }
// Scratch GSString for fopen's gsosOpen call. Single static buffer is // Scratch GSString for fopen's gsosOpen call. Single static buffer is

View file

@ -979,7 +979,18 @@ __muldi3:
stz 0xf4 stz 0xf4
stz 0xf6 stz 0xf6
stz 0xf8 stz 0xf8
; Loop 64 times on a's bits. ; Short-circuit when a's high half ($E4/$E6) is zero: bits 32..63
; of a are 0, so the 32 high iterations would add nothing. Saves
; ~50% of __muldi3 cost in mulhi64Aligned (softDouble.c), which
; passes only u32-wide operands. b's high half is irrelevant for
; this short-circuit — even if b is full-width, iters 32..63 only
; shift b and add zero.
lda 0xe4
ora 0xe6
bne .Lmuldi_long
ldy #0x20
bra .Lmuldi_loop
.Lmuldi_long:
ldy #0x40 ldy #0x40
.Lmuldi_loop: .Lmuldi_loop:
; Right-shift the 64-bit `a` by 1. $E0=lo..$E6=hi (matches the ; Right-shift the 64-bit `a` by 1. $E0=lo..$E6=hi (matches the

View file

@ -708,12 +708,6 @@ float expm1f(float x) { return (float)expm1((double)x); }
// to avoid overflow — for |x|, |y| < ~1e150 the naive form is fine, // to avoid overflow — for |x|, |y| < ~1e150 the naive form is fine,
// past that you'd want the standard scale-by-max trick. // past that you'd want the standard scale-by-max trick.
// hypot — naive sqrt(x*x + y*y). NO `volatile` on the temps —
// clang's codegen for volatile-double locals on this target generates
// stack-relative loads/stores that crash under the GS/OS Loader (the
// chain executes correctly under runInMame but not via Finder). The
// volatile-free version works in both contexts.
__attribute__((noinline))
double hypot(double x, double y) { double hypot(double x, double y) {
double xx = x * x; double xx = x * x;
double yy = y * y; double yy = y * y;
@ -734,8 +728,6 @@ float hypotf(float x, float y) {
// Implemented WITHOUT calling pow because clang treats pow as a // Implemented WITHOUT calling pow because clang treats pow as a
// known builtin and either inlines it (with bad fold of pow(x,1/3)) // known builtin and either inlines it (with bad fold of pow(x,1/3))
// or DCEs the call entirely (cbrt body collapses to "return 0"). // or DCEs the call entirely (cbrt body collapses to "return 0").
// This implementation has no pow dependency and is immune.
__attribute__((noinline))
double cbrt(double x) { double cbrt(double x) {
if (x == 0.0) return x; if (x == 0.0) return x;
int neg = (int)(dToBits(x) >> 63) & 1; int neg = (int)(dToBits(x) >> 63) & 1;

View file

@ -57,7 +57,6 @@ void *bsearch(const void *key, const void *base, size_t nmemb,
// the split, qsort's i32-pointer pressure under ptr32 produces // the split, qsort's i32-pointer pressure under ptr32 produces
// ADCEfi tied-def chains the inline-spiller can't allocate ("ran // ADCEfi tied-def chains the inline-spiller can't allocate ("ran
// out of registers" failure). // out of registers" failure).
__attribute__((noinline))
static void qsortInner(unsigned char *base, unsigned char *cur, static void qsortInner(unsigned char *base, unsigned char *cur,
size_t size, CmpFnT cmp) { size_t size, CmpFnT cmp) {
while (cur > base) { while (cur > base) {

View file

@ -18,25 +18,9 @@
// the buffer been unbounded (C99 vsnprintf semantics), not just the // the buffer been unbounded (C99 vsnprintf semantics), not just the
// number actually written. This lets callers detect truncation. // number actually written. This lets callers detect truncation.
// //
// **Sink state lives in file-static globals** instead of an explicit // Sink state lives in file-static globals (gCur/gEnd/gTotal) rather
// struct passed by pointer. This was originally a workaround for two // than a per-call context. Single-threaded use only, but that matches
// W65816 backend bugs (since fixed): // the rest of this runtime.
// (1) The address of a stack-resident struct used to be computed
// wrong (&s came out as SP+5 = address of s.end instead of SP+3).
// (2) Functions taking fmt as arg1 (stack) didn't initialize the
// fmt local before the loop body — first char came from the
// arg slot but the loop's fmt++ ran on uninitialized memory.
// The struct-sink form now compiles correctly, but switching back to it
// would shift every TU's branch distances; left as-is for stability.
// Single-threaded use only, but that matches the rest of this runtime.
//
// Reverse-emit pattern (used by emitUDec / emitULong / emitHex): the
// natural countdown forms (`while (i > 0) emit(buf[--i])`,
// `while (i > 0) { i--; emit(buf[i]); }`,
// `for (j = i - 1; j >= 0; j--) emit(buf[j])`) all lower to a
// do-while whose `dec a; bpl` exit condition runs the loop one
// extra time on this backend, leaking a `buf[-1]` read. Use the
// forward count + index-arithmetic form instead.
typedef unsigned long size_t; typedef unsigned long size_t;
typedef __builtin_va_list va_list; typedef __builtin_va_list va_list;
@ -50,7 +34,6 @@ static char *gEnd;
static size_t gTotal; static size_t gTotal;
__attribute__((noinline))
static void emit(char c) { static void emit(char c) {
if (gCur < gEnd) { if (gCur < gEnd) {
*gCur++ = c; *gCur++ = c;
@ -59,7 +42,6 @@ static void emit(char c) {
} }
__attribute__((noinline))
static void emitStr(const char *p) { static void emitStr(const char *p) {
if (!p) { if (!p) {
p = "(null)"; p = "(null)";
@ -70,7 +52,6 @@ static void emitStr(const char *p) {
} }
__attribute__((noinline))
static void emitUDec(unsigned int n) { static void emitUDec(unsigned int n) {
char buf[6]; char buf[6];
int i = 0; int i = 0;
@ -82,15 +63,10 @@ static void emitUDec(unsigned int n) {
buf[i++] = '0' + (n % 10); buf[i++] = '0' + (n % 10);
n /= 10; n /= 10;
} }
// Reverse-emit; see file header for the forward-index rationale. while (i > 0) emit(buf[--i]);
int top = i;
for (int j = 0; j < top; j++) {
emit(buf[top - 1 - j]);
}
} }
__attribute__((noinline))
static void emitDec(int n) { static void emitDec(int n) {
// -n on INT_MIN is signed-overflow UB; negate as unsigned. // -n on INT_MIN is signed-overflow UB; negate as unsigned.
if (n < 0) { if (n < 0) {
@ -102,7 +78,6 @@ static void emitDec(int n) {
} }
__attribute__((noinline))
static void emitULong(unsigned long n) { static void emitULong(unsigned long n) {
char buf[11]; char buf[11];
int i = 0; int i = 0;
@ -114,15 +89,10 @@ static void emitULong(unsigned long n) {
buf[i++] = '0' + (n % 10); buf[i++] = '0' + (n % 10);
n /= 10; n /= 10;
} }
// Reverse-emit; see file header for the forward-index rationale. while (i > 0) emit(buf[--i]);
int top = i;
for (int j = 0; j < top; j++) {
emit(buf[top - 1 - j]);
}
} }
__attribute__((noinline))
static void emitSignedLong(long n) { static void emitSignedLong(long n) {
// See emitDec: avoid the signed-overflow UB on LONG_MIN. // See emitDec: avoid the signed-overflow UB on LONG_MIN.
if (n < 0) { if (n < 0) {
@ -134,7 +104,6 @@ static void emitSignedLong(long n) {
} }
__attribute__((noinline))
static void emitHex(unsigned int n, int width) { static void emitHex(unsigned int n, int width) {
static const char digits[] = "0123456789abcdef"; static const char digits[] = "0123456789abcdef";
// unsigned int is 16-bit on this target -> at most 4 hex digits. // unsigned int is 16-bit on this target -> at most 4 hex digits.
@ -153,15 +122,10 @@ static void emitHex(unsigned int n, int width) {
while (i < width) { while (i < width) {
buf[i++] = '0'; buf[i++] = '0';
} }
// Reverse-emit; see file header for the forward-index rationale. while (i > 0) emit(buf[--i]);
int top = i;
for (int j = 0; j < top; j++) {
emit(buf[top - 1 - j]);
}
} }
__attribute__((noinline))
static void emitDouble(double v, int prec, char spec) { static void emitDouble(double v, int prec, char spec) {
// For %g / %G, "precision" is total significant digits. Real glibc // For %g / %G, "precision" is total significant digits. Real glibc
// would compute exponent and choose between %e and %f styles, but // would compute exponent and choose between %e and %f styles, but

View file

@ -24,45 +24,20 @@ typedef unsigned char u8;
// Pack sign / unbiased-exp / mantissa-with-leading-bit into IEEE-754 // Pack sign / unbiased-exp / mantissa-with-leading-bit into IEEE-754
// double. Returns sign for zero or underflow; sign|inf for overflow. // double. Returns sign for zero or underflow; sign|inf for overflow.
//
// Body uses per-word writes through a `union { u64; u16[4]; }` and
// stores each word through a volatile-qualified accessor to defeat
// the backend's stack-slot coalescing. Without the volatile wrap,
// inlining dpack into __adddf3 hit a stack-slot-aliasing miscompile
// where result word 2 got OR'd with result word 3 (dadd(1.5, 2.5) →
// 0x4010_4010_0000_0000 instead of 0x4010_0000_0000_0000). Real fix
// needs backend stack-slot lifetime analysis at the coalescer stage.
static u64 dpack(u64 sign, s16 exp, u64 mant) { static u64 dpack(u64 sign, s16 exp, u64 mant) {
if (mant == 0) return sign; if (mant == 0) return sign;
s16 eS = exp + DEXP_BIAS; s16 eS = exp + DEXP_BIAS;
if (eS <= 0) return sign; if (eS <= 0) return sign;
if (eS >= 2047) return sign | DEXP_MASK; if (eS >= 2047) return sign | DEXP_MASK;
union { u64 u; u16 w[4]; } mantU, signU; return sign | (mant & DMANT_MASK) | ((u64)(u16)eS << DEXP_SHIFT);
mantU.u = mant;
signU.u = sign;
// Volatile output array forces distinct stack slots per word —
// the compiler can't fold these into shared slots.
volatile u16 outW[4];
outW[0] = (u16)(mantU.w[0] | signU.w[0]);
outW[1] = (u16)(mantU.w[1] | signU.w[1]);
outW[2] = (u16)(mantU.w[2] | signU.w[2]);
outW[3] = (u16)((mantU.w[3] & 0x000F) | signU.w[3] | ((u16)eS << 4));
union { u64 u; u16 w[4]; } r;
r.w[0] = outW[0];
r.w[1] = outW[1];
r.w[2] = outW[2];
r.w[3] = outW[3];
return r.u;
} }
// Decompose `x` into sign / unbiased-exp / mantissa-with-leading-bit. // Decompose `x` into sign / unbiased-exp / mantissa-with-leading-bit.
// Returns the class: 0=zero, 1=normal, 2=infinity, 3=NaN. // Returns the class: 0=zero, 1=normal, 2=infinity, 3=NaN.
// noinline reduces register pressure in __muldf3/__divdf3/__adddf3 //
// — without it, greedy regalloc runs out of registers in __muldf3 // Kept inline: passing pointer args from a noinline boundary lowers to
// at -O2. Now safe because pointer-arg writes lower to STBptr/STAptr // `sta (d,s),y` (DBR-relative) — broken under DBR != 0. Inlining keeps
// which use [$E0],Y indirect-long with the bank byte forced to 0 // the stores within the caller's frame. See feedback_dbr_ptr_deref_spill.md.
// (DBR-independent). See `feedback_dbr_ptr_deref_spill.md`.
// noinline removed — pointer-arg stores now lower to STBptr/STAptr (indirect-long, DBR-independent)
static u16 dclass(u64 x, u64 *out_sign, s16 *out_exp, u64 *out_mant) { static u16 dclass(u64 x, u64 *out_sign, s16 *out_exp, u64 *out_mant) {
*out_sign = x & DSIGN_BIT; *out_sign = x & DSIGN_BIT;
s16 e = (s16)((x >> DEXP_SHIFT) & 0x7FF); s16 e = (s16)((x >> DEXP_SHIFT) & 0x7FF);
@ -142,10 +117,9 @@ u64 __adddf3(u64 a, u64 b) {
// left-shift if subtraction left the lead below 55. Reverse order // left-shift if subtraction left the lead below 55. Reverse order
// would shift an over-wide value out of u64 range entirely. // would shift an over-wide value out of u64 range entirely.
// Use if + do-while because pure `while (cond) body` triggers a // Use if + do-while because pure `while (cond) body` triggers a
// ptr32 backend bug: PHP/PLP wrap pass mis-identifies the loop's // backend bug in the left-shift renormalize path: subtraction
// pre-test LDA reload as flag corruption and wraps the wrong // cases (different signs) lose their result (7+(-8) → -0 instead
// range, so the BEQ tests stale flags and the loop body never // of -1). do-while is unaffected (test-after-body).
// fires. `do { } while (cond)` is unaffected (test-after-body).
if (mr & ~((1ULL << 56) - 1)) { if (mr & ~((1ULL << 56) - 1)) {
do { do {
u64 sticky_bit = mr & 1; u64 sticky_bit = mr & 1;
@ -282,26 +256,14 @@ u64 __divdf3(u64 a, u64 b) {
// Handle the leading quotient bit explicitly. // Handle the leading quotient bit explicitly.
u64 q = DMANT_LEAD; u64 q = DMANT_LEAD;
u64 r = ma - mb; u64 r = ma - mb;
// `volatile vmb`: forces mb to be re-read from memory inside the
// loop. Without this, the W65816 codegen miscompiles `r >= mb` and
// `r -= mb` when called as the 3rd+ chained `__divdf3` after prior
// softDouble libcalls (sqrt3 Newton iter — 3rd iter returned 0.0
// instead of 1.41421). Adding `volatile` to either `r` or `mb`
// alone fixes it, suggesting the compiler is keeping one of them
// in registers across loop iterations and a JSL inside the loop
// (__ashlsi3 for `r <<= 1`) clobbers the held value. The real
// fix lives in the W65816 backend's u64-shift lowering; volatile
// here is the conservative workaround.
volatile u64 vmb = mb;
// Compute 52 more fractional bits via standard shift-test-subtract. // Compute 52 more fractional bits via standard shift-test-subtract.
for (int i = 51; i >= 0; i--) { for (int i = 51; i >= 0; i--) {
r <<= 1; r <<= 1;
if (r >= vmb) { if (r >= mb) {
r -= vmb; r -= mb;
q |= (1ULL << i); q |= (1ULL << i);
} }
} }
mb = vmb; // resync in case below reads mb
// Round to nearest, ties to even. Generate one extra bit (the // Round to nearest, ties to even. Generate one extra bit (the
// "guard"), examine the remainder for any non-zero "sticky" tail, // "guard"), examine the remainder for any non-zero "sticky" tail,
// and round q up when guard=1 and (sticky || (q & 1)). Without // and round q up when guard=1 and (sticky || (q & 1)). Without

View file

@ -38,7 +38,6 @@ typedef int s16;
#define MANT_MASK 0x007FFFFFUL #define MANT_MASK 0x007FFFFFUL
#define MANT_LEAD 0x00800000UL // implicit leading 1 #define MANT_LEAD 0x00800000UL // implicit leading 1
__attribute__((noinline))
static u16 fpClass(u32 x, u32 *out_sign, s16 *out_exp, u32 *out_mant) { static u16 fpClass(u32 x, u32 *out_sign, s16 *out_exp, u32 *out_mant) {
*out_sign = x & SIGN_BIT; *out_sign = x & SIGN_BIT;
s16 e = (s16)((x >> EXP_SHIFT) & 0xFF); s16 e = (s16)((x >> EXP_SHIFT) & 0xFF);
@ -61,7 +60,6 @@ static u16 fpClass(u32 x, u32 *out_sign, s16 *out_exp, u32 *out_mant) {
return 1; // normal return 1; // normal
} }
__attribute__((noinline))
static u32 fpPack(u32 sign, s16 exp, u32 mant) { static u32 fpPack(u32 sign, s16 exp, u32 mant) {
if (mant == 0) return sign; // zero if (mant == 0) return sign; // zero
// Normalize: shift mantissa until bit 23 is the leading 1. // Normalize: shift mantissa until bit 23 is the leading 1.

View file

@ -9,7 +9,6 @@ static char *gStrtokSave;
// strtok_r, growing the .o by ~70%. The runtime's bank-0 budget // strtok_r, growing the .o by ~70%. The runtime's bank-0 budget
// is tight enough that the duplicated code pushes rodata past // is tight enough that the duplicated code pushes rodata past
// 0xC000 (IIgs IO window), corrupting string literals at runtime. // 0xC000 (IIgs IO window), corrupting string literals at runtime.
__attribute__((noinline))
char *strtok_r(char *str, const char *delim, char **saveptr) { char *strtok_r(char *str, const char *delim, char **saveptr) {
unsigned char *s; unsigned char *s;
if (str != (char *)0) { if (str != (char *)0) {

View file

@ -164,7 +164,6 @@ static const char *const __monLong[12] = {
// (__udivhi3 + __umodhi3) is slower than one __udivhi3 + multiply but // (__udivhi3 + __umodhi3) is slower than one __udivhi3 + multiply but
// is the only spelling that avoids the negation bug at this width. // is the only spelling that avoids the negation bug at this width.
// Calendar values stay under 65535 so u16 suffices. // Calendar values stay under 65535 so u16 suffices.
__attribute__((noinline))
static char *fmtN(char *p, unsigned long v, int n) { static char *fmtN(char *p, unsigned long v, int n) {
unsigned int v16 = (unsigned int)v; unsigned int v16 = (unsigned int)v;
p += n; p += n;
@ -220,7 +219,6 @@ char *ctime(const time_t *t) {
// %Y %m %d %H %M %S %j %w %a %A %b %h %B %p %% // %Y %m %d %H %M %S %j %w %a %A %b %h %B %p %%
// Composite specs (expanded by main loop via strftimeComposite): // Composite specs (expanded by main loop via strftimeComposite):
// %D %F %R %T %r %x %X %c // %D %F %R %T %r %x %X %c
__attribute__((noinline))
static int strftimeOne(char dst[8], char spec, const struct tm *tm, static int strftimeOne(char dst[8], char spec, const struct tm *tm,
const char **strOut) { const char **strOut) {
*strOut = 0; *strOut = 0;

BIN
screenshots/frame.png (Stored with Git LFS)

Binary file not shown.

BIN
screenshots/minicad.png (Stored with Git LFS)

Binary file not shown.

BIN
screenshots/orcaFrameLike.png (Stored with Git LFS)

Binary file not shown.

BIN
screenshots/orcaMiniCadLike.png (Stored with Git LFS)

Binary file not shown.

BIN
screenshots/orcaReversiLike.png (Stored with Git LFS)

Binary file not shown.

BIN
screenshots/qdProbe.png (Stored with Git LFS)

Binary file not shown.

BIN
screenshots/reversi.png (Stored with Git LFS)

Binary file not shown.

View file

@ -24,6 +24,8 @@ oCrt0=$(mktemp --suffix=.o)
oLibgcc=$(mktemp --suffix=.o) oLibgcc=$(mktemp --suffix=.o)
"$LLVM_MC" -arch=w65816 -filetype=obj "$PROJECT_ROOT/runtime/src/crt0.s" -o "$oCrt0" "$LLVM_MC" -arch=w65816 -filetype=obj "$PROJECT_ROOT/runtime/src/crt0.s" -o "$oCrt0"
"$LLVM_MC" -arch=w65816 -filetype=obj "$PROJECT_ROOT/runtime/src/libgcc.s" -o "$oLibgcc" "$LLVM_MC" -arch=w65816 -filetype=obj "$PROJECT_ROOT/runtime/src/libgcc.s" -o "$oLibgcc"
# softDouble.o is needed for FP benches (dmul/dadd/ddiv → __muldf3/etc.)
oSoftDouble="$PROJECT_ROOT/runtime/softDouble.o"
# Per-benchmark wrapper template. The C wrapper calls each benchmark # Per-benchmark wrapper template. The C wrapper calls each benchmark
# with appropriate inputs, then writes the iteration count and cycle # with appropriate inputs, then writes the iteration count and cycle
@ -39,6 +41,9 @@ benchInputs() {
dotProduct) echo 'dotProduct(va, vb, 4)';; dotProduct) echo 'dotProduct(va, vb, 4)';;
popcount) echo 'popcount(0x12345678UL)';; popcount) echo 'popcount(0x12345678UL)';;
crc32) echo 'crc32((const unsigned char *)"hello", 5)';; crc32) echo 'crc32((const unsigned char *)"hello", 5)';;
dmul) echo 'dmul(da, db)';;
dadd) echo 'dadd(da, db)';;
ddiv) echo 'ddiv(da, db)';;
*) echo "/* unknown */";; *) echo "/* unknown */";;
esac esac
} }
@ -53,6 +58,9 @@ benchExtern() {
dotProduct) echo 'extern long dotProduct(const short *a, const short *b, unsigned int n); static const short va[] = {1,2,3,4}; static const short vb[] = {5,6,7,8};';; dotProduct) echo 'extern long dotProduct(const short *a, const short *b, unsigned int n); static const short va[] = {1,2,3,4}; static const short vb[] = {5,6,7,8};';;
popcount) echo 'extern int popcount(unsigned long x);';; popcount) echo 'extern int popcount(unsigned long x);';;
crc32) echo 'extern unsigned long crc32(const unsigned char *p, unsigned int n);';; crc32) echo 'extern unsigned long crc32(const unsigned char *p, unsigned int n);';;
dmul) echo 'extern double dmul(double a, double b); static volatile double da = 3.14, db = 2.71;';;
dadd) echo 'extern double dadd(double a, double b); static volatile double da = 3.14, db = 2.71;';;
ddiv) echo 'extern double ddiv(double a, double b); static volatile double da = 3.14, db = 2.71;';;
*) echo '';; *) echo '';;
esac esac
} }
@ -68,6 +76,14 @@ runOneBench() {
echo "(no input config)" echo "(no input config)"
return return
fi fi
# FP benches assign result to sinkD (double); rest assign to sink as ulong
# FP benches also use fewer iters (each call is ~1000+ cycles, so 100
# iters wraps the 8-bit HBL counter many times).
local sink_lhs sink_cast iters
case "$name" in
dmul|dadd|ddiv) sink_lhs='sinkD'; sink_cast=''; iters=10 ;;
*) sink_lhs='sink'; sink_cast='(unsigned long)'; iters=100 ;;
esac
local cwrap=$(mktemp --suffix=.c) local cwrap=$(mktemp --suffix=.c)
local owrap=$(mktemp --suffix=.o) local owrap=$(mktemp --suffix=.o)
@ -90,7 +106,8 @@ __attribute__((noinline)) static unsigned char readVbl(void) {
return r; return r;
} }
volatile unsigned long sink; volatile unsigned long sink;
#define ITERS 100 volatile double sinkD;
#define ITERS $iters
int main(void) { int main(void) {
// Re-enable IRQs so the IIgs ROM's VBL handler runs and the // Re-enable IRQs so the IIgs ROM's VBL handler runs and the
// VBL counter at \$E1006B actually ticks. crt0 disables IRQs // VBL counter at \$E1006B actually ticks. crt0 disables IRQs
@ -98,7 +115,7 @@ int main(void) {
__asm__ volatile ("cli\n" ::: "memory"); __asm__ volatile ("cli\n" ::: "memory");
unsigned char t0 = readVbl(); unsigned char t0 = readVbl();
for (int i = 0; i < ITERS; i++) { for (int i = 0; i < ITERS; i++) {
sink = (unsigned long)($call_expr); $sink_lhs = $sink_cast($call_expr);
} }
unsigned char t1 = readVbl(); unsigned char t1 = readVbl();
__asm__ volatile ("sei\n" ::: "memory"); __asm__ volatile ("sei\n" ::: "memory");
@ -114,7 +131,7 @@ EOF
|| { echo "compile-fail"; rm -f "$cwrap" "$owrap"; return; } || { echo "compile-fail"; rm -f "$cwrap" "$owrap"; return; }
"$CLANG" --target=w65816 -O2 -ffunction-sections -c "$BENCH_DIR/$name.c" -o "$obench" 2>/dev/null \ "$CLANG" --target=w65816 -O2 -ffunction-sections -c "$BENCH_DIR/$name.c" -o "$obench" 2>/dev/null \
|| { echo "compile-fail"; rm -f "$cwrap" "$owrap" "$obench"; return; } || { echo "compile-fail"; rm -f "$cwrap" "$owrap" "$obench"; return; }
"$LINK" -o "$bin" --text-base 0x1000 "$oCrt0" "$oLibgcc" "$owrap" "$obench" 2>/dev/null \ "$LINK" -o "$bin" --text-base 0x1000 "$oCrt0" "$oLibgcc" "$oSoftDouble" "$owrap" "$obench" 2>/dev/null \
|| { echo "link-fail"; rm -f "$cwrap" "$owrap" "$obench" "$bin"; return; } || { echo "link-fail"; rm -f "$cwrap" "$owrap" "$obench" "$bin"; return; }
# Read VBL delta at $025000. # Read VBL delta at $025000.
@ -135,8 +152,8 @@ EOF
if [ "$ticks" -eq 0 ]; then if [ "$ticks" -eq 0 ]; then
echo "<65 cyc/iter (under timer resolution)" echo "<65 cyc/iter (under timer resolution)"
else else
local cycles=$((ticks * 65 / 100)) local cycles=$((ticks * 65 / iters))
printf "%d hbl-ticks (~%d cyc/iter)" "$ticks" "$cycles" printf "%d hbl-ticks (~%d cyc/iter, %d iters)" "$ticks" "$cycles" "$iters"
fi fi
fi fi
} }

View file

@ -21,7 +21,15 @@ source "$(dirname "$0")/common.sh"
BIN="$1" BIN="$1"
shift shift
SECS=3 # Frame budget: load at frame 30, check at CHECK_FRAME (default 300 = 4.5
# simulated seconds after load). Override via env for heavy-compute tests.
# Earlier default was 60 frames (0.5 sec), which falsely flagged slow but
# correct math (e.g. 6-iter sqrt with chained soft-double libcalls) as
# runtime hangs — see feedback_sqrt_runtime_broken.md.
CHECK_FRAME=${MAME_CHECK_FRAME:-300}
# seconds_to_run is simulated time; MAME terminates at this point. Sized
# to comfortably exceed CHECK_FRAME (300 frames = 5 sec at 60Hz).
SECS=${MAME_SECS:-6}
# Build address list as Lua table entries. # Build address list as Lua table entries.
LUA_CHECKS="" LUA_CHECKS=""
@ -84,7 +92,7 @@ emu.register_frame_done(function()
cpu.state["S"].value = 0x01FF cpu.state["S"].value = 0x01FF
print("MAME-LOADED bytes=" .. #data) print("MAME-LOADED bytes=" .. #data)
end end
if frame == 60 then if frame == $CHECK_FRAME then
local cpu = manager.machine.devices[":maincpu"] local cpu = manager.machine.devices[":maincpu"]
local mem = cpu.spaces["program"] local mem = cpu.spaces["program"]
$LUA_CHECKS $LUA_CHECKS

View file

@ -22,7 +22,8 @@ source "$(dirname "$0")/common.sh"
BIN="$1" BIN="$1"
shift shift
SECS=3 CHECK_FRAME=${MAME_CHECK_FRAME:-300}
SECS=${MAME_SECS:-6}
# 23-byte stub bytes (see runtime/src/iigsGsosStub.s for source). # 23-byte stub bytes (see runtime/src/iigsGsosStub.s for source).
# Hand-assembled to avoid relying on llvm-mc tracking M-flag state. # Hand-assembled to avoid relying on llvm-mc tracking M-flag state.
@ -96,7 +97,7 @@ $STUB_LUA
cpu.state["S"].value = 0x01FF cpu.state["S"].value = 0x01FF
print("MAME-LOADED bytes=" .. #data .. " stub=$((${#STUB_BYTES}/2))") print("MAME-LOADED bytes=" .. #data .. " stub=$((${#STUB_BYTES}/2))")
end end
if frame == 60 then if frame == $CHECK_FRAME then
local cpu = manager.machine.devices[":maincpu"] local cpu = manager.machine.devices[":maincpu"]
local mem = cpu.spaces["program"] local mem = cpu.spaces["program"]
$LUA_CHECKS $LUA_CHECKS

View file

@ -11,7 +11,8 @@ source "$(dirname "$0")/common.sh"
MANIFEST="$1" MANIFEST="$1"
shift shift
SECS=3 CHECK_FRAME=${MAME_CHECK_FRAME:-300}
SECS=${MAME_SECS:-6}
# Build address list as Lua table entries, mirroring runInMame.sh. # Build address list as Lua table entries, mirroring runInMame.sh.
LUA_CHECKS="" LUA_CHECKS=""
@ -97,7 +98,7 @@ $LOAD_LUA
cpu.state["S"].value = 0x01FF cpu.state["S"].value = 0x01FF
print('MAME-READY pc=0x' .. string.format('%06x', $ENTRY_BASE + $ENTRY_OFF)) print('MAME-READY pc=0x' .. string.format('%06x', $ENTRY_BASE + $ENTRY_OFF))
end end
if frame == 60 then if frame == $CHECK_FRAME then
local cpu = manager.machine.devices[":maincpu"] local cpu = manager.machine.devices[":maincpu"]
local mem = cpu.spaces["program"] local mem = cpu.spaces["program"]
$LUA_CHECKS $LUA_CHECKS

View file

@ -833,6 +833,15 @@ struct Linker {
L.bssBase = 0xD000; L.bssBase = 0xD000;
} }
} }
// Also bump past the IO window if BSS would SPAN it
// (starts below 0xC000, extends into or past 0xC000).
// BSS writes to 0xC000-0xCFFF hit soft switches — caught
// by smoke #128 hex dumper, where ~954-byte BSS pushed
// past 0xC000 and BSS-clear writes crashed MAME.
if (L.bssBase < 0xC000 &&
L.bssBase + L.bssSize > 0xC000) {
L.bssBase = 0xD000;
}
if (L.bssBase + L.bssSize > 0x10000u) { if (L.bssBase + L.bssSize > 0x10000u) {
char msg[256]; char msg[256];
std::snprintf(msg, sizeof(msg), std::snprintf(msg, sizeof(msg),

View file

@ -114,6 +114,17 @@ W65816TargetLowering::W65816TargetLowering(const TargetMachine &TM,
for (MVT VT : MVT::integer_valuetypes()) for (MVT VT : MVT::integer_valuetypes())
setLoadExtAction(ISD::SEXTLOAD, VT, MVT::i8, Expand); setLoadExtAction(ISD::SEXTLOAD, VT, MVT::i8, Expand);
// GlobalOpt sometimes narrows a `short` global to `i1` when it sees
// every assignment is 0 or 1. Custom-lower so LowerLoad rewrites
// `zext/sext/anyext from i1` into a plain byte load + appropriate
// mask. Both i16 and i8 result widths can appear, depending on
// whether the consumer wants the value as `short` or `bool`.
for (MVT ResVT : {MVT::i8, MVT::i16}) {
setLoadExtAction(ISD::ZEXTLOAD, ResVT, MVT::i1, Custom);
setLoadExtAction(ISD::SEXTLOAD, ResVT, MVT::i1, Custom);
setLoadExtAction(ISD::EXTLOAD, ResVT, MVT::i1, Custom);
}
// Only register i32 ext-load / trunc-store and Custom actions when // Only register i32 ext-load / trunc-store and Custom actions when
// i32 is actually a legal type (ptr32 mode active). Otherwise the // i32 is actually a legal type (ptr32 mode active). Otherwise the
// Custom-action calls intercept i16/i8 ops, and LowerTruncate's // Custom-action calls intercept i16/i8 ops, and LowerTruncate's
@ -191,6 +202,20 @@ W65816TargetLowering::W65816TargetLowering(const TargetMachine &TM,
setOperationAction(ISD::SMUL_LOHI, MVT::i16, Expand); setOperationAction(ISD::SMUL_LOHI, MVT::i16, Expand);
setOperationAction(ISD::UMUL_LOHI, MVT::i16, Expand); setOperationAction(ISD::UMUL_LOHI, MVT::i16, Expand);
setOperationAction(ISD::MUL, MVT::i16, LibCall); setOperationAction(ISD::MUL, MVT::i16, LibCall);
// i8 multiply / mulh / div / rem: SDAG narrows e.g. `x / 10` to
// `mulhu i8 x, -51` + shift when it proves operands fit in i8.
// The 65816 has no native 8-bit multiplier; route everything
// through the 16-bit libcalls by Promoting i8 ops to i16.
setOperationAction(ISD::MUL, MVT::i8, Promote);
setOperationAction(ISD::MULHU, MVT::i8, Promote);
setOperationAction(ISD::MULHS, MVT::i8, Promote);
setOperationAction(ISD::SDIV, MVT::i8, Promote);
setOperationAction(ISD::UDIV, MVT::i8, Promote);
setOperationAction(ISD::SREM, MVT::i8, Promote);
setOperationAction(ISD::UREM, MVT::i8, Promote);
setOperationAction(ISD::SMUL_LOHI, MVT::i8, Expand);
setOperationAction(ISD::UMUL_LOHI, MVT::i8, Expand);
// CTPOP/CTLZ/CTTZ/ROTL/ROTR — no hardware support. Expand lets the // CTPOP/CTLZ/CTTZ/ROTL/ROTR — no hardware support. Expand lets the
// type legalizer rewrite into a sequence of basic ops. Without // type legalizer rewrite into a sequence of basic ops. Without
// this, e.g. `x && !(x & (x-1))` (LLVM canonicalises to popcount==1) // this, e.g. `x && !(x & (x-1))` (LLVM canonicalises to popcount==1)
@ -904,6 +929,28 @@ SDValue W65816TargetLowering::LowerLoad(SDValue Op,
Ld->getAlign(), Ld->getAlign(),
Ld->getMemOperand()->getFlags()); Ld->getMemOperand()->getFlags());
} }
// i1 memory type comes from GlobalOpt narrowing `short` globals
// whose only assignments are 0/1. Treat as i8 load + appropriate
// mask — the underlying memory is still byte-sized.
if (MemVT == MVT::i1) {
SDValue ByteLd = DAG.getExtLoad(ISD::ZEXTLOAD, DL, MVT::i16, Chain,
FoldedLo, MVT::i8,
Ld->getMemOperand());
SDValue Val = ByteLd;
if (ExtType == ISD::ZEXTLOAD || ExtType == ISD::EXTLOAD) {
Val = DAG.getNode(ISD::AND, DL, MVT::i16, ByteLd,
DAG.getConstant(1, DL, MVT::i16));
} else if (ExtType == ISD::SEXTLOAD) {
// i1 sign-extend: bit 0 -> all bits. AND #1 then NEG.
SDValue Bit = DAG.getNode(ISD::AND, DL, MVT::i16, ByteLd,
DAG.getConstant(1, DL, MVT::i16));
Val = DAG.getNode(ISD::SUB, DL, MVT::i16,
DAG.getConstant(0, DL, MVT::i16), Bit);
}
if (Op.getValueType() == MVT::i8)
Val = DAG.getNode(ISD::TRUNCATE, DL, MVT::i8, Val);
return DAG.getMergeValues({Val, ByteLd.getValue(1)}, DL);
}
return DAG.getExtLoad(ExtType, DL, Op.getValueType(), Chain, FoldedLo, return DAG.getExtLoad(ExtType, DL, Op.getValueType(), Chain, FoldedLo,
MemVT, Ld->getMemOperand()); MemVT, Ld->getMemOperand());
} }
@ -913,6 +960,9 @@ SDValue W65816TargetLowering::LowerLoad(SDValue Op,
return SDValue(); return SDValue();
EVT MemVT = Ld->getMemoryVT(); EVT MemVT = Ld->getMemoryVT();
// Widen i1 memVT to i8 (single-byte storage). getMemIntrinsicNode
// asserts memvt must be supported; i1 isn't.
if (MemVT == MVT::i1) MemVT = MVT::i8;
SDVTList VTs = DAG.getVTList(MVT::i16, MVT::Other); SDVTList VTs = DAG.getVTList(MVT::i16, MVT::Other);
SDValue Ops[] = { Chain, Ptr }; SDValue Ops[] = { Chain, Ptr };
// memVT for the LD_PTR memintrinsic must match MMO's size (i8 vs // memVT for the LD_PTR memintrinsic must match MMO's size (i8 vs
@ -925,10 +975,14 @@ SDValue W65816TargetLowering::LowerLoad(SDValue Op,
MemVT, Ld->getMemOperand()); MemVT, Ld->getMemOperand());
SDValue Val = LdNode; SDValue Val = LdNode;
// Byte memory access: mask the high byte for zextload, leave anyext. // Byte memory access: mask the high byte for zextload, leave anyext.
// i1 memVT was widened to i8 above; the mask path is the same.
if (MemVT == MVT::i8) { if (MemVT == MVT::i8) {
if (Ld->getExtensionType() == ISD::ZEXTLOAD) EVT OrigMemVT = Ld->getMemoryVT();
Val = DAG.getNode(ISD::AND, DL, MVT::i16, Val, SDValue MaskC = DAG.getConstant(OrigMemVT == MVT::i1 ? 1 : 0xFF,
DAG.getConstant(0xFF, DL, MVT::i16)); DL, MVT::i16);
if (Ld->getExtensionType() == ISD::ZEXTLOAD ||
(OrigMemVT == MVT::i1 && Ld->getExtensionType() == ISD::EXTLOAD))
Val = DAG.getNode(ISD::AND, DL, MVT::i16, Val, MaskC);
else if (Ld->getExtensionType() == ISD::SEXTLOAD) else if (Ld->getExtensionType() == ISD::SEXTLOAD)
Val = DAG.getNode(ISD::SIGN_EXTEND_INREG, DL, MVT::i16, Val, Val = DAG.getNode(ISD::SIGN_EXTEND_INREG, DL, MVT::i16, Val,
DAG.getValueType(MVT::i8)); DAG.getValueType(MVT::i8));

View file

@ -110,21 +110,32 @@ static int classifyImgReg(unsigned Reg) {
return -1; return -1;
} }
// Classification of a DP-addressed instruction's relation to a DP slot.
enum class DpAccess {
None, // not a DP-imm instruction we care about
Read, // only reads the DP slot (e.g., LDA $C0)
Write, // only writes the DP slot (e.g., STA $C0)
ReadWrite, // both (e.g., INC $C0)
};
// Map a DP-addressed instruction's first immediate operand to an IMG // Map a DP-addressed instruction's first immediate operand to an IMG
// slot index if it falls in $C0..$CE. Returns -1 otherwise. // slot index and access mode. Returns (-1, None) if it doesn't access
static int classifyDpImmAsImg(const MachineInstr &MI) { // an IMG slot.
// Most DP-addressed opcodes take the dp address as immediate op 0. static std::pair<int, DpAccess> classifyDpImmAsImg(const MachineInstr &MI) {
// (Some, like ADC_DP-form-with-explicit-A, may put the imm at op 1.)
// For our scan, check the first IMM operand we find.
unsigned Opc = MI.getOpcode(); unsigned Opc = MI.getOpcode();
DpAccess Mode;
switch (Opc) { switch (Opc) {
case W65816::LDA_DP: // Pure stores: write only.
case W65816::STA_DP: case W65816::STA_DP:
case W65816::STZ_DP: case W65816::STZ_DP:
case W65816::LDX_DP:
case W65816::STX_DP: case W65816::STX_DP:
case W65816::LDY_DP:
case W65816::STY_DP: case W65816::STY_DP:
Mode = DpAccess::Write;
break;
// Pure loads / compares / bit-tests: read only (writes to A/X/Y/P, not DP).
case W65816::LDA_DP:
case W65816::LDX_DP:
case W65816::LDY_DP:
case W65816::ADC_DP: case W65816::ADC_DP:
case W65816::SBC_DP: case W65816::SBC_DP:
case W65816::AND_DP: case W65816::AND_DP:
@ -134,53 +145,68 @@ static int classifyDpImmAsImg(const MachineInstr &MI) {
case W65816::CPX_DP: case W65816::CPX_DP:
case W65816::CPY_DP: case W65816::CPY_DP:
case W65816::BIT_DP: case W65816::BIT_DP:
Mode = DpAccess::Read;
break;
// Read-modify-write.
case W65816::INC_DP: case W65816::INC_DP:
case W65816::DEC_DP: case W65816::DEC_DP:
case W65816::ASL_DP: case W65816::ASL_DP:
case W65816::LSR_DP: case W65816::LSR_DP:
case W65816::ROL_DP: case W65816::ROL_DP:
case W65816::ROR_DP: case W65816::ROR_DP:
Mode = DpAccess::ReadWrite;
break; break;
default: default:
return -1; return {-1, DpAccess::None};
} }
for (const auto &MO : MI.operands()) { for (const auto &MO : MI.operands()) {
if (!MO.isImm()) continue; if (!MO.isImm()) continue;
int64_t V = MO.getImm(); int64_t V = MO.getImm();
for (int i = 0; i < 8; ++i) for (int i = 0; i < 8; ++i)
if ((int64_t)IMG_DP[i] == V) if ((int64_t)IMG_DP[i] == V)
return i; return {i, Mode};
return -1; // First imm is the dp addr; not in IMG range. return {-1, DpAccess::None}; // First imm is the dp addr; not in IMG range.
} }
return -1; return {-1, DpAccess::None};
} }
bool W65816ImgCalleeSave::runOnMachineFunction(MachineFunction &MF) { bool W65816ImgCalleeSave::runOnMachineFunction(MachineFunction &MF) {
// Step 1: scan for IMG8..IMG15 usage. copyPhysReg already lowered // Step 1: scan for IMG8..IMG15 WRITES. Reads alone don't need saving
// some COPY $imgN = $a forms to STA_DP imm:0xC0 (etc.), so we have // — if we never write IMGn, the caller's value survives untouched
// to check both the physreg form AND the DP-immediate form. // (other functions we call also preserve IMG8..IMG15 by the same
bool UsedSlot[8] = {false}; // convention, so no chain breaks the invariant). Saving on read-only
bool AnyUsed = false; // use costs ~6 bytes per slot of needlessly-saved prologue/epilogue
// (caught by evalAt at 1.96× Calypsi — 5 IMG slots saved when fewer
// were actually written).
//
// copyPhysReg lowers `COPY $imgN = $a` to `STA_DP imm:0xCx`, so we
// check both the physreg-DEF form AND the DP-imm-store form.
bool WrittenSlot[8] = {false};
bool AnyWritten = false;
for (auto &MBB : MF) { for (auto &MBB : MF) {
for (auto &MI : MBB) { for (auto &MI : MBB) {
// physreg form: $imgN = ... or ... = $imgN // physreg-DEF form: $imgN appearing as a Def operand.
for (const auto &MO : MI.operands()) { for (const auto &MO : MI.operands()) {
if (!MO.isReg() || MO.getReg() == 0) continue; if (!MO.isReg() || MO.getReg() == 0 || !MO.isDef()) continue;
int idx = classifyImgReg(MO.getReg()); int idx = classifyImgReg(MO.getReg());
if (idx >= 0) { if (idx >= 0) {
UsedSlot[idx] = true; WrittenSlot[idx] = true;
AnyUsed = true; AnyWritten = true;
} }
} }
// DP-imm form: lda dp imm:0xC0 etc. // DP-imm form: STA_DP / INC_DP / etc. write the slot at $Cx.
int idx = classifyDpImmAsImg(MI); auto [idx, mode] = classifyDpImmAsImg(MI);
if (idx >= 0) { if (idx >= 0 &&
UsedSlot[idx] = true; (mode == DpAccess::Write || mode == DpAccess::ReadWrite)) {
AnyUsed = true; WrittenSlot[idx] = true;
AnyWritten = true;
} }
} }
} }
if (!AnyUsed) return false; if (!AnyWritten) return false;
// Rename for downstream Step 2/3/4 readability — they use UsedSlot.
bool (&UsedSlot)[8] = WrittenSlot;
(void)AnyWritten;
// Step 2: allocate one frame slot per used IMG. Size = 2 bytes (each // Step 2: allocate one frame slot per used IMG. Size = 2 bytes (each
// Img16 holds a 16-bit value). Mark as a spill slot so PEI accounts // Img16 holds a 16-bit value). Mark as a spill slot so PEI accounts

View file

@ -942,6 +942,17 @@ def : Pat<(i16 (zextloadi8 (W65816Wrapper tglobaladdr:$g))),
def : Pat<(i16 (zextloadi8 (W65816Wrapper texternalsym:$s))), def : Pat<(i16 (zextloadi8 (W65816Wrapper texternalsym:$s))),
(ANDi16imm (LDAabs texternalsym:$s), 0xFF)>; (ANDi16imm (LDAabs texternalsym:$s), 0xFF)>;
// i1-result loads from globals: GlobalOpt narrows `static short` to
// i1 when it sees every assignment is 0 or 1. zextloadi1 and
// extloadi1 land on us as i16-result loads with `s8`/i1 memory type;
// emit them as a normal byte load + mask (zext) or bare load (ext).
def : Pat<(i16 (zextloadi1 (W65816Wrapper tglobaladdr:$g))),
(ANDi16imm (LDAabs tglobaladdr:$g), 0xFF)>;
def : Pat<(i16 (extloadi1 (W65816Wrapper tglobaladdr:$g))),
(LDAabs tglobaladdr:$g)>;
def : Pat<(i16 (sextloadi1 (W65816Wrapper tglobaladdr:$g))),
(ANDi16imm (LDAabs tglobaladdr:$g), 1)>;
// CMP / branches. CMP sets the flags via the W65816cmp SDNode (glue // CMP / branches. CMP sets the flags via the W65816cmp SDNode (glue
// out); the W65816brcc node consumes the glue and dispatches to the // out); the W65816brcc node consumes the glue and dispatches to the
// right Bxx instruction by condition code. // right Bxx instruction by condition code.

View file

@ -117,17 +117,33 @@ bool W65816LowerWide32::runOnMachineFunction(MachineFunction &MF) {
MachineInstr *DefMI = MRI.getUniqueVRegDef(W); MachineInstr *DefMI = MRI.getUniqueVRegDef(W);
if (DefMI && DefMI->getOpcode() == TargetOpcode::REG_SEQUENCE) { if (DefMI && DefMI->getOpcode() == TargetOpcode::REG_SEQUENCE) {
Register Lo, Hi; Register Lo, Hi;
bool Bail = false;
for (unsigned op = 1; op + 1 < DefMI->getNumOperands(); op += 2) { for (unsigned op = 1; op + 1 < DefMI->getNumOperands(); op += 2) {
if (!DefMI->getOperand(op).isReg() || if (!DefMI->getOperand(op).isReg() ||
!DefMI->getOperand(op + 1).isImm()) !DefMI->getOperand(op + 1).isImm())
continue; continue;
unsigned idx = DefMI->getOperand(op + 1).getImm(); unsigned idx = DefMI->getOperand(op + 1).getImm();
Register Src = DefMI->getOperand(op).getReg(); Register Src = DefMI->getOperand(op).getReg();
unsigned SrcSub = DefMI->getOperand(op).getSubReg();
// If the source has a sub-register specifier (e.g.
// `%W.sub_lo:wide32` is a slice of a wide32 vreg), the
// effective "half" is the corresponding half of that source.
// Resolve via wideMap when the parent is already mapped;
// otherwise defer until a later iteration picks it up.
if (SrcSub != 0) {
if (!Src.isVirtual() || !wideMap.count(Src)) {
Bail = true;
break;
}
auto [SrcLo, SrcHi] = wideMap[Src];
Src = (SrcSub == llvm::sub_lo) ? SrcLo : SrcHi;
}
if (idx == llvm::sub_lo) if (idx == llvm::sub_lo)
Lo = Src; Lo = Src;
else if (idx == llvm::sub_hi) else if (idx == llvm::sub_hi)
Hi = Src; Hi = Src;
} }
if (Bail) continue;
if (Lo && Hi) { if (Lo && Hi) {
wideMap[W] = {Lo, Hi}; wideMap[W] = {Lo, Hi};
toErase.push_back(DefMI); toErase.push_back(DefMI);
@ -156,25 +172,38 @@ bool W65816LowerWide32::runOnMachineFunction(MachineFunction &MF) {
MachineInstr *LoDefMI = nullptr; MachineInstr *LoDefMI = nullptr;
MachineInstr *HiDefMI = nullptr; MachineInstr *HiDefMI = nullptr;
bool ok = true; bool ok = true;
bool Bail = false;
for (MachineInstr &MI : MRI.def_instructions(W)) { for (MachineInstr &MI : MRI.def_instructions(W)) {
if (!MI.isCopy()) { ok = false; break; } if (!MI.isCopy()) { ok = false; break; }
const MachineOperand &Dst = MI.getOperand(0); const MachineOperand &Dst = MI.getOperand(0);
const MachineOperand &Src = MI.getOperand(1); const MachineOperand &Src = MI.getOperand(1);
if (!Dst.isReg() || Dst.getReg() != W) { ok = false; break; } if (!Dst.isReg() || Dst.getReg() != W) { ok = false; break; }
unsigned SubIdx = Dst.getSubReg(); unsigned SubIdx = Dst.getSubReg();
Register S = Src.getReg();
unsigned SrcSub = Src.getSubReg();
// If the source has a sub-register specifier, resolve through
// wideMap[parent]. Symmetric with the REG_SEQUENCE handler
// above — without this, `%W.sub_lo = COPY %V.sub_lo:wide32`
// records the wide32 parent %V instead of %V's i16 sub_lo.
if (SrcSub != 0) {
if (!S.isVirtual() || !wideMap.count(S)) { Bail = true; break; }
auto [SL, SH] = wideMap[S];
S = (SrcSub == llvm::sub_lo) ? SL : SH;
}
if (SubIdx == llvm::sub_lo) { if (SubIdx == llvm::sub_lo) {
if (LoDefMI) { ok = false; break; } if (LoDefMI) { ok = false; break; }
LoDefMI = &MI; LoDefMI = &MI;
LoSrc = Src.getReg(); LoSrc = S;
} else if (SubIdx == llvm::sub_hi) { } else if (SubIdx == llvm::sub_hi) {
if (HiDefMI) { ok = false; break; } if (HiDefMI) { ok = false; break; }
HiDefMI = &MI; HiDefMI = &MI;
HiSrc = Src.getReg(); HiSrc = S;
} else { } else {
ok = false; ok = false;
break; break;
} }
} }
if (Bail) continue;
if (ok && LoSrc && HiSrc) { if (ok && LoSrc && HiSrc) {
wideMap[W] = {LoSrc, HiSrc}; wideMap[W] = {LoSrc, HiSrc};
if (LoDefMI) toErase.push_back(LoDefMI); if (LoDefMI) toErase.push_back(LoDefMI);

View file

@ -281,7 +281,11 @@ bool W65816PromoteFiToImg::runOnMachineFunction(MachineFunction &MF) {
Name == "__modsi3" || Name == "__ashlhi3" || Name == "__modsi3" || Name == "__ashlhi3" ||
Name == "__lshrhi3" || Name == "__ashrhi3" || Name == "__lshrhi3" || Name == "__ashrhi3" ||
Name == "__ashlsi3" || Name == "__lshrsi3" || Name == "__ashlsi3" || Name == "__lshrsi3" ||
Name == "__ashrsi3") Name == "__ashrsi3" ||
// 64-bit helpers: use $E0..$EE only, no IMG0..7 touch.
Name == "__ashldi3" || Name == "__lshrdi3" ||
Name == "__ashrdi3" || Name == "__cmpdi2" ||
Name == "__ucmpdi2")
return true; return true;
return false; return false;
} }

View file

@ -54,6 +54,7 @@
#include "llvm/CodeGen/MachineInstrBuilder.h" #include "llvm/CodeGen/MachineInstrBuilder.h"
#include "llvm/Support/Debug.h" #include "llvm/Support/Debug.h"
#include "llvm/Support/Format.h" #include "llvm/Support/Format.h"
#include <functional>
using namespace llvm; using namespace llvm;
@ -131,6 +132,501 @@ static bool isImgSafeCall(const MachineInstr &MI) {
} }
// Phase 12 peephole — A-dead PHA/PLA bracket elision. Two shapes:
//
// (a) PEI single-store IMG-source-STAfi bracket. When the next op
// after PLA redefines A, the bracket is dead weight:
//
// PHA ; (LDA_DP $cx | TXA | TYA) ; STA_StackRel (off+2) ; PLA
// [next redefines A]
// →
// (LDA_DP $cx | TXA | TYA) ; STA_StackRel off
//
// (b) ImgCalleeSave multi-store bracket at function entry. When the
// post-PLA pattern is "STX_DP ... ; STA_StackRel destOff ; [redefines
// A]", the post-PLA STA is storing entry-A to its final slot — we
// reorder by hoisting that STA to BEFORE the bracket, then dropping
// PHA/PLA and reverting inner offsets:
//
// PHA ; (LDA_DP $cx ; STA_StackRel off+2)×N ; PLA
// STX_DP $cM ; STA_StackRel destOff
// [next redefines A]
// →
// STA_StackRel destOff ; hoisted, entry-A → slot first
// (LDA_DP $cx ; STA_StackRel off)×N
// STX_DP $cM ; STX stays after saves
// [next op]
//
// Restricted to the entry MBB starting at MBB.begin() to ensure the
// match is an ImgCalleeSave-emitted prologue bracket (and not a mid-
// function bracket where the post-PLA STA is consuming a *different*
// A value than what was preserved).
static bool elidePhaBracket(MachineFunction &MF,
const W65816InstrInfo *TII) {
bool Changed = false;
auto opNoTouchA = [](unsigned Op) {
switch (Op) {
case W65816::STX_DP: case W65816::STX_Abs:
case W65816::STY_DP: case W65816::STY_Abs:
return true;
default:
return false;
}
};
auto opRedefinesA = [](unsigned Op) {
switch (Op) {
case W65816::LDA_DP: case W65816::LDA_StackRel:
case W65816::LDA_Abs: case W65816::LDA_Imm16:
case W65816::LDAabs: case W65816::LDAi16imm:
case W65816::TXA: case W65816::TYA:
case W65816::PLA:
return true;
default:
return false;
}
};
// --- Case (a): single-store brackets anywhere in any MBB. ---
for (MachineBasicBlock &MBB : MF) {
SmallVector<MachineInstr *, 4> ToErase;
for (auto It = MBB.begin(); It != MBB.end(); ++It) {
if (It->getOpcode() != W65816::PHA) continue;
auto Lda = std::next(It);
if (Lda == MBB.end()) continue;
unsigned LdaOp = Lda->getOpcode();
bool LdaIsLoadDp = (LdaOp == W65816::LDA_DP);
bool LdaIsXfer = (LdaOp == W65816::TXA || LdaOp == W65816::TYA);
if (!LdaIsLoadDp && !LdaIsXfer) continue;
auto Sta = std::next(Lda);
if (Sta == MBB.end()) continue;
if (Sta->getOpcode() != W65816::STA_StackRel) continue;
auto Pla = std::next(Sta);
if (Pla == MBB.end()) continue;
if (Pla->getOpcode() != W65816::PLA) continue;
auto AfterPla = std::next(Pla);
if (AfterPla == MBB.end()) continue;
unsigned AfterPlaOp = AfterPla->getOpcode();
bool AfterDeadA = opRedefinesA(AfterPlaOp);
// Forward-walk liveness: if AfterPla is a branch and ALL its
// successors' first ops redefine A (recursing through
// unconditional-branch trampolines), A is dead.
if (!AfterDeadA && AfterPla->isBranch()) {
bool AllDead = true;
std::function<bool(MachineBasicBlock *, int)> firstRedef =
[&](MachineBasicBlock *B, int Depth) -> bool {
if (Depth > 3 || !B || B->empty()) return false;
MachineInstr &MI = B->front();
unsigned MOp = MI.getOpcode();
if (opRedefinesA(MOp)) return true;
if (MOp == W65816::BRA || MOp == W65816::BRL ||
MOp == W65816::JMP_Abs) {
for (auto &MO : MI.operands()) {
if (MO.isMBB()) {
return firstRedef(MO.getMBB(), Depth + 1);
}
}
}
return false;
};
for (MachineBasicBlock *Succ : MBB.successors()) {
if (!firstRedef(Succ, 0)) { AllDead = false; break; }
}
if (AllDead && !MBB.succ_empty()) AfterDeadA = true;
}
if (!AfterDeadA) continue;
MachineOperand &OffMO = Sta->getOperand(0);
if (!OffMO.isImm()) continue;
int64_t Off = OffMO.getImm();
if (Off < 2) continue;
OffMO.setImm(Off - 2);
ToErase.push_back(&*It);
ToErase.push_back(&*Pla);
}
for (MachineInstr *MI : ToErase) {
MI->eraseFromParent();
Changed = true;
}
}
// --- Case (c): multi-pair STA_DP-only bracket anywhere. ---
// IMG-to-IMG copies bracketed for A-preservation. No StackRel
// offsets to adjust (DP is absolute, immune to PHA shifts), so just
// drop PHA/PLA when A is dead at PLA's exit.
std::function<bool(MachineBasicBlock *, int)> firstRedef =
[&](MachineBasicBlock *B, int Depth) -> bool {
if (Depth > 3 || !B || B->empty()) return false;
MachineInstr &MI = B->front();
unsigned MOp = MI.getOpcode();
if (opRedefinesA(MOp)) return true;
if (MOp == W65816::BRA || MOp == W65816::BRL ||
MOp == W65816::JMP_Abs) {
for (auto &MO : MI.operands()) {
if (MO.isMBB()) return firstRedef(MO.getMBB(), Depth + 1);
}
}
return false;
};
for (MachineBasicBlock &MBB : MF) {
SmallVector<MachineInstr *, 4> ToErase;
for (auto It = MBB.begin(); It != MBB.end(); ++It) {
if (It->getOpcode() != W65816::PHA) continue;
// Walk inner LDA_DP + STA_DP pairs.
auto Inner = std::next(It);
int InnerPairs = 0;
bool BailInner = false;
while (Inner != MBB.end()) {
if (Inner->getOpcode() == W65816::PLA) break;
if (Inner->getOpcode() != W65816::LDA_DP) { BailInner = true; break; }
auto St = std::next(Inner);
if (St == MBB.end() || St->getOpcode() != W65816::STA_DP) {
BailInner = true; break;
}
++InnerPairs;
Inner = std::next(St);
}
if (BailInner || Inner == MBB.end() || InnerPairs < 1) continue;
// Inner == PLA. Check liveness after PLA.
auto Post = std::next(Inner);
if (Post == MBB.end()) continue;
unsigned PostOp = Post->getOpcode();
bool ADead = opRedefinesA(PostOp);
if (!ADead && Post->isBranch()) {
bool AllDead = true;
for (MachineBasicBlock *Succ : MBB.successors()) {
if (!firstRedef(Succ, 0)) { AllDead = false; break; }
}
if (AllDead && !MBB.succ_empty()) ADead = true;
}
if (!ADead) continue;
// Eligible: drop PHA + PLA (no offset adjustment for DP).
ToErase.push_back(&*It);
ToErase.push_back(&*Inner);
}
for (MachineInstr *MI : ToErase) {
MI->eraseFromParent();
Changed = true;
}
}
// --- Case (b): ImgCalleeSave prologue bracket in entry MBB. ---
// PHA must be the FIRST instruction (or first after PEI prologue ops
// like REP/TAY/TSC/SEC/SBC/TCS/TYA) in the entry MBB. This ensures
// we're looking at the prologue's IMG save block.
MachineBasicBlock &EntryMBB = MF.front();
auto BB = EntryMBB.begin();
// Skip PEI prologue ops to reach the first ImgCalleeSave PHA.
while (BB != EntryMBB.end()) {
unsigned Op = BB->getOpcode();
if (Op == W65816::PHA) break;
// PEI prologue ops we expect to see before ImgCalleeSave's PHA.
if (Op == W65816::REP || Op == W65816::TAY ||
Op == W65816::TSC || Op == W65816::SEC ||
Op == W65816::SBC_Imm16 || Op == W65816::TCS ||
Op == W65816::TYA) {
++BB;
continue;
}
BB = EntryMBB.end(); // not a recognized prologue shape — bail
break;
}
if (BB != EntryMBB.end() && BB->getOpcode() == W65816::PHA) {
SmallVector<MachineInstr *, 8> InnerStas;
auto Inner = std::next(BB);
bool BailInner = false;
while (Inner != EntryMBB.end()) {
unsigned IOp = Inner->getOpcode();
if (IOp == W65816::PLA) break;
// Inner must be alternating LDA_DP + STA_StackRel pairs.
if (IOp != W65816::LDA_DP) { BailInner = true; break; }
auto St = std::next(Inner);
if (St == EntryMBB.end() || St->getOpcode() != W65816::STA_StackRel) {
BailInner = true; break;
}
MachineOperand &OffMO = St->getOperand(0);
if (!OffMO.isImm() || OffMO.getImm() < 2) {
BailInner = true; break;
}
InnerStas.push_back(&*St);
Inner = std::next(St);
}
if (!BailInner && Inner != EntryMBB.end() && !InnerStas.empty()) {
// Inner == PLA. Walk forward through STX_DP / STY_DP (A-
// transparent) ops looking for STA_StackRel that consumes
// entry-A, then verify next op redefines A.
auto Post = std::next(Inner);
while (Post != EntryMBB.end() && opNoTouchA(Post->getOpcode())) {
++Post;
}
if (Post != EntryMBB.end() &&
Post->getOpcode() == W65816::STA_StackRel) {
auto AfterSta = std::next(Post);
if (AfterSta != EntryMBB.end() &&
opRedefinesA(AfterSta->getOpcode())) {
// Eligible. Move STA destOff to right BEFORE PHA, drop
// PHA + PLA, shift inner STA offsets by -2.
MachineInstr *StaToMove = &*Post;
MachineInstr *PhaMI = &*BB;
MachineInstr *PlaMI = &*Inner;
// splice: move StaToMove to position just before PhaMI.
EntryMBB.splice(PhaMI->getIterator(), &EntryMBB,
StaToMove->getIterator());
for (MachineInstr *Sta : InnerStas) {
Sta->getOperand(0).setImm(Sta->getOperand(0).getImm() - 2);
}
PhaMI->eraseFromParent();
PlaMI->eraseFromParent();
Changed = true;
}
}
}
}
return Changed;
}
// Always-on: elide the STA $E0 / LDA $E0 round-trip in
// ADJCALLSTACKUP's Y-live i64-return path when the next instruction
// after the LDA is `STA_StackRel off,s` storing A to a slot. The
// emitted PEI sequence (see W65816FrameLowering ADJCALLSTACKUP):
//
// STA_DP $E0 ; save A across TSC
// TSC ; A = S
// CLC ; ADC_Imm16 #N ; TCS ; pop N bytes
// LDA_DP $E0 ; restore A
// STA_StackRel off, s ; store A to slot
//
// If the destination's pre-adjust offset (off + N) fits in a 1-byte
// stack-rel encoding, we can move the STA up to BEFORE the SP-adjust
// (using the pre-adjust offset) and drop both the save and reload.
//
// Saves 6 bytes + 8 cyc per match. evalAt has 4 of these.
static bool elideCallResultSaveSPReload(MachineFunction &MF,
const W65816InstrInfo *TII) {
bool Changed = false;
for (MachineBasicBlock &MBB : MF) {
SmallVector<MachineInstr *, 4> ToErase;
for (auto It = MBB.begin(); It != MBB.end(); ++It) {
if (It->getOpcode() != W65816::STA_DP) continue;
MachineOperand &SaveImm = It->getOperand(0);
if (!SaveImm.isImm() || SaveImm.getImm() != 0xE0) continue;
auto I1 = std::next(It);
if (I1 == MBB.end() || I1->getOpcode() != W65816::TSC) continue;
auto I2 = std::next(I1);
if (I2 == MBB.end() || I2->getOpcode() != W65816::CLC) continue;
auto I3 = std::next(I2);
if (I3 == MBB.end() || I3->getOpcode() != W65816::ADC_Imm16) continue;
MachineOperand &AdcImm = I3->getOperand(0);
if (!AdcImm.isImm()) continue;
int64_t N = AdcImm.getImm();
auto I4 = std::next(I3);
if (I4 == MBB.end() || I4->getOpcode() != W65816::TCS) continue;
auto I5 = std::next(I4);
if (I5 == MBB.end() || I5->getOpcode() != W65816::LDA_DP) continue;
MachineOperand &LoadImm = I5->getOperand(0);
if (!LoadImm.isImm() || LoadImm.getImm() != 0xE0) continue;
auto I6 = std::next(I5);
if (I6 == MBB.end() || I6->getOpcode() != W65816::STA_StackRel) continue;
MachineOperand &StaImm = I6->getOperand(0);
if (!StaImm.isImm()) continue;
int64_t Off = StaImm.getImm();
int64_t NewOff = Off + N;
if (NewOff < 0 || NewOff > 255) continue;
// Insert a new STA_StackRel at NewOff before the STA_DP $E0.
BuildMI(MBB, It, It->getDebugLoc(), TII->get(W65816::STA_StackRel))
.addImm(NewOff);
ToErase.push_back(&*It); // STA_DP $E0
ToErase.push_back(&*I5); // LDA_DP $E0
ToErase.push_back(&*I6); // original STA_StackRel
}
for (MachineInstr *MI : ToErase) {
MI->eraseFromParent();
Changed = true;
}
}
return Changed;
}
// Returns true if the opcode is "transparent" to a STA→LDA forward —
// does not write A, does not change S, does not write to any stack
// memory. Used to widen the elideStoreForwarding peephole's window.
static bool isStaLdaTransparent(unsigned Opc) {
switch (Opc) {
// X/Y register ops (don't touch A or S)
case W65816::LDX_Imm16: case W65816::LDX_DP: case W65816::LDX_Abs:
case W65816::LDXi16imm:
case W65816::LDY_Imm16: case W65816::LDY_DP: case W65816::LDY_Abs:
case W65816::TAX: case W65816::TAY:
case W65816::INX: case W65816::INY:
case W65816::DEX: case W65816::DEY:
case W65816::STX_DP: case W65816::STX_Abs:
case W65816::STY_DP: case W65816::STY_Abs:
// Flag ops
case W65816::CLC: case W65816::SEC:
case W65816::CLD: case W65816::SED:
case W65816::CLI: case W65816::SEI:
case W65816::CLV:
case W65816::NOP:
return true;
default:
return false;
}
}
// Always-on: drop a redundant LDA following STA to the same slot when
// any intermediate ops are "transparent" (don't write A or change S
// or stack memory). STA doesn't modify A, so A still holds the value.
//
// STA off, s
// LDX #imm ; transparent
// LDA off, s ; redundant — A unchanged since STA
//
// Saves 1 instruction (3 bytes / 4 cyc) per match.
static bool elideStoreForwarding(MachineFunction &MF) {
bool Changed = false;
for (MachineBasicBlock &MBB : MF) {
SmallVector<MachineInstr *, 4> ToErase;
for (auto It = MBB.begin(); It != MBB.end(); ++It) {
if (It->getOpcode() != W65816::STA_StackRel) continue;
MachineOperand &S = It->getOperand(0);
if (!S.isImm()) continue;
int64_t StaOff = S.getImm();
// Walk forward up to 3 ops looking for matching LDA.
MachineBasicBlock::iterator Walk = std::next(It);
int Steps = 0;
while (Walk != MBB.end() && Steps < 3) {
unsigned WOp = Walk->getOpcode();
if (WOp == W65816::LDA_StackRel) {
MachineOperand &L = Walk->getOperand(0);
if (L.isImm() && L.getImm() == StaOff) {
ToErase.push_back(&*Walk);
}
break;
}
if (!isStaLdaTransparent(WOp)) break;
++Walk;
++Steps;
}
}
for (MachineInstr *MI : ToErase) {
MI->eraseFromParent();
Changed = true;
}
}
return Changed;
}
// Always-on: drop a consecutive PLA/PHA pair. PLA restores A from
// the stack; PHA immediately pushes the same A back. Net is a no-op
// in both A and stack memory. Emerges when multiple adjacent IMG
// copies are each bracketed with PHA/PLA for A-preservation:
//
// PHA ; LDA dp ; STA dp ; PLA ; PHA ; LDA dp ; STA dp ; PLA
// ^^^^^^^^^^
// collapsed away
//
// Saves 2 instructions (2 bytes / 7 cyc) per match.
static bool elidePlaPhaPair(MachineFunction &MF) {
bool Changed = false;
for (MachineBasicBlock &MBB : MF) {
SmallVector<MachineInstr *, 4> ToErase;
for (auto It = MBB.begin(); It != MBB.end(); ++It) {
if (It->getOpcode() != W65816::PLA) continue;
auto I1 = std::next(It);
if (I1 == MBB.end() || I1->getOpcode() != W65816::PHA) continue;
ToErase.push_back(&*It);
ToErase.push_back(&*I1);
++It; // advance past PHA (already-to-erase)
}
for (MachineInstr *MI : ToErase) {
MI->eraseFromParent();
Changed = true;
}
}
return Changed;
}
// Always-on: drop a redundant LDA when the prior LDA loaded the same
// source and the only intervening instruction was PHA (which reads A
// but doesn't modify it). Emerges from i64 arg-push sequences:
//
// LDA off, s
// PHA
// LDA off, s ; A still has this value — redundant
// PHA
//
// Saves 1 instruction (3 bytes / 4 cyc) per match.
static bool elideRedundantLdaAfterPha(MachineFunction &MF) {
bool Changed = false;
for (MachineBasicBlock &MBB : MF) {
SmallVector<MachineInstr *, 4> ToErase;
for (auto It = MBB.begin(); It != MBB.end(); ++It) {
unsigned Op = It->getOpcode();
bool IsLdaSr = (Op == W65816::LDA_StackRel);
bool IsLdaDp = (Op == W65816::LDA_DP);
if (!IsLdaSr && !IsLdaDp) continue;
auto I1 = std::next(It);
if (I1 == MBB.end() || I1->getOpcode() != W65816::PHA) continue;
auto I2 = std::next(I1);
if (I2 == MBB.end() || I2->getOpcode() != Op) continue;
MachineOperand &S1 = It->getOperand(0);
MachineOperand &S2 = I2->getOperand(0);
if (!S1.isImm() || !S2.isImm()) continue;
if (S1.getImm() != S2.getImm()) continue;
ToErase.push_back(&*I2);
}
for (MachineInstr *MI : ToErase) {
MI->eraseFromParent();
Changed = true;
}
}
return Changed;
}
// Always-on: drop a dead STA in the i32-carry-propagation pattern:
//
// STA_StackRel off, s
// ADC_Imm16 #N ; doesn't touch slot
// STA_StackRel off, s ; overwrites first STA
//
// The first STA's value is shadowed by the second. Drop it.
// Saves 1 instruction (3 bytes / 5 cyc) per match.
static bool elideDeadStaCarry(MachineFunction &MF) {
bool Changed = false;
for (MachineBasicBlock &MBB : MF) {
SmallVector<MachineInstr *, 4> ToErase;
for (auto It = MBB.begin(); It != MBB.end(); ++It) {
if (It->getOpcode() != W65816::STA_StackRel) continue;
auto I1 = std::next(It);
if (I1 == MBB.end()) continue;
unsigned MidOp = I1->getOpcode();
bool IsAddImm = (MidOp == W65816::ADC_Imm16 ||
MidOp == W65816::ADCi16imm ||
MidOp == W65816::ADCEi16imm ||
MidOp == W65816::SBCi16imm ||
MidOp == W65816::SBCEi16imm);
if (!IsAddImm) continue;
auto I2 = std::next(I1);
if (I2 == MBB.end() || I2->getOpcode() != W65816::STA_StackRel) continue;
MachineOperand &Off1 = It->getOperand(0);
MachineOperand &Off2 = I2->getOperand(0);
if (!Off1.isImm() || !Off2.isImm()) continue;
if (Off1.getImm() != Off2.getImm()) continue;
ToErase.push_back(&*It);
}
for (MachineInstr *MI : ToErase) {
MI->eraseFromParent();
Changed = true;
}
}
return Changed;
}
bool W65816StackRelToImg::runOnMachineFunction(MachineFunction &MF) { bool W65816StackRelToImg::runOnMachineFunction(MachineFunction &MF) {
if (skipFunction(MF.getFunction())) return false; if (skipFunction(MF.getFunction())) return false;
if (MF.getFunction().hasOptNone()) return false; if (MF.getFunction().hasOptNone()) return false;
@ -139,26 +635,48 @@ bool W65816StackRelToImg::runOnMachineFunction(MachineFunction &MF) {
// be from FP not SP and the PHP-wrap +1 adjustment differs. // be from FP not SP and the PHP-wrap +1 adjustment differs.
if (MF.getFrameInfo().hasVarSizedObjects()) return false; if (MF.getFrameInfo().hasVarSizedObjects()) return false;
// Always-on peepholes that run even when the main IMG promotion bails.
const W65816Subtarget &STIp = MF.getSubtarget<W65816Subtarget>();
// Run PLA;PHA collapse FIRST so adjacent brackets merge into a
// single multi-pair bracket — lets elidePhaBracket case (c) match
// the merged shape.
bool ChangedEarly = elidePlaPhaPair(MF);
ChangedEarly |= elidePhaBracket(MF, STIp.getInstrInfo());
ChangedEarly |= elideCallResultSaveSPReload(MF, STIp.getInstrInfo());
ChangedEarly |= elideDeadStaCarry(MF);
ChangedEarly |= elideRedundantLdaAfterPha(MF);
// elideStoreForwarding only when main IMG promotion would bail —
// running it early in non-bailing functions cascades into IMG-slot
// reallocation that regresses strcpy 1.63×. Gated below.
// 2. Bail if the function has any non-IMG-safe call (would clobber // 2. Bail if the function has any non-IMG-safe call (would clobber
// our IMG0..7 promotions) or is recursive (same). Tried allowing // our IMG0..7 promotions) or is recursive (same). Tried allowing
// IMG8..15 + ImgCalleeSave fallback for these cases (gained 12 // IMG8..15 + own-pass save/restore for these cases (today, after
// inst on evalAt), but broke sprintf and fib due to subtle // landing W65816LowerWide32 + ImgCalleeSave-writes-only fixes), and
// interactions with ImgCalleeSave's slot allocation. Reverted. // saw: evalAt 498→500 (NET LOSS due to save/restore overhead) AND
// qsort #70 regression. The IMG8..15 path is not currently a win
// for our benchmarks; reverted.
StringRef SelfName = MF.getName(); StringRef SelfName = MF.getName();
for (MachineBasicBlock &MBB : MF) { for (MachineBasicBlock &MBB : MF) {
for (MachineInstr &MI : MBB) { for (MachineInstr &MI : MBB) {
if (!MI.isCall()) continue; if (!MI.isCall()) continue;
if (!isImgSafeCall(MI)) return false; if (!isImgSafeCall(MI)) {
ChangedEarly |= elideStoreForwarding(MF);
return ChangedEarly;
}
for (const MachineOperand &MO : MI.operands()) { for (const MachineOperand &MO : MI.operands()) {
StringRef Name; StringRef Name;
if (MO.isGlobal()) Name = MO.getGlobal()->getName(); if (MO.isGlobal()) Name = MO.getGlobal()->getName();
else if (MO.isSymbol()) Name = MO.getSymbolName(); else if (MO.isSymbol()) Name = MO.getSymbolName();
else continue; else continue;
if (Name == SelfName) return false; if (Name == SelfName) {
ChangedEarly |= elideStoreForwarding(MF);
return ChangedEarly;
}
} }
} }
} }
uint8_t imgBase = 0xD0; uint8_t imgBase = 0xD0u;
// 3. Count stack-rel accesses per offset. CRITICAL: the stack // 3. Count stack-rel accesses per offset. CRITICAL: the stack
// pointer shifts during the function due to PHP/PLP (+1 byte) and // pointer shifts during the function due to PHP/PLP (+1 byte) and
@ -614,23 +1132,65 @@ bool W65816StackRelToImg::runOnMachineFunction(MachineFunction &MF) {
auto Tya = std::next(Tcs); auto Tya = std::next(Tcs);
while (Tya != EntryMBB.end() && Tya->isDebugInstr()) ++Tya; while (Tya != EntryMBB.end() && Tya->isDebugInstr()) ++Tya;
if (Tya != EntryMBB.end() && Tya->getOpcode() == W65816::TYA) { if (Tya != EntryMBB.end() && Tya->getOpcode() == W65816::TYA) {
// Walk past A-transparent ops (STX_DP, STY_DP) — these
// don't touch A, so TAY/TYA can still be removed.
auto Sta = std::next(Tya); auto Sta = std::next(Tya);
while (Sta != EntryMBB.end() && Sta->isDebugInstr()) ++Sta; while (Sta != EntryMBB.end() &&
(Sta->isDebugInstr() ||
Sta->getOpcode() == W65816::STX_DP ||
Sta->getOpcode() == W65816::STY_DP)) {
++Sta;
}
if (Sta != EntryMBB.end() && if (Sta != EntryMBB.end() &&
Sta->getOpcode() == W65816::STA_DP &&
Sta->getNumOperands() >= 1 && Sta->getNumOperands() >= 1 &&
Sta->getOperand(0).isImm()) { Sta->getOperand(0).isImm()) {
int64_t StaAddr = Sta->getOperand(0).getImm(); unsigned StaOp = Sta->getOpcode();
// Build new STA_DP between REP and TSC. bool IsStaDp = (StaOp == W65816::STA_DP);
DebugLoc DL = Sta->getDebugLoc(); bool IsStaSr = (StaOp == W65816::STA_StackRel);
BuildMI(EntryMBB, Tsc, DL, TII->get(W65816::STA_DP)) if (IsStaDp || IsStaSr) {
.addImm(StaAddr) // For STA_StackRel: pre-TCS offset = post-TCS_off - N
.addReg(W65816::A, RegState::Implicit); // where N = SBC immediate. Only valid if off >= N.
// Erase: TAY, TYA, old STA_DP. int64_t StaAddr = Sta->getOperand(0).getImm();
Tay->eraseFromParent(); int64_t SbcImm = Sbc->getOperand(0).isImm()
Tya->eraseFromParent(); ? Sbc->getOperand(0).getImm() : -1;
Sta->eraseFromParent(); // Drop ADCi16imm pseudo-tied operands: imm is at op 0 for
Changed = true; // SBC_Imm16 but op 2 for SBCi16imm — handle uniformly.
if (!Sbc->getOperand(0).isImm() &&
Sbc->getNumOperands() >= 3 &&
Sbc->getOperand(2).isImm()) {
SbcImm = Sbc->getOperand(2).getImm();
}
int64_t NewAddr = IsStaDp ? StaAddr : (StaAddr - SbcImm);
bool OffOk = IsStaDp || (NewAddr >= 1 && SbcImm > 0);
// Safety: the op after the spill-STA must REDEFINE A
// (not read it). Otherwise A would be lost (TCS
// clobbered it).
auto Next = std::next(Sta);
while (Next != EntryMBB.end() && Next->isDebugInstr())
++Next;
bool NextRedef = false;
if (Next != EntryMBB.end()) {
unsigned NOp = Next->getOpcode();
NextRedef =
NOp == W65816::LDA_DP || NOp == W65816::LDA_StackRel ||
NOp == W65816::LDA_Abs || NOp == W65816::LDA_Imm16 ||
NOp == W65816::LDAabs || NOp == W65816::LDAi16imm ||
NOp == W65816::TXA || NOp == W65816::TYA ||
NOp == W65816::PLA;
}
if (OffOk && NextRedef) {
// Build new STA_<DP|StackRel> between REP and TSC.
DebugLoc DL = Sta->getDebugLoc();
BuildMI(EntryMBB, Tsc, DL, TII->get(StaOp))
.addImm(NewAddr)
.addReg(W65816::A, RegState::Implicit);
// Erase: TAY, TYA, old STA.
Tay->eraseFromParent();
Tya->eraseFromParent();
Sta->eraseFromParent();
Changed = true;
}
}
} }
} }
} }
@ -1459,5 +2019,17 @@ bool W65816StackRelToImg::runOnMachineFunction(MachineFunction &MF) {
} }
} }
// Run elideStoreForwarding at the very end, AFTER IMG promotion has
// committed slot assignments. Running this peephole earlier (with
// the other early peepholes) cascades into different IMG-promotion
// choices and was observed to regress strcpy 1.63×. At this point
// promotion is done, so dropping a redundant LDA can no longer
// disturb slot allocation.
// End-of-pass: also try elideStoreForwarding for non-bailing
// functions. After main IMG promotion finalizes slot assignments,
// dropping a redundant LDA can no longer disturb them.
Changed |= elideStoreForwarding(MF);
Changed |= ChangedEarly;
return Changed; return Changed;
} }