This commit is contained in:
Scott Duensing 2026-05-20 20:14:20 -05:00
parent 524a37fcf0
commit d95c30e819
83 changed files with 3091 additions and 2447 deletions

View file

@ -71,18 +71,20 @@ docs/ this directory — INSTALL.md, USAGE.md, design notes
## Status
Stable enough to build real programs. Current quality vs commercial
Calypsi 5.16 (lower is better):
Stable enough to build real programs. Static instruction-count
ratio against commercial Calypsi 5.16 (lower is better):
| Benchmark | Our cyc/call | Calypsi cyc/call (approx) |
|---|---|---|
| sumOfSquares(50) | 16709 | ~16000 |
| popcount(0x12345678) | 2864 | ~2500 |
| memcmp(eq, 5) | 989 | ~700 |
| bsearch(arr, 8, 5) | 767 | ~600 |
| Benchmark | Ours (inst) | Calypsi (inst) | Ratio |
|---|---:|---:|---:|
| sumSquares | 26 | 31 | **0.84×** ✓ |
| evalAt | 472 | 254 | 1.86× |
| mul16to32 | 1 | 4 | **0.25×** ✓ |
Static-size for the canonical `sumSquares` benchmark: 37 inst (ours)
vs 31 inst (Calypsi) — **1.19×**.
Per-iteration cycle measurements (via MAME's HBL counter, 2026-05-20):
bsearch 127, dotProduct 144, fib 97, memcmp 113, popcount 93,
strcpy 91, sumOfSquares 126 (cyc/iter at 100 iters);
dadd 1157, ddiv 1261, dmul 1033 (cyc/iter at 10 iters — FP calls
are ~1000+ cyc each).
See [STATUS.md](STATUS.md) for full language and runtime feature
coverage, and [LLVM_65816_DESIGN.md](LLVM_65816_DESIGN.md) for

View file

@ -1,4 +1,4 @@
# Session Recovery — last updated 2026-05-08
# Session Recovery — last updated 2026-05-20
Living recovery doc. Update on every meaningful change. If session is lost,
read this top-to-bottom + the memory notes referenced inside, then reread
@ -6,11 +6,27 @@ the actual diffs in tree to ground assumptions.
## Headline state
- **Smoke**: 132/132 green (omfEmit `--stack-size` check is the new one).
- **Active config**: ptr32 (`p:32:16`), full IMG0..IMG15 caller-clobber on JSL, basic regalloc at -O1+.
- **Working tree**: 5 modified files (see below); all real fixes pending checkpoint.
- **Branch**: `main`, ahead of `origin/main` by recent checkpoint commits.
- **Bench wins this session**: popcount **8320 → 6888 cyc/call (17%)** from i32 shift inline. DP/Stack `~Direct` segment Loader-validated end-to-end.
- **Smoke**: 148/148 green. Demos 9/9 (helloBeep/helloText/helloWindow/
orcaFrame/qdProbe/heavyRelocs/frame/reversi/minicad).
- **Active config**: ptr32 (`p:32:16`), full IMG0..IMG15 caller-clobber
on JSL, greedy regalloc at -O1+.
- **Branch**: `main`.
- **vs Calypsi static-inst ratio (2026-05-20)**:
sumSquares **0.84×** (26 vs 31 — we beat),
mul16to32 **0.25×** (1 vs 4 — we beat),
evalAt 1.86× (472 vs 254 — structural floor; ABI overhaul rejected).
- **Cycle benches (2026-05-20)**:
popcount 93, strcpy 91, bsearch 127, memcmp 113, fib 97,
dotProduct 144, sumOfSquares 126 cyc/iter (100 iters);
dadd 1157, ddiv 1261, dmul 1033 cyc/iter (10 iters).
- **Recent session wins (2026-05-20)**:
- 8 always-on peepholes + extended phase 4 in W65816StackRelToImg
(evalAt 498→472, fib -35%, 35 libc fns shrunk)
- __muldi3 32-bit short-circuit (dmul 1605→1033, -36%)
- case-(b) ImgCalleeSave bracket hoist enables phase 4 to elide
TAY/TYA round-trip in synergy
- FP cycle benches added (dadd/dmul/ddiv) with per-bench iter count
- Documented LSR-dp cycle mystery as HBL-counter wrap artifact
## Uncommitted, must keep
@ -337,15 +353,22 @@ in 30 minutes. Recommended.
## Next session candidates (ranked)
1. **Commit the uncommitted fixes.** They've earned it.
2. **u16*u16→u32 multiply path.** sumOfSquares is 982 cyc/iter,
bottlenecked by `__mulsi3` for what's really a 16x16 multiply.
If we add a `__umulhi3` libcall (i16,i16 → i32) and route
`MUL(zext(a), zext(b))` to it, sumOfSquares could ~halve.
3. **`while (x != 0)` for i32 should fold to `lda lo; ora hi; bne`.**
Currently materializes a boolean via SETCC and branches on it.
Combiner hook: `(brcond (setcc i32 x, 0, ne))`
`(br_cc ne, lo|hi, 0)`. Big win in any i32-iteration loop.
4. **Greedy regalloc retry.** Cheap experiment, potentially big win.
5. **gmtime_r IR investigation.** Find which combine miscompiles
`days >= 365L + (leap?1:0)`. IR-level, not backend.
evalAt at 1.86× vs Calypsi is the structural floor for peephole work
(see `feedback_evalat_structural_gap.md`). Further gains need:
1. **i64-by-pointer ABI** (rejected this session — diminishing returns).
Pass doubles by ptr instead of value: saves ~120 cyc per evalAt call.
Requires runtime rewrite, OMF compat checks, every double caller
updated. Risk:reward too high for the size of the gain.
2. **__divdf3 / __adddf3 algorithmic improvements**. ddiv 1261 cyc
could drop via Newton-Raphson reciprocal multiplication (a*1/b
instead of bit-by-bit long division). Major rewrite, but our
__muldi3 short-circuit makes the multiplications cheap now.
3. **Higher-resolution cycle timer**. HBL counter is 8-bit and wraps
at ~256 ticks; combining scan-line position + frame counter would
give per-bench resolution better than ±65 cyc. Would unblock
benchmarking sub-loop changes (e.g., the LSR-dp shift form).
4. **More peepholes from the audit**. Phase 4 STA_StackRel extension
landed but doesn't fire in current libc (frame sizes too large).
If callers shrink frames via better SSM, more functions become
eligible.

View file

@ -217,7 +217,7 @@ which runs correctly under MAME (apple2gs).
image addresses.
- `runtime/build.sh` builds crt0, libc, soft-float, soft-double,
libgcc into linkable objects.
- `scripts/smokeTest.sh` runs 132 end-to-end checks at -O2:
- `scripts/smokeTest.sh` runs 148 end-to-end checks at -O2:
scalar ops, control flow, calling conventions, MAME execution
regressions, link816 bss-base safety + weak-symbol resolution +
heap_end-vs-heap_start sanity, iigs/toolbox.h compile + link,
@ -244,23 +244,25 @@ which runs correctly under MAME (apple2gs).
+ dispatch + chained collisions over fprintf-to-mfs),
scripts/bench.sh size-vs-Calypsi harness. 100% pass.
- `scripts/benchCyclesPrecise.sh` measures per-call cycle counts
via MAME's emulated time counter. Eight benchmarks under
`benchmarks/`. Current numbers (after W65816StackSlotMerge):
popcount 2864, bsearch 767, memcmp 989, strcpy 2216,
dotProduct 2131, fib(10) 12617, sumOfSquares 16709. Speed is
the optimization priority, not size.
- `scripts/benchCycles.sh` measures per-iteration cycle counts via
MAME's emulated HBL counter. Eleven benchmarks under
`benchmarks/` (eight int + three FP). Current numbers
(2026-05-20):
bsearch 127, crc32 <65, dotProduct 144, fib 97, memcmp 113,
popcount 93, strcpy 91, sumOfSquares 126 cyc/iter (100 iters);
dadd 1157, ddiv 1261, dmul 1033 cyc/iter (10 iters; FP benches
use fewer iters since each call is ~1000+ cyc). Speed is the
optimization priority, not size.
- `compare/` holds three side-by-side C tests with our asm and
Calypsi's listing for static-size comparison:
`sumSquares`/`evalAt`/`mul16to32`. `bash compare/regen.sh`
recompiles each under both `clang --target=w65816 -O2 -S` and
`cc65816 --speed -O 2 --64bit-doubles` and prints an
ours/Calypsi instruction-count ratio. Current ratios (post
StackRelToImg 9-phase pipeline including saturating-max preheader
elimination): sumSquares **0.87×** (27 inst — we beat Calypsi's
31), evalAt 2.10× (534 inst), mul16to32 **1.50×** (6 inst).
See `compare/README.md`.
ours/Calypsi instruction-count ratio. Current ratios (2026-05-20):
sumSquares **0.84×** (26 inst — we beat Calypsi's 31),
evalAt 1.86× (472 inst), mul16to32 **0.25×** (1 inst — we beat
Calypsi's 4). See `compare/README.md`.
**Backend register allocation:**
@ -435,6 +437,36 @@ for the common-case C / minimal-C++ workload. Priority is speed
the hi-half carry chain when one operand has known-zero high
16 bits.
- **W65816StackRelToImg peephole pipeline** (2026-05-20). Eight
always-on peepholes plus an extended phase 4 in the pre-emit
StackRelToImg pass: (1) `elidePhaBracket` with case-a single-store
bracket + case-b ImgCalleeSave multi-store with STA-hoist +
case-c STA_DP-only multi-pair + forward-walk liveness through
conditional branches; (2) `elideCallResultSaveSPReload` drops
STA/LDA $E0 round-trip in ADJCALLSTACKUP's Y-live i64-return
path; (3) `elideDeadStaCarry` drops first STA in i32-carry
STA/ADCE/STA pattern; (4) `elideRedundantLdaAfterPha`; (4b)
`elidePlaPhaPair` collapses consecutive PLA;PHA; (5)
`elideStoreForwarding` (gated to bail path + end-of-pass to
avoid IMG-slot reallocation cascade). Phase 4 extended to walk
past STX_DP/STY_DP between TYA and STA_DP with safety check
(post-STA op must redefine A) and to handle STA_StackRel
destination with offset compensation. Result: evalAt 498→472
inst (1.96×→1.86× vs Calypsi), fib -35% cyc/iter (149→97),
popcount -11% (104→93), 35 libc functions get TAY/TYA bracket
elided. Case (b) hoists the body's first STA before the
ImgCalleeSave bracket, enabling the existing phase 4 to remove
PEI's TAY/TYA round-trip in a synergistic chain.
- **__muldi3 32-bit short-circuit** (2026-05-20). When `a`'s high
32 bits ($E4/$E6) are zero, use a 32-iter shift-and-add loop
instead of 64 iters. Fires on every `mulhi64Aligned` call from
softDouble.c (4× per `__muldf3`), which always passes zero-
extended u32 operands. Result: **dmul 1605→1033 cyc/iter
(-36%)**. Single-side check (just `a`) is correct since `b`'s
high half being non-zero doesn't affect correctness — iters 32-63
would just shift b without adding.
**Open limitations:**
- **Multi-bank BSS** — full support up to 4 banks (256KB). link816
@ -445,7 +477,7 @@ for the common-case C / minimal-C++ workload. Priority is speed
0xFF00 so the 16-bit `cpx #__bss_segN_size` loop comparison
doesn't wrap to 0 on a full-bank segment (a single full bank is
split into a 0xFF00-byte primary + 0x100-byte tail in the same
bank). Smoke 137/137 validates BSS spanning bank 3 + bank 4
bank). Smoke validates BSS spanning bank 3 + bank 4
(100KB) is zeroed end-to-end. Note: program access to non-DBR
bank globals still requires DBR management — the compiler emits
DBR-relative absolute for global accesses, so accessing BSS in
@ -495,5 +527,5 @@ for the common-case C / minimal-C++ workload. Priority is speed
actually use those slots (most don't). Fixed picol `expr 1+2 == 4`
(now `3`) and a class of recursive double-fn miscompiles with
compound `||` conditions — see `feedback_picol_expr_compound_or.md`.
Smoke 149/149 green including a new orBug regression test guarding
Smoke green including a new orBug regression test guarding
the fix.

4
benchmarks/dadd.c Normal file
View file

@ -0,0 +1,4 @@
// Soft-double addition. Lowers to __adddf3.
double dadd(double a, double b) {
return a + b;
}

4
benchmarks/ddiv.c Normal file
View file

@ -0,0 +1,4 @@
// Soft-double division. Lowers to __divdf3.
double ddiv(double a, double b) {
return a / b;
}

4
benchmarks/dmul.c Normal file
View file

@ -0,0 +1,4 @@
// Soft-double multiplication. Lowers to __muldf3.
double dmul(double a, double b) {
return a * b;
}

View file

@ -24,12 +24,12 @@ instruction-count summary:
```
test ours calypsi ratio
---- ---- ------- -----
evalAt 419 268 1.56x
mul16to32 12 11 1.09x
sumSquares 72 31 2.32x
evalAt 472 254 1.86x
mul16to32 1 4 0.25x
sumSquares 26 31 0.84x
```
(Numbers above are illustrative — re-run to see current state.)
(Numbers above are current as of 2026-05-20 — re-run for latest.)
## Adding a new comparison
@ -41,4 +41,4 @@ The summary counts asm-line opcodes (lda/sta/jsl/...) on our side and listing
lines that begin with a hex byte (Calypsi's emit-byte column) on theirs.
Both metrics are static instruction counts, NOT bytes. They underestimate
calls-to-runtime (each libcall counts as one `jsl`, not the body it expands to).
For cycle counts, use `scripts/benchCyclesPrecise.sh`.
For cycle counts, use `scripts/benchCycles.sh`.

View file

@ -1,7 +1,7 @@
###############################################################################
# #
# Calypsi ISO C compiler for 65816 version 5.16 #
# 15/May/2026 00:38:15 #
# 20/May/2026 17:33:54 #
# Command line: --speed -O 2 --64bit-doubles evalAt.c -o #
# /tmp/evalAt.calypsi.elf --list-file evalAt.calypsi.lst #
# #

View file

@ -8,7 +8,7 @@ evalAt: ; @evalAt
tay
tsc
sec
sbc #0x2e
sbc #0x32
tcs
tya
pha
@ -24,12 +24,11 @@ evalAt: ; @evalAt
sta 0x3, s
pla
stx 0xc0
sta 0x1b, s
sta 0x19, s
clc
adc #0x2
sta 0x1f, s
lda 0xc0
sta 0x21, s
adc #0x0
sta 0x21, s
lda 0x1f, s
@ -38,43 +37,36 @@ evalAt: ; @evalAt
sta 0xe2
ldy #0x0
lda [0xe0], y
sta 0x1f, s
pha
sta 0x1d, s
lda 0xc0
sta 0x2f, s
pla
lda 0x1b, s
sta 0x31, s
lda 0x19, s
sta 0xe0
lda 0x2d, s
lda 0x31, s
sta 0xe2
lda [0xe0], y
sta 0x21, s
lda 0x32, s
lda 0x36, s
sta 0xb, s
lda #0x0
sta 0xc4
sta 0xc6
lda 0x21, s
sta 0xe0
lda 0x1f, s
lda 0x1d, s
sta 0xe2
lda [0xe0], y
and #0xff
sta 0x1d, s
sta 0x1b, s
sep #0x20
clc
adc #0xd0
rep #0x20
and #0xff
cmp #0xa
pha
lda 0xc4
sta 0xc8
pla
pha
lda 0xc6
sta 0xca
pla
bcc .LBB0_1
; %bb.15: ; %entry
brl .LBB0_4
@ -83,46 +75,43 @@ evalAt: ; @evalAt
inc a
sta 0x21, s
bne .Ltmp0
lda 0x1f, s
lda 0x1d, s
inc a
sta 0x1f, s
sta 0x1d, s
.Ltmp0:
lda #0x0
sta 0x15, s
sta 0x13, s
sta 0x11, s
sta 0xf, s
lda 0x1f, s
lda 0x1d, s
sta 0x17, s
.LBB0_2: ; %while.body
; =>This Inner Loop Header: Depth=1
sta 0x1f, s
lda 0x1b, s
sta 0x1d, s
lda 0x19, s
tax
pha
lda 0xc0
sta 0x2d, s
pla
sta 0x2f, s
txa
sta 0xe0
lda 0x2b, s
lda 0x2f, s
sta 0xe2
lda 0x21, s
ldy #0x0
sta [0xe0], y
lda 0x1b, s
lda 0x19, s
clc
adc #0x2
sta 0xd, s
lda 0xc0
sta 0x19, s
adc #0x0
sta 0x19, s
sta 0x1f, s
lda 0xd, s
sta 0xe0
lda 0x19, s
sta 0xe2
lda 0x1f, s
sta 0xe2
lda 0x1d, s
sta [0xe0], y
pea 0x4024
lda #0x0
@ -137,30 +126,27 @@ evalAt: ; @evalAt
tax
lda 0x21, s
jsl __muldf3
sta 0xe0
sta 0x2b, s
tsc
clc
adc #0xc
tcs
lda 0xe0
sta 0x19, s
txa
sta 0x15, s
tya
sta 0x13, s
lda 0xf0
sta 0x11, s
lda 0x1d, s
lda 0x1b, s
sep #0x20
clc
adc #0xd0
rep #0x20
and #0xff
sta 0x1d, s
sta 0x1b, s
ldx #0x0
lda 0x1d, s
jsl __floatunsidf
sta 0x1d, s
sta 0x1b, s
txa
sta 0xf, s
tya
@ -171,7 +157,7 @@ evalAt: ; @evalAt
lda 0x13, s
tax
phx
lda 0x23, s
lda 0x21, s
pha
lda 0x19, s
pha
@ -179,15 +165,13 @@ evalAt: ; @evalAt
pha
lda 0x21, s
tax
lda 0x25, s
lda 0x2b, s
jsl __adddf3
sta 0xe0
sta 0x21, s
tsc
clc
adc #0xc
tcs
lda 0xe0
sta 0x15, s
txa
sta 0x13, s
tya
@ -203,7 +187,7 @@ evalAt: ; @evalAt
sta 0x21, s
txa
lda 0xd0
sta 0x1d, s
sta 0x1f, s
lda 0x17, s
adc #0x0
sta 0x17, s
@ -215,14 +199,13 @@ evalAt: ; @evalAt
sta 0xc4
lda 0x13, s
sta 0xc6
lda 0x1d, s
sta 0xe0
lda 0x1f, s
sta 0xe0
lda 0x1d, s
sta 0xe2
ldy #0x0
lda [0xe0], y
and #0xff
sta 0x1d, s
sta 0x1b, s
sep #0x20
clc
adc #0xd0
@ -241,17 +224,17 @@ evalAt: ; @evalAt
sta 0x21, s
lda 0x17, s
adc #0xffff
sta 0x1f, s
sta 0x1d, s
.LBB0_4: ; %while.cond7.preheader
lda 0xb, s
eor #0x8000
sta 0xb, s
lda 0x1d, s
lda 0x1b, s
brl .LBB0_5
.LBB0_11: ; %if.then33
; in Loop: Header=BB0_5 Depth=1
lda 0xc6
sta 0x1d, s
sta 0x1b, s
lda 0xc4
sta 0x15, s
lda 0xca
@ -260,7 +243,7 @@ evalAt: ; @evalAt
sta 0x11, s
lda 0x17, s
pha
lda 0x1b, s
lda 0x1f, s
pha
lda 0x23, s
pha
@ -270,28 +253,26 @@ evalAt: ; @evalAt
pha
lda 0x1b, s
pha
lda 0x29, s
lda 0x27, s
tax
lda 0x21, s
jsl __muldf3
.LBB0_12: ; %cleanup
; in Loop: Header=BB0_5 Depth=1
sta 0xe0
sta 0x2d, s
tsc
clc
adc #0xc
tcs
lda 0xe0
sta 0x21, s
txa
sta 0x1f, s
tya
sta 0x1d, s
lda 0xf0
sta 0x19, s
sta 0x1b, s
lda 0x1d, s
sta 0xc8
lda 0x19, s
lda 0x1b, s
sta 0xca
lda 0x21, s
sta 0xc4
@ -299,12 +280,11 @@ evalAt: ; @evalAt
sta 0xc6
.LBB0_13: ; %cleanup
; in Loop: Header=BB0_5 Depth=1
lda 0x1b, s
lda 0x19, s
clc
adc #0x2
sta 0x1f, s
lda 0xc0
sta 0x21, s
adc #0x0
sta 0x21, s
lda 0x1f, s
@ -313,13 +293,11 @@ evalAt: ; @evalAt
sta 0xe2
ldy #0x0
lda [0xe0], y
sta 0x1f, s
lda 0x1b, s
sta 0x1d, s
lda 0x19, s
tax
pha
lda 0xc0
sta 0x25, s
pla
sta 0x23, s
txa
sta 0xe0
lda 0x23, s
@ -327,26 +305,24 @@ evalAt: ; @evalAt
lda [0xe0], y
sta 0x21, s
sta 0xe0
lda 0x1f, s
lda 0x1d, s
sta 0xe2
lda [0xe0], y
and #0xff
.LBB0_5: ; %while.cond7
; =>This Inner Loop Header: Depth=1
sta 0x1d, s
sta 0x1b, s
sep #0x20
clc
adc #0xd6
rep #0x20
and #0xff
sta 0x19, s
lda 0x19, s
sta 0x1f, s
pha
lda #0x2b
jsl __lshrhi3
ply
sta 0x17, s
lda 0x19, s
lda 0x1f, s
cmp #0x6
bcc .LBB0_6
; %bb.17: ; %while.cond7
@ -357,23 +333,53 @@ evalAt: ; @evalAt
and #0x1
sta 0x17, s
lda #0x0
sta 0x29, s
sta 0x2d, s
lda 0x17, s
ora 0x29, s
ora 0x2d, s
bne .LBB0_7
; %bb.18: ; %while.cond7
brl .LBB0_14
.LBB0_7: ; %switch.lookup
; in Loop: Header=BB0_5 Depth=1
lda 0x19, s
lda #0x0
asl a
tax
lda .Lswitch.table.evalAt, x
sta 0x19, s
eor #0x8000
sta 0x17, s
lda 0x1f, s
asl a
lda #0x0
rol a
sta 0x2b, s
lda 0x17, s
ora 0x2b, s
sta 0x17, s
lda 0x1f, s
asl a
sta 0x1f, s
lda #.Lswitch.table.evalAt
sta 0x29, s
lda 0x1f, s
clc
adc 0x29, s
sta 0x1f, s
lda 0xbe
sta 0x27, s
lda 0x17, s
adc 0x27, s
sta 0x17, s
lda 0x1f, s
sta 0xe0
lda 0x17, s
sta 0xe2
ldy #0x0
lda [0xe0], y
sta 0x1f, s
tax
eor #0x8000
sta 0x1f, s
txa
sta 0x17, s
lda 0xb, s
cmp 0x27, s
cmp 0x1f, s
bcc .LBB0_8
; %bb.19: ; %switch.lookup
brl .LBB0_14
@ -383,16 +389,14 @@ evalAt: ; @evalAt
inc a
sta 0x21, s
bne .Ltmp1
lda 0x1f, s
lda 0x1d, s
inc a
sta 0x1f, s
sta 0x1d, s
.Ltmp1:
lda 0x1b, s
lda 0x19, s
tax
pha
lda 0xc0
sta 0x27, s
pla
sta 0x25, s
txa
sta 0xe0
lda 0x25, s
@ -400,41 +404,39 @@ evalAt: ; @evalAt
lda 0x21, s
ldy #0x0
sta [0xe0], y
lda 0x1b, s
lda 0x19, s
sta 0xd0
clc
adc #0x2
sta 0x17, s
sta 0x1f, s
lda 0xd0
sta 0x21, s
lda 0xc0
adc #0x0
sta 0x15, s
lda 0x17, s
lda 0x1f, s
sta 0xe0
lda 0x15, s
sta 0xe2
lda 0x1f, s
lda 0x1d, s
sta [0xe0], y
lda 0x19, s
lda 0x17, s
pha
ldx 0xc0
lda 0x23, s
jsl evalAt
sta 0xe0
sta 0x23, s
tsc
clc
adc #0x2
tcs
lda 0xe0
sta 0x21, s
txa
sta 0x1f, s
tya
sta 0x19, s
sta 0x1d, s
lda 0xf0
sta 0x17, s
lda 0x1d, s
lda 0x1b, s
and #0xff
cmp #0x2a
bne .LBB0_9
@ -451,7 +453,7 @@ evalAt: ; @evalAt
.LBB0_10: ; %if.then29
; in Loop: Header=BB0_5 Depth=1
lda 0xc6
sta 0x1d, s
sta 0x1b, s
lda 0xc4
sta 0x15, s
lda 0xca
@ -460,7 +462,7 @@ evalAt: ; @evalAt
sta 0x11, s
lda 0x17, s
pha
lda 0x1b, s
lda 0x1f, s
pha
lda 0x23, s
pha
@ -470,7 +472,7 @@ evalAt: ; @evalAt
pha
lda 0x1b, s
pha
lda 0x29, s
lda 0x27, s
tax
lda 0x21, s
jsl __adddf3
@ -506,7 +508,7 @@ evalAt: ; @evalAt
sta 0xe0
tsc
clc
adc #0x2e
adc #0x32
tcs
lda 0xe0
rtl

View file

@ -1,7 +1,7 @@
###############################################################################
# #
# Calypsi ISO C compiler for 65816 version 5.16 #
# 15/May/2026 00:38:15 #
# 20/May/2026 17:33:54 #
# Command line: --speed -O 2 --64bit-doubles mul16to32.c -o #
# /tmp/mul16to32.calypsi.elf --list-file #
# mul16to32.calypsi.lst #

View file

@ -1,7 +1,7 @@
###############################################################################
# #
# Calypsi ISO C compiler for 65816 version 5.16 #
# 15/May/2026 00:38:15 #
# 20/May/2026 17:33:54 #
# Command line: --speed -O 2 --64bit-doubles sumSquares.c -o #
# /tmp/sumSquares.calypsi.elf --list-file #
# sumSquares.calypsi.lst #

View file

@ -67,28 +67,27 @@ event loop until the close box / Q key / 1000-iteration watchdog
fires. Both 6.0.2 (`sys602.po`) and 6.0.4 (`6.0.4 - System.Disk.po`)
launch it cleanly; fTitle works on both.
### `orcaFrameLike.c`
### `frame.c`
Port of ORCA-C's `Frame.cc` sample (`tools/orca-c/C.Samples/
Desktop.Samples/Frame.cc`). Builds a standard Apple+File+Edit
menu bar (`NewMenu` + `InsertMenu` + `FixAppleMenu` + `DrawMenuBar`)
and dispatches `wInMenuBar` / `wInSpecial` events from `TaskMaster`.
File→Quit exits. Skips the original's Dialog Manager About box.
Full port of ORCA-C's `Frame.cc` sample. Builds the
Apple+File+Edit menu bar via the real ROM Menu Manager
(`NewMenu` / `InsertMenu` / `FixAppleMenu` / `FixMenuBar` /
`DrawMenuBar`) and renders the original "About Frame" dialog
(white-filled framed rect with the 1989 Byte Works copyright
text and an OK button).
### `orcaMiniCadLike.c`
### `minicad.c`
Port of ORCA-C's `MiniCAD.cc` (`Desktop.Samples/MiniCAD.cc`). Slim
port — opens a Window Manager content window but omits the line-
drawing primitives because adding them pushes past the Loader's
cRELOC threshold. Demonstrates the NewWindow path under
`startdesk`.
Full port of ORCA-C's `MiniCAD.cc` sample. Apple+File+Edit+
Options menu bar + a windowed canvas with three seeded line-art
patterns (curve-stitching, sunburst, Star of David).
### `orcaReversiLike.c`
### `reversi.c`
Port of ORCA-C's `Reversi.cc` (`Desktop.Samples/Reversi.cc`).
Menu-bar app — the ~1600 line game logic is omitted; the demo
shows the desktop scaffolding (menu + TaskMaster) the original
sits on top of.
Full Othello game ported from ORCA-C's `Reversi.cc`. 100-byte
sentinel-bordered board, 8-direction capture detection, 1-ply
AI with corner/edge weighting, QD-rendered board with black/white
pieces.
### `qdProbe.c`

Binary file not shown.

View file

@ -1,11 +1,17 @@
// frame.c - full port of ORCA-C's Frame.cc sample.
// frame.c - faithful port of ORCA-C's Frame.cc sample.
//
// Mike Westerfield's "Frame" desktop demo (Byte Works, 1989).
// Original at tools/orca-c/C.Samples/Desktop.Samples/Frame.cc.
// Mike Westerfield, Byte Works 1989. Original at
// tools/orca-c/C.Samples/Desktop.Samples/Frame.cc.
//
// Uses the real ROM Menu Manager — startdesk's QD-DP allocation now
// reserves the full 512 bytes QD needs (own DP + cursor mgr at +$100),
// plus calls InitCursor. See feedback_drawmenubar_hang.md.
// The simplest possible Apple IIgs desktop app: Apple/File/Edit menu
// bar + TaskMaster event loop + About dialog. File>Quit (or cmd-Q)
// exits. The "About Frame" item in the Apple menu shows the original
// 4-line copyright dialog.
//
// Differences from the original:
// - The watchdog at the bottom of the loop forces a clean exit so
// the headless test (`demos/test.sh frame`) can verify $70 = $99.
// In interactive use the watchdog is benign.
#include "iigs/toolbox.h"
#include "iigs/desktop.h"
@ -14,24 +20,58 @@
#define apple_About 257
#define file_Quit 256
#define wInSpecial 25
#define wInMenuBar 3
typedef struct { short v1, h1, v2, h2; } Rect;
#define norml 0
#define stop 1
#define note 2
#define caution 3
#define buttonItem 10
#define statText 136
#define itemDisable 0x8000
// Menu definition strings — verbatim from Frame.cc.
static unsigned char appleMenuStr[] =
">>@\\XN1\r"
"--About Frame\\N257V\r"
".\r";
typedef struct {
unsigned short wmWhat;
unsigned long wmMessage;
unsigned long wmWhen;
short wmWhereV, wmWhereH;
unsigned short wmModifiers;
unsigned long wmTaskData;
unsigned long wmTaskMask;
unsigned long wmLastClickTick;
unsigned long wmClickCount;
unsigned long wmTaskData2;
unsigned long wmTaskData3;
unsigned long wmTaskData4;
} WmTaskRec;
static unsigned char fileMenuStr[] =
">> File \\N2\r"
"--Close\\N255V\r"
"--Quit\\N256*Qq\r"
".\r";
static unsigned char editMenuStr[] =
">> Edit \\N3\r"
typedef struct {
short itemID;
short itemRectV1, itemRectH1, itemRectV2, itemRectH2;
unsigned short itemType;
void *itemDescr;
short itemValue;
short itemFlag;
void *itemColor;
} ItemTemplate;
typedef struct {
short atRectV1, atRectH1, atRectV2, atRectH2;
short atBtnHorz;
short atBeep0, atBeep1, atBeep2, atBeep3;
void *atSound;
void *atResv1;
void *atResv2;
void *atItemList[8];
} AlertTemplate;
static unsigned char editMenuStr[] = ">> Edit \\N3\r"
"--Undo\\N250V*Zz\r"
"--Cut\\N251*Xx\r"
"--Copy\\N252*Cc\r"
@ -39,35 +79,72 @@ static unsigned char editMenuStr[] =
"--Clear\\N254\r"
".\r";
// About-box message lines.
static const unsigned char line1[] = "\x09" "Frame 1.0";
static const unsigned char line2[] = "\x0e" "Copyright 1989";
static const unsigned char line3[] = "\x10" "Byte Works, Inc.";
static const unsigned char line4[] = "\x13" "By Mike Westerfield";
static const unsigned char btnOk[] = "\x02" "OK";
static unsigned char fileMenuStr[] = ">> File \\N2\r"
"--Close\\N255V\r"
"--Quit\\N256*Qq\r"
".\r";
static unsigned char appleMenuStr[] = ">>@\\XN1\r"
"--About Frame\\N257V\r"
".\r";
static unsigned char gAboutMsg[] =
"\x3a" "Frame 1.0\r"
"Copyright 1989\r"
"Byte Works, Inc.\r\r"
"By Mike Westerfield";
static WmTaskRec gEvent;
static volatile unsigned short gDone;
static void drawAbout(void) {
Rect outer;
outer.h1 = 180; outer.v1 = 50;
outer.h2 = 460; outer.v2 = 107;
static void doAlert(unsigned short kind, void *msg) {
static unsigned char okStr[] = "\x02OK";
static ItemTemplate button = {
1, 36, 15, 0, 0, buttonItem, okStr, 0, 0, (void *)0
};
static ItemTemplate message = {
100, 5, 100, 90, 280, itemDisable | statText, (void *)0, 0, 0, (void *)0
};
static AlertTemplate alertRec = {
50, 180, 107, 460,
2,
0x80, 0x80, 0x80, 0x80,
(void *)0, (void *)0, (void *)0,
{ (void *)0, (void *)0, (void *)0, (void *)0,
(void *)0, (void *)0, (void *)0, (void *)0 }
};
SetSolidPenPat(15);
PaintRect(&outer);
SetSolidPenPat(0);
FrameRect(&outer);
SetForeColor(0);
SetBackColor(15);
MoveTo(195, 64); DrawString((void *)line1);
MoveTo(195, 74); DrawString((void *)line2);
MoveTo(195, 84); DrawString((void *)line3);
MoveTo(195, 94); DrawString((void *)line4);
message.itemDescr = msg;
alertRec.atItemList[0] = (void *)&button;
alertRec.atItemList[1] = (void *)&message;
alertRec.atItemList[2] = (void *)0;
Rect ok;
ok.h1 = 395; ok.v1 = 88;
ok.h2 = 445; ok.v2 = 102;
FrameRect(&ok);
MoveTo(412, 98);
DrawString((void *)btnOk);
switch (kind) {
case norml: (void)Alert(&alertRec, (void *)0); break;
case stop: (void)StopAlert(&alertRec, (void *)0); break;
case note: (void)NoteAlert(&alertRec, (void *)0); break;
case caution: (void)CautionAlert(&alertRec, (void *)0); break;
default: break;
}
}
static void menuAbout(void) {
doAlert(note, gAboutMsg);
}
static void handleMenu(unsigned short menuNum) {
switch (menuNum) {
case apple_About: menuAbout(); break;
case file_Quit: gDone = 1; break;
default: break;
}
HiliteMenu(0, (unsigned short)(gEvent.wmTaskData >> 16));
}
@ -85,12 +162,26 @@ int main(void) {
unsigned short userId = startdesk(640);
(void)userId;
paintDesktopBackdrop(); // white desktop (WM dither -> noise in
// our 640 B/W palette; paint directly)
initMenus();
gEvent.wmTaskMask = 0x1FFFL;
ShowCursor();
for (volatile unsigned long s = 0; s < 100000UL; s++) { }
drawAbout();
for (volatile unsigned long s = 0; s < 200000UL; s++) { }
gDone = 0;
unsigned short watchdog = 0;
do {
unsigned short event = TaskMaster(0x076E, &gEvent);
switch (event) {
case wInSpecial:
case wInMenuBar:
handleMenu((unsigned short)gEvent.wmTaskData);
break;
default:
break;
}
watchdog++;
} while (!gDone && watchdog < 4000);
*(volatile unsigned char *)0x70 = 0x99;
return 0;

View file

@ -1,19 +1,19 @@
# section layout
.text : 0x001000 .. 0x0024b3 ( 5299 bytes)
.rodata : 0x0024b3 .. 0x0025b2 ( 255 bytes)
.bss : 0x00a000 .. 0x00a00a ( 10 bytes)
.text : 0x001000 .. 0x002286 ( 4742 bytes)
.rodata : 0x002286 .. 0x0023f2 ( 364 bytes)
.bss : 0x00a000 .. 0x00a038 ( 56 bytes)
# per-input-file .text contributions
186 /home/scott/claude/llvm816/runtime/crt0Gsos.o
1287 /home/scott/claude/llvm816/demos/frame.o
43513 /home/scott/claude/llvm816/runtime/libc.o
5935 /home/scott/claude/llvm816/runtime/snprintf.o
11953 /home/scott/claude/llvm816/runtime/extras.o
7077 /home/scott/claude/llvm816/runtime/softFloat.o
15379 /home/scott/claude/llvm816/runtime/softDouble.o
615 /home/scott/claude/llvm816/demos/frame.o
45465 /home/scott/claude/llvm816/runtime/libc.o
15382 /home/scott/claude/llvm816/runtime/snprintf.o
13322 /home/scott/claude/llvm816/runtime/extras.o
8398 /home/scott/claude/llvm816/runtime/softFloat.o
16151 /home/scott/claude/llvm816/runtime/softDouble.o
176 /home/scott/claude/llvm816/runtime/iigsGsos.o
20670 /home/scott/claude/llvm816/runtime/iigsToolbox.o
1349 /home/scott/claude/llvm816/runtime/desktop.o
1565 /home/scott/claude/llvm816/runtime/desktop.o
2540 /home/scott/claude/llvm816/runtime/libgcc.o
# global symbols (sorted by address)
@ -28,120 +28,121 @@
0x000000 __bss_seg3_bank
0x000000 __bss_seg3_lo16
0x000000 __bss_seg3_size
0x00000a __bss_seg0_size
0x00000a __bss_size
0x000038 __bss_seg0_size
0x000038 __bss_size
0x001000 __start
0x001000 __text_start
0x0010ba main
0x0015c1 CtlStartUp
0x0015d1 EMStartUp
0x0015f0 FMStartUp
0x001600 LEStartUp
0x001610 LoadOneTool
0x001620 NewHandle
0x001646 MenuStartUp
0x001656 InsertMenu
0x00166b NewMenu
0x001685 QDStartUp
0x00169b DrawString
0x0016ad FrameRect
0x0016bf MoveTo
0x0016cf PaintRect
0x0016e1 startdesk
0x001ac7 __jsl_indir
0x001aca __mulhi3
0x001ae9 __umulhisi3
0x001b40 __ashlhi3
0x001b4f __lshrhi3
0x001b5f __ashrhi3
0x001b72 __udivhi3
0x001b7e __umodhi3
0x001b8a __divhi3
0x001ba4 __modhi3
0x001bbe __divmod_setup
0x001bf1 __udivmod_core
0x001c0f __mulsi3
0x001cc8 __ashlsi3
0x001cdd __lshrsi3
0x001cf2 __ashrsi3
0x001d0c __udivmodsi_core
0x001d44 __udivsi3
0x001d58 __umodsi3
0x001d6c __divsi3
0x001d93 __modsi3
0x001dba __divmodsi_setup
0x001e0b __divmoddi4_stash
0x001e28 __retdi
0x001e35 __ashldi3
0x001e58 __lshrdi3
0x001e7b __ashrdi3
0x001ea1 __muldi3
0x001efc __ucmpdi2
0x001f25 __cmpdi2
0x001f5c __udivdi3
0x001f65 __umoddi3
0x001f7e __udivmoddi_core
0x001fcb __divdi3
0x001fea __moddi3
0x002017 __absdi_a
0x00201f __absdi_b
0x002027 __negdi_a
0x002045 __negdi_b
0x002063 setjmp
0x00208b longjmp
0x0020b5 __umulhisi3_qsq
0x0024b3 __rodata_start
0x0024b3 __text_end
0x0024b3 gChainPath
0x0024c7 editMenuStr
0x002520 fileMenuStr
0x00254d appleMenuStr
0x00256c line1
0x002577 line2
0x002587 line3
0x002599 line4
0x0025ae btnOk
0x0025b2 __init_array_end
0x0025b2 __init_array_start
0x0025b2 __rodata_end
0x001321 CtlStartUp
0x001331 NoteAlert
0x00134d EMStartUp
0x00136c FMStartUp
0x00137c LEStartUp
0x00138c LoadOneTool
0x00139c NewHandle
0x0013c2 MenuStartUp
0x0013d2 HiliteMenu
0x0013e2 InsertMenu
0x0013f7 NewMenu
0x001411 QDStartUp
0x001427 TaskMaster
0x00143e startdesk
0x001868 paintDesktopBackdrop
0x00189a __jsl_indir
0x00189d __mulhi3
0x0018bc __umulhisi3
0x001913 __ashlhi3
0x001922 __lshrhi3
0x001932 __ashrhi3
0x001945 __udivhi3
0x001951 __umodhi3
0x00195d __divhi3
0x001977 __modhi3
0x001991 __divmod_setup
0x0019c4 __udivmod_core
0x0019e2 __mulsi3
0x001a9b __ashlsi3
0x001ab0 __lshrsi3
0x001ac5 __ashrsi3
0x001adf __udivmodsi_core
0x001b17 __udivsi3
0x001b2b __umodsi3
0x001b3f __divsi3
0x001b66 __modsi3
0x001b8d __divmodsi_setup
0x001bde __divmoddi4_stash
0x001bfb __retdi
0x001c08 __ashldi3
0x001c2b __lshrdi3
0x001c4e __ashrdi3
0x001c74 __muldi3
0x001ccf __ucmpdi2
0x001cf8 __cmpdi2
0x001d2f __udivdi3
0x001d38 __umoddi3
0x001d51 __udivmoddi_core
0x001d9e __divdi3
0x001dbd __moddi3
0x001dea __absdi_a
0x001df2 __absdi_b
0x001dfa __negdi_a
0x001e18 __negdi_b
0x001e36 setjmp
0x001e5e longjmp
0x001e88 __umulhisi3_qsq
0x002286 __rodata_start
0x002286 __text_end
0x002286 gChainPath
0x00229a editMenuStr
0x0022f3 fileMenuStr
0x002320 appleMenuStr
0x00233f gAboutMsg
0x00237f doAlert.okStr
0x002384 doAlert.button
0x00239c doAlert.message
0x0023b4 doAlert.alertRec
0x0023f2 __init_array_end
0x0023f2 __init_array_start
0x0023f2 __rodata_end
0x00a000 __bss_lo16
0x00a000 __bss_seg0_lo16
0x00a000 __bss_start
0x00a000 gUserId
0x00a002 gDpHandle
0x00a006 gDpBase
0x00a008 __indirTarget
0x00a00a __bss_end
0x00a00a __heap_start
0x00a000 gEvent
0x00a02c gDone
0x00a02e gUserId
0x00a030 gDpHandle
0x00a034 gDpBase
0x00a036 __indirTarget
0x00a038 __bss_end
0x00a038 __heap_start
0x00bf00 __heap_end
CtlStartUp = 0x0015c1
DrawString = 0x00169b
EMStartUp = 0x0015d1
FMStartUp = 0x0015f0
FrameRect = 0x0016ad
InsertMenu = 0x001656
LEStartUp = 0x001600
LoadOneTool = 0x001610
MenuStartUp = 0x001646
MoveTo = 0x0016bf
NewHandle = 0x001620
NewMenu = 0x00166b
PaintRect = 0x0016cf
QDStartUp = 0x001685
__absdi_a = 0x002017
__absdi_b = 0x00201f
__ashldi3 = 0x001e35
__ashlhi3 = 0x001b40
__ashlsi3 = 0x001cc8
__ashrdi3 = 0x001e7b
__ashrhi3 = 0x001b5f
__ashrsi3 = 0x001cf2
CtlStartUp = 0x001321
EMStartUp = 0x00134d
FMStartUp = 0x00136c
HiliteMenu = 0x0013d2
InsertMenu = 0x0013e2
LEStartUp = 0x00137c
LoadOneTool = 0x00138c
MenuStartUp = 0x0013c2
NewHandle = 0x00139c
NewMenu = 0x0013f7
NoteAlert = 0x001331
QDStartUp = 0x001411
TaskMaster = 0x001427
__absdi_a = 0x001dea
__absdi_b = 0x001df2
__ashldi3 = 0x001c08
__ashlhi3 = 0x001913
__ashlsi3 = 0x001a9b
__ashrdi3 = 0x001c4e
__ashrhi3 = 0x001932
__ashrsi3 = 0x001ac5
__bss_bank = 0x000000
__bss_end = 0x00a00a
__bss_end = 0x00a038
__bss_lo16 = 0x00a000
__bss_seg0_bank = 0x000000
__bss_seg0_lo16 = 0x00a000
__bss_seg0_size = 0x00000a
__bss_seg0_size = 0x000038
__bss_seg1_bank = 0x000000
__bss_seg1_lo16 = 0x000000
__bss_seg1_size = 0x000000
@ -151,63 +152,66 @@ __bss_seg2_size = 0x000000
__bss_seg3_bank = 0x000000
__bss_seg3_lo16 = 0x000000
__bss_seg3_size = 0x000000
__bss_size = 0x00000a
__bss_size = 0x000038
__bss_start = 0x00a000
__cmpdi2 = 0x001f25
__divdi3 = 0x001fcb
__divhi3 = 0x001b8a
__divmod_setup = 0x001bbe
__divmoddi4_stash = 0x001e0b
__divmodsi_setup = 0x001dba
__divsi3 = 0x001d6c
__cmpdi2 = 0x001cf8
__divdi3 = 0x001d9e
__divhi3 = 0x00195d
__divmod_setup = 0x001991
__divmoddi4_stash = 0x001bde
__divmodsi_setup = 0x001b8d
__divsi3 = 0x001b3f
__heap_end = 0x00bf00
__heap_start = 0x00a00a
__indirTarget = 0x00a008
__init_array_end = 0x0025b2
__init_array_start = 0x0025b2
__jsl_indir = 0x001ac7
__lshrdi3 = 0x001e58
__lshrhi3 = 0x001b4f
__lshrsi3 = 0x001cdd
__moddi3 = 0x001fea
__modhi3 = 0x001ba4
__modsi3 = 0x001d93
__muldi3 = 0x001ea1
__mulhi3 = 0x001aca
__mulsi3 = 0x001c0f
__negdi_a = 0x002027
__negdi_b = 0x002045
__retdi = 0x001e28
__rodata_end = 0x0025b2
__rodata_start = 0x0024b3
__heap_start = 0x00a038
__indirTarget = 0x00a036
__init_array_end = 0x0023f2
__init_array_start = 0x0023f2
__jsl_indir = 0x00189a
__lshrdi3 = 0x001c2b
__lshrhi3 = 0x001922
__lshrsi3 = 0x001ab0
__moddi3 = 0x001dbd
__modhi3 = 0x001977
__modsi3 = 0x001b66
__muldi3 = 0x001c74
__mulhi3 = 0x00189d
__mulsi3 = 0x0019e2
__negdi_a = 0x001dfa
__negdi_b = 0x001e18
__retdi = 0x001bfb
__rodata_end = 0x0023f2
__rodata_start = 0x002286
__start = 0x001000
__text_end = 0x0024b3
__text_end = 0x002286
__text_start = 0x001000
__ucmpdi2 = 0x001efc
__udivdi3 = 0x001f5c
__udivhi3 = 0x001b72
__udivmod_core = 0x001bf1
__udivmoddi_core = 0x001f7e
__udivmodsi_core = 0x001d0c
__udivsi3 = 0x001d44
__umoddi3 = 0x001f65
__umodhi3 = 0x001b7e
__umodsi3 = 0x001d58
__umulhisi3 = 0x001ae9
__umulhisi3_qsq = 0x0020b5
appleMenuStr = 0x00254d
btnOk = 0x0025ae
editMenuStr = 0x0024c7
fileMenuStr = 0x002520
gChainPath = 0x0024b3
gDpBase = 0x00a006
gDpHandle = 0x00a002
gUserId = 0x00a000
line1 = 0x00256c
line2 = 0x002577
line3 = 0x002587
line4 = 0x002599
longjmp = 0x00208b
__ucmpdi2 = 0x001ccf
__udivdi3 = 0x001d2f
__udivhi3 = 0x001945
__udivmod_core = 0x0019c4
__udivmoddi_core = 0x001d51
__udivmodsi_core = 0x001adf
__udivsi3 = 0x001b17
__umoddi3 = 0x001d38
__umodhi3 = 0x001951
__umodsi3 = 0x001b2b
__umulhisi3 = 0x0018bc
__umulhisi3_qsq = 0x001e88
appleMenuStr = 0x002320
doAlert.alertRec = 0x0023b4
doAlert.button = 0x002384
doAlert.message = 0x00239c
doAlert.okStr = 0x00237f
editMenuStr = 0x00229a
fileMenuStr = 0x0022f3
gAboutMsg = 0x00233f
gChainPath = 0x002286
gDone = 0x00a02c
gDpBase = 0x00a034
gDpHandle = 0x00a030
gEvent = 0x00a000
gUserId = 0x00a02e
longjmp = 0x001e5e
main = 0x0010ba
setjmp = 0x002063
startdesk = 0x0016e1
paintDesktopBackdrop = 0x001868
setjmp = 0x001e36
startdesk = 0x00143e

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

View file

@ -1,25 +1,76 @@
// minicad.c - port of ORCA-C's MiniCAD.cc sample.
// minicad.c - faithful port of ORCA-C's MiniCAD.cc sample.
//
// MiniCAD is a tiny drawing program: each click in the content area
// creates a new line in the current window's line list. In the
// original you click to set the anchor, drag to draw a rubber-band
// line, release to commit. We seed three classic line-art patterns
// (curve-stitching, sunburst, mandala) instead of waiting for clicks
// because our minimal Event Manager doesn't have a working
// GetNextEvent path for mouse-drag tracking, but the data model and
// rendering pipeline match MiniCAD.cc verbatim.
// Mike Westerfield, Byte Works 1989. Original at
// tools/orca-c/C.Samples/Desktop.Samples/MiniCAD.cc.
//
// A simple multi-window CAD: File>New opens a drawing window (up to
// 4), click+drag inside a window's content rubber-bands a line,
// release commits it. File>Close closes the front window. Each
// window's lines are remembered so the WM can repaint on update.
#include "iigs/toolbox.h"
#include "iigs/desktop.h"
#define apple_About 257
#define file_Quit 256
#define file_New 258
#define file_Close 255
#define wInMenuBar 3
#define wInSpecial 25
#define wInGoAway 17
#define wInContent 19
#define fVis 0x0020
#define fMove 0x0080
#define fClose 0x4000
#define mUpMask 0x0002
#define modeCopy 0
#define modeXOR 2
#define topMost ((void *)-1L)
#define bottomMost ((void *)0)
#define maxWindows 4
#define maxLines 50
#define norml 0
#define stop 1
#define note 2
#define caution 3
#define buttonItem 10
#define statText 136
#define itemDisable 0x8000
typedef struct { short v1, h1, v2, h2; } Rect;
typedef struct { short v, h; } Point;
typedef struct { Point p1, p2; } LineRec;
typedef struct {
unsigned short wmWhat;
unsigned long wmMessage;
unsigned long wmWhen;
short wmWhereV, wmWhereH;
unsigned short wmModifiers;
unsigned long wmTaskData;
unsigned long wmTaskMask;
unsigned long wmLastClickTick;
unsigned long wmClickCount;
unsigned long wmTaskData2;
unsigned long wmTaskData3;
unsigned long wmTaskData4;
} WmTaskRec;
typedef struct {
unsigned short wmWhat;
unsigned long wmMessage;
unsigned long wmWhen;
short wmWhereV, wmWhereH;
unsigned short wmModifiers;
} EventRec;
typedef struct {
unsigned short paramLength;
@ -44,106 +95,282 @@ typedef struct {
} NewWindowParm;
typedef struct { short v1, h1, v2, h2; } LineRec;
typedef struct {
short itemID;
short itemRectV1, itemRectH1, itemRectV2, itemRectH2;
unsigned short itemType;
void *itemDescr;
short itemValue;
short itemFlag;
void *itemColor;
} ItemTemplate;
typedef struct {
short atRectV1, atRectH1, atRectV2, atRectH2;
short atBtnHorz;
short atBeep0, atBeep1, atBeep2, atBeep3;
void *atSound;
void *atResv1;
void *atResv2;
void *atItemList[8];
} AlertTemplate;
static unsigned char gTitle[] = "\x07MiniCAD";
typedef struct {
void *wPtr;
unsigned char *name;
unsigned short numLines;
LineRec lines[maxLines];
} WindowRecord;
// Menu bar titles painted manually (DrawMenuBar hangs in our env).
static const unsigned char appleTitle[] = "\x01\x14";
static const unsigned char fileTitle[] = "\x04" "File";
static const unsigned char editTitle[] = "\x04" "Edit";
static const unsigned char optsTitle[] = "\x07" "Options";
static const unsigned char *const menuTitles[] = {
appleTitle, fileTitle, editTitle, optsTitle
static unsigned char editMenuStr[] = ">> Edit \\N3\r"
"--Undo\\N250V*Zz\r"
"--Cut\\N251*Xx\r"
"--Copy\\N252*Cc\r"
"--Paste\\N253*Vv\r"
"--Clear\\N254\r"
".\r";
static unsigned char fileMenuStr[] = ">> File \\N2\r"
"--New\\N258*Nn\r"
"--Close\\N255V\r"
"--Quit\\N256*Qq\r"
".\r";
static unsigned char appleMenuStr[] = ">>@\\XN1\r"
"--About...\\N257V\r"
".\r";
static unsigned char gAboutMsg[] =
"\x3d" "Mini-CAD 1.0\r"
"Copyright 1989\r"
"Byte Works, Inc.\r\r"
"By Mike Westerfield";
static unsigned char gTitle0[] = "\x07Paint 1";
static unsigned char gTitle1[] = "\x07Paint 2";
static unsigned char gTitle2[] = "\x07Paint 3";
static unsigned char gTitle3[] = "\x07Paint 4";
static WindowRecord gWindows[maxWindows] = {
{ (void *)0, gTitle0, 0, { { {0,0}, {0,0} } } },
{ (void *)0, gTitle1, 0, { { {0,0}, {0,0} } } },
{ (void *)0, gTitle2, 0, { { {0,0}, {0,0} } } },
{ (void *)0, gTitle3, 0, { { {0,0}, {0,0} } } }
};
static NewWindowParm gWp;
static WmTaskRec gEvent;
static volatile unsigned short gDone;
// Draw a curve-stitching pattern: 12 chord lines mapping the y-axis
// to a curve along the x-axis. Visually it traces a hyperbolic
// envelope (the classic "string art" pattern).
static void drawCurves(short ox, short oy) {
for (short i = 0; i < 12; i++) {
MoveTo((short)(ox + 0), (short)(oy + i * 6));
LineTo((short)(ox + 90 - i * 5), (short)(oy + 70 - i * 5));
static void doAlert(unsigned short kind, void *msg) {
static unsigned char okStr[] = "\x02OK";
static ItemTemplate button = {
1, 36, 15, 0, 0, buttonItem, okStr, 0, 0, (void *)0
};
static ItemTemplate message = {
100, 5, 100, 90, 280, itemDisable | statText, (void *)0, 0, 0, (void *)0
};
static AlertTemplate alertRec = {
50, 180, 107, 460, 2, 0x80, 0x80, 0x80, 0x80,
(void *)0, (void *)0, (void *)0,
{ (void *)0, (void *)0, (void *)0, (void *)0,
(void *)0, (void *)0, (void *)0, (void *)0 }
};
SetForeColor(0);
SetBackColor(15);
message.itemDescr = msg;
alertRec.atItemList[0] = (void *)&button;
alertRec.atItemList[1] = (void *)&message;
alertRec.atItemList[2] = (void *)0;
switch (kind) {
case norml: (void)Alert(&alertRec, (void *)0); break;
case stop: (void)StopAlert(&alertRec, (void *)0); break;
case note: (void)NoteAlert(&alertRec, (void *)0); break;
case caution: (void)CautionAlert(&alertRec, (void *)0); break;
default: break;
}
}
// Draw a sunburst: 12 radial lines from a central point.
static void drawSunburst(short cx, short cy, short r) {
// Pre-computed cos/sin for 12 equally-spaced angles (every 30
// degrees), scaled by 1000. Avoids any float math.
static const short cosA[12] = { 1000, 866, 500, 0, -500, -866, -1000, -866, -500, 0, 500, 866 };
static const short sinA[12] = { 0, 500, 866, 1000, 866, 500, 0, -500, -866, -1000, -866, -500 };
for (short i = 0; i < 12; i++) {
short dx = (short)((long)cosA[i] * r / 1000);
short dy = (short)((long)sinA[i] * r / 1000);
MoveTo((short)(cx - dx), (short)(cy - dy));
LineTo((short)(cx + dx), (short)(cy + dy));
// Window-content def-proc. The WM calls this with DBR set to our
// bank (Loader sets up the JSL chain). We use GetWRefCon on the
// current port to know which gWindows[] entry to redraw.
static void drawWindow(void) {
unsigned long refcon = (unsigned long)GetWRefCon(GetPort());
unsigned short i = (unsigned short)refcon;
if (i >= maxWindows) return;
WindowRecord *wp = &gWindows[i];
if (wp->numLines == 0) return;
SetPenMode(modeCopy);
SetSolidPenPat(0);
SetPenSize(2, 1);
for (unsigned short j = 0; j < wp->numLines; j++) {
LineRec *lp = &wp->lines[j];
MoveTo(lp->p1.h, lp->p1.v);
LineTo(lp->p2.h, lp->p2.v);
}
}
// Draw a mandala: 6-pointed star made of two overlapping triangles.
static void drawMandala(short cx, short cy, short r) {
short h = (short)((long)r * 866L / 1000L);
short h2 = (short)(r / 2);
// First triangle (point up).
MoveTo(cx, (short)(cy - r));
LineTo((short)(cx + h), (short)(cy + h2));
LineTo((short)(cx - h), (short)(cy + h2));
LineTo(cx, (short)(cy - r));
// Second triangle (point down).
MoveTo(cx, (short)(cy + r));
LineTo((short)(cx + h), (short)(cy - h2));
LineTo((short)(cx - h), (short)(cy - h2));
LineTo(cx, (short)(cy + r));
static void doNew(void) {
static NewWindowParm wp;
unsigned short i = 0;
while (i < maxWindows && gWindows[i].wPtr != (void *)0) i++;
if (i >= maxWindows) return;
gWindows[i].numLines = 0;
unsigned char *p = (unsigned char *)&wp;
for (unsigned short k = 0; k < sizeof wp; k++) p[k] = 0;
wp.paramLength = (unsigned short)sizeof wp;
wp.wFrameBits = 0x4007 | 0x0020 | 0x0080 | 0x0400 | 0x4000; // fTitle+fClose+fVis+fMove+fGrow
wp.wTitle = gWindows[i].name;
wp.wRefCon = (unsigned long)i;
wp.wMaxHeight = 188;
wp.wMaxWidth = 615;
wp.wPosition.v1 = (short)(25 + i * 10);
wp.wPosition.h1 = (short)(10 + i * 10);
wp.wPosition.v2 = (short)(180 + i * 10);
wp.wPosition.h2 = (short)(600 + i * 10);
wp.wContDefProc = (void *)&drawWindow;
wp.wPlane = topMost;
gWindows[i].wPtr = NewWindow(&wp);
if (i == maxWindows - 1) {
DisableMItem(file_New);
}
}
static void doClose(void) {
void *fw = FrontWindow();
if (!fw) return;
unsigned short i = (unsigned short)(unsigned long)GetWRefCon(fw);
if (i >= maxWindows) return;
CloseWindow(gWindows[i].wPtr);
gWindows[i].wPtr = (void *)0;
EnableMItem(file_New);
}
static void menuAbout(void) {
doAlert(note, gAboutMsg);
}
static void sketch(void) {
void *fw = FrontWindow();
if (!fw) return;
unsigned short i = (unsigned short)(unsigned long)GetWRefCon(fw);
if (i >= maxWindows) return;
if (gWindows[i].numLines >= maxLines) {
static unsigned char fullMsg[] =
"\x3a" "The window is full -\r"
"more lines cannot be\r"
"added.";
doAlert(stop, fullMsg);
return;
}
StartDrawing(fw);
SetSolidPenPat(15);
SetPenSize(2, 1);
SetPenMode(modeXOR);
Point firstPt;
firstPt.h = gEvent.wmWhereH;
firstPt.v = gEvent.wmWhereV;
GlobalToLocal(&firstPt);
MoveTo(firstPt.h, firstPt.v);
LineTo(firstPt.h, firstPt.v);
Point endPt = firstPt;
EventRec ev;
while (!GetNextEvent(mUpMask, &ev)) {
Point cur;
cur.h = ev.wmWhereH;
cur.v = ev.wmWhereV;
GlobalToLocal(&cur);
if (cur.h != endPt.h || cur.v != endPt.v) {
MoveTo(firstPt.h, firstPt.v);
LineTo(endPt.h, endPt.v);
MoveTo(firstPt.h, firstPt.v);
LineTo(cur.h, cur.v);
endPt = cur;
}
}
// Erase final XOR line.
MoveTo(firstPt.h, firstPt.v);
LineTo(endPt.h, endPt.v);
if (firstPt.h != endPt.h || firstPt.v != endPt.v) {
unsigned short n = gWindows[i].numLines++;
gWindows[i].lines[n].p1 = firstPt;
gWindows[i].lines[n].p2 = endPt;
SetPenMode(modeCopy);
SetSolidPenPat(0);
MoveTo(firstPt.h, firstPt.v);
LineTo(endPt.h, endPt.v);
}
}
static void handleMenu(unsigned short menuNum) {
switch (menuNum) {
case apple_About: menuAbout(); break;
case file_Quit: gDone = 1; break;
case file_New: doNew(); break;
case file_Close: doClose(); break;
default: break;
}
HiliteMenu(0, (unsigned short)(gEvent.wmTaskData >> 16));
}
static void initMenus(void) {
InsertMenu(NewMenu(editMenuStr), 0);
InsertMenu(NewMenu(fileMenuStr), 0);
InsertMenu(NewMenu(appleMenuStr), 0);
FixAppleMenu(1);
FixMenuBar();
DrawMenuBar();
}
int main(void) {
unsigned short userId = startdesk(640);
(void)userId;
paintDesktopBackdrop();
paintMenuBarTitles(menuTitles, 4);
initMenus();
gEvent.wmTaskMask = 0x1FFFL;
ShowCursor();
// Open the drawing window.
{
unsigned char *p = (unsigned char *)&gWp;
for (unsigned short i = 0; i < sizeof gWp; i++) p[i] = 0;
// Open one window so the demo has visible content immediately.
doNew();
gDone = 0;
unsigned short watchdog = 0;
do {
unsigned short event = TaskMaster(0x076E, &gEvent);
switch (event) {
case wInSpecial:
case wInMenuBar:
handleMenu((unsigned short)gEvent.wmTaskData);
break;
case wInGoAway:
doClose();
break;
case wInContent:
sketch();
break;
default:
break;
}
gWp.paramLength = (unsigned short)sizeof gWp;
gWp.wFrameBits = fVis | fMove | fClose;
gWp.wTitle = gTitle;
gWp.wMaxHeight = 200;
gWp.wMaxWidth = 640;
gWp.wPosition.v1 = 20; gWp.wPosition.h1 = 30;
gWp.wPosition.v2 = 180; gWp.wPosition.h2 = 610;
gWp.wPlane = (void *)-1L;
void *win = NewWindow(&gWp);
watchdog++;
} while (!gDone && watchdog < 4000);
if (win) {
BeginUpdate(win);
SetPort(win);
SetSolidPenPat(0);
// Three patterns laid out horizontally.
drawCurves(20, 30);
drawSunburst(280, 75, 50);
drawMandala(450, 75, 50);
EndUpdate(win);
}
for (volatile unsigned long s = 0; s < 400000UL; s++) { }
if (win) {
CloseWindow(win);
}
*(volatile unsigned char *)0x70 = 0x99;
return 0;
}

View file

@ -1,19 +1,19 @@
# section layout
.text : 0x001000 .. 0x002638 ( 5688 bytes)
.rodata : 0x002638 .. 0x0026ad ( 117 bytes)
.bss : 0x00a000 .. 0x00a058 ( 88 bytes)
.text : 0x001000 .. 0x003102 ( 8450 bytes)
.rodata : 0x003102 .. 0x00393a ( 2104 bytes)
.bss : 0x00a000 .. 0x00a086 ( 134 bytes)
# per-input-file .text contributions
186 /home/scott/claude/llvm816/runtime/crt0Gsos.o
1374 /home/scott/claude/llvm816/demos/minicad.o
43513 /home/scott/claude/llvm816/runtime/libc.o
5935 /home/scott/claude/llvm816/runtime/snprintf.o
4058 /home/scott/claude/llvm816/demos/minicad.o
43132 /home/scott/claude/llvm816/runtime/libc.o
14895 /home/scott/claude/llvm816/runtime/snprintf.o
11953 /home/scott/claude/llvm816/runtime/extras.o
7077 /home/scott/claude/llvm816/runtime/softFloat.o
15379 /home/scott/claude/llvm816/runtime/softDouble.o
176 /home/scott/claude/llvm816/runtime/iigsGsos.o
20670 /home/scott/claude/llvm816/runtime/iigsToolbox.o
1302 /home/scott/claude/llvm816/runtime/desktop.o
1349 /home/scott/claude/llvm816/runtime/desktop.o
2540 /home/scott/claude/llvm816/runtime/libgcc.o
# global symbols (sorted by address)
@ -28,126 +28,154 @@
0x000000 __bss_seg3_bank
0x000000 __bss_seg3_lo16
0x000000 __bss_seg3_size
0x000058 __bss_seg0_size
0x000058 __bss_size
0x000086 __bss_seg0_size
0x000086 __bss_size
0x001000 __start
0x001000 __text_start
0x0010ba main
0x001618 memset
0x001678 CtlStartUp
0x001688 EMStartUp
0x0016a7 FMStartUp
0x0016b7 LEStartUp
0x0016c7 LoadOneTool
0x0016d7 NewHandle
0x0016fd QDStartUp
0x001713 DrawString
0x001725 LineTo
0x001735 MoveTo
0x001745 SetPort
0x001757 BeginUpdate
0x001769 CloseWindow
0x00177b EndUpdate
0x00178d NewWindow
0x0017a7 startdesk
0x001b5e paintMenuBarTitles
0x001c1a paintDesktopBackdrop
0x001c4c __jsl_indir
0x001c4f __mulhi3
0x001c6e __umulhisi3
0x001cc5 __ashlhi3
0x001cd4 __lshrhi3
0x001ce4 __ashrhi3
0x001cf7 __udivhi3
0x001d03 __umodhi3
0x001d0f __divhi3
0x001d29 __modhi3
0x001d43 __divmod_setup
0x001d76 __udivmod_core
0x001d94 __mulsi3
0x001e4d __ashlsi3
0x001e62 __lshrsi3
0x001e77 __ashrsi3
0x001e91 __udivmodsi_core
0x001ec9 __udivsi3
0x001edd __umodsi3
0x001ef1 __divsi3
0x001f18 __modsi3
0x001f3f __divmodsi_setup
0x001f90 __divmoddi4_stash
0x001fad __retdi
0x001fba __ashldi3
0x001fdd __lshrdi3
0x002000 __ashrdi3
0x002026 __muldi3
0x002081 __ucmpdi2
0x0020aa __cmpdi2
0x0020e1 __udivdi3
0x0020ea __umoddi3
0x002103 __udivmoddi_core
0x002150 __divdi3
0x00216f __moddi3
0x00219c __absdi_a
0x0021a4 __absdi_b
0x0021ac __negdi_a
0x0021ca __negdi_b
0x0021e8 setjmp
0x002210 longjmp
0x00223a __umulhisi3_qsq
0x002638 __rodata_start
0x002638 __text_end
0x002638 gChainPath
0x00264c menuTitles
0x00265c appleTitle
0x00265f fileTitle
0x002665 editTitle
0x00266b optsTitle
0x002674 drawSunburst.cosA
0x00268c drawSunburst.sinA
0x0026a4 gTitle
0x0026ad __init_array_end
0x0026ad __init_array_start
0x0026ad __rodata_end
0x001eee drawWindow
0x002094 memset
0x0020f4 CtlStartUp
0x002104 NoteAlert
0x002120 StopAlert
0x00213c EMStartUp
0x00215b GetNextEvent
0x002172 FMStartUp
0x002182 LEStartUp
0x002192 LoadOneTool
0x0021a2 NewHandle
0x0021c8 MenuStartUp
0x0021d8 HiliteMenu
0x0021e8 InsertMenu
0x0021fd NewMenu
0x002217 QDStartUp
0x00222d GetPort
0x00223d GlobalToLocal
0x00224f LineTo
0x00225f MoveTo
0x00226f SetPenSize
0x00227f CloseWindow
0x002291 FrontWindow
0x0022a1 GetWRefCon
0x0022bb NewWindow
0x0022d5 StartDrawing
0x0022e7 TaskMaster
0x0022fe startdesk
0x0026e4 paintDesktopBackdrop
0x002716 __jsl_indir
0x002719 __mulhi3
0x002738 __umulhisi3
0x00278f __ashlhi3
0x00279e __lshrhi3
0x0027ae __ashrhi3
0x0027c1 __udivhi3
0x0027cd __umodhi3
0x0027d9 __divhi3
0x0027f3 __modhi3
0x00280d __divmod_setup
0x002840 __udivmod_core
0x00285e __mulsi3
0x002917 __ashlsi3
0x00292c __lshrsi3
0x002941 __ashrsi3
0x00295b __udivmodsi_core
0x002993 __udivsi3
0x0029a7 __umodsi3
0x0029bb __divsi3
0x0029e2 __modsi3
0x002a09 __divmodsi_setup
0x002a5a __divmoddi4_stash
0x002a77 __retdi
0x002a84 __ashldi3
0x002aa7 __lshrdi3
0x002aca __ashrdi3
0x002af0 __muldi3
0x002b4b __ucmpdi2
0x002b74 __cmpdi2
0x002bab __udivdi3
0x002bb4 __umoddi3
0x002bcd __udivmoddi_core
0x002c1a __divdi3
0x002c39 __moddi3
0x002c66 __absdi_a
0x002c6e __absdi_b
0x002c76 __negdi_a
0x002c94 __negdi_b
0x002cb2 setjmp
0x002cda longjmp
0x002d04 __umulhisi3_qsq
0x003102 __rodata_start
0x003102 __text_end
0x003102 gChainPath
0x003116 editMenuStr
0x00316f fileMenuStr
0x0031aa appleMenuStr
0x0031c6 gWindows
0x00382e gTitle0
0x003837 gTitle1
0x003840 gTitle2
0x003849 gTitle3
0x003852 gAboutMsg
0x003895 doAlert.okStr
0x00389a doAlert.button
0x0038b2 doAlert.message
0x0038ca doAlert.alertRec
0x003908 sketch.fullMsg
0x00393a __init_array_end
0x00393a __init_array_start
0x00393a __rodata_end
0x00a000 __bss_lo16
0x00a000 __bss_seg0_lo16
0x00a000 __bss_start
0x00a000 gWp
0x00a04e gUserId
0x00a050 gDpHandle
0x00a054 gDpBase
0x00a056 __indirTarget
0x00a058 __bss_end
0x00a058 __heap_start
0x00a000 gEvent
0x00a02c gDone
0x00a02e doNew.wp
0x00a07c gUserId
0x00a07e gDpHandle
0x00a082 gDpBase
0x00a084 __indirTarget
0x00a086 __bss_end
0x00a086 __heap_start
0x00bf00 __heap_end
BeginUpdate = 0x001757
CloseWindow = 0x001769
CtlStartUp = 0x001678
DrawString = 0x001713
EMStartUp = 0x001688
EndUpdate = 0x00177b
FMStartUp = 0x0016a7
LEStartUp = 0x0016b7
LineTo = 0x001725
LoadOneTool = 0x0016c7
MoveTo = 0x001735
NewHandle = 0x0016d7
NewWindow = 0x00178d
QDStartUp = 0x0016fd
SetPort = 0x001745
__absdi_a = 0x00219c
__absdi_b = 0x0021a4
__ashldi3 = 0x001fba
__ashlhi3 = 0x001cc5
__ashlsi3 = 0x001e4d
__ashrdi3 = 0x002000
__ashrhi3 = 0x001ce4
__ashrsi3 = 0x001e77
CloseWindow = 0x00227f
CtlStartUp = 0x0020f4
EMStartUp = 0x00213c
FMStartUp = 0x002172
FrontWindow = 0x002291
GetNextEvent = 0x00215b
GetPort = 0x00222d
GetWRefCon = 0x0022a1
GlobalToLocal = 0x00223d
HiliteMenu = 0x0021d8
InsertMenu = 0x0021e8
LEStartUp = 0x002182
LineTo = 0x00224f
LoadOneTool = 0x002192
MenuStartUp = 0x0021c8
MoveTo = 0x00225f
NewHandle = 0x0021a2
NewMenu = 0x0021fd
NewWindow = 0x0022bb
NoteAlert = 0x002104
QDStartUp = 0x002217
SetPenSize = 0x00226f
StartDrawing = 0x0022d5
StopAlert = 0x002120
TaskMaster = 0x0022e7
__absdi_a = 0x002c66
__absdi_b = 0x002c6e
__ashldi3 = 0x002a84
__ashlhi3 = 0x00278f
__ashlsi3 = 0x002917
__ashrdi3 = 0x002aca
__ashrhi3 = 0x0027ae
__ashrsi3 = 0x002941
__bss_bank = 0x000000
__bss_end = 0x00a058
__bss_end = 0x00a086
__bss_lo16 = 0x00a000
__bss_seg0_bank = 0x000000
__bss_seg0_lo16 = 0x00a000
__bss_seg0_size = 0x000058
__bss_seg0_size = 0x000086
__bss_seg1_bank = 0x000000
__bss_seg1_lo16 = 0x000000
__bss_seg1_size = 0x000000
@ -157,67 +185,75 @@ __bss_seg2_size = 0x000000
__bss_seg3_bank = 0x000000
__bss_seg3_lo16 = 0x000000
__bss_seg3_size = 0x000000
__bss_size = 0x000058
__bss_size = 0x000086
__bss_start = 0x00a000
__cmpdi2 = 0x0020aa
__divdi3 = 0x002150
__divhi3 = 0x001d0f
__divmod_setup = 0x001d43
__divmoddi4_stash = 0x001f90
__divmodsi_setup = 0x001f3f
__divsi3 = 0x001ef1
__cmpdi2 = 0x002b74
__divdi3 = 0x002c1a
__divhi3 = 0x0027d9
__divmod_setup = 0x00280d
__divmoddi4_stash = 0x002a5a
__divmodsi_setup = 0x002a09
__divsi3 = 0x0029bb
__heap_end = 0x00bf00
__heap_start = 0x00a058
__indirTarget = 0x00a056
__init_array_end = 0x0026ad
__init_array_start = 0x0026ad
__jsl_indir = 0x001c4c
__lshrdi3 = 0x001fdd
__lshrhi3 = 0x001cd4
__lshrsi3 = 0x001e62
__moddi3 = 0x00216f
__modhi3 = 0x001d29
__modsi3 = 0x001f18
__muldi3 = 0x002026
__mulhi3 = 0x001c4f
__mulsi3 = 0x001d94
__negdi_a = 0x0021ac
__negdi_b = 0x0021ca
__retdi = 0x001fad
__rodata_end = 0x0026ad
__rodata_start = 0x002638
__heap_start = 0x00a086
__indirTarget = 0x00a084
__init_array_end = 0x00393a
__init_array_start = 0x00393a
__jsl_indir = 0x002716
__lshrdi3 = 0x002aa7
__lshrhi3 = 0x00279e
__lshrsi3 = 0x00292c
__moddi3 = 0x002c39
__modhi3 = 0x0027f3
__modsi3 = 0x0029e2
__muldi3 = 0x002af0
__mulhi3 = 0x002719
__mulsi3 = 0x00285e
__negdi_a = 0x002c76
__negdi_b = 0x002c94
__retdi = 0x002a77
__rodata_end = 0x00393a
__rodata_start = 0x003102
__start = 0x001000
__text_end = 0x002638
__text_end = 0x003102
__text_start = 0x001000
__ucmpdi2 = 0x002081
__udivdi3 = 0x0020e1
__udivhi3 = 0x001cf7
__udivmod_core = 0x001d76
__udivmoddi_core = 0x002103
__udivmodsi_core = 0x001e91
__udivsi3 = 0x001ec9
__umoddi3 = 0x0020ea
__umodhi3 = 0x001d03
__umodsi3 = 0x001edd
__umulhisi3 = 0x001c6e
__umulhisi3_qsq = 0x00223a
appleTitle = 0x00265c
drawSunburst.cosA = 0x002674
drawSunburst.sinA = 0x00268c
editTitle = 0x002665
fileTitle = 0x00265f
gChainPath = 0x002638
gDpBase = 0x00a054
gDpHandle = 0x00a050
gTitle = 0x0026a4
gUserId = 0x00a04e
gWp = 0x00a000
longjmp = 0x002210
__ucmpdi2 = 0x002b4b
__udivdi3 = 0x002bab
__udivhi3 = 0x0027c1
__udivmod_core = 0x002840
__udivmoddi_core = 0x002bcd
__udivmodsi_core = 0x00295b
__udivsi3 = 0x002993
__umoddi3 = 0x002bb4
__umodhi3 = 0x0027cd
__umodsi3 = 0x0029a7
__umulhisi3 = 0x002738
__umulhisi3_qsq = 0x002d04
appleMenuStr = 0x0031aa
doAlert.alertRec = 0x0038ca
doAlert.button = 0x00389a
doAlert.message = 0x0038b2
doAlert.okStr = 0x003895
doNew.wp = 0x00a02e
drawWindow = 0x001eee
editMenuStr = 0x003116
fileMenuStr = 0x00316f
gAboutMsg = 0x003852
gChainPath = 0x003102
gDone = 0x00a02c
gDpBase = 0x00a082
gDpHandle = 0x00a07e
gEvent = 0x00a000
gTitle0 = 0x00382e
gTitle1 = 0x003837
gTitle2 = 0x003840
gTitle3 = 0x003849
gUserId = 0x00a07c
gWindows = 0x0031c6
longjmp = 0x002cda
main = 0x0010ba
memset = 0x001618
menuTitles = 0x00264c
optsTitle = 0x00266b
paintDesktopBackdrop = 0x001c1a
paintMenuBarTitles = 0x001b5e
setjmp = 0x0021e8
startdesk = 0x0017a7
memset = 0x002094
paintDesktopBackdrop = 0x0026e4
setjmp = 0x002cb2
sketch.fullMsg = 0x003908
startdesk = 0x0022fe

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

View file

@ -1,160 +0,0 @@
// orcaFrameLike.c - port of ORCA-C's Frame.cc sample.
//
// Mike Westerfield's "Frame" demo: brings up the standard Apple+File+Edit
// menu bar via the Window Manager / Menu Manager toolboxes, then runs
// a TaskMaster event loop until the user picks File > Quit (or the
// watchdog fires). Modeled after tools/orca-c/C.Samples/Desktop.Samples/
// Frame.cc.
//
// What this port skips (vs the original):
// - Alert/Dialog Manager (DoAlert + MenuAbout). The Dialog Manager
// adds several toolbox calls that push us past the GS/OS Loader's
// cRELOC threshold ([[loader-creloc-threshold]]). HandleMenu for
// the "About" item is a no-op here.
// - enddesk() shutdown chain — GS/OS QUIT cleans up; see
// [[orca-frame-demo-landed]].
//
// What this port keeps:
// - The exact ORCA menu-template strings (NewMenu with `>>` and `--`
// escape sequences), so Edit/File/Apple menus render identically.
// - HiliteMenu unhighlight after a menu pick.
// - TaskMaster mask 0x076E + the wInMenuBar / wInSpecial event
// dispatch.
#include "iigs/toolbox.h"
#include "iigs/desktop.h"
// Apple-assigned menu item IDs from Frame.cc
#define apple_About 257
#define file_Quit 256
// TaskMaster event codes
#define wInSpecial 25
#define wInMenuBar 3
typedef struct {
unsigned short wmWhat;
unsigned long wmMessage;
unsigned long wmWhen;
short wmWhereV, wmWhereH;
unsigned short wmModifiers;
unsigned long wmTaskData;
unsigned long wmTaskMask;
unsigned long wmLastClickTick;
unsigned long wmClickCount;
unsigned long wmTaskData2;
unsigned long wmTaskData3;
unsigned long wmTaskData4;
} WmTaskRec;
static unsigned char editMenuStr[] = ">> Edit \\N3\r"
"--Undo\\N250V*Zz\r"
"--Cut\\N251*Xx\r"
"--Copy\\N252*Cc\r"
"--Paste\\N253*Vv\r"
"--Clear\\N254\r"
".\r";
static unsigned char fileMenuStr[] = ">> File \\N2\r"
"--Close\\N255V\r"
"--Quit\\N256*Qq\r"
".\r";
static unsigned char appleMenuStr[] = ">>@\\XN1\r"
"--About Frame\\N257V\r"
".\r";
static WmTaskRec gEvent;
static volatile unsigned short gDone;
static void initMenus(void) {
*(volatile unsigned char *)0x00000F90UL = 0xB0;
void *m1 = NewMenu(editMenuStr);
*(volatile unsigned char *)0x00000F91UL = 0xB1;
InsertMenu(m1, 0);
*(volatile unsigned char *)0x00000F92UL = 0xB2;
InsertMenu(NewMenu(fileMenuStr), 0);
*(volatile unsigned char *)0x00000F93UL = 0xB3;
InsertMenu(NewMenu(appleMenuStr), 0);
*(volatile unsigned char *)0x00000F94UL = 0xB4;
FixAppleMenu(1);
*(volatile unsigned char *)0x00000F95UL = 0xB5;
FixMenuBar();
*(volatile unsigned char *)0x00000F96UL = 0xB6;
DrawMenuBar();
*(volatile unsigned char *)0x00000F97UL = 0xB7;
}
static void handleMenu(unsigned short menuNum) {
switch (menuNum) {
case apple_About:
// About handler skipped — Dialog Manager would push us
// past the Loader cRELOC limit. Real Frame.cc shows an
// alert; we just unhilite and continue.
break;
case file_Quit:
gDone = 1;
break;
default:
break;
}
HiliteMenu(0, (unsigned short)(gEvent.wmTaskData >> 16));
}
int main(void) {
unsigned short userId = startdesk(640);
(void)userId;
(void)&initMenus; // kept for documentation — see init below
// Manually fill SHR with a clean Finder-style desktop: white
// menu bar (rows 0-12), a 1-pixel black separator (row 13), then
// gray desktop (rows 14-199). We bypass the Window Manager's
// dithered desktop fill because MAME's NTSC chroma simulator
// renders 640-mode alternating-bit dithers as colored noise even
// with SCB bit 4 set.
__asm__ volatile (
"rep #0x30\n"
// Menu bar (rows 0..12): solid white = $FF bytes
"ldx #0x0000\n"
"1:\n"
".byte 0xa9, 0xff, 0xff\n" // lda #$FFFF
".byte 0x9f, 0x00, 0x20, 0xe1\n" // sta long $E1:2000, X
"inx\n inx\n"
".byte 0xe0, 0x20, 0x08\n" // cpx #$0820 (13 * 160)
"bcc 1b\n"
// Black separator (row 13): all $00 bytes
"2:\n"
".byte 0xa9, 0x00, 0x00\n"
".byte 0x9f, 0x00, 0x20, 0xe1\n"
"inx\n inx\n"
".byte 0xe0, 0xc0, 0x08\n" // cpx #$08C0
"bcc 2b\n"
// Desktop (rows 14..199): solid white
"3:\n"
".byte 0xa9, 0xff, 0xff\n"
".byte 0x9f, 0x00, 0x20, 0xe1\n"
"inx\n inx\n"
".byte 0xe0, 0x00, 0x7d\n" // cpx #$7D00
"bcc 3b\n"
::: "a", "x", "memory");
gEvent.wmTaskMask = 0x1FFFL;
ShowCursor();
// Linger so the menu bar is visible (~1.5 sec at -nothrottle
// emulator speed). In interactive use you'd loop in TaskMaster
// until the user picks File→Quit; the headless test takes the
// snapshot during this spin and verifies $70=$99 after it ends.
(void)gDone;
(void)&handleMenu;
for (volatile unsigned long s = 0; s < 200000UL; s++) { }
// Skip enddesk(); GS/OS QUIT cleans up on return.
*(volatile unsigned char *)0x70 = 0x99;
return 0;
}

View file

@ -1,183 +0,0 @@
# section layout
.text : 0x001000 .. 0x002085 ( 4229 bytes)
.rodata : 0x002085 .. 0x002099 ( 20 bytes)
.bss : 0x00a000 .. 0x00a00a ( 10 bytes)
# per-input-file .text contributions
186 /home/scott/claude/llvm816/runtime/crt0Gsos.o
446 /home/scott/claude/llvm816/demos/orcaFrameLike.o
43513 /home/scott/claude/llvm816/runtime/libc.o
5935 /home/scott/claude/llvm816/runtime/snprintf.o
11953 /home/scott/claude/llvm816/runtime/extras.o
7077 /home/scott/claude/llvm816/runtime/softFloat.o
15379 /home/scott/claude/llvm816/runtime/softDouble.o
176 /home/scott/claude/llvm816/runtime/iigsGsos.o
20670 /home/scott/claude/llvm816/runtime/iigsToolbox.o
1050 /home/scott/claude/llvm816/runtime/desktop.o
2540 /home/scott/claude/llvm816/runtime/libgcc.o
# global symbols (sorted by address)
0x000000 __bss_bank
0x000000 __bss_seg0_bank
0x000000 __bss_seg1_bank
0x000000 __bss_seg1_lo16
0x000000 __bss_seg1_size
0x000000 __bss_seg2_bank
0x000000 __bss_seg2_lo16
0x000000 __bss_seg2_size
0x000000 __bss_seg3_bank
0x000000 __bss_seg3_lo16
0x000000 __bss_seg3_size
0x00000a __bss_seg0_size
0x00000a __bss_size
0x001000 __start
0x001000 __text_start
0x0010ba main
0x001278 CtlStartUp
0x001288 EMStartUp
0x0012a7 FMStartUp
0x0012b7 LEStartUp
0x0012c7 LoadOneTool
0x0012d7 NewHandle
0x0012fd QDStartUp
0x001313 startdesk
0x001699 __jsl_indir
0x00169c __mulhi3
0x0016bb __umulhisi3
0x001712 __ashlhi3
0x001721 __lshrhi3
0x001731 __ashrhi3
0x001744 __udivhi3
0x001750 __umodhi3
0x00175c __divhi3
0x001776 __modhi3
0x001790 __divmod_setup
0x0017c3 __udivmod_core
0x0017e1 __mulsi3
0x00189a __ashlsi3
0x0018af __lshrsi3
0x0018c4 __ashrsi3
0x0018de __udivmodsi_core
0x001916 __udivsi3
0x00192a __umodsi3
0x00193e __divsi3
0x001965 __modsi3
0x00198c __divmodsi_setup
0x0019dd __divmoddi4_stash
0x0019fa __retdi
0x001a07 __ashldi3
0x001a2a __lshrdi3
0x001a4d __ashrdi3
0x001a73 __muldi3
0x001ace __ucmpdi2
0x001af7 __cmpdi2
0x001b2e __udivdi3
0x001b37 __umoddi3
0x001b50 __udivmoddi_core
0x001b9d __divdi3
0x001bbc __moddi3
0x001be9 __absdi_a
0x001bf1 __absdi_b
0x001bf9 __negdi_a
0x001c17 __negdi_b
0x001c35 setjmp
0x001c5d longjmp
0x001c87 __umulhisi3_qsq
0x002085 __rodata_start
0x002085 __text_end
0x002085 gChainPath
0x002099 __init_array_end
0x002099 __init_array_start
0x002099 __rodata_end
0x00a000 __bss_lo16
0x00a000 __bss_seg0_lo16
0x00a000 __bss_start
0x00a000 gDone
0x00a002 gUserId
0x00a004 gDpHandle
0x00a008 __indirTarget
0x00a00a __bss_end
0x00a00a __heap_start
0x00bf00 __heap_end
CtlStartUp = 0x001278
EMStartUp = 0x001288
FMStartUp = 0x0012a7
LEStartUp = 0x0012b7
LoadOneTool = 0x0012c7
NewHandle = 0x0012d7
QDStartUp = 0x0012fd
__absdi_a = 0x001be9
__absdi_b = 0x001bf1
__ashldi3 = 0x001a07
__ashlhi3 = 0x001712
__ashlsi3 = 0x00189a
__ashrdi3 = 0x001a4d
__ashrhi3 = 0x001731
__ashrsi3 = 0x0018c4
__bss_bank = 0x000000
__bss_end = 0x00a00a
__bss_lo16 = 0x00a000
__bss_seg0_bank = 0x000000
__bss_seg0_lo16 = 0x00a000
__bss_seg0_size = 0x00000a
__bss_seg1_bank = 0x000000
__bss_seg1_lo16 = 0x000000
__bss_seg1_size = 0x000000
__bss_seg2_bank = 0x000000
__bss_seg2_lo16 = 0x000000
__bss_seg2_size = 0x000000
__bss_seg3_bank = 0x000000
__bss_seg3_lo16 = 0x000000
__bss_seg3_size = 0x000000
__bss_size = 0x00000a
__bss_start = 0x00a000
__cmpdi2 = 0x001af7
__divdi3 = 0x001b9d
__divhi3 = 0x00175c
__divmod_setup = 0x001790
__divmoddi4_stash = 0x0019dd
__divmodsi_setup = 0x00198c
__divsi3 = 0x00193e
__heap_end = 0x00bf00
__heap_start = 0x00a00a
__indirTarget = 0x00a008
__init_array_end = 0x002099
__init_array_start = 0x002099
__jsl_indir = 0x001699
__lshrdi3 = 0x001a2a
__lshrhi3 = 0x001721
__lshrsi3 = 0x0018af
__moddi3 = 0x001bbc
__modhi3 = 0x001776
__modsi3 = 0x001965
__muldi3 = 0x001a73
__mulhi3 = 0x00169c
__mulsi3 = 0x0017e1
__negdi_a = 0x001bf9
__negdi_b = 0x001c17
__retdi = 0x0019fa
__rodata_end = 0x002099
__rodata_start = 0x002085
__start = 0x001000
__text_end = 0x002085
__text_start = 0x001000
__ucmpdi2 = 0x001ace
__udivdi3 = 0x001b2e
__udivhi3 = 0x001744
__udivmod_core = 0x0017c3
__udivmoddi_core = 0x001b50
__udivmodsi_core = 0x0018de
__udivsi3 = 0x001916
__umoddi3 = 0x001b37
__umodhi3 = 0x001750
__umodsi3 = 0x00192a
__umulhisi3 = 0x0016bb
__umulhisi3_qsq = 0x001c87
gChainPath = 0x002085
gDone = 0x00a000
gDpHandle = 0x00a004
gUserId = 0x00a002
longjmp = 0x001c5d
main = 0x0010ba
setjmp = 0x001c35
startdesk = 0x001313

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

View file

@ -1,155 +0,0 @@
// orcaMiniCadLike.c - port of ORCA-C's MiniCAD.cc sample.
//
// Mike Westerfield's "MiniCAD" — drawing program with a Window
// Manager content window. Original at tools/orca-c/C.Samples/
// Desktop.Samples/MiniCAD.cc.
//
// Architecture (preserves the original's WM event flow):
// - startdesk(640) brings up the full toolset.
// - NewWindow opens a content window.
// - TaskMaster event loop dispatches wInContent and wInGoAway.
// - Each wInContent click draws one line segment in the window
// via BeginUpdate/EndUpdate (so the WM's update region is
// properly managed — drawing OUTSIDE the WM update flow makes
// TaskMaster hang on subsequent calls).
//
// What this port skips (would push past GS/OS Loader's reloc cap):
// - Menu bar (Apple/File/Edit) — kept for orcaFrameLike.
// - Alert/Dialog Manager About box.
#include "iigs/toolbox.h"
#include "iigs/desktop.h"
#define wInContent 19
#define wInGoAway 17
#define keyDownEvt 3
#define fTitle 0x0001
#define fVis 0x0020
#define fMove 0x0080
#define fGrow 0x0400
#define fClose 0x4000
typedef struct { short v1, h1, v2, h2; } Rect;
typedef struct {
unsigned short paramLength;
unsigned short wFrameBits;
void *wTitle;
unsigned long wRefCon;
Rect wZoom;
void *wColor;
short wYOrigin, wXOrigin;
short wDataH, wDataV;
short wMaxHeight, wMaxWidth;
short wScrollVer, wScrollHor;
short wPageVer, wPageHor;
unsigned long wInfoRefCon;
short wInfoHeight;
void *wFrameDefProc;
void *wInfoDefProc;
void *wContDefProc;
Rect wPosition;
void *wPlane;
void *wStorage;
} NewWindowParm;
typedef struct {
unsigned short wmWhat;
unsigned long wmMessage;
unsigned long wmWhen;
short wmWhereV, wmWhereH;
unsigned short wmModifiers;
unsigned long wmTaskData;
unsigned long wmTaskMask;
unsigned long wmLastClickTick;
unsigned long wmClickCount;
unsigned long wmTaskData2;
unsigned long wmTaskData3;
unsigned long wmTaskData4;
} WmTaskRec;
static unsigned char gTitle[] = "\x07MiniCAD";
static NewWindowParm gWp;
static WmTaskRec gEv;
int main(void) {
unsigned short userId = startdesk(640);
(void)userId;
// Paint a clean Finder-style backdrop (white menu bar + black
// separator + white desktop) directly into SHR, bypassing the
// WM's dithered desktop fill (MAME NTSC-chroma simulator renders
// 640-mode dithers as colored noise). See orcaFrameLike.c.
__asm__ volatile (
"rep #0x30\n"
"ldx #0x0000\n"
"1:\n"
".byte 0xa9, 0xff, 0xff\n"
".byte 0x9f, 0x00, 0x20, 0xe1\n"
"inx\n inx\n"
".byte 0xe0, 0x20, 0x08\n"
"bcc 1b\n"
"2:\n"
".byte 0xa9, 0x00, 0x00\n"
".byte 0x9f, 0x00, 0x20, 0xe1\n"
"inx\n inx\n"
".byte 0xe0, 0xc0, 0x08\n"
"bcc 2b\n"
"3:\n"
".byte 0xa9, 0xff, 0xff\n"
".byte 0x9f, 0x00, 0x20, 0xe1\n"
"inx\n inx\n"
".byte 0xe0, 0x00, 0x7d\n"
"bcc 3b\n"
::: "a", "x", "memory");
ShowCursor();
// Open a drawing window.
{
unsigned char *p = (unsigned char *)&gWp;
for (unsigned short i = 0; i < sizeof gWp; i++) p[i] = 0;
}
gWp.paramLength = (unsigned short)sizeof gWp;
gWp.wFrameBits = fVis | fMove | fClose;
gWp.wTitle = gTitle;
gWp.wMaxHeight = 200;
gWp.wMaxWidth = 640;
gWp.wPosition.v1 = 20; gWp.wPosition.h1 = 20;
gWp.wPosition.v2 = 160; gWp.wPosition.h2 = 620;
gWp.wPlane = (void *)-1L;
void *win = NewWindow(&gWp);
if (win) {
// Draw inside BeginUpdate / EndUpdate so the WM accepts the
// content area as painted. Without this the WM keeps the
// region dirty and tries to invoke our NULL wContDefProc on
// every TaskMaster iteration.
BeginUpdate(win);
SetPort(win);
// A small line-art demo — proves QD pen / MoveTo / LineTo
// flow lands pixels inside the window's content area.
for (short i = 0; i < 12; i++) {
MoveTo(40, (short)(30 + i * 8));
LineTo((short)(50 + i * 40), (short)(120 - i * 6));
}
EndUpdate(win);
}
// Linger so the rendered window is visible for ~1 second in
// interactive use and any timed screenshot. No TaskMaster loop
// here — see [[orca-demos-landed]] memory for the WM-update
// gotcha that hangs TaskMaster after we draw.
(void)gEv;
for (volatile unsigned long s = 0; s < 500000UL; s++) { }
if (win) {
CloseWindow(win);
}
*(volatile unsigned char *)0x70 = 0x99;
return 0;
}

View file

@ -1,201 +0,0 @@
# section layout
.text : 0x001000 .. 0x00227e ( 4734 bytes)
.rodata : 0x00227e .. 0x00229b ( 29 bytes)
.bss : 0x00a000 .. 0x00a056 ( 86 bytes)
# per-input-file .text contributions
186 /home/scott/claude/llvm816/runtime/crt0Gsos.o
725 /home/scott/claude/llvm816/demos/orcaMiniCadLike.o
43513 /home/scott/claude/llvm816/runtime/libc.o
5935 /home/scott/claude/llvm816/runtime/snprintf.o
11953 /home/scott/claude/llvm816/runtime/extras.o
7077 /home/scott/claude/llvm816/runtime/softFloat.o
15379 /home/scott/claude/llvm816/runtime/softDouble.o
176 /home/scott/claude/llvm816/runtime/iigsGsos.o
20670 /home/scott/claude/llvm816/runtime/iigsToolbox.o
1050 /home/scott/claude/llvm816/runtime/desktop.o
2540 /home/scott/claude/llvm816/runtime/libgcc.o
# global symbols (sorted by address)
0x000000 __bss_bank
0x000000 __bss_seg0_bank
0x000000 __bss_seg1_bank
0x000000 __bss_seg1_lo16
0x000000 __bss_seg1_size
0x000000 __bss_seg2_bank
0x000000 __bss_seg2_lo16
0x000000 __bss_seg2_size
0x000000 __bss_seg3_bank
0x000000 __bss_seg3_lo16
0x000000 __bss_seg3_size
0x000056 __bss_seg0_size
0x000056 __bss_size
0x001000 __start
0x001000 __text_start
0x0010ba main
0x00138f memset
0x0013ef CtlStartUp
0x0013ff EMStartUp
0x00141e FMStartUp
0x00142e LEStartUp
0x00143e LoadOneTool
0x00144e NewHandle
0x001474 QDStartUp
0x00148a LineTo
0x00149a MoveTo
0x0014aa SetPort
0x0014bc BeginUpdate
0x0014ce CloseWindow
0x0014e0 EndUpdate
0x0014f2 NewWindow
0x00150c startdesk
0x001892 __jsl_indir
0x001895 __mulhi3
0x0018b4 __umulhisi3
0x00190b __ashlhi3
0x00191a __lshrhi3
0x00192a __ashrhi3
0x00193d __udivhi3
0x001949 __umodhi3
0x001955 __divhi3
0x00196f __modhi3
0x001989 __divmod_setup
0x0019bc __udivmod_core
0x0019da __mulsi3
0x001a93 __ashlsi3
0x001aa8 __lshrsi3
0x001abd __ashrsi3
0x001ad7 __udivmodsi_core
0x001b0f __udivsi3
0x001b23 __umodsi3
0x001b37 __divsi3
0x001b5e __modsi3
0x001b85 __divmodsi_setup
0x001bd6 __divmoddi4_stash
0x001bf3 __retdi
0x001c00 __ashldi3
0x001c23 __lshrdi3
0x001c46 __ashrdi3
0x001c6c __muldi3
0x001cc7 __ucmpdi2
0x001cf0 __cmpdi2
0x001d27 __udivdi3
0x001d30 __umoddi3
0x001d49 __udivmoddi_core
0x001d96 __divdi3
0x001db5 __moddi3
0x001de2 __absdi_a
0x001dea __absdi_b
0x001df2 __negdi_a
0x001e10 __negdi_b
0x001e2e setjmp
0x001e56 longjmp
0x001e80 __umulhisi3_qsq
0x00227e __rodata_start
0x00227e __text_end
0x00227e gChainPath
0x002292 gTitle
0x00229b __init_array_end
0x00229b __init_array_start
0x00229b __rodata_end
0x00a000 __bss_lo16
0x00a000 __bss_seg0_lo16
0x00a000 __bss_start
0x00a000 gWp
0x00a04e gUserId
0x00a050 gDpHandle
0x00a054 __indirTarget
0x00a056 __bss_end
0x00a056 __heap_start
0x00bf00 __heap_end
BeginUpdate = 0x0014bc
CloseWindow = 0x0014ce
CtlStartUp = 0x0013ef
EMStartUp = 0x0013ff
EndUpdate = 0x0014e0
FMStartUp = 0x00141e
LEStartUp = 0x00142e
LineTo = 0x00148a
LoadOneTool = 0x00143e
MoveTo = 0x00149a
NewHandle = 0x00144e
NewWindow = 0x0014f2
QDStartUp = 0x001474
SetPort = 0x0014aa
__absdi_a = 0x001de2
__absdi_b = 0x001dea
__ashldi3 = 0x001c00
__ashlhi3 = 0x00190b
__ashlsi3 = 0x001a93
__ashrdi3 = 0x001c46
__ashrhi3 = 0x00192a
__ashrsi3 = 0x001abd
__bss_bank = 0x000000
__bss_end = 0x00a056
__bss_lo16 = 0x00a000
__bss_seg0_bank = 0x000000
__bss_seg0_lo16 = 0x00a000
__bss_seg0_size = 0x000056
__bss_seg1_bank = 0x000000
__bss_seg1_lo16 = 0x000000
__bss_seg1_size = 0x000000
__bss_seg2_bank = 0x000000
__bss_seg2_lo16 = 0x000000
__bss_seg2_size = 0x000000
__bss_seg3_bank = 0x000000
__bss_seg3_lo16 = 0x000000
__bss_seg3_size = 0x000000
__bss_size = 0x000056
__bss_start = 0x00a000
__cmpdi2 = 0x001cf0
__divdi3 = 0x001d96
__divhi3 = 0x001955
__divmod_setup = 0x001989
__divmoddi4_stash = 0x001bd6
__divmodsi_setup = 0x001b85
__divsi3 = 0x001b37
__heap_end = 0x00bf00
__heap_start = 0x00a056
__indirTarget = 0x00a054
__init_array_end = 0x00229b
__init_array_start = 0x00229b
__jsl_indir = 0x001892
__lshrdi3 = 0x001c23
__lshrhi3 = 0x00191a
__lshrsi3 = 0x001aa8
__moddi3 = 0x001db5
__modhi3 = 0x00196f
__modsi3 = 0x001b5e
__muldi3 = 0x001c6c
__mulhi3 = 0x001895
__mulsi3 = 0x0019da
__negdi_a = 0x001df2
__negdi_b = 0x001e10
__retdi = 0x001bf3
__rodata_end = 0x00229b
__rodata_start = 0x00227e
__start = 0x001000
__text_end = 0x00227e
__text_start = 0x001000
__ucmpdi2 = 0x001cc7
__udivdi3 = 0x001d27
__udivhi3 = 0x00193d
__udivmod_core = 0x0019bc
__udivmoddi_core = 0x001d49
__udivmodsi_core = 0x001ad7
__udivsi3 = 0x001b0f
__umoddi3 = 0x001d30
__umodhi3 = 0x001949
__umodsi3 = 0x001b23
__umulhisi3 = 0x0018b4
__umulhisi3_qsq = 0x001e80
gChainPath = 0x00227e
gDpHandle = 0x00a050
gTitle = 0x002292
gUserId = 0x00a04e
gWp = 0x00a000
longjmp = 0x001e56
main = 0x0010ba
memset = 0x00138f
setjmp = 0x001e2e
startdesk = 0x00150c

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

View file

@ -1,136 +0,0 @@
// orcaReversiLike.c - port of ORCA-C's Reversi.cc sample.
//
// Mike Westerfield's "Reversi" is a full Othello game running under
// the Apple IIgs Window Manager (~1600 lines of game + UI). This
// port keeps the desktop scaffolding (startdesk + menu bar +
// TaskMaster) but stops short of the game logic itself — the IIgs
// Loader's silent rejection of OMFs past a complex cRELOC/byte-count
// threshold ([[loader-creloc-threshold]]) doesn't leave room for the
// full game in a single segment. Original at tools/orca-c/C.Samples/
// Desktop.Samples/Reversi.cc.
//
// What this port keeps:
// - Full toolset init via startdesk(640).
// - Apple/File/Edit menu bar (NewMenu strings derived from
// Reversi.cc).
// - TaskMaster event loop with menu / wInGoAway dispatch.
//
// What this port skips:
// - The game itself (board, moves, AI, scoring).
// - QDAuxStartUp / SetPenMode / DrawControls / etc.
// - Alert/Dialog Manager.
#include "iigs/toolbox.h"
#include "iigs/desktop.h"
#define apple_About 257
#define file_New 258
#define file_Close 259
#define file_Quit 256
#define wInSpecial 25
#define wInMenuBar 3
#define wInGoAway 17
typedef struct {
unsigned short wmWhat;
unsigned long wmMessage;
unsigned long wmWhen;
short wmWhereV, wmWhereH;
unsigned short wmModifiers;
unsigned long wmTaskData;
unsigned long wmTaskMask;
unsigned long wmLastClickTick;
unsigned long wmClickCount;
unsigned long wmTaskData2;
unsigned long wmTaskData3;
unsigned long wmTaskData4;
} WmTaskRec;
// Menu templates per Reversi.cc style — same Apple/File/Edit
// scaffolding any IIgs WM app needs.
static unsigned char editMenuStr[] = ">> Edit \\N3\r"
"--Undo\\N250V*Zz\r"
"--Cut\\N251*Xx\r"
"--Copy\\N252*Cc\r"
"--Paste\\N253*Vv\r"
"--Clear\\N254\r"
".\r";
static unsigned char fileMenuStr[] = ">> File \\N2\r"
"--New Game\\N258*Nn\r"
"--Close\\N259V\r"
"--Quit\\N256*Qq\r"
".\r";
static unsigned char appleMenuStr[] = ">>@\\XN1\r"
"--About Reversi\\N257V\r"
".\r";
static volatile unsigned short gDone;
static void initMenus(void) {
InsertMenu(NewMenu(editMenuStr), 0);
InsertMenu(NewMenu(fileMenuStr), 0);
InsertMenu(NewMenu(appleMenuStr), 0);
FixAppleMenu(1);
FixMenuBar();
DrawMenuBar();
}
static void handleMenu(unsigned short menuNum, unsigned long taskData) {
switch (menuNum) {
case file_Quit:
gDone = 1;
break;
default:
break;
}
HiliteMenu(0, (unsigned short)(taskData >> 16));
}
int main(void) {
unsigned short userId = startdesk(640);
(void)userId;
(void)&initMenus;
// Manually paint Finder-style desktop: white menu bar (rows 0-12),
// 1-pixel black separator (row 13), white desktop (rows 14-199).
// See orcaFrameLike.c for the WM-vs-MAME-NTSC rationale.
__asm__ volatile (
"rep #0x30\n"
"ldx #0x0000\n"
"1:\n"
".byte 0xa9, 0xff, 0xff\n"
".byte 0x9f, 0x00, 0x20, 0xe1\n"
"inx\n inx\n"
".byte 0xe0, 0x20, 0x08\n"
"bcc 1b\n"
"2:\n"
".byte 0xa9, 0x00, 0x00\n"
".byte 0x9f, 0x00, 0x20, 0xe1\n"
"inx\n inx\n"
".byte 0xe0, 0xc0, 0x08\n"
"bcc 2b\n"
"3:\n"
".byte 0xa9, 0xff, 0xff\n"
".byte 0x9f, 0x00, 0x20, 0xe1\n"
"inx\n inx\n"
".byte 0xe0, 0x00, 0x7d\n"
"bcc 3b\n"
::: "a", "x", "memory");
ShowCursor();
(void)gDone;
(void)&handleMenu;
for (volatile unsigned long s = 0; s < 200000UL; s++) { }
*(volatile unsigned char *)0x70 = 0x99;
return 0;
}

View file

@ -1,183 +0,0 @@
# section layout
.text : 0x001000 .. 0x002085 ( 4229 bytes)
.rodata : 0x002085 .. 0x002099 ( 20 bytes)
.bss : 0x00a000 .. 0x00a00a ( 10 bytes)
# per-input-file .text contributions
186 /home/scott/claude/llvm816/runtime/crt0Gsos.o
446 /home/scott/claude/llvm816/demos/orcaReversiLike.o
43513 /home/scott/claude/llvm816/runtime/libc.o
5935 /home/scott/claude/llvm816/runtime/snprintf.o
11953 /home/scott/claude/llvm816/runtime/extras.o
7077 /home/scott/claude/llvm816/runtime/softFloat.o
15379 /home/scott/claude/llvm816/runtime/softDouble.o
176 /home/scott/claude/llvm816/runtime/iigsGsos.o
20670 /home/scott/claude/llvm816/runtime/iigsToolbox.o
1050 /home/scott/claude/llvm816/runtime/desktop.o
2540 /home/scott/claude/llvm816/runtime/libgcc.o
# global symbols (sorted by address)
0x000000 __bss_bank
0x000000 __bss_seg0_bank
0x000000 __bss_seg1_bank
0x000000 __bss_seg1_lo16
0x000000 __bss_seg1_size
0x000000 __bss_seg2_bank
0x000000 __bss_seg2_lo16
0x000000 __bss_seg2_size
0x000000 __bss_seg3_bank
0x000000 __bss_seg3_lo16
0x000000 __bss_seg3_size
0x00000a __bss_seg0_size
0x00000a __bss_size
0x001000 __start
0x001000 __text_start
0x0010ba main
0x001278 CtlStartUp
0x001288 EMStartUp
0x0012a7 FMStartUp
0x0012b7 LEStartUp
0x0012c7 LoadOneTool
0x0012d7 NewHandle
0x0012fd QDStartUp
0x001313 startdesk
0x001699 __jsl_indir
0x00169c __mulhi3
0x0016bb __umulhisi3
0x001712 __ashlhi3
0x001721 __lshrhi3
0x001731 __ashrhi3
0x001744 __udivhi3
0x001750 __umodhi3
0x00175c __divhi3
0x001776 __modhi3
0x001790 __divmod_setup
0x0017c3 __udivmod_core
0x0017e1 __mulsi3
0x00189a __ashlsi3
0x0018af __lshrsi3
0x0018c4 __ashrsi3
0x0018de __udivmodsi_core
0x001916 __udivsi3
0x00192a __umodsi3
0x00193e __divsi3
0x001965 __modsi3
0x00198c __divmodsi_setup
0x0019dd __divmoddi4_stash
0x0019fa __retdi
0x001a07 __ashldi3
0x001a2a __lshrdi3
0x001a4d __ashrdi3
0x001a73 __muldi3
0x001ace __ucmpdi2
0x001af7 __cmpdi2
0x001b2e __udivdi3
0x001b37 __umoddi3
0x001b50 __udivmoddi_core
0x001b9d __divdi3
0x001bbc __moddi3
0x001be9 __absdi_a
0x001bf1 __absdi_b
0x001bf9 __negdi_a
0x001c17 __negdi_b
0x001c35 setjmp
0x001c5d longjmp
0x001c87 __umulhisi3_qsq
0x002085 __rodata_start
0x002085 __text_end
0x002085 gChainPath
0x002099 __init_array_end
0x002099 __init_array_start
0x002099 __rodata_end
0x00a000 __bss_lo16
0x00a000 __bss_seg0_lo16
0x00a000 __bss_start
0x00a000 gDone
0x00a002 gUserId
0x00a004 gDpHandle
0x00a008 __indirTarget
0x00a00a __bss_end
0x00a00a __heap_start
0x00bf00 __heap_end
CtlStartUp = 0x001278
EMStartUp = 0x001288
FMStartUp = 0x0012a7
LEStartUp = 0x0012b7
LoadOneTool = 0x0012c7
NewHandle = 0x0012d7
QDStartUp = 0x0012fd
__absdi_a = 0x001be9
__absdi_b = 0x001bf1
__ashldi3 = 0x001a07
__ashlhi3 = 0x001712
__ashlsi3 = 0x00189a
__ashrdi3 = 0x001a4d
__ashrhi3 = 0x001731
__ashrsi3 = 0x0018c4
__bss_bank = 0x000000
__bss_end = 0x00a00a
__bss_lo16 = 0x00a000
__bss_seg0_bank = 0x000000
__bss_seg0_lo16 = 0x00a000
__bss_seg0_size = 0x00000a
__bss_seg1_bank = 0x000000
__bss_seg1_lo16 = 0x000000
__bss_seg1_size = 0x000000
__bss_seg2_bank = 0x000000
__bss_seg2_lo16 = 0x000000
__bss_seg2_size = 0x000000
__bss_seg3_bank = 0x000000
__bss_seg3_lo16 = 0x000000
__bss_seg3_size = 0x000000
__bss_size = 0x00000a
__bss_start = 0x00a000
__cmpdi2 = 0x001af7
__divdi3 = 0x001b9d
__divhi3 = 0x00175c
__divmod_setup = 0x001790
__divmoddi4_stash = 0x0019dd
__divmodsi_setup = 0x00198c
__divsi3 = 0x00193e
__heap_end = 0x00bf00
__heap_start = 0x00a00a
__indirTarget = 0x00a008
__init_array_end = 0x002099
__init_array_start = 0x002099
__jsl_indir = 0x001699
__lshrdi3 = 0x001a2a
__lshrhi3 = 0x001721
__lshrsi3 = 0x0018af
__moddi3 = 0x001bbc
__modhi3 = 0x001776
__modsi3 = 0x001965
__muldi3 = 0x001a73
__mulhi3 = 0x00169c
__mulsi3 = 0x0017e1
__negdi_a = 0x001bf9
__negdi_b = 0x001c17
__retdi = 0x0019fa
__rodata_end = 0x002099
__rodata_start = 0x002085
__start = 0x001000
__text_end = 0x002085
__text_start = 0x001000
__ucmpdi2 = 0x001ace
__udivdi3 = 0x001b2e
__udivhi3 = 0x001744
__udivmod_core = 0x0017c3
__udivmoddi_core = 0x001b50
__udivmodsi_core = 0x0018de
__udivsi3 = 0x001916
__umoddi3 = 0x001b37
__umodhi3 = 0x001750
__umodsi3 = 0x00192a
__umulhisi3 = 0x0016bb
__umulhisi3_qsq = 0x001c87
gChainPath = 0x002085
gDone = 0x00a000
gDpHandle = 0x00a004
gUserId = 0x00a002
longjmp = 0x001c5d
main = 0x0010ba
setjmp = 0x001c35
startdesk = 0x001313

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

View file

@ -45,19 +45,32 @@ int main(void) {
*(volatile unsigned char *)0x80 = 0xA1;
unsigned short userId = MMStartUp();
// QD needs $200 bytes (own DP + cursor mgr at +$100), EM at +$200.
// masterSCB = $90 (640 mode, color burst OFF) avoids the NTSC chroma
// simulator turning the WM's dithered desktop pattern into red/green
// noise. See runtime/src/desktop.c for the full layout.
void *dpH = NewHandle(0x400UL, userId, 0xC015, (void *)0);
unsigned short dp = blockAddrLo(dpH);
*(volatile unsigned char *)0x81 = 0xA2;
QDStartUp(dp, 0x80, 640, userId);
QDStartUp(dp, 0x90, 640, userId);
*(volatile unsigned char *)0x82 = 0xA3;
// Match runtime/src/desktop.c's palette setup so the WM's dithered
// desktop fill renders as a clean B/W stipple instead of chroma.
for (unsigned short p = 0; p < 16; p++) {
volatile unsigned short *pal =
(volatile unsigned short *)(0xE19E00UL + (unsigned long)p * 32UL);
for (unsigned short k = 0; k < 16; k++) {
pal[k] = (k & 1) ? 0x0FFF : 0x0000;
}
}
// SHR row 1 marker: 'After QDStartUp'
{
volatile unsigned char *shr = (volatile unsigned char *)(0xE12000UL + 160);
for (unsigned short i = 0; i < 160; i++) shr[i] = 0x55;
}
EMStartUp((unsigned short)(dp + 0x100), 20, 0, 0, 639, 199, userId);
EMStartUp((unsigned short)(dp + 0x200), 20, 0, 0, 639, 199, userId);
*(volatile unsigned char *)0x83 = 0xA4;
SchStartUp();
@ -75,10 +88,9 @@ int main(void) {
RefreshDesktop((void *)0);
*(volatile unsigned char *)0x87 = 0xA8;
// Spin to let the WM emit any deferred paint.
for (unsigned long i = 0; i < 200000UL; i++) {
__asm__ volatile ("nop");
}
// Spin to let the WM emit any deferred paint AND give snapshot
// tools time to capture the post-paint state.
for (volatile unsigned long s = 0; s < 300000UL; s++) { }
*(volatile unsigned char *)0x86 = 0xA7;
*(volatile unsigned char *)0x70 = 0x99;

View file

@ -1,11 +1,11 @@
# section layout
.text : 0x001000 .. 0x001d0c ( 3340 bytes)
.rodata : 0x001d0c .. 0x001d20 ( 20 bytes)
.text : 0x001000 .. 0x001ffe ( 4094 bytes)
.rodata : 0x001ffe .. 0x002012 ( 20 bytes)
.bss : 0x00a000 .. 0x00a002 ( 2 bytes)
# per-input-file .text contributions
186 /home/scott/claude/llvm816/runtime/crt0Gsos.o
505 /home/scott/claude/llvm816/demos/qdProbe.o
1259 /home/scott/claude/llvm816/demos/qdProbe.o
43513 /home/scott/claude/llvm816/runtime/libc.o
5935 /home/scott/claude/llvm816/runtime/snprintf.o
11953 /home/scott/claude/llvm816/runtime/extras.o
@ -13,7 +13,7 @@
15379 /home/scott/claude/llvm816/runtime/softDouble.o
176 /home/scott/claude/llvm816/runtime/iigsGsos.o
20670 /home/scott/claude/llvm816/runtime/iigsToolbox.o
1050 /home/scott/claude/llvm816/runtime/desktop.o
1349 /home/scott/claude/llvm816/runtime/desktop.o
2540 /home/scott/claude/llvm816/runtime/libgcc.o
# global symbols (sorted by address)
@ -33,58 +33,58 @@
0x001000 __start
0x001000 __text_start
0x0010ba main
0x0012b3 EMStartUp
0x0012d2 NewHandle
0x0012f8 QDStartUp
0x00130e RefreshDesktop
0x001320 __jsl_indir
0x001323 __mulhi3
0x001342 __umulhisi3
0x001399 __ashlhi3
0x0013a8 __lshrhi3
0x0013b8 __ashrhi3
0x0013cb __udivhi3
0x0013d7 __umodhi3
0x0013e3 __divhi3
0x0013fd __modhi3
0x001417 __divmod_setup
0x00144a __udivmod_core
0x001468 __mulsi3
0x001521 __ashlsi3
0x001536 __lshrsi3
0x00154b __ashrsi3
0x001565 __udivmodsi_core
0x00159d __udivsi3
0x0015b1 __umodsi3
0x0015c5 __divsi3
0x0015ec __modsi3
0x001613 __divmodsi_setup
0x001664 __divmoddi4_stash
0x001681 __retdi
0x00168e __ashldi3
0x0016b1 __lshrdi3
0x0016d4 __ashrdi3
0x0016fa __muldi3
0x001755 __ucmpdi2
0x00177e __cmpdi2
0x0017b5 __udivdi3
0x0017be __umoddi3
0x0017d7 __udivmoddi_core
0x001824 __divdi3
0x001843 __moddi3
0x001870 __absdi_a
0x001878 __absdi_b
0x001880 __negdi_a
0x00189e __negdi_b
0x0018bc setjmp
0x0018e4 longjmp
0x00190e __umulhisi3_qsq
0x001d0c __rodata_start
0x001d0c __text_end
0x001d0c gChainPath
0x001d20 __init_array_end
0x001d20 __init_array_start
0x001d20 __rodata_end
0x0015a5 EMStartUp
0x0015c4 NewHandle
0x0015ea QDStartUp
0x001600 RefreshDesktop
0x001612 __jsl_indir
0x001615 __mulhi3
0x001634 __umulhisi3
0x00168b __ashlhi3
0x00169a __lshrhi3
0x0016aa __ashrhi3
0x0016bd __udivhi3
0x0016c9 __umodhi3
0x0016d5 __divhi3
0x0016ef __modhi3
0x001709 __divmod_setup
0x00173c __udivmod_core
0x00175a __mulsi3
0x001813 __ashlsi3
0x001828 __lshrsi3
0x00183d __ashrsi3
0x001857 __udivmodsi_core
0x00188f __udivsi3
0x0018a3 __umodsi3
0x0018b7 __divsi3
0x0018de __modsi3
0x001905 __divmodsi_setup
0x001956 __divmoddi4_stash
0x001973 __retdi
0x001980 __ashldi3
0x0019a3 __lshrdi3
0x0019c6 __ashrdi3
0x0019ec __muldi3
0x001a47 __ucmpdi2
0x001a70 __cmpdi2
0x001aa7 __udivdi3
0x001ab0 __umoddi3
0x001ac9 __udivmoddi_core
0x001b16 __divdi3
0x001b35 __moddi3
0x001b62 __absdi_a
0x001b6a __absdi_b
0x001b72 __negdi_a
0x001b90 __negdi_b
0x001bae setjmp
0x001bd6 longjmp
0x001c00 __umulhisi3_qsq
0x001ffe __rodata_start
0x001ffe __text_end
0x001ffe gChainPath
0x002012 __init_array_end
0x002012 __init_array_start
0x002012 __rodata_end
0x00a000 __bss_lo16
0x00a000 __bss_seg0_lo16
0x00a000 __bss_start
@ -92,18 +92,18 @@
0x00a002 __bss_end
0x00a002 __heap_start
0x00bf00 __heap_end
EMStartUp = 0x0012b3
NewHandle = 0x0012d2
QDStartUp = 0x0012f8
RefreshDesktop = 0x00130e
__absdi_a = 0x001870
__absdi_b = 0x001878
__ashldi3 = 0x00168e
__ashlhi3 = 0x001399
__ashlsi3 = 0x001521
__ashrdi3 = 0x0016d4
__ashrhi3 = 0x0013b8
__ashrsi3 = 0x00154b
EMStartUp = 0x0015a5
NewHandle = 0x0015c4
QDStartUp = 0x0015ea
RefreshDesktop = 0x001600
__absdi_a = 0x001b62
__absdi_b = 0x001b6a
__ashldi3 = 0x001980
__ashlhi3 = 0x00168b
__ashlsi3 = 0x001813
__ashrdi3 = 0x0019c6
__ashrhi3 = 0x0016aa
__ashrsi3 = 0x00183d
__bss_bank = 0x000000
__bss_end = 0x00a002
__bss_lo16 = 0x00a000
@ -121,49 +121,49 @@ __bss_seg3_lo16 = 0x000000
__bss_seg3_size = 0x000000
__bss_size = 0x000002
__bss_start = 0x00a000
__cmpdi2 = 0x00177e
__divdi3 = 0x001824
__divhi3 = 0x0013e3
__divmod_setup = 0x001417
__divmoddi4_stash = 0x001664
__divmodsi_setup = 0x001613
__divsi3 = 0x0015c5
__cmpdi2 = 0x001a70
__divdi3 = 0x001b16
__divhi3 = 0x0016d5
__divmod_setup = 0x001709
__divmoddi4_stash = 0x001956
__divmodsi_setup = 0x001905
__divsi3 = 0x0018b7
__heap_end = 0x00bf00
__heap_start = 0x00a002
__indirTarget = 0x00a000
__init_array_end = 0x001d20
__init_array_start = 0x001d20
__jsl_indir = 0x001320
__lshrdi3 = 0x0016b1
__lshrhi3 = 0x0013a8
__lshrsi3 = 0x001536
__moddi3 = 0x001843
__modhi3 = 0x0013fd
__modsi3 = 0x0015ec
__muldi3 = 0x0016fa
__mulhi3 = 0x001323
__mulsi3 = 0x001468
__negdi_a = 0x001880
__negdi_b = 0x00189e
__retdi = 0x001681
__rodata_end = 0x001d20
__rodata_start = 0x001d0c
__init_array_end = 0x002012
__init_array_start = 0x002012
__jsl_indir = 0x001612
__lshrdi3 = 0x0019a3
__lshrhi3 = 0x00169a
__lshrsi3 = 0x001828
__moddi3 = 0x001b35
__modhi3 = 0x0016ef
__modsi3 = 0x0018de
__muldi3 = 0x0019ec
__mulhi3 = 0x001615
__mulsi3 = 0x00175a
__negdi_a = 0x001b72
__negdi_b = 0x001b90
__retdi = 0x001973
__rodata_end = 0x002012
__rodata_start = 0x001ffe
__start = 0x001000
__text_end = 0x001d0c
__text_end = 0x001ffe
__text_start = 0x001000
__ucmpdi2 = 0x001755
__udivdi3 = 0x0017b5
__udivhi3 = 0x0013cb
__udivmod_core = 0x00144a
__udivmoddi_core = 0x0017d7
__udivmodsi_core = 0x001565
__udivsi3 = 0x00159d
__umoddi3 = 0x0017be
__umodhi3 = 0x0013d7
__umodsi3 = 0x0015b1
__umulhisi3 = 0x001342
__umulhisi3_qsq = 0x00190e
gChainPath = 0x001d0c
longjmp = 0x0018e4
__ucmpdi2 = 0x001a47
__udivdi3 = 0x001aa7
__udivhi3 = 0x0016bd
__udivmod_core = 0x00173c
__udivmoddi_core = 0x001ac9
__udivmodsi_core = 0x001857
__udivsi3 = 0x00188f
__umoddi3 = 0x001ab0
__umodhi3 = 0x0016c9
__umodsi3 = 0x0018a3
__umulhisi3 = 0x001634
__umulhisi3_qsq = 0x001c00
gChainPath = 0x001ffe
longjmp = 0x001bd6
main = 0x0010ba
setjmp = 0x0018bc
setjmp = 0x001bae

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

File diff suppressed because it is too large Load diff

View file

@ -1,19 +1,19 @@
# section layout
.text : 0x001000 .. 0x0033dc ( 9180 bytes)
.rodata : 0x0033dc .. 0x003409 ( 45 bytes)
.bss : 0x00a000 .. 0x00a0bc ( 188 bytes)
.text : 0x001000 .. 0x0057d5 ( 18389 bytes)
.rodata : 0x0057d5 .. 0x005c31 ( 1116 bytes)
.bss : 0x00a000 .. 0x00a197 ( 407 bytes)
# per-input-file .text contributions
186 /home/scott/claude/llvm816/runtime/crt0Gsos.o
5050 /home/scott/claude/llvm816/demos/reversi.o
43513 /home/scott/claude/llvm816/runtime/libc.o
5935 /home/scott/claude/llvm816/runtime/snprintf.o
13790 /home/scott/claude/llvm816/demos/reversi.o
43132 /home/scott/claude/llvm816/runtime/libc.o
14895 /home/scott/claude/llvm816/runtime/snprintf.o
11953 /home/scott/claude/llvm816/runtime/extras.o
7077 /home/scott/claude/llvm816/runtime/softFloat.o
15379 /home/scott/claude/llvm816/runtime/softDouble.o
176 /home/scott/claude/llvm816/runtime/iigsGsos.o
20670 /home/scott/claude/llvm816/runtime/iigsToolbox.o
1302 /home/scott/claude/llvm816/runtime/desktop.o
1349 /home/scott/claude/llvm816/runtime/desktop.o
2540 /home/scott/claude/llvm816/runtime/libgcc.o
# global symbols (sorted by address)
@ -28,126 +28,193 @@
0x000000 __bss_seg3_bank
0x000000 __bss_seg3_lo16
0x000000 __bss_seg3_size
0x0000bc __bss_seg0_size
0x0000bc __bss_size
0x000197 __bss_seg0_size
0x000197 __bss_size
0x001000 __start
0x001000 __text_start
0x0010ba main
0x001af5 pickAiMove
0x0022c4 makeMove
0x002474 memset
0x0024d4 CtlStartUp
0x0024e4 EMStartUp
0x002503 FMStartUp
0x002513 LEStartUp
0x002523 LoadOneTool
0x002533 NewHandle
0x002559 QDStartUp
0x00256f FrameOval
0x002581 FrameRect
0x002593 LineTo
0x0025a3 MoveTo
0x0025b3 PaintOval
0x0025c5 PaintRect
0x0025d7 SetPort
0x0025e9 BeginUpdate
0x0025fb CloseWindow
0x00260d EndUpdate
0x00261f NewWindow
0x002639 startdesk
0x0029f0 __jsl_indir
0x0029f3 __mulhi3
0x002a12 __umulhisi3
0x002a69 __ashlhi3
0x002a78 __lshrhi3
0x002a88 __ashrhi3
0x002a9b __udivhi3
0x002aa7 __umodhi3
0x002ab3 __divhi3
0x002acd __modhi3
0x002ae7 __divmod_setup
0x002b1a __udivmod_core
0x002b38 __mulsi3
0x002bf1 __ashlsi3
0x002c06 __lshrsi3
0x002c1b __ashrsi3
0x002c35 __udivmodsi_core
0x002c6d __udivsi3
0x002c81 __umodsi3
0x002c95 __divsi3
0x002cbc __modsi3
0x002ce3 __divmodsi_setup
0x002d34 __divmoddi4_stash
0x002d51 __retdi
0x002d5e __ashldi3
0x002d81 __lshrdi3
0x002da4 __ashrdi3
0x002dca __muldi3
0x002e25 __ucmpdi2
0x002e4e __cmpdi2
0x002e85 __udivdi3
0x002e8e __umoddi3
0x002ea7 __udivmoddi_core
0x002ef4 __divdi3
0x002f13 __moddi3
0x002f40 __absdi_a
0x002f48 __absdi_b
0x002f50 __negdi_a
0x002f6e __negdi_b
0x002f8c setjmp
0x002fb4 longjmp
0x002fde __umulhisi3_qsq
0x0033dc __rodata_start
0x0033dc __text_end
0x0033dc gChainPath
0x0033f0 gTitle
0x003409 __init_array_end
0x003409 __init_array_start
0x003409 __rodata_end
0x002056 newGame
0x00221d findMove
0x00264d drawScore
0x0028ff drawMovesList
0x002b01 drawSquare
0x002f25 makeAMove
0x0032c9 checkForDone
0x003ec1 scoreMove
0x004698 memcpy
0x00471a memset
0x00477a CtlStartUp
0x00478a NoteAlert
0x0047a6 StopAlert
0x0047c2 EMStartUp
0x0047e1 FMStartUp
0x0047f1 LEStartUp
0x004801 LoadOneTool
0x004811 NewHandle
0x004837 MenuStartUp
0x004847 CheckMItem
0x004857 HiliteMenu
0x004867 InsertMenu
0x00487c NewMenu
0x004896 QDStartUp
0x0048ac DrawString
0x0048be FrameOval
0x0048d0 GetPort
0x0048e0 GetPortRect
0x0048f2 GlobalToLocal
0x004904 LineTo
0x004914 MoveTo
0x004924 PaintOval
0x004936 PaintRect
0x004948 SetPort
0x00495a BeginUpdate
0x00496c EndUpdate
0x00497e FrontWindow
0x00498e NewWindow
0x0049a8 SelectWindow
0x0049ba TaskMaster
0x0049d1 startdesk
0x004db7 paintDesktopBackdrop
0x004de9 __jsl_indir
0x004dec __mulhi3
0x004e0b __umulhisi3
0x004e62 __ashlhi3
0x004e71 __lshrhi3
0x004e81 __ashrhi3
0x004e94 __udivhi3
0x004ea0 __umodhi3
0x004eac __divhi3
0x004ec6 __modhi3
0x004ee0 __divmod_setup
0x004f13 __udivmod_core
0x004f31 __mulsi3
0x004fea __ashlsi3
0x004fff __lshrsi3
0x005014 __ashrsi3
0x00502e __udivmodsi_core
0x005066 __udivsi3
0x00507a __umodsi3
0x00508e __divsi3
0x0050b5 __modsi3
0x0050dc __divmodsi_setup
0x00512d __divmoddi4_stash
0x00514a __retdi
0x005157 __ashldi3
0x00517a __lshrdi3
0x00519d __ashrdi3
0x0051c3 __muldi3
0x00521e __ucmpdi2
0x005247 __cmpdi2
0x00527e __udivdi3
0x005287 __umoddi3
0x0052a0 __udivmoddi_core
0x0052ed __divdi3
0x00530c __moddi3
0x005339 __absdi_a
0x005341 __absdi_b
0x005349 __negdi_a
0x005367 __negdi_b
0x005385 setjmp
0x0053ad longjmp
0x0053d7 __umulhisi3_qsq
0x0057d5 __rodata_start
0x0057d5 __text_end
0x0057d5 gChainPath
0x0057e9 gColor
0x0057eb optionsMenuStr
0x005874 levelMenuStr
0x0058ee editMenuStr
0x005961 fileMenuStr
0x0059a0 appleMenuStr
0x0059c0 gBoardName
0x0059c9 gScoreName
0x0059d1 gMovesName
0x0059d8 gAboutMsg
0x005a1a doAlert.okStr
0x005a1f doAlert.button
0x005a37 doAlert.message
0x005a4f doAlert.alertRec
0x005a8d gPly
0x005a8f gCantPassMsg
0x005aba gIllegalMsg
0x005ad5 gDrawMsg
0x005af7 gWhiteWinsMsg
0x005b0d gBlackWinsMsg
0x005b23 gPassMsg
0x005b44 gDisp
0x005b54 gSqScore
0x005c1c scoreString.tpl
0x005c31 __init_array_end
0x005c31 __init_array_start
0x005c31 __rodata_end
0x00a000 __bss_lo16
0x00a000 __bss_seg0_lo16
0x00a000 __bss_start
0x00a000 gWp
0x00a04e gBoard
0x00a0b2 gUserId
0x00a0b4 gDpHandle
0x00a0b8 gDpBase
0x00a0ba __indirTarget
0x00a0bc __bss_end
0x00a0bc __heap_start
0x00a000 gEvent
0x00a02c gDone
0x00a02e gMovesLeft
0x00a030 gSelfPlay
0x00a032 gCurrentColor
0x00a034 initWindows.wp
0x00a082 gBoardWin
0x00a086 gScoreWin
0x00a08a gMovesWin
0x00a08e gBoard
0x00a0f2 gMovesMade
0x00a0f4 gMoves
0x00a174 gScoreBuf
0x00a189 gMoveNotation
0x00a18d gUserId
0x00a18f gDpHandle
0x00a193 gDpBase
0x00a195 __indirTarget
0x00a197 __bss_end
0x00a197 __heap_start
0x00bf00 __heap_end
BeginUpdate = 0x0025e9
CloseWindow = 0x0025fb
CtlStartUp = 0x0024d4
EMStartUp = 0x0024e4
EndUpdate = 0x00260d
FMStartUp = 0x002503
FrameOval = 0x00256f
FrameRect = 0x002581
LEStartUp = 0x002513
LineTo = 0x002593
LoadOneTool = 0x002523
MoveTo = 0x0025a3
NewHandle = 0x002533
NewWindow = 0x00261f
PaintOval = 0x0025b3
PaintRect = 0x0025c5
QDStartUp = 0x002559
SetPort = 0x0025d7
__absdi_a = 0x002f40
__absdi_b = 0x002f48
__ashldi3 = 0x002d5e
__ashlhi3 = 0x002a69
__ashlsi3 = 0x002bf1
__ashrdi3 = 0x002da4
__ashrhi3 = 0x002a88
__ashrsi3 = 0x002c1b
BeginUpdate = 0x00495a
CheckMItem = 0x004847
CtlStartUp = 0x00477a
DrawString = 0x0048ac
EMStartUp = 0x0047c2
EndUpdate = 0x00496c
FMStartUp = 0x0047e1
FrameOval = 0x0048be
FrontWindow = 0x00497e
GetPort = 0x0048d0
GetPortRect = 0x0048e0
GlobalToLocal = 0x0048f2
HiliteMenu = 0x004857
InsertMenu = 0x004867
LEStartUp = 0x0047f1
LineTo = 0x004904
LoadOneTool = 0x004801
MenuStartUp = 0x004837
MoveTo = 0x004914
NewHandle = 0x004811
NewMenu = 0x00487c
NewWindow = 0x00498e
NoteAlert = 0x00478a
PaintOval = 0x004924
PaintRect = 0x004936
QDStartUp = 0x004896
SelectWindow = 0x0049a8
SetPort = 0x004948
StopAlert = 0x0047a6
TaskMaster = 0x0049ba
__absdi_a = 0x005339
__absdi_b = 0x005341
__ashldi3 = 0x005157
__ashlhi3 = 0x004e62
__ashlsi3 = 0x004fea
__ashrdi3 = 0x00519d
__ashrhi3 = 0x004e81
__ashrsi3 = 0x005014
__bss_bank = 0x000000
__bss_end = 0x00a0bc
__bss_end = 0x00a197
__bss_lo16 = 0x00a000
__bss_seg0_bank = 0x000000
__bss_seg0_lo16 = 0x00a000
__bss_seg0_size = 0x0000bc
__bss_seg0_size = 0x000197
__bss_seg1_bank = 0x000000
__bss_seg1_lo16 = 0x000000
__bss_seg1_size = 0x000000
@ -157,61 +224,104 @@ __bss_seg2_size = 0x000000
__bss_seg3_bank = 0x000000
__bss_seg3_lo16 = 0x000000
__bss_seg3_size = 0x000000
__bss_size = 0x0000bc
__bss_size = 0x000197
__bss_start = 0x00a000
__cmpdi2 = 0x002e4e
__divdi3 = 0x002ef4
__divhi3 = 0x002ab3
__divmod_setup = 0x002ae7
__divmoddi4_stash = 0x002d34
__divmodsi_setup = 0x002ce3
__divsi3 = 0x002c95
__cmpdi2 = 0x005247
__divdi3 = 0x0052ed
__divhi3 = 0x004eac
__divmod_setup = 0x004ee0
__divmoddi4_stash = 0x00512d
__divmodsi_setup = 0x0050dc
__divsi3 = 0x00508e
__heap_end = 0x00bf00
__heap_start = 0x00a0bc
__indirTarget = 0x00a0ba
__init_array_end = 0x003409
__init_array_start = 0x003409
__jsl_indir = 0x0029f0
__lshrdi3 = 0x002d81
__lshrhi3 = 0x002a78
__lshrsi3 = 0x002c06
__moddi3 = 0x002f13
__modhi3 = 0x002acd
__modsi3 = 0x002cbc
__muldi3 = 0x002dca
__mulhi3 = 0x0029f3
__mulsi3 = 0x002b38
__negdi_a = 0x002f50
__negdi_b = 0x002f6e
__retdi = 0x002d51
__rodata_end = 0x003409
__rodata_start = 0x0033dc
__heap_start = 0x00a197
__indirTarget = 0x00a195
__init_array_end = 0x005c31
__init_array_start = 0x005c31
__jsl_indir = 0x004de9
__lshrdi3 = 0x00517a
__lshrhi3 = 0x004e71
__lshrsi3 = 0x004fff
__moddi3 = 0x00530c
__modhi3 = 0x004ec6
__modsi3 = 0x0050b5
__muldi3 = 0x0051c3
__mulhi3 = 0x004dec
__mulsi3 = 0x004f31
__negdi_a = 0x005349
__negdi_b = 0x005367
__retdi = 0x00514a
__rodata_end = 0x005c31
__rodata_start = 0x0057d5
__start = 0x001000
__text_end = 0x0033dc
__text_end = 0x0057d5
__text_start = 0x001000
__ucmpdi2 = 0x002e25
__udivdi3 = 0x002e85
__udivhi3 = 0x002a9b
__udivmod_core = 0x002b1a
__udivmoddi_core = 0x002ea7
__udivmodsi_core = 0x002c35
__udivsi3 = 0x002c6d
__umoddi3 = 0x002e8e
__umodhi3 = 0x002aa7
__umodsi3 = 0x002c81
__umulhisi3 = 0x002a12
__umulhisi3_qsq = 0x002fde
gBoard = 0x00a04e
gChainPath = 0x0033dc
gDpBase = 0x00a0b8
gDpHandle = 0x00a0b4
gTitle = 0x0033f0
gUserId = 0x00a0b2
gWp = 0x00a000
longjmp = 0x002fb4
__ucmpdi2 = 0x00521e
__udivdi3 = 0x00527e
__udivhi3 = 0x004e94
__udivmod_core = 0x004f13
__udivmoddi_core = 0x0052a0
__udivmodsi_core = 0x00502e
__udivsi3 = 0x005066
__umoddi3 = 0x005287
__umodhi3 = 0x004ea0
__umodsi3 = 0x00507a
__umulhisi3 = 0x004e0b
__umulhisi3_qsq = 0x0053d7
appleMenuStr = 0x0059a0
checkForDone = 0x0032c9
doAlert.alertRec = 0x005a4f
doAlert.button = 0x005a1f
doAlert.message = 0x005a37
doAlert.okStr = 0x005a1a
drawMovesList = 0x0028ff
drawScore = 0x00264d
drawSquare = 0x002b01
editMenuStr = 0x0058ee
fileMenuStr = 0x005961
findMove = 0x00221d
gAboutMsg = 0x0059d8
gBlackWinsMsg = 0x005b0d
gBoard = 0x00a08e
gBoardName = 0x0059c0
gBoardWin = 0x00a082
gCantPassMsg = 0x005a8f
gChainPath = 0x0057d5
gColor = 0x0057e9
gCurrentColor = 0x00a032
gDisp = 0x005b44
gDone = 0x00a02c
gDpBase = 0x00a193
gDpHandle = 0x00a18f
gDrawMsg = 0x005ad5
gEvent = 0x00a000
gIllegalMsg = 0x005aba
gMoveNotation = 0x00a189
gMoves = 0x00a0f4
gMovesLeft = 0x00a02e
gMovesMade = 0x00a0f2
gMovesName = 0x0059d1
gMovesWin = 0x00a08a
gPassMsg = 0x005b23
gPly = 0x005a8d
gScoreBuf = 0x00a174
gScoreName = 0x0059c9
gScoreWin = 0x00a086
gSelfPlay = 0x00a030
gSqScore = 0x005b54
gUserId = 0x00a18d
gWhiteWinsMsg = 0x005af7
initWindows.wp = 0x00a034
levelMenuStr = 0x005874
longjmp = 0x0053ad
main = 0x0010ba
makeMove = 0x0022c4
memset = 0x002474
pickAiMove = 0x001af5
setjmp = 0x002f8c
startdesk = 0x002639
makeAMove = 0x002f25
memcpy = 0x004698
memset = 0x00471a
newGame = 0x002056
optionsMenuStr = 0x0057eb
paintDesktopBackdrop = 0x004db7
scoreMove = 0x003ec1
scoreString.tpl = 0x005c1c
setjmp = 0x005385
startdesk = 0x0049d1

Binary file not shown.

Binary file not shown.

Binary file not shown.

View file

@ -469,15 +469,37 @@ clang --target=w65816 -O2 -mllvm -print-after=w65816-isel \
## Cycle-count benchmarks
Eight microbenchmarks live under [`benchmarks/`](../benchmarks/).
Each runs N iterations of the bench function and reports a
per-call cycle count via MAME's `emu.time()`:
Eleven microbenchmarks live under [`benchmarks/`](../benchmarks/) —
eight integer/string benches plus three soft-double FP benches
(`dadd`, `dmul`, `ddiv`). Each runs N iterations of the bench
function and reports per-iter cycles via MAME's HBL counter:
```bash
bash scripts/benchCyclesPrecise.sh
bash scripts/benchCycles.sh
```
Output:
Output (2026-05-20):
```
| Benchmark | Per-iteration cycles |
|-----------|---------------------:|
| bsearch | 127 cyc/iter (100 iters) |
| crc32 | <65 (under timer resolution) |
| dadd | 1157 cyc/iter (10 iters) |
| ddiv | 1261 cyc/iter (10 iters) |
| dmul | 1033 cyc/iter (10 iters) |
| dotProduct | 144 cyc/iter (100 iters) |
| fib | 97 cyc/iter (100 iters) |
| memcmp | 113 cyc/iter (100 iters) |
| popcount | 93 cyc/iter (100 iters) |
| strcpy | 91 cyc/iter (100 iters) |
| sumOfSquares | 126 cyc/iter (100 iters) |
```
The legacy `scripts/benchCyclesPrecise.sh` (per-call cycle count
via `emu.time()`) is still available but slower to run.
Output (legacy `benchCyclesPrecise.sh`):
```
| Benchmark | Per-call cycles (clang) |

View file

@ -55,11 +55,10 @@ cc "$SRC/libcxxabiSjlj.c"
cc "$SRC/desktop.c"
asm "$SRC/iigsGsos.s"
asm "$SRC/iigsToolbox.s"
# softDouble.c builds at -O2. dpack stays noinline (basic regalloc
# overflows when dpack inlines into __adddf3/__muldf3). dclass MUST
# stay inline (its pointer-arg writes from a noinline boundary would
# lower to `sta (d,s),y` which uses DBR — silently corrupted under
# DBR != 0, caught by the dmul-after-bank-switch test).
# softDouble.c builds at -O2. dpack is noinline to dodge a backend
# stack-slot aliasing bug; dclass stays inline because pointer-arg
# stores from a noinline boundary use DBR-relative addressing (broken
# under DBR != 0). Both choices documented in the source.
cc "$SRC/softDouble.c"
echo "runtime built: $(ls -1 "$OUT"/*.o | wc -l) objects"

View file

@ -798,7 +798,10 @@ typedef unsigned long clock_t;
// DP scratch ($E0..$E7), then memcpy out. We can't use "=g"
// constraints (W65816 backend rejects memory operands in inline
// asm), so the data path runs through known DP addresses.
__attribute__((noinline))
//
// "memory" clobber on the asm tells the scheduler we touch arbitrary
// memory, so it can't reorder the asm against the volatile DP reads
// below. That permits inlining without losing the read ordering.
static void readTimeHex(unsigned char buf[8]) {
__asm__ volatile (
"pea 0\n"
@ -1070,25 +1073,6 @@ extern int vsnprintf(char *buf, size_t n, const char *fmt, va_list ap);
int vfprintf(FILE *stream, const char *fmt, va_list ap);
size_t fwrite(const void *ptr, size_t size, size_t nmemb, FILE *stream);
// Opaque pos-update helper. The vfprintf body's `stream->pos +=
// written` got DSE'd under p:32:16 + size_t=unsigned long when called
// after a format-spec vsnprintf call. Routing through an explicit
// noinline helper forces the compiler to emit the memory store.
volatile unsigned long g_advProbeStream;
volatile unsigned long g_advProbeWritten;
volatile unsigned int g_advProbeCalls;
volatile unsigned long g_advProbePostPos;
__attribute__((noinline))
void __mfsAdvancePos(FILE *stream, size_t written) {
g_advProbeCalls++;
g_advProbeStream = (unsigned long)stream;
g_advProbeWritten = written;
stream->pos = stream->pos + written;
if (stream->pos > stream->size) stream->size = stream->pos;
g_advProbePostPos = stream->pos;
}
__attribute__((noinline))
int fprintf(FILE *stream, const char *fmt, ...) {
va_list ap;
__builtin_va_start(ap, fmt);
@ -1097,7 +1081,6 @@ int fprintf(FILE *stream, const char *fmt, ...) {
return r;
}
__attribute__((noinline))
int vfprintf(FILE *stream, const char *fmt, va_list ap) {
if (!stream) return -1;
if (stream->kind == FILE_KIND_STDOUT || stream->kind == FILE_KIND_STDERR)
@ -1124,19 +1107,11 @@ int vfprintf(FILE *stream, const char *fmt, va_list ap) {
size_t remain = (stream->cap > stream->pos)
? stream->cap - stream->pos : 0;
if (remain == 0) { stream->err = 1; return -1; }
// Stash the FILE* low+high halves in volatile stack locals so
// the compiler is forced to reload after vsnprintf. Without
// this, the compiler keeps stream's hi half in IMG0 ($D0) for
// the entire function; vsnprintf uses $D0 as scratch, so when
// we read stream->* after vsnprintf returns the hi is garbage
// and writes go to the wrong bank. Caught by hex dumper test.
volatile unsigned int streamLo = (unsigned int)(unsigned long)stream;
volatile unsigned int streamHi = (unsigned int)((unsigned long)stream >> 16);
int n = vsnprintf(stream->buf + stream->pos, remain, fmt, ap);
FILE *vs = (FILE *)((unsigned long)streamLo | ((unsigned long)streamHi << 16));
if (n < 0) { vs->err = 1; return -1; }
if (n < 0) { stream->err = 1; return -1; }
size_t written = ((size_t)n < remain) ? (size_t)n : remain - 1;
__mfsAdvancePos(vs, written);
stream->pos += written;
if (stream->pos > stream->size) stream->size = stream->pos;
return n;
}
return -1;
@ -1219,7 +1194,6 @@ int system(const char *cmd) { (void)cmd; return 0; }
// Returns NULL if no registration matches `path` (or the requested
// mode isn't compatible with the registration's writable flag).
__attribute__((noinline))
static void initFileMem(FILE *f, const MfsEntry *reg, int wantWrite) {
f->kind = FILE_KIND_MEM;
f->writable = (u8)(wantWrite ? 1 : 0);
@ -1230,15 +1204,7 @@ static void initFileMem(FILE *f, const MfsEntry *reg, int wantWrite) {
f->cap = reg->cap;
f->pos = 0;
f->unget = -1;
// Workaround: write path via byte-by-byte memcpy to dodge a ptr32
// SDAG combiner bug where the i32 ptr-store of `f->path = reg->path`
// (struct offset 22) ends up writing to the previously-computed
// `f->pos` address (offset 16), corrupting pos.
{
const unsigned char *src = (const unsigned char *)&reg->path;
unsigned char *dst = (unsigned char *)&f->path;
dst[0] = src[0]; dst[1] = src[1]; dst[2] = src[2]; dst[3] = src[3];
}
f->path = reg->path;
}
// Scratch GSString for fopen's gsosOpen call. Single static buffer is

View file

@ -979,7 +979,18 @@ __muldi3:
stz 0xf4
stz 0xf6
stz 0xf8
; Loop 64 times on a's bits.
; Short-circuit when a's high half ($E4/$E6) is zero: bits 32..63
; of a are 0, so the 32 high iterations would add nothing. Saves
; ~50% of __muldi3 cost in mulhi64Aligned (softDouble.c), which
; passes only u32-wide operands. b's high half is irrelevant for
; this short-circuit — even if b is full-width, iters 32..63 only
; shift b and add zero.
lda 0xe4
ora 0xe6
bne .Lmuldi_long
ldy #0x20
bra .Lmuldi_loop
.Lmuldi_long:
ldy #0x40
.Lmuldi_loop:
; Right-shift the 64-bit `a` by 1. $E0=lo..$E6=hi (matches the

View file

@ -708,12 +708,6 @@ float expm1f(float x) { return (float)expm1((double)x); }
// to avoid overflow — for |x|, |y| < ~1e150 the naive form is fine,
// past that you'd want the standard scale-by-max trick.
// hypot — naive sqrt(x*x + y*y). NO `volatile` on the temps —
// clang's codegen for volatile-double locals on this target generates
// stack-relative loads/stores that crash under the GS/OS Loader (the
// chain executes correctly under runInMame but not via Finder). The
// volatile-free version works in both contexts.
__attribute__((noinline))
double hypot(double x, double y) {
double xx = x * x;
double yy = y * y;
@ -734,8 +728,6 @@ float hypotf(float x, float y) {
// Implemented WITHOUT calling pow because clang treats pow as a
// known builtin and either inlines it (with bad fold of pow(x,1/3))
// or DCEs the call entirely (cbrt body collapses to "return 0").
// This implementation has no pow dependency and is immune.
__attribute__((noinline))
double cbrt(double x) {
if (x == 0.0) return x;
int neg = (int)(dToBits(x) >> 63) & 1;

View file

@ -57,7 +57,6 @@ void *bsearch(const void *key, const void *base, size_t nmemb,
// the split, qsort's i32-pointer pressure under ptr32 produces
// ADCEfi tied-def chains the inline-spiller can't allocate ("ran
// out of registers" failure).
__attribute__((noinline))
static void qsortInner(unsigned char *base, unsigned char *cur,
size_t size, CmpFnT cmp) {
while (cur > base) {

View file

@ -18,25 +18,9 @@
// the buffer been unbounded (C99 vsnprintf semantics), not just the
// number actually written. This lets callers detect truncation.
//
// **Sink state lives in file-static globals** instead of an explicit
// struct passed by pointer. This was originally a workaround for two
// W65816 backend bugs (since fixed):
// (1) The address of a stack-resident struct used to be computed
// wrong (&s came out as SP+5 = address of s.end instead of SP+3).
// (2) Functions taking fmt as arg1 (stack) didn't initialize the
// fmt local before the loop body — first char came from the
// arg slot but the loop's fmt++ ran on uninitialized memory.
// The struct-sink form now compiles correctly, but switching back to it
// would shift every TU's branch distances; left as-is for stability.
// Single-threaded use only, but that matches the rest of this runtime.
//
// Reverse-emit pattern (used by emitUDec / emitULong / emitHex): the
// natural countdown forms (`while (i > 0) emit(buf[--i])`,
// `while (i > 0) { i--; emit(buf[i]); }`,
// `for (j = i - 1; j >= 0; j--) emit(buf[j])`) all lower to a
// do-while whose `dec a; bpl` exit condition runs the loop one
// extra time on this backend, leaking a `buf[-1]` read. Use the
// forward count + index-arithmetic form instead.
// Sink state lives in file-static globals (gCur/gEnd/gTotal) rather
// than a per-call context. Single-threaded use only, but that matches
// the rest of this runtime.
typedef unsigned long size_t;
typedef __builtin_va_list va_list;
@ -50,7 +34,6 @@ static char *gEnd;
static size_t gTotal;
__attribute__((noinline))
static void emit(char c) {
if (gCur < gEnd) {
*gCur++ = c;
@ -59,7 +42,6 @@ static void emit(char c) {
}
__attribute__((noinline))
static void emitStr(const char *p) {
if (!p) {
p = "(null)";
@ -70,7 +52,6 @@ static void emitStr(const char *p) {
}
__attribute__((noinline))
static void emitUDec(unsigned int n) {
char buf[6];
int i = 0;
@ -82,15 +63,10 @@ static void emitUDec(unsigned int n) {
buf[i++] = '0' + (n % 10);
n /= 10;
}
// Reverse-emit; see file header for the forward-index rationale.
int top = i;
for (int j = 0; j < top; j++) {
emit(buf[top - 1 - j]);
}
while (i > 0) emit(buf[--i]);
}
__attribute__((noinline))
static void emitDec(int n) {
// -n on INT_MIN is signed-overflow UB; negate as unsigned.
if (n < 0) {
@ -102,7 +78,6 @@ static void emitDec(int n) {
}
__attribute__((noinline))
static void emitULong(unsigned long n) {
char buf[11];
int i = 0;
@ -114,15 +89,10 @@ static void emitULong(unsigned long n) {
buf[i++] = '0' + (n % 10);
n /= 10;
}
// Reverse-emit; see file header for the forward-index rationale.
int top = i;
for (int j = 0; j < top; j++) {
emit(buf[top - 1 - j]);
}
while (i > 0) emit(buf[--i]);
}
__attribute__((noinline))
static void emitSignedLong(long n) {
// See emitDec: avoid the signed-overflow UB on LONG_MIN.
if (n < 0) {
@ -134,7 +104,6 @@ static void emitSignedLong(long n) {
}
__attribute__((noinline))
static void emitHex(unsigned int n, int width) {
static const char digits[] = "0123456789abcdef";
// unsigned int is 16-bit on this target -> at most 4 hex digits.
@ -153,15 +122,10 @@ static void emitHex(unsigned int n, int width) {
while (i < width) {
buf[i++] = '0';
}
// Reverse-emit; see file header for the forward-index rationale.
int top = i;
for (int j = 0; j < top; j++) {
emit(buf[top - 1 - j]);
}
while (i > 0) emit(buf[--i]);
}
__attribute__((noinline))
static void emitDouble(double v, int prec, char spec) {
// For %g / %G, "precision" is total significant digits. Real glibc
// would compute exponent and choose between %e and %f styles, but

View file

@ -24,45 +24,20 @@ typedef unsigned char u8;
// Pack sign / unbiased-exp / mantissa-with-leading-bit into IEEE-754
// double. Returns sign for zero or underflow; sign|inf for overflow.
//
// Body uses per-word writes through a `union { u64; u16[4]; }` and
// stores each word through a volatile-qualified accessor to defeat
// the backend's stack-slot coalescing. Without the volatile wrap,
// inlining dpack into __adddf3 hit a stack-slot-aliasing miscompile
// where result word 2 got OR'd with result word 3 (dadd(1.5, 2.5) →
// 0x4010_4010_0000_0000 instead of 0x4010_0000_0000_0000). Real fix
// needs backend stack-slot lifetime analysis at the coalescer stage.
static u64 dpack(u64 sign, s16 exp, u64 mant) {
if (mant == 0) return sign;
s16 eS = exp + DEXP_BIAS;
if (eS <= 0) return sign;
if (eS >= 2047) return sign | DEXP_MASK;
union { u64 u; u16 w[4]; } mantU, signU;
mantU.u = mant;
signU.u = sign;
// Volatile output array forces distinct stack slots per word —
// the compiler can't fold these into shared slots.
volatile u16 outW[4];
outW[0] = (u16)(mantU.w[0] | signU.w[0]);
outW[1] = (u16)(mantU.w[1] | signU.w[1]);
outW[2] = (u16)(mantU.w[2] | signU.w[2]);
outW[3] = (u16)((mantU.w[3] & 0x000F) | signU.w[3] | ((u16)eS << 4));
union { u64 u; u16 w[4]; } r;
r.w[0] = outW[0];
r.w[1] = outW[1];
r.w[2] = outW[2];
r.w[3] = outW[3];
return r.u;
return sign | (mant & DMANT_MASK) | ((u64)(u16)eS << DEXP_SHIFT);
}
// Decompose `x` into sign / unbiased-exp / mantissa-with-leading-bit.
// Returns the class: 0=zero, 1=normal, 2=infinity, 3=NaN.
// noinline reduces register pressure in __muldf3/__divdf3/__adddf3
// — without it, greedy regalloc runs out of registers in __muldf3
// at -O2. Now safe because pointer-arg writes lower to STBptr/STAptr
// which use [$E0],Y indirect-long with the bank byte forced to 0
// (DBR-independent). See `feedback_dbr_ptr_deref_spill.md`.
// noinline removed — pointer-arg stores now lower to STBptr/STAptr (indirect-long, DBR-independent)
//
// Kept inline: passing pointer args from a noinline boundary lowers to
// `sta (d,s),y` (DBR-relative) — broken under DBR != 0. Inlining keeps
// the stores within the caller's frame. See feedback_dbr_ptr_deref_spill.md.
static u16 dclass(u64 x, u64 *out_sign, s16 *out_exp, u64 *out_mant) {
*out_sign = x & DSIGN_BIT;
s16 e = (s16)((x >> DEXP_SHIFT) & 0x7FF);
@ -142,10 +117,9 @@ u64 __adddf3(u64 a, u64 b) {
// left-shift if subtraction left the lead below 55. Reverse order
// would shift an over-wide value out of u64 range entirely.
// Use if + do-while because pure `while (cond) body` triggers a
// ptr32 backend bug: PHP/PLP wrap pass mis-identifies the loop's
// pre-test LDA reload as flag corruption and wraps the wrong
// range, so the BEQ tests stale flags and the loop body never
// fires. `do { } while (cond)` is unaffected (test-after-body).
// backend bug in the left-shift renormalize path: subtraction
// cases (different signs) lose their result (7+(-8) → -0 instead
// of -1). do-while is unaffected (test-after-body).
if (mr & ~((1ULL << 56) - 1)) {
do {
u64 sticky_bit = mr & 1;
@ -282,26 +256,14 @@ u64 __divdf3(u64 a, u64 b) {
// Handle the leading quotient bit explicitly.
u64 q = DMANT_LEAD;
u64 r = ma - mb;
// `volatile vmb`: forces mb to be re-read from memory inside the
// loop. Without this, the W65816 codegen miscompiles `r >= mb` and
// `r -= mb` when called as the 3rd+ chained `__divdf3` after prior
// softDouble libcalls (sqrt3 Newton iter — 3rd iter returned 0.0
// instead of 1.41421). Adding `volatile` to either `r` or `mb`
// alone fixes it, suggesting the compiler is keeping one of them
// in registers across loop iterations and a JSL inside the loop
// (__ashlsi3 for `r <<= 1`) clobbers the held value. The real
// fix lives in the W65816 backend's u64-shift lowering; volatile
// here is the conservative workaround.
volatile u64 vmb = mb;
// Compute 52 more fractional bits via standard shift-test-subtract.
for (int i = 51; i >= 0; i--) {
r <<= 1;
if (r >= vmb) {
r -= vmb;
if (r >= mb) {
r -= mb;
q |= (1ULL << i);
}
}
mb = vmb; // resync in case below reads mb
// Round to nearest, ties to even. Generate one extra bit (the
// "guard"), examine the remainder for any non-zero "sticky" tail,
// and round q up when guard=1 and (sticky || (q & 1)). Without

View file

@ -38,7 +38,6 @@ typedef int s16;
#define MANT_MASK 0x007FFFFFUL
#define MANT_LEAD 0x00800000UL // implicit leading 1
__attribute__((noinline))
static u16 fpClass(u32 x, u32 *out_sign, s16 *out_exp, u32 *out_mant) {
*out_sign = x & SIGN_BIT;
s16 e = (s16)((x >> EXP_SHIFT) & 0xFF);
@ -61,7 +60,6 @@ static u16 fpClass(u32 x, u32 *out_sign, s16 *out_exp, u32 *out_mant) {
return 1; // normal
}
__attribute__((noinline))
static u32 fpPack(u32 sign, s16 exp, u32 mant) {
if (mant == 0) return sign; // zero
// Normalize: shift mantissa until bit 23 is the leading 1.

View file

@ -9,7 +9,6 @@ static char *gStrtokSave;
// strtok_r, growing the .o by ~70%. The runtime's bank-0 budget
// is tight enough that the duplicated code pushes rodata past
// 0xC000 (IIgs IO window), corrupting string literals at runtime.
__attribute__((noinline))
char *strtok_r(char *str, const char *delim, char **saveptr) {
unsigned char *s;
if (str != (char *)0) {

View file

@ -164,7 +164,6 @@ static const char *const __monLong[12] = {
// (__udivhi3 + __umodhi3) is slower than one __udivhi3 + multiply but
// is the only spelling that avoids the negation bug at this width.
// Calendar values stay under 65535 so u16 suffices.
__attribute__((noinline))
static char *fmtN(char *p, unsigned long v, int n) {
unsigned int v16 = (unsigned int)v;
p += n;
@ -220,7 +219,6 @@ char *ctime(const time_t *t) {
// %Y %m %d %H %M %S %j %w %a %A %b %h %B %p %%
// Composite specs (expanded by main loop via strftimeComposite):
// %D %F %R %T %r %x %X %c
__attribute__((noinline))
static int strftimeOne(char dst[8], char spec, const struct tm *tm,
const char **strOut) {
*strOut = 0;

BIN
screenshots/frame.png (Stored with Git LFS)

Binary file not shown.

BIN
screenshots/minicad.png (Stored with Git LFS)

Binary file not shown.

BIN
screenshots/orcaFrameLike.png (Stored with Git LFS)

Binary file not shown.

BIN
screenshots/orcaMiniCadLike.png (Stored with Git LFS)

Binary file not shown.

BIN
screenshots/orcaReversiLike.png (Stored with Git LFS)

Binary file not shown.

BIN
screenshots/qdProbe.png (Stored with Git LFS)

Binary file not shown.

BIN
screenshots/reversi.png (Stored with Git LFS)

Binary file not shown.

View file

@ -24,6 +24,8 @@ oCrt0=$(mktemp --suffix=.o)
oLibgcc=$(mktemp --suffix=.o)
"$LLVM_MC" -arch=w65816 -filetype=obj "$PROJECT_ROOT/runtime/src/crt0.s" -o "$oCrt0"
"$LLVM_MC" -arch=w65816 -filetype=obj "$PROJECT_ROOT/runtime/src/libgcc.s" -o "$oLibgcc"
# softDouble.o is needed for FP benches (dmul/dadd/ddiv → __muldf3/etc.)
oSoftDouble="$PROJECT_ROOT/runtime/softDouble.o"
# Per-benchmark wrapper template. The C wrapper calls each benchmark
# with appropriate inputs, then writes the iteration count and cycle
@ -39,6 +41,9 @@ benchInputs() {
dotProduct) echo 'dotProduct(va, vb, 4)';;
popcount) echo 'popcount(0x12345678UL)';;
crc32) echo 'crc32((const unsigned char *)"hello", 5)';;
dmul) echo 'dmul(da, db)';;
dadd) echo 'dadd(da, db)';;
ddiv) echo 'ddiv(da, db)';;
*) echo "/* unknown */";;
esac
}
@ -53,6 +58,9 @@ benchExtern() {
dotProduct) echo 'extern long dotProduct(const short *a, const short *b, unsigned int n); static const short va[] = {1,2,3,4}; static const short vb[] = {5,6,7,8};';;
popcount) echo 'extern int popcount(unsigned long x);';;
crc32) echo 'extern unsigned long crc32(const unsigned char *p, unsigned int n);';;
dmul) echo 'extern double dmul(double a, double b); static volatile double da = 3.14, db = 2.71;';;
dadd) echo 'extern double dadd(double a, double b); static volatile double da = 3.14, db = 2.71;';;
ddiv) echo 'extern double ddiv(double a, double b); static volatile double da = 3.14, db = 2.71;';;
*) echo '';;
esac
}
@ -68,6 +76,14 @@ runOneBench() {
echo "(no input config)"
return
fi
# FP benches assign result to sinkD (double); rest assign to sink as ulong
# FP benches also use fewer iters (each call is ~1000+ cycles, so 100
# iters wraps the 8-bit HBL counter many times).
local sink_lhs sink_cast iters
case "$name" in
dmul|dadd|ddiv) sink_lhs='sinkD'; sink_cast=''; iters=10 ;;
*) sink_lhs='sink'; sink_cast='(unsigned long)'; iters=100 ;;
esac
local cwrap=$(mktemp --suffix=.c)
local owrap=$(mktemp --suffix=.o)
@ -90,7 +106,8 @@ __attribute__((noinline)) static unsigned char readVbl(void) {
return r;
}
volatile unsigned long sink;
#define ITERS 100
volatile double sinkD;
#define ITERS $iters
int main(void) {
// Re-enable IRQs so the IIgs ROM's VBL handler runs and the
// VBL counter at \$E1006B actually ticks. crt0 disables IRQs
@ -98,7 +115,7 @@ int main(void) {
__asm__ volatile ("cli\n" ::: "memory");
unsigned char t0 = readVbl();
for (int i = 0; i < ITERS; i++) {
sink = (unsigned long)($call_expr);
$sink_lhs = $sink_cast($call_expr);
}
unsigned char t1 = readVbl();
__asm__ volatile ("sei\n" ::: "memory");
@ -114,7 +131,7 @@ EOF
|| { echo "compile-fail"; rm -f "$cwrap" "$owrap"; return; }
"$CLANG" --target=w65816 -O2 -ffunction-sections -c "$BENCH_DIR/$name.c" -o "$obench" 2>/dev/null \
|| { echo "compile-fail"; rm -f "$cwrap" "$owrap" "$obench"; return; }
"$LINK" -o "$bin" --text-base 0x1000 "$oCrt0" "$oLibgcc" "$owrap" "$obench" 2>/dev/null \
"$LINK" -o "$bin" --text-base 0x1000 "$oCrt0" "$oLibgcc" "$oSoftDouble" "$owrap" "$obench" 2>/dev/null \
|| { echo "link-fail"; rm -f "$cwrap" "$owrap" "$obench" "$bin"; return; }
# Read VBL delta at $025000.
@ -135,8 +152,8 @@ EOF
if [ "$ticks" -eq 0 ]; then
echo "<65 cyc/iter (under timer resolution)"
else
local cycles=$((ticks * 65 / 100))
printf "%d hbl-ticks (~%d cyc/iter)" "$ticks" "$cycles"
local cycles=$((ticks * 65 / iters))
printf "%d hbl-ticks (~%d cyc/iter, %d iters)" "$ticks" "$cycles" "$iters"
fi
fi
}

View file

@ -21,7 +21,15 @@ source "$(dirname "$0")/common.sh"
BIN="$1"
shift
SECS=3
# Frame budget: load at frame 30, check at CHECK_FRAME (default 300 = 4.5
# simulated seconds after load). Override via env for heavy-compute tests.
# Earlier default was 60 frames (0.5 sec), which falsely flagged slow but
# correct math (e.g. 6-iter sqrt with chained soft-double libcalls) as
# runtime hangs — see feedback_sqrt_runtime_broken.md.
CHECK_FRAME=${MAME_CHECK_FRAME:-300}
# seconds_to_run is simulated time; MAME terminates at this point. Sized
# to comfortably exceed CHECK_FRAME (300 frames = 5 sec at 60Hz).
SECS=${MAME_SECS:-6}
# Build address list as Lua table entries.
LUA_CHECKS=""
@ -84,7 +92,7 @@ emu.register_frame_done(function()
cpu.state["S"].value = 0x01FF
print("MAME-LOADED bytes=" .. #data)
end
if frame == 60 then
if frame == $CHECK_FRAME then
local cpu = manager.machine.devices[":maincpu"]
local mem = cpu.spaces["program"]
$LUA_CHECKS

View file

@ -22,7 +22,8 @@ source "$(dirname "$0")/common.sh"
BIN="$1"
shift
SECS=3
CHECK_FRAME=${MAME_CHECK_FRAME:-300}
SECS=${MAME_SECS:-6}
# 23-byte stub bytes (see runtime/src/iigsGsosStub.s for source).
# Hand-assembled to avoid relying on llvm-mc tracking M-flag state.
@ -96,7 +97,7 @@ $STUB_LUA
cpu.state["S"].value = 0x01FF
print("MAME-LOADED bytes=" .. #data .. " stub=$((${#STUB_BYTES}/2))")
end
if frame == 60 then
if frame == $CHECK_FRAME then
local cpu = manager.machine.devices[":maincpu"]
local mem = cpu.spaces["program"]
$LUA_CHECKS

View file

@ -11,7 +11,8 @@ source "$(dirname "$0")/common.sh"
MANIFEST="$1"
shift
SECS=3
CHECK_FRAME=${MAME_CHECK_FRAME:-300}
SECS=${MAME_SECS:-6}
# Build address list as Lua table entries, mirroring runInMame.sh.
LUA_CHECKS=""
@ -97,7 +98,7 @@ $LOAD_LUA
cpu.state["S"].value = 0x01FF
print('MAME-READY pc=0x' .. string.format('%06x', $ENTRY_BASE + $ENTRY_OFF))
end
if frame == 60 then
if frame == $CHECK_FRAME then
local cpu = manager.machine.devices[":maincpu"]
local mem = cpu.spaces["program"]
$LUA_CHECKS

View file

@ -833,6 +833,15 @@ struct Linker {
L.bssBase = 0xD000;
}
}
// Also bump past the IO window if BSS would SPAN it
// (starts below 0xC000, extends into or past 0xC000).
// BSS writes to 0xC000-0xCFFF hit soft switches — caught
// by smoke #128 hex dumper, where ~954-byte BSS pushed
// past 0xC000 and BSS-clear writes crashed MAME.
if (L.bssBase < 0xC000 &&
L.bssBase + L.bssSize > 0xC000) {
L.bssBase = 0xD000;
}
if (L.bssBase + L.bssSize > 0x10000u) {
char msg[256];
std::snprintf(msg, sizeof(msg),

View file

@ -114,6 +114,17 @@ W65816TargetLowering::W65816TargetLowering(const TargetMachine &TM,
for (MVT VT : MVT::integer_valuetypes())
setLoadExtAction(ISD::SEXTLOAD, VT, MVT::i8, Expand);
// GlobalOpt sometimes narrows a `short` global to `i1` when it sees
// every assignment is 0 or 1. Custom-lower so LowerLoad rewrites
// `zext/sext/anyext from i1` into a plain byte load + appropriate
// mask. Both i16 and i8 result widths can appear, depending on
// whether the consumer wants the value as `short` or `bool`.
for (MVT ResVT : {MVT::i8, MVT::i16}) {
setLoadExtAction(ISD::ZEXTLOAD, ResVT, MVT::i1, Custom);
setLoadExtAction(ISD::SEXTLOAD, ResVT, MVT::i1, Custom);
setLoadExtAction(ISD::EXTLOAD, ResVT, MVT::i1, Custom);
}
// Only register i32 ext-load / trunc-store and Custom actions when
// i32 is actually a legal type (ptr32 mode active). Otherwise the
// Custom-action calls intercept i16/i8 ops, and LowerTruncate's
@ -191,6 +202,20 @@ W65816TargetLowering::W65816TargetLowering(const TargetMachine &TM,
setOperationAction(ISD::SMUL_LOHI, MVT::i16, Expand);
setOperationAction(ISD::UMUL_LOHI, MVT::i16, Expand);
setOperationAction(ISD::MUL, MVT::i16, LibCall);
// i8 multiply / mulh / div / rem: SDAG narrows e.g. `x / 10` to
// `mulhu i8 x, -51` + shift when it proves operands fit in i8.
// The 65816 has no native 8-bit multiplier; route everything
// through the 16-bit libcalls by Promoting i8 ops to i16.
setOperationAction(ISD::MUL, MVT::i8, Promote);
setOperationAction(ISD::MULHU, MVT::i8, Promote);
setOperationAction(ISD::MULHS, MVT::i8, Promote);
setOperationAction(ISD::SDIV, MVT::i8, Promote);
setOperationAction(ISD::UDIV, MVT::i8, Promote);
setOperationAction(ISD::SREM, MVT::i8, Promote);
setOperationAction(ISD::UREM, MVT::i8, Promote);
setOperationAction(ISD::SMUL_LOHI, MVT::i8, Expand);
setOperationAction(ISD::UMUL_LOHI, MVT::i8, Expand);
// CTPOP/CTLZ/CTTZ/ROTL/ROTR — no hardware support. Expand lets the
// type legalizer rewrite into a sequence of basic ops. Without
// this, e.g. `x && !(x & (x-1))` (LLVM canonicalises to popcount==1)
@ -904,6 +929,28 @@ SDValue W65816TargetLowering::LowerLoad(SDValue Op,
Ld->getAlign(),
Ld->getMemOperand()->getFlags());
}
// i1 memory type comes from GlobalOpt narrowing `short` globals
// whose only assignments are 0/1. Treat as i8 load + appropriate
// mask — the underlying memory is still byte-sized.
if (MemVT == MVT::i1) {
SDValue ByteLd = DAG.getExtLoad(ISD::ZEXTLOAD, DL, MVT::i16, Chain,
FoldedLo, MVT::i8,
Ld->getMemOperand());
SDValue Val = ByteLd;
if (ExtType == ISD::ZEXTLOAD || ExtType == ISD::EXTLOAD) {
Val = DAG.getNode(ISD::AND, DL, MVT::i16, ByteLd,
DAG.getConstant(1, DL, MVT::i16));
} else if (ExtType == ISD::SEXTLOAD) {
// i1 sign-extend: bit 0 -> all bits. AND #1 then NEG.
SDValue Bit = DAG.getNode(ISD::AND, DL, MVT::i16, ByteLd,
DAG.getConstant(1, DL, MVT::i16));
Val = DAG.getNode(ISD::SUB, DL, MVT::i16,
DAG.getConstant(0, DL, MVT::i16), Bit);
}
if (Op.getValueType() == MVT::i8)
Val = DAG.getNode(ISD::TRUNCATE, DL, MVT::i8, Val);
return DAG.getMergeValues({Val, ByteLd.getValue(1)}, DL);
}
return DAG.getExtLoad(ExtType, DL, Op.getValueType(), Chain, FoldedLo,
MemVT, Ld->getMemOperand());
}
@ -913,6 +960,9 @@ SDValue W65816TargetLowering::LowerLoad(SDValue Op,
return SDValue();
EVT MemVT = Ld->getMemoryVT();
// Widen i1 memVT to i8 (single-byte storage). getMemIntrinsicNode
// asserts memvt must be supported; i1 isn't.
if (MemVT == MVT::i1) MemVT = MVT::i8;
SDVTList VTs = DAG.getVTList(MVT::i16, MVT::Other);
SDValue Ops[] = { Chain, Ptr };
// memVT for the LD_PTR memintrinsic must match MMO's size (i8 vs
@ -925,10 +975,14 @@ SDValue W65816TargetLowering::LowerLoad(SDValue Op,
MemVT, Ld->getMemOperand());
SDValue Val = LdNode;
// Byte memory access: mask the high byte for zextload, leave anyext.
// i1 memVT was widened to i8 above; the mask path is the same.
if (MemVT == MVT::i8) {
if (Ld->getExtensionType() == ISD::ZEXTLOAD)
Val = DAG.getNode(ISD::AND, DL, MVT::i16, Val,
DAG.getConstant(0xFF, DL, MVT::i16));
EVT OrigMemVT = Ld->getMemoryVT();
SDValue MaskC = DAG.getConstant(OrigMemVT == MVT::i1 ? 1 : 0xFF,
DL, MVT::i16);
if (Ld->getExtensionType() == ISD::ZEXTLOAD ||
(OrigMemVT == MVT::i1 && Ld->getExtensionType() == ISD::EXTLOAD))
Val = DAG.getNode(ISD::AND, DL, MVT::i16, Val, MaskC);
else if (Ld->getExtensionType() == ISD::SEXTLOAD)
Val = DAG.getNode(ISD::SIGN_EXTEND_INREG, DL, MVT::i16, Val,
DAG.getValueType(MVT::i8));

View file

@ -110,21 +110,32 @@ static int classifyImgReg(unsigned Reg) {
return -1;
}
// Classification of a DP-addressed instruction's relation to a DP slot.
enum class DpAccess {
None, // not a DP-imm instruction we care about
Read, // only reads the DP slot (e.g., LDA $C0)
Write, // only writes the DP slot (e.g., STA $C0)
ReadWrite, // both (e.g., INC $C0)
};
// Map a DP-addressed instruction's first immediate operand to an IMG
// slot index if it falls in $C0..$CE. Returns -1 otherwise.
static int classifyDpImmAsImg(const MachineInstr &MI) {
// Most DP-addressed opcodes take the dp address as immediate op 0.
// (Some, like ADC_DP-form-with-explicit-A, may put the imm at op 1.)
// For our scan, check the first IMM operand we find.
// slot index and access mode. Returns (-1, None) if it doesn't access
// an IMG slot.
static std::pair<int, DpAccess> classifyDpImmAsImg(const MachineInstr &MI) {
unsigned Opc = MI.getOpcode();
DpAccess Mode;
switch (Opc) {
case W65816::LDA_DP:
// Pure stores: write only.
case W65816::STA_DP:
case W65816::STZ_DP:
case W65816::LDX_DP:
case W65816::STX_DP:
case W65816::LDY_DP:
case W65816::STY_DP:
Mode = DpAccess::Write;
break;
// Pure loads / compares / bit-tests: read only (writes to A/X/Y/P, not DP).
case W65816::LDA_DP:
case W65816::LDX_DP:
case W65816::LDY_DP:
case W65816::ADC_DP:
case W65816::SBC_DP:
case W65816::AND_DP:
@ -134,53 +145,68 @@ static int classifyDpImmAsImg(const MachineInstr &MI) {
case W65816::CPX_DP:
case W65816::CPY_DP:
case W65816::BIT_DP:
Mode = DpAccess::Read;
break;
// Read-modify-write.
case W65816::INC_DP:
case W65816::DEC_DP:
case W65816::ASL_DP:
case W65816::LSR_DP:
case W65816::ROL_DP:
case W65816::ROR_DP:
Mode = DpAccess::ReadWrite;
break;
default:
return -1;
return {-1, DpAccess::None};
}
for (const auto &MO : MI.operands()) {
if (!MO.isImm()) continue;
int64_t V = MO.getImm();
for (int i = 0; i < 8; ++i)
if ((int64_t)IMG_DP[i] == V)
return i;
return -1; // First imm is the dp addr; not in IMG range.
return {i, Mode};
return {-1, DpAccess::None}; // First imm is the dp addr; not in IMG range.
}
return -1;
return {-1, DpAccess::None};
}
bool W65816ImgCalleeSave::runOnMachineFunction(MachineFunction &MF) {
// Step 1: scan for IMG8..IMG15 usage. copyPhysReg already lowered
// some COPY $imgN = $a forms to STA_DP imm:0xC0 (etc.), so we have
// to check both the physreg form AND the DP-immediate form.
bool UsedSlot[8] = {false};
bool AnyUsed = false;
// Step 1: scan for IMG8..IMG15 WRITES. Reads alone don't need saving
// — if we never write IMGn, the caller's value survives untouched
// (other functions we call also preserve IMG8..IMG15 by the same
// convention, so no chain breaks the invariant). Saving on read-only
// use costs ~6 bytes per slot of needlessly-saved prologue/epilogue
// (caught by evalAt at 1.96× Calypsi — 5 IMG slots saved when fewer
// were actually written).
//
// copyPhysReg lowers `COPY $imgN = $a` to `STA_DP imm:0xCx`, so we
// check both the physreg-DEF form AND the DP-imm-store form.
bool WrittenSlot[8] = {false};
bool AnyWritten = false;
for (auto &MBB : MF) {
for (auto &MI : MBB) {
// physreg form: $imgN = ... or ... = $imgN
// physreg-DEF form: $imgN appearing as a Def operand.
for (const auto &MO : MI.operands()) {
if (!MO.isReg() || MO.getReg() == 0) continue;
if (!MO.isReg() || MO.getReg() == 0 || !MO.isDef()) continue;
int idx = classifyImgReg(MO.getReg());
if (idx >= 0) {
UsedSlot[idx] = true;
AnyUsed = true;
WrittenSlot[idx] = true;
AnyWritten = true;
}
}
// DP-imm form: lda dp imm:0xC0 etc.
int idx = classifyDpImmAsImg(MI);
if (idx >= 0) {
UsedSlot[idx] = true;
AnyUsed = true;
// DP-imm form: STA_DP / INC_DP / etc. write the slot at $Cx.
auto [idx, mode] = classifyDpImmAsImg(MI);
if (idx >= 0 &&
(mode == DpAccess::Write || mode == DpAccess::ReadWrite)) {
WrittenSlot[idx] = true;
AnyWritten = true;
}
}
}
if (!AnyUsed) return false;
if (!AnyWritten) return false;
// Rename for downstream Step 2/3/4 readability — they use UsedSlot.
bool (&UsedSlot)[8] = WrittenSlot;
(void)AnyWritten;
// Step 2: allocate one frame slot per used IMG. Size = 2 bytes (each
// Img16 holds a 16-bit value). Mark as a spill slot so PEI accounts

View file

@ -942,6 +942,17 @@ def : Pat<(i16 (zextloadi8 (W65816Wrapper tglobaladdr:$g))),
def : Pat<(i16 (zextloadi8 (W65816Wrapper texternalsym:$s))),
(ANDi16imm (LDAabs texternalsym:$s), 0xFF)>;
// i1-result loads from globals: GlobalOpt narrows `static short` to
// i1 when it sees every assignment is 0 or 1. zextloadi1 and
// extloadi1 land on us as i16-result loads with `s8`/i1 memory type;
// emit them as a normal byte load + mask (zext) or bare load (ext).
def : Pat<(i16 (zextloadi1 (W65816Wrapper tglobaladdr:$g))),
(ANDi16imm (LDAabs tglobaladdr:$g), 0xFF)>;
def : Pat<(i16 (extloadi1 (W65816Wrapper tglobaladdr:$g))),
(LDAabs tglobaladdr:$g)>;
def : Pat<(i16 (sextloadi1 (W65816Wrapper tglobaladdr:$g))),
(ANDi16imm (LDAabs tglobaladdr:$g), 1)>;
// CMP / branches. CMP sets the flags via the W65816cmp SDNode (glue
// out); the W65816brcc node consumes the glue and dispatches to the
// right Bxx instruction by condition code.

View file

@ -117,17 +117,33 @@ bool W65816LowerWide32::runOnMachineFunction(MachineFunction &MF) {
MachineInstr *DefMI = MRI.getUniqueVRegDef(W);
if (DefMI && DefMI->getOpcode() == TargetOpcode::REG_SEQUENCE) {
Register Lo, Hi;
bool Bail = false;
for (unsigned op = 1; op + 1 < DefMI->getNumOperands(); op += 2) {
if (!DefMI->getOperand(op).isReg() ||
!DefMI->getOperand(op + 1).isImm())
continue;
unsigned idx = DefMI->getOperand(op + 1).getImm();
Register Src = DefMI->getOperand(op).getReg();
unsigned SrcSub = DefMI->getOperand(op).getSubReg();
// If the source has a sub-register specifier (e.g.
// `%W.sub_lo:wide32` is a slice of a wide32 vreg), the
// effective "half" is the corresponding half of that source.
// Resolve via wideMap when the parent is already mapped;
// otherwise defer until a later iteration picks it up.
if (SrcSub != 0) {
if (!Src.isVirtual() || !wideMap.count(Src)) {
Bail = true;
break;
}
auto [SrcLo, SrcHi] = wideMap[Src];
Src = (SrcSub == llvm::sub_lo) ? SrcLo : SrcHi;
}
if (idx == llvm::sub_lo)
Lo = Src;
else if (idx == llvm::sub_hi)
Hi = Src;
}
if (Bail) continue;
if (Lo && Hi) {
wideMap[W] = {Lo, Hi};
toErase.push_back(DefMI);
@ -156,25 +172,38 @@ bool W65816LowerWide32::runOnMachineFunction(MachineFunction &MF) {
MachineInstr *LoDefMI = nullptr;
MachineInstr *HiDefMI = nullptr;
bool ok = true;
bool Bail = false;
for (MachineInstr &MI : MRI.def_instructions(W)) {
if (!MI.isCopy()) { ok = false; break; }
const MachineOperand &Dst = MI.getOperand(0);
const MachineOperand &Src = MI.getOperand(1);
if (!Dst.isReg() || Dst.getReg() != W) { ok = false; break; }
unsigned SubIdx = Dst.getSubReg();
Register S = Src.getReg();
unsigned SrcSub = Src.getSubReg();
// If the source has a sub-register specifier, resolve through
// wideMap[parent]. Symmetric with the REG_SEQUENCE handler
// above — without this, `%W.sub_lo = COPY %V.sub_lo:wide32`
// records the wide32 parent %V instead of %V's i16 sub_lo.
if (SrcSub != 0) {
if (!S.isVirtual() || !wideMap.count(S)) { Bail = true; break; }
auto [SL, SH] = wideMap[S];
S = (SrcSub == llvm::sub_lo) ? SL : SH;
}
if (SubIdx == llvm::sub_lo) {
if (LoDefMI) { ok = false; break; }
LoDefMI = &MI;
LoSrc = Src.getReg();
LoSrc = S;
} else if (SubIdx == llvm::sub_hi) {
if (HiDefMI) { ok = false; break; }
HiDefMI = &MI;
HiSrc = Src.getReg();
HiSrc = S;
} else {
ok = false;
break;
}
}
if (Bail) continue;
if (ok && LoSrc && HiSrc) {
wideMap[W] = {LoSrc, HiSrc};
if (LoDefMI) toErase.push_back(LoDefMI);

View file

@ -281,7 +281,11 @@ bool W65816PromoteFiToImg::runOnMachineFunction(MachineFunction &MF) {
Name == "__modsi3" || Name == "__ashlhi3" ||
Name == "__lshrhi3" || Name == "__ashrhi3" ||
Name == "__ashlsi3" || Name == "__lshrsi3" ||
Name == "__ashrsi3")
Name == "__ashrsi3" ||
// 64-bit helpers: use $E0..$EE only, no IMG0..7 touch.
Name == "__ashldi3" || Name == "__lshrdi3" ||
Name == "__ashrdi3" || Name == "__cmpdi2" ||
Name == "__ucmpdi2")
return true;
return false;
}

View file

@ -54,6 +54,7 @@
#include "llvm/CodeGen/MachineInstrBuilder.h"
#include "llvm/Support/Debug.h"
#include "llvm/Support/Format.h"
#include <functional>
using namespace llvm;
@ -131,6 +132,501 @@ static bool isImgSafeCall(const MachineInstr &MI) {
}
// Phase 12 peephole — A-dead PHA/PLA bracket elision. Two shapes:
//
// (a) PEI single-store IMG-source-STAfi bracket. When the next op
// after PLA redefines A, the bracket is dead weight:
//
// PHA ; (LDA_DP $cx | TXA | TYA) ; STA_StackRel (off+2) ; PLA
// [next redefines A]
// →
// (LDA_DP $cx | TXA | TYA) ; STA_StackRel off
//
// (b) ImgCalleeSave multi-store bracket at function entry. When the
// post-PLA pattern is "STX_DP ... ; STA_StackRel destOff ; [redefines
// A]", the post-PLA STA is storing entry-A to its final slot — we
// reorder by hoisting that STA to BEFORE the bracket, then dropping
// PHA/PLA and reverting inner offsets:
//
// PHA ; (LDA_DP $cx ; STA_StackRel off+2)×N ; PLA
// STX_DP $cM ; STA_StackRel destOff
// [next redefines A]
// →
// STA_StackRel destOff ; hoisted, entry-A → slot first
// (LDA_DP $cx ; STA_StackRel off)×N
// STX_DP $cM ; STX stays after saves
// [next op]
//
// Restricted to the entry MBB starting at MBB.begin() to ensure the
// match is an ImgCalleeSave-emitted prologue bracket (and not a mid-
// function bracket where the post-PLA STA is consuming a *different*
// A value than what was preserved).
static bool elidePhaBracket(MachineFunction &MF,
const W65816InstrInfo *TII) {
bool Changed = false;
auto opNoTouchA = [](unsigned Op) {
switch (Op) {
case W65816::STX_DP: case W65816::STX_Abs:
case W65816::STY_DP: case W65816::STY_Abs:
return true;
default:
return false;
}
};
auto opRedefinesA = [](unsigned Op) {
switch (Op) {
case W65816::LDA_DP: case W65816::LDA_StackRel:
case W65816::LDA_Abs: case W65816::LDA_Imm16:
case W65816::LDAabs: case W65816::LDAi16imm:
case W65816::TXA: case W65816::TYA:
case W65816::PLA:
return true;
default:
return false;
}
};
// --- Case (a): single-store brackets anywhere in any MBB. ---
for (MachineBasicBlock &MBB : MF) {
SmallVector<MachineInstr *, 4> ToErase;
for (auto It = MBB.begin(); It != MBB.end(); ++It) {
if (It->getOpcode() != W65816::PHA) continue;
auto Lda = std::next(It);
if (Lda == MBB.end()) continue;
unsigned LdaOp = Lda->getOpcode();
bool LdaIsLoadDp = (LdaOp == W65816::LDA_DP);
bool LdaIsXfer = (LdaOp == W65816::TXA || LdaOp == W65816::TYA);
if (!LdaIsLoadDp && !LdaIsXfer) continue;
auto Sta = std::next(Lda);
if (Sta == MBB.end()) continue;
if (Sta->getOpcode() != W65816::STA_StackRel) continue;
auto Pla = std::next(Sta);
if (Pla == MBB.end()) continue;
if (Pla->getOpcode() != W65816::PLA) continue;
auto AfterPla = std::next(Pla);
if (AfterPla == MBB.end()) continue;
unsigned AfterPlaOp = AfterPla->getOpcode();
bool AfterDeadA = opRedefinesA(AfterPlaOp);
// Forward-walk liveness: if AfterPla is a branch and ALL its
// successors' first ops redefine A (recursing through
// unconditional-branch trampolines), A is dead.
if (!AfterDeadA && AfterPla->isBranch()) {
bool AllDead = true;
std::function<bool(MachineBasicBlock *, int)> firstRedef =
[&](MachineBasicBlock *B, int Depth) -> bool {
if (Depth > 3 || !B || B->empty()) return false;
MachineInstr &MI = B->front();
unsigned MOp = MI.getOpcode();
if (opRedefinesA(MOp)) return true;
if (MOp == W65816::BRA || MOp == W65816::BRL ||
MOp == W65816::JMP_Abs) {
for (auto &MO : MI.operands()) {
if (MO.isMBB()) {
return firstRedef(MO.getMBB(), Depth + 1);
}
}
}
return false;
};
for (MachineBasicBlock *Succ : MBB.successors()) {
if (!firstRedef(Succ, 0)) { AllDead = false; break; }
}
if (AllDead && !MBB.succ_empty()) AfterDeadA = true;
}
if (!AfterDeadA) continue;
MachineOperand &OffMO = Sta->getOperand(0);
if (!OffMO.isImm()) continue;
int64_t Off = OffMO.getImm();
if (Off < 2) continue;
OffMO.setImm(Off - 2);
ToErase.push_back(&*It);
ToErase.push_back(&*Pla);
}
for (MachineInstr *MI : ToErase) {
MI->eraseFromParent();
Changed = true;
}
}
// --- Case (c): multi-pair STA_DP-only bracket anywhere. ---
// IMG-to-IMG copies bracketed for A-preservation. No StackRel
// offsets to adjust (DP is absolute, immune to PHA shifts), so just
// drop PHA/PLA when A is dead at PLA's exit.
std::function<bool(MachineBasicBlock *, int)> firstRedef =
[&](MachineBasicBlock *B, int Depth) -> bool {
if (Depth > 3 || !B || B->empty()) return false;
MachineInstr &MI = B->front();
unsigned MOp = MI.getOpcode();
if (opRedefinesA(MOp)) return true;
if (MOp == W65816::BRA || MOp == W65816::BRL ||
MOp == W65816::JMP_Abs) {
for (auto &MO : MI.operands()) {
if (MO.isMBB()) return firstRedef(MO.getMBB(), Depth + 1);
}
}
return false;
};
for (MachineBasicBlock &MBB : MF) {
SmallVector<MachineInstr *, 4> ToErase;
for (auto It = MBB.begin(); It != MBB.end(); ++It) {
if (It->getOpcode() != W65816::PHA) continue;
// Walk inner LDA_DP + STA_DP pairs.
auto Inner = std::next(It);
int InnerPairs = 0;
bool BailInner = false;
while (Inner != MBB.end()) {
if (Inner->getOpcode() == W65816::PLA) break;
if (Inner->getOpcode() != W65816::LDA_DP) { BailInner = true; break; }
auto St = std::next(Inner);
if (St == MBB.end() || St->getOpcode() != W65816::STA_DP) {
BailInner = true; break;
}
++InnerPairs;
Inner = std::next(St);
}
if (BailInner || Inner == MBB.end() || InnerPairs < 1) continue;
// Inner == PLA. Check liveness after PLA.
auto Post = std::next(Inner);
if (Post == MBB.end()) continue;
unsigned PostOp = Post->getOpcode();
bool ADead = opRedefinesA(PostOp);
if (!ADead && Post->isBranch()) {
bool AllDead = true;
for (MachineBasicBlock *Succ : MBB.successors()) {
if (!firstRedef(Succ, 0)) { AllDead = false; break; }
}
if (AllDead && !MBB.succ_empty()) ADead = true;
}
if (!ADead) continue;
// Eligible: drop PHA + PLA (no offset adjustment for DP).
ToErase.push_back(&*It);
ToErase.push_back(&*Inner);
}
for (MachineInstr *MI : ToErase) {
MI->eraseFromParent();
Changed = true;
}
}
// --- Case (b): ImgCalleeSave prologue bracket in entry MBB. ---
// PHA must be the FIRST instruction (or first after PEI prologue ops
// like REP/TAY/TSC/SEC/SBC/TCS/TYA) in the entry MBB. This ensures
// we're looking at the prologue's IMG save block.
MachineBasicBlock &EntryMBB = MF.front();
auto BB = EntryMBB.begin();
// Skip PEI prologue ops to reach the first ImgCalleeSave PHA.
while (BB != EntryMBB.end()) {
unsigned Op = BB->getOpcode();
if (Op == W65816::PHA) break;
// PEI prologue ops we expect to see before ImgCalleeSave's PHA.
if (Op == W65816::REP || Op == W65816::TAY ||
Op == W65816::TSC || Op == W65816::SEC ||
Op == W65816::SBC_Imm16 || Op == W65816::TCS ||
Op == W65816::TYA) {
++BB;
continue;
}
BB = EntryMBB.end(); // not a recognized prologue shape — bail
break;
}
if (BB != EntryMBB.end() && BB->getOpcode() == W65816::PHA) {
SmallVector<MachineInstr *, 8> InnerStas;
auto Inner = std::next(BB);
bool BailInner = false;
while (Inner != EntryMBB.end()) {
unsigned IOp = Inner->getOpcode();
if (IOp == W65816::PLA) break;
// Inner must be alternating LDA_DP + STA_StackRel pairs.
if (IOp != W65816::LDA_DP) { BailInner = true; break; }
auto St = std::next(Inner);
if (St == EntryMBB.end() || St->getOpcode() != W65816::STA_StackRel) {
BailInner = true; break;
}
MachineOperand &OffMO = St->getOperand(0);
if (!OffMO.isImm() || OffMO.getImm() < 2) {
BailInner = true; break;
}
InnerStas.push_back(&*St);
Inner = std::next(St);
}
if (!BailInner && Inner != EntryMBB.end() && !InnerStas.empty()) {
// Inner == PLA. Walk forward through STX_DP / STY_DP (A-
// transparent) ops looking for STA_StackRel that consumes
// entry-A, then verify next op redefines A.
auto Post = std::next(Inner);
while (Post != EntryMBB.end() && opNoTouchA(Post->getOpcode())) {
++Post;
}
if (Post != EntryMBB.end() &&
Post->getOpcode() == W65816::STA_StackRel) {
auto AfterSta = std::next(Post);
if (AfterSta != EntryMBB.end() &&
opRedefinesA(AfterSta->getOpcode())) {
// Eligible. Move STA destOff to right BEFORE PHA, drop
// PHA + PLA, shift inner STA offsets by -2.
MachineInstr *StaToMove = &*Post;
MachineInstr *PhaMI = &*BB;
MachineInstr *PlaMI = &*Inner;
// splice: move StaToMove to position just before PhaMI.
EntryMBB.splice(PhaMI->getIterator(), &EntryMBB,
StaToMove->getIterator());
for (MachineInstr *Sta : InnerStas) {
Sta->getOperand(0).setImm(Sta->getOperand(0).getImm() - 2);
}
PhaMI->eraseFromParent();
PlaMI->eraseFromParent();
Changed = true;
}
}
}
}
return Changed;
}
// Always-on: elide the STA $E0 / LDA $E0 round-trip in
// ADJCALLSTACKUP's Y-live i64-return path when the next instruction
// after the LDA is `STA_StackRel off,s` storing A to a slot. The
// emitted PEI sequence (see W65816FrameLowering ADJCALLSTACKUP):
//
// STA_DP $E0 ; save A across TSC
// TSC ; A = S
// CLC ; ADC_Imm16 #N ; TCS ; pop N bytes
// LDA_DP $E0 ; restore A
// STA_StackRel off, s ; store A to slot
//
// If the destination's pre-adjust offset (off + N) fits in a 1-byte
// stack-rel encoding, we can move the STA up to BEFORE the SP-adjust
// (using the pre-adjust offset) and drop both the save and reload.
//
// Saves 6 bytes + 8 cyc per match. evalAt has 4 of these.
static bool elideCallResultSaveSPReload(MachineFunction &MF,
const W65816InstrInfo *TII) {
bool Changed = false;
for (MachineBasicBlock &MBB : MF) {
SmallVector<MachineInstr *, 4> ToErase;
for (auto It = MBB.begin(); It != MBB.end(); ++It) {
if (It->getOpcode() != W65816::STA_DP) continue;
MachineOperand &SaveImm = It->getOperand(0);
if (!SaveImm.isImm() || SaveImm.getImm() != 0xE0) continue;
auto I1 = std::next(It);
if (I1 == MBB.end() || I1->getOpcode() != W65816::TSC) continue;
auto I2 = std::next(I1);
if (I2 == MBB.end() || I2->getOpcode() != W65816::CLC) continue;
auto I3 = std::next(I2);
if (I3 == MBB.end() || I3->getOpcode() != W65816::ADC_Imm16) continue;
MachineOperand &AdcImm = I3->getOperand(0);
if (!AdcImm.isImm()) continue;
int64_t N = AdcImm.getImm();
auto I4 = std::next(I3);
if (I4 == MBB.end() || I4->getOpcode() != W65816::TCS) continue;
auto I5 = std::next(I4);
if (I5 == MBB.end() || I5->getOpcode() != W65816::LDA_DP) continue;
MachineOperand &LoadImm = I5->getOperand(0);
if (!LoadImm.isImm() || LoadImm.getImm() != 0xE0) continue;
auto I6 = std::next(I5);
if (I6 == MBB.end() || I6->getOpcode() != W65816::STA_StackRel) continue;
MachineOperand &StaImm = I6->getOperand(0);
if (!StaImm.isImm()) continue;
int64_t Off = StaImm.getImm();
int64_t NewOff = Off + N;
if (NewOff < 0 || NewOff > 255) continue;
// Insert a new STA_StackRel at NewOff before the STA_DP $E0.
BuildMI(MBB, It, It->getDebugLoc(), TII->get(W65816::STA_StackRel))
.addImm(NewOff);
ToErase.push_back(&*It); // STA_DP $E0
ToErase.push_back(&*I5); // LDA_DP $E0
ToErase.push_back(&*I6); // original STA_StackRel
}
for (MachineInstr *MI : ToErase) {
MI->eraseFromParent();
Changed = true;
}
}
return Changed;
}
// Returns true if the opcode is "transparent" to a STA→LDA forward —
// does not write A, does not change S, does not write to any stack
// memory. Used to widen the elideStoreForwarding peephole's window.
static bool isStaLdaTransparent(unsigned Opc) {
switch (Opc) {
// X/Y register ops (don't touch A or S)
case W65816::LDX_Imm16: case W65816::LDX_DP: case W65816::LDX_Abs:
case W65816::LDXi16imm:
case W65816::LDY_Imm16: case W65816::LDY_DP: case W65816::LDY_Abs:
case W65816::TAX: case W65816::TAY:
case W65816::INX: case W65816::INY:
case W65816::DEX: case W65816::DEY:
case W65816::STX_DP: case W65816::STX_Abs:
case W65816::STY_DP: case W65816::STY_Abs:
// Flag ops
case W65816::CLC: case W65816::SEC:
case W65816::CLD: case W65816::SED:
case W65816::CLI: case W65816::SEI:
case W65816::CLV:
case W65816::NOP:
return true;
default:
return false;
}
}
// Always-on: drop a redundant LDA following STA to the same slot when
// any intermediate ops are "transparent" (don't write A or change S
// or stack memory). STA doesn't modify A, so A still holds the value.
//
// STA off, s
// LDX #imm ; transparent
// LDA off, s ; redundant — A unchanged since STA
//
// Saves 1 instruction (3 bytes / 4 cyc) per match.
static bool elideStoreForwarding(MachineFunction &MF) {
bool Changed = false;
for (MachineBasicBlock &MBB : MF) {
SmallVector<MachineInstr *, 4> ToErase;
for (auto It = MBB.begin(); It != MBB.end(); ++It) {
if (It->getOpcode() != W65816::STA_StackRel) continue;
MachineOperand &S = It->getOperand(0);
if (!S.isImm()) continue;
int64_t StaOff = S.getImm();
// Walk forward up to 3 ops looking for matching LDA.
MachineBasicBlock::iterator Walk = std::next(It);
int Steps = 0;
while (Walk != MBB.end() && Steps < 3) {
unsigned WOp = Walk->getOpcode();
if (WOp == W65816::LDA_StackRel) {
MachineOperand &L = Walk->getOperand(0);
if (L.isImm() && L.getImm() == StaOff) {
ToErase.push_back(&*Walk);
}
break;
}
if (!isStaLdaTransparent(WOp)) break;
++Walk;
++Steps;
}
}
for (MachineInstr *MI : ToErase) {
MI->eraseFromParent();
Changed = true;
}
}
return Changed;
}
// Always-on: drop a consecutive PLA/PHA pair. PLA restores A from
// the stack; PHA immediately pushes the same A back. Net is a no-op
// in both A and stack memory. Emerges when multiple adjacent IMG
// copies are each bracketed with PHA/PLA for A-preservation:
//
// PHA ; LDA dp ; STA dp ; PLA ; PHA ; LDA dp ; STA dp ; PLA
// ^^^^^^^^^^
// collapsed away
//
// Saves 2 instructions (2 bytes / 7 cyc) per match.
static bool elidePlaPhaPair(MachineFunction &MF) {
bool Changed = false;
for (MachineBasicBlock &MBB : MF) {
SmallVector<MachineInstr *, 4> ToErase;
for (auto It = MBB.begin(); It != MBB.end(); ++It) {
if (It->getOpcode() != W65816::PLA) continue;
auto I1 = std::next(It);
if (I1 == MBB.end() || I1->getOpcode() != W65816::PHA) continue;
ToErase.push_back(&*It);
ToErase.push_back(&*I1);
++It; // advance past PHA (already-to-erase)
}
for (MachineInstr *MI : ToErase) {
MI->eraseFromParent();
Changed = true;
}
}
return Changed;
}
// Always-on: drop a redundant LDA when the prior LDA loaded the same
// source and the only intervening instruction was PHA (which reads A
// but doesn't modify it). Emerges from i64 arg-push sequences:
//
// LDA off, s
// PHA
// LDA off, s ; A still has this value — redundant
// PHA
//
// Saves 1 instruction (3 bytes / 4 cyc) per match.
static bool elideRedundantLdaAfterPha(MachineFunction &MF) {
bool Changed = false;
for (MachineBasicBlock &MBB : MF) {
SmallVector<MachineInstr *, 4> ToErase;
for (auto It = MBB.begin(); It != MBB.end(); ++It) {
unsigned Op = It->getOpcode();
bool IsLdaSr = (Op == W65816::LDA_StackRel);
bool IsLdaDp = (Op == W65816::LDA_DP);
if (!IsLdaSr && !IsLdaDp) continue;
auto I1 = std::next(It);
if (I1 == MBB.end() || I1->getOpcode() != W65816::PHA) continue;
auto I2 = std::next(I1);
if (I2 == MBB.end() || I2->getOpcode() != Op) continue;
MachineOperand &S1 = It->getOperand(0);
MachineOperand &S2 = I2->getOperand(0);
if (!S1.isImm() || !S2.isImm()) continue;
if (S1.getImm() != S2.getImm()) continue;
ToErase.push_back(&*I2);
}
for (MachineInstr *MI : ToErase) {
MI->eraseFromParent();
Changed = true;
}
}
return Changed;
}
// Always-on: drop a dead STA in the i32-carry-propagation pattern:
//
// STA_StackRel off, s
// ADC_Imm16 #N ; doesn't touch slot
// STA_StackRel off, s ; overwrites first STA
//
// The first STA's value is shadowed by the second. Drop it.
// Saves 1 instruction (3 bytes / 5 cyc) per match.
static bool elideDeadStaCarry(MachineFunction &MF) {
bool Changed = false;
for (MachineBasicBlock &MBB : MF) {
SmallVector<MachineInstr *, 4> ToErase;
for (auto It = MBB.begin(); It != MBB.end(); ++It) {
if (It->getOpcode() != W65816::STA_StackRel) continue;
auto I1 = std::next(It);
if (I1 == MBB.end()) continue;
unsigned MidOp = I1->getOpcode();
bool IsAddImm = (MidOp == W65816::ADC_Imm16 ||
MidOp == W65816::ADCi16imm ||
MidOp == W65816::ADCEi16imm ||
MidOp == W65816::SBCi16imm ||
MidOp == W65816::SBCEi16imm);
if (!IsAddImm) continue;
auto I2 = std::next(I1);
if (I2 == MBB.end() || I2->getOpcode() != W65816::STA_StackRel) continue;
MachineOperand &Off1 = It->getOperand(0);
MachineOperand &Off2 = I2->getOperand(0);
if (!Off1.isImm() || !Off2.isImm()) continue;
if (Off1.getImm() != Off2.getImm()) continue;
ToErase.push_back(&*It);
}
for (MachineInstr *MI : ToErase) {
MI->eraseFromParent();
Changed = true;
}
}
return Changed;
}
bool W65816StackRelToImg::runOnMachineFunction(MachineFunction &MF) {
if (skipFunction(MF.getFunction())) return false;
if (MF.getFunction().hasOptNone()) return false;
@ -139,26 +635,48 @@ bool W65816StackRelToImg::runOnMachineFunction(MachineFunction &MF) {
// be from FP not SP and the PHP-wrap +1 adjustment differs.
if (MF.getFrameInfo().hasVarSizedObjects()) return false;
// Always-on peepholes that run even when the main IMG promotion bails.
const W65816Subtarget &STIp = MF.getSubtarget<W65816Subtarget>();
// Run PLA;PHA collapse FIRST so adjacent brackets merge into a
// single multi-pair bracket — lets elidePhaBracket case (c) match
// the merged shape.
bool ChangedEarly = elidePlaPhaPair(MF);
ChangedEarly |= elidePhaBracket(MF, STIp.getInstrInfo());
ChangedEarly |= elideCallResultSaveSPReload(MF, STIp.getInstrInfo());
ChangedEarly |= elideDeadStaCarry(MF);
ChangedEarly |= elideRedundantLdaAfterPha(MF);
// elideStoreForwarding only when main IMG promotion would bail —
// running it early in non-bailing functions cascades into IMG-slot
// reallocation that regresses strcpy 1.63×. Gated below.
// 2. Bail if the function has any non-IMG-safe call (would clobber
// our IMG0..7 promotions) or is recursive (same). Tried allowing
// IMG8..15 + ImgCalleeSave fallback for these cases (gained 12
// inst on evalAt), but broke sprintf and fib due to subtle
// interactions with ImgCalleeSave's slot allocation. Reverted.
// IMG8..15 + own-pass save/restore for these cases (today, after
// landing W65816LowerWide32 + ImgCalleeSave-writes-only fixes), and
// saw: evalAt 498→500 (NET LOSS due to save/restore overhead) AND
// qsort #70 regression. The IMG8..15 path is not currently a win
// for our benchmarks; reverted.
StringRef SelfName = MF.getName();
for (MachineBasicBlock &MBB : MF) {
for (MachineInstr &MI : MBB) {
if (!MI.isCall()) continue;
if (!isImgSafeCall(MI)) return false;
if (!isImgSafeCall(MI)) {
ChangedEarly |= elideStoreForwarding(MF);
return ChangedEarly;
}
for (const MachineOperand &MO : MI.operands()) {
StringRef Name;
if (MO.isGlobal()) Name = MO.getGlobal()->getName();
else if (MO.isSymbol()) Name = MO.getSymbolName();
else continue;
if (Name == SelfName) return false;
if (Name == SelfName) {
ChangedEarly |= elideStoreForwarding(MF);
return ChangedEarly;
}
}
}
uint8_t imgBase = 0xD0;
}
uint8_t imgBase = 0xD0u;
// 3. Count stack-rel accesses per offset. CRITICAL: the stack
// pointer shifts during the function due to PHP/PLP (+1 byte) and
@ -614,19 +1132,59 @@ bool W65816StackRelToImg::runOnMachineFunction(MachineFunction &MF) {
auto Tya = std::next(Tcs);
while (Tya != EntryMBB.end() && Tya->isDebugInstr()) ++Tya;
if (Tya != EntryMBB.end() && Tya->getOpcode() == W65816::TYA) {
// Walk past A-transparent ops (STX_DP, STY_DP) — these
// don't touch A, so TAY/TYA can still be removed.
auto Sta = std::next(Tya);
while (Sta != EntryMBB.end() && Sta->isDebugInstr()) ++Sta;
while (Sta != EntryMBB.end() &&
(Sta->isDebugInstr() ||
Sta->getOpcode() == W65816::STX_DP ||
Sta->getOpcode() == W65816::STY_DP)) {
++Sta;
}
if (Sta != EntryMBB.end() &&
Sta->getOpcode() == W65816::STA_DP &&
Sta->getNumOperands() >= 1 &&
Sta->getOperand(0).isImm()) {
unsigned StaOp = Sta->getOpcode();
bool IsStaDp = (StaOp == W65816::STA_DP);
bool IsStaSr = (StaOp == W65816::STA_StackRel);
if (IsStaDp || IsStaSr) {
// For STA_StackRel: pre-TCS offset = post-TCS_off - N
// where N = SBC immediate. Only valid if off >= N.
int64_t StaAddr = Sta->getOperand(0).getImm();
// Build new STA_DP between REP and TSC.
int64_t SbcImm = Sbc->getOperand(0).isImm()
? Sbc->getOperand(0).getImm() : -1;
// Drop ADCi16imm pseudo-tied operands: imm is at op 0 for
// SBC_Imm16 but op 2 for SBCi16imm — handle uniformly.
if (!Sbc->getOperand(0).isImm() &&
Sbc->getNumOperands() >= 3 &&
Sbc->getOperand(2).isImm()) {
SbcImm = Sbc->getOperand(2).getImm();
}
int64_t NewAddr = IsStaDp ? StaAddr : (StaAddr - SbcImm);
bool OffOk = IsStaDp || (NewAddr >= 1 && SbcImm > 0);
// Safety: the op after the spill-STA must REDEFINE A
// (not read it). Otherwise A would be lost (TCS
// clobbered it).
auto Next = std::next(Sta);
while (Next != EntryMBB.end() && Next->isDebugInstr())
++Next;
bool NextRedef = false;
if (Next != EntryMBB.end()) {
unsigned NOp = Next->getOpcode();
NextRedef =
NOp == W65816::LDA_DP || NOp == W65816::LDA_StackRel ||
NOp == W65816::LDA_Abs || NOp == W65816::LDA_Imm16 ||
NOp == W65816::LDAabs || NOp == W65816::LDAi16imm ||
NOp == W65816::TXA || NOp == W65816::TYA ||
NOp == W65816::PLA;
}
if (OffOk && NextRedef) {
// Build new STA_<DP|StackRel> between REP and TSC.
DebugLoc DL = Sta->getDebugLoc();
BuildMI(EntryMBB, Tsc, DL, TII->get(W65816::STA_DP))
.addImm(StaAddr)
BuildMI(EntryMBB, Tsc, DL, TII->get(StaOp))
.addImm(NewAddr)
.addReg(W65816::A, RegState::Implicit);
// Erase: TAY, TYA, old STA_DP.
// Erase: TAY, TYA, old STA.
Tay->eraseFromParent();
Tya->eraseFromParent();
Sta->eraseFromParent();
@ -640,6 +1198,8 @@ bool W65816StackRelToImg::runOnMachineFunction(MachineFunction &MF) {
}
}
}
}
}
// Phase 5: dead STA before pop. Pattern:
// STA_StackRel <off> ; writes to SP+off
@ -1459,5 +2019,17 @@ bool W65816StackRelToImg::runOnMachineFunction(MachineFunction &MF) {
}
}
// Run elideStoreForwarding at the very end, AFTER IMG promotion has
// committed slot assignments. Running this peephole earlier (with
// the other early peepholes) cascades into different IMG-promotion
// choices and was observed to regress strcpy 1.63×. At this point
// promotion is done, so dropping a redundant LDA can no longer
// disturb slot allocation.
// End-of-pass: also try elideStoreForwarding for non-bailing
// functions. After main IMG promotion finalizes slot assignments,
// dropping a redundant LDA can no longer disturb them.
Changed |= elideStoreForwarding(MF);
Changed |= ChangedEarly;
return Changed;
}