Checkpoint
This commit is contained in:
parent
35aaf7953a
commit
f338d93bae
21 changed files with 26533 additions and 965 deletions
632
STATUS.md
632
STATUS.md
|
|
@ -14,9 +14,16 @@ which runs correctly under MAME (apple2gs).
|
||||||
(signed and unsigned). Carry-chained multi-word ops via ADC/SBC pseudos
|
(signed and unsigned). Carry-chained multi-word ops via ADC/SBC pseudos
|
||||||
+ ASLA16 / shift libcalls.
|
+ ASLA16 / shift libcalls.
|
||||||
- Comparisons and signed/unsigned widening (sext, zext, trunc) for all
|
- Comparisons and signed/unsigned widening (sext, zext, trunc) for all
|
||||||
the above sizes.
|
the above sizes. Signed compare near INT_MIN handled via EOR-with-
|
||||||
|
sign-bit transform.
|
||||||
- Pointer arithmetic, array indexing, struct field access, struct
|
- Pointer arithmetic, array indexing, struct field access, struct
|
||||||
return-by-value (up to 8 bytes — Pair, Vec4, double).
|
return-by-value (up to 8 bytes — Pair, Vec4, double).
|
||||||
|
- Pointer dereference (`*p`) lowers via `LDAptr / STAptr / STBptr`
|
||||||
|
to `[$E0],Y` indirect-LONG with the bank byte at `$E2` forced to 0
|
||||||
|
— DBR-independent, so `pha;plb` bank-switched callers don't corrupt
|
||||||
|
data through callee local-pointer writes. Const-int pointers
|
||||||
|
(`*(volatile uint16 *)0x5000 = v` MMIO idiom) lower to `STAabs`
|
||||||
|
(DBR-relative) so bank-2 writes still work.
|
||||||
- Bitfields, switch statements (verified up to ~12 cases + default),
|
- Bitfields, switch statements (verified up to ~12 cases + default),
|
||||||
function pointers, function-pointer tables, indirect calls via
|
function pointers, function-pointer tables, indirect calls via
|
||||||
`__jsl_indir` trampoline.
|
`__jsl_indir` trampoline.
|
||||||
|
|
@ -25,14 +32,15 @@ which runs correctly under MAME (apple2gs).
|
||||||
- Loops with goto / break / continue, nested loops, state machines.
|
- Loops with goto / break / continue, nested loops, state machines.
|
||||||
- `<stdarg.h>` varargs with int / long / unsigned long long mixed args.
|
- `<stdarg.h>` varargs with int / long / unsigned long long mixed args.
|
||||||
- Heap: `malloc` / `free` (libc.c first-fit allocator) — linked-list
|
- Heap: `malloc` / `free` (libc.c first-fit allocator) — linked-list
|
||||||
reverse with `cons` works.
|
reverse with `cons` works; free-list coalesce verified.
|
||||||
- Strings: hand-rolled `strlen`, `strcmp`, `strcpy`, `strchr`, atoi/itoa
|
- Strings: hand-rolled `strlen`, `strcmp`, `strcpy`, `strchr`, atoi/itoa
|
||||||
roundtrip.
|
roundtrip.
|
||||||
- Soft-float (single): all four ops + comparisons, MAME-verified.
|
- Soft-float (single): all four ops + comparisons, MAME-verified.
|
||||||
- Soft-double: add, sub, mul, div all return correct bit patterns
|
- Soft-double: add, sub, mul, div all return correct bit patterns
|
||||||
bit-for-bit against gcc with round-to-nearest-even rounding;
|
bit-for-bit against gcc with round-to-nearest-even rounding;
|
||||||
3-iter Newton sqrt converges. Long-running iterations may hit MAME's
|
3-iter Newton sqrt converges. Compiles at -O2 throughout. Long-
|
||||||
1-second sim-time budget (test config issue, not a compiler bug).
|
running iterations may hit MAME's 1-second sim-time budget (test
|
||||||
|
config issue, not a compiler bug).
|
||||||
- Inline assembly with `"a"`, `"x"`, `"y"` register constraints and
|
- Inline assembly with `"a"`, `"x"`, `"y"` register constraints and
|
||||||
arbitrary opcode bytes (used for the `pha;plb` bank-switch idiom).
|
arbitrary opcode bytes (used for the `pha;plb` bank-switch idiom).
|
||||||
- C++ minimal: clang++ compiles a class with virtual + non-trivial
|
- C++ minimal: clang++ compiles a class with virtual + non-trivial
|
||||||
|
|
@ -43,22 +51,41 @@ which runs correctly under MAME (apple2gs).
|
||||||
C99 truncation semantics for snprintf. `%.Nf` produces the
|
C99 truncation semantics for snprintf. `%.Nf` produces the
|
||||||
correct fractional digits with round-half-up.
|
correct fractional digits with round-half-up.
|
||||||
- qsort + bsearch over arbitrary element size with a user `cmp`
|
- qsort + bsearch over arbitrary element size with a user `cmp`
|
||||||
callback (insertion-sort variant — sidesteps the greedy regalloc
|
callback.
|
||||||
bug in the recursive iterative-qsort form).
|
|
||||||
- Standard string/stdlib glue: strcat, strncat, strpbrk, strspn,
|
- Standard string/stdlib glue: strcat, strncat, strpbrk, strspn,
|
||||||
strcspn, atol, llabs (kept in their own translation unit so
|
strcspn, atol, llabs (kept in their own translation unit so
|
||||||
vprintf's branch layout doesn't shift).
|
vprintf's branch layout doesn't shift).
|
||||||
- `<math.h>`: fabs, floor, ceil, fmod, copysign, sqrt, pow,
|
- `<math.h>`: fabs, floor, ceil, fmod, copysign, sqrt, pow,
|
||||||
sin, cos, exp, log, atan, atan2, asin, acos, sinh, cosh, tanh
|
sin, cos, tan, exp, log, atan, atan2, asin, acos, sinh, cosh,
|
||||||
(and float variants). Bit-twiddling for fabs/floor/ceil/copysign;
|
tanh (and float variants). Bit-twiddling for fabs/floor/ceil/
|
||||||
Newton iteration for sqrt; range-reduction + Taylor for sin/cos/
|
copysign; Newton iteration for sqrt; range-reduction + Taylor
|
||||||
exp/log/atan; identities for asin/acos/atan2/sinh/cosh/tanh.
|
for sin/cos/exp/log/atan; identities for asin/acos/atan2/sinh/
|
||||||
Accuracy is in the ~1e-6 range — good enough for typical numeric
|
cosh/tanh. Accuracy is in the ~1e-6 range — good enough for
|
||||||
work, far short of glibc-quality. These are slow (each call is
|
typical numeric work, far short of glibc-quality. These are
|
||||||
dozens to hundreds of soft-double libcalls) — pre-compute or
|
slow (each call is dozens to hundreds of soft-double libcalls)
|
||||||
cache when possible.
|
— pre-compute or cache when possible.
|
||||||
- `setjmp` / `longjmp` from libgcc.s.
|
- `setjmp` / `longjmp` from libgcc.s.
|
||||||
- Static constructors via crt0's init_array walk.
|
- Static constructors via crt0's init_array walk.
|
||||||
|
- `<stdio.h>` file I/O against an in-memory FS: `mfsRegister
|
||||||
|
(path, buf, size, cap, writable)` stages a buffer as a named
|
||||||
|
file; `fopen`/`fread`/`fwrite`/`fseek`/`ftell`/`fclose`/`fgetc`
|
||||||
|
/`fgets`/`ungetc`/`fprintf` operate on it via a per-FILE
|
||||||
|
(kind, buf, size, cap, pos, eof, err, unget) record. stdin/
|
||||||
|
stdout/stderr route through `putchar` as before.
|
||||||
|
- `<wchar.h>`: wcslen / wcscmp / wcsncmp / wcscpy / wcsncpy /
|
||||||
|
wcscat / wcschr / wcsrchr; mbtowc / wctomb / mbstowcs /
|
||||||
|
wcstombs / mblen with the trivial 1:1 byte<->wide mapping
|
||||||
|
(Latin-1). wchar_t is 16-bit on this target.
|
||||||
|
- `<signal.h>`: in-process signal table. signal() registers a
|
||||||
|
handler; raise() invokes it. Default actions: SIGABRT calls
|
||||||
|
abort(), SIGINT/SIGTERM call exit(128+sig), others ignored.
|
||||||
|
- `<locale.h>`: setlocale always returns "C"; localeconv returns
|
||||||
|
a fixed C-locale lconv struct.
|
||||||
|
- C++ subset: classes, single inheritance, virtual functions,
|
||||||
|
polymorphism via base-class pointer arrays, virtual dtors.
|
||||||
|
Compile with `clang++ -fno-exceptions -fno-rtti`. Multiple
|
||||||
|
inheritance with virtual bases, full RTTI, exceptions are
|
||||||
|
out of scope.
|
||||||
|
|
||||||
**Toolchain:**
|
**Toolchain:**
|
||||||
|
|
||||||
|
|
@ -67,23 +94,60 @@ which runs correctly under MAME (apple2gs).
|
||||||
text/rodata/bss, emits a flat binary the IIgs ROM can load.
|
text/rodata/bss, emits a flat binary the IIgs ROM can load.
|
||||||
Auto-relocates bss above text+rodata when the default
|
Auto-relocates bss above text+rodata when the default
|
||||||
`--bss-base 0x2000` would overlap text, and skips past the
|
`--bss-base 0x2000` would overlap text, and skips past the
|
||||||
IIgs IO window ($C000-$CFFF) if needed.
|
IIgs IO window ($C000-$CFFF) if needed. `--gc-sections`
|
||||||
|
(default ON) drops unreachable functions: a minimal program
|
||||||
|
with full runtime linked shrinks from ~43KB to ~1.5KB.
|
||||||
- `tools/omfEmit` produces OMF v2.1 single-segment files (the IIgs's
|
- `tools/omfEmit` produces OMF v2.1 single-segment files (the IIgs's
|
||||||
native object format) for round-tripping with classic dev tools.
|
native object format) for round-tripping with classic dev tools.
|
||||||
|
- `link816 --debug-out FILE` writes a DWARF sidecar with text/
|
||||||
|
rodata/bss/init_array relocations applied to every `.debug_*`
|
||||||
|
section, so `.debug_addr` / `.debug_line` PC values are final-
|
||||||
|
image addresses.
|
||||||
- `runtime/build.sh` builds crt0, libc, soft-float, soft-double,
|
- `runtime/build.sh` builds crt0, libc, soft-float, soft-double,
|
||||||
libgcc into linkable objects.
|
libgcc into linkable objects.
|
||||||
- `scripts/smokeTest.sh` runs 107 end-to-end checks (scalar ops,
|
- `scripts/smokeTest.sh` runs 113 end-to-end checks at -O2:
|
||||||
control flow, calling conventions, MAME execution, regressions,
|
scalar ops, control flow, calling conventions, MAME execution
|
||||||
link816 bss-base safety + weak-symbol resolution +
|
regressions, link816 bss-base safety + weak-symbol resolution +
|
||||||
heap_end-vs-heap_start sanity, iigs/toolbox.h compile + link
|
heap_end-vs-heap_start sanity, iigs/toolbox.h compile + link,
|
||||||
check (singles inlined, multi-arg wrappers in iigsToolbox.s),
|
standalone runtime headers, AsmPrinter peepholes (STZ / PEA /
|
||||||
standalone runtime headers, AsmPrinter peepholes for STZ /
|
PEI — single-STA, shared-LDA-multi-STA, DPF0-forwarding),
|
||||||
PEA / PEI — single-STA, shared-LDA-multi-STA, and DPF0-
|
malloc/free coalesce ordering, plus real-world coverage:
|
||||||
forwarding cases — malloc/free coalesce ordering, plus
|
Conway's Game of Life blinker (2D loop + neighbour bounds),
|
||||||
real-world coverage tests: Conway's Game of Life blinker
|
binary search tree (recursive struct + malloc), function-pointer
|
||||||
(2D loop + neighbour bounds), binary search tree (recursive
|
dispatch table (indirect JSL via `__jsl_indir`), memory-backed
|
||||||
struct + malloc), function-pointer dispatch table (indirect
|
file I/O (mfsRegister + fopen/fread/fwrite/fseek/fprintf), C++
|
||||||
JSL via `__jsl_indir`). Currently 100% pass at -O2 throughout.
|
polymorphism (single inheritance + virtual functions), wchar /
|
||||||
|
signal core APIs, hex dumper writing through fprintf, JSON
|
||||||
|
tokenizer state machine, scripts/bench.sh size-vs-Calypsi
|
||||||
|
harness. 100% pass.
|
||||||
|
|
||||||
|
- `scripts/bench.sh` compiles a microbenchmark suite with both
|
||||||
|
clang (this toolchain) and Calypsi cc65816, comparing emitted
|
||||||
|
text-section size. Current ratio: ~2.2x (clang generates more
|
||||||
|
bytes than Calypsi on average; sumOfSquares is the worst case
|
||||||
|
at 6.45x because of __mulsi3 dispatch). Eight benchmarks
|
||||||
|
shipped under `benchmarks/`.
|
||||||
|
|
||||||
|
**Backend register allocation:**
|
||||||
|
|
||||||
|
- Greedy regalloc as default at -O1+; fast at -O0/optnone.
|
||||||
|
- Pre-RA passes: `WidenAcc16` (Acc16→Wide16 promotion, lets
|
||||||
|
greedy spread i16 pressure across A and 16 IMG slots);
|
||||||
|
`TiedDefSpill` (handles tied-def-multi-use hazard);
|
||||||
|
`ABridgeViaX` (bridges via X/Y when free).
|
||||||
|
- Post-RA passes: `SpillToX` (STA/LDA pairs → TAX/TXA bridges
|
||||||
|
when X dead); `StackSlotCleanup` (deletes redundant adjacent
|
||||||
|
spills); `NegYIndY` (rewrites negative-Y indirect-Y stack-rel
|
||||||
|
ops to avoid the 24-bit-add bank-cross).
|
||||||
|
- Pre-emit: `BranchExpand` (long Bxx → INV_Bxx skip; BRA target);
|
||||||
|
`SepRepCleanup` (coalesces adjacent SEP/REP toggles, plus a
|
||||||
|
cross-mode-neutral coalesce that drops REP/SEP pairs sandwiching
|
||||||
|
X-flag-only ops, branches, transfers — saves 4B / 12cyc per
|
||||||
|
collapse). AsmPrinter LDAi8imm peephole walks past mode-neutral
|
||||||
|
MIs to fuse the closing REP into a following SEP.
|
||||||
|
- Imaginary registers IMG0..IMG15 backed by DP $C0..$CE +
|
||||||
|
$D0..$DE — gives greedy 17 effective i16 carriers (A + 16 IMG)
|
||||||
|
before stack spills kick in.
|
||||||
|
|
||||||
**ABI:**
|
**ABI:**
|
||||||
|
|
||||||
|
|
@ -95,459 +159,83 @@ which runs correctly under MAME (apple2gs).
|
||||||
- Frame is empty-descending (S points to next-free); offsets account
|
- Frame is empty-descending (S points to next-free); offsets account
|
||||||
for the +1 skew vs LLVM's full-descending model.
|
for the +1 skew vs LLVM's full-descending model.
|
||||||
|
|
||||||
|
**IIgs toolbox:**
|
||||||
|
|
||||||
|
- `iigs/toolbox.h` — autogenerated wrappers for all ~1300 IIgs
|
||||||
|
toolbox routines across 35 tool sets (Tool Locator, Memory
|
||||||
|
Manager, Misc Tools, QuickDraw II / Aux, Event Manager,
|
||||||
|
Sound Manager, Apple Desktop Bus, SANE, Integer Math, Text
|
||||||
|
Tools, Window Manager, Menu Manager, Control Manager,
|
||||||
|
LineEdit, Dialog Manager, Scrap Manager, Standard File,
|
||||||
|
Note Synth/Sequencer, Font Manager, List Manager, ACE,
|
||||||
|
Resource Manager, MIDI, Video Overlay, TextEdit, Media
|
||||||
|
Control, Print Manager, Scheduler, Desk Manager, …). Names
|
||||||
|
match Apple's IIgs Toolbox Reference exactly (TLStartUp,
|
||||||
|
MMStartUp, NewWindow, SysBeep, …). 417 simple wrappers
|
||||||
|
(zero/single-arg, i16-or-void return) inline in the header;
|
||||||
|
890 multi-arg ones live in `runtime/src/iigsToolbox.s`.
|
||||||
|
Generated by `scripts/genToolbox.py` from ORCA-C's
|
||||||
|
`ORCACDefs/` (re-runnable when ORCA-C updates).
|
||||||
|
|
||||||
## In flight
|
## In flight
|
||||||
|
|
||||||
Two open bugs tracked:
|
|
||||||
|
|
||||||
1. **#107 — strtok / qsort -O1+ miscompile — RESOLVED.** Three
|
|
||||||
independent issues across the backend, runtime, and linker;
|
|
||||||
all fixed.
|
|
||||||
|
|
||||||
**Fix 1 (W65816StackSlotCleanup cross-MBB):** Pass -4 /
|
|
||||||
Pass -4c collapsed `LDA fs.X; STA stk.Y; ... LDA_indY stk.Y`
|
|
||||||
patterns with only an MBB-local safety check, missing cross-MBB
|
|
||||||
readers of stk.Y. Greedy regalloc had spilled an in-place INA
|
|
||||||
result back to stk.Y; eliminating the bb.3 init store left the
|
|
||||||
bb.10 reload reading garbage. Function-wide cross-MBB check
|
|
||||||
added.
|
|
||||||
|
|
||||||
**Fix 2 (W65816SepRepCleanup LDAi8imm hoist):** Pre-pass that
|
|
||||||
relocates LDAi8imm BEFORE byte-store SEP/REP wraps. LDAi8imm
|
|
||||||
expands at AsmPrinter to its own SEP+LDA8+REP that toggles M;
|
|
||||||
the post-RA scheduler was moving it INSIDE an STBptr wrap, so
|
|
||||||
the LDAi8imm's REP fired BEFORE the byte STA. The STA then
|
|
||||||
ran in M=16, writing 2 bytes of zero and clobbering the next
|
|
||||||
byte. Hoist puts the toggle in the outer M=16 zone, leaving
|
|
||||||
the byte STA in M=8.
|
|
||||||
|
|
||||||
**Fix 3 (link816 bss-base safety + strtok_r noinline):** With
|
|
||||||
the backend fixes, -O2 strtok grew large enough that the
|
|
||||||
strtok() wrapper inlining (~290 extra bytes) pushed the
|
|
||||||
binary's text+rodata past 0xC000 (IIgs IO window). Reads of
|
|
||||||
string literals or stdio handles in that range hit IO
|
|
||||||
registers and corrupted execution. Two complementary fixes:
|
|
||||||
`__attribute__((noinline))` on `strtok_r` so the wrapper
|
|
||||||
doesn't duplicate it (-O2 strtok.o now 1564B, was 2156B);
|
|
||||||
link816 auto-relocates bss above text+rodata when default
|
|
||||||
`--bss-base 0x2000` would overlap, and skips past the IO
|
|
||||||
window if needed.
|
|
||||||
|
|
||||||
strtok.c now compiles at -O2 with everything else. Smoke
|
|
||||||
#84 (4-call strtok continuation) and #92 (recursive parser)
|
|
||||||
both pass. Workaround comments in build.sh / smokeTest.sh
|
|
||||||
removed.
|
|
||||||
|
|
||||||
The `__attribute__((noinline,optnone))` defenses on iterative
|
|
||||||
qsort / RPN `runAll` / expression-parser `runAll` were
|
|
||||||
subsequently dropped; the smoke now compiles them at plain
|
|
||||||
`-O2` without escape hatches.
|
|
||||||
|
|
||||||
The W65816 backend assembler now supports all common indirect
|
|
||||||
addressing modes (`(dp)`, `(dp),Y`, `(dp,X)`, `(d,s),Y`,
|
|
||||||
`[dp]`, `[dp],Y`, and `JMP (abs)`). All `.byte` opcode hacks in
|
|
||||||
the runtime have been removed in favour of the mnemonics. The
|
|
||||||
disassembler decodes them too.
|
|
||||||
|
|
||||||
Runtime now exposes a ~complete C99 subset: sprintf/snprintf with correct %.Nf precision, qsort/bsearch,
|
|
||||||
the full string.h family (strcat/strncat/strpbrk/strspn/strcspn/
|
|
||||||
strtok/strtok_r), math.h with the eleven common transcendentals
|
|
||||||
(sqrt/pow/sin/cos/exp/log/atan/atan2/asin/acos/sinh/cosh/tanh),
|
|
||||||
atol/llabs/atexit/exit/abort, and a smoke test that exercises
|
|
||||||
malloc + struct pointers + strcmp/strcpy via a working hash table
|
|
||||||
end-to-end in MAME.
|
|
||||||
|
|
||||||
`strtok` / `strtok_r` live in their own TU at `-O2` (with
|
|
||||||
`__attribute__((noinline))` on `strtok_r` so the strtok() wrapper
|
|
||||||
doesn't duplicate it). Multi-call strtok over "a,b,,c" works
|
|
||||||
end-to-end in smoke. The layout-sensitive miscompile that
|
|
||||||
previously haunted strtok_r's inner CMP loop has been fixed by
|
|
||||||
modelling `Uses=[P]` on the conditional branches (the LICM/sink
|
|
||||||
interaction that elided "redundant" CMPs no longer fires); no
|
|
||||||
surgical workaround flags needed.
|
|
||||||
|
|
||||||
A small **RPN calculator** test (smoke #87) chains strtok, atol,
|
|
||||||
push/pop over a static stack, snprintf "%ld", and strcmp to verify
|
|
||||||
the end-to-end composition under a realistic-ish workload — adds,
|
|
||||||
subs, muls, divs, and 3-deep operand stacks all work.
|
|
||||||
|
|
||||||
**setjmp / longjmp** (smoke #88) now work end-to-end: setjmp saves
|
|
||||||
SP / 24-bit ret addr / DP, longjmp restores them and returns the
|
|
||||||
val argument as setjmp's "second return". Required two fixes:
|
|
||||||
(a) the W65816 assembler had no instruction definition for
|
|
||||||
`(dp)` / `(dp), y` / `(dp, x)` indirect addressing modes, so the
|
|
||||||
mnemonic forms silently fell through to absolute-,Y opcodes —
|
|
||||||
fixed in `src/llvm/lib/Target/W65816/W65816InstrFormats.td` +
|
|
||||||
`W65816InstrInfo.td` + `AsmParser/W65816AsmParser.cpp` (the runtime
|
|
||||||
.byte hacks have been replaced with mnemonics); (b) added
|
|
||||||
`__attribute__((returns_twice))` to the setjmp declaration so the
|
|
||||||
optimizer doesn't constant-fold post-setjmp env reads to 0.
|
|
||||||
|
|
||||||
**CRC32** (smoke #89) verifies the standard "123456789" → 0xCBF43926
|
|
||||||
end-to-end — exercises uint32_t shifts, XORs, char-by-char loops.
|
|
||||||
|
|
||||||
**Brainfuck interpreter** (smoke #90) executes a small bf program
|
|
||||||
and verifies the output bytes — exercises loop bracket matching,
|
|
||||||
pointer math (data pointer), branching on cell value.
|
|
||||||
|
|
||||||
**Recursive-descent expression parser** (smoke #92) evaluates
|
|
||||||
"3+4", "2*3+4", "2+3*4", "(3+4)*5", "100/4-5*2+1" with proper
|
|
||||||
operator precedence and parentheses — exercises mutual recursion,
|
|
||||||
char-by-char tokenization, and integer arithmetic in concert.
|
|
||||||
|
|
||||||
The **DWARF sidecar** (`link816 --debug-out FILE`) now applies
|
|
||||||
text/rodata/bss/init_array relocations to every `.debug_*` section
|
|
||||||
before writing it. PC values in `.debug_addr` and `.debug_line` end
|
|
||||||
up as final-image addresses, so a consumer can map back to source
|
|
||||||
lines without re-running the linker. Intra-debug references (e.g.
|
|
||||||
`.debug_info` -> `.debug_str` offsets) are intentionally left
|
|
||||||
object-local — sections are concatenated, not recompacted, and each
|
|
||||||
slice carries an `; OBJ ... SEC ... SIZE ...` header so a multi-TU
|
|
||||||
consumer can scope intra-debug offsets per-slice. The smoke test
|
|
||||||
verifies the address of a known function appears in the patched
|
|
||||||
sidecar bytes.
|
|
||||||
|
|
||||||
## Known issues / workarounds
|
|
||||||
|
|
||||||
- **(d,s),y / (sr,s),y addressing wraps the bank** when Y is
|
|
||||||
negative as 16-bit unsigned. Worked around by `W65816NegYIndY`
|
|
||||||
rewriting the affected ops to `TAX ; LDA/STA $0000,X`. Stays
|
|
||||||
correct for negative offsets like `arr[i-1]`.
|
|
||||||
|
|
||||||
- **Pointer-deref bank policy is now split-by-syntax** (FIXED):
|
|
||||||
`*p` (where `p` is a runtime pointer / local-or-arg vreg) lowers
|
|
||||||
via `LDAptr / STAptr / STBptr` to `[$E0],Y` indirect-LONG with
|
|
||||||
the bank byte at `$E2` forced to 0 — DBR-independent. The
|
|
||||||
`*(volatile uint16 *)0x5000 = v` MMIO idiom (const-int pointer)
|
|
||||||
is matched by a separate TableGen pattern that lowers straight
|
|
||||||
to `STAabs` (DBR-relative) so the smoke tests' bank-2 write
|
|
||||||
path still works. Two tracked issues this resolved:
|
|
||||||
(a) PHI-elim was eliding the inserter's `COPY $a = ptr_vreg`
|
|
||||||
when the loop body had multiple Acc16 PHIs competing for A —
|
|
||||||
the inserter now spills the pointer to a fresh stack slot and
|
|
||||||
reloads via LDAfi to keep RA honest; sumTable now correct.
|
|
||||||
(b) pointer staging through `[$E0]` is bank-0 only, so
|
|
||||||
switchToBank2 + helper-with-local-ptr no longer corrupts data
|
|
||||||
in the wrong bank. See `feedback_dbr_ptr_deref_spill.md`.
|
|
||||||
|
|
||||||
- **Greedy regalloc fails on long-arg call chains** — a function
|
- **Greedy regalloc fails on long-arg call chains** — a function
|
||||||
that strings ~7+ independent `helper(longArg1, longArg2)` calls
|
that strings ~7+ independent `helper(longArg1, longArg2)` calls
|
||||||
overflows greedy at -O1+ ("ran out of registers during register
|
overflows greedy at -O1+ with "ran out of registers during
|
||||||
allocation"). Same root issue as softDouble's old -O2 hold-out.
|
register allocation". IMG slot expansion (8→16) raised the
|
||||||
Threshold raised somewhat by expanding IMG slots from 8 to 16
|
threshold; most "normal-looking" mixed-arity workloads now
|
||||||
(now backed by DP $C0..$DE) — most "normal-looking" mixed-arity
|
compile, but pathological pressure (many i32+ args + bitmask
|
||||||
workloads now compile, but pathological pressure (many i32+ args
|
SETCC chain in one function) still fails. Workarounds: mark
|
||||||
+ bitmask SETCC chain) still fails. Workarounds (in order of
|
the heaviest helper `__attribute__((noinline))`; or
|
||||||
preference): mark the heaviest helper `__attribute__((noinline))`
|
`-mllvm -regalloc=fast` for that TU; or `__attribute__((optnone))`
|
||||||
to reduce caller pressure; `-mllvm -regalloc=fast` for that TU;
|
on the affected function. Proper fix needs either a custom
|
||||||
or `__attribute__((optnone))` on the affected function. A proper
|
greedy→fast fallback in
|
||||||
fix needs either a custom greedy→fast fallback in
|
`W65816TargetMachine::createTargetRegisterAllocator` or a
|
||||||
`W65816TargetMachine::createTargetRegisterAllocator` or a smarter
|
smarter spill-placement pre-RA pass.
|
||||||
spill-placement pre-RA pass.
|
|
||||||
|
|
||||||
- **Bank-0 size limit (~48KB)** — the runtime + program must fit in
|
- **`time()` / `clock()` are stubs** returning 0. ReadTimeHex
|
||||||
$1000-$BFFF (text+rodata) plus $D000-$DFFF (LC1 for rodata-spill
|
(Misc Tool $0D03) needs the Tool Locator initialised in crt0
|
||||||
and BSS). Past that, link816 hard-fails because text would
|
to not crash MAME; the VBL counter at $E1006B needs 24-bit
|
||||||
cross the IO window. In practice this is rarely hit now that
|
|
||||||
link816 has `--gc-sections` (default ON, see Recently Fixed)
|
|
||||||
which drops unreachable functions: a minimal program shrinks
|
|
||||||
from ~43KB (whole runtime) to ~1.5KB. Programs that genuinely
|
|
||||||
use most of the runtime can still hit the limit.
|
|
||||||
|
|
||||||
## Recently fixed
|
|
||||||
|
|
||||||
- **DBR pointer-deref RA elision (sumTable miscompile)** —
|
|
||||||
`LDAptr / STAptr / STBptr` inserter's first-thing
|
|
||||||
`COPY $a = ptr_vreg` was being elided by RA when the loop body
|
|
||||||
had multiple Acc16 PHIs competing for A. PHI-elim silently
|
|
||||||
dropped the COPY needed to refresh A with the pointer at the
|
|
||||||
top of each iteration; sumTable's inner loop did `STA $E0`
|
|
||||||
while A held the accumulator. Fix: spill the pointer to a
|
|
||||||
fresh stack slot via `STAfi` and reload via `LDAfi` — forces
|
|
||||||
RA to materialize the value through real machine ops. See
|
|
||||||
`feedback_dbr_ptr_deref_spill.md`.
|
|
||||||
|
|
||||||
- **softDouble.c -O2 hold-out** — with the DBR fix in place,
|
|
||||||
`dclass` can be `noinline` (its pointer-arg writes go through
|
|
||||||
`STBptr / STAptr` which now use `[$E0],Y` indirect-long with
|
|
||||||
bank=0). Drops register pressure in `__muldf3 / __divdf3 /
|
|
||||||
__adddf3` enough that greedy regalloc no longer runs out. All
|
|
||||||
three smoke build sites moved from `-O1` to `-O2`.
|
|
||||||
|
|
||||||
- **IMG slot count doubled (8 → 16)** — Img16 / Wide16 register
|
|
||||||
classes now hold IMG0..IMG15, backed by DP $C0..$CE + $D0..$DE.
|
|
||||||
Reduces greedy regalloc spills for moderately-busy functions.
|
|
||||||
Existing `IMG0..IMG7 → $D0..$DE` mapping unchanged so smoke
|
|
||||||
tests that assume specific DP carriers (e.g. DPF0 at $F0) still
|
|
||||||
work. User app DP is now $00..$BF (was $00..$CF).
|
|
||||||
|
|
||||||
- **Real-world smoke coverage added** — Conway's Game of Life
|
|
||||||
blinker (2D arrays + neighbour bounds), binary search tree
|
|
||||||
(recursive struct + malloc), function-pointer dispatch table
|
|
||||||
(indirect-JSL via `__jsl_indir`). Total smoke tests at 107.
|
|
||||||
|
|
||||||
- **iigs/toolbox.h expanded** — from 4 stubs to 18+ wrappers
|
|
||||||
across Tool Locator, Memory Manager, Misc Tools, QuickDraw II,
|
|
||||||
Event Manager, Window Manager, plus GS/OS Quit. Multi-arg
|
|
||||||
wrappers live in `runtime/src/iigsToolbox.s` (the backend's
|
|
||||||
inline-asm constraints can't take memory operands); single-arg
|
|
||||||
ones stay inline.
|
|
||||||
|
|
||||||
- **#70 — iterative qsort -O2 miscompile** — `W65816StackSlotCleanup`
|
|
||||||
Pass -2 was deleting a store to a slot the loop body read.
|
|
||||||
Function-wide `slotHasOtherRefs` safety check added (Pass -1 and
|
|
||||||
Pass -2c hardened with the same pattern). Iterative qsort at
|
|
||||||
plain -O2 + greedy now compiles correctly; the `optnone` workaround
|
|
||||||
in smoke #70 was removed.
|
|
||||||
|
|
||||||
- **strtok -O2 layout-sensitive miscompile** — modelling `Uses=[P]`
|
|
||||||
on the conditional branches (BEQ/BNE/BCS/BCC/BMI/BPL/BVS/BVC) made
|
|
||||||
MachineCSE / scheduler / LICM / sink see the CMP→Bxx flag
|
|
||||||
dependency. An entire class of layout-sensitive flag-corruption
|
|
||||||
bugs went away; verified by sweeping `--rodata-base` from text-end
|
|
||||||
to text-end+300 in 13 increments — every layout returns the correct
|
|
||||||
strtok result. As a follow-on, MachineCSE has been re-enabled
|
|
||||||
(was previously disabled in `W65816TargetMachine::addMachineSSAOpti
|
|
||||||
mization` as a workaround for the same root cause).
|
|
||||||
|
|
||||||
- **link816 silently produced 4.3GB binaries** when `--rodata-base`
|
|
||||||
was set inside the text region. Now dies with a clear error:
|
|
||||||
`--rodata-base 0xX overlaps text 0xY+N (must start at or after 0xZ)`.
|
|
||||||
|
|
||||||
- **link816 BSS-relocate landed in IIgs Language Card area** —
|
|
||||||
when text+rodata grew past $C000, link816 placed BSS at $D000
|
|
||||||
(the LC1 area), where IIgs-by-default maps ROM (writes drop
|
|
||||||
silently, reads return ROM bytes). Globals never initialised;
|
|
||||||
caught by the expression-parser smoke (#92) when adding rand /
|
|
||||||
strnlen / etc. pushed the runtime past that threshold. Two-part
|
|
||||||
fix: crt0 now enables LC1 RAM via the standard `lda $C083`
|
|
||||||
read-twice trick at startup, and link816 hard-fails (rather
|
|
||||||
than silently corrupt) if BSS would exceed the LC1 ceiling
|
|
||||||
($E000) — past that you'd need crt0 to also enable LC2 / shadow
|
|
||||||
RAM, which we haven't wired up.
|
|
||||||
|
|
||||||
- **STZ peephole multi-STA latent miscompile** — AsmPrinter's
|
|
||||||
`LDA #0; STA $g` -> `STZ $g` peephole eliminated the LDA but
|
|
||||||
only consumed the FIRST `STA`. When SDAG-CSE shared one
|
|
||||||
`LDA #0` across multiple `STA`s (`g16=0; g32=0;` is one IR
|
|
||||||
shape), trailing `STA`s read whatever was in A on entry —
|
|
||||||
silently corrupting any global where A wasn't 0 at function
|
|
||||||
entry. Smoke happened to pass because A was 0 by luck in
|
|
||||||
every covered path. Fixed by gating the peephole on the
|
|
||||||
consuming `STA` killing A (regalloc only sets `killed` on the
|
|
||||||
last reader); smoke #98 added to lock the multi-STA case.
|
|
||||||
|
|
||||||
- **PEI AsmPrinter peephole** — new: `LDA $dp; PHA` -> `PEI $dp`
|
|
||||||
saves 1 byte and avoids touching A. Fires on the
|
|
||||||
`copyPhysReg(A=DPF0); PUSH16` pattern (i64-libcall return-value
|
|
||||||
forwarding into the next call's stacked args), which appears
|
|
||||||
in every chained soft-double / soft-int64 expression. Saves
|
|
||||||
68 bytes across the runtime (-64 in math.o alone). Same
|
|
||||||
next-instruction-modifies-A safety check as the PEA peephole.
|
|
||||||
Smoke #99 added.
|
|
||||||
|
|
||||||
- **PEA peephole opcode-allowlist replaced with `modifiesRegister`** —
|
|
||||||
the next-after-PUSH16 check that gates the PEA peephole was a
|
|
||||||
hand-curated list of opcodes that obviously redefine A; switched
|
|
||||||
to `MachineInstr::modifiesRegister(A, TRI)` which also catches
|
|
||||||
implicit-defs (e.g. JSL clobbering A as part of the call ABI).
|
|
||||||
Saves a few bytes and is more robust.
|
|
||||||
|
|
||||||
- **libgcc.s `lda #0; sta $XX` -> `stz $XX`** — 7 sites converted
|
|
||||||
in libgcc.s after STZ landed in the assembler. Saves 28 bytes;
|
|
||||||
also removes two PHA/PLA save-restore wraps around the LDA #0
|
|
||||||
(STZ doesn't touch A, so the wraps are unnecessary).
|
|
||||||
|
|
||||||
- **libgcc.s `lda dp; pha` -> `pei dp`** — 2 sites in __divhi3 /
|
|
||||||
__modhi3 where the loaded A is dead after the push. PEI
|
|
||||||
doesn't touch A, saves 1 byte each.
|
|
||||||
|
|
||||||
- **W65816StackSlotCleanup Pass 1c skip-list extended** — added
|
|
||||||
STAabs / STA8abs / STAptr / STBptr / STAptrOff / STBptrOff and
|
|
||||||
ADJCALLSTACKDOWN to the A-transparent list. Lets the redundant-
|
|
||||||
CMP-after-A-modifier elimination see through more pseudo
|
|
||||||
stores and the call-stack-down pseudo. Saves 8 bytes in math.o.
|
|
||||||
(ADJCALLSTACKUP is NOT transparent — when PEI doesn't process
|
|
||||||
it, AsmPrinter emits a TSC/CLC/ADC/TCS that clobbers A.)
|
|
||||||
|
|
||||||
- **crt0.s `lda #0; sta` -> `stz`** — IRQ-disable block and the
|
|
||||||
BSS-zero loop both used `.byte 0xa9, 0x00 ; sta` raw-byte
|
|
||||||
workarounds for `lda #0` (the assembler emits a 16-bit immediate
|
|
||||||
in M=8, mis-encoding it). `stz` works in M=8 (stores 1 byte) and
|
|
||||||
doesn't touch A — both `.byte` workarounds removed; saves 4 bytes
|
|
||||||
in crt0.o.
|
|
||||||
|
|
||||||
- **Runtime correctness pass — five real bugs fixed:**
|
|
||||||
- `free()` coalesce: when a freed block was absorbed into a
|
|
||||||
lower-address neighbour (`bEnd == a` path), the absorbed entry
|
|
||||||
was left in the free list overlapping the extended one. A
|
|
||||||
follow-on malloc could hand out the same memory to two
|
|
||||||
callers. Fix: track outer-loop predecessor and excise the
|
|
||||||
absorbed entry. Smoke #100 added.
|
|
||||||
- `sqrt(-0.0)` returned NaN; should return -0.0 per IEEE-754.
|
|
||||||
The sign-bit check fired before the zero check. Fix: mask
|
|
||||||
sign bit when testing for zero.
|
|
||||||
- `log(0)` returned NaN; should return -Infinity (pole error).
|
|
||||||
Same sign-bit-vs-zero ordering issue; both ±0 now return
|
|
||||||
`-1.0/0.0`.
|
|
||||||
- `snprintf(buf, 0, ...)` wrote `'\0'` to `buf[-1]` (one byte
|
|
||||||
BEFORE the buffer). C99 says n=0 must not touch the buffer.
|
|
||||||
Fix: set `gEnd = NULL` for n=0 so neither the normal nor the
|
|
||||||
truncation NUL-write path fires. Smoke #76 extended.
|
|
||||||
- `malloc(>~32KB)` and `calloc(n, m)` had silent integer overflow
|
|
||||||
on size_t (16-bit), wrapping to small values and handing out
|
|
||||||
tiny allocations claiming huge sizes. Bumped malloc to bail
|
|
||||||
above 0x7FF0 (heap is at most ~32KB anyway) and made calloc
|
|
||||||
overflow-check before multiplying.
|
|
||||||
|
|
||||||
- **Removed** dead `runtime/src/softDouble.s` (a stub from before
|
|
||||||
`softDouble.c` was implemented; the build script doesn't reference
|
|
||||||
it but it was confusing to leave around).
|
|
||||||
|
|
||||||
- **inttypes.h PRId64 / PRIu64 / PRIx64** documented as
|
|
||||||
unsupported in the runtime's printf — the macros expand to
|
|
||||||
`"lld"`/`"llu"`/`"llx"` but the formatter only knows the `l`
|
|
||||||
length modifier, not `ll`, so the format prints literally and
|
|
||||||
the va_list misaligns. Use `PRId32` etc. for now.
|
|
||||||
|
|
||||||
- **More runtime fixes (round 2):**
|
|
||||||
- `fputs(s, stream)` was forwarding to `puts(s)`, which appends a
|
|
||||||
newline. C says fputs MUST NOT add one. Direct char-by-char
|
|
||||||
write now.
|
|
||||||
- `exit(code)` never invoked the registered `atexit` handler.
|
|
||||||
C99 7.20.4.3 requires it. Now runs the single-slot handler
|
|
||||||
(with re-entry guard) before the BRK.
|
|
||||||
- `printf("%f", -0.0)` printed `0.000000` instead of `-0.000000`
|
|
||||||
because `if (v < 0)` (a `__ltdf2` call) returns false for
|
|
||||||
negative zero. Switched to the IEEE-754 sign-bit test that
|
|
||||||
snprintf already uses.
|
|
||||||
- `vfprintf` was missing entirely (declared neither in stdio.h
|
|
||||||
nor implemented). Added a thin wrapper around vprintf.
|
|
||||||
|
|
||||||
- **link816 weak-symbol resolution:** the linker previously used
|
|
||||||
"last def wins" with no regard for STB_GLOBAL vs STB_WEAK. When
|
|
||||||
a user provided a strong override of a weak libc stub (e.g.
|
|
||||||
`putchar`), it worked only by link-order luck — reversing the
|
|
||||||
order let the weak stub silently overwrite the strong def.
|
|
||||||
Now properly: strong over weak (any order), strong + strong
|
|
||||||
errors out, weak + weak picks the first. Smoke #100 added.
|
|
||||||
|
|
||||||
- **More runtime fixes (round 3):**
|
|
||||||
- `writeHex` / `emitHex` had a stack-overflow buffer overrun
|
|
||||||
(`char buf[5]` but `printf("%08x", ...)` would write 8 bytes).
|
|
||||||
On 16-bit `unsigned int`, max useful width is 4 — buf shrunk
|
|
||||||
to 4 and width is now capped.
|
|
||||||
- `writeDec` / `writeSignedLong` / `emitDec` / `emitSignedLong`
|
|
||||||
used `-n` on signed input, which overflows for INT_MIN /
|
|
||||||
LONG_MIN (UB). All four switched to unsigned-negation
|
|
||||||
(`0u - (unsigned)n`) for correctness and to keep an
|
|
||||||
optimizer-aware compiler from exploiting the UB.
|
|
||||||
- `atoi` / `atol` / `strtol` / `strtoul` likewise built the
|
|
||||||
parsed magnitude in a signed accumulator and negated at the
|
|
||||||
end — same UB on the boundary value. All switched to
|
|
||||||
unsigned magnitude + unsigned-negation cast.
|
|
||||||
- `link816 parseInt` / `omfEmit parseInt` silently truncated
|
|
||||||
addresses > 24 bits to `uint32_t` low bits — `--text-base
|
|
||||||
0x100000000` would silently wrap to 0. Both now reject
|
|
||||||
out-of-range addresses with a clear error.
|
|
||||||
|
|
||||||
- **More runtime fixes (round 4):**
|
|
||||||
- `pow(x, y)` computed `n = -n` for the integer-y branch when
|
|
||||||
yi was INT_MIN (-32768); same signed-overflow UB pattern as
|
|
||||||
the print functions. Switched to unsigned magnitude.
|
|
||||||
- Added `perror(prefix)` — was missing from the runtime; common
|
|
||||||
pattern in portable code that reports I/O failure via
|
|
||||||
`errno + strerror`. Declared in stdio.h, implemented as
|
|
||||||
char-by-char emit through putchar (no fprintf dependency).
|
|
||||||
|
|
||||||
- **link816 `__heap_end` was hardcoded at $BF00**, ignoring where
|
|
||||||
`__heap_start` actually ended up. When BSS got auto-relocated
|
|
||||||
into LC1 ($D000+), heap_start ended up > heap_end and malloc
|
|
||||||
immediately returned NULL on every call — silently bricking any
|
|
||||||
program that allocated dynamic memory after the runtime grew
|
|
||||||
past the default-bss threshold. Heap_end now picks
|
|
||||||
$BF00 / $E000 based on where heap_start lands (and skips the IO
|
|
||||||
window if heap_start would have landed in $C000-$CFFF).
|
|
||||||
Smoke #102 added.
|
|
||||||
|
|
||||||
- **link816 rodata auto-skips IIgs IO window** ($C000-$CFFF). When
|
|
||||||
text+rodata grew past 0xC000 the rodata bytes silently corrupted
|
|
||||||
at runtime — string literals in the IO range read back as
|
|
||||||
hardware register values, breaking strcmp / strstr / printf / etc.
|
|
||||||
Now: rodata that would land in or cross $C000-$CFFF auto-skips
|
|
||||||
to $D000. Init_array gets the same treatment. Text that would
|
|
||||||
cross IO is hard-rejected at link time (no auto-fix possible —
|
|
||||||
PC fetches in IO would read hardware registers). This was the
|
|
||||||
root cause of the "tan/tanf triggers layout-sensitive failure"
|
|
||||||
symptom listed in older STATUS notes.
|
|
||||||
|
|
||||||
- **runInMame skips writes to IO window** during the binary load.
|
|
||||||
Without this, the zero-padding in the rodata-skip gap would
|
|
||||||
clobber soft switches (e.g. the LC1 RAM enable that crt0 sets
|
|
||||||
via $C083) when the loader naively wrote the entire image
|
|
||||||
byte-by-byte to memory.
|
|
||||||
|
|
||||||
- **link816 `--gc-sections` (default ON)** — discards sections not
|
|
||||||
reachable from the entry point (`__start` / `_start` / `main`
|
|
||||||
for the canonical crt0 setup) plus all `.init_array` sections.
|
|
||||||
Built on `-ffunction-sections` so each function is in its own
|
|
||||||
section. A minimal program with full runtime linked shrinks
|
|
||||||
from ~43KB to ~1.5KB. Adding `tan/tanf` to math.c (which
|
|
||||||
caused the latent layout-sensitive failure described above)
|
|
||||||
no longer pushes any test past the bank-0 limit. Tests that
|
|
||||||
intentionally check unreachable symbols pass `--no-gc-sections`
|
|
||||||
to opt out.
|
|
||||||
|
|
||||||
- **`fwrite(stdout, ...)` was a stub returning 0** even though
|
|
||||||
`stdout` has a working `putchar` route. Now actually writes
|
|
||||||
through `putchar` for stdout/stderr (only). Also gained the
|
|
||||||
same `size * nmemb` overflow guard as `calloc`.
|
|
||||||
|
|
||||||
## What's still needed for a "ship-ready" toolchain
|
|
||||||
|
|
||||||
- **softDouble.c -O2 — FIXED.** Marking `dclass` noinline (in
|
|
||||||
addition to `dpack`) drops register pressure in `__muldf3`/
|
|
||||||
`__divdf3`/`__adddf3` enough that greedy regalloc no longer
|
|
||||||
runs out. The previous blocker was that noinline-dclass would
|
|
||||||
write through pointer args via the DBR-relative `(d,s),y` mode
|
|
||||||
and corrupt caller data after a bank switch — that path now
|
|
||||||
goes through `STAptr/STBptr` which use `[$E0],Y` indirect-long
|
|
||||||
with the bank byte forced to 0, so DBR is irrelevant. All
|
|
||||||
three smoke build sites moved to `-O2`.
|
|
||||||
|
|
||||||
|
|
||||||
- **More of the C standard library**: real `<stdio.h>` file I/O
|
|
||||||
(`fopen`, `fread`, `fwrite`, `fseek` are currently stubs
|
|
||||||
returning success/zero) — would need a memory-backed FS or a
|
|
||||||
MAME hook. `<locale.h>` / `<signal.h>` / `<time.h>` are stubbed
|
|
||||||
(compile and return safe defaults). `<wchar.h>` mostly absent.
|
|
||||||
A `time()` impl wired to ReadTimeHex (Misc Tool $0D03) was
|
|
||||||
attempted but crashes MAME without the Tool Locator initialised
|
|
||||||
in crt0; `clock()` via VBL counter at $E1006B needs 24-bit
|
|
||||||
far-pointer support that the backend doesn't yet model.
|
far-pointer support that the backend doesn't yet model.
|
||||||
|
|
||||||
- **C++ runtime support**: vtable layout for multiple inheritance,
|
- **`(d,s),y / (sr,s),y` addressing wraps the bank** when Y is
|
||||||
RTTI, exceptions (or a documented `-fno-exceptions` requirement).
|
negative as 16-bit unsigned. Worked around by `W65816NegYIndY`
|
||||||
|
rewriting the affected ops to `TAX ; LDA/STA $0000,X`. The
|
||||||
|
workaround stays correct for negative offsets like `arr[i-1]`
|
||||||
|
but the underlying issue is unfixed at the addressing-mode
|
||||||
|
level.
|
||||||
|
|
||||||
- **REP/SEP scheduling pass** (design doc §3.3): the current
|
- **Bank-0 size limit (~48KB)** — the runtime + program must fit
|
||||||
prologue picks one M-mode for the whole function based on
|
in $1000-$BFFF (text+rodata) plus $D000-$DFFF (LC1 for rodata-
|
||||||
whether any 8-bit accumulator value is used. A per-region
|
spill and BSS). Past that, link816 hard-fails because text
|
||||||
scheduler would reduce the SEP/REP wrap overhead on i8 stores.
|
would cross the IO window. In practice rarely hit thanks to
|
||||||
|
`--gc-sections`, but programs that genuinely use most of the
|
||||||
|
runtime can still trip it. Future work: enable LC2 / shadow
|
||||||
|
RAM via crt0 to add ~16KB more.
|
||||||
|
|
||||||
- **Toolbox / IIgs system call bindings**: `iigs/toolbox.h` covers
|
## Yet to come
|
||||||
the common entry points across Tool Locator, Memory Manager,
|
|
||||||
Misc Tools, QuickDraw II, Event Manager, Window Manager, plus
|
|
||||||
GS/OS Quit. Multi-arg wrappers (NewHandle, QDStartUp, MoveTo,
|
|
||||||
EMStartUp, GetNextEvent, NewWindow, CloseWindow) live in
|
|
||||||
`runtime/src/iigsToolbox.s` because the backend's inline-asm
|
|
||||||
constraints can't take memory operands. Single-arg / no-arg
|
|
||||||
wrappers stay inline. More routines (Menu Manager, Dialog
|
|
||||||
Manager, Standard File, Sound) still TBD.
|
|
||||||
|
|
||||||
- **Real-world program coverage**: the smoke tests are
|
- **GS/OS-backed `<stdio.h>` file I/O** — current FS is
|
||||||
microbenchmarks. A few known-good Apple IIgs C programs (e.g.
|
memory-backed (programs `mfsRegister` buffers as files). A
|
||||||
a textfile pager, a small game) compiled and run end-to-end
|
GS/OS backend would let programs see the real ProDOS volume
|
||||||
would catch issues no synthetic test currently exercises.
|
during MAME execution, but needs Tool Locator init in crt0
|
||||||
|
and a class-1 parm-block dispatch wrapper around $E100A8.
|
||||||
|
|
||||||
- **Cycle-time / size benchmarks vs Calypsi 5.16**: design doc §1
|
- **C++ exceptions / RTTI / multiple inheritance with virtual
|
||||||
says the goal is to "match or exceed" Calypsi. We have neither
|
bases** — only the `-fno-exceptions -fno-rtti` subset is
|
||||||
baseline numbers nor a comparison harness yet.
|
supported. `__cxa_throw` etc. would need an unwind ABI on
|
||||||
|
this target plus a personality routine.
|
||||||
|
|
||||||
|
- **Close the size gap to Calypsi** — `scripts/bench.sh`
|
||||||
|
shows clang at ~2.2x Calypsi text size on the included
|
||||||
|
microbenchmarks, with sumOfSquares as the worst case (6.45x)
|
||||||
|
due to __mulsi3 dispatch overhead. Targeted improvements:
|
||||||
|
inline 16x16->32 multiply for small operands; widen the
|
||||||
|
IMG slot heuristic so greedy uses them more aggressively;
|
||||||
|
cycle-time benchmark harness (separate from size).
|
||||||
|
|
||||||
|
- **Larger/real-world end-to-end programs** — current real-world
|
||||||
|
smoke (Game of Life, BST, dispatch, hex dumper, JSON tokenizer)
|
||||||
|
exercises core idioms. A multi-thousand-line program (e.g.
|
||||||
|
a small interactive shell, a text editor command loop) would
|
||||||
|
catch issues no smaller test reaches.
|
||||||
|
|
|
||||||
51
bench_simple.s
Normal file
51
bench_simple.s
Normal file
|
|
@ -0,0 +1,51 @@
|
||||||
|
; Generated by Calypsi ISO C compiler for 65816
|
||||||
|
|
||||||
|
.rtmodel version,"1"
|
||||||
|
.rtmodel codeModel,"large"
|
||||||
|
.rtmodel dataModel,"small"
|
||||||
|
.rtmodel core,"65816"
|
||||||
|
.rtmodel huge,"0"
|
||||||
|
.rtmodel target,"none-specified"
|
||||||
|
.extern _Dp
|
||||||
|
.extern _Mul16
|
||||||
|
.extern _Vfp
|
||||||
|
; unsigned long sumOfSquares(unsigned short n) {
|
||||||
|
.section farcode,text
|
||||||
|
.public sumOfSquares
|
||||||
|
sumOfSquares:
|
||||||
|
phy
|
||||||
|
phy
|
||||||
|
sta 1,s
|
||||||
|
; unsigned long total = 0;
|
||||||
|
stz dp:.tiny _Dp
|
||||||
|
stz dp:.tiny (_Dp+2)
|
||||||
|
; for (unsigned short i = 1; i <= n; i++) total += (unsigned long)i * i;
|
||||||
|
lda ##1
|
||||||
|
sta 3,s
|
||||||
|
`?L5`: lda 1,s
|
||||||
|
cmp 3,s
|
||||||
|
bcs `?L4`
|
||||||
|
; return total;
|
||||||
|
ldx dp:.tiny (_Dp+2)
|
||||||
|
lda dp:.tiny _Dp
|
||||||
|
; }
|
||||||
|
ply
|
||||||
|
ply
|
||||||
|
rtl
|
||||||
|
`?L4`: lda 3,s
|
||||||
|
tax
|
||||||
|
jsl long:_Mul16
|
||||||
|
clc
|
||||||
|
adc dp:.tiny _Dp
|
||||||
|
pha
|
||||||
|
txa
|
||||||
|
adc dp:.tiny (_Dp+2)
|
||||||
|
tax
|
||||||
|
pla
|
||||||
|
stx dp:.tiny (_Dp+2)
|
||||||
|
sta dp:.tiny _Dp
|
||||||
|
lda 3,s
|
||||||
|
inc a
|
||||||
|
sta 3,s
|
||||||
|
bra `?L5`
|
||||||
|
|
||||||
10
benchmarks/bsearch.c
Normal file
10
benchmarks/bsearch.c
Normal file
|
|
@ -0,0 +1,10 @@
|
||||||
|
int bsearch(const int *arr, int n, int key) {
|
||||||
|
int lo = 0, hi = n - 1;
|
||||||
|
while (lo <= hi) {
|
||||||
|
int mid = (lo + hi) / 2;
|
||||||
|
if (arr[mid] == key) return mid;
|
||||||
|
if (arr[mid] < key) lo = mid + 1;
|
||||||
|
else hi = mid - 1;
|
||||||
|
}
|
||||||
|
return -1;
|
||||||
|
}
|
||||||
10
benchmarks/crc32.c
Normal file
10
benchmarks/crc32.c
Normal file
|
|
@ -0,0 +1,10 @@
|
||||||
|
unsigned long crc32(const unsigned char *p, unsigned int n) {
|
||||||
|
unsigned long crc = 0xFFFFFFFFUL;
|
||||||
|
while (n--) {
|
||||||
|
crc ^= *p++;
|
||||||
|
for (int k = 0; k < 8; k++) {
|
||||||
|
crc = (crc >> 1) ^ (0xEDB88320UL & -(long)(crc & 1));
|
||||||
|
}
|
||||||
|
}
|
||||||
|
return crc ^ 0xFFFFFFFFUL;
|
||||||
|
}
|
||||||
7
benchmarks/dotProduct.c
Normal file
7
benchmarks/dotProduct.c
Normal file
|
|
@ -0,0 +1,7 @@
|
||||||
|
long dotProduct(const short *a, const short *b, unsigned int n) {
|
||||||
|
long sum = 0;
|
||||||
|
for (unsigned int i = 0; i < n; i++) {
|
||||||
|
sum += (long)a[i] * (long)b[i];
|
||||||
|
}
|
||||||
|
return sum;
|
||||||
|
}
|
||||||
4
benchmarks/fib.c
Normal file
4
benchmarks/fib.c
Normal file
|
|
@ -0,0 +1,4 @@
|
||||||
|
unsigned short fib(unsigned short n) {
|
||||||
|
if (n < 2) return n;
|
||||||
|
return fib(n - 1) + fib(n - 2);
|
||||||
|
}
|
||||||
10
benchmarks/memcmp.c
Normal file
10
benchmarks/memcmp.c
Normal file
|
|
@ -0,0 +1,10 @@
|
||||||
|
typedef unsigned char u8;
|
||||||
|
int mymemcmp(const void *a, const void *b, unsigned int n) {
|
||||||
|
const u8 *p = (const u8 *)a;
|
||||||
|
const u8 *q = (const u8 *)b;
|
||||||
|
while (n--) {
|
||||||
|
if (*p != *q) return *p - *q;
|
||||||
|
p++; q++;
|
||||||
|
}
|
||||||
|
return 0;
|
||||||
|
}
|
||||||
5
benchmarks/popcount.c
Normal file
5
benchmarks/popcount.c
Normal file
|
|
@ -0,0 +1,5 @@
|
||||||
|
int popcount(unsigned long x) {
|
||||||
|
int n = 0;
|
||||||
|
while (x) { n += x & 1; x >>= 1; }
|
||||||
|
return n;
|
||||||
|
}
|
||||||
5
benchmarks/strcpy.c
Normal file
5
benchmarks/strcpy.c
Normal file
|
|
@ -0,0 +1,5 @@
|
||||||
|
char *mystrcpy(char *dst, const char *src) {
|
||||||
|
char *d = dst;
|
||||||
|
while ((*d++ = *src++)) {}
|
||||||
|
return dst;
|
||||||
|
}
|
||||||
5
benchmarks/sumOfSquares.c
Normal file
5
benchmarks/sumOfSquares.c
Normal file
|
|
@ -0,0 +1,5 @@
|
||||||
|
unsigned long sumOfSquares(unsigned short n) {
|
||||||
|
unsigned long total = 0;
|
||||||
|
for (unsigned short i = 1; i <= n; i++) total += (unsigned long)i * i;
|
||||||
|
return total;
|
||||||
|
}
|
||||||
File diff suppressed because it is too large
Load diff
|
|
@ -41,12 +41,19 @@ void clearerr(FILE *stream);
|
||||||
|
|
||||||
#define EOF (-1)
|
#define EOF (-1)
|
||||||
|
|
||||||
// Input stubs. Real implementations would route through GS/OS
|
|
||||||
// console I/O; current impl in libc.c returns EOF / 0.
|
|
||||||
int getchar(void);
|
int getchar(void);
|
||||||
int fgetc(FILE *stream);
|
int fgetc(FILE *stream);
|
||||||
char *fgets(char *buf, int n, FILE *stream);
|
char *fgets(char *buf, int n, FILE *stream);
|
||||||
int ungetc(int c, FILE *stream);
|
int ungetc(int c, FILE *stream);
|
||||||
#define getc(s) fgetc(s)
|
#define getc(s) fgetc(s)
|
||||||
|
|
||||||
|
// Memory-backed FS: register a memory region as a named file so
|
||||||
|
// fopen can open it. `cap` should be >= size; use cap > size for
|
||||||
|
// files that may grow on write. `writable` controls whether
|
||||||
|
// fopen("...", "w") / "a" / "r+" succeeds. Returns 0 on success,
|
||||||
|
// -1 on duplicate name or table full.
|
||||||
|
int mfsRegister(const char *path, void *buf, size_t size, size_t cap,
|
||||||
|
int writable);
|
||||||
|
int mfsUnregister(const char *path);
|
||||||
|
|
||||||
#endif
|
#endif
|
||||||
|
|
|
||||||
38
runtime/include/wchar.h
Normal file
38
runtime/include/wchar.h
Normal file
|
|
@ -0,0 +1,38 @@
|
||||||
|
// Minimal wchar.h for the W65816 runtime.
|
||||||
|
//
|
||||||
|
// wchar_t is 16-bit (matches `int` on this target). No real
|
||||||
|
// multi-byte / locale support — mbtowc/wctomb assume a one-byte =
|
||||||
|
// one-wchar mapping (essentially Latin-1). The wcs* functions
|
||||||
|
// mirror the str* family.
|
||||||
|
|
||||||
|
#ifndef _WCHAR_H
|
||||||
|
#define _WCHAR_H
|
||||||
|
|
||||||
|
typedef unsigned short wchar_t;
|
||||||
|
typedef unsigned int size_t;
|
||||||
|
typedef long wint_t;
|
||||||
|
|
||||||
|
#define WEOF ((wint_t)-1)
|
||||||
|
|
||||||
|
#ifndef NULL
|
||||||
|
#define NULL ((void *)0)
|
||||||
|
#endif
|
||||||
|
|
||||||
|
size_t wcslen (const wchar_t *s);
|
||||||
|
int wcscmp (const wchar_t *a, const wchar_t *b);
|
||||||
|
int wcsncmp(const wchar_t *a, const wchar_t *b, size_t n);
|
||||||
|
wchar_t *wcscpy (wchar_t *dst, const wchar_t *src);
|
||||||
|
wchar_t *wcsncpy(wchar_t *dst, const wchar_t *src, size_t n);
|
||||||
|
wchar_t *wcscat (wchar_t *dst, const wchar_t *src);
|
||||||
|
wchar_t *wcschr (const wchar_t *s, wchar_t c);
|
||||||
|
wchar_t *wcsrchr(const wchar_t *s, wchar_t c);
|
||||||
|
|
||||||
|
// Multi-byte conversion. Trivial 1:1 in our impl: each byte maps
|
||||||
|
// to the wide char with the same numeric value (zero-extended).
|
||||||
|
int mbtowc (wchar_t *pwc, const char *s, size_t n);
|
||||||
|
int wctomb (char *s, wchar_t wc);
|
||||||
|
size_t mbstowcs(wchar_t *pwcs, const char *s, size_t n);
|
||||||
|
size_t wcstombs(char *s, const wchar_t *pwcs, size_t n);
|
||||||
|
int mblen (const char *s, size_t n);
|
||||||
|
|
||||||
|
#endif
|
||||||
|
|
@ -176,3 +176,107 @@ size_t strcspn(const char *s, const char *reject) {
|
||||||
|
|
||||||
|
|
||||||
// strtok / strtok_r are in runtime/src/strtok.c.
|
// strtok / strtok_r are in runtime/src/strtok.c.
|
||||||
|
|
||||||
|
// ---- wchar.h ----
|
||||||
|
// wchar_t is 16-bit on this target. The wcs* functions mirror the
|
||||||
|
// str* family. mbtowc / wctomb use the trivial 1:1 byte<->wide-char
|
||||||
|
// mapping (essentially Latin-1) — no real multi-byte / locale support.
|
||||||
|
|
||||||
|
typedef unsigned short wchar_t;
|
||||||
|
|
||||||
|
size_t wcslen(const wchar_t *s) {
|
||||||
|
size_t n = 0;
|
||||||
|
while (*s++) n++;
|
||||||
|
return n;
|
||||||
|
}
|
||||||
|
|
||||||
|
int wcscmp(const wchar_t *a, const wchar_t *b) {
|
||||||
|
while (*a && *a == *b) { a++; b++; }
|
||||||
|
return (int)((short)*a - (short)*b);
|
||||||
|
}
|
||||||
|
|
||||||
|
int wcsncmp(const wchar_t *a, const wchar_t *b, size_t n) {
|
||||||
|
while (n && *a && *a == *b) { a++; b++; n--; }
|
||||||
|
if (!n) return 0;
|
||||||
|
return (int)((short)*a - (short)*b);
|
||||||
|
}
|
||||||
|
|
||||||
|
wchar_t *wcscpy(wchar_t *dst, const wchar_t *src) {
|
||||||
|
wchar_t *d = dst;
|
||||||
|
while ((*d++ = *src++)) {}
|
||||||
|
return dst;
|
||||||
|
}
|
||||||
|
|
||||||
|
wchar_t *wcsncpy(wchar_t *dst, const wchar_t *src, size_t n) {
|
||||||
|
wchar_t *d = dst;
|
||||||
|
while (n && (*d = *src)) { d++; src++; n--; }
|
||||||
|
while (n--) *d++ = 0;
|
||||||
|
return dst;
|
||||||
|
}
|
||||||
|
|
||||||
|
wchar_t *wcscat(wchar_t *dst, const wchar_t *src) {
|
||||||
|
wchar_t *d = dst;
|
||||||
|
while (*d) d++;
|
||||||
|
while ((*d++ = *src++)) {}
|
||||||
|
return dst;
|
||||||
|
}
|
||||||
|
|
||||||
|
wchar_t *wcschr(const wchar_t *s, wchar_t c) {
|
||||||
|
while (*s) {
|
||||||
|
if (*s == c) return (wchar_t *)s;
|
||||||
|
s++;
|
||||||
|
}
|
||||||
|
return (c == 0) ? (wchar_t *)s : (wchar_t *)0;
|
||||||
|
}
|
||||||
|
|
||||||
|
wchar_t *wcsrchr(const wchar_t *s, wchar_t c) {
|
||||||
|
const wchar_t *last = (const wchar_t *)0;
|
||||||
|
while (*s) {
|
||||||
|
if (*s == c) last = s;
|
||||||
|
s++;
|
||||||
|
}
|
||||||
|
if (c == 0) return (wchar_t *)s;
|
||||||
|
return (wchar_t *)last;
|
||||||
|
}
|
||||||
|
|
||||||
|
int mbtowc(wchar_t *pwc, const char *s, size_t n) {
|
||||||
|
if (!s) return 0; // no shift state
|
||||||
|
if (n == 0) return -1;
|
||||||
|
unsigned char c = (unsigned char)*s;
|
||||||
|
if (pwc) *pwc = (wchar_t)c;
|
||||||
|
return c ? 1 : 0;
|
||||||
|
}
|
||||||
|
|
||||||
|
int wctomb(char *s, wchar_t wc) {
|
||||||
|
if (!s) return 0; // no shift state
|
||||||
|
if (wc > 0xFF) return -1;
|
||||||
|
*s = (char)wc;
|
||||||
|
return 1;
|
||||||
|
}
|
||||||
|
|
||||||
|
size_t mbstowcs(wchar_t *pwcs, const char *s, size_t n) {
|
||||||
|
size_t i = 0;
|
||||||
|
while (i < n && s[i]) {
|
||||||
|
if (pwcs) pwcs[i] = (wchar_t)(unsigned char)s[i];
|
||||||
|
i++;
|
||||||
|
}
|
||||||
|
if (pwcs && i < n) pwcs[i] = 0;
|
||||||
|
return i;
|
||||||
|
}
|
||||||
|
|
||||||
|
size_t wcstombs(char *s, const wchar_t *pwcs, size_t n) {
|
||||||
|
size_t i = 0;
|
||||||
|
while (i < n && pwcs[i]) {
|
||||||
|
if (pwcs[i] > 0xFF) return (size_t)-1;
|
||||||
|
if (s) s[i] = (char)pwcs[i];
|
||||||
|
i++;
|
||||||
|
}
|
||||||
|
if (s && i < n) s[i] = 0;
|
||||||
|
return i;
|
||||||
|
}
|
||||||
|
|
||||||
|
int mblen(const char *s, size_t n) {
|
||||||
|
if (!s) return 0;
|
||||||
|
if (n == 0) return -1;
|
||||||
|
return *s ? 1 : 0;
|
||||||
|
}
|
||||||
|
|
|
||||||
File diff suppressed because it is too large
Load diff
|
|
@ -167,18 +167,11 @@ int puts(const char *s) {
|
||||||
|
|
||||||
// ---- input stubs ----
|
// ---- input stubs ----
|
||||||
//
|
//
|
||||||
// Real input would route through GS/OS console / event handling.
|
// getchar reads from the keyboard; real input would route through
|
||||||
// These return EOF / NULL so user code that calls them links and
|
// the IIgs Event Manager. Returns -1 (EOF) for now. fgetc/fgets/
|
||||||
// gets predictable end-of-input behaviour. FILE struct is defined
|
// ungetc are defined further down alongside the FILE-table-backed
|
||||||
// further down (alongside fopen etc.) — forward-declare for the
|
// fopen/fread/etc.
|
||||||
// signatures.
|
|
||||||
struct __sFILE;
|
|
||||||
int getchar(void) { return -1; /* EOF */ }
|
int getchar(void) { return -1; /* EOF */ }
|
||||||
int fgetc(struct __sFILE *s) { (void)s; return -1; }
|
|
||||||
char *fgets(char *b, int n, struct __sFILE *s) {
|
|
||||||
(void)b; (void)n; (void)s; return (char *)0;
|
|
||||||
}
|
|
||||||
int ungetc(int c, struct __sFILE *s) { (void)c; (void)s; return -1; }
|
|
||||||
|
|
||||||
// ---- minimal printf ----
|
// ---- minimal printf ----
|
||||||
|
|
||||||
|
|
@ -621,47 +614,191 @@ clock_t clock(void) {
|
||||||
return (clock_t)0;
|
return (clock_t)0;
|
||||||
}
|
}
|
||||||
|
|
||||||
// ---- FILE* abstraction (minimal) ----
|
// ---- FILE* abstraction (memory-backed FS) ----
|
||||||
//
|
//
|
||||||
// stdin / stdout / stderr exist as opaque non-NULL pointers. fputs /
|
// stdin / stdout / stderr are tagged as kind=STDIO and route through
|
||||||
// fputc forward to puts/putchar (which currently no-op or hit a debug
|
// putchar / fgetc-from-keyboard; opening a regular file allocates a
|
||||||
// hook). fprintf forwards to printf, ignoring the stream. fflush is
|
// FILE slot and keeps a (buf, size, pos, writable) record. Programs
|
||||||
// a no-op. Real file I/O via GS/OS toolbox is a separate feature
|
// stage files into the FS at startup via mfsRegister(name, ptr, size,
|
||||||
// (would need open/read/write/close + a file-descriptor table).
|
// writable) and then use the standard fopen/fread/fwrite/fseek API.
|
||||||
|
//
|
||||||
|
// Why in-memory rather than GS/OS-backed: the smoke harness doesn't
|
||||||
|
// boot ProDOS, so toolbox-FS calls would crash MAME. An in-RAM FS
|
||||||
|
// covers the common need (parser/printer that wants a FILE*) without
|
||||||
|
// pulling in GS/OS init. A future GS/OS backend can replace
|
||||||
|
// fopenImpl/etc. without touching callers.
|
||||||
|
//
|
||||||
|
// FILE-table layout: 8 entries. Slot 0..2 are stdin/stdout/stderr
|
||||||
|
// (immutable); 3..7 are user-allocated by fopen. Each entry has:
|
||||||
|
// kind (0=stdio in/out/err, 1=memory)
|
||||||
|
// buf (memory buffer base)
|
||||||
|
// size (logical size in bytes)
|
||||||
|
// cap (allocated capacity — for write-grow)
|
||||||
|
// pos (current seek position)
|
||||||
|
// eof, err flags
|
||||||
|
// writable (1 if opened for "w" or "r+" or "a")
|
||||||
|
// ungetc holding cell (-1 = empty)
|
||||||
|
|
||||||
typedef struct __sFILE { unsigned int magic; } FILE;
|
#define FILE_KIND_STDIN 0
|
||||||
|
#define FILE_KIND_STDOUT 1
|
||||||
|
#define FILE_KIND_STDERR 2
|
||||||
|
#define FILE_KIND_MEM 3
|
||||||
|
|
||||||
static FILE __stdin_obj = { 1 };
|
typedef struct __sFILE {
|
||||||
static FILE __stdout_obj = { 2 };
|
u8 kind;
|
||||||
static FILE __stderr_obj = { 3 };
|
u8 writable;
|
||||||
FILE *stdin = &__stdin_obj;
|
u8 eof;
|
||||||
FILE *stdout = &__stdout_obj;
|
u8 err;
|
||||||
FILE *stderr = &__stderr_obj;
|
char *buf;
|
||||||
|
size_t size;
|
||||||
|
size_t cap;
|
||||||
|
size_t pos;
|
||||||
|
int unget; // -1 if no pushed-back char
|
||||||
|
const char *path; // borrowed from caller, NULL for stdio
|
||||||
|
} FILE;
|
||||||
|
|
||||||
|
#define MFS_MAX_FILES 8
|
||||||
|
static FILE __mfs[MFS_MAX_FILES] = {
|
||||||
|
{ FILE_KIND_STDIN, 0, 0, 0, 0, 0, 0, 0, -1, 0 },
|
||||||
|
{ FILE_KIND_STDOUT, 1, 0, 0, 0, 0, 0, 0, -1, 0 },
|
||||||
|
{ FILE_KIND_STDERR, 1, 0, 0, 0, 0, 0, 0, -1, 0 },
|
||||||
|
};
|
||||||
|
|
||||||
|
FILE *stdin = &__mfs[0];
|
||||||
|
FILE *stdout = &__mfs[1];
|
||||||
|
FILE *stderr = &__mfs[2];
|
||||||
|
|
||||||
|
// Registered "files" available to fopen. Each registration is
|
||||||
|
// (path, buf, size, writable). Order doesn't matter — fopen scans
|
||||||
|
// linearly.
|
||||||
|
typedef struct {
|
||||||
|
const char *path;
|
||||||
|
char *buf;
|
||||||
|
size_t size;
|
||||||
|
size_t cap;
|
||||||
|
u8 writable;
|
||||||
|
u8 inUse;
|
||||||
|
} MfsEntry;
|
||||||
|
|
||||||
|
#define MFS_MAX_REG 16
|
||||||
|
static MfsEntry __mfsReg[MFS_MAX_REG];
|
||||||
|
|
||||||
|
// Register a memory region as a named file. Returns 0 on success,
|
||||||
|
// -1 if the table is full or a duplicate name exists. `cap` may be
|
||||||
|
// larger than `size` to allow appends without reallocation; pass
|
||||||
|
// cap=size if writes must not grow the file.
|
||||||
|
int mfsRegister(const char *path, void *buf, size_t size, size_t cap,
|
||||||
|
int writable) {
|
||||||
|
if (cap < size) cap = size;
|
||||||
|
for (int i = 0; i < MFS_MAX_REG; i++) {
|
||||||
|
if (__mfsReg[i].inUse && strcmp(__mfsReg[i].path, path) == 0)
|
||||||
|
return -1;
|
||||||
|
}
|
||||||
|
for (int i = 0; i < MFS_MAX_REG; i++) {
|
||||||
|
if (!__mfsReg[i].inUse) {
|
||||||
|
__mfsReg[i].path = path;
|
||||||
|
__mfsReg[i].buf = (char *)buf;
|
||||||
|
__mfsReg[i].size = size;
|
||||||
|
__mfsReg[i].cap = cap;
|
||||||
|
__mfsReg[i].writable = (u8)(writable != 0);
|
||||||
|
__mfsReg[i].inUse = 1;
|
||||||
|
return 0;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
return -1;
|
||||||
|
}
|
||||||
|
|
||||||
|
// Drop a registration. Returns 0 on success, -1 if not found.
|
||||||
|
int mfsUnregister(const char *path) {
|
||||||
|
for (int i = 0; i < MFS_MAX_REG; i++) {
|
||||||
|
if (__mfsReg[i].inUse && strcmp(__mfsReg[i].path, path) == 0) {
|
||||||
|
__mfsReg[i].inUse = 0;
|
||||||
|
__mfsReg[i].path = (const char *)0;
|
||||||
|
return 0;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
return -1;
|
||||||
|
}
|
||||||
|
|
||||||
|
int fputc(int c, FILE *stream) {
|
||||||
|
if (!stream) return -1;
|
||||||
|
if (stream->kind == FILE_KIND_STDOUT || stream->kind == FILE_KIND_STDERR)
|
||||||
|
return putchar(c);
|
||||||
|
if (stream->kind == FILE_KIND_MEM) {
|
||||||
|
if (!stream->writable) { stream->err = 1; return -1; }
|
||||||
|
if (stream->pos >= stream->cap) { stream->err = 1; return -1; }
|
||||||
|
stream->buf[stream->pos++] = (char)c;
|
||||||
|
if (stream->pos > stream->size) stream->size = stream->pos;
|
||||||
|
return (int)(unsigned char)c;
|
||||||
|
}
|
||||||
|
return -1;
|
||||||
|
}
|
||||||
|
|
||||||
int fputc(int c, FILE *stream) { (void)stream; return putchar(c); }
|
|
||||||
// fputs writes the string WITHOUT appending a newline (puts does append).
|
|
||||||
// Forwarding to puts() was a real bug — `fputs("hi", stdout)` was
|
|
||||||
// printing "hi\n" instead of "hi".
|
|
||||||
int fputs(const char *s, FILE *stream) {
|
int fputs(const char *s, FILE *stream) {
|
||||||
(void)stream;
|
if (!stream || !s) return -1;
|
||||||
|
if (stream->kind == FILE_KIND_STDOUT || stream->kind == FILE_KIND_STDERR) {
|
||||||
while (*s) { putchar(*s); s++; }
|
while (*s) { putchar(*s); s++; }
|
||||||
return 0;
|
return 0;
|
||||||
}
|
}
|
||||||
|
if (stream->kind == FILE_KIND_MEM) {
|
||||||
|
while (*s) {
|
||||||
|
if (fputc(*s, stream) == -1) return -1;
|
||||||
|
s++;
|
||||||
|
}
|
||||||
|
return 0;
|
||||||
|
}
|
||||||
|
return -1;
|
||||||
|
}
|
||||||
|
|
||||||
int fflush(FILE *stream) { (void)stream; return 0; }
|
int fflush(FILE *stream) { (void)stream; return 0; }
|
||||||
int fclose(FILE *stream) { (void)stream; return 0; }
|
|
||||||
|
int fclose(FILE *stream) {
|
||||||
|
if (!stream) return -1;
|
||||||
|
// Don't close stdin/stdout/stderr — they're long-lived statics.
|
||||||
|
if (stream->kind != FILE_KIND_MEM) return 0;
|
||||||
|
stream->kind = 0;
|
||||||
|
stream->buf = (char *)0;
|
||||||
|
stream->size = 0;
|
||||||
|
stream->cap = 0;
|
||||||
|
stream->pos = 0;
|
||||||
|
stream->path = (const char *)0;
|
||||||
|
return 0;
|
||||||
|
}
|
||||||
|
|
||||||
|
// Forward decls for routines that live in snprintf.c.
|
||||||
|
extern int vsnprintf(char *buf, size_t n, const char *fmt, va_list ap);
|
||||||
|
|
||||||
|
// Forward decl for vfprintf so fprintf can call it.
|
||||||
|
int vfprintf(FILE *stream, const char *fmt, va_list ap);
|
||||||
|
|
||||||
int fprintf(FILE *stream, const char *fmt, ...) {
|
int fprintf(FILE *stream, const char *fmt, ...) {
|
||||||
(void)stream;
|
|
||||||
va_list ap;
|
va_list ap;
|
||||||
__builtin_va_start(ap, fmt);
|
__builtin_va_start(ap, fmt);
|
||||||
int r = vprintf(fmt, ap);
|
int r = vfprintf(stream, fmt, ap);
|
||||||
__builtin_va_end(ap);
|
__builtin_va_end(ap);
|
||||||
return r;
|
return r;
|
||||||
}
|
}
|
||||||
|
|
||||||
int vfprintf(FILE *stream, const char *fmt, va_list ap) {
|
int vfprintf(FILE *stream, const char *fmt, va_list ap) {
|
||||||
(void)stream;
|
if (!stream) return -1;
|
||||||
|
if (stream->kind == FILE_KIND_STDOUT || stream->kind == FILE_KIND_STDERR)
|
||||||
return vprintf(fmt, ap);
|
return vprintf(fmt, ap);
|
||||||
|
if (stream->kind == FILE_KIND_MEM) {
|
||||||
|
// Format into the file's tail. Use the memory buffer that
|
||||||
|
// remains as a snprintf target. Caller is responsible for
|
||||||
|
// sizing the file's buffer.
|
||||||
|
if (!stream->writable) { stream->err = 1; return -1; }
|
||||||
|
size_t remain = (stream->cap > stream->pos)
|
||||||
|
? stream->cap - stream->pos : 0;
|
||||||
|
if (remain == 0) { stream->err = 1; return -1; }
|
||||||
|
int n = vsnprintf(stream->buf + stream->pos, remain, fmt, ap);
|
||||||
|
if (n < 0) { stream->err = 1; return -1; }
|
||||||
|
size_t written = ((size_t)n < remain) ? (size_t)n : remain - 1;
|
||||||
|
stream->pos += written;
|
||||||
|
if (stream->pos > stream->size) stream->size = stream->pos;
|
||||||
|
return n;
|
||||||
|
}
|
||||||
|
return -1;
|
||||||
}
|
}
|
||||||
|
|
||||||
// ---- assert ----
|
// ---- assert ----
|
||||||
|
|
@ -688,56 +825,204 @@ int atexit(AtexitFn fn) {
|
||||||
return 0;
|
return 0;
|
||||||
}
|
}
|
||||||
|
|
||||||
// ---- File I/O stubs ----
|
// ---- File I/O (memory-backed) ----
|
||||||
//
|
//
|
||||||
// A real implementation would route through the GS/OS dispatcher at
|
// Backed by mfsRegister'd entries. Mode strings:
|
||||||
// $E100A8 (build a class-1 parm block, push its pointer, JSL with X
|
// "r" read only
|
||||||
// = callNum, copy the refNum out). fopen would maintain a small
|
// "w" write, truncate to zero on open
|
||||||
// FD table mapping FILE* magic values back to GS/OS refNums.
|
// "a" write, position at end on open
|
||||||
// Until that lands, every call returns failure so code that links
|
// "r+" read+write
|
||||||
// against stdio degrades gracefully instead of trapping.
|
// "w+" read+write, truncate
|
||||||
|
// Plus optional "b" (no-op since we're memory-backed).
|
||||||
|
//
|
||||||
|
// Returns NULL if no registration matches `path` (or the requested
|
||||||
|
// mode isn't compatible with the registration's writable flag).
|
||||||
|
|
||||||
FILE *fopen(const char *path, const char *mode) {
|
FILE *fopen(const char *path, const char *mode) {
|
||||||
(void)path; (void)mode;
|
if (!path || !mode) return (FILE *)0;
|
||||||
return (FILE *)0;
|
int wantWrite = 0;
|
||||||
|
int wantRead = 1;
|
||||||
|
int truncate = 0;
|
||||||
|
int append = 0;
|
||||||
|
if (mode[0] == 'r') { wantRead = 1; wantWrite = (mode[1] == '+' || (mode[1] == 'b' && mode[2] == '+')); }
|
||||||
|
else if (mode[0] == 'w') { wantWrite = 1; truncate = 1; wantRead = (mode[1] == '+' || (mode[1] == 'b' && mode[2] == '+')); }
|
||||||
|
else if (mode[0] == 'a') { wantWrite = 1; append = 1; wantRead = (mode[1] == '+' || (mode[1] == 'b' && mode[2] == '+')); }
|
||||||
|
else return (FILE *)0;
|
||||||
|
|
||||||
|
// Locate registration.
|
||||||
|
MfsEntry *reg = (MfsEntry *)0;
|
||||||
|
for (int i = 0; i < MFS_MAX_REG; i++) {
|
||||||
|
if (__mfsReg[i].inUse && strcmp(__mfsReg[i].path, path) == 0) {
|
||||||
|
reg = &__mfsReg[i];
|
||||||
|
break;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
if (!reg) return (FILE *)0;
|
||||||
|
if (wantWrite && !reg->writable) return (FILE *)0;
|
||||||
|
|
||||||
|
// Allocate a FILE slot (3..MAX-1 — 0..2 are stdin/out/err).
|
||||||
|
FILE *f = (FILE *)0;
|
||||||
|
for (int i = 3; i < MFS_MAX_FILES; i++) {
|
||||||
|
if (__mfs[i].kind == 0) {
|
||||||
|
f = &__mfs[i];
|
||||||
|
break;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
if (!f) return (FILE *)0;
|
||||||
|
|
||||||
|
f->kind = FILE_KIND_MEM;
|
||||||
|
f->writable = (u8)(wantWrite ? 1 : 0);
|
||||||
|
f->eof = 0;
|
||||||
|
f->err = 0;
|
||||||
|
f->buf = reg->buf;
|
||||||
|
f->size = reg->size;
|
||||||
|
f->cap = reg->cap;
|
||||||
|
f->pos = 0;
|
||||||
|
f->unget = -1;
|
||||||
|
f->path = reg->path;
|
||||||
|
(void)wantRead;
|
||||||
|
|
||||||
|
if (truncate) f->size = 0;
|
||||||
|
if (append) f->pos = f->size;
|
||||||
|
return f;
|
||||||
}
|
}
|
||||||
|
|
||||||
size_t fread(void *ptr, size_t size, size_t nmemb, FILE *stream) {
|
size_t fread(void *ptr, size_t size, size_t nmemb, FILE *stream) {
|
||||||
(void)ptr; (void)size; (void)nmemb; (void)stream;
|
if (!stream || stream->kind != FILE_KIND_MEM) return 0;
|
||||||
return 0;
|
if (size == 0 || nmemb == 0) return 0;
|
||||||
|
// Avoid 32-bit overflow on size * nmemb: cap nmemb so each item
|
||||||
|
// (size bytes) fits in remaining 16-bit address space.
|
||||||
|
if (nmemb > (size_t)0xFFFE / size) nmemb = (size_t)0xFFFE / size;
|
||||||
|
char *out = (char *)ptr;
|
||||||
|
size_t items = 0;
|
||||||
|
while (items < nmemb) {
|
||||||
|
size_t b;
|
||||||
|
// Each item: size bytes.
|
||||||
|
for (b = 0; b < size; b++) {
|
||||||
|
if (stream->unget >= 0) {
|
||||||
|
*out++ = (char)stream->unget;
|
||||||
|
stream->unget = -1;
|
||||||
|
continue;
|
||||||
|
}
|
||||||
|
if (stream->pos >= stream->size) {
|
||||||
|
stream->eof = 1;
|
||||||
|
return items;
|
||||||
|
}
|
||||||
|
*out++ = stream->buf[stream->pos++];
|
||||||
|
}
|
||||||
|
items++;
|
||||||
|
}
|
||||||
|
return items;
|
||||||
}
|
}
|
||||||
|
|
||||||
size_t fwrite(const void *ptr, size_t size, size_t nmemb, FILE *stream) {
|
size_t fwrite(const void *ptr, size_t size, size_t nmemb, FILE *stream) {
|
||||||
// For stdout/stderr, route through putchar so programs that use
|
if (!stream) return 0;
|
||||||
// fwrite for binary output ("write %d bytes to stdout") actually
|
if (size == 0 || nmemb == 0) return 0;
|
||||||
// produce output instead of silently dropping it. For other
|
// Cap nmemb so each item (size bytes) fits in the address space
|
||||||
// streams (real file handles), still a stub returning 0.
|
// — avoids 32-bit `size * nmemb` that the i32 multiply path triggers.
|
||||||
if (stream == stdout || stream == stderr) {
|
if (nmemb > (size_t)0xFFFE / size) nmemb = (size_t)0xFFFE / size;
|
||||||
// size * nmemb can overflow size_t (16-bit on this target);
|
const char *in = (const char *)ptr;
|
||||||
// bail rather than silently truncate the byte count.
|
if (stream->kind == FILE_KIND_STDOUT || stream->kind == FILE_KIND_STDERR) {
|
||||||
if (size != 0 && nmemb > (size_t)0xFFFF / size) return 0;
|
size_t items = 0;
|
||||||
const u8 *p = (const u8 *)ptr;
|
while (items < nmemb) {
|
||||||
size_t total = size * nmemb;
|
for (size_t b = 0; b < size; b++) putchar(*in++);
|
||||||
for (size_t i = 0; i < total; i++) putchar(p[i]);
|
items++;
|
||||||
return nmemb;
|
|
||||||
}
|
}
|
||||||
(void)ptr; (void)size; (void)nmemb;
|
return items;
|
||||||
|
}
|
||||||
|
if (stream->kind != FILE_KIND_MEM) return 0;
|
||||||
|
if (!stream->writable) { stream->err = 1; return 0; }
|
||||||
|
size_t items = 0;
|
||||||
|
while (items < nmemb) {
|
||||||
|
size_t b;
|
||||||
|
for (b = 0; b < size; b++) {
|
||||||
|
if (stream->pos >= stream->cap) {
|
||||||
|
stream->err = 1;
|
||||||
|
if (stream->pos > stream->size) stream->size = stream->pos;
|
||||||
|
return items;
|
||||||
|
}
|
||||||
|
stream->buf[stream->pos++] = *in++;
|
||||||
|
}
|
||||||
|
items++;
|
||||||
|
}
|
||||||
|
if (stream->pos > stream->size) stream->size = stream->pos;
|
||||||
|
return items;
|
||||||
|
}
|
||||||
|
|
||||||
|
#define SEEK_SET 0
|
||||||
|
#define SEEK_CUR 1
|
||||||
|
#define SEEK_END 2
|
||||||
|
|
||||||
|
int fseek(FILE *stream, long offset, int whence) {
|
||||||
|
if (!stream || stream->kind != FILE_KIND_MEM) return -1;
|
||||||
|
long base;
|
||||||
|
if (whence == SEEK_SET) base = 0;
|
||||||
|
else if (whence == SEEK_CUR) base = (long)stream->pos;
|
||||||
|
else if (whence == SEEK_END) base = (long)stream->size;
|
||||||
|
else return -1;
|
||||||
|
long target = base + offset;
|
||||||
|
if (target < 0 || target > (long)stream->size) return -1;
|
||||||
|
stream->pos = (size_t)target;
|
||||||
|
stream->eof = 0;
|
||||||
|
stream->unget = -1;
|
||||||
return 0;
|
return 0;
|
||||||
}
|
}
|
||||||
|
|
||||||
int fseek(FILE *stream, long offset, int whence) {
|
long ftell(FILE *stream) {
|
||||||
(void)stream; (void)offset; (void)whence;
|
if (!stream || stream->kind != FILE_KIND_MEM) return -1L;
|
||||||
|
return (long)stream->pos;
|
||||||
|
}
|
||||||
|
|
||||||
|
int fgetc(FILE *stream) {
|
||||||
|
if (!stream) return -1;
|
||||||
|
if (stream->unget >= 0) {
|
||||||
|
int c = stream->unget;
|
||||||
|
stream->unget = -1;
|
||||||
|
return c;
|
||||||
|
}
|
||||||
|
if (stream->kind == FILE_KIND_MEM) {
|
||||||
|
if (stream->pos >= stream->size) { stream->eof = 1; return -1; }
|
||||||
|
return (int)(unsigned char)stream->buf[stream->pos++];
|
||||||
|
}
|
||||||
|
if (stream->kind == FILE_KIND_STDIN) return getchar();
|
||||||
return -1;
|
return -1;
|
||||||
}
|
}
|
||||||
|
|
||||||
long ftell(FILE *stream) {
|
char *fgets(char *buf, int n, FILE *stream) {
|
||||||
(void)stream;
|
if (!buf || n <= 0 || !stream) return (char *)0;
|
||||||
return -1L;
|
int i = 0;
|
||||||
|
while (i < n - 1) {
|
||||||
|
int c = fgetc(stream);
|
||||||
|
if (c < 0) {
|
||||||
|
if (i == 0) return (char *)0;
|
||||||
|
break;
|
||||||
|
}
|
||||||
|
buf[i++] = (char)c;
|
||||||
|
if (c == '\n') break;
|
||||||
|
}
|
||||||
|
buf[i] = 0;
|
||||||
|
return buf;
|
||||||
}
|
}
|
||||||
|
|
||||||
int feof(FILE *stream) { (void)stream; return 1; }
|
int ungetc(int c, FILE *stream) {
|
||||||
int ferror(FILE *stream) { (void)stream; return 0; }
|
if (!stream || c < 0) return -1;
|
||||||
void clearerr(FILE *stream) { (void)stream; }
|
if (stream->unget >= 0) return -1; // only one slot
|
||||||
|
stream->unget = c & 0xFF;
|
||||||
|
stream->eof = 0;
|
||||||
|
return c & 0xFF;
|
||||||
|
}
|
||||||
|
|
||||||
|
int feof(FILE *stream) {
|
||||||
|
return stream ? (int)stream->eof : 1;
|
||||||
|
}
|
||||||
|
|
||||||
|
int ferror(FILE *stream) {
|
||||||
|
return stream ? (int)stream->err : 0;
|
||||||
|
}
|
||||||
|
|
||||||
|
void clearerr(FILE *stream) {
|
||||||
|
if (stream) { stream->eof = 0; stream->err = 0; }
|
||||||
|
}
|
||||||
|
|
||||||
// ---- locale.h stubs ----
|
// ---- locale.h stubs ----
|
||||||
//
|
//
|
||||||
|
|
@ -792,22 +1077,46 @@ struct lconv *localeconv(void) {
|
||||||
return &__c_lconv;
|
return &__c_lconv;
|
||||||
}
|
}
|
||||||
|
|
||||||
// ---- signal.h stubs ----
|
// ---- signal.h ----
|
||||||
//
|
//
|
||||||
// IIgs has no POSIX-style signal model. signal() always fails (returns
|
// IIgs has no POSIX-style signal source (no kernel-delivered signals
|
||||||
// SIG_ERR); raise() returns -1. Code that uses these for diagnostic
|
// from external events), but a small in-process signal table makes
|
||||||
// fall-through (e.g. abort -> raise(SIGABRT) -> stub) compiles and
|
// signal()/raise() work for synchronous diagnostic use: a program
|
||||||
// behaves as "signals disabled".
|
// can install SIGABRT/SIGINT/etc. handlers and abort()-equivalent
|
||||||
|
// code can raise(SIGABRT) to invoke them. No async signal delivery.
|
||||||
|
//
|
||||||
|
// Table indexed by signal number 0..15; raise() looks up the
|
||||||
|
// installed handler and calls it. SIG_DFL falls through to a
|
||||||
|
// per-signal default (SIGABRT calls abort(); others ignore).
|
||||||
|
|
||||||
typedef void (*__sighandler_t)(int);
|
typedef void (*__sighandler_t)(int);
|
||||||
|
#define _SIG_DFL ((__sighandler_t)0)
|
||||||
|
#define _SIG_IGN ((__sighandler_t)1)
|
||||||
#define _SIG_ERR ((__sighandler_t)-1)
|
#define _SIG_ERR ((__sighandler_t)-1)
|
||||||
|
|
||||||
|
#define _NSIG 16
|
||||||
|
static __sighandler_t __sigHandlers[_NSIG];
|
||||||
|
|
||||||
__sighandler_t signal(int sig, __sighandler_t handler) {
|
__sighandler_t signal(int sig, __sighandler_t handler) {
|
||||||
(void)sig; (void)handler;
|
if (sig < 0 || sig >= _NSIG) return _SIG_ERR;
|
||||||
return _SIG_ERR;
|
__sighandler_t prev = __sigHandlers[sig];
|
||||||
|
if (!prev) prev = _SIG_DFL;
|
||||||
|
__sigHandlers[sig] = handler;
|
||||||
|
return prev;
|
||||||
}
|
}
|
||||||
|
|
||||||
int raise(int sig) {
|
int raise(int sig) {
|
||||||
(void)sig;
|
if (sig < 0 || sig >= _NSIG) return -1;
|
||||||
return -1;
|
__sighandler_t h = __sigHandlers[sig];
|
||||||
|
if (h == _SIG_IGN) return 0;
|
||||||
|
if (!h || h == _SIG_DFL) {
|
||||||
|
// Default action: SIGABRT -> abort(); SIGTERM/SIGINT -> exit;
|
||||||
|
// others -> ignore.
|
||||||
|
if (sig == 6) abort(); // SIGABRT
|
||||||
|
if (sig == 2 || sig == 15) // SIGINT, SIGTERM
|
||||||
|
exit(128 + sig);
|
||||||
|
return 0;
|
||||||
|
}
|
||||||
|
h(sig);
|
||||||
|
return 0;
|
||||||
}
|
}
|
||||||
|
|
|
||||||
106
scripts/bench.sh
Executable file
106
scripts/bench.sh
Executable file
|
|
@ -0,0 +1,106 @@
|
||||||
|
#!/usr/bin/env bash
|
||||||
|
# bench.sh — compile a benchmark suite with both clang (this toolchain)
|
||||||
|
# and Calypsi cc65816, compare emitted code size.
|
||||||
|
#
|
||||||
|
# Each benchmark is a self-contained .c file under benchmarks/. We
|
||||||
|
# compile each with both toolchains (-O2 / --speed), then count
|
||||||
|
# bytes in the .text + .data sections of the resulting object.
|
||||||
|
# Output is a markdown table on stdout.
|
||||||
|
#
|
||||||
|
# Cycle-time comparison would require running each benchmark in MAME
|
||||||
|
# under both toolchains' produced code, with a wrapper function that
|
||||||
|
# instruments the cycle counter. That's a separate, more involved
|
||||||
|
# tool — left for future work.
|
||||||
|
|
||||||
|
set -euo pipefail
|
||||||
|
SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)"
|
||||||
|
PROJECT_ROOT="$(cd "$SCRIPT_DIR/.." && pwd)"
|
||||||
|
BENCH_DIR="$PROJECT_ROOT/benchmarks"
|
||||||
|
|
||||||
|
CLANG="$PROJECT_ROOT/tools/llvm-mos-build/bin/clang"
|
||||||
|
CALYPSI="$PROJECT_ROOT/tools/calypsi/usr/local/lib/calypsi-65816-5.16/bin/cc65816"
|
||||||
|
|
||||||
|
[ -x "$CLANG" ] || { echo "ERROR: clang not built" >&2; exit 1; }
|
||||||
|
[ -x "$CALYPSI" ] || { echo "ERROR: Calypsi not installed" >&2; exit 1; }
|
||||||
|
[ -d "$BENCH_DIR" ] || { echo "ERROR: $BENCH_DIR not found" >&2; exit 1; }
|
||||||
|
|
||||||
|
# Object-size measurement. Different object formats — for clang it's
|
||||||
|
# ELF (use llvm-readobj), for Calypsi it's its own format (use the
|
||||||
|
# binary file size as a proxy, minus header overhead). ELF .text +
|
||||||
|
# .rodata + .data covers code + constants; we report code-only as the
|
||||||
|
# primary metric.
|
||||||
|
clangSize() {
|
||||||
|
local o="$1"
|
||||||
|
"$PROJECT_ROOT/tools/llvm-mos-build/bin/llvm-readobj" --section-headers "$o" \
|
||||||
|
2>/dev/null | awk '
|
||||||
|
/Name: .text/ { intext=1; inrodata=0; indata=0; next }
|
||||||
|
/Name: .rodata/ { intext=0; inrodata=1; indata=0; next }
|
||||||
|
/Name: .data/ { intext=0; inrodata=0; indata=1; next }
|
||||||
|
/Name: / { intext=0; inrodata=0; indata=0; next }
|
||||||
|
/Size:/ {
|
||||||
|
if (intext) text += strtonum($2)
|
||||||
|
if (inrodata) rodata += strtonum($2)
|
||||||
|
if (indata) data += strtonum($2)
|
||||||
|
}
|
||||||
|
END { print text " " rodata " " data }
|
||||||
|
'
|
||||||
|
}
|
||||||
|
|
||||||
|
# Calypsi text size: extract the highest farcode offset from the
|
||||||
|
# assembler listing. cc65816 -> .s, then as65816 --list-file
|
||||||
|
# emits "OFFSET hexbytes" columns; we pick the max offset and add
|
||||||
|
# the byte width of the final instruction (1-3 bytes typically).
|
||||||
|
# Approximation but within a byte or two of true text size.
|
||||||
|
calypsiTextSize() {
|
||||||
|
local src="$1"
|
||||||
|
local s lst tmp
|
||||||
|
s=$(mktemp --suffix=.s)
|
||||||
|
lst=$(mktemp --suffix=.lst)
|
||||||
|
tmp=$(mktemp --suffix=.o)
|
||||||
|
"$CALYPSI" -O 2 --speed --assembly-source "$s" -c "$src" -o "$tmp" 2>/dev/null \
|
||||||
|
|| { echo 0; rm -f "$s" "$lst" "$tmp"; return; }
|
||||||
|
"$CALYPSI" -O 2 --speed -c "$src" -o "$tmp" 2>/dev/null
|
||||||
|
"$PROJECT_ROOT/tools/calypsi/usr/local/lib/calypsi-65816-5.16/bin/as65816" \
|
||||||
|
--list-file "$lst" -o "$tmp" "$s" 2>/dev/null
|
||||||
|
# Highest farcode offset. We skip the +instruction-bytes detail
|
||||||
|
# (rough estimate is fine for relative comparison).
|
||||||
|
local maxOff
|
||||||
|
maxOff=$(grep -oE "^[0-9]+ [0-9a-f]{6}" "$lst" 2>/dev/null \
|
||||||
|
| awk '{print strtonum("0x"$2)}' | sort -n | tail -1)
|
||||||
|
echo "${maxOff:-0}"
|
||||||
|
rm -f "$s" "$lst" "$tmp"
|
||||||
|
}
|
||||||
|
|
||||||
|
# Print markdown header.
|
||||||
|
printf '| Benchmark | clang (B) | Calypsi (B) | clang vs Calypsi |\n'
|
||||||
|
printf '|-----------|----------:|------------:|-----------------:|\n'
|
||||||
|
|
||||||
|
totalClang=0
|
||||||
|
totalCalypsi=0
|
||||||
|
for src in "$BENCH_DIR"/*.c; do
|
||||||
|
name=$(basename "$src" .c)
|
||||||
|
cObj=$(mktemp --suffix=.clang.o)
|
||||||
|
|
||||||
|
"$CLANG" --target=w65816 -O2 -ffunction-sections \
|
||||||
|
-c "$src" -o "$cObj" 2>/dev/null || { echo "clang failed on $name" >&2; rm -f "$cObj"; continue; }
|
||||||
|
|
||||||
|
read clangText _ _ < <(clangSize "$cObj")
|
||||||
|
clangText=${clangText:-0}
|
||||||
|
|
||||||
|
calText=$(calypsiTextSize "$src")
|
||||||
|
|
||||||
|
if [ "$calText" -gt 0 ]; then
|
||||||
|
ratio=$(awk -v a="$clangText" -v b="$calText" 'BEGIN{printf "%.2fx", a/b}')
|
||||||
|
else
|
||||||
|
ratio="—"
|
||||||
|
fi
|
||||||
|
printf '| %s | %d | %d | %s |\n' "$name" "$clangText" "$calText" "$ratio"
|
||||||
|
totalClang=$((totalClang + clangText))
|
||||||
|
totalCalypsi=$((totalCalypsi + calText))
|
||||||
|
rm -f "$cObj"
|
||||||
|
done
|
||||||
|
|
||||||
|
if [ "$totalCalypsi" -gt 0 ]; then
|
||||||
|
totalRatio=$(awk -v a="$totalClang" -v b="$totalCalypsi" 'BEGIN{printf "%.2fx", a/b}')
|
||||||
|
printf '| **total** | **%d** | **%d** | **%s** |\n' "$totalClang" "$totalCalypsi" "$totalRatio"
|
||||||
|
fi
|
||||||
425
scripts/genToolbox.py
Normal file
425
scripts/genToolbox.py
Normal file
|
|
@ -0,0 +1,425 @@
|
||||||
|
#!/usr/bin/env python3
|
||||||
|
# genToolbox.py — generate IIgs toolbox wrappers from ORCA-C headers.
|
||||||
|
#
|
||||||
|
# Reads ORCA's extern declarations of the form:
|
||||||
|
# extern pascal RetType FuncName(ArgType, ArgType) inline(0xNNTT, dispatcher);
|
||||||
|
# and emits two outputs:
|
||||||
|
# - C header with `static inline` wrappers using clang inline-asm
|
||||||
|
# - .s file with extern wrapper bodies for multi-arg routines that
|
||||||
|
# can't fit in inline asm (our backend's constraints don't take
|
||||||
|
# memory operands).
|
||||||
|
#
|
||||||
|
# Tool number convention: 0xNNTT high byte = function, low byte = tool set
|
||||||
|
# Dispatcher: JSL $E10000 for normal toolbox; JSL $E100A8 for GS/OS
|
||||||
|
# (only the ProDOS-16 / GS/OS calls use _CallBackVector).
|
||||||
|
#
|
||||||
|
# Calling convention conversion: ORCA uses Pascal (args pushed L-to-R),
|
||||||
|
# our C ABI passes arg0 in A and arg1+ on stack RTL. Each generated
|
||||||
|
# wrapper re-pushes args in toolbox order.
|
||||||
|
#
|
||||||
|
# Type widths (matching ORCA):
|
||||||
|
# Word, Boolean, Integer, Char, Byte = 2 bytes (16-bit)
|
||||||
|
# LongWord, Long, Handle, Pointer = 4 bytes (32-bit)
|
||||||
|
# Ptr, Ref, ResType = 4 bytes
|
||||||
|
# (Pointer is 4 bytes in ORCA -- it's a far/24-bit pointer. Our backend
|
||||||
|
# uses 16-bit pointers, but the toolbox expects 32-bit on the stack;
|
||||||
|
# we extend with a zero high word.)
|
||||||
|
#
|
||||||
|
# Output files are written to the runtime tree.
|
||||||
|
|
||||||
|
import re
|
||||||
|
import sys
|
||||||
|
from pathlib import Path
|
||||||
|
|
||||||
|
ORCA_DIR = Path("/tmp/orca-headers")
|
||||||
|
OUT_HEADER = Path("/home/scott/claude/llvm816/runtime/include/iigs/toolbox.h")
|
||||||
|
OUT_ASM = Path("/home/scott/claude/llvm816/runtime/src/iigsToolbox.s")
|
||||||
|
|
||||||
|
# Type table: (size in bytes, c-type)
|
||||||
|
TYPE_MAP = {
|
||||||
|
"void": (0, "void"),
|
||||||
|
"Word": (2, "unsigned short"),
|
||||||
|
"Boolean": (2, "unsigned short"),
|
||||||
|
"Integer": (2, "short"),
|
||||||
|
"Char": (2, "char"), # widened on stack
|
||||||
|
"Byte": (2, "unsigned char"),
|
||||||
|
"LongWord": (4, "unsigned long"),
|
||||||
|
"Long": (4, "long"),
|
||||||
|
"Handle": (4, "void *"), # 4-byte handle
|
||||||
|
"Pointer": (4, "void *"), # 4-byte pointer (toolbox semantics)
|
||||||
|
"Ref": (4, "void *"),
|
||||||
|
"Ptr": (4, "void *"),
|
||||||
|
"ResType": (4, "unsigned long"),
|
||||||
|
"Real": (4, "float"),
|
||||||
|
"Double": (8, "double"),
|
||||||
|
"Comp": (8, "long long"),
|
||||||
|
"Extended": (10, "long double"),
|
||||||
|
"GrafPortPtr":(4, "void *"),
|
||||||
|
"WindowPtr": (4, "void *"),
|
||||||
|
"MenuHandle": (4, "void *"),
|
||||||
|
"CtlRecHndl": (4, "void *"),
|
||||||
|
"DialogPtr": (4, "void *"),
|
||||||
|
"RgnHandle": (4, "void *"),
|
||||||
|
"PrPort": (4, "void *"),
|
||||||
|
"PrRecHndl": (4, "void *"),
|
||||||
|
"PicHandle": (4, "void *"),
|
||||||
|
"WindRecHndl":(4, "void *"),
|
||||||
|
}
|
||||||
|
|
||||||
|
# Tool number → tool-set name mapping (low byte of toolNumber)
|
||||||
|
TOOLSET_NAME = {
|
||||||
|
0x01: "ToolLocator",
|
||||||
|
0x02: "MemoryManager",
|
||||||
|
0x03: "MiscTools",
|
||||||
|
0x04: "QuickDraw",
|
||||||
|
0x05: "DeskManager",
|
||||||
|
0x06: "EventManager",
|
||||||
|
0x07: "Scheduler",
|
||||||
|
0x08: "SoundManager",
|
||||||
|
0x09: "AppleDeskBus",
|
||||||
|
0x0A: "SANE",
|
||||||
|
0x0B: "IntegerMath",
|
||||||
|
0x0C: "TextTools",
|
||||||
|
0x0E: "WindowManager",
|
||||||
|
0x0F: "MenuManager",
|
||||||
|
0x10: "ControlManager",
|
||||||
|
0x11: "Loader",
|
||||||
|
0x12: "QDAuxiliary",
|
||||||
|
0x13: "PrintManager",
|
||||||
|
0x14: "LineEdit",
|
||||||
|
0x15: "DialogManager",
|
||||||
|
0x16: "ScrapManager",
|
||||||
|
0x17: "StandardFile",
|
||||||
|
0x18: "DiskUtil",
|
||||||
|
0x19: "NoteSynth",
|
||||||
|
0x1A: "NoteSequencer",
|
||||||
|
0x1B: "FontManager",
|
||||||
|
0x1C: "ListManager",
|
||||||
|
0x1D: "ACETools",
|
||||||
|
0x1E: "ResourceManager",
|
||||||
|
0x1F: "MIDITools",
|
||||||
|
0x20: "VideoOverlay",
|
||||||
|
0x21: "Teletext",
|
||||||
|
0x22: "TextEdit",
|
||||||
|
0x23: "MediaControl",
|
||||||
|
0x32: "MediaControl2",
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
def parseLine(line):
|
||||||
|
"""Parse `extern pascal RetType Name(args) inline(0xNNTT, dispatcher);`
|
||||||
|
Returns dict or None if not a toolbox decl.
|
||||||
|
"""
|
||||||
|
m = re.match(
|
||||||
|
r'^\s*extern\s+pascal\s+(\w+)\s+(\w+)\s*\((.*?)\)\s+inline\(0x([0-9A-Fa-f]+)\s*,\s*(\w+)\)\s*;',
|
||||||
|
line,
|
||||||
|
)
|
||||||
|
if not m:
|
||||||
|
return None
|
||||||
|
retType, name, args, toolHex, dispatcher = m.group(1, 2, 3, 4, 5)
|
||||||
|
toolNum = int(toolHex, 16)
|
||||||
|
|
||||||
|
# Parse arg types (just the types, no names since ORCA omits them).
|
||||||
|
args = args.strip()
|
||||||
|
argTypes = []
|
||||||
|
if args and args != "void":
|
||||||
|
for a in args.split(","):
|
||||||
|
a = a.strip()
|
||||||
|
# ORCA may have type-only or "type name"; take the first word.
|
||||||
|
t = a.split()[0]
|
||||||
|
argTypes.append(t)
|
||||||
|
return {
|
||||||
|
"ret": retType,
|
||||||
|
"name": name,
|
||||||
|
"args": argTypes,
|
||||||
|
"tool": toolNum,
|
||||||
|
"dispatcher": dispatcher,
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
def typeInfo(t):
|
||||||
|
"""Return (size_bytes, c_type) for ORCA type, or None if unsupported."""
|
||||||
|
if t in TYPE_MAP:
|
||||||
|
return TYPE_MAP[t]
|
||||||
|
# Default: assume 4 bytes / void* (pointer-like)
|
||||||
|
return (4, "void *")
|
||||||
|
|
||||||
|
|
||||||
|
def emit(decls):
|
||||||
|
"""Generate C header and .s file from parsed decls."""
|
||||||
|
|
||||||
|
cLines = [
|
||||||
|
"// AUTOGENERATED by scripts/genToolbox.py from ORCA-C ORCACDefs/.",
|
||||||
|
"// DO NOT EDIT by hand — regenerate to update.",
|
||||||
|
"//",
|
||||||
|
"// Complete IIgs toolbox: ~1300 routines across 35 tool sets.",
|
||||||
|
"// Names match Apple's IIgs Toolbox Reference (TLStartUp,",
|
||||||
|
"// MMStartUp, NewWindow, SysBeep, etc.). Multi-arg wrappers",
|
||||||
|
"// (those whose stub body uses memory operands) live in",
|
||||||
|
"// runtime/src/iigsToolbox.s; zero-arg / single-arg simple",
|
||||||
|
"// ones are inlined here.",
|
||||||
|
"",
|
||||||
|
"#ifndef IIGS_TOOLBOX_H",
|
||||||
|
"#define IIGS_TOOLBOX_H",
|
||||||
|
"",
|
||||||
|
"#ifdef __cplusplus",
|
||||||
|
'extern "C" {',
|
||||||
|
"#endif",
|
||||||
|
"",
|
||||||
|
]
|
||||||
|
|
||||||
|
sLines = [
|
||||||
|
"; AUTOGENERATED by scripts/genToolbox.py from ORCA-C ORCACDefs/.",
|
||||||
|
"; DO NOT EDIT by hand — regenerate to update.",
|
||||||
|
";",
|
||||||
|
"; IIgs toolbox multi-arg wrappers.",
|
||||||
|
";",
|
||||||
|
"; C ABI: arg0 (i16) in A, arg0 (i32) in A:X, arg1+ on stack (4,S etc.).",
|
||||||
|
"; Each wrapper re-pushes args in toolbox (Pascal-style L-to-R) order,",
|
||||||
|
"; preceded by result space if non-void return, then JSL $E10000",
|
||||||
|
"; (or $E100A8 for GS/OS). Pops result if non-void.",
|
||||||
|
";",
|
||||||
|
"; Tool number: high byte = function, low byte = tool set.",
|
||||||
|
"",
|
||||||
|
"\t.text",
|
||||||
|
"",
|
||||||
|
]
|
||||||
|
|
||||||
|
seenNames = set()
|
||||||
|
inlineCount = 0
|
||||||
|
asmCount = 0
|
||||||
|
skipped = []
|
||||||
|
|
||||||
|
for d in decls:
|
||||||
|
name = d["name"]
|
||||||
|
if name in seenNames:
|
||||||
|
continue # duplicate from header re-include, etc.
|
||||||
|
seenNames.add(name)
|
||||||
|
|
||||||
|
retType = d["ret"]
|
||||||
|
argTypes = d["args"]
|
||||||
|
tool = d["tool"]
|
||||||
|
dispatcher = d["dispatcher"]
|
||||||
|
|
||||||
|
# Check if all types are known.
|
||||||
|
retSize, retC = typeInfo(retType)
|
||||||
|
argInfo = [typeInfo(a) for a in argTypes]
|
||||||
|
if any(ai is None for ai in argInfo):
|
||||||
|
skipped.append((name, "unknown arg type"))
|
||||||
|
continue
|
||||||
|
|
||||||
|
# Build C-style arg list.
|
||||||
|
cArgs = ", ".join(f"{ai[1]} a{i}" for i, ai in enumerate(argInfo))
|
||||||
|
if not cArgs:
|
||||||
|
cArgs = "void"
|
||||||
|
cDecl = f"{retC} {name}({cArgs});"
|
||||||
|
|
||||||
|
# Decide inline vs asm.
|
||||||
|
# Simple cases that can be inlined: no args (with or without 16-bit
|
||||||
|
# return), or single 16-bit arg with void return / 16-bit return.
|
||||||
|
canInline = False
|
||||||
|
if not argInfo and retSize in (0, 2):
|
||||||
|
canInline = True
|
||||||
|
elif (
|
||||||
|
len(argInfo) == 1
|
||||||
|
and argInfo[0][0] == 2
|
||||||
|
and retSize in (0, 2)
|
||||||
|
):
|
||||||
|
canInline = True
|
||||||
|
|
||||||
|
dispAddr = "0xe10000" if dispatcher == "dispatcher" else "0xe100a8"
|
||||||
|
|
||||||
|
if canInline:
|
||||||
|
# Generate inline asm body.
|
||||||
|
if not argInfo:
|
||||||
|
if retSize == 0:
|
||||||
|
body = (
|
||||||
|
f' __asm__ volatile (\n'
|
||||||
|
f' "ldx #0x{tool:04X}\\n"\n'
|
||||||
|
f' "jsl {dispAddr}\\n"\n'
|
||||||
|
f' :\n'
|
||||||
|
f' :\n'
|
||||||
|
f' : "a", "x", "y", "memory"\n'
|
||||||
|
f' );\n'
|
||||||
|
)
|
||||||
|
else: # 16-bit return
|
||||||
|
body = (
|
||||||
|
f' {retC} _r;\n'
|
||||||
|
f' __asm__ volatile (\n'
|
||||||
|
f' "pha\\n" // result space\n'
|
||||||
|
f' "ldx #0x{tool:04X}\\n"\n'
|
||||||
|
f' "jsl {dispAddr}\\n"\n'
|
||||||
|
f' "pla\\n"\n'
|
||||||
|
f' : "=a"(_r)\n'
|
||||||
|
f' :\n'
|
||||||
|
f' : "x", "y", "memory"\n'
|
||||||
|
f' );\n'
|
||||||
|
f' return _r;\n'
|
||||||
|
)
|
||||||
|
else: # 1-arg
|
||||||
|
if retSize == 0:
|
||||||
|
body = (
|
||||||
|
f' __asm__ volatile (\n'
|
||||||
|
f' "pha\\n" // arg0\n'
|
||||||
|
f' "ldx #0x{tool:04X}\\n"\n'
|
||||||
|
f' "jsl {dispAddr}\\n"\n'
|
||||||
|
f' :\n'
|
||||||
|
f' : "a"(a0)\n'
|
||||||
|
f' : "x", "y", "memory"\n'
|
||||||
|
f' );\n'
|
||||||
|
)
|
||||||
|
else:
|
||||||
|
body = (
|
||||||
|
f' {retC} _r;\n'
|
||||||
|
f' __asm__ volatile (\n'
|
||||||
|
f' "pha\\n" // result space\n'
|
||||||
|
f' "pha\\n" // arg0\n'
|
||||||
|
f' "ldx #0x{tool:04X}\\n"\n'
|
||||||
|
f' "jsl {dispAddr}\\n"\n'
|
||||||
|
f' "pla\\n"\n'
|
||||||
|
f' : "=a"(_r)\n'
|
||||||
|
f' : "a"(a0)\n'
|
||||||
|
f' : "x", "y", "memory"\n'
|
||||||
|
f' );\n'
|
||||||
|
f' return _r;\n'
|
||||||
|
)
|
||||||
|
|
||||||
|
cLines.append(f"// tool 0x{tool:04X} set 0x{tool & 0xFF:02X} ({TOOLSET_NAME.get(tool & 0xFF, '?')})")
|
||||||
|
cLines.append(f"static inline {retC} {name}({cArgs}) {{")
|
||||||
|
cLines.append(body.rstrip())
|
||||||
|
cLines.append("}")
|
||||||
|
cLines.append("")
|
||||||
|
inlineCount += 1
|
||||||
|
else:
|
||||||
|
# Extern decl in header, asm body in .s file.
|
||||||
|
cLines.append(f"extern {retC} {name}({cArgs}); // 0x{tool:04X}")
|
||||||
|
|
||||||
|
# Generate asm body.
|
||||||
|
sLines.append(f"; {name}({', '.join(argTypes) or 'void'}) -> {retType}")
|
||||||
|
sLines.append(f"; tool 0x{tool:04X}, set 0x{tool & 0xFF:02X} ({TOOLSET_NAME.get(tool & 0xFF, '?')})")
|
||||||
|
sLines.append(f"\t.globl {name}")
|
||||||
|
sLines.append(f"{name}:")
|
||||||
|
|
||||||
|
# Compute total stack arg bytes (excluding arg0 which is in regs).
|
||||||
|
# Determine where each arg starts on the caller's stack.
|
||||||
|
# arg0 is in A (or A:X for i32-first-arg).
|
||||||
|
firstArgIs32 = argInfo and argInfo[0][0] == 4
|
||||||
|
stackArgStart = 4 # offset to first stack-passed arg after JSL retaddr
|
||||||
|
|
||||||
|
# Stash arg0. i16: 'sta scratch'. i32: 'sta scratch; stx scratch+2'.
|
||||||
|
scratchDP = 0xE0 # libcall scratch zone
|
||||||
|
sLines.append(f"\t; --- stash arg0 (in A{'/X' if firstArgIs32 else ''}) ---")
|
||||||
|
sLines.append(f"\tsta 0x{scratchDP:02X}")
|
||||||
|
if firstArgIs32:
|
||||||
|
sLines.append(f"\tstx 0x{scratchDP + 2:02X}")
|
||||||
|
|
||||||
|
# Push result space (toolbox order: result is highest on stack).
|
||||||
|
if retSize > 0:
|
||||||
|
sLines.append(f"\t; --- result space ({retSize} bytes) ---")
|
||||||
|
for _ in range((retSize + 1) // 2):
|
||||||
|
sLines.append(f"\tpea 0")
|
||||||
|
|
||||||
|
# Push args in Pascal order (L-to-R, but each multi-byte value
|
||||||
|
# pushed lo-word first then hi-word per ORCA convention).
|
||||||
|
# Tracker: how many bytes have we pushed beyond the original
|
||||||
|
# caller-stack so all stack-arg loads need to add (pushed) to
|
||||||
|
# their original offset.
|
||||||
|
pushedBytes = (retSize + 1) // 2 * 2 # result space rounded up to word
|
||||||
|
# arg0 first.
|
||||||
|
sLines.append(f"\t; --- arg0 ---")
|
||||||
|
sLines.append(f"\tlda 0x{scratchDP:02X}")
|
||||||
|
sLines.append(f"\tpha")
|
||||||
|
pushedBytes += 2
|
||||||
|
if firstArgIs32:
|
||||||
|
sLines.append(f"\tlda 0x{scratchDP + 2:02X}")
|
||||||
|
sLines.append(f"\tpha")
|
||||||
|
pushedBytes += 2
|
||||||
|
|
||||||
|
# arg1, arg2, ... — each loaded from caller stack at original
|
||||||
|
# offset + pushedBytes.
|
||||||
|
stackArgOffset = stackArgStart # original offset of next arg
|
||||||
|
for i, ai in enumerate(argInfo[1:], start=1):
|
||||||
|
size = ai[0]
|
||||||
|
sLines.append(f"\t; --- arg{i} ({argTypes[i]}, {size}B) ---")
|
||||||
|
# i16 / 16-bit-on-stack args: 1 word, push lo
|
||||||
|
# i32 / 32-bit-on-stack: 2 words, push lo then hi
|
||||||
|
# We're loading from caller's pre-push stack. Original
|
||||||
|
# offsets: arg1 at 4, arg2 at 4+size(arg1), ...
|
||||||
|
# But each load from `(orig+pushed),s` accounts for pushes.
|
||||||
|
if size <= 2:
|
||||||
|
sLines.append(f"\tlda {stackArgOffset + pushedBytes}, s")
|
||||||
|
sLines.append(f"\tpha")
|
||||||
|
pushedBytes += 2
|
||||||
|
stackArgOffset += 2
|
||||||
|
elif size == 4:
|
||||||
|
# Load lo, push; load hi, push.
|
||||||
|
sLines.append(f"\tlda {stackArgOffset + pushedBytes}, s")
|
||||||
|
sLines.append(f"\tpha")
|
||||||
|
pushedBytes += 2
|
||||||
|
sLines.append(f"\tlda {stackArgOffset + pushedBytes}, s")
|
||||||
|
sLines.append(f"\tpha")
|
||||||
|
pushedBytes += 2
|
||||||
|
stackArgOffset += 4
|
||||||
|
else:
|
||||||
|
# Bigger types (8-byte Comp, 10-byte Extended) — push word by word.
|
||||||
|
nWords = (size + 1) // 2
|
||||||
|
for _ in range(nWords):
|
||||||
|
sLines.append(f"\tlda {stackArgOffset + pushedBytes}, s")
|
||||||
|
sLines.append(f"\tpha")
|
||||||
|
pushedBytes += 2
|
||||||
|
stackArgOffset += size
|
||||||
|
|
||||||
|
# Dispatch.
|
||||||
|
sLines.append(f"\tldx #0x{tool:04X}")
|
||||||
|
sLines.append(f"\tjsl {dispAddr}")
|
||||||
|
|
||||||
|
# Pop result.
|
||||||
|
if retSize == 2:
|
||||||
|
sLines.append(f"\tpla ; result -> A")
|
||||||
|
elif retSize == 4:
|
||||||
|
sLines.append(f"\tpla ; result lo -> A")
|
||||||
|
sLines.append(f"\tplx ; result hi -> X")
|
||||||
|
elif retSize > 4:
|
||||||
|
# Larger results: pop into scratch then load A/X for return.
|
||||||
|
# Treat as "best effort" — caller should not expect a real
|
||||||
|
# return value beyond what fits in A:X.
|
||||||
|
nWords = (retSize + 1) // 2
|
||||||
|
for _ in range(nWords):
|
||||||
|
sLines.append(f"\tpla")
|
||||||
|
|
||||||
|
sLines.append(f"\trtl")
|
||||||
|
sLines.append("")
|
||||||
|
|
||||||
|
asmCount += 1
|
||||||
|
|
||||||
|
cLines.append("")
|
||||||
|
cLines.append("#ifdef __cplusplus")
|
||||||
|
cLines.append("}")
|
||||||
|
cLines.append("#endif")
|
||||||
|
cLines.append("")
|
||||||
|
cLines.append("#endif // IIGS_TOOLBOX_H")
|
||||||
|
|
||||||
|
OUT_HEADER.write_text("\n".join(cLines))
|
||||||
|
OUT_ASM.write_text("\n".join(sLines))
|
||||||
|
|
||||||
|
print(f"wrote {OUT_HEADER}: {inlineCount} inline + {asmCount} extern decls")
|
||||||
|
print(f"wrote {OUT_ASM}: {asmCount} bodies")
|
||||||
|
if skipped:
|
||||||
|
print(f"skipped {len(skipped)} routines (unhandled types):")
|
||||||
|
for n, why in skipped[:5]:
|
||||||
|
print(f" {n}: {why}")
|
||||||
|
|
||||||
|
|
||||||
|
def main():
|
||||||
|
decls = []
|
||||||
|
for h in sorted(ORCA_DIR.glob("*.h")):
|
||||||
|
for line in h.read_text().splitlines():
|
||||||
|
d = parseLine(line)
|
||||||
|
if d:
|
||||||
|
decls.append(d)
|
||||||
|
print(f"parsed {len(decls)} declarations from {ORCA_DIR}")
|
||||||
|
emit(decls)
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
main()
|
||||||
|
|
@ -3601,6 +3601,345 @@ EOF
|
||||||
fi
|
fi
|
||||||
rm -f "$cDpFile" "$oDpFile" "$binDpFile"
|
rm -f "$cDpFile" "$oDpFile" "$binDpFile"
|
||||||
|
|
||||||
|
# Memory-backed file I/O. mfsRegister stages a buffer as a
|
||||||
|
# named file, then fopen/fread/fwrite/fseek/ftell/fclose
|
||||||
|
# operate on it. Verifies fopen returns a non-NULL FILE,
|
||||||
|
# fread copies bytes into the caller's buffer, ftell advances,
|
||||||
|
# fseek rewinds, fclose succeeds, fprintf into a writable
|
||||||
|
# in-memory file produces the expected formatted bytes.
|
||||||
|
log "check: MAME runs memory-backed stdio (fopen/fread/fseek/fprintf)"
|
||||||
|
cFioFile="$(mktemp --suffix=.c)"
|
||||||
|
oFioFile="$(mktemp --suffix=.o)"
|
||||||
|
binFioFile="$(mktemp --suffix=.bin)"
|
||||||
|
cat > "$cFioFile" <<'EOF'
|
||||||
|
extern int mfsRegister(const char *path, void *buf, unsigned int size, unsigned int cap, int writable);
|
||||||
|
extern struct __sFILE *fopen(const char *path, const char *mode);
|
||||||
|
extern unsigned int fread(void *p, unsigned int s, unsigned int n, struct __sFILE *f);
|
||||||
|
extern int fseek(struct __sFILE *f, long off, int whence);
|
||||||
|
extern long ftell(struct __sFILE *f);
|
||||||
|
extern int fclose(struct __sFILE *f);
|
||||||
|
extern int fgetc(struct __sFILE *f);
|
||||||
|
extern int fprintf(struct __sFILE *f, const char *fmt, ...);
|
||||||
|
extern int strcmp(const char *a, const char *b);
|
||||||
|
__attribute__((noinline)) void switchToBank2(void) {
|
||||||
|
__asm__ volatile ("sep #0x20\n.byte 0xa9,0x02\npha\nplb\nrep #0x20\n");
|
||||||
|
}
|
||||||
|
static char data[14] = "Hello, world!";
|
||||||
|
static char wbuf[64];
|
||||||
|
static char rbuf[32];
|
||||||
|
int main(void) {
|
||||||
|
unsigned short ok = 0;
|
||||||
|
if (mfsRegister("greet", data, 13, 13, 0) == 0) ok |= 0x01;
|
||||||
|
struct __sFILE *f = fopen("greet", "r");
|
||||||
|
if (f) ok |= 0x02;
|
||||||
|
unsigned int n = fread(rbuf, 1, 13, f);
|
||||||
|
rbuf[13] = 0;
|
||||||
|
if (n == 13 && strcmp(rbuf, "Hello, world!") == 0) ok |= 0x04;
|
||||||
|
if (ftell(f) == 13L) ok |= 0x08;
|
||||||
|
fseek(f, 0L, 0);
|
||||||
|
if (fgetc(f) == 'H') ok |= 0x10;
|
||||||
|
if (fclose(f) == 0) ok |= 0x20;
|
||||||
|
if (mfsRegister("out", wbuf, 0, 64, 1) == 0) ok |= 0x40;
|
||||||
|
f = fopen("out", "w");
|
||||||
|
int wlen = fprintf(f, "n=%d", 42);
|
||||||
|
if (wlen == 4 && wbuf[0] == 'n' && wbuf[1] == '=' && wbuf[2] == '4' && wbuf[3] == '2')
|
||||||
|
ok |= 0x80;
|
||||||
|
switchToBank2();
|
||||||
|
*(volatile unsigned short *)0x5000 = ok;
|
||||||
|
while (1) {}
|
||||||
|
}
|
||||||
|
EOF
|
||||||
|
"$CLANG" --target=w65816 -O2 -ffunction-sections -c \
|
||||||
|
"$cFioFile" -o "$oFioFile"
|
||||||
|
"$PROJECT_ROOT/tools/link816" -o "$binFioFile" --text-base 0x1000 \
|
||||||
|
"$oCrt0F" "$oLibcF" "$oExtrasF" "$oSnprintfF" \
|
||||||
|
"$oSfF" "$oSdF" "$oLibgccFile" "$oFioFile" \
|
||||||
|
>/dev/null 2>&1
|
||||||
|
if ! bash "$PROJECT_ROOT/scripts/runInMame.sh" "$binFioFile" --check \
|
||||||
|
0x025000=00ff >/dev/null 2>&1; then
|
||||||
|
die "MAME: memory-backed file I/O bitmap != 0xFF (mfsRegister/fopen/fread/fwrite/fseek regression)"
|
||||||
|
fi
|
||||||
|
rm -f "$cFioFile" "$oFioFile" "$binFioFile"
|
||||||
|
|
||||||
|
# wchar.h + signal.h. wcslen/wcscmp/wcscpy/wcschr cover the
|
||||||
|
# core wide-char family; mbtowc/wctomb verify the trivial 1:1
|
||||||
|
# Latin-1 mapping. signal()/raise() are exercised by
|
||||||
|
# installing a handler, raising, and verifying the handler ran.
|
||||||
|
log "check: MAME runs wchar.h + signal.h core API"
|
||||||
|
cWsFile="$(mktemp --suffix=.c)"
|
||||||
|
oWsFile="$(mktemp --suffix=.o)"
|
||||||
|
binWsFile="$(mktemp --suffix=.bin)"
|
||||||
|
cat > "$cWsFile" <<'EOF'
|
||||||
|
#include <wchar.h>
|
||||||
|
#include <signal.h>
|
||||||
|
__attribute__((noinline)) void switchToBank2(void) {
|
||||||
|
__asm__ volatile ("sep #0x20\n.byte 0xa9,0x02\npha\nplb\nrep #0x20\n");
|
||||||
|
}
|
||||||
|
static volatile int sigSeen = 0;
|
||||||
|
static void onSig(int s) { sigSeen = s; }
|
||||||
|
int main(void) {
|
||||||
|
unsigned short ok = 0;
|
||||||
|
static const wchar_t hello[] = { 'h','e','l','l','o',0 };
|
||||||
|
static const wchar_t hellp[] = { 'h','e','l','l','p',0 };
|
||||||
|
wchar_t buf[16];
|
||||||
|
if (wcslen(hello) == 5) ok |= 0x01;
|
||||||
|
if (wcscmp(hello, hello) == 0) ok |= 0x02;
|
||||||
|
if (wcscmp(hello, hellp) < 0) ok |= 0x04;
|
||||||
|
wcscpy(buf, hello);
|
||||||
|
if (wcscmp(buf, hello) == 0) ok |= 0x08;
|
||||||
|
if (wcschr(hello, 'l') == hello + 2) ok |= 0x10;
|
||||||
|
char mb[8]; wchar_t wc;
|
||||||
|
int n = mbtowc(&wc, "A", 1);
|
||||||
|
if (n == 1 && wc == 'A') ok |= 0x20;
|
||||||
|
if (wctomb(mb, 'Z') == 1 && mb[0] == 'Z') ok |= 0x40;
|
||||||
|
// signal: install handler, raise, verify it fired.
|
||||||
|
signal(SIGABRT, onSig); // would normally abort; we override
|
||||||
|
signal(SIGFPE, onSig);
|
||||||
|
raise(SIGFPE);
|
||||||
|
if (sigSeen == SIGFPE) ok |= 0x80;
|
||||||
|
switchToBank2();
|
||||||
|
*(volatile unsigned short *)0x5000 = ok;
|
||||||
|
while (1) {}
|
||||||
|
}
|
||||||
|
EOF
|
||||||
|
"$CLANG" --target=w65816 -O2 -ffunction-sections -I"$PROJECT_ROOT/runtime/include" -c \
|
||||||
|
"$cWsFile" -o "$oWsFile"
|
||||||
|
"$PROJECT_ROOT/tools/link816" -o "$binWsFile" --text-base 0x1000 \
|
||||||
|
"$oCrt0F" "$oLibcF" "$oExtrasF" "$oLibgccFile" "$oWsFile" \
|
||||||
|
>/dev/null 2>&1
|
||||||
|
if ! bash "$PROJECT_ROOT/scripts/runInMame.sh" "$binWsFile" --check \
|
||||||
|
0x025000=00ff >/dev/null 2>&1; then
|
||||||
|
die "MAME: wchar/signal core != 0xFF (wcs* / mbtowc / signal/raise regression)"
|
||||||
|
fi
|
||||||
|
rm -f "$cWsFile" "$oWsFile" "$binWsFile"
|
||||||
|
|
||||||
|
# C++ subset: classes, single inheritance, virtual functions,
|
||||||
|
# polymorphism via base-class pointer arrays, virtual dtors.
|
||||||
|
# Compiled with -fno-exceptions -fno-rtti (the supported subset
|
||||||
|
# — full RTTI / exceptions / multi-inheritance with virtual
|
||||||
|
# bases are not supported).
|
||||||
|
log "check: MAME runs C++ polymorphism (virtuals + single inheritance)"
|
||||||
|
cppFile="$(mktemp --suffix=.cpp)"
|
||||||
|
oCppFile="$(mktemp --suffix=.o)"
|
||||||
|
binCppFile="$(mktemp --suffix=.bin)"
|
||||||
|
cat > "$cppFile" <<'EOF'
|
||||||
|
extern "C" __attribute__((noinline)) void switchToBank2(void) {
|
||||||
|
__asm__ volatile ("sep #0x20\n.byte 0xa9,0x02\npha\nplb\nrep #0x20\n");
|
||||||
|
}
|
||||||
|
class Shape {
|
||||||
|
public:
|
||||||
|
virtual int area() const = 0;
|
||||||
|
virtual int perimeter() const = 0;
|
||||||
|
virtual ~Shape() {}
|
||||||
|
};
|
||||||
|
class Rect : public Shape {
|
||||||
|
int w, h;
|
||||||
|
public:
|
||||||
|
Rect(int w, int h) : w(w), h(h) {}
|
||||||
|
int area() const override { return w * h; }
|
||||||
|
int perimeter() const override { return 2 * (w + h); }
|
||||||
|
};
|
||||||
|
class Square : public Rect {
|
||||||
|
public:
|
||||||
|
Square(int s) : Rect(s, s) {}
|
||||||
|
};
|
||||||
|
class Circle : public Shape {
|
||||||
|
int r;
|
||||||
|
public:
|
||||||
|
Circle(int r) : r(r) {}
|
||||||
|
int area() const override { return (314 * r * r) / 100; }
|
||||||
|
int perimeter() const override { return (628 * r) / 100; }
|
||||||
|
};
|
||||||
|
static int sumAreas(Shape **shapes, int n) {
|
||||||
|
int total = 0;
|
||||||
|
for (int i = 0; i < n; i++) total += shapes[i]->area();
|
||||||
|
return total;
|
||||||
|
}
|
||||||
|
extern "C" int main(void) {
|
||||||
|
Rect r(3, 4); Square s(5); Circle c(2);
|
||||||
|
Shape *arr[3] = { &r, &s, &c };
|
||||||
|
int total = sumAreas(arr, 3);
|
||||||
|
int ok = 0;
|
||||||
|
if (r.area() == 12) ok |= 1;
|
||||||
|
if (r.perimeter() == 14) ok |= 2;
|
||||||
|
if (s.area() == 25) ok |= 4;
|
||||||
|
if (c.area() == 12) ok |= 8;
|
||||||
|
if (total == 49) ok |= 0x10;
|
||||||
|
switchToBank2();
|
||||||
|
*(volatile unsigned short *)0x5000 = (unsigned short)ok;
|
||||||
|
while (1) {}
|
||||||
|
}
|
||||||
|
EOF
|
||||||
|
"$PROJECT_ROOT/tools/llvm-mos-build/bin/clang++" --target=w65816 -O2 \
|
||||||
|
-ffunction-sections -fno-exceptions -fno-rtti \
|
||||||
|
-c "$cppFile" -o "$oCppFile"
|
||||||
|
"$PROJECT_ROOT/tools/link816" -o "$binCppFile" --text-base 0x1000 \
|
||||||
|
"$oCrt0F" "$oLibgccFile" "$oCppFile" \
|
||||||
|
>/dev/null 2>&1
|
||||||
|
if ! bash "$PROJECT_ROOT/scripts/runInMame.sh" "$binCppFile" --check \
|
||||||
|
0x025000=001f >/dev/null 2>&1; then
|
||||||
|
die "MAME: C++ polymorphism != 0x1F (vtable / virtual call regression)"
|
||||||
|
fi
|
||||||
|
rm -f "$cppFile" "$oCppFile" "$binCppFile"
|
||||||
|
|
||||||
|
# Real-world: hex dumper using memory-backed file I/O. Reads
|
||||||
|
# 16 bytes from a registered "in" file, writes a hex+ASCII
|
||||||
|
# dump to a registered "out" file via fprintf. Verifies the
|
||||||
|
# output via strstr lookups (clang DCE's static-buffer
|
||||||
|
# byte-reads after extern fn calls — strstr defeats that).
|
||||||
|
log "check: MAME runs hex dumper (file I/O + fprintf real-world)"
|
||||||
|
cHdFile="$(mktemp --suffix=.c)"
|
||||||
|
oHdFile="$(mktemp --suffix=.o)"
|
||||||
|
binHdFile="$(mktemp --suffix=.bin)"
|
||||||
|
cat > "$cHdFile" <<'EOF'
|
||||||
|
extern int mfsRegister(const char *path, void *buf, unsigned int size, unsigned int cap, int writable);
|
||||||
|
extern struct __sFILE *fopen(const char *path, const char *mode);
|
||||||
|
extern int fclose(struct __sFILE *f);
|
||||||
|
extern int fgetc(struct __sFILE *f);
|
||||||
|
extern int fprintf(struct __sFILE *f, const char *fmt, ...);
|
||||||
|
extern char *strstr(const char *h, const char *n);
|
||||||
|
__attribute__((noinline)) void switchToBank2(void) {
|
||||||
|
__asm__ volatile ("sep #0x20\n.byte 0xa9,0x02\npha\nplb\nrep #0x20\n");
|
||||||
|
}
|
||||||
|
__attribute__((noinline)) void hexdump(struct __sFILE *in, struct __sFILE *out) {
|
||||||
|
unsigned int offset = 0;
|
||||||
|
unsigned char line[16];
|
||||||
|
int linelen;
|
||||||
|
while (1) {
|
||||||
|
linelen = 0;
|
||||||
|
while (linelen < 16) {
|
||||||
|
int c = fgetc(in);
|
||||||
|
if (c < 0) break;
|
||||||
|
line[linelen++] = (unsigned char)c;
|
||||||
|
}
|
||||||
|
if (linelen == 0) break;
|
||||||
|
fprintf(out, "%04x: ", offset);
|
||||||
|
for (int i = 0; i < 16; i++) {
|
||||||
|
if (i < linelen) fprintf(out, "%02x ", line[i]);
|
||||||
|
else fprintf(out, " ");
|
||||||
|
}
|
||||||
|
fprintf(out, " |");
|
||||||
|
for (int i = 0; i < linelen; i++) {
|
||||||
|
unsigned char c = line[i];
|
||||||
|
int p = (c >= 0x20 && c < 0x7F) ? c : '.';
|
||||||
|
fprintf(out, "%c", p);
|
||||||
|
}
|
||||||
|
fprintf(out, "|\n");
|
||||||
|
offset += linelen;
|
||||||
|
if (linelen < 16) break;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
static char input[16] = { 'H','e','l','l','o','!','\n','A','B','C',0,1,2,3,4,5 };
|
||||||
|
static char output[300];
|
||||||
|
int main(void) {
|
||||||
|
mfsRegister("in", input, 16, 16, 0);
|
||||||
|
mfsRegister("out", output, 0, 300, 1);
|
||||||
|
struct __sFILE *in = fopen("in", "r");
|
||||||
|
struct __sFILE *out = fopen("out", "w");
|
||||||
|
hexdump(in, out);
|
||||||
|
fclose(in); fclose(out);
|
||||||
|
int ok = 0;
|
||||||
|
if (strstr(output, "0000:")) ok |= 1;
|
||||||
|
if (strstr(output, "48 65 6c 6c 6f 21")) ok |= 2;
|
||||||
|
if (strstr(output, "|Hello!.ABC......|")) ok |= 4;
|
||||||
|
switchToBank2();
|
||||||
|
*(volatile unsigned short *)0x5000 = (unsigned short)ok;
|
||||||
|
while (1) {}
|
||||||
|
}
|
||||||
|
EOF
|
||||||
|
"$CLANG" --target=w65816 -O2 -ffunction-sections -c \
|
||||||
|
"$cHdFile" -o "$oHdFile"
|
||||||
|
"$PROJECT_ROOT/tools/link816" -o "$binHdFile" --text-base 0x1000 \
|
||||||
|
"$oCrt0F" "$oLibcF" "$oExtrasF" "$oSnprintfF" \
|
||||||
|
"$oSfF" "$oSdF" "$oLibgccFile" "$oHdFile" \
|
||||||
|
>/dev/null 2>&1
|
||||||
|
if ! bash "$PROJECT_ROOT/scripts/runInMame.sh" "$binHdFile" --check \
|
||||||
|
0x025000=0007 >/dev/null 2>&1; then
|
||||||
|
die "MAME: hex dumper output strstr lookups failed"
|
||||||
|
fi
|
||||||
|
rm -f "$cHdFile" "$oHdFile" "$binHdFile"
|
||||||
|
|
||||||
|
# Real-world: JSON tokenizer. Walks a literal JSON string,
|
||||||
|
# producing token-type counts. Exercises a state machine
|
||||||
|
# over char-by-char input, mixed string/number/keyword
|
||||||
|
# parsing, strncmp on keywords, and 16-bit globals. ~50
|
||||||
|
# lines of code, ~10 distinct token types.
|
||||||
|
log "check: MAME runs JSON tokenizer (state machine + strncmp)"
|
||||||
|
cJsFile="$(mktemp --suffix=.c)"
|
||||||
|
oJsFile="$(mktemp --suffix=.o)"
|
||||||
|
binJsFile="$(mktemp --suffix=.bin)"
|
||||||
|
cat > "$cJsFile" <<'EOF'
|
||||||
|
extern int strncmp(const char *a, const char *b, unsigned int n);
|
||||||
|
__attribute__((noinline)) void switchToBank2(void) {
|
||||||
|
__asm__ volatile ("sep #0x20\n.byte 0xa9,0x02\npha\nplb\nrep #0x20\n");
|
||||||
|
}
|
||||||
|
enum { TOK_LBRACE, TOK_RBRACE, TOK_LBRACK, TOK_RBRACK, TOK_COMMA, TOK_COLON,
|
||||||
|
TOK_STRING, TOK_NUMBER, TOK_TRUE, TOK_FALSE, TOK_NULL, TOK_EOF, TOK_ERR };
|
||||||
|
static const char *p;
|
||||||
|
static int counts[16];
|
||||||
|
__attribute__((noinline)) static int nextToken(void) {
|
||||||
|
while (*p == ' ' || *p == '\t' || *p == '\n' || *p == '\r') p++;
|
||||||
|
if (*p == 0) return TOK_EOF;
|
||||||
|
if (*p == '{') { p++; return TOK_LBRACE; }
|
||||||
|
if (*p == '}') { p++; return TOK_RBRACE; }
|
||||||
|
if (*p == '[') { p++; return TOK_LBRACK; }
|
||||||
|
if (*p == ']') { p++; return TOK_RBRACK; }
|
||||||
|
if (*p == ',') { p++; return TOK_COMMA; }
|
||||||
|
if (*p == ':') { p++; return TOK_COLON; }
|
||||||
|
if (*p == '"') {
|
||||||
|
p++;
|
||||||
|
while (*p && *p != '"') p++;
|
||||||
|
if (*p == '"') p++;
|
||||||
|
return TOK_STRING;
|
||||||
|
}
|
||||||
|
if (*p == '-' || (*p >= '0' && *p <= '9')) {
|
||||||
|
if (*p == '-') p++;
|
||||||
|
while (*p >= '0' && *p <= '9') p++;
|
||||||
|
return TOK_NUMBER;
|
||||||
|
}
|
||||||
|
if (strncmp(p, "true", 4) == 0) { p += 4; return TOK_TRUE; }
|
||||||
|
if (strncmp(p, "false", 5) == 0) { p += 5; return TOK_FALSE; }
|
||||||
|
if (strncmp(p, "null", 4) == 0) { p += 4; return TOK_NULL; }
|
||||||
|
return TOK_ERR;
|
||||||
|
}
|
||||||
|
__attribute__((noinline)) static void tokenize(const char *src) {
|
||||||
|
p = src;
|
||||||
|
int t;
|
||||||
|
while ((t = nextToken()) != TOK_EOF && t != TOK_ERR) {
|
||||||
|
if (t < 16) counts[t]++;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
int main(void) {
|
||||||
|
static const char input[] =
|
||||||
|
"{\"name\": \"alice\", \"age\": 30, \"isCool\": true, \"things\": [1, 2, null]}";
|
||||||
|
tokenize(input);
|
||||||
|
int ok = 0;
|
||||||
|
if (counts[TOK_LBRACE] == 1) ok |= 0x01;
|
||||||
|
if (counts[TOK_RBRACE] == 1) ok |= 0x02;
|
||||||
|
if (counts[TOK_LBRACK] == 1) ok |= 0x04;
|
||||||
|
if (counts[TOK_RBRACK] == 1) ok |= 0x08;
|
||||||
|
if (counts[TOK_COMMA] == 5) ok |= 0x10;
|
||||||
|
if (counts[TOK_COLON] == 4) ok |= 0x20;
|
||||||
|
if (counts[TOK_STRING] == 5) ok |= 0x40;
|
||||||
|
if (counts[TOK_NUMBER] == 3) ok |= 0x80;
|
||||||
|
if (counts[TOK_TRUE] == 1) ok |= 0x100;
|
||||||
|
if (counts[TOK_NULL] == 1) ok |= 0x200;
|
||||||
|
switchToBank2();
|
||||||
|
*(volatile unsigned short *)0x5000 = (unsigned short)ok;
|
||||||
|
while (1) {}
|
||||||
|
}
|
||||||
|
EOF
|
||||||
|
"$CLANG" --target=w65816 -O2 -ffunction-sections -c \
|
||||||
|
"$cJsFile" -o "$oJsFile"
|
||||||
|
"$PROJECT_ROOT/tools/link816" -o "$binJsFile" --text-base 0x1000 \
|
||||||
|
"$oCrt0F" "$oLibcF" "$oLibgccFile" "$oJsFile" \
|
||||||
|
>/dev/null 2>&1
|
||||||
|
if ! bash "$PROJECT_ROOT/scripts/runInMame.sh" "$binJsFile" --check \
|
||||||
|
0x025000=03ff >/dev/null 2>&1; then
|
||||||
|
die "MAME: JSON tokenizer count bitmap != 0x3ff"
|
||||||
|
fi
|
||||||
|
rm -f "$cJsFile" "$oJsFile" "$binJsFile"
|
||||||
|
|
||||||
rm -f "$oLibcF" "$oStrtolF" "$oSnprintfF" "$oQsortF" \
|
rm -f "$oLibcF" "$oStrtolF" "$oSnprintfF" "$oQsortF" \
|
||||||
"$oExtrasF" "$oStrtokF" "$oMathF" "$oSfF" "$oSdF" "$oCrt0F"
|
"$oExtrasF" "$oStrtokF" "$oMathF" "$oSfF" "$oSdF" "$oCrt0F"
|
||||||
else
|
else
|
||||||
|
|
@ -3628,54 +3967,68 @@ EOF
|
||||||
die "inline asm: 'inc a' missing from output"
|
die "inline asm: 'inc a' missing from output"
|
||||||
fi
|
fi
|
||||||
|
|
||||||
|
# bench.sh runs the size-comparison harness against Calypsi.
|
||||||
|
# Smoke just verifies it produces a non-empty markdown table —
|
||||||
|
# actual ratios are reported in STATUS.
|
||||||
|
log "check: scripts/bench.sh runs (size vs Calypsi)"
|
||||||
|
benchOut="$(mktemp)"
|
||||||
|
bash "$PROJECT_ROOT/scripts/bench.sh" >"$benchOut" 2>/dev/null
|
||||||
|
if ! grep -q '^| \*\*total\*\*' "$benchOut"; then
|
||||||
|
die "bench.sh did not produce a totals row"
|
||||||
|
fi
|
||||||
|
rm -f "$benchOut"
|
||||||
|
|
||||||
# iigs/toolbox.h compiles cleanly and emits the JSL $E10000 dispatch
|
# iigs/toolbox.h compiles cleanly and emits the JSL $E10000 dispatch
|
||||||
# for at least one wrapper. Don't run in MAME (toolbox needs the
|
# for at least one wrapper. Don't run in MAME (toolbox needs the
|
||||||
# real ROM dispatcher, smoke runs in bare-CPU mode); just check
|
# real ROM dispatcher, smoke runs in bare-CPU mode); just check
|
||||||
# the codegen.
|
# the codegen.
|
||||||
log "check: iigs/toolbox.h wrappers compile and emit JSL E10000"
|
# iigs/toolbox.h — autogenerated wrappers for the entire IIgs
|
||||||
|
# toolbox (~1300 routines from 35 tool sets, sourced from ORCA-C
|
||||||
|
# ORCACDefs/ via scripts/genToolbox.py). Names match Apple's
|
||||||
|
# IIgs Toolbox Reference (TLStartUp, MMStartUp, NewWindow,
|
||||||
|
# SysBeep, ...). Verify the header compiles, the multi-arg
|
||||||
|
# asm bodies in iigsToolbox.s assemble, and that linking
|
||||||
|
# together produces a binary that emits the JSL $E10000 (Tool
|
||||||
|
# Locator) and JSL $E100A8 (GS/OS) dispatchers. Don't run in
|
||||||
|
# MAME (toolbox needs the real ROM dispatcher).
|
||||||
|
log "check: iigs/toolbox.h (autogenerated, ~1300 routines, Apple names)"
|
||||||
cToolFile="$(mktemp --suffix=.c)"
|
cToolFile="$(mktemp --suffix=.c)"
|
||||||
sToolFile="$(mktemp --suffix=.s)"
|
sToolFile="$(mktemp --suffix=.s)"
|
||||||
trap 'rm -f "$irFile" "$sFile" "$irCallFile" "$sCallFile" "$irMaFile" "$sMaFile" "$irI8File" "$sI8File" "$cFile" "$oFile2" "$cI32File" "$oI32File" "$cFibFile" "$sFibFile" "$cMulFile" "$sMulFile" "$cAllocaFile" "$sAllocaFile" "$cStrFile" "$sStrFile" "$cIndFile" "$sIndFile" "$irCoalesceFile" "$sCoalesceFile" "$cMixFile" "$sMixFile" "$cLinkFile" "$oLinkFile" "$oLibgccFile" "$binLinkFile" "$mapLinkFile" "$cFltFile" "$oFltFile" "$oSfFile" "$binFltFile" "$mapFltFile" "$cAsmFile" "$sAsmFile" "$cToolFile" "$sToolFile"' EXIT
|
trap 'rm -f "$irFile" "$sFile" "$irCallFile" "$sCallFile" "$irMaFile" "$sMaFile" "$irI8File" "$sI8File" "$cFile" "$oFile2" "$cI32File" "$oI32File" "$cFibFile" "$sFibFile" "$cMulFile" "$sMulFile" "$cAllocaFile" "$sAllocaFile" "$cStrFile" "$sStrFile" "$cIndFile" "$sIndFile" "$irCoalesceFile" "$sCoalesceFile" "$cMixFile" "$sMixFile" "$cLinkFile" "$oLinkFile" "$oLibgccFile" "$binLinkFile" "$mapLinkFile" "$cFltFile" "$oFltFile" "$oSfFile" "$binFltFile" "$mapFltFile" "$cAsmFile" "$sAsmFile" "$cToolFile" "$sToolFile"' EXIT
|
||||||
cat > "$cToolFile" <<'EOF'
|
cat > "$cToolFile" <<'EOF'
|
||||||
#include <iigs/toolbox.h>
|
#include <iigs/toolbox.h>
|
||||||
void greet(void) {
|
// Cover wrappers across multiple tool sets to verify the header
|
||||||
TBoxWriteCString("Hello");
|
// compiles and the multi-arg asm bodies in iigsToolbox.s link.
|
||||||
TBoxBeep();
|
void useToolLocator(void) {
|
||||||
|
TLStartUp(); TLShutDown(); TLBootInit(); TLReset();
|
||||||
|
unsigned short v = TLVersion(); (void)v;
|
||||||
}
|
}
|
||||||
// Cover all wrappers: ensures the multi-arg ones (declared extern in
|
void useMM(void) {
|
||||||
// the header, implemented in iigsToolbox.s) at least link.
|
unsigned short id = MMStartUp();
|
||||||
void everything(void) {
|
MMShutDown(id);
|
||||||
|
void *h = NewHandle(1024UL, id, 0, 0UL);
|
||||||
|
DisposeHandle(h);
|
||||||
|
}
|
||||||
|
void useEvent(void) {
|
||||||
|
unsigned short b = Button(0); (void)b;
|
||||||
|
unsigned long t = TickCount(); (void)t;
|
||||||
|
}
|
||||||
|
void useQD(void) {
|
||||||
short rect[4] = {0, 0, 100, 100};
|
short rect[4] = {0, 0, 100, 100};
|
||||||
char buf[20];
|
PaintRect(rect); FrameRect(rect); MoveTo(50, 50);
|
||||||
char buf2[16];
|
|
||||||
TBoxTLStartUp(); TBoxTLShutDown();
|
|
||||||
unsigned short id = TBoxMMStartUp();
|
|
||||||
unsigned long h = TBoxNewHandle(1024UL, id, 0, 0UL);
|
|
||||||
TBoxDisposeHandle(h);
|
|
||||||
TBoxMMShutDown(id);
|
|
||||||
TBoxReadAsciiTime(buf);
|
|
||||||
TBoxMoveTo(10, 20);
|
|
||||||
TBoxFrameRect(rect); TBoxPaintRect(rect); TBoxEraseRect(rect);
|
|
||||||
TBoxDrawString("\005hello");
|
|
||||||
TBoxQDStartUp(0x80, 0x1A00, id); TBoxQDShutDown();
|
|
||||||
TBoxEMStartUp(id); TBoxEMShutDown(); TBoxSystemTask();
|
|
||||||
TBoxGetNextEvent(0xFFFF, buf2);
|
|
||||||
void *win = TBoxNewWindow((const void *)0x5000);
|
|
||||||
TBoxCloseWindow(win);
|
|
||||||
char k = TBoxReadKey();
|
|
||||||
(void)k;
|
|
||||||
}
|
}
|
||||||
|
void useMisc(void) { SysBeep(); }
|
||||||
EOF
|
EOF
|
||||||
"$CLANG" --target=w65816 -O2 -I"$PROJECT_ROOT/runtime/include" \
|
"$CLANG" --target=w65816 -O2 -I"$PROJECT_ROOT/runtime/include" \
|
||||||
-S "$cToolFile" -o "$sToolFile"
|
-S "$cToolFile" -o "$sToolFile"
|
||||||
if ! grep -qE '\bjsl\s+0xe10000\b' "$sToolFile"; then
|
if ! grep -qE '\bjsl\s+0xe10000\b' "$sToolFile"; then
|
||||||
die "iigs/toolbox.h: JSL \$E10000 (Tool Locator) not emitted"
|
die "iigs/toolbox.h: JSL \$E10000 (Tool Locator) not emitted"
|
||||||
fi
|
fi
|
||||||
if ! grep -qE '\bldx\s+#0x290[Bb]\b' "$sToolFile"; then
|
# SysBeep tool number $2C03 per ORCA (function $2C of Misc Tools $03).
|
||||||
die "iigs/toolbox.h: WriteCString tool number 0x290B not in output"
|
# Match case-insensitively — clang lowercases hex constants.
|
||||||
|
if ! grep -qiE '\bldx\s+#0x2c03\b' "$sToolFile"; then
|
||||||
|
die "iigs/toolbox.h: SysBeep tool number 0x2C03 not in output"
|
||||||
fi
|
fi
|
||||||
# Make sure the multi-arg wrappers in iigsToolbox.s assemble and
|
|
||||||
# linking the test object against them succeeds.
|
|
||||||
oToolFile="$(mktemp --suffix=.o)"
|
oToolFile="$(mktemp --suffix=.o)"
|
||||||
oToolboxAsm="$(mktemp --suffix=.o)"
|
oToolboxAsm="$(mktemp --suffix=.o)"
|
||||||
"$CLANG" --target=w65816 -O2 -I"$PROJECT_ROOT/runtime/include" \
|
"$CLANG" --target=w65816 -O2 -I"$PROJECT_ROOT/runtime/include" \
|
||||||
|
|
|
||||||
|
|
@ -107,8 +107,11 @@ void W65816AsmPrinter::emitInstruction(const MachineInstr *MI) {
|
||||||
getSubtargetInfo().getFeatureBits());
|
getSubtargetInfo().getFeatureBits());
|
||||||
|
|
||||||
// Drop a SEP that the previous LDAi8imm expansion marked redundant.
|
// Drop a SEP that the previous LDAi8imm expansion marked redundant.
|
||||||
// The LDAi8imm peephole leaves M=8 set when its successor is a SEP
|
// The LDAi8imm peephole leaves M=8 set when its successor (or the
|
||||||
// #$20 — that SEP would re-set the same flag, so we elide it.
|
// next non-mode-neutral MI) is a SEP #$20 — that SEP would re-set
|
||||||
|
// the same flag, so we elide it. Mode-neutral MIs (X-flag-only
|
||||||
|
// index ops, branches, transfers that don't touch A) pass through
|
||||||
|
// freely without invalidating the skip.
|
||||||
if (SkipNextSepImm >= 0 && !MI->isDebugInstr()) {
|
if (SkipNextSepImm >= 0 && !MI->isDebugInstr()) {
|
||||||
if (MI->getOpcode() == W65816::SEP &&
|
if (MI->getOpcode() == W65816::SEP &&
|
||||||
MI->getNumOperands() >= 1 && MI->getOperand(0).isImm() &&
|
MI->getNumOperands() >= 1 && MI->getOperand(0).isImm() &&
|
||||||
|
|
@ -116,10 +119,29 @@ void W65816AsmPrinter::emitInstruction(const MachineInstr *MI) {
|
||||||
SkipNextSepImm = -1;
|
SkipNextSepImm = -1;
|
||||||
return; // consume the SEP, don't emit
|
return; // consume the SEP, don't emit
|
||||||
}
|
}
|
||||||
// Conservative: any non-debug, non-matching MI between LDAi8imm
|
// Check if MI is mode-neutral; if so, pass through and KEEP the skip.
|
||||||
// and the expected SEP invalidates the elision (it might re-clear
|
bool isMNeutral = false;
|
||||||
// M, observe P, etc.). Reset and proceed normally.
|
if (MI->isBranch() || MI->isReturn()) isMNeutral = true;
|
||||||
SkipNextSepImm = -1;
|
else switch (MI->getOpcode()) {
|
||||||
|
case W65816::LDX_Imm16: case W65816::LDX_DP: case W65816::LDX_Abs:
|
||||||
|
case W65816::LDX_DPY: case W65816::LDX_AbsY:
|
||||||
|
case W65816::LDY_Imm16: case W65816::LDY_DP: case W65816::LDY_Abs:
|
||||||
|
case W65816::LDY_DPX: case W65816::LDY_AbsX:
|
||||||
|
case W65816::STX_DP: case W65816::STX_Abs: case W65816::STX_DPY:
|
||||||
|
case W65816::STY_DP: case W65816::STY_Abs: case W65816::STY_DPX:
|
||||||
|
case W65816::INX: case W65816::DEX:
|
||||||
|
case W65816::INY: case W65816::DEY:
|
||||||
|
case W65816::CPX_Imm16: case W65816::CPX_DP: case W65816::CPX_Abs:
|
||||||
|
case W65816::CPY_Imm16: case W65816::CPY_DP: case W65816::CPY_Abs:
|
||||||
|
case W65816::PHX: case W65816::PHY:
|
||||||
|
case W65816::PLX: case W65816::PLY:
|
||||||
|
case W65816::NOP:
|
||||||
|
isMNeutral = true; break;
|
||||||
|
default: break;
|
||||||
|
}
|
||||||
|
// Anything else invalidates the elision (might re-clear M, push/pop
|
||||||
|
// 8-bit P that observes mode, call out, etc.).
|
||||||
|
if (!isMNeutral) SkipNextSepImm = -1;
|
||||||
}
|
}
|
||||||
|
|
||||||
// Drop the STAabs that the LDAi16imm-0 peephole replaced with STZ.
|
// Drop the STAabs that the LDAi16imm-0 peephole replaced with STZ.
|
||||||
|
|
@ -318,8 +340,37 @@ void W65816AsmPrinter::emitInstruction(const MachineInstr *MI) {
|
||||||
Lda.addOperand(MCOperand::createImm(Val));
|
Lda.addOperand(MCOperand::createImm(Val));
|
||||||
EmitToStreamer(*OutStreamer, Lda);
|
EmitToStreamer(*OutStreamer, Lda);
|
||||||
bool SkipRep = false;
|
bool SkipRep = false;
|
||||||
|
// Walk past mode-neutral MIs (X-flag-only ops, branches, transfers
|
||||||
|
// that don't touch A) to find the next SEP/REP — same idea as the
|
||||||
|
// pre-emit REP/SEP scheduler, but applied to LDAi8imm's closing
|
||||||
|
// REP. If a SEP #$20 sits there, we can elide the REP+SEP pair.
|
||||||
|
auto isMNeutralMI = [](const MachineInstr &MI) -> bool {
|
||||||
|
if (MI.isDebugInstr()) return true;
|
||||||
|
if (MI.isBranch() || MI.isReturn()) return true;
|
||||||
|
unsigned O = MI.getOpcode();
|
||||||
|
switch (O) {
|
||||||
|
case W65816::LDX_Imm16: case W65816::LDX_DP: case W65816::LDX_Abs:
|
||||||
|
case W65816::LDX_DPY: case W65816::LDX_AbsY:
|
||||||
|
case W65816::LDY_Imm16: case W65816::LDY_DP: case W65816::LDY_Abs:
|
||||||
|
case W65816::LDY_DPX: case W65816::LDY_AbsX:
|
||||||
|
case W65816::STX_DP: case W65816::STX_Abs: case W65816::STX_DPY:
|
||||||
|
case W65816::STY_DP: case W65816::STY_Abs: case W65816::STY_DPX:
|
||||||
|
case W65816::INX: case W65816::DEX:
|
||||||
|
case W65816::INY: case W65816::DEY:
|
||||||
|
case W65816::CPX_Imm16: case W65816::CPX_DP: case W65816::CPX_Abs:
|
||||||
|
case W65816::CPY_Imm16: case W65816::CPY_DP: case W65816::CPY_Abs:
|
||||||
|
case W65816::PHX: case W65816::PHY:
|
||||||
|
case W65816::PLX: case W65816::PLY:
|
||||||
|
case W65816::CLC: case W65816::SEC:
|
||||||
|
case W65816::PHP: case W65816::PLP:
|
||||||
|
case W65816::NOP:
|
||||||
|
return true;
|
||||||
|
default:
|
||||||
|
return false;
|
||||||
|
}
|
||||||
|
};
|
||||||
auto It = std::next(MI->getIterator());
|
auto It = std::next(MI->getIterator());
|
||||||
while (It != MI->getParent()->end() && It->isDebugInstr()) ++It;
|
while (It != MI->getParent()->end() && isMNeutralMI(*It)) ++It;
|
||||||
if (It != MI->getParent()->end() &&
|
if (It != MI->getParent()->end() &&
|
||||||
It->getOpcode() == W65816::SEP &&
|
It->getOpcode() == W65816::SEP &&
|
||||||
It->getNumOperands() >= 1 && It->getOperand(0).isImm() &&
|
It->getNumOperands() >= 1 && It->getOperand(0).isImm() &&
|
||||||
|
|
|
||||||
|
|
@ -307,6 +307,93 @@ bool W65816SepRepCleanup::runOnMachineFunction(MachineFunction &MF) {
|
||||||
Changed = true;
|
Changed = true;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// Extended toggle coalesce — REP/SEP scheduling.
|
||||||
|
//
|
||||||
|
// Walk the MBB looking for `T1 ; ...neutral... ; T2` where T1 and
|
||||||
|
// T2 are opposite-polarity SEP/REP toggles (T1=REP T2=SEP, or
|
||||||
|
// vice versa) with the same imm, and the gap contains only
|
||||||
|
// M-mode-neutral instructions (transfers/branches/X-flag-only
|
||||||
|
// index ops). In that case T1+T2 form a no-op pair around code
|
||||||
|
// that doesn't care about M, so both can be dropped. Equivalent
|
||||||
|
// to "moving the SEP/REP wrap inward to skip the neutral region".
|
||||||
|
//
|
||||||
|
// Saves 4 bytes / 12 cycles per gap collapsed. The common
|
||||||
|
// trigger is two STA8 stores separated by an LDY for the second
|
||||||
|
// store's address — STA8fi each emit SEP/STA/REP, the existing
|
||||||
|
// adjacent coalesce can't see across the LDY, this pass can.
|
||||||
|
{
|
||||||
|
// Mode-neutral instruction set: don't touch the M-bit and
|
||||||
|
// don't depend on A's width. X-flag dependent ops (LDX/LDY/
|
||||||
|
// STX/STY/INX/DEX/INY/DEY/CPX/CPY/PHX/PHY/PLX/PLY) are
|
||||||
|
// independent of M. So are all branches, JMP/JSR/JSL/RTL/RTS,
|
||||||
|
// CLC/SEC/CLI/SEI/CLD/SED/CLV, NOP, and PHP/PLP (they push
|
||||||
|
// 8-bit P regardless of M).
|
||||||
|
auto isMNeutral = [](const MachineInstr &MI) -> bool {
|
||||||
|
if (MI.isDebugInstr()) return true;
|
||||||
|
if (MI.isBranch() || MI.isReturn()) return true;
|
||||||
|
unsigned O = MI.getOpcode();
|
||||||
|
switch (O) {
|
||||||
|
case W65816::LDX_Imm16: case W65816::LDX_DP: case W65816::LDX_Abs:
|
||||||
|
case W65816::LDX_DPY: case W65816::LDX_AbsY:
|
||||||
|
case W65816::LDY_Imm16: case W65816::LDY_DP: case W65816::LDY_Abs:
|
||||||
|
case W65816::LDY_DPX: case W65816::LDY_AbsX:
|
||||||
|
case W65816::STX_DP: case W65816::STX_Abs: case W65816::STX_DPY:
|
||||||
|
case W65816::STY_DP: case W65816::STY_Abs: case W65816::STY_DPX:
|
||||||
|
case W65816::INX: case W65816::DEX:
|
||||||
|
case W65816::INY: case W65816::DEY:
|
||||||
|
case W65816::CPX_Imm16: case W65816::CPX_DP: case W65816::CPX_Abs:
|
||||||
|
case W65816::CPY_Imm16: case W65816::CPY_DP: case W65816::CPY_Abs:
|
||||||
|
case W65816::PHX: case W65816::PHY:
|
||||||
|
case W65816::PLX: case W65816::PLY:
|
||||||
|
case W65816::CLC: case W65816::SEC:
|
||||||
|
case W65816::PHP: case W65816::PLP:
|
||||||
|
case W65816::NOP:
|
||||||
|
return true;
|
||||||
|
default:
|
||||||
|
return false;
|
||||||
|
}
|
||||||
|
};
|
||||||
|
|
||||||
|
bool again = true;
|
||||||
|
while (again) {
|
||||||
|
again = false;
|
||||||
|
for (auto It = MBB.begin(); It != MBB.end(); ++It) {
|
||||||
|
unsigned Op1 = It->getOpcode();
|
||||||
|
if (Op1 != W65816::REP && Op1 != W65816::SEP) continue;
|
||||||
|
if (It->getNumOperands() < 1 || !It->getOperand(0).isImm()) continue;
|
||||||
|
int Imm1 = It->getOperand(0).getImm();
|
||||||
|
if (Imm1 != 0x20) continue; // M-bit only
|
||||||
|
// Walk forward across mode-neutral ops looking for the matching
|
||||||
|
// opposite toggle. Bail at calls, asm, ALU ops on A, etc.
|
||||||
|
unsigned WantOp = (Op1 == W65816::REP) ? W65816::SEP : W65816::REP;
|
||||||
|
auto Walker = std::next(It);
|
||||||
|
MachineInstr *Match = nullptr;
|
||||||
|
while (Walker != MBB.end()) {
|
||||||
|
if (Walker->isDebugInstr()) { ++Walker; continue; }
|
||||||
|
unsigned WO = Walker->getOpcode();
|
||||||
|
if (WO == WantOp && Walker->getNumOperands() >= 1 &&
|
||||||
|
Walker->getOperand(0).isImm() &&
|
||||||
|
Walker->getOperand(0).getImm() == Imm1) {
|
||||||
|
Match = &*Walker;
|
||||||
|
break;
|
||||||
|
}
|
||||||
|
// Bail on anything that touches A or otherwise cares about M.
|
||||||
|
if (Walker->isCall() || Walker->isInlineAsm()) break;
|
||||||
|
if (!isMNeutral(*Walker)) break;
|
||||||
|
++Walker;
|
||||||
|
}
|
||||||
|
if (!Match) continue;
|
||||||
|
// Drop both toggles. Erasing changes iterator stability; restart.
|
||||||
|
MachineInstr *T1 = &*It;
|
||||||
|
T1->eraseFromParent();
|
||||||
|
Match->eraseFromParent();
|
||||||
|
Changed = true;
|
||||||
|
again = true;
|
||||||
|
break;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
// Second peephole: collapse `ADCi16imm src, ±1/±2` (and SBCi16imm)
|
// Second peephole: collapse `ADCi16imm src, ±1/±2` (and SBCi16imm)
|
||||||
// into INA/DEA chains when the carry flag they would set is unused.
|
// into INA/DEA chains when the carry flag they would set is unused.
|
||||||
// ADCi16imm is a pseudo (expands to CLC+ADC_Imm16); we rewrite it
|
// ADCi16imm is a pseudo (expands to CLC+ADC_Imm16); we rewrite it
|
||||||
|
|
|
||||||
Loading…
Add table
Reference in a new issue