26 KiB
llvm816 — Current Status
LLVM/Clang backend for the WDC 65816 (Apple IIgs), forked from
llvm-mos as a separate W65816 target.
What works
End-to-end C-to-binary toolchain that produces 65816 machine code which runs correctly under MAME (apple2gs).
Language coverage at -O2 (no extra flags):
- All scalar arithmetic: i8 / i16 / i32 / i64 add, sub, mul, div, mod
(signed and unsigned). Carry-chained multi-word ops via ADC/SBC pseudos
- ASLA16 / shift libcalls.
- Comparisons and signed/unsigned widening (sext, zext, trunc) for all the above sizes.
- Pointer arithmetic, array indexing, struct field access, struct return-by-value (up to 8 bytes — Pair, Vec4, double).
- Bitfields, switch statements (verified up to ~12 cases + default),
function pointers, function-pointer tables, indirect calls via
__jsl_indirtrampoline. - Recursion: factorial, Fibonacci, depth-3 binary-tree insert/sum/min/max, simple recursive quicksort.
- Loops with goto / break / continue, nested loops, state machines.
<stdarg.h>varargs with int / long / unsigned long long mixed args.- Heap:
malloc/free(libc.c first-fit allocator) — linked-list reverse withconsworks. - Strings: hand-rolled
strlen,strcmp,strcpy,strchr, atoi/itoa roundtrip. - Soft-float (single): all four ops + comparisons, MAME-verified.
- Soft-double: add, sub, mul, div all return correct bit patterns bit-for-bit against gcc with round-to-nearest-even rounding; 3-iter Newton sqrt converges. Long-running iterations may hit MAME's 1-second sim-time budget (test config issue, not a compiler bug).
- Inline assembly with
"a","x","y"register constraints and arbitrary opcode bytes (used for thepha;plbbank-switch idiom). - C++ minimal: clang++ compiles a class with virtual + non-trivial ctor (vtable + RTTI omitted; no exceptions).
- printf with
%d %x %s %c %pand width/precision specifiers. - sprintf / snprintf / vsprintf / vsnprintf with the same format
coverage as printf (
%d %u %x %ld %lu %s %c %f %p %%+ width). C99 truncation semantics for snprintf.%.Nfproduces the correct fractional digits with round-half-up. - qsort + bsearch over arbitrary element size with a user
cmpcallback (insertion-sort variant — sidesteps the greedy regalloc bug in the recursive iterative-qsort form). - Standard string/stdlib glue: strcat, strncat, strpbrk, strspn, strcspn, atol, llabs (kept in their own translation unit so vprintf's branch layout doesn't shift).
<math.h>: fabs, floor, ceil, fmod, copysign, sqrt, pow, sin, cos, exp, log, atan, atan2, asin, acos, sinh, cosh, tanh (and float variants). Bit-twiddling for fabs/floor/ceil/copysign; Newton iteration for sqrt; range-reduction + Taylor for sin/cos/ exp/log/atan; identities for asin/acos/atan2/sinh/cosh/tanh. Accuracy is in the ~1e-6 range — good enough for typical numeric work, far short of glibc-quality. These are slow (each call is dozens to hundreds of soft-double libcalls) — pre-compute or cache when possible.setjmp/longjmpfrom libgcc.s.- Static constructors via crt0's init_array walk.
Toolchain:
clang/llcproduce W65816 assembly + ELF object files.tools/link816resolves cross-translation-unit refs, lays out text/rodata/bss, emits a flat binary the IIgs ROM can load. Auto-relocates bss above text+rodata when the default--bss-base 0x2000would overlap text, and skips past the IIgs IO window ($C000-$CFFF) if needed.tools/omfEmitproduces OMF v2.1 single-segment files (the IIgs's native object format) for round-tripping with classic dev tools.runtime/build.shbuilds crt0, libc, soft-float, soft-double, libgcc into linkable objects.scripts/smokeTest.shruns 102 end-to-end checks (scalar ops, control flow, calling conventions, MAME execution, regressions, link816 bss-base safety + weak-symbol resolution + heap_end-vs-heap_start sanity, iigs/toolbox.h compile-check, standalone runtime headers, AsmPrinter peepholes for STZ / PEA / PEI — single-STA, shared-LDA-multi-STA, and DPF0- forwarding cases — malloc/free coalesce ordering). Currently 100% pass at -O2 throughout.
ABI:
- arg0 in A; arg1 in X for i32-first-arg signatures; rest pushed RTL
on the system stack with PHA. Caller deallocates via
tsc;clc;adc #N;tcsorPLY*N/2. - Return: i8/i16 in A; i32 in A:X; i64 in A:X:Y plus DP[$F0..$F1] for the highest 16 bits.
- Frame is empty-descending (S points to next-free); offsets account for the +1 skew vs LLVM's full-descending model.
In flight
Two open bugs tracked:
-
#107 — strtok / qsort -O1+ miscompile — RESOLVED. Three independent issues across the backend, runtime, and linker; all fixed.
Fix 1 (W65816StackSlotCleanup cross-MBB): Pass -4 / Pass -4c collapsed
LDA fs.X; STA stk.Y; ... LDA_indY stk.Ypatterns with only an MBB-local safety check, missing cross-MBB readers of stk.Y. Greedy regalloc had spilled an in-place INA result back to stk.Y; eliminating the bb.3 init store left the bb.10 reload reading garbage. Function-wide cross-MBB check added.Fix 2 (W65816SepRepCleanup LDAi8imm hoist): Pre-pass that relocates LDAi8imm BEFORE byte-store SEP/REP wraps. LDAi8imm expands at AsmPrinter to its own SEP+LDA8+REP that toggles M; the post-RA scheduler was moving it INSIDE an STBptr wrap, so the LDAi8imm's REP fired BEFORE the byte STA. The STA then ran in M=16, writing 2 bytes of zero and clobbering the next byte. Hoist puts the toggle in the outer M=16 zone, leaving the byte STA in M=8.
Fix 3 (link816 bss-base safety + strtok_r noinline): With the backend fixes, -O2 strtok grew large enough that the strtok() wrapper inlining (~290 extra bytes) pushed the binary's text+rodata past 0xC000 (IIgs IO window). Reads of string literals or stdio handles in that range hit IO registers and corrupted execution. Two complementary fixes:
__attribute__((noinline))onstrtok_rso the wrapper doesn't duplicate it (-O2 strtok.o now 1564B, was 2156B); link816 auto-relocates bss above text+rodata when default--bss-base 0x2000would overlap, and skips past the IO window if needed.strtok.c now compiles at -O2 with everything else. Smoke #84 (4-call strtok continuation) and #92 (recursive parser) both pass. Workaround comments in build.sh / smokeTest.sh removed.
The
__attribute__((noinline,optnone))defenses on iterative qsort / RPNrunAll/ expression-parserrunAllwere subsequently dropped; the smoke now compiles them at plain-O2without escape hatches.
The W65816 backend assembler now supports all common indirect
addressing modes ((dp), (dp),Y, (dp,X), (d,s),Y,
[dp], [dp],Y, and JMP (abs)). All .byte opcode hacks in
the runtime have been removed in favour of the mnemonics. The
disassembler decodes them too.
Runtime now exposes a ~complete C99 subset: sprintf/snprintf with correct %.Nf precision, qsort/bsearch, the full string.h family (strcat/strncat/strpbrk/strspn/strcspn/ strtok/strtok_r), math.h with the eleven common transcendentals (sqrt/pow/sin/cos/exp/log/atan/atan2/asin/acos/sinh/cosh/tanh), atol/llabs/atexit/exit/abort, and a smoke test that exercises malloc + struct pointers + strcmp/strcpy via a working hash table end-to-end in MAME.
strtok / strtok_r live in their own TU at -O2 (with
__attribute__((noinline)) on strtok_r so the strtok() wrapper
doesn't duplicate it). Multi-call strtok over "a,b,,c" works
end-to-end in smoke. The layout-sensitive miscompile that
previously haunted strtok_r's inner CMP loop has been fixed by
modelling Uses=[P] on the conditional branches (the LICM/sink
interaction that elided "redundant" CMPs no longer fires); no
surgical workaround flags needed.
A small RPN calculator test (smoke #87) chains strtok, atol, push/pop over a static stack, snprintf "%ld", and strcmp to verify the end-to-end composition under a realistic-ish workload — adds, subs, muls, divs, and 3-deep operand stacks all work.
setjmp / longjmp (smoke #88) now work end-to-end: setjmp saves
SP / 24-bit ret addr / DP, longjmp restores them and returns the
val argument as setjmp's "second return". Required two fixes:
(a) the W65816 assembler had no instruction definition for
(dp) / (dp), y / (dp, x) indirect addressing modes, so the
mnemonic forms silently fell through to absolute-,Y opcodes —
fixed in src/llvm/lib/Target/W65816/W65816InstrFormats.td +
W65816InstrInfo.td + AsmParser/W65816AsmParser.cpp (the runtime
.byte hacks have been replaced with mnemonics); (b) added
__attribute__((returns_twice)) to the setjmp declaration so the
optimizer doesn't constant-fold post-setjmp env reads to 0.
CRC32 (smoke #89) verifies the standard "123456789" → 0xCBF43926 end-to-end — exercises uint32_t shifts, XORs, char-by-char loops.
Brainfuck interpreter (smoke #90) executes a small bf program and verifies the output bytes — exercises loop bracket matching, pointer math (data pointer), branching on cell value.
Recursive-descent expression parser (smoke #92) evaluates "3+4", "23+4", "2+34", "(3+4)5", "100/4-52+1" with proper operator precedence and parentheses — exercises mutual recursion, char-by-char tokenization, and integer arithmetic in concert.
The DWARF sidecar (link816 --debug-out FILE) now applies
text/rodata/bss/init_array relocations to every .debug_* section
before writing it. PC values in .debug_addr and .debug_line end
up as final-image addresses, so a consumer can map back to source
lines without re-running the linker. Intra-debug references (e.g.
.debug_info -> .debug_str offsets) are intentionally left
object-local — sections are concatenated, not recompacted, and each
slice carries an ; OBJ ... SEC ... SIZE ... header so a multi-TU
consumer can scope intra-debug offsets per-slice. The smoke test
verifies the address of a known function appears in the patched
sidecar bytes.
Known issues / workarounds
-
(d,s),y / (sr,s),y addressing wraps the bank when Y is negative as 16-bit unsigned. Worked around by
W65816NegYIndYrewriting the affected ops toTAX ; LDA/STA $0000,X. Stays correct for negative offsets likearr[i-1]. -
Pointer-deref bank policy is now split-by-syntax (FIXED):
*p(wherepis a runtime pointer / local-or-arg vreg) lowers viaLDAptr / STAptr / STBptrto[$E0],Yindirect-LONG with the bank byte at$E2forced to 0 — DBR-independent. The*(volatile uint16 *)0x5000 = vMMIO idiom (const-int pointer) is matched by a separate TableGen pattern that lowers straight toSTAabs(DBR-relative) so the smoke tests' bank-2 write path still works. Two tracked issues this resolved: (a) PHI-elim was eliding the inserter'sCOPY $a = ptr_vregwhen the loop body had multiple Acc16 PHIs competing for A — the inserter now spills the pointer to a fresh stack slot and reloads via LDAfi to keep RA honest; sumTable now correct. (b) pointer staging through[$E0]is bank-0 only, so switchToBank2 + helper-with-local-ptr no longer corrupts data in the wrong bank. Seefeedback_dbr_ptr_deref_spill.md. -
Greedy regalloc fails on long-arg call chains — a function that strings ~7+ independent
helper(longArg1, longArg2)calls overflows greedy at -O1+ ("ran out of registers during register allocation"). Same root issue as softDouble's old -O2 hold-out. Threshold raised somewhat by expanding IMG slots from 8 to 16 (now backed by DP $C0..$DE) — most "normal-looking" mixed-arity workloads now compile, but pathological pressure (many i32+ args- bitmask SETCC chain) still fails. Workarounds (in order of
preference): mark the heaviest helper
__attribute__((noinline))to reduce caller pressure;-mllvm -regalloc=fastfor that TU; or__attribute__((optnone))on the affected function. A proper fix needs either a custom greedy→fast fallback inW65816TargetMachine::createTargetRegisterAllocatoror a smarter spill-placement pre-RA pass.
- bitmask SETCC chain) still fails. Workarounds (in order of
preference): mark the heaviest helper
-
Bank-0 size limit (~48KB) — the runtime + program must fit in $1000-$BFFF (text+rodata) plus $D000-$DFFF (LC1 for rodata-spill and BSS). Past that, link816 hard-fails because text would cross the IO window. In practice this is rarely hit now that link816 has
--gc-sections(default ON, see Recently Fixed) which drops unreachable functions: a minimal program shrinks from ~43KB (whole runtime) to ~1.5KB. Programs that genuinely use most of the runtime can still hit the limit.
Recently fixed
-
#70 — iterative qsort -O2 miscompile —
W65816StackSlotCleanupPass -2 was deleting a store to a slot the loop body read. Function-wideslotHasOtherRefssafety check added (Pass -1 and Pass -2c hardened with the same pattern). Iterative qsort at plain -O2 + greedy now compiles correctly; theoptnoneworkaround in smoke #70 was removed. -
strtok -O2 layout-sensitive miscompile — modelling
Uses=[P]on the conditional branches (BEQ/BNE/BCS/BCC/BMI/BPL/BVS/BVC) made MachineCSE / scheduler / LICM / sink see the CMP→Bxx flag dependency. An entire class of layout-sensitive flag-corruption bugs went away; verified by sweeping--rodata-basefrom text-end to text-end+300 in 13 increments — every layout returns the correct strtok result. As a follow-on, MachineCSE has been re-enabled (was previously disabled inW65816TargetMachine::addMachineSSAOpti mizationas a workaround for the same root cause). -
link816 silently produced 4.3GB binaries when
--rodata-basewas set inside the text region. Now dies with a clear error:--rodata-base 0xX overlaps text 0xY+N (must start at or after 0xZ). -
link816 BSS-relocate landed in IIgs Language Card area — when text+rodata grew past $C000, link816 placed BSS at $D000 (the LC1 area), where IIgs-by-default maps ROM (writes drop silently, reads return ROM bytes). Globals never initialised; caught by the expression-parser smoke (#92) when adding rand / strnlen / etc. pushed the runtime past that threshold. Two-part fix: crt0 now enables LC1 RAM via the standard
lda $C083read-twice trick at startup, and link816 hard-fails (rather than silently corrupt) if BSS would exceed the LC1 ceiling ($E000) — past that you'd need crt0 to also enable LC2 / shadow RAM, which we haven't wired up. -
STZ peephole multi-STA latent miscompile — AsmPrinter's
LDA #0; STA $g->STZ $gpeephole eliminated the LDA but only consumed the FIRSTSTA. When SDAG-CSE shared oneLDA #0across multipleSTAs (g16=0; g32=0;is one IR shape), trailingSTAs read whatever was in A on entry — silently corrupting any global where A wasn't 0 at function entry. Smoke happened to pass because A was 0 by luck in every covered path. Fixed by gating the peephole on the consumingSTAkilling A (regalloc only setskilledon the last reader); smoke #98 added to lock the multi-STA case. -
PEI AsmPrinter peephole — new:
LDA $dp; PHA->PEI $dpsaves 1 byte and avoids touching A. Fires on thecopyPhysReg(A=DPF0); PUSH16pattern (i64-libcall return-value forwarding into the next call's stacked args), which appears in every chained soft-double / soft-int64 expression. Saves 68 bytes across the runtime (-64 in math.o alone). Same next-instruction-modifies-A safety check as the PEA peephole. Smoke #99 added. -
PEA peephole opcode-allowlist replaced with
modifiesRegister— the next-after-PUSH16 check that gates the PEA peephole was a hand-curated list of opcodes that obviously redefine A; switched toMachineInstr::modifiesRegister(A, TRI)which also catches implicit-defs (e.g. JSL clobbering A as part of the call ABI). Saves a few bytes and is more robust. -
libgcc.s
lda #0; sta $XX->stz $XX— 7 sites converted in libgcc.s after STZ landed in the assembler. Saves 28 bytes; also removes two PHA/PLA save-restore wraps around the LDA #0 (STZ doesn't touch A, so the wraps are unnecessary). -
libgcc.s
lda dp; pha->pei dp— 2 sites in __divhi3 / __modhi3 where the loaded A is dead after the push. PEI doesn't touch A, saves 1 byte each. -
W65816StackSlotCleanup Pass 1c skip-list extended — added STAabs / STA8abs / STAptr / STBptr / STAptrOff / STBptrOff and ADJCALLSTACKDOWN to the A-transparent list. Lets the redundant- CMP-after-A-modifier elimination see through more pseudo stores and the call-stack-down pseudo. Saves 8 bytes in math.o. (ADJCALLSTACKUP is NOT transparent — when PEI doesn't process it, AsmPrinter emits a TSC/CLC/ADC/TCS that clobbers A.)
-
crt0.s
lda #0; sta->stz— IRQ-disable block and the BSS-zero loop both used.byte 0xa9, 0x00 ; staraw-byte workarounds forlda #0(the assembler emits a 16-bit immediate in M=8, mis-encoding it).stzworks in M=8 (stores 1 byte) and doesn't touch A — both.byteworkarounds removed; saves 4 bytes in crt0.o. -
Runtime correctness pass — five real bugs fixed:
free()coalesce: when a freed block was absorbed into a lower-address neighbour (bEnd == apath), the absorbed entry was left in the free list overlapping the extended one. A follow-on malloc could hand out the same memory to two callers. Fix: track outer-loop predecessor and excise the absorbed entry. Smoke #100 added.sqrt(-0.0)returned NaN; should return -0.0 per IEEE-754. The sign-bit check fired before the zero check. Fix: mask sign bit when testing for zero.log(0)returned NaN; should return -Infinity (pole error). Same sign-bit-vs-zero ordering issue; both ±0 now return-1.0/0.0.snprintf(buf, 0, ...)wrote'\0'tobuf[-1](one byte BEFORE the buffer). C99 says n=0 must not touch the buffer. Fix: setgEnd = NULLfor n=0 so neither the normal nor the truncation NUL-write path fires. Smoke #76 extended.malloc(>~32KB)andcalloc(n, m)had silent integer overflow on size_t (16-bit), wrapping to small values and handing out tiny allocations claiming huge sizes. Bumped malloc to bail above 0x7FF0 (heap is at most ~32KB anyway) and made calloc overflow-check before multiplying.
-
Removed dead
runtime/src/softDouble.s(a stub from beforesoftDouble.cwas implemented; the build script doesn't reference it but it was confusing to leave around). -
inttypes.h PRId64 / PRIu64 / PRIx64 documented as unsupported in the runtime's printf — the macros expand to
"lld"/"llu"/"llx"but the formatter only knows thellength modifier, notll, so the format prints literally and the va_list misaligns. UsePRId32etc. for now. -
More runtime fixes (round 2):
fputs(s, stream)was forwarding toputs(s), which appends a newline. C says fputs MUST NOT add one. Direct char-by-char write now.exit(code)never invoked the registeredatexithandler. C99 7.20.4.3 requires it. Now runs the single-slot handler (with re-entry guard) before the BRK.printf("%f", -0.0)printed0.000000instead of-0.000000becauseif (v < 0)(a__ltdf2call) returns false for negative zero. Switched to the IEEE-754 sign-bit test that snprintf already uses.vfprintfwas missing entirely (declared neither in stdio.h nor implemented). Added a thin wrapper around vprintf.
-
link816 weak-symbol resolution: the linker previously used "last def wins" with no regard for STB_GLOBAL vs STB_WEAK. When a user provided a strong override of a weak libc stub (e.g.
putchar), it worked only by link-order luck — reversing the order let the weak stub silently overwrite the strong def. Now properly: strong over weak (any order), strong + strong errors out, weak + weak picks the first. Smoke #100 added. -
More runtime fixes (round 3):
writeHex/emitHexhad a stack-overflow buffer overrun (char buf[5]butprintf("%08x", ...)would write 8 bytes). On 16-bitunsigned int, max useful width is 4 — buf shrunk to 4 and width is now capped.writeDec/writeSignedLong/emitDec/emitSignedLongused-non signed input, which overflows for INT_MIN / LONG_MIN (UB). All four switched to unsigned-negation (0u - (unsigned)n) for correctness and to keep an optimizer-aware compiler from exploiting the UB.atoi/atol/strtol/strtoullikewise built the parsed magnitude in a signed accumulator and negated at the end — same UB on the boundary value. All switched to unsigned magnitude + unsigned-negation cast.link816 parseInt/omfEmit parseIntsilently truncated addresses > 24 bits touint32_tlow bits —--text-base 0x100000000would silently wrap to 0. Both now reject out-of-range addresses with a clear error.
-
More runtime fixes (round 4):
pow(x, y)computedn = -nfor the integer-y branch when yi was INT_MIN (-32768); same signed-overflow UB pattern as the print functions. Switched to unsigned magnitude.- Added
perror(prefix)— was missing from the runtime; common pattern in portable code that reports I/O failure viaerrno + strerror. Declared in stdio.h, implemented as char-by-char emit through putchar (no fprintf dependency).
-
link816
__heap_endwas hardcoded at $BF00, ignoring where__heap_startactually ended up. When BSS got auto-relocated into LC1 ($D000+), heap_start ended up > heap_end and malloc immediately returned NULL on every call — silently bricking any program that allocated dynamic memory after the runtime grew past the default-bss threshold. Heap_end now picks $BF00 / $E000 based on where heap_start lands (and skips the IO window if heap_start would have landed in $C000-$CFFF). Smoke #102 added. -
link816 rodata auto-skips IIgs IO window ($C000-$CFFF). When text+rodata grew past 0xC000 the rodata bytes silently corrupted at runtime — string literals in the IO range read back as hardware register values, breaking strcmp / strstr / printf / etc. Now: rodata that would land in or cross $C000-$CFFF auto-skips to $D000. Init_array gets the same treatment. Text that would cross IO is hard-rejected at link time (no auto-fix possible — PC fetches in IO would read hardware registers). This was the root cause of the "tan/tanf triggers layout-sensitive failure" symptom listed in older STATUS notes.
-
runInMame skips writes to IO window during the binary load. Without this, the zero-padding in the rodata-skip gap would clobber soft switches (e.g. the LC1 RAM enable that crt0 sets via $C083) when the loader naively wrote the entire image byte-by-byte to memory.
-
link816
--gc-sections(default ON) — discards sections not reachable from the entry point (__start/_start/mainfor the canonical crt0 setup) plus all.init_arraysections. Built on-ffunction-sectionsso each function is in its own section. A minimal program with full runtime linked shrinks from ~43KB to ~1.5KB. Addingtan/tanfto math.c (which caused the latent layout-sensitive failure described above) no longer pushes any test past the bank-0 limit. Tests that intentionally check unreachable symbols pass--no-gc-sectionsto opt out. -
fwrite(stdout, ...)was a stub returning 0 even thoughstdouthas a workingputcharroute. Now actually writes throughputcharfor stdout/stderr (only). Also gained the samesize * nmemboverflow guard ascalloc.
What's still needed for a "ship-ready" toolchain
-
softDouble.c -O2 — FIXED. Marking
dclassnoinline (in addition todpack) drops register pressure in__muldf3/__divdf3/__adddf3enough that greedy regalloc no longer runs out. The previous blocker was that noinline-dclass would write through pointer args via the DBR-relative(d,s),ymode and corrupt caller data after a bank switch — that path now goes throughSTAptr/STBptrwhich use[$E0],Yindirect-long with the bank byte forced to 0, so DBR is irrelevant. All three smoke build sites moved to-O2. -
More of the C standard library: real
<stdio.h>file I/O (fopen,fread,fwrite,fseekare currently stubs returning success/zero) — would need a memory-backed FS or a MAME hook.<locale.h>/<signal.h>/<time.h>are stubbed (compile and return safe defaults).<wchar.h>mostly absent. Atime()impl wired to ReadTimeHex (Misc Tool $0D03) was attempted but crashes MAME without the Tool Locator initialised in crt0;clock()via VBL counter at $E1006B needs 24-bit far-pointer support that the backend doesn't yet model. -
C++ runtime support: vtable layout for multiple inheritance, RTTI, exceptions (or a documented
-fno-exceptionsrequirement). -
REP/SEP scheduling pass (design doc §3.3): the current prologue picks one M-mode for the whole function based on whether any 8-bit accumulator value is used. A per-region scheduler would reduce the SEP/REP wrap overhead on i8 stores.
-
Toolbox / IIgs system call bindings:
iigs/toolbox.hcovers the common entry points across Tool Locator, Memory Manager, Misc Tools, QuickDraw II, Event Manager, Window Manager, plus GS/OS Quit. Multi-arg wrappers (NewHandle, QDStartUp, MoveTo, EMStartUp, GetNextEvent, NewWindow, CloseWindow) live inruntime/src/iigsToolbox.sbecause the backend's inline-asm constraints can't take memory operands. Single-arg / no-arg wrappers stay inline. More routines (Menu Manager, Dialog Manager, Standard File, Sound) still TBD. -
Real-world program coverage: the smoke tests are microbenchmarks. A few known-good Apple IIgs C programs (e.g. a textfile pager, a small game) compiled and run end-to-end would catch issues no synthetic test currently exercises.
-
Cycle-time / size benchmarks vs Calypsi 5.16: design doc §1 says the goal is to "match or exceed" Calypsi. We have neither baseline numbers nor a comparison harness yet.