393 lines
21 KiB
Markdown
393 lines
21 KiB
Markdown
# llvm816 — Current Status
|
|
|
|
LLVM/Clang backend for the WDC 65816 (Apple IIgs), forked from
|
|
llvm-mos as a separate `W65816` target.
|
|
|
|
## What works
|
|
|
|
End-to-end C-to-binary toolchain that produces 65816 machine code
|
|
which runs correctly under MAME (apple2gs).
|
|
|
|
**Language coverage at -O2 (no extra flags):**
|
|
|
|
- All scalar arithmetic: i8 / i16 / i32 / i64 add, sub, mul, div, mod
|
|
(signed and unsigned). Carry-chained multi-word ops via ADC/SBC pseudos
|
|
+ ASLA16 / shift libcalls.
|
|
- Comparisons and signed/unsigned widening (sext, zext, trunc) for all
|
|
the above sizes. Signed compare near INT_MIN handled via EOR-with-
|
|
sign-bit transform.
|
|
- Pointer arithmetic, array indexing, struct field access, struct
|
|
return-by-value (up to 8 bytes — Pair, Vec4, double).
|
|
- Pointer dereference (`*p`) lowers via `LDAptr / STAptr / STBptr`
|
|
to `[$E0],Y` indirect-LONG with the bank byte at `$E2` forced to 0
|
|
— DBR-independent, so `pha;plb` bank-switched callers don't corrupt
|
|
data through callee local-pointer writes. Const-int pointers
|
|
(`*(volatile uint16 *)0x5000 = v` MMIO idiom) lower to `STAabs`
|
|
(DBR-relative) so bank-2 writes still work.
|
|
- Bitfields, switch statements (verified up to ~12 cases + default),
|
|
function pointers, function-pointer tables, indirect calls via
|
|
`__jsl_indir` trampoline.
|
|
- Recursion: factorial, Fibonacci, depth-3 binary-tree
|
|
insert/sum/min/max, simple recursive quicksort.
|
|
- Loops with goto / break / continue, nested loops, state machines.
|
|
- `<stdarg.h>` varargs with int / long / unsigned long long mixed args.
|
|
- Heap: `malloc` / `free` (libc.c first-fit allocator) — linked-list
|
|
reverse with `cons` works; free-list coalesce verified.
|
|
- Strings: hand-rolled `strlen`, `strcmp`, `strcpy`, `strchr`, atoi/itoa
|
|
roundtrip.
|
|
- Soft-float (single): all four ops + comparisons, MAME-verified.
|
|
- Soft-double: add, sub, mul, div all return correct bit patterns
|
|
bit-for-bit against gcc with round-to-nearest-even rounding;
|
|
3-iter Newton sqrt converges. Compiles at -O2 throughout. Long-
|
|
running iterations may hit MAME's 1-second sim-time budget (test
|
|
config issue, not a compiler bug).
|
|
- Inline assembly with `"a"`, `"x"`, `"y"` register constraints and
|
|
arbitrary opcode bytes (used for the `pha;plb` bank-switch idiom).
|
|
- C++ minimal: clang++ compiles a class with virtual + non-trivial
|
|
ctor (vtable + RTTI omitted; no exceptions).
|
|
- printf with `%d %x %s %c %p` and width/precision specifiers.
|
|
- sprintf / snprintf / vsprintf / vsnprintf with the same format
|
|
coverage as printf (`%d %u %x %ld %lu %s %c %f %p %%` + width).
|
|
C99 truncation semantics for snprintf. `%.Nf` produces the
|
|
correct fractional digits with round-half-up.
|
|
- qsort + bsearch over arbitrary element size with a user `cmp`
|
|
callback.
|
|
- Standard string/stdlib glue: strcat, strncat, strpbrk, strspn,
|
|
strcspn, atol, llabs (kept in their own translation unit so
|
|
vprintf's branch layout doesn't shift).
|
|
- `<math.h>`: fabs, floor, ceil, fmod, copysign, sqrt, pow,
|
|
sin, cos, tan, exp, log, atan, atan2, asin, acos, sinh, cosh,
|
|
tanh (and float variants). Bit-twiddling for fabs/floor/ceil/
|
|
copysign; Newton iteration for sqrt; range-reduction + Taylor
|
|
for sin/cos/exp/log/atan; identities for asin/acos/atan2/sinh/
|
|
cosh/tanh. Accuracy is in the ~1e-6 range — good enough for
|
|
typical numeric work, far short of glibc-quality. These are
|
|
slow (each call is dozens to hundreds of soft-double libcalls)
|
|
— pre-compute or cache when possible.
|
|
- `setjmp` / `longjmp` from libgcc.s.
|
|
- Static constructors via crt0's init_array walk.
|
|
- `<stdio.h>` file I/O against an in-memory FS: `mfsRegister
|
|
(path, buf, size, cap, writable)` stages a buffer as a named
|
|
file; `fopen`/`fread`/`fwrite`/`fseek`/`ftell`/`fclose`/`fgetc`
|
|
/`fgets`/`ungetc`/`fprintf` operate on it via a per-FILE
|
|
(kind, buf, size, cap, pos, eof, err, unget) record. stdin/
|
|
stdout/stderr route through `putchar` as before.
|
|
- `<wchar.h>`: wcslen / wcscmp / wcsncmp / wcscpy / wcsncpy /
|
|
wcscat / wcschr / wcsrchr; mbtowc / wctomb / mbstowcs /
|
|
wcstombs / mblen with the trivial 1:1 byte<->wide mapping
|
|
(Latin-1). wchar_t is 16-bit on this target.
|
|
- `<signal.h>`: in-process signal table. signal() registers a
|
|
handler; raise() invokes it. Default actions: SIGABRT calls
|
|
abort(), SIGINT/SIGTERM call exit(128+sig), others ignored.
|
|
- `<locale.h>`: setlocale always returns "C"; localeconv returns
|
|
a fixed C-locale lconv struct.
|
|
- C++ subset: classes, single inheritance, multiple inheritance
|
|
(Drawable+Movable through one Sprite), virtual base diamond
|
|
(A and B virtually derive Base; Diamond inherits from both
|
|
with one shared Base subobject), virtual functions,
|
|
polymorphism via base-class pointer arrays, virtual dtors,
|
|
this-pointer adjustment for non-leftmost bases, vbase offset
|
|
tables. RTTI / `dynamic_cast` works (downcast, MI cross-cast,
|
|
virtual-base sibling cast) via a minimal libcxxabi shim
|
|
(`runtime/src/libcxxabi.c`) that provides `__dynamic_cast` +
|
|
the three typeinfo class vtables (`__class_type_info`,
|
|
`__si_class_type_info`, `__vmi_class_type_info`) + sized
|
|
`operator delete` + `__cxa_pure_virtual`.
|
|
- C++ exceptions via `clang++ -fsjlj-exceptions`: throw, catch,
|
|
catch-by-value, multiple catch handlers, exception destruction.
|
|
Backend wiring: `MCAsmInfo` selects `ExceptionHandling::SjLj`
|
|
so clang's `SjLjEHPrepare` runs; a custom `W65816SjLjFinalize`
|
|
IR pass (in `src/llvm/lib/Target/W65816/`) finishes the
|
|
lowering by inserting an actual `setjmp` at function entry,
|
|
building a `switch`-on-call-site dispatch block, building a
|
|
per-function catch table referenced via the lsda field, and
|
|
rewriting `eh.typeid.for(@TI)` to use typeinfo addresses as
|
|
selectors. Runtime in `runtime/src/libcxxabiSjlj.c` provides
|
|
the full Itanium SJLJ surface: `_Unwind_SjLj_Register/
|
|
Unregister/RaiseException/Resume`, `__cxa_allocate_exception`,
|
|
`__cxa_throw`, `__cxa_begin_catch`, `__cxa_end_catch`,
|
|
`__cxa_rethrow`, plus a no-op `__gxx_personality_sj0`
|
|
(we dispatch via call_site directly, not via the personality).
|
|
Two backend bug fixes were required along the way: longjmp's
|
|
SP restore was off by 3 (libgcc.s subtracted 3 before TCS,
|
|
leaving caller's stack 3 bytes off) and `W65816StackSlotCleanup`
|
|
was eliminating volatile stores to dead-from-its-perspective
|
|
stack slots (skipped via `hasOrderedMemoryRef()` gate).
|
|
|
|
**Toolchain:**
|
|
|
|
- `clang` / `llc` produce W65816 assembly + ELF object files.
|
|
- `tools/link816` resolves cross-translation-unit refs, lays out
|
|
text/rodata/bss, emits a flat binary the IIgs ROM can load.
|
|
Auto-relocates bss above text+rodata when the default
|
|
`--bss-base 0x2000` would overlap text, and skips past the
|
|
IIgs IO window ($C000-$CFFF) if needed. `--gc-sections`
|
|
(default ON) drops unreachable functions: a minimal program
|
|
with full runtime linked shrinks from ~43KB to ~1.5KB.
|
|
- `link816 --segment-cap N` packs `.text` greedily into multiple
|
|
bank-aligned segments, capped at N bytes per segment. Segment 1
|
|
stays at `--text-base` in bank 0 (alongside rodata + bss + init);
|
|
segments 2..M start at `--segment-bank-base` (default $040000)
|
|
in successive banks. `--manifest path.json` writes a JSON file
|
|
listing each segment's image, base, and entry offset.
|
|
Cross-bank `JSL` (IMM24 reloc) just works — patched at link
|
|
time with the full 24-bit address. Cross-bank IMM16 is
|
|
permitted (uses DBR for bank — caller pins DBR to data's bank);
|
|
cross-bank PCREL is rejected with a clear diagnostic.
|
|
`scripts/runMultiSeg.sh` is a mini in-Lua loader for MAME that
|
|
reads the manifest, places each segment's bytes, and runs from
|
|
segment 1's entry — used by smoke to verify cross-bank JSL
|
|
end-to-end (helper3 chain across 3 bank-aligned segments).
|
|
- `tools/omfEmit` produces OMF v2.1 files in two modes:
|
|
(a) single-segment — `--input flat.bin --map flat.map --base
|
|
ADDR --entry SYM`, KIND=0x0000 (CODE, dynamic), ORG=0 (loader
|
|
picks bank); (b) multi-segment — `--manifest path.json` reads
|
|
link816's manifest and emits one OMF segment per entry with
|
|
KIND=0x8800 (STATIC|ABSBANK|CODE) + ORG=segment-base, asking
|
|
the GS/OS Loader to place each at its declared bank-aligned
|
|
address. All intra-segment relocations were already patched by
|
|
the linker, so no INTERSEG/RELOC opcodes are needed for v1
|
|
static placement.
|
|
- `link816 --debug-out FILE` writes a DWARF sidecar with text/
|
|
rodata/bss/init_array relocations applied to every `.debug_*`
|
|
section, so `.debug_addr` / `.debug_line` PC values are final-
|
|
image addresses.
|
|
- `runtime/build.sh` builds crt0, libc, soft-float, soft-double,
|
|
libgcc into linkable objects.
|
|
- `scripts/smokeTest.sh` runs 126 end-to-end checks at -O2:
|
|
scalar ops, control flow, calling conventions, MAME execution
|
|
regressions, link816 bss-base safety + weak-symbol resolution +
|
|
heap_end-vs-heap_start sanity, iigs/toolbox.h compile + link,
|
|
iigs/gsos.h compile + link, standalone runtime headers,
|
|
AsmPrinter peepholes (STZ / PEA / PEI — single-STA, shared-
|
|
LDA-multi-STA, DPF0-forwarding), malloc/free coalesce ordering,
|
|
plus real-world coverage: Conway's Game of Life blinker
|
|
(2D loop + neighbour bounds), binary search tree (recursive
|
|
struct + malloc), function-pointer dispatch table (indirect
|
|
JSL via `__jsl_indir`), memory-backed file I/O (mfsRegister +
|
|
fopen/fread/fwrite/fseek/fprintf), C++ polymorphism (single
|
|
inheritance), C++ multiple inheritance (Drawable+Movable),
|
|
C++ virtual base diamond, C++ dynamic_cast (SI + MI cross-cast +
|
|
virtual-base sibling cast through libcxxabi shim), SJLJ exception
|
|
runtime end-to-end (libcxxabiSjlj.c throw/catch round-trip via
|
|
setjmp/longjmp + catch-table walk), C++ -fsjlj-exceptions
|
|
compile + link (the C++ frontend → backend path is execution-
|
|
verified manually but skipped from MAME smoke due to a
|
|
MAME-side flakiness — see "Yet to come"), GS/OS wrapper
|
|
round-trip via stub dispatcher pre-loaded at $E100A8 (validates
|
|
PHA + PEA 0 + JSL + post-call SP-fixup contract end-to-end),
|
|
wchar / signal core APIs, hex dumper writing through fprintf,
|
|
JSON tokenizer state machine, hash-table command shell (parser
|
|
+ dispatch + chained collisions over fprintf-to-mfs),
|
|
scripts/bench.sh size-vs-Calypsi harness. 100% pass.
|
|
|
|
- `scripts/bench.sh` compiles a microbenchmark suite with both
|
|
clang (this toolchain) and Calypsi cc65816, comparing emitted
|
|
text-section size. Current ratio: ~1.9x (down from 2.2x once
|
|
the W65816 target started overriding `replexitval` to "never"
|
|
by default in `LLVMInitializeW65816Target`; SCEV's closed-form
|
|
rewrite was promoting i16 induction expressions to i64 and
|
|
hitting `__muldi3`, which on a 16-bit target is dramatically
|
|
bigger than the loop it replaces). sumOfSquares went 335B →
|
|
128B, a 2.6x shrink with no other benchmark affected. Eight
|
|
benchmarks shipped under `benchmarks/`. Remaining gap is
|
|
structural: Calypsi uses `(sr,s),Y` for stack-relative
|
|
pointer indirection where we route through DP $E0 indirect-
|
|
long for bank safety.
|
|
|
|
**Backend register allocation:**
|
|
|
|
- Basic regalloc as default at -O1+; fast at -O0/optnone. We use
|
|
basic instead of greedy because greedy fails ("ran out of
|
|
registers during register allocation") on functions with many
|
|
cross-call Acc16 vregs (the `ok |= bit; helper(); ok |= bit;`
|
|
pattern across many if-blocks). Basic handles those cleanly
|
|
with negligible code-size overhead vs greedy on the bench
|
|
suite (~0.6%).
|
|
- Pre-RA passes: `WidenAcc16` (Acc16→Wide16 promotion, lets
|
|
greedy spread i16 pressure across A and 16 IMG slots);
|
|
`TiedDefSpill` (handles tied-def-multi-use hazard);
|
|
`ABridgeViaX` (bridges via X/Y when free).
|
|
- Post-RA passes: `SpillToX` (STA/LDA pairs → TAX/TXA bridges
|
|
when X dead); `StackSlotCleanup` (deletes redundant adjacent
|
|
spills); `NegYIndY` (rewrites negative-Y indirect-Y stack-rel
|
|
ops to avoid the 24-bit-add bank-cross).
|
|
- Pre-emit: `BranchExpand` (long Bxx → INV_Bxx skip; BRA target);
|
|
`SepRepCleanup` (coalesces adjacent SEP/REP toggles, plus a
|
|
cross-mode-neutral coalesce that drops REP/SEP pairs sandwiching
|
|
X-flag-only ops, branches, transfers — saves 4B / 12cyc per
|
|
collapse). AsmPrinter LDAi8imm peephole walks past mode-neutral
|
|
MIs to fuse the closing REP into a following SEP.
|
|
- Imaginary registers IMG0..IMG15 backed by DP $C0..$CE +
|
|
$D0..$DE — gives greedy 17 effective i16 carriers (A + 16 IMG)
|
|
before stack spills kick in.
|
|
|
|
**ABI:**
|
|
|
|
- arg0 in A; arg1 in X for i32-first-arg signatures; rest pushed RTL
|
|
on the system stack with PHA. Caller deallocates via `tsc;clc;adc
|
|
#N;tcs` or `PLY*N/2`.
|
|
- Return: i8/i16 in A; i32 in A:X; i64 in A:X:Y plus DP[$F0..$F1] for
|
|
the highest 16 bits.
|
|
- Frame is empty-descending (S points to next-free); offsets account
|
|
for the +1 skew vs LLVM's full-descending model.
|
|
|
|
**IIgs toolbox:**
|
|
|
|
- `iigs/toolbox.h` — autogenerated wrappers for all ~1300 IIgs
|
|
toolbox routines across 35 tool sets (Tool Locator, Memory
|
|
Manager, Misc Tools, QuickDraw II / Aux, Event Manager,
|
|
Sound Manager, Apple Desktop Bus, SANE, Integer Math, Text
|
|
Tools, Window Manager, Menu Manager, Control Manager,
|
|
LineEdit, Dialog Manager, Scrap Manager, Standard File,
|
|
Note Synth/Sequencer, Font Manager, List Manager, ACE,
|
|
Resource Manager, MIDI, Video Overlay, TextEdit, Media
|
|
Control, Print Manager, Scheduler, Desk Manager, …). Names
|
|
match Apple's IIgs Toolbox Reference exactly (TLStartUp,
|
|
MMStartUp, NewWindow, SysBeep, …). 417 simple wrappers
|
|
(zero/single-arg, i16-or-void return) inline in the header;
|
|
890 multi-arg ones live in `runtime/src/iigsToolbox.s`.
|
|
Generated by `scripts/genToolbox.py` from ORCA-C's
|
|
`ORCACDefs/` (re-runnable when ORCA-C updates).
|
|
|
|
## In flight
|
|
|
|
(Nothing currently — the four previous in-flight items all
|
|
landed: basic-regalloc-by-default replaced greedy and resolved
|
|
the long-arg-chain failure; `time()` reads ReadTimeHex when the
|
|
program has called `iigsToolboxInit()` and `clock()` reads the
|
|
VBL counter via 24-bit absolute load; the (sr,s),Y bank-wrap
|
|
addressing is no longer emitted by any inserter and the
|
|
`W65816NegYIndY` workaround is disabled; LC ceiling extended
|
|
from $E000 to $10000 since crt0's `lda $C083` read-twice enables
|
|
RAM through $FFFF, gaining 8KB of bank-0 space.)
|
|
|
|
## Yet to come
|
|
|
|
- **Multi-bank BSS / init_array** — multi-segment splits text
|
|
across banks but BSS + init_array still live in segment 1's bank
|
|
(bank 0). Programs whose zero-init data exceeds the ~60KB bank-0
|
|
budget would need crt0 to walk a per-segment table of `(start,
|
|
end)` pairs. Not blocking >64KB *code* programs; only matters
|
|
for programs with very large global arrays.
|
|
|
|
- **GS/OS Loader OMF format compatibility** — the OMF format we
|
|
emit is now byte-equivalent to real Apple S16 segments at the
|
|
header level. Verified by extracting the ABOUT segment from
|
|
real `/SYSTEM/START` (FINDER) via Cadius (`/tmp/cadius/cadius`,
|
|
not AppleCommander which can't extract forks) and comparing
|
|
field-by-field against ours. Five fixes landed in
|
|
`src/link816/omfEmit.cpp` along the way:
|
|
(1) VERSION byte 0x21 → 0x02 (was BCD-style "2.1"; real format
|
|
is enum where 0x02 = v2.1). Cleared error $1102.
|
|
(2) Body opcode 0xF1 (DS = N zeros) → 0xF2 (compact LCONST,
|
|
2-byte length + N data bytes). Long-form 0xF5 LCONST is in
|
|
the spec but real Loader appears to mis-parse it (3 stale
|
|
copies of the segment ended up scattered in RAM). Every real
|
|
segment we decoded uses 0xF2.
|
|
(3) KIND 0x0000 (CODE) → 0x8000 (CODE|STATIC) for legacy
|
|
single-segment mode. Real ABOUT segment uses 0x8000; with
|
|
0x0000 the Loader returns $110A loadSegFailErr. Multi-segment
|
|
mode keeps 0x8800 (CODE|STATIC|ABSBANK) since each seg has a
|
|
fixed ORG.
|
|
(4) BANKSIZE 0 → 0x10000 (matches real code segments).
|
|
(5) LOAD_NAME emitted as 10 bytes of zeros immediately after
|
|
the 44-byte header (some sources omit it, real OMFs include it).
|
|
|
|
GS/OS 6.0.2 is installed under `tools/gsos/` and boots cleanly
|
|
to Finder in MAME. Replacing `/SYSTEM/START` with a known-good
|
|
OMF (the extracted ABOUT segment) gives error `$005C` —
|
|
identical to what we get with our test program — meaning our
|
|
OMF is indistinguishable from real Apple S16 as far as the
|
|
Loader is concerned. The $005C is *not* OMF rejection; it is
|
|
the boot-launcher path failing because a minimal `/SYSTEM/START`
|
|
doesn't chain to a real Finder via QUIT-with-pathname.
|
|
|
|
`runtime/src/crt0Gsos.s` is committed: skips SEI/LC-reconfig
|
|
(GS/OS owns CPU state), zeros BSS, runs init_array, calls
|
|
main, then QUIT(pcount=2) chained to `gChainPath` (default
|
|
`/SYSTEM/START.ORIG`). Linkage works.
|
|
|
|
Tested with a marker write as the very first instruction of
|
|
crt0Gsos, replacing `/SYSTEM/START` with our OMF and saving
|
|
the original as `/SYSTEM/START.ORIG` for chain-back. After
|
|
110-second boot: marker `$00/0078` is still 0 — the Loader
|
|
places our segment in RAM (entry signature found in 3 banks
|
|
via memory search) but **never JSLs entry**. Tested ENTRY=0,
|
|
ENTRY=1 (with NOP pad), auxtype=0 and =DB03; all give the
|
|
same $005C without ever calling our code. Conclusion: the
|
|
boot-launcher path requires the `~ExpressLoad` segment that
|
|
every real `/SYSTEM/START` carries. Without ExpressLoad,
|
|
the bootstrap takes a code path that loads our segment but
|
|
never auto-calls it.
|
|
|
|
**OMF format → fully Loader-compatible** after reading
|
|
Merlin32 source. Final canonical fields (single-segment
|
|
Finder-launchable app):
|
|
- KIND=0x1000 (CODE|PRIV) — was 0x8000 (CODE|STATIC) which
|
|
came from extracting ABOUT from real FINDER, but ABOUT is a
|
|
sub-segment called as a subroutine, not a launchable app
|
|
- LABLEN=10 (fixed-width 10-byte LOAD_NAME and SEG_NAME,
|
|
space-padded) — was 0 (length-prefixed) which is what
|
|
/SYSTEM/START FINDER uses but the Loader will only LOAD,
|
|
not JSL-into, that format
|
|
- VERSION=0x02 (OMF v2.1)
|
|
- BANKSIZE=0x10000 for code segs
|
|
- Body opcode 0xF2 LCONST with NUMLEN-byte (=4) count
|
|
|
|
ExpressLoad emission also landed (`omfEmit --expressload`):
|
|
6-byte header + segment list + remap list + header info,
|
|
byte-equivalent to Merlin32's `BuildExpressLoadSegment`.
|
|
|
|
End-to-end runtime verification: new `scripts/runViaFinder.sh`
|
|
injects an OMF as `/SYSTEM.DISK/HELLO`, boots GS/OS in MAME,
|
|
drives Finder via Lua keyboard automation (S+Cmd-O to open
|
|
System.Disk, H+Cmd-O to launch HELLO), samples specified
|
|
memory addresses to verify execution. Pattern adapted from
|
|
`joeylib/scripts/run-iigs-mame.sh` from a sibling project.
|
|
Pure-asm marker tests (`sta $000078 long, value=$42`) are
|
|
confirmed running under real GS/OS Loader with
|
|
`runViaFinder.sh hello.omf --check 0x000078=0x42` returning
|
|
exit 0.
|
|
|
|
**Compiled C now runs under real GS/OS Loader.** Implemented
|
|
option (a) from the analysis: OMF cRELOC opcode emission.
|
|
- `link816 --reloc-out FILE` records every R_W65816_IMM24
|
|
relocation site (intra-segment 24-bit refs only — GS/OS
|
|
dispatcher calls and other cross-bank refs are filtered out)
|
|
as a binary sidecar of (patchOff, offsetRef) pairs.
|
|
- `omfEmit --relocs FILE` reads the sidecar and emits a
|
|
cRELOC opcode (0xF5) per site between the LCONST data and the
|
|
END opcode. Format per Merlin32: `0xF5 ByteCnt(=3) Shift(=0)
|
|
OffsetPatch(2) OffsetReference(2)` = 7 bytes.
|
|
- The Loader rewrites segment[OffsetPatch..OffsetPatch+2] to
|
|
`(segPlacedBase + OffsetReference)` at load time, fixing
|
|
every `jsl`/`jml`/`sta long`/`lda long` operand that targets
|
|
an in-segment symbol.
|
|
- End-to-end verified: a real C function call + for loop
|
|
(`sumTo(10)` → 55, `sumTo(100)` → 5050) compiled with clang
|
|
-O2, linked, OMF-emitted with cRELOC, injected as
|
|
`/SYSTEM.DISK/HELLO`, launched from Finder via MAME-Lua
|
|
keyboard automation, marker bytes verified at the expected
|
|
values. Smoke check #62 verifies cRELOC opcode count
|
|
matches the link816 sidecar count.
|
|
|
|
Smoke tests #59-#60 (omfEmit single + multi-segment) verify
|
|
the structural format invariants (VERSION=0x02, KIND=0x8000
|
|
or 0x8800, body opcode 0xF2 LCONST) so regressions are
|
|
caught. `scripts/runMultiSeg.sh` mini-loader continues to
|
|
cover the >64KB use case end-to-end.
|
|
|
|
- **C++ exceptions in CI smoke** — runs reliably outside smoke;
|
|
see context below. The SJLJ runtime end-to-end test passes;
|
|
the C++ frontend→backend path is compile/link verified in
|
|
smoke; full execution path is left out due to a MAME-side I/O
|
|
flakiness (same binary runs fine interactively).
|
|
|
|
- **GS/OS validated against a real ProDOS volume** — the wrapper
|
|
contract (PHA + PEA 0 + LDX + JSL $E100A8 + post-call SP fixup)
|
|
is verified end-to-end in MAME against a stub dispatcher
|
|
(`scripts/runInMameWithGsosStub.sh`). Validating against an
|
|
actual GS/OS-loaded volume needs a bootable system disk image
|
|
attached as a MAME smartport hard disk and Tool Locator init —
|
|
out of scope for an automated CI smoke.
|