65816-llvm-mos/docs/USAGE.md
Scott Duensing 6bff7bea3f Docs!
2026-05-14 11:23:00 -05:00

11 KiB
Raw Blame History

Using llvm816

This document covers compiling a C program, linking it into an Apple IIgs binary, and running it under MAME. It assumes you've followed INSTALL.md and have a working tools/llvm-mos-build/bin/clang.

Quick reference

CLANG=tools/llvm-mos-build/bin/clang
LINK=tools/link816
RUNTIME=runtime

# 1. Compile C to object
$CLANG --target=w65816 -O2 -I$RUNTIME/include -c hello.c -o hello.o

# 2. Link to a raw binary (loadable at $00:1000)
$LINK -o hello.bin --text-base 0x1000 \
    $RUNTIME/crt0.o $RUNTIME/libc.o $RUNTIME/libgcc.o hello.o

# 3. Run under MAME
bash scripts/runInMame.sh hello.bin --check 0x025000=????

Compiling C

The compiler is invoked just like a normal clang, with --target=w65816:

clang --target=w65816 -O2 -c source.c -o source.o

Recommended flags:

Flag Meaning
--target=w65816 Selects the W65816 backend (required)
-O2 Default optimization level. -O0 and -O1 work but produce ~3-5× larger code
-ffunction-sections Put each function in its own section. Lets the linker drop unreferenced functions
-I runtime/include Find <stdio.h> etc.
-c Compile only — produce .o, don't link

What works at -O2:

  • All C99 scalars: int8_t through int64_t, signed and unsigned, all arithmetic operators
  • Soft float and double (full IEEE-754 with round-to-nearest-even)
  • Pointers, arrays, structs, unions, bitfields
  • All control flow: if, for, while, goto, switch, recursion
  • <stdarg.h> varargs
  • <setjmp.h> setjmp/longjmp (SJLJ, no DWARF unwinder)
  • Inline __asm__ with "a", "x", "y" register constraints
  • C++ subset: classes, single+multiple inheritance, virtual functions, RTTI, dynamic_cast. No exceptions (DWARF unwinder not implemented).

See STATUS.md for the full feature matrix.

Linking

The linker is tools/link816. It produces either a raw binary suitable for direct execution (loaded into a fixed address) or an OMF binary suitable for GS/OS Loader.

Raw binary

link816 -o output.bin --text-base 0x1000 crt0.o libc.o libgcc.o yourprog.o
  • --text-base 0x1000 — physical address where code is loaded. 0x1000 is the conventional starting address; the first 4KB of bank 0 ($00:0000 $00:0FFF) is reserved for the stack and zero-page.
  • crt0.o — the C runtime startup. Sets DBR, calls main, halts. Always link first.
  • libc.oprintf, malloc, strlen, etc.
  • libgcc.o — compiler-helper routines (__mulhi3, __umulhisi3, __divhi3, __ashlhi3, etc.). Required by most non-trivial programs.

Additional runtime libraries

Library What you get
runtime/libc.o Core C library — printf, malloc, strlen, etc.
runtime/libgcc.o Compiler helpers — multiply, divide, shift
runtime/snprintf.o sprintf / snprintf / vsnprintf
runtime/sscanf.o sscanf / vsscanf / fscanf
runtime/softDouble.o IEEE 754 double-precision math
runtime/softFloat.o IEEE 754 single-precision math
runtime/math.o fabs, floor, sqrt, sin, cos, etc.
runtime/qsort.o qsort / bsearch
runtime/strtol.o strtol / strtoul / atoi / atol
runtime/strtok.o strtok / strtok_r
runtime/extras.o strcat, strncat, llabs, rand/srand
runtime/timeExt.o time / gmtime / mktime
runtime/iigsToolbox.o Apple IIgs Toolbox call wrappers
runtime/iigsGsos.o GS/OS call wrappers

Link only what you use — the linker drops unreferenced symbols.

Build them all once with:

bash runtime/build.sh

Multi-segment OMF (for GS/OS Loader)

For programs that need >60 KB of code (the usable bank-0 limit after subtracting the stack, zero-page, and I/O window), build a multi-segment OMF that GS/OS Loader can place across banks:

link816 -o myprog.bin --omf --manifest my.manifest \
    --expressload \
    crt0Gsos.o ... yourprog.o

See docs/multiSegmentPlan.md for details and scripts/runMultiSeg.sh for a working example.

Running under MAME

The supplied scripts/runInMame.sh launches MAME's apple2gs with the right ROM path, loads your binary at $00:1000, runs for a few seconds, and reads back a memory cell.

bash scripts/runInMame.sh prog.bin                     # just run for 5s
bash scripts/runInMame.sh prog.bin --check 0x025000=00ff
bash scripts/runInMame.sh prog.bin 0x025000 0x025002   # dump these addrs

The --check ADDR=VALUE form returns exit 0 if ADDR contains VALUE after the run, exit 1 otherwise. Use 0x???? to dump the value without checking.

MAME is invoked headless by default (no window) via -video none + SDL_VIDEODRIVER=dummy. This works on servers/CI runners.

The bank-switch idiom

Bank 0 ($00:0000-$00:FFFF) has the I/O window at $C000-$CFFF that interferes with normal data access. The convention is to switch the data bank register (DBR) to bank 2 ($02:0000) before doing any data work:

__attribute__((noinline)) void switchToBank2(void) {
    __asm__ volatile (
        "sep #0x20\n"        // 8-bit accumulator
        ".byte 0xa9,0x02\n"  // lda #2 (force as bytes — llvm-mc bug)
        "pha\n"
        "plb\n"              // DBR = 2
        "rep #0x20\n"        // back to 16-bit
    );
}

After switchToBank2(), your data lives at $02:0000 upward. The runInMame.sh --check 0x025000=... address is $02:5000 — accessible via a normal store in bank 2.

Examples

Hello, integer

__attribute__((noinline)) void switchToBank2(void) {
    __asm__ volatile (
        "sep #0x20\n.byte 0xa9,0x02\npha\nplb\nrep #0x20\n"
    );
}

int main(void) {
    int x = 42;
    switchToBank2();
    *(volatile int *)0x5000 = x;
    while (1) {}
}

Build & run:

clang --target=w65816 -O2 -c hello.c -o hello.o
link816 -o hello.bin --text-base 0x1000 \
    runtime/crt0.o runtime/libc.o runtime/libgcc.o hello.o
bash scripts/runInMame.sh hello.bin --check 0x025000=002a    # 0x2a = 42

Recursion + printing

#include <stdio.h>
#include <stdlib.h>

unsigned long fib(unsigned n) {
    if (n < 2) return n;
    return fib(n-1) + fib(n-2);
}

__attribute__((noinline)) void switchToBank2(void) {
    __asm__ volatile (
        "sep #0x20\n.byte 0xa9,0x02\npha\nplb\nrep #0x20\n"
    );
}

int main(void) {
    char buf[32];
    int len = snprintf(buf, sizeof buf, "fib(10) = %lu", fib(10));
    switchToBank2();
    // Copy buf to $025000 so we can read it after the run
    for (int i = 0; i <= len; i++)
        ((volatile char *)0x5000)[i] = buf[i];
    while (1) {}
}

Build (note: need snprintf.o for snprintf):

clang --target=w65816 -O2 -I runtime/include -c fib.c -o fib.o
link816 -o fib.bin --text-base 0x1000 \
    runtime/crt0.o runtime/libc.o runtime/libgcc.o \
    runtime/snprintf.o runtime/softDouble.o runtime/sscanf.o fib.o

Apple IIgs Toolbox

#include <iigs/toolbox_full.h>

int main(void) {
    DrawString("\pHello, World");
    while (1) {}
}

Build:

clang --target=w65816 -O2 -I runtime/include -c hello_gs.c -o hello_gs.o
link816 -o hello_gs.bin --text-base 0x1000 \
    runtime/crt0Gsos.o runtime/iigsToolbox.o runtime/iigsGsos.o \
    runtime/libgcc.o hello_gs.o

Use crt0Gsos.o (not crt0.o) for programs that call into the toolbox — it sets up the IIgs runtime environment.

Inline assembly

The W65816 backend supports __asm__ with operand constraints "a", "x", "y":

unsigned short addOne(unsigned short x) {
    unsigned short r;
    __asm__("inc a" : "=a"(r) : "a"(x));
    return r;
}

Multi-instruction asm and raw bytes both work:

__asm__ volatile (
    "sep #0x20\n"
    ".byte 0x68\n"      // pla
    "rep #0x20\n"
);

The .byte 0xa9, ... form is sometimes needed to work around llvm-mc encoding gaps — the assembler doesn't yet support every 65816 addressing mode literally. The pattern works for any opcode whose mnemonic doesn't yet parse.

Tools reference

Tool Location Purpose
clang tools/llvm-mos-build/bin/clang C/C++ compiler
llvm-mc tools/llvm-mos-build/bin/llvm-mc Assembler
llvm-objdump tools/llvm-mos-build/bin/llvm-objdump Disassembler
llc tools/llvm-mos-build/bin/llc Standalone codegen (.ll.s)
link816 tools/link816 Our relocating linker
omfEmit tools/omfEmit Emit OMF v2.1 binary from link816 output
mame apt (system-wide) Apple IIgs emulator

Debugging

Look at the asm

clang --target=w65816 -O2 -S -o prog.s prog.c

Look at the MIR after each pass

clang --target=w65816 -O2 -mllvm -print-after-all -S prog.c 2>&1 | less

Useful pass names to filter on:

Pass name What it does
w65816-isel SDAG → MachineInstr selection
w65816-widen-acc16 Promote Acc16 vregs to Wide16 (regalloc help)
w65816-stack-slot-cleanup Remove redundant spill/reload
w65816-stackrel-to-img Promote hot stack slots to DP IMG slots
w65816-stack-slot-merge Collapse PHI src/dst slot pairs
w65816-branch-expand Long-distance Bxx → INV_Bxx skip;BRA

Single-pass filter

clang --target=w65816 -O2 -mllvm -print-after=w65816-isel \
    -mllvm -filter-print-funcs=myfunc -S prog.c 2>&1 | less

Cycle-count benchmarks

Eight microbenchmarks live under benchmarks/. Each runs N iterations of the bench function and reports a per-call cycle count via MAME's emu.time():

bash scripts/benchCyclesPrecise.sh

Output:

| Benchmark | Per-call cycles (clang) |
|-----------|------------------------:|
| bsearch | 767 cyc/call |
| dotProduct | 2131 cyc/call |
| fib | 12617 cyc/call |
| memcmp | 989 cyc/call |
| popcount | 2864 cyc/call |
| strcpy | 2216 cyc/call |
| sumOfSquares | 16709 cyc/call |

The compare/ directory has side-by-side .s files vs Calypsi 5.16 for sumSquares, evalAt, and mul16to32. Rerun with:

bash compare/regen.sh

Known limitations

  • C++ exceptions are not implemented. try/catch compiles but doesn't unwind. -fsjlj-exceptions works for limited SJLJ-style throwing.
  • stdin always returns EOF. scanf compiles but isn't useful. Use sscanf on a buffer instead.
  • File I/O through fopen etc. requires a backing implementation. The default mfs backing (memory-file-system) lets you simulate files via mfsRegister() — useful for tests, not for real disk I/O. GS/OS file I/O works via runtime/iigsGsos.o if you link against the GS/OS runtime.
  • fork/exec — not applicable on a 65816, no support.
  • Code generation gotcha: very large frames (>200 bytes) trigger FP-relative addressing. Most programs fit under that limit. See the frame-rel discussion in LLVM_65816_DESIGN.md.

Where to go next

  • Building real GS/OS apps: see docs/multiSegmentPlan.md and the runViaFinder.sh script for booting through real GS/OS 6.0.2 in MAME.
  • Backend internals (you're hacking on the compiler): LLVM_65816_DESIGN.md.
  • Smoke tests: scripts/smokeTest.sh runs ~150 end-to-end checks. Read it for examples of every feature in action.