65816-llvm-mos/tests/coremark/README.md
2026-05-27 19:37:26 -05:00

4.1 KiB
Raw Blame History

CoreMark — EEMBC's standard embedded benchmark

CoreMark 1.0 ported to the W65816 / Apple IIgs target. Source is vendored under coremark-src/ from github.com/eembc/coremark.

CoreMark exercises three distinct algorithm families:

  1. Linked list traversal + insert/sort (core_list_join.c)
  2. Matrix init + multiply (core_matrix.c)
  3. State machine processing a string (core_state.c)

…plus utility code (CRC, RNG) in core_util.c. Total ~2000 LOC.

This is the embedded benchmark vendors publish CoreMark/MHz scores against (Cortex-M0, AVR, RISC-V, ...). It's a useful cross-platform sanity check on our backend's code-quality.

Files

  • coremark-src/ — vendored EEMBC source (read-only)
  • core_portme.h / .c — W65816 porting layer (timing, malloc, printf bridge)
  • build.sh — compile the 5 core .c files + portme
  • runCoreMark.sh — build + link + run under MAME

Building

bash tests/coremark/build.sh --layer2

--layer2 enables -mllvm -w65816-dbr-safe-ptrs. This is required to fit the binary in a single bank; without it, text crosses the IO window at 0xC000. CoreMark only touches malloc/static-array memory, so the dbr-safe-ptrs assumption is correct.

Default iteration count is 1 (smallest valid run). Override for publishable scores:

ITERATIONS=5 bash tests/coremark/build.sh --layer2

CoreMark spec recommends >= 10 seconds of runtime. At ~1 MHz, expect roughly one iteration per second of in-IIgs time — so iteration counts of 1060 give a representative score.

Running

bash tests/coremark/runCoreMark.sh --layer2

The run terminates with 0xC0DE written to $025000 on success. Elapsed VBL ticks (60 Hz) are stored at $025002 (low/hi halves).

Note: running CoreMark under MAME inside this project's restricted shell crashes MAME (same SIGSEGV as Lua's full interpreter run — see feedback_lua_compile_test.md). The build produces a valid binary; the run only works in an unrestricted shell. Workaround: copy coreMark.bin out of the sandbox and run with the same runInMame.sh invocation directly.

Size vs Calypsi (5 core files, ITERATIONS=1, PERFORMANCE_RUN)

File Ours (L2+threshold=50) Calypsi 5.16 Ratio
core_list_join.o 10,008 9,073 1.10×
core_main.o 11,588 19,772 0.59×
core_matrix.o 10,660 11,078 0.96×
core_state.o 7,256 9,944 0.73×
core_util.o 3,156 4,631 0.68×
TOTAL 42,668 54,498 0.78×

We beat Calypsi by 22% on CoreMark overall. (Since the inline- threshold dropped from 75 to 50 target-wide, core_matrix.o improved from 1.37× → 0.96× by no longer inlining 5 nested-loop helpers.)

Notes on the porting layer

  • ee_u32 is unsigned long (not unsigned int — on W65816 int is 16-bit; long is 32-bit). CoreMark depends on 32-bit ee_u32 for CRC and timing math.
  • MEM_METHOD = MEM_STATIC — a single 2 KB static array in BSS. Avoids dynamic alloc and the resulting heap-management overhead.
  • start_time / stop_time use clock() which returns the 60 Hz VBL counter. EE_TICKS_PER_SEC = 60.
  • HAS_FLOAT = 1 — CoreMark uses double precision for the score calculation; our soft-double handles it.
  • MULTITHREAD = 1 — single-context. The IIgs doesn't have threads.

Comparing builds

Lua and CoreMark together cover roughly disjoint code patterns:

Pattern Lua CoreMark
VM dispatch yes (luaV_execute 30+ case switch) no
Recursive descent parsing yes (lparser.c) no
String + hash table yes no
Linked-list traversal + sort (small) yes
Matrix init + multiply no yes
State machine (JSON tokenizer in smoke) yes (formal CoreMark state)
CRC yes (in smoke) yes
Recursion-heavy yes no

So they complement each other for backend coverage. Both now compile to under-or-near Calypsi size with the standard Layer 2 + threshold=75 config.