65816-llvm-mos/tests/coremark/README.md
2026-05-27 19:37:26 -05:00

110 lines
4.1 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# CoreMark — EEMBC's standard embedded benchmark
CoreMark 1.0 ported to the W65816 / Apple IIgs target. Source is
vendored under `coremark-src/` from
[github.com/eembc/coremark](https://github.com/eembc/coremark).
CoreMark exercises three distinct algorithm families:
1. **Linked list traversal + insert/sort** (`core_list_join.c`)
2. **Matrix init + multiply** (`core_matrix.c`)
3. **State machine** processing a string (`core_state.c`)
…plus utility code (CRC, RNG) in `core_util.c`. Total ~2000 LOC.
This is the embedded benchmark vendors publish CoreMark/MHz scores
against (Cortex-M0, AVR, RISC-V, ...). It's a useful cross-platform
sanity check on our backend's code-quality.
## Files
- `coremark-src/` — vendored EEMBC source (read-only)
- `core_portme.h` / `.c` — W65816 porting layer (timing, malloc,
printf bridge)
- `build.sh` — compile the 5 core .c files + portme
- `runCoreMark.sh` — build + link + run under MAME
## Building
```bash
bash tests/coremark/build.sh --layer2
```
`--layer2` enables `-mllvm -w65816-dbr-safe-ptrs`. This is **required**
to fit the binary in a single bank; without it, text crosses the IO
window at `0xC000`. CoreMark only touches malloc/static-array memory,
so the dbr-safe-ptrs assumption is correct.
Default iteration count is 1 (smallest valid run). Override for
publishable scores:
```bash
ITERATIONS=5 bash tests/coremark/build.sh --layer2
```
CoreMark spec recommends >= 10 seconds of runtime. At ~1 MHz, expect
roughly one iteration per second of in-IIgs time — so iteration counts
of 1060 give a representative score.
## Running
```bash
bash tests/coremark/runCoreMark.sh --layer2
```
The run terminates with `0xC0DE` written to `$025000` on success.
Elapsed VBL ticks (60 Hz) are stored at `$025002` (low/hi halves).
**Note:** running CoreMark under MAME inside this project's restricted
shell crashes MAME (same SIGSEGV as Lua's full interpreter run —
see `feedback_lua_compile_test.md`). The build produces a valid
binary; the run only works in an unrestricted shell. Workaround: copy
`coreMark.bin` out of the sandbox and run with the same
`runInMame.sh` invocation directly.
## Size vs Calypsi (5 core files, ITERATIONS=1, PERFORMANCE_RUN)
| File | Ours (L2+threshold=50) | Calypsi 5.16 | Ratio |
|------|----------------------:|-------------:|------:|
| core_list_join.o | 10,008 | 9,073 | 1.10× |
| core_main.o | 11,588 | 19,772 | 0.59× |
| core_matrix.o | 10,660 | 11,078 | 0.96× |
| core_state.o | 7,256 | 9,944 | 0.73× |
| core_util.o | 3,156 | 4,631 | 0.68× |
| **TOTAL** | **42,668** | **54,498** | **0.78×** |
We beat Calypsi by 22% on CoreMark overall. (Since the inline-
threshold dropped from 75 to 50 target-wide, `core_matrix.o` improved
from 1.37× → 0.96× by no longer inlining 5 nested-loop helpers.)
## Notes on the porting layer
- `ee_u32` is `unsigned long` (not `unsigned int` — on W65816 `int` is
16-bit; `long` is 32-bit). CoreMark depends on 32-bit `ee_u32` for
CRC and timing math.
- `MEM_METHOD = MEM_STATIC` — a single 2 KB static array in BSS.
Avoids dynamic alloc and the resulting heap-management overhead.
- `start_time` / `stop_time` use `clock()` which returns the 60 Hz VBL
counter. `EE_TICKS_PER_SEC = 60`.
- `HAS_FLOAT = 1` — CoreMark uses double precision for the score
calculation; our soft-double handles it.
- `MULTITHREAD = 1` — single-context. The IIgs doesn't have threads.
## Comparing builds
Lua and CoreMark together cover roughly disjoint code patterns:
| Pattern | Lua | CoreMark |
|---|---|---|
| VM dispatch | yes (`luaV_execute` 30+ case switch) | no |
| Recursive descent parsing | yes (`lparser.c`) | no |
| String + hash table | yes | no |
| Linked-list traversal + sort | (small) | yes |
| Matrix init + multiply | no | yes |
| State machine | (JSON tokenizer in smoke) | yes (formal CoreMark state) |
| CRC | yes (in smoke) | yes |
| Recursion-heavy | yes | no |
So they complement each other for backend coverage. Both now compile
to under-or-near Calypsi size with the standard Layer 2 + threshold=75
config.