# CoreMark — EEMBC's standard embedded benchmark

CoreMark 1.0 ported to the W65816 / Apple IIgs target.  Source is
vendored under `coremark-src/` from
[github.com/eembc/coremark](https://github.com/eembc/coremark).

CoreMark exercises three distinct algorithm families:

1. **Linked list traversal + insert/sort** (`core_list_join.c`)
2. **Matrix init + multiply** (`core_matrix.c`)
3. **State machine** processing a string (`core_state.c`)

…plus utility code (CRC, RNG) in `core_util.c`.  Total ~2000 LOC.

This is the embedded benchmark vendors publish CoreMark/MHz scores
against (Cortex-M0, AVR, RISC-V, ...).  It's a useful cross-platform
sanity check on our backend's code-quality.

## Files

- `coremark-src/`            — vendored EEMBC source (read-only)
- `core_portme.h` / `.c`     — W65816 porting layer (timing, malloc,
                              printf bridge)
- `build.sh`                 — compile the 5 core .c files + portme
- `runCoreMark.sh`           — build + link + run under MAME

## Building

```bash
bash tests/coremark/build.sh --layer2
```

`--layer2` enables `-mllvm -w65816-dbr-safe-ptrs`.  This is **required**
to fit the binary in a single bank; without it, text crosses the IO
window at `0xC000`.  CoreMark only touches malloc/static-array memory,
so the dbr-safe-ptrs assumption is correct.

Default iteration count is 1 (smallest valid run).  Override for
publishable scores:

```bash
ITERATIONS=5 bash tests/coremark/build.sh --layer2
```

CoreMark spec recommends >= 10 seconds of runtime.  At ~1 MHz, expect
roughly one iteration per second of in-IIgs time — so iteration counts
of 10–60 give a representative score.

## Running

```bash
bash tests/coremark/runCoreMark.sh --layer2
```

The run terminates with `0xC0DE` written to `$025000` on success.
Elapsed VBL ticks (60 Hz) are stored at `$025002` (low/hi halves).

**Note:** running CoreMark under MAME inside this project's restricted
shell crashes MAME (same SIGSEGV as Lua's full interpreter run —
see `feedback_lua_compile_test.md`).  The build produces a valid
binary; the run only works in an unrestricted shell.  Workaround: copy
`coreMark.bin` out of the sandbox and run with the same
`runInMame.sh` invocation directly.

## Size vs Calypsi (5 core files, ITERATIONS=1, PERFORMANCE_RUN)

| File | Ours (L2+threshold=75) | Calypsi 5.16 | Ratio |
|------|----------------------:|-------------:|------:|
| core_list_join.o | 10,188 | 9,073 | 1.12× |
| core_main.o      | 11,656 | 19,772 | 0.59× |
| core_matrix.o    | 15,180 | 11,078 | 1.37× |
| core_state.o     | 7,348 | 9,944 | 0.74× |
| core_util.o      | 3,156 | 4,631 | 0.68× |
| **TOTAL**        | **47,528** | **54,498** | **0.87×** |

We beat Calypsi by 13% on CoreMark overall.

## Notes on the porting layer

- `ee_u32` is `unsigned long` (not `unsigned int` — on W65816 `int` is
  16-bit; `long` is 32-bit).  CoreMark depends on 32-bit `ee_u32` for
  CRC and timing math.
- `MEM_METHOD = MEM_STATIC` — a single 2 KB static array in BSS.
  Avoids dynamic alloc and the resulting heap-management overhead.
- `start_time` / `stop_time` use `clock()` which returns the 60 Hz VBL
  counter.  `EE_TICKS_PER_SEC = 60`.
- `HAS_FLOAT = 1` — CoreMark uses double precision for the score
  calculation; our soft-double handles it.
- `MULTITHREAD = 1` — single-context.  The IIgs doesn't have threads.

## Comparing builds

Lua and CoreMark together cover roughly disjoint code patterns:

| Pattern | Lua | CoreMark |
|---|---|---|
| VM dispatch | yes (`luaV_execute` 30+ case switch) | no |
| Recursive descent parsing | yes (`lparser.c`) | no |
| String + hash table | yes | no |
| Linked-list traversal + sort | (small) | yes |
| Matrix init + multiply | no | yes |
| State machine | (JSON tokenizer in smoke) | yes (formal CoreMark state) |
| CRC | yes (in smoke) | yes |
| Recursion-heavy | yes | no |

So they complement each other for backend coverage.  Both now compile
to under-or-near Calypsi size with the standard Layer 2 + threshold=75
config.