119 lines
6.7 KiB
Markdown
119 lines
6.7 KiB
Markdown
// Benchmark cycle regression sweep — 2026-06-03
|
|
//
|
|
// Methodology
|
|
//
|
|
// - scripts/benchCyclesPrecise.sh harness (default Layer 1, no
|
|
// W65816_CC_EXTRA), measured via emu.time() inside MAME.
|
|
// - Three back-to-back runs; numbers were byte-identical across
|
|
// runs (emu.time() is deterministic when MAME is driven from the
|
|
// same Lua boot script). No MAME flakiness involved.
|
|
// - Compared against the most recent recorded baseline in each
|
|
// bench's MEMORY.md entry (see "Source" column).
|
|
//
|
|
// Suspected cause of regressions: commit 09f7405 (2026-06-03,
|
|
// "Updates") removed three major peephole/pass bodies:
|
|
//
|
|
// - W65816UnLSR.cpp lost processReturnedCounter (-241 lines).
|
|
// This was the strLen-style counter-PHI-to-pointer-PHI undo that
|
|
// enabled the downstream Y-as-counter peephole in StackRelToImg.
|
|
// Without it, strLen / strcpy / memcmp loops emit the
|
|
// pre-2026-05-25 22 cyc/iter form instead of the 13 cyc/iter
|
|
// form.
|
|
// - W65816SepRepCleanup.cpp lost the store-forwarding pass body
|
|
// (-370 lines including 358 comment+code lines). This was the
|
|
// PHI-copy memory-to-memory eliminator that fed djb2Hash and
|
|
// popcount.
|
|
// - W65816WidenAcc16.cpp lost the Phase-2 PHI cycle widening
|
|
// scaffolding (-214 lines). Effect on benches less direct but
|
|
// correlates with djb2Hash, popcount, memcmp regressions.
|
|
//
|
|
// Commit message claims "Updates" — diff is a wholesale removal of
|
|
// "disabled" / "experimental" #if-0'd code blocks. Some of those
|
|
// blocks were actually wired in (UnLSR.processReturnedCounter was
|
|
// not gated behind any disable; the call site at line ~107 was
|
|
// `Changed |= processReturnedCounter(L);` per memory, with the
|
|
// "disabled" comment now showing the call removed).
|
|
//
|
|
//
|
|
// Results
|
|
//
|
|
// benchCyclesPrecise.sh on commit HEAD (09f7405), default Layer 1
|
|
// (no -mllvm -w65816-dbr-safe-ptrs), all benches 3x consistent.
|
|
//
|
|
// | Bench | Baseline | Current | Delta % | Regression? | Baseline source |
|
|
// |---------------|---------:|--------:|---------:|:-------------|----------------------------------------------|
|
|
// | bsearch | 767 | 767 | +0.0% | NO | feedback_remaining_optimization_opportunities |
|
|
// | bubbleSort | 15004 | 15004 | +0.0% | NO | feedback_layer2_loop_miscompile (L1 baseline) |
|
|
// | crc32 | n/a | 55839 | n/a | NO BASELINE | first measurement |
|
|
// | djb2Hash | 2387 | 2728 | +14.3% | YES | feedback_mul_const_strength_reduce 2026-05-25 |
|
|
// | dotProduct | 1620 | 1620 | +0.0% | NO | feedback_dpf0_setup_collapse 2026-05-15 |
|
|
// | fib | 11594 | 11764 | +1.5% | marginal | feedback_stackrel_dead_store_fib 2026-05-27 |
|
|
// | memcmp | 716 | 887 | +23.9% | YES | feedback_dp_dead_store_elim 2026-05-25 |
|
|
// | popcount | 1194 | 1228 | +2.8% | YES (mild) | feedback_popcount_carry_trick 2026-05-26 |
|
|
// | strcpy | 1108 | 1705 | +53.9% | YES | feedback_stackrel_dead_store_elim 2026-05-27 |
|
|
// | strLen | 767 | 2643 | +244.6% | YES (severe) | feedback_y_as_counter_strlen 2026-05-27 |
|
|
// | sumOfSquares | n/cmp | 6820 | n/a | NO (improved)| harness change since 18755 number |
|
|
// | globalArr8Sum | n/a | 3922 | n/a | NO BASELINE | first measurement |
|
|
// | globalArrFill | n/a | 8184 | n/a | NO BASELINE | first measurement |
|
|
// | globalArrSum | n/a | 8525 | n/a | NO BASELINE | first measurement |
|
|
//
|
|
//
|
|
// Notes per regression
|
|
//
|
|
// strLen +244.6% The 767-cyc baseline came from the y-as-counter
|
|
// peephole in W65816StackRelToImg, whose INPUT
|
|
// pattern is produced by W65816UnLSR's
|
|
// processReturnedCounter (the strLen-style undo).
|
|
// With that undo removed, StackRelToImg sees the
|
|
// LSR-widened counter-PHI form and bails to
|
|
// generic codegen. The peephole code is still
|
|
// present in StackRelToImg.cpp lines 2941, 3106 —
|
|
// but it never matches.
|
|
//
|
|
// strcpy +53.9% Same root cause: UnLSR's processReturnedCounter
|
|
// also fed the strcpy-style pointer-walk shapes.
|
|
// The "stack-rel dead-store elim" peephole in
|
|
// StackRelToImg (which produced the 1108 cyc
|
|
// baseline) is upstream of the pattern collapse
|
|
// that UnLSR removed.
|
|
//
|
|
// memcmp +23.9% Two-pointer deref loop; same family of patterns.
|
|
// The Pass-2c DPF0-setup-collapse in
|
|
// W65816StackSlotCleanup (which produced 818 cyc
|
|
// and was later tightened to 716 via dead-store
|
|
// elim) is still present, but its upstream
|
|
// structural shape isn't being produced.
|
|
//
|
|
// djb2Hash +14.3% Hash loop with i32 accumulator. The
|
|
// store-forwarding pass removed from
|
|
// SepRepCleanup was the eliminator for the PHI
|
|
// memory copy at end of body (2387-cyc baseline
|
|
// required it).
|
|
//
|
|
// popcount +2.8% Slight regression; the carry-trick peephole
|
|
// is still present (StackRelToImg.cpp line 2541),
|
|
// but the lagged-PHI store-forwarding step it
|
|
// relied on is gone, costing 3 cyc/iter * 16 iters
|
|
// plus a few cleanup cycles at exit.
|
|
//
|
|
// fib +1.5% Marginal. Stack-rel dead-store-elim still
|
|
// present per StackRelToImg.cpp; the small
|
|
// regression may be CMake / regalloc noise from
|
|
// the unrelated WidenAcc16 changes.
|
|
//
|
|
//
|
|
// Verdict: REGRESSIONS FOUND.
|
|
//
|
|
// Five clear regressions (strLen, strcpy, memcmp, djb2Hash, popcount)
|
|
// and one marginal (fib) attributable to commit 09f7405 (2026-06-03,
|
|
// "Updates") which removed perf-critical pass bodies from
|
|
// W65816UnLSR.cpp, W65816SepRepCleanup.cpp, and W65816WidenAcc16.cpp.
|
|
//
|
|
// Fix path (not this agent): restore the deleted blocks (especially
|
|
// W65816UnLSR::processReturnedCounter and its registration in
|
|
// runOnFunction), then re-run this sweep to confirm strLen 2643 →
|
|
// 767, strcpy 1705 → 1108, memcmp 887 → 716, djb2Hash 2728 → 2387.
|
|
//
|
|
// Files unchanged by this agent: src/llvm/lib/Target/W65816/*.
|
|
// New file created by this agent: tests/benchSummary_2026_06_03.md
|
|
// (this file).
|