calog/design.md

1006 lines
62 KiB
Markdown

# calog -- Polyglot Script Broker: Design & Implementation Plan
A C "broker" that lets one application be written in a mix of scripting languages
(Lua and my-basic first; Squirrel and others later). Native C functions are added
once and become callable from every language. Functions and data exported from one
module are callable from modules written in another language. Threading is actor-based;
networking rides the same dispatcher. Data sharing is by-value for v1.
This document is the reconciled output of a design pass plus an adversarial verification
pass. Where the verification corrected the first-cut design, the correction is folded in
and noted as "[verified]".
---
## 1. Architecture: hub and spoke
Nothing talks to anything else directly. Every engine talks only to the broker, through
two shared contracts:
1. One universal value type, `ValueT` (a tagged union).
2. One uniform native-function signature:
```c
typedef int32_t (*NativeFnT)(ValueT *args, int32_t argCount, ValueT *result, void *userData);
```
A developer writes a native function once against that signature and registers it once.
A *script* function exported from a module is itself stored as a `NativeFnT` whose body
re-enters its owning interpreter -- so "call C from script", "call script from C", and
"call module A's function from module B" are all the same code path. Adding an engine is
O(1) adapter work, not O(N) per existing engine.
The single most important lesson from the verification pass: **there must be exactly one
`ValueT` / `AggregateT` / `ValueTypeE`, defined in `broker.h`, included verbatim by every
adapter.** The first-cut design had three divergent copies and would not have linked, let
alone round-tripped data. Section 2 is therefore the load-bearing part of this plan.
---
## 2. The canonical type system (`broker.h`) -- single source of truth
### 2.1 Value tags and the value struct
```c
typedef enum ValueTypeE {
valueNilE = 0,
valueBoolE = 1,
valueIntE = 2, // int64_t
valueRealE = 3, // double
valueStringE = 4, // length-prefixed, binary-safe
valueAggregateE = 5, // hybrid array + map container
valueFnE = 6 // function value: refcounted handle to a CallableT
} ValueTypeE;
typedef struct StringT {
char *bytes; // owned; always NUL-terminated at bytes[length] for C consumers
int64_t length; // byte count excluding the convenience terminator (binary-safe)
} StringT;
typedef struct ValueT {
ValueTypeE type;
union {
bool b;
int64_t i;
double r;
StringT s;
struct AggregateT *agg; // heap-owned subtree
struct CallableT *callable; // refcounted, broker-owned
} as;
} ValueT;
```
### 2.2 The aggregate (one shape both engines map onto)
```c
typedef struct PairT {
ValueT key; // marshal layer constrains to int/real/string keys (see 2.5)
ValueT value;
} PairT;
typedef enum AggregateKindE {
aggregateListE = 0, // empty container round-trips as a list by default
aggregateMapE = 1,
aggregateBothE = 2 // array part AND pairs part populated
} AggregateKindE;
typedef struct AggregateT {
AggregateKindE kind; // disambiguates empty/mixed containers across engines
ValueT *array; // dense elements [0, arrayCount)
int64_t arrayCount;
int64_t arrayCap;
PairT *pairs; // map part; preserves insertion order
int64_t pairCount;
int64_t pairCap;
} AggregateT;
```
A Lua table's sequence part maps to `array`, its remaining keys to `pairs`. A my-basic
`LIST` maps to `array`, a `DICT` maps to `pairs`. The explicit `kind` flag fixes two
problems the verifier flagged: an empty `{}` had no defined type on the far side, and a
mixed array+hash Lua table had no representation at all.
### 2.3 Function values: `CallableT`
`valueFnE` is the one deliberate exception to "by-value everything". A function cannot be
meaningfully copied between heaps, so it is shared by *reference* -- but safely, because it
is only ever *invoked*, never inspected, and invocation always routes to the owning
context's thread.
```c
typedef struct CallableT {
NativeFnT fn; // uniform invoke entry
void *userData; // luaL_ref slot, pinned mb_value_t, or C closure ctx
uint32_t ownerCtxId; // context whose thread MUST run it
uint32_t ownerGen; // generation of that context (UAF guard, see sec 9)
int32_t refCount; // ATOMIC; shared-handle lifetime across threads
bool alive; // false once the owning context is torn down
} CallableT;
```
Rules (these resolve the verifier's critical "function value breaks the no-shared-pointer
invariant" finding):
- `valueCopy` of a `valueFnE` does an **atomic refcount increment** on the same
`CallableT` -- it does NOT clone the closure. The shared `CallableT*` across threads is
allowed precisely because `refCount` is atomic and the closure is touched only on its
owner thread.
- `valueFree` of a `valueFnE` does an **atomic decrement**. When it hits zero, the
underlying closure must be released with an interpreter call (`luaL_unref`, my-basic
unref) -- which can only run on the owner's thread. So a zero-drop on a foreign thread
**posts a release message to the owner context** rather than calling the interpreter
directly (sec 10).
- Invoking a dead handle (`alive == false`, owner gone) returns a clean broker error,
never a call into a freed interpreter.
### 2.4 Value operation contract (one signature, used everywhere)
The verifier caught two designs declaring `valueCopy` with incompatible signatures. The
canonical form -- status-returning, dst-by-pointer, so OOM is checkable on the hot path:
```c
int32_t valueCopy(ValueT *dst, const ValueT *src); // deep copy; brokerOkE / brokerErrOomE
void valueFree(ValueT *v); // recursive free; leaves a safe nil
void valueMove(ValueT *dst, ValueT *src); // zero-alloc ownership transfer
```
`valueFree` and `valueCopy` MUST have a `default:`/explicit case for every tag including
`valueFnE` -- a missing case was how the first cut silently leaked function payloads.
### 2.5 Cross-engine fidelity table (the honest limits)
By-value marshalling across `Lua <-> broker <-> my-basic` is lossless for the common cases
and lossy at documented edges. These are inherent to my-basic's value model (verified
against `my_basic.h`/`.c`), not marshalling bugs:
| Aspect | Lua | my-basic | Crossing Lua <-> BASIC |
|-----------------------|------------------------|-----------------------------------|-------------------------------------------------|
| integers | 64-bit | `int_t == int` (32-bit, always) | **truncates above 2^31** -- range-check + error |
| reals | double | `float` (double w/ `-DMB_DOUBLE_FLOAT`) | precision loss unless double build |
| int vs real subtype | distinct (5.4) | integral real auto-collapses to int | **subtype not preserved** through BASIC |
| strings | binary-safe (length) | bare `char*` (`strlen`) | **embedded NUL truncates** -- detect + error |
| array / list | table sequence part | `LIST` | OK |
| map / dict | table hash part | `DICT` (int/real/string keys) | OK |
| mixed array+hash table| one table | no single LIST+DICT value | **collapses to DICT** (array part -> int keys), documented |
| empty `{}` | table | LIST or DICT? | `kind` flag; default LIST |
| function value | closure (`luaL_ref`) | lambda/routine (`MB_DT_ROUTINE`) | OK via `CallableT` (by-reference) |
| nested depth | bounded | bounded | one shared cap; defined error past it |
| cycles | rejected on ingress | rejected on ingress | impossible at rest (by-value, no shared refs) |
Policy decisions baked in to make the above deterministic:
- **One shared recursion-depth cap** applied on *every* recursive path -- both ingress and
egress, all components -- failing with a defined status instead of overflowing the C
stack. (The first cut bounded only Lua ingress.)
- **One strict/lenient switch**, owned by the broker and honored by both adapters: in
strict mode an unrepresentable value/key (e.g. a function used as a table key, a 64-bit
int into BASIC, a NUL-bearing string into BASIC) is an error; in lenient mode it is
dropped/coerced with a documented rule. Never a silent surprise either way.
- **Keys**: broker allows int/real/string keys (my-basic dicts accept all three). Float
keys get defined equality; non-representable keys follow the strict/lenient switch.
---
## 3. The engine vtable -- what makes the actor loop engine-agnostic
Each adapter implements one small vtable so the broker/actor core never special-cases an
engine:
```c
typedef struct EngineT {
const char *name;
void *(*createInterpreter)(struct ScriptContextT *ctx); // ON the owning thread
void (*destroyInterpreter)(void *interp);
int32_t (*loadSource)(void *interp, const char *src, int64_t len);
int32_t (*registerNative)(void *interp, const char *name, NativeFnT fn, void *userData);
int32_t (*callExport)(void *interp, void *exportRef, ValueT *args, int32_t argCount, ValueT *result);
void (*releaseExport)(void *interp, void *exportRef); // ON the owning thread
} EngineT;
```
`createInterpreter`, `destroyInterpreter`, `releaseExport`, and every `callExport` run on
the context's own thread -- that is the invariant that keeps each interpreter
single-threaded.
---
## 4. The Lua adapter
Target Lua 5.4 (note 5.1/5.3 deltas where they matter). The C API surface was verified
accurate; the fixes below are about lifecycle, not API names.
### 4.1 Native registration and the trampoline
Each registered `NativeFnT` becomes a Lua C closure: push the `NativeFnT` and its
`userData` as upvalues with `lua_pushcclosure`, then `lua_setglobal` (or into a module
table). The single trampoline recovers them from upvalues, marshals the Lua stack into a
`ValueT[]`, calls the `NativeFnT`, and pushes the result back.
- `lua_setglobal` returns `void` (the first cut documented `int` -- harmless but wrong).
- Lua allocation APIs (`lua_newuserdatauv`, `lua_createtable`, ...) **longjmp on OOM and
never return NULL** -- so NULL-checks after them are dead code; the OOM path is a Lua
error, not a C return. Only `luaL_newstate` can return NULL and must be checked.
### 4.2 Marshalling `ValueT <-> Lua` (by value)
Scalars are direct. Strings use `lua_tolstring` + length (binary-safe, preserves NULs).
Tables deep-copy in both directions:
- Ingress (table -> `AggregateT`): normalize the table index to absolute before `lua_next`;
route numeric keys through `lua_tointeger`/`lua_tonumber` (do **not** let `lua_tolstring`
mutate a numeric key in place); fill `array` for the sequence part and `pairs` for the
rest; detect cycles via an ancestor-pointer stack (`lua_topointer`); use `lua_rawset`/
`lua_rawget` to avoid metamethods; enforce the shared depth cap.
- Egress (`AggregateT` -> table): build with `lua_createtable`, populate `array` as the
sequence and `pairs` as keyed entries, **with the same depth cap** (the first cut had no
egress cap -- a deep BASIC-origin structure could overflow the stack on the way back).
Make the builder self-balancing: record `lua_gettop` on entry and `lua_settop` back on
any error return.
### 4.3 Exporting a Lua function (and the leak fix)
A Lua function crossing the boundary becomes a `valueFnE`: pin the closure with
`luaL_ref(L, LUA_REGISTRYINDEX)`, wrap it in a `CallableT`. The `fn` body does
`lua_rawgeti` to retrieve, marshals args onto the stack, `lua_pcall`, marshals the return;
on error it pulls the message with `luaL_tolstring` and reports it as a broker error
(sec 8).
Two bugs the verifier found, fixed here:
- **The exports array needs grow-on-demand.** The context is `calloc`'d, so the array
starts NULL/0; the first export must `realloc` (double the cap, handle the NULL/0 seed)
before storing, and `luaL_unref` the just-created ref if the realloc fails.
- **Transient vs persistent ownership.** Every Lua function passed as a *native argument*
was creating a `luaL_ref` that only got released at `lua_close` -- an unbounded leak for
any long-lived context that takes callbacks. Fix: the `CallableT` refcount owns the
`luaL_ref`. When the last `valueFree` drops the handle to zero, the ref is released (on
the Lua thread per sec 10). A function the broker retains as a real export holds a
reference for as long as it is registered; a function merely borrowed for the duration
of one call is released when that call's `ValueT` args are freed. Same mechanism, two
lifetimes.
### 4.4 Calling a function value from Lua
A `valueFnE` marshalled *into* Lua becomes a Lua C closure over the `CallableT*`, so script
authors just write `cb(x, y)` and it transparently dispatches (to the owner's thread if the
function lives elsewhere). A universal `call(fn, ...)` native is also provided for
uniformity across engines.
### 4.5 Context lifecycle
`luaL_newstate` + `luaL_openlibs` on the owning thread; confine the `lua_State` to that
thread forever; teardown `luaL_unref`s outstanding refs then `lua_close`.
---
## 5. The my-basic adapter
Verified against `paladin-t/my_basic` (`my_basic.h` + `.c` read directly). **Zero
hallucinated calls.** The interesting work is three my-basic-specific quirks that shape the
adapter; all three are forced by the source, not stylistic.
### 5.1 Lifecycle and the inverted register result
`mb_init` (once per process) / `mb_open(&bas)` / `mb_load_string(bas, src, true)` /
`mb_run` / `mb_close` / `mb_dispose`. The broker pointer is threaded through the interp's
single userdata slot via `mb_set_userdata` / `mb_get_userdata`.
- **`mb_register_func` returns a count, not a status** -- nonzero means registered, `0`
means duplicate/failure. That is the opposite of the `MB_FUNC_OK == 0` convention, so the
success test must be inverted. Names are uppercased internally (`mb_strupr`), so the
broker key is the uppercased identifier (BASIC is case-insensitive).
### 5.2 The native-function protocol and the trampoline bank
The native signature is `typedef int (*mb_func_t)(struct mb_interpreter_t*, void**);` --
**no per-callback userData parameter**, and the interpreter has only **one** userdata slot.
So a single shared C trampoline cannot tell *which* broker function it is serving.
Fix (the verifier confirmed this limitation is real): a **macro-generated bank of
trampolines** `mbTramp0 .. mbTrampN`, each hardcoding its slot index, each looking up
`ctx->nativeBank[slot]` (the `NativeFnT` + `userData`) via the interpreter's userdata
pointer. The bank size caps how many natives one my-basic context can host; size it
generously and document it.
Inside a trampoline the argument protocol is the real my-basic frame dance:
`mb_attempt_open_bracket` / loop `mb_pop_value` (honoring `mb_has_arg`) /
`mb_attempt_close_bracket` / compute / `mb_push_value` (or the typed `mb_push_*`).
### 5.3 String ownership (memdup is mandatory)
`mb_pop_string` hands back a **borrowed interior pointer** -- the broker must `strdup`/copy
it immediately. Pushed strings are taken over by the interpreter and later freed with *its*
allocator, so any string handed to `mb_push_value`/`mb_make_string` **must come from
`mb_memdup`** (not plain `malloc`). Embedded NULs cannot survive (bare `char*` + `strlen`)
-- enforce the strict/lenient policy on egress.
### 5.4 Aggregates: the collection API
There is no `mb_make_coll`. A list/dict is built by presetting `coll->type =
MB_DT_LIST`/`MB_DT_DICT` then calling `mb_init_coll`, and accessed with
`mb_get_coll` / `mb_set_coll` / `mb_remove_coll` / `mb_count_coll` / `mb_keys_of_coll`.
Collection support is on by default (`MB_ENABLE_COLLECTION_LIB`). A broker aggregate with
both `array` and `pairs` populated collapses to a DICT (array part becomes integer keys)
per the fidelity table.
### 5.5 Exporting a BASIC routine -- the parked `__BROKERSERVE` frame
This is the my-basic-specific crux. To call a BASIC routine/lambda from C you use
`mb_get_routine(s, l, name, &val)` then `mb_eval_routine(s, l, val, args, argc, &ret)` --
and **`mb_eval_routine` dereferences `*l` and hard-requires a live, non-NULL `void** l`**
(verified at `my_basic.c:14344/14358`). A valid `l` only exists *inside* a running native
call. Therefore a my-basic context cannot be driven from arbitrary C; it must be **parked
inside a native frame**.
Design: register a native `__BROKERSERVE` whose C body *is* the context's message-pump /
serve loop. A module hands control to the broker by ending with a `SERVE` call (the adapter
appends one if absent). While parked there, the loop holds a valid `l`, which it uses to
`mb_eval_routine` whenever another context calls one of this module's exported routines.
`mb_get_routine` returns `MB_FUNC_OK` with a *nil* value when a name is absent, so the
not-found test is `routine.type != MB_DT_ROUTINE`, not the status code.
### 5.6 Numeric and identity caveats
`int_t` is 32-bit unconditionally (64-bit broker ints truncate -- range-check + error or
promote to real with documented precision loss); integral reals auto-collapse to int so
real/int subtype is not preserved across a BASIC hop. Both are in the fidelity table; both
follow the strict/lenient switch.
---
## 6. Threading: the actor model
Each `ScriptContextT` owns one interpreter, one OS thread (pthreads -- chosen over C11
`<threads.h>` for portability/maturity), and one inbound MPSC message queue. Interpreters
are single-threaded; only the owning thread ever enters `callExport`. A cross-context call
is a message; the caller blocks for the reply on a per-call condvar future (lost-wakeup
safe via a predicate loop). The verifier confirmed the core is sound: no path lets two
threads touch one interpreter, the deep-copy ownership ledger is correct on the success
path, and the epoll thread enqueuing while a context is mid-dispatch is race-free.
The fixes folded in from verification:
- **One error channel.** The first cut carried a separate error `ValueT` in the reply that
the caller never freed -- a leak on every errored call, lost error text, and a second
source of truth contradicting the broker's "error travels in `result`" contract. Fix:
on failure the adapter writes the error string into `result` (as `brokerSetError` does);
the reply carries only `{status, result}`. One channel, one owner, freed once.
- **`valueCopy` checked on enqueue.** Use the canonical `int32_t valueCopy(dst, src)`,
check each arg, and unwind partially-copied args on OOM (mirroring the broker route
path). The actor layer should call the broker's marshalling, not reimplement a copy loop.
- **Shutdown drains everything.** On `SHUTDOWN`, error-reply every queued `CALL` *and*
free every queued/stashed `REPLY` (result + error) before join -- the first cut leaked
in-flight replies unwound by a nested shutdown.
- **Explicit thread stack size.** The reentrancy depth bound counts dispatch nesting, not
C-stack bytes; set a validated stack size with `pthread_attr_setstacksize` (or lower the
bound) so the "clean catchable depth error" promise actually holds instead of a UB
overflow.
- **Split the ready-handshake condvar** off the queue condvar so `queueCond` has exactly
one semantic (latent lost-wakeup footgun if a second waiter is ever added).
### 6.1 What a context does while blocked: always-live nested pump [DECIDED]
When context A makes a synchronous cross-context call and waits for the reply, A's thread
**pumps its own inbound queue** instead of sleeping idle. An incoming call to A -- including
a re-entrant B->A issued during the very call A is waiting on -- is serviced on A's own
thread, then A resumes waiting. This was chosen over strict run-to-completion because it
never deadlocks and needs no wait-for-graph deadlock detector; the rejected alternative
would have had to raise a "synchronous call cycle" error on A->B->A and would leave a busy
context unresponsive to other callers. The verifier validated the pump as sound: only A's
thread ever enters A's interpreter (the single-threaded invariant holds), reply nesting is
strict LIFO, and depth is bounded.
The contract this commits the runtime and script authors to:
- **Re-entry happens only at explicit cross-context call points** (`x = getUserInfo()`,
`data = sockRecv(c)`), never mid-statement -- a call point is a yield point.
- **Module-global state may differ after a cross-context call returns**, because another
call may have run on this context while it was outstanding (the same contract as any RPC).
Local variables are unaffected.
- **Reentrancy is depth-bounded** with a catchable error (backed by an explicit pthread
stack size, sec 6 fixes), so runaway ping-pong fails cleanly instead of overflowing.
Script code stays plain synchronous-blocking regardless -- `info = getUserInfo()` just
works; this only governs what the runtime does while a call is outstanding.
---
## 7. Networking and the dispatcher
One dedicated I/O thread runs `epoll` (Linux; `kqueue`/`poll` for portability) and owns no
interpreter. Async socket primitives (`sockConnect`, `sockListen`, `sockSend`, `sockRecv`,
`sockClose`, plus a timer) are registered once through the broker, so every language gets
them. The recommended v1 model is **synchronous-blocking at the script level**: `data =
sockRecv(conn)` parks the calling context on a reply future; when epoll reports readiness,
the I/O thread builds a `CALL`/reply and enqueues it onto the *owning* context's queue, so
the result lands on the right interpreter thread. **Callbacks are opt-in on top**: pass a
`valueFnE` (e.g. `onConnect(myFunc)`) and the completion invokes it via the same dispatch,
always back on its home thread. No separate async keyword, no per-engine coroutine support
needed.
Fixes from verification:
- The I/O command queue must be **strict tail-append FIFO** (a `sockSend` issued right
after `sockConnect` must be processed after the connect that registered the handle);
assert it.
- The resolver/connect path must **deep-copy `host`** before the command is freed (the
first cut had an unconditional `free(cmd->host)` that would UAF if stored by pointer).
- Wake the epoll thread for new interest via `eventfd`/self-pipe; deregister on close;
drain `eventfd` to `EAGAIN`.
- Portability note: `pthread_condattr_setclock(CLOCK_MONOTONIC)` is absent on Darwin --
guard it with `#ifdef` and derive any monotonic timed wait accordingly (a monotonic
deadline cannot be handed to a realtime-clock condvar).
---
## 8. Error model (one source of truth)
A `NativeFnT` returns a status int and, on failure, writes a human-readable message **into
`result`** (a `valueStringE` tagged as an error, or a small error-struct convention). That
single in-band channel crosses the actor boundary unchanged, is freed exactly once by the
caller, and is surfaced into the calling engine as that engine's native error
(`luaL_error`/`lua_error` for Lua, `mb_raise_error` for my-basic). There is no second error
field anywhere.
---
## 9. Context lifetime and the registry (UAF fix)
The critical use-after-free: contexts were addressed by raw `ScriptContextT*` (and exports
held raw owner pointers), while `contextShutdown` frees the context and destroys its
mutex/cond at runtime -- so a foreign thread could enqueue onto a freed `queueMutex`.
Fix:
- Address contexts by a **stable integer id** through a **locked registry**; never by raw
pointer. `contextEnqueue`/`contextCall`/`ioDispatch` resolve id -> context under the
registry lock and either hold the lock across enqueue or take a reference so the context
(and its mutex) cannot be freed mid-enqueue.
- Add a **generation counter** to context ids and to `CallableT.ownerGen` so a recycled id
cannot misroute an in-flight completion to a different context.
- `contextShutdown`: under the lock, mark dead and remove from the id map; reject new
enqueues with a defined "dead context" error; drain and error-reply queued work; wait for
in-flight references to drain; then free.
---
## 10. Function-value lifecycle across threads
`CallableT.refCount` is atomic. `valueCopy` bumps it; `valueFree` drops it. The subtlety:
releasing the underlying closure is an interpreter op (`luaL_unref` / my-basic unref) that
must run on the owner's thread. So when a drop reaches zero on a *foreign* thread, the
broker posts a **release message** to the owner context instead of touching the interpreter
directly; the owner releases the closure on its own thread and frees the `CallableT`. If
the owner is already gone (`alive == false`), the `CallableT` shell is freed directly (the
closure is already gone with the interpreter) and any pending invoke returns a clean error.
---
## 11. Build order
1. **Broker core**: `broker.h` (the canonical `ValueT`/`AggregateT`/`CallableT`/enums),
`valueCopy`/`valueFree`/`valueMove` with full tag coverage and the depth cap, the
name->entry registry, `brokerCall`, the error convention. Unit-test value round-trips
and deep-copy/free under a leak checker before any engine exists.
2. **Lua adapter** against the core: trampoline, scalar+string marshalling, table
deep-copy both directions with caps, native registration, export with the refcounted
`luaL_ref` lifecycle and exports-array growth. Test C<->Lua and Lua-export-called-from-C
single-threaded.
3. **my-basic adapter**: lifecycle, the trampoline bank, the arg-frame protocol, `mb_memdup`
string ownership, the collection mapping, and the parked `__BROKERSERVE` export frame.
Test C<->BASIC and the full Lua<->broker<->BASIC round-trip against the fidelity table
(assert the lossy edges error or coerce exactly as documented).
4. **Actor layer**: `ScriptContextT`, the MPSC queue, the reply future, the id+generation
registry, the single error channel, the chosen block-while-waiting semantics (sec 6.1),
and shutdown drain. Stress cross-context calls and teardown under a thread sanitizer.
5. **Networking/dispatcher**: epoll I/O thread, the FIFO command queue, the socket/timer
natives, completion dispatch onto owning queues, callbacks via `valueFnE`.
6. **Squirrel** (later): a third adapter validates that the vtable + canonical `ValueT`
really make new engines O(1).
---
## 12. Open decisions
- **Block-while-waiting semantics: DECIDED -- always-live nested pump (sec 6.1).**
- Strict-vs-lenient default for the lossy marshal edges (recommend: strict by default so
truncation/loss is an explicit error; lenient opt-in per call).
- my-basic native-bank size (cap on natives per BASIC context).
- Whether a foreign function injected into BASIC should be transparently callable as a
routine value (`cb(x)`) or only via the portable `CALL(fn, ...)` primitive (Lua gets the
transparent form for free; BASIC's transparent form needs confirming).
---
## 13. Implementation notes (as-built: broker core + both adapters)
Built and tested: `broker.h`/`value.c`/`broker.c` (core), `luaAdapter.*` (Lua 5.4),
`mybasicAdapter.*` (vendored my-basic in `vendor/`), with `testBroker`/`testLua`/
`testMyBasic`/`testPolyglot` -- 378 checks, clean under ASan+UBSan. The polyglot test
proves the thesis: one C native called from both engines, and a Lua function invoked from
a BASIC program through the broker.
**Core refinement.** The single global callable-release hook could not distinguish a Lua
closure from a my-basic routine, so release is now a per-callable `CallableReleaseFnT`
passed to `callableCreate` (design sec 10's "owner releases the closure", just synchronous
for now). Added `callableUserData` so a release fn can reach its closure handle.
**Lua adapter.** Context pointer lives in `lua_getextraspace`. Native bindings are
context-owned `{fn,userData}` structs referenced by a light-userdata upvalue on one shared
trampoline. A Lua function crossing out becomes a `CallableT` over a pinned `luaL_ref`
(released via `luaL_unref` in the per-callable release fn); transient callback args are
freed automatically because `valueFree` drops the handle. A `CallableT` crossing in becomes
a callable userdata with `__call`/`__gc`. Lua allocation APIs longjmp on OOM (no NULL
checks). Caveat: release exported callables before `luaContextDestroy` (the `luaL_ref`
lives in that state's registry).
**my-basic adapter** (the high-effort one; these rules were forced by ASan):
- Build with `-DMB_DOUBLE_FLOAT` (double reals) and link `-lm`.
- Native signature has no per-call userData and one interpreter userdata slot, so a
macro-generated **trampoline bank** (`MB_BANK_SIZE`) supplies slot-specific entries that
recover the binding from the context.
- `mb_register_func` returns a count: **nonzero = success, 0 = failure** (inverted vs the
usual `MB_FUNC_OK == 0`).
- **Ownership is asymmetric and was the main source of bugs** (verified against the my-basic
source during adversarial review):
- A popped **collection** is owned by the consumer (`mb_dispose_value` after marshalling);
a popped **string** is a borrowed interior pointer (copy, never free).
- `mb_set_coll` **copies** a scalar/string key-value (dispose your copy after) but stores a
**collection by pointer without a reference** -- so a nested collection needs an explicit
`mb_ref_value` *before* the set, and must then NOT be disposed (the parent owns it).
- `mb_push_value` **transfers** a collection, but **borrows** a string -- a string result
must be pushed with `mb_push_string` (which marks it for lazy destroy), not
`mb_push_value`.
- `mb_eval_routine` **borrows** its arguments (it never frees them), so marshalled routine
args are disposed by the caller after the call -- and the **return value is marshalled
out first**, because a routine may return one of those borrowed arguments.
- Routine values are **not** ref-counted (`mb_ref_value`/`mb_unref_value` corrupt them); a
routine name must be **uppercased** before `mb_get_routine` (BASIC uppercases at parse).
- int64 entering BASIC is range-checked to 32-bit `int_t` (`brokerErrRangeE` on overflow).
- **Routine export** uses `mb_get_routine`(by name) + `mb_eval_routine`, both of which need
a live `void** l`. That cursor only exists inside a native call, so the dispatch stashes
it in `currentL`; an exported BASIC-routine `CallableT` is therefore valid only while the
context is *serving* (a native frame is on the stack -- what the actor layer's parked
`__BROKERSERVE` frame will guarantee). For now, fetch and invoke within one native call.
- **One interpreter per program:** a my-basic context hosts a single program; reset+reload
after disposing native-pushed collection intermediates is unreliable, so the tests spin a
fresh context per run. (The actor layer will own one long-lived parked context per
module, which sidesteps this.)
**Build/verify.** Core compiled strict (`-Wconversion -Wsign-conversion`); adapters drop
those two (engine headers use wide macros) but keep `-Wall -Wextra -Werror`. **All three
engines are vendored under `vendor/` and built from source** -- `vendor/lua` (Lua 5.4.6,
library = `src/*.c` minus the `lua.c`/`luac.c` mains), `vendor/mybasic` (my-basic),
`vendor/squirrel-src` (Squirrel 3.2) -- each relaxed and un-sanitized but linked into the
sanitized binaries so cross-boundary heap misuse is still caught. Nothing depends on a
system-installed engine or `pkg-config`, so the build is reproducible. The Lua platform
define is selected automatically: `$(OS)` first (Windows sets `Windows_NT` and has no
`uname` -> no define / ISO C), else from `uname -s` (LUA_USE_LINUX + `-ldl` /
LUA_USE_MACOSX / LUA_USE_POSIX). NB the project is otherwise Unix-only (pthreads,
sanitizers, `setarch`), so the Windows branch only keeps the define correct.
**Function-value lifecycle across threads (sec 10), DONE.** `callableInvoke` and the
final `callableRelease` are now thread-correct. The core exposes two installable hooks
(`callableSetInvokeHook`/`callableSetReleaseHook`, the same pattern as `brokerSetRouteHook`)
so it stays independent of the actor layer; `actorInit` installs them. An invoke from a
thread other than the callable's owner is marshalled to the owner's thread by reusing the
CALL machinery (a callable's `fn`+`userData` are exactly a native call -- `callableFn` is
the one new accessor). The final reference drop is routed too: a new `messageReleaseE`
posts the finalize (fire-and-forget) to the owner, which runs the engine release
(`luaL_unref` / `sq_release`) on its own thread. `callableFinalize` is the shared "run
release + free shell" tail; the core still runs it inline when no actor is present (so the
single-threaded `testCallableDead` semantics -- a dead callable still runs its release on
last drop -- are preserved). `testEngineLua` captures a Lua closure on its context's
thread, then invokes and releases it from the main thread; both marshal to the owner,
ASan/TSan-clean. Limit: releasing a callable whose owner *context* has been destroyed is
the deferred non-quiescent-teardown case (sec 9) -- best-effort inline finalize for now.
**JavaScript adapter (Duktape), the fourth engine.** Vendored Duktape 2.7.0 (the
single amalgamated `duktape.c`/`duktape.h`/`duk_config.h`) in `vendor/duktape`, built
relaxed/un-sanitized. `src/js/jsAdapter.*` mirrors the Lua/Squirrel adapters: one shared
trampoline recovers its binding from a hidden property on the function object
(`duk_push_current_function` + an internal `\xFF`-prefixed key) and dispatches through
`brokerCall`; marshalling covers scalars, binary-safe strings, and the hybrid aggregate
(JS array <-> list, object <-> map) with the depth cap. JS numbers are doubles, so an
integral in-range number round-trips as an int (else a real). A JS function crossing out
becomes a refcounted `CallableT` over a Duktape **heap pointer** kept alive by a per-heap
export registry object in the global stash (the slot is dropped on release) -- and it
participates in the sec-10 cross-thread routing, so a JS closure captured on its context's
thread is invoked and released from another thread correctly. `src/js/jsEngine.*` is the
EngineT binding. `testJs` (single-threaded: scalar/string/array/object marshalling, export
+ invoke-from-C, closure-as-arg callback, error paths) and `testEngineJs` (threaded:
cross-context call + the sec-10 callback) -- clean under ASan/UBSan and TSan (`make tsanjs`).
Adding the engine touched zero lines of the broker core or actor layer (one adapter TU +
one engine-binding TU + Makefile rules), re-confirming the O(1)-engine-add thesis. v1 limit
(as with Squirrel): pushing a foreign `CallableT` *into* JS is unsupported.
**Source layout.** `src/` holds the project source (core + actor directly in `src/`,
one subdir per script language: `src/lua`, `src/mybasic`, `src/squirrel`, `src/js`);
`tests/` holds the test programs; `obj/` collects every object file (ours and the vendored
engines, via `patsubst` into `obj/`); `bin/` collects the binaries. The Makefile finds our
sources by `VPATH` and groups object rules by flag set; `-MMD -MP` generate header
dependencies automatically. `make clean` removes `obj/` and `bin/`. `make test` builds and
runs all ten binaries; `make tsan`/`make tsansq`/`make tsanjs` are the ThreadSanitizer
variants.
## 16. Threading model rewrite -- host-thread natives, fire-and-forget scripts
Supersedes the earlier "natives run inline on the calling context thread" model. The
host's own thread is now an implicit **host context (id 0)**: it has a queue but no OS
thread of its own, and the host drives it by calling **`calogPump`** in its loop.
- **Scripts are fire-and-forget.** `calogContextEval(ctx, src)` enqueues the script
onto the context's thread and returns a status (not the result); the script runs
asynchronously, and results come back by calling natives.
- **A registered native runs on the host thread, serialized.** A script calling one
posts a CALL onto the host queue and parks; the host runs it during `calogPump`. So
host C code is never called concurrently and needs no locking. `actorRoute` inlines a
call already on the host thread; otherwise it marshals to the host context (id 0).
- **`calogRegisterInline`** is the escape hatch: the registry entry's `runInline` flag
(which replaced `ownerCtxId`) makes the native run on the calling script's thread.
- **Errors** from a fire-and-forget script are posted to the host queue and delivered
to the `CalogErrorFnT` handler (`calogSetErrorHandler`) during `calogPump` (default:
log to stderr).
- **Function values** (`CalogFnT`) still run on their owning engine's thread -- sec 10
routing is unchanged, and `calogFnInvoke` from the host blocks-and-pumps the host
queue while it waits (the same nested pump, now applied to the host context).
- **Nested eval is allowed:** a new eval that arrives while a context is mid-script
(parked on a native call) runs nested via `pumpUntil` -- consistent with the sec-6.1
re-entrancy contract (interpreters support nested `pcall`/`peval`).
API shape: `calogRegister(c,name,fn,ud)` / `calogRegisterInline(...)`;
`calogContextOpen(c,engine) -> CalogContextT*` (create+start merged, since nothing is
registered between them anymore) and `calogContextClose`; `calogContextEval(ctx,src)`
fire-and-forget; `calogPump`; `calogSetErrorHandler`. `CalogConfigT` and the
`createInterpreter` config parameter are gone -- a context now exposes *every*
registered native (the engine binding walks the registry via the internal
`calogForEach`). Tests rewritten to drive calog the host way (register, open, eval,
pump-until-a-native-records-the-result); `testActor` is now engine-free, exercising the
dispatch machinery with C callables (`calogFnCreate`) on synthetic contexts. Verified:
`make test` 441 checks across 11 binaries (incl. `examples/embed.c`), gcc + clang
strict, ASan/UBSan + TSan clean (`make tsan`/`tsansq`/`tsanjs`).
**`CalogT` owns its contexts; ids are unbounded.** The active-context registry moved
from `context.c` file-static globals *into* `struct CalogT` (now defined in
`calogInternal.h`): a runtime owns both its native-function registry and its
active-context registry (`ctxMutex`, `ctxSlots`, freelist). `context.c` reaches it via
one `runtime` pointer set in `calogActorInit` (which also refuses a second runtime).
So `calogDestroy` closes every still-open context automatically -- the host need not
track them (a test opens 32 and never closes them; ASan confirms no leak). Context ids
widened to **`uint64`** (32-bit slot index + 32-bit generation), so neither the live
count nor open/close churn hits a preset ceiling; `calogContextId`/`calogCurrentId`
return `uint64_t`. The now-dead `ownerGen` parameter was dropped from `calogFnCreate`
(generation lives in the packed id). Re-verified: `make test` 473 checks, ASan no
leaks, TSan clean, gcc + clang strict.
**Independent runtimes in one process.** The one-runtime limit was not fundamental --
just process-global state that hadn't moved into `CalogT`. All of it now has: the host
context, the routing hooks (`routeHook`/`invokeHook`/`releaseHook`), and the error sink
are `CalogT` fields; a `CalogFnT` carries a `runtime` pointer so `calogFnInvoke`/release
reach the right hooks (the callable path has no `CalogT` otherwise). The dispatch
reaches its runtime through the object it already holds -- the route hook is handed its
`calog`, the callable hooks read `calogFnRuntime`, context-thread code uses
`context->broker`, and the rest take an explicit `calog` argument. The **only** remaining
process global is `currentContext`, and it is thread-local (it names the calling
thread's context). So the setters (`calogSetRouteHook` etc.) are gone -- `calogActorInit`
assigns the fields directly -- and the `runtime` static that the earlier review flagged
is deleted. Runtimes are isolated: don't pass a value or callable between them (a
cross-runtime reply cannot route). A test spawns N threads, each creating, driving, and
destroying its own runtime concurrently.
**One thread may host several runtimes.** `calogPump(calog)` sets `currentContext` to
`calog`'s host context for the drain and restores it after, so a single thread can drive
many runtimes by pumping each in turn -- a native serviced during `calogPump(A)` sees
`calogCurrent() == A` even if the thread also hosts B. Two consequences fall out and are
handled: context ids number from 1 in every runtime, so the "already on the owner's
thread" (inline) and "caller can take the token/pump path" (reply) decisions match the
*runtime* too, not the id alone -- a foreign or wrong-runtime caller takes a reply box,
which cannot misroute. A test creates two runtimes on one thread, runs a script in each,
and pumps both in a loop, asserting each runtime's native resolved `calogCurrent()` to
its own runtime (it fails if the pump doesn't rebind `currentContext`). Re-verified:
`make test` 480 checks, ASan no leaks, TSan clean (both concurrent runtimes and one
thread pumping two), gcc + clang strict.
**Loading a script by filename.** Each engine carries a NULL-terminated `extensions`
list (`{"lua"}` / `{"js"}` / `{"nut"}` / `{"bas"}`), and a host makes engines available
for filename-based loading with `calogRegisterEngine(calog, &engine)`.
`calogContextLoad(calog, base)` then walks the registered engines in registration order
(each engine's extensions in order), forms `"<base>.<ext>"`, and the first one that
`fopen`s wins: it reads the file on the calling thread, opens a context on that engine,
and loads the contents fire-and-forget -- returning the context (NULL if nothing matched
or the load failed). Registration matters for more than search order: hardcoding the
built-in engine vtables in the core would force-link *all* of them (and their vendored
runtimes) into every binary, defeating the per-engine archives -- so the host opts in,
and a binary that never references an engine pulls in none (`testActor` stays
engine-free). Engine selection is fundamentally a build-time (link) choice, so
`calogRegisterBuiltinEngines` (a header-inline in `calog.h`) registers exactly the
engines whose `CALOG_WITH_<ENGINE>` macro is set -- the host defines those alongside the
archives it links, and the inline emits nothing (references no engine) unless called, so
it never force-links.
**my-basic as an actor engine.** Making my-basic loadable meant running it under the
actor model for the first time, which exposed two things. (1) Its native dispatch called
the C function *directly* instead of through `calogCall`, so natives ran on the my-basic
context thread rather than marshalling to the host -- fixed by routing `mbDispatch`
through `calogCall` (the binding now stores the registry name), matching the other
engines. (2) my-basic keeps process-global state -- lazy `mb_init` singletons and a
global `_mb_allocated` counter touched on every allocation (forced on in the vendored
header) -- so two my-basic contexts on different threads race (TSan-confirmed). The
singletons are built once by `mb_init` and read-only thereafter, so the only
execution-time shared write is that counter; a one-line vendored patch makes it
`_Atomic` (the original is preserved as `vendor/mybasic/myBasic.c.orig`). With the
counter safe, the my-basic *engine* (not the adapter, which stays usable single-threaded
and lock-free) needs a lock only across *lifecycle* -- `mb_init`'s first-context build,
`mb_dispose`'s last-context teardown, and the shared context refcount -- and NOT across
`runSource`, so several my-basic scripts execute concurrently. A `tsanmb` target proves
the parallel case is race-free (verified further by a 4-context stress running
arithmetic, strings, lists, and booleans). Verified: `make test` 494 checks (13
binaries), ASan no leaks, TSan clean on all four engines
(`tsan`/`tsansq`/`tsanjs`/`tsanmb`), gcc + clang strict.
## 15. Public embedding API (`calog.h`) -- as-built (superseded by sec 16 for threading/API)
calog is packaged as an embedding library: a host links it, registers its own native
C functions, creates script contexts on an engine, and runs scripts. Every public
symbol carries a `calog` prefix (types `Calog...T`, enums `Calog...E`) so the library
is a good citizen in a host binary. The API was curated to the minimum:
- **One handle, one header.** `CalogT` is the runtime; `calogCreate()` composes the
registry with the actor layer (installs the routing hooks) and `calogDestroy()`
tears both down -- no separate init/shutdown for the host. The entire embedding
surface is `src/calog.h` (~30 functions); internal machinery (the registry entry
type, the route/invoke/release hooks, the low-level callable lifecycle, the split
`calogBrokerCreate`/`calogActorInit`) lives in `src/calogInternal.h`, which host code
never includes. `calog.h` leaks no internal symbol.
- **One config type.** The three per-engine configs collapsed into `CalogConfigT`
(`exposeNames` + `exposeCount`), used by every built-in engine vtable.
- **Value model unchanged, just prefixed.** `CalogValueT`/`CalogAggT`/`CalogFnT` +
constructors (`calogValueInt`, ...), ops (`calogValueCopy`/`Free`/`Move`/`Equals`),
aggregates (`calogAgg*`), function values (`calogFnInvoke`/`Retain`/`Release`), and
`calogFail`/`calogTypeName` for writing natives. Contexts: `calogContextCreate`/
`Start`/`Eval`/`Destroy`/`Id`, plus `calogCurrentId`/`calogCurrent` for natives. The
built-in engine vtables are `calogLuaEngine`, `calogJsEngine`, `calogSquirrelEngine`
(a host may also supply a custom `CalogEngineT`).
- **Packaging.** `make` builds `lib/libcalog.a` (calog itself: core + actor + every
adapter/binding) and separate vendored-engine archives (`liblua.a`, `libduktape.a`,
`libsquirrel.a`, `libmybasic.a`). A host links `libcalog.a` plus whichever engine
archives it uses; unused adapters (and their engine deps) stay unlinked because static
members are pulled only when referenced -- so a JS-only host never links Lua/Squirrel.
The tests consume the archives; the threaded/engine tests use only `calog.h`,
validating that the public surface is complete. `examples/embed.c` is a ~30-line host
(public header only) that registers a native and calls it from JavaScript.
- **Reconfirmed:** rename + restructure kept all 441 checks passing across 10 test
binaries, clean under ASan/UBSan and TSan (all four engines), gcc + clang strict.
## 14. Implementation notes (as-built: actor layer, engine-on-a-thread, Squirrel)
The actor layer (`context.h`/`context.c`, build step 4) is built and tested:
`testActor` exercises cross-context routing, the always-live nested pump (the
re-entrant A->B->A deadlock test), and a concurrent fan-out stress; clean under
ASan+UBSan and ThreadSanitizer (`make tsan`, run under `setarch -R` -- some kernels
hand out more mmap randomization than TSan's shadow tolerates). One thread + one
MPSC queue per `ScriptContextT`; `brokerCall` routes through an installed hook
(`brokerSetRouteHook`) so owner-0/same-context calls run inline and others marshal
to the owning thread; an external caller blocks on a private reply box, a context
caller pumps. The reply carries only `{status, result}` -- the single error channel
(sec 8) is structural, the error string rides in `result`. `contextSendBlocking`
and `contextReply` are the shared enqueue-wait and reply tails behind both CALL and
EVAL dispatch.
**Generationed registry (sec 9), DONE.** Context ids pack a 16-bit slot index and
a 16-bit generation; the registry is a slot table plus a freelist. `contextDestroy`
unlinks a context under the registry lock (after stopping+joining its thread) and
returns the slot to the freelist; the next reuse bumps the generation. A stale id
(slot since freed/recycled) resolves to `brokerErrDeadE`, never misroutes to the
recycler -- `testActor`'s generation test proves it. The registry lock is held
across enqueue, so a foreign enqueue cannot race a destroy onto a freed queue mutex.
Still quiescence-assuming (no call to the context in flight at teardown); in-flight
reference draining is the remaining sec 9 hardening.
**Engine on a thread (the EngineT vtable).** `EngineT` gained `runSource`;
`contextEval(context, source, result)` marshals a script run onto the context's own
thread (a new `messageEvalE`) and blocks like a call. Each adapter's engine binding
lives in its own TU (`luaEngine.*`, `squirrelEngine.*`) -- the only Lua/Squirrel
files that depend on the threading layer, keeping the adapters thread-agnostic so
`testLua`/`testPolyglot` link them without `context.o`/pthread. `createInterpreter`
runs on the thread and exposes the configured natives there. Crucially, the exposed-
native trampolines now dispatch through `brokerCall` (by broker+name) instead of a
captured fn pointer, so an exposed native owned by another context is transparently
routed to its thread -- the script author still writes `doubleIt(21)`. With no route
hook installed this is identical to the old inline path (testLua still passes).
`testEngineLua` proves a real Lua interpreter on a context thread calling a thread-
agnostic native and a cross-context native, on the correct threads.
**Squirrel adapter (sec 11 step 6), the O(1)-engine-add validation.** Vendored
Squirrel 3.2 in `vendor/squirrel-src` (C++), built relaxed/un-sanitized with
`-D_SQ64 -DSQUSEDOUBLE` so `SQInteger`/`SQFloat` are 64-bit int / double matching
`ValueT` -- the adapter shares those defines so the ABI matches. `squirrelAdapter.*`
mirrors the Lua adapter: one shared trampoline recovers its binding from the
closure's single free variable (which the VM pushes onto the stack *after* the args,
so it sits at the top -- verified in `sqvm.cpp` CallNative), marshals scalars,
binary-safe strings, and the hybrid aggregate (array<->list, table<->map) with the
shared depth cap, and dispatches through `brokerCall`. `testEngineSquirrel` runs a
real VM on a thread doing the cross-context call plus string and array round-trips;
clean under ASan+UBSan and TSan (`make tsansq`). The total surface a new engine
added: one adapter TU + one engine-binding TU + Makefile rules -- no change to the
broker core or the actor layer, which is the thesis.
**Squirrel closure export, DONE.** A Squirrel closure crossing the boundary now
becomes a refcounted `CallableT` over a pinned `HSQOBJECT` (`sq_addref`/`sq_release`,
mirroring Lua's `luaL_ref` lifecycle): `squirrelExport` fetches a named global
closure, and a closure passed as a native argument is exported the same way during
ingress (the VM's foreign pointer -- finally used -- recovers the owning context).
`squirrelCallableInvoke` runs `sq_pushobject`+`sq_call` on the owner's VM and
marshals the return; `squirrelCallableRelease` `sq_release`s on the owner thread.
Single-threaded `testSquirrel` covers export+invoke-from-C, a closure passed as an
argument and called back through the broker, and the not-found/type-error paths;
ASan-clean (no addref/release leak). Caveat (same as Lua): release exported
callables before `squirrelContextDestroy`. (The reverse direction -- a foreign `CalogFnT`
pushed INTO Squirrel -- is now supported too: a native closure whose one free variable is
a release-hooked userdata holding the `CalogFnT`; see sec 18.)
`make test` runs all seven binaries (411 checks). `make tsan` covers the actor core
and the Lua engine path; `make tsansq` the Squirrel path.
**Adversarial review (3 parallel reviewers: actor concurrency, Squirrel adapter,
Lua trampoline + engine bindings).** Two real defects found and fixed:
- *NULL-interpreter crash.* `threadMain` ignores `createInterpreter`'s status, so a
failed create (e.g. a config expose-name that was never registered) left a context
serving with `interp == NULL`; `contextDispatchEval` only checked `runSource !=
NULL`, so the first eval called `runSource(NULL,...)` -> NULL deref. Fixed by
guarding `interp == NULL` (the context still serves native calls, just rejects
evals); regression test in `testEngineLua` (testFailedInterpreter).
- *OOM lost-wakeup.* `contextReply`'s context-caller branch allocated a fresh REPLY
and, on `calloc` failure, dropped the wakeup -- the caller hung in `pumpUntil`
forever. Fixed by reusing the request message as the reply (it already carries the
token and replyToId), which removes the allocation entirely, so the wakeup can no
longer be lost to OOM.
The Squirrel adapter was traced clean against the real Squirrel source (trampoline
free-var indexing, stack balance, ValueT ownership, the throwerror/free order, binary
strings); added `sq_reservestack` guards before the recursive marshallers to match the
Lua adapter's `lua_checkstack` discipline. Documented (not changed): the registry must
be frozen before contexts start (`brokerCall` reads it locklessly from context
threads -- noted in broker.h); teardown still assumes quiescence (sec 9); and the
hybrid-aggregate-to-Squirrel-table egress flattens array indices and integer keys into
one table (same lossy edge as elsewhere in the fidelity table).
## 17. Engine expansion -- QuickJS-ng, and three new languages (Berry, s7, Wren)
calog now ships **seven** engines. Each is one adapter TU (marshalling + native
trampoline + the sec-10 callable export) plus one engine-binding TU (the four-hook
`CalogEngineT`), a vendored-from-source archive, a `testEngine*`, and a `tsan*` target --
the core, the actor layer, and `calog.h` were untouched, re-confirming the O(1)-per-engine
thesis. Which engines a binary pulls in is a link-time choice: the header-inline
`calogRegisterBuiltinEngines` references only the `CALOG_WITH_<ENGINE>`-selected vtables,
so `testActor` still links **zero** engine code.
**QuickJS-ng replaces Duktape** (same `calogJsEngine` / `.js`, same `jsAdapter.h` /
`jsEngine.c` -- only `jsAdapter.c` and the Makefile changed). The wins: a JS **BigInt
round-trips to int64 exactly** (`JS_ToBigInt64`), closing the double-only fidelity gap
Duktape had (proven by a 2^53+1 test); JS functions are **refcounted `JSValue`s**
(`JS_DupValue` / `JS_FreeValue`), replacing the Duktape heap-pointer-pinning registry
behind `CalogFnT`; and the broker/name binding rides on `JS_SetContextOpaque` +
`JS_NewCFunctionData`. One gotcha: a missing global reads back as `undefined` (not an
error), so `calogJsExport` maps `JS_IsUndefined` to not-found while a bound non-function
stays a type error. Core library = `quickjs.c` + `libregexp.c` + `libunicode.c` +
`dtoa.c`, built with `-D_GNU_SOURCE`.
**Berry** (`.be`) is a Lua-like stack VM (64-bit ints, binary-safe `be_pushnstring`).
Natives are Berry **native closures carrying two upvalues** (the context comptr and the
name), recovered with `be_getupval(vm, 0, pos)`; a Berry function crossing out is pinned
under a uniquely-named hidden global (globals are GC roots) and dropped by setting it to
nil. The sharp edge: `be_pcall(vm, argc)` leaves the **result in the function's slot
(`base+1`)**, not at `-1` (which holds the last stale argument). Vendoring needs Berry's
`coc` codegen prebuild plus its OS port (`be_port.c`) and module/class tables
(`be_modtab.c`). A subtlety for records: `be_newmap`/`be_newlist` push *raw* containers
that a script cannot subscript, so an aggregate crossing out is wrapped in its `map`/`list`
class instance (`map(raw)` via `be_getbuiltin` + `be_call`, then `be_moveto`/`be_pop` to
drop the raw and `init`'s nil return) -- then `user['name']` works. Ingress reverses it:
a list/map instance holds its raw container in the hidden `.p` member, iterated with
`be_pushiter`/`be_iter_next` (see sec 18).
**s7 Scheme** (`.scm`) uses the *current* official s7 (an older mirror lacked `s7_free`,
which would leak a heap per context). Since s7 native functions carry no user data, all
natives route through **one generic `%calog-call` dispatcher** plus a per-name Scheme
wrapper (`(define (report . a) (apply %calog-call "report" a))`); the context rides on a
`*calog-context*` c-pointer global. Callables are kept alive by `s7_gc_protect` and
invoked with `s7_call`. Because `s7_eval_c_string` evaluates a single form, `calogS7Run`
wraps the (escaped) source in `(catch #t (lambda () (eval-string …)) handler)`, so both
read and run errors surface as a value -- a marker pair the runner detects. An aggregate
crossing out is a Scheme list, or an (applicable) hash-table when keyed, so a materialized
record reads as `(user "name")`; reading a script's keyed value back in is a v1 limit. s7's intentional
"permanent string" interning (which `s7_free` does not reclaim) is a small, bounded
allocation, suppressed with a documented, allocation-site-specific `__lsan` hook. s7 is
per-interpreter thread-safe -- no serialization needed (unlike my-basic).
**Wren** (`.wren`) is the outlier: Wren has **no bare function calls**, so every native is
reached through a single foreign method. A preamble defines `class Calog { foreign static
call(name, args) }`, and scripts call `Calog.call("report", [42])`; the C dispatcher
recovers the context from `wrenGetUserData`, marshals the argument list, and dispatches
through `calogCall`. A Wren function crossing out is a retained `WrenHandle`, invoked with
a cached per-arity `call(_)` handle. Wren numbers are IEEE **doubles**, so int64 above
2^53 loses precision (the same edge my-basic and old-JS have). An aggregate crossing out
is a Wren `List`, or a `Map` when keyed (a record reads as `user["name"]`); reading a
script's list/map back in is a v1 limit -- Wren's C API cannot enumerate a `Map`'s keys.
Wren keeps no process-global state, so contexts run in parallel.
Aggregate egress is therefore uniform across all seven engines: a host native can return a
keyed `CalogValueT` record and every engine reads its fields with native syntax
(`user.name` / `user['name']` / `user["name"]` / `(user "name")`). The reverse -- a script
handing a list/map *back* to C -- is complete on Lua/JS/Squirrel/my-basic and a v1 limit on
Berry/s7/Wren.
**Verified** across all seven engines: `make test` (531 checks, including a materialized-
record read per new engine), ASan/UBSan clean, a `tsan<engine>` target clean for each, and
gcc + clang strict on the core.
## 18. Closing the v1 marshalling limits (function-into-script, aggregate ingress)
The engines above shipped with two directional gaps in the value bridge: a foreign
function value could not be pushed *into* a script (only Lua did it), and a script could
not hand a keyed aggregate (map) *back* to C on the three newest engines. Both are now
closed everywhere they can be, with one honest exception each.
**New public API** `calogFnFromNative(out, calog, fn, userData)` -- wraps one of your
natives as a host-owned function value (owner id 0, runs on the host thread during
`calogPump`, like `calogRegister` but anonymous). Without it, function-into-script was
unusable from `calog.h` alone (`calogFnCreate` is internal), so a host could only forward
a *script*-derived callable, never one of its own natives. Return the result from a native
and the script gets a callable that routes back to the host.
**Function value -> script** (each `*FromValue` `calogFnE` case): an engine callable
object wraps the `CalogFnT*`, a trampoline marshals the script's args -> `calogFnInvoke`
-> marshals the result, and a finalizer runs `calogFnRelease`; `calogFnRetain` at push.
Per engine:
- **Lua** (pre-existing): userdata + `__call` + `__gc` metatable -- the reference for the rest.
- **JS**: a `JSClassDef` with both `.call` and `.finalizer`; the finalizer only receives
the runtime, so `JS_SetRuntimeOpaque` carries the context to it.
- **Squirrel**: a native closure whose single free variable is a release-hooked userdata
holding the `CalogFnT` (freeing the closure frees the userdata -> the hook releases).
- **s7**: an applicable c-object (`s7_make_c_type` + `s7_c_type_set_ref` for the call,
`s7_c_type_set_free` for release); the `ref` fn gets `(obj . args)`, so the object is
`s7_car`. Script calls `(f ...)`.
- **Berry**: no per-value finalizer exists, so the `CalogFnT`s pushed into a context are
*tracked on the context* and released together in `calogBerryDestroy`; the callable is a
native closure over `(context, CalogFnT)` comptr upvalues. (`berryFromValue` gained the
context parameter so it could record them.)
- **Wren**: a `foreign class CalogFn { construct new() {} foreign call(args) }` whose
`call` takes a *list* (Wren method arity is fixed, so a list absorbs any argument count);
finalize releases. Script calls `f.call([...])`. Gotcha: Wren requires newlines between
class members, so the preamble is multi-line.
- **MY-BASIC**: inherent gap -- BASIC has no first-class callable values to invoke.
**Aggregate ingress** (`*ToValue` map/list): Lua/JS/Squirrel/MY-BASIC already read both.
Added:
- **s7**: read a hash-table by `s7_make_iterator` + `s7_iterate` (each yields a
`(key . value)` cons; gotcha: at end `s7_iterate` returns a *non-pair* sentinel even
when the at-end flag was still false, so guard `if (!s7_is_pair(pair)) break`).
- **Berry**: a list/map instance's raw container is its hidden `.p` member; iterate with
`be_pushiter`/`be_iter_hasnext`/`be_iter_next` (which take the *container* index with the
iterator kept on top, pushing one value for a list and key+value for a map -- restore the
stack to `[container, iterator]` after each entry).
- **Wren**: a `List` reads back directly. A `Map` needs key enumeration, which upstream
Wren's C API lacks (`wrenGetMapValue` is by-key only) -- so calog adds a small patch to
the vendored `wren.c`/`wren.h` (`wrenGetMapCapacity` / `wrenGetMapEntry`, mirroring Wren's
own internal `map_iterate`), and the adapter walks the raw table skipping empty slots.
With the patch, Wren too reads maps back in. (Documented in LICENSE.md; re-apply if the
amalgamation is regenerated.)
**Deliberately not "fixed" (inherent to the engine's value model):** MY-BASIC 32-bit ints
(over 2^31 range-checked to an error, not silently truncated), MY-BASIC NUL-in-string
truncation and serialize-at-load, and JS/Wren int64 above 2^53 (IEEE doubles -- JS *could*
emit a BigInt but that breaks arithmetic mixing with Number, a worse trap than the
documented precision edge). `WREN_MAX_CALL_ARITY` (16) is pinned to Wren's own engine
limit (`MAX_PARAMETERS`) and can't be raised. `MB_BANK_SIZE` (the MY-BASIC native cap) was
32 -- the one hard cap a real app could hit, since MY-BASIC natives can't carry userdata
so each needs a hand-materialized slot trampoline; it is now **256** (the trampoline bank
and its `[MB_BANK_SIZE]` table are regenerated together, so a count mismatch fails to
compile). Berry's `BE_BYTES_MAX_SIZE` was likewise raised from 32 kb to 256 MB.
**Verified**: `make test` (539 checks) with a `testForeignFunction` per engine and a
`testMapIngress` on s7 and Berry (the Berry one nests a list to exercise list ingress);
ASan-clean on every engine (retain/release balanced); gcc strict; per-engine `tsan*`.