# calog -- Polyglot Script Broker: Design & Implementation Plan A C "broker" that lets one application be written in a mix of scripting languages (Lua and my-basic first; Squirrel and others later). Native C functions are added once and become callable from every language. Functions and data exported from one module are callable from modules written in another language. Threading is actor-based; networking rides the same dispatcher. Data sharing is by-value for v1. This document is the reconciled output of a design pass plus an adversarial verification pass. Where the verification corrected the first-cut design, the correction is folded in and noted as "[verified]". --- ## 1. Architecture: hub and spoke Nothing talks to anything else directly. Every engine talks only to the broker, through two shared contracts: 1. One universal value type, `ValueT` (a tagged union). 2. One uniform native-function signature: ```c typedef int32_t (*NativeFnT)(ValueT *args, int32_t argCount, ValueT *result, void *userData); ``` A developer writes a native function once against that signature and registers it once. A *script* function exported from a module is itself stored as a `NativeFnT` whose body re-enters its owning interpreter -- so "call C from script", "call script from C", and "call module A's function from module B" are all the same code path. Adding an engine is O(1) adapter work, not O(N) per existing engine. The single most important lesson from the verification pass: **there must be exactly one `ValueT` / `AggregateT` / `ValueTypeE`, defined in `broker.h`, included verbatim by every adapter.** The first-cut design had three divergent copies and would not have linked, let alone round-tripped data. Section 2 is therefore the load-bearing part of this plan. --- ## 2. The canonical type system (`broker.h`) -- single source of truth ### 2.1 Value tags and the value struct ```c typedef enum ValueTypeE { valueNilE = 0, valueBoolE = 1, valueIntE = 2, // int64_t valueRealE = 3, // double valueStringE = 4, // length-prefixed, binary-safe valueAggregateE = 5, // hybrid array + map container valueFnE = 6 // function value: refcounted handle to a CallableT } ValueTypeE; typedef struct StringT { char *bytes; // owned; always NUL-terminated at bytes[length] for C consumers int64_t length; // byte count excluding the convenience terminator (binary-safe) } StringT; typedef struct ValueT { ValueTypeE type; union { bool b; int64_t i; double r; StringT s; struct AggregateT *agg; // heap-owned subtree struct CallableT *callable; // refcounted, broker-owned } as; } ValueT; ``` ### 2.2 The aggregate (one shape both engines map onto) ```c typedef struct PairT { ValueT key; // marshal layer constrains to int/real/string keys (see 2.5) ValueT value; } PairT; typedef enum AggregateKindE { aggregateListE = 0, // empty container round-trips as a list by default aggregateMapE = 1, aggregateBothE = 2 // array part AND pairs part populated } AggregateKindE; typedef struct AggregateT { AggregateKindE kind; // disambiguates empty/mixed containers across engines ValueT *array; // dense elements [0, arrayCount) int64_t arrayCount; int64_t arrayCap; PairT *pairs; // map part; preserves insertion order int64_t pairCount; int64_t pairCap; } AggregateT; ``` A Lua table's sequence part maps to `array`, its remaining keys to `pairs`. A my-basic `LIST` maps to `array`, a `DICT` maps to `pairs`. The explicit `kind` flag fixes two problems the verifier flagged: an empty `{}` had no defined type on the far side, and a mixed array+hash Lua table had no representation at all. ### 2.3 Function values: `CallableT` `valueFnE` is the one deliberate exception to "by-value everything". A function cannot be meaningfully copied between heaps, so it is shared by *reference* -- but safely, because it is only ever *invoked*, never inspected, and invocation always routes to the owning context's thread. ```c typedef struct CallableT { NativeFnT fn; // uniform invoke entry void *userData; // luaL_ref slot, pinned mb_value_t, or C closure ctx uint32_t ownerCtxId; // context whose thread MUST run it uint32_t ownerGen; // generation of that context (UAF guard, see sec 9) int32_t refCount; // ATOMIC; shared-handle lifetime across threads bool alive; // false once the owning context is torn down } CallableT; ``` Rules (these resolve the verifier's critical "function value breaks the no-shared-pointer invariant" finding): - `valueCopy` of a `valueFnE` does an **atomic refcount increment** on the same `CallableT` -- it does NOT clone the closure. The shared `CallableT*` across threads is allowed precisely because `refCount` is atomic and the closure is touched only on its owner thread. - `valueFree` of a `valueFnE` does an **atomic decrement**. When it hits zero, the underlying closure must be released with an interpreter call (`luaL_unref`, my-basic unref) -- which can only run on the owner's thread. So a zero-drop on a foreign thread **posts a release message to the owner context** rather than calling the interpreter directly (sec 10). - Invoking a dead handle (`alive == false`, owner gone) returns a clean broker error, never a call into a freed interpreter. ### 2.4 Value operation contract (one signature, used everywhere) The verifier caught two designs declaring `valueCopy` with incompatible signatures. The canonical form -- status-returning, dst-by-pointer, so OOM is checkable on the hot path: ```c int32_t valueCopy(ValueT *dst, const ValueT *src); // deep copy; brokerOkE / brokerErrOomE void valueFree(ValueT *v); // recursive free; leaves a safe nil void valueMove(ValueT *dst, ValueT *src); // zero-alloc ownership transfer ``` `valueFree` and `valueCopy` MUST have a `default:`/explicit case for every tag including `valueFnE` -- a missing case was how the first cut silently leaked function payloads. ### 2.5 Cross-engine fidelity table (the honest limits) By-value marshalling across `Lua <-> broker <-> my-basic` is lossless for the common cases and lossy at documented edges. These are inherent to my-basic's value model (verified against `my_basic.h`/`.c`), not marshalling bugs: | Aspect | Lua | my-basic | Crossing Lua <-> BASIC | |-----------------------|------------------------|-----------------------------------|-------------------------------------------------| | integers | 64-bit | `int_t == int` (32-bit, always) | **truncates above 2^31** -- range-check + error | | reals | double | `float` (double w/ `-DMB_DOUBLE_FLOAT`) | precision loss unless double build | | int vs real subtype | distinct (5.4) | integral real auto-collapses to int | **subtype not preserved** through BASIC | | strings | binary-safe (length) | bare `char*` (`strlen`) | **embedded NUL truncates** -- detect + error | | array / list | table sequence part | `LIST` | OK | | map / dict | table hash part | `DICT` (int/real/string keys) | OK | | mixed array+hash table| one table | no single LIST+DICT value | **collapses to DICT** (array part -> int keys), documented | | empty `{}` | table | LIST or DICT? | `kind` flag; default LIST | | function value | closure (`luaL_ref`) | lambda/routine (`MB_DT_ROUTINE`) | OK via `CallableT` (by-reference) | | nested depth | bounded | bounded | one shared cap; defined error past it | | cycles | rejected on ingress | rejected on ingress | impossible at rest (by-value, no shared refs) | Policy decisions baked in to make the above deterministic: - **One shared recursion-depth cap** applied on *every* recursive path -- both ingress and egress, all components -- failing with a defined status instead of overflowing the C stack. (The first cut bounded only Lua ingress.) - **One strict/lenient switch**, owned by the broker and honored by both adapters: in strict mode an unrepresentable value/key (e.g. a function used as a table key, a 64-bit int into BASIC, a NUL-bearing string into BASIC) is an error; in lenient mode it is dropped/coerced with a documented rule. Never a silent surprise either way. - **Keys**: broker allows int/real/string keys (my-basic dicts accept all three). Float keys get defined equality; non-representable keys follow the strict/lenient switch. --- ## 3. The engine vtable -- what makes the actor loop engine-agnostic Each adapter implements one small vtable so the broker/actor core never special-cases an engine: ```c typedef struct EngineT { const char *name; void *(*createInterpreter)(struct ScriptContextT *ctx); // ON the owning thread void (*destroyInterpreter)(void *interp); int32_t (*loadSource)(void *interp, const char *src, int64_t len); int32_t (*registerNative)(void *interp, const char *name, NativeFnT fn, void *userData); int32_t (*callExport)(void *interp, void *exportRef, ValueT *args, int32_t argCount, ValueT *result); void (*releaseExport)(void *interp, void *exportRef); // ON the owning thread } EngineT; ``` `createInterpreter`, `destroyInterpreter`, `releaseExport`, and every `callExport` run on the context's own thread -- that is the invariant that keeps each interpreter single-threaded. --- ## 4. The Lua adapter Target Lua 5.4 (note 5.1/5.3 deltas where they matter). The C API surface was verified accurate; the fixes below are about lifecycle, not API names. ### 4.1 Native registration and the trampoline Each registered `NativeFnT` becomes a Lua C closure: push the `NativeFnT` and its `userData` as upvalues with `lua_pushcclosure`, then `lua_setglobal` (or into a module table). The single trampoline recovers them from upvalues, marshals the Lua stack into a `ValueT[]`, calls the `NativeFnT`, and pushes the result back. - `lua_setglobal` returns `void` (the first cut documented `int` -- harmless but wrong). - Lua allocation APIs (`lua_newuserdatauv`, `lua_createtable`, ...) **longjmp on OOM and never return NULL** -- so NULL-checks after them are dead code; the OOM path is a Lua error, not a C return. Only `luaL_newstate` can return NULL and must be checked. ### 4.2 Marshalling `ValueT <-> Lua` (by value) Scalars are direct. Strings use `lua_tolstring` + length (binary-safe, preserves NULs). Tables deep-copy in both directions: - Ingress (table -> `AggregateT`): normalize the table index to absolute before `lua_next`; route numeric keys through `lua_tointeger`/`lua_tonumber` (do **not** let `lua_tolstring` mutate a numeric key in place); fill `array` for the sequence part and `pairs` for the rest; detect cycles via an ancestor-pointer stack (`lua_topointer`); use `lua_rawset`/ `lua_rawget` to avoid metamethods; enforce the shared depth cap. - Egress (`AggregateT` -> table): build with `lua_createtable`, populate `array` as the sequence and `pairs` as keyed entries, **with the same depth cap** (the first cut had no egress cap -- a deep BASIC-origin structure could overflow the stack on the way back). Make the builder self-balancing: record `lua_gettop` on entry and `lua_settop` back on any error return. ### 4.3 Exporting a Lua function (and the leak fix) A Lua function crossing the boundary becomes a `valueFnE`: pin the closure with `luaL_ref(L, LUA_REGISTRYINDEX)`, wrap it in a `CallableT`. The `fn` body does `lua_rawgeti` to retrieve, marshals args onto the stack, `lua_pcall`, marshals the return; on error it pulls the message with `luaL_tolstring` and reports it as a broker error (sec 8). Two bugs the verifier found, fixed here: - **The exports array needs grow-on-demand.** The context is `calloc`'d, so the array starts NULL/0; the first export must `realloc` (double the cap, handle the NULL/0 seed) before storing, and `luaL_unref` the just-created ref if the realloc fails. - **Transient vs persistent ownership.** Every Lua function passed as a *native argument* was creating a `luaL_ref` that only got released at `lua_close` -- an unbounded leak for any long-lived context that takes callbacks. Fix: the `CallableT` refcount owns the `luaL_ref`. When the last `valueFree` drops the handle to zero, the ref is released (on the Lua thread per sec 10). A function the broker retains as a real export holds a reference for as long as it is registered; a function merely borrowed for the duration of one call is released when that call's `ValueT` args are freed. Same mechanism, two lifetimes. ### 4.4 Calling a function value from Lua A `valueFnE` marshalled *into* Lua becomes a Lua C closure over the `CallableT*`, so script authors just write `cb(x, y)` and it transparently dispatches (to the owner's thread if the function lives elsewhere). A universal `call(fn, ...)` native is also provided for uniformity across engines. ### 4.5 Context lifecycle `luaL_newstate` + `luaL_openlibs` on the owning thread; confine the `lua_State` to that thread forever; teardown `luaL_unref`s outstanding refs then `lua_close`. --- ## 5. The my-basic adapter Verified against `paladin-t/my_basic` (`my_basic.h` + `.c` read directly). **Zero hallucinated calls.** The interesting work is three my-basic-specific quirks that shape the adapter; all three are forced by the source, not stylistic. ### 5.1 Lifecycle and the inverted register result `mb_init` (once per process) / `mb_open(&bas)` / `mb_load_string(bas, src, true)` / `mb_run` / `mb_close` / `mb_dispose`. The broker pointer is threaded through the interp's single userdata slot via `mb_set_userdata` / `mb_get_userdata`. - **`mb_register_func` returns a count, not a status** -- nonzero means registered, `0` means duplicate/failure. That is the opposite of the `MB_FUNC_OK == 0` convention, so the success test must be inverted. Names are uppercased internally (`mb_strupr`), so the broker key is the uppercased identifier (BASIC is case-insensitive). ### 5.2 The native-function protocol and the trampoline bank The native signature is `typedef int (*mb_func_t)(struct mb_interpreter_t*, void**);` -- **no per-callback userData parameter**, and the interpreter has only **one** userdata slot. So a single shared C trampoline cannot tell *which* broker function it is serving. Fix (the verifier confirmed this limitation is real): a **macro-generated bank of trampolines** `mbTramp0 .. mbTrampN`, each hardcoding its slot index, each looking up `ctx->nativeBank[slot]` (the `NativeFnT` + `userData`) via the interpreter's userdata pointer. The bank size caps how many natives one my-basic context can host; size it generously and document it. Inside a trampoline the argument protocol is the real my-basic frame dance: `mb_attempt_open_bracket` / loop `mb_pop_value` (honoring `mb_has_arg`) / `mb_attempt_close_bracket` / compute / `mb_push_value` (or the typed `mb_push_*`). ### 5.3 String ownership (memdup is mandatory) `mb_pop_string` hands back a **borrowed interior pointer** -- the broker must `strdup`/copy it immediately. Pushed strings are taken over by the interpreter and later freed with *its* allocator, so any string handed to `mb_push_value`/`mb_make_string` **must come from `mb_memdup`** (not plain `malloc`). Embedded NULs cannot survive (bare `char*` + `strlen`) -- enforce the strict/lenient policy on egress. ### 5.4 Aggregates: the collection API There is no `mb_make_coll`. A list/dict is built by presetting `coll->type = MB_DT_LIST`/`MB_DT_DICT` then calling `mb_init_coll`, and accessed with `mb_get_coll` / `mb_set_coll` / `mb_remove_coll` / `mb_count_coll` / `mb_keys_of_coll`. Collection support is on by default (`MB_ENABLE_COLLECTION_LIB`). A broker aggregate with both `array` and `pairs` populated collapses to a DICT (array part becomes integer keys) per the fidelity table. ### 5.5 Exporting a BASIC routine -- the parked `__BROKERSERVE` frame This is the my-basic-specific crux. To call a BASIC routine/lambda from C you use `mb_get_routine(s, l, name, &val)` then `mb_eval_routine(s, l, val, args, argc, &ret)` -- and **`mb_eval_routine` dereferences `*l` and hard-requires a live, non-NULL `void** l`** (verified at `my_basic.c:14344/14358`). A valid `l` only exists *inside* a running native call. Therefore a my-basic context cannot be driven from arbitrary C; it must be **parked inside a native frame**. Design: register a native `__BROKERSERVE` whose C body *is* the context's message-pump / serve loop. A module hands control to the broker by ending with a `SERVE` call (the adapter appends one if absent). While parked there, the loop holds a valid `l`, which it uses to `mb_eval_routine` whenever another context calls one of this module's exported routines. `mb_get_routine` returns `MB_FUNC_OK` with a *nil* value when a name is absent, so the not-found test is `routine.type != MB_DT_ROUTINE`, not the status code. ### 5.6 Numeric and identity caveats `int_t` is 32-bit unconditionally (64-bit broker ints truncate -- range-check + error or promote to real with documented precision loss); integral reals auto-collapse to int so real/int subtype is not preserved across a BASIC hop. Both are in the fidelity table; both follow the strict/lenient switch. --- ## 6. Threading: the actor model Each `ScriptContextT` owns one interpreter, one OS thread (pthreads -- chosen over C11 `` for portability/maturity), and one inbound MPSC message queue. Interpreters are single-threaded; only the owning thread ever enters `callExport`. A cross-context call is a message; the caller blocks for the reply on a per-call condvar future (lost-wakeup safe via a predicate loop). The verifier confirmed the core is sound: no path lets two threads touch one interpreter, the deep-copy ownership ledger is correct on the success path, and the epoll thread enqueuing while a context is mid-dispatch is race-free. The fixes folded in from verification: - **One error channel.** The first cut carried a separate error `ValueT` in the reply that the caller never freed -- a leak on every errored call, lost error text, and a second source of truth contradicting the broker's "error travels in `result`" contract. Fix: on failure the adapter writes the error string into `result` (as `brokerSetError` does); the reply carries only `{status, result}`. One channel, one owner, freed once. - **`valueCopy` checked on enqueue.** Use the canonical `int32_t valueCopy(dst, src)`, check each arg, and unwind partially-copied args on OOM (mirroring the broker route path). The actor layer should call the broker's marshalling, not reimplement a copy loop. - **Shutdown drains everything.** On `SHUTDOWN`, error-reply every queued `CALL` *and* free every queued/stashed `REPLY` (result + error) before join -- the first cut leaked in-flight replies unwound by a nested shutdown. - **Explicit thread stack size.** The reentrancy depth bound counts dispatch nesting, not C-stack bytes; set a validated stack size with `pthread_attr_setstacksize` (or lower the bound) so the "clean catchable depth error" promise actually holds instead of a UB overflow. - **Split the ready-handshake condvar** off the queue condvar so `queueCond` has exactly one semantic (latent lost-wakeup footgun if a second waiter is ever added). ### 6.1 What a context does while blocked: always-live nested pump [DECIDED] When context A makes a synchronous cross-context call and waits for the reply, A's thread **pumps its own inbound queue** instead of sleeping idle. An incoming call to A -- including a re-entrant B->A issued during the very call A is waiting on -- is serviced on A's own thread, then A resumes waiting. This was chosen over strict run-to-completion because it never deadlocks and needs no wait-for-graph deadlock detector; the rejected alternative would have had to raise a "synchronous call cycle" error on A->B->A and would leave a busy context unresponsive to other callers. The verifier validated the pump as sound: only A's thread ever enters A's interpreter (the single-threaded invariant holds), reply nesting is strict LIFO, and depth is bounded. The contract this commits the runtime and script authors to: - **Re-entry happens only at explicit cross-context call points** (`x = getUserInfo()`, `data = sockRecv(c)`), never mid-statement -- a call point is a yield point. - **Module-global state may differ after a cross-context call returns**, because another call may have run on this context while it was outstanding (the same contract as any RPC). Local variables are unaffected. - **Reentrancy is depth-bounded** with a catchable error (backed by an explicit pthread stack size, sec 6 fixes), so runaway ping-pong fails cleanly instead of overflowing. Script code stays plain synchronous-blocking regardless -- `info = getUserInfo()` just works; this only governs what the runtime does while a call is outstanding. --- ## 7. Networking and the dispatcher One dedicated I/O thread runs `epoll` (Linux; `kqueue`/`poll` for portability) and owns no interpreter. Async socket primitives (`sockConnect`, `sockListen`, `sockSend`, `sockRecv`, `sockClose`, plus a timer) are registered once through the broker, so every language gets them. The recommended v1 model is **synchronous-blocking at the script level**: `data = sockRecv(conn)` parks the calling context on a reply future; when epoll reports readiness, the I/O thread builds a `CALL`/reply and enqueues it onto the *owning* context's queue, so the result lands on the right interpreter thread. **Callbacks are opt-in on top**: pass a `valueFnE` (e.g. `onConnect(myFunc)`) and the completion invokes it via the same dispatch, always back on its home thread. No separate async keyword, no per-engine coroutine support needed. Fixes from verification: - The I/O command queue must be **strict tail-append FIFO** (a `sockSend` issued right after `sockConnect` must be processed after the connect that registered the handle); assert it. - The resolver/connect path must **deep-copy `host`** before the command is freed (the first cut had an unconditional `free(cmd->host)` that would UAF if stored by pointer). - Wake the epoll thread for new interest via `eventfd`/self-pipe; deregister on close; drain `eventfd` to `EAGAIN`. - Portability note: `pthread_condattr_setclock(CLOCK_MONOTONIC)` is absent on Darwin -- guard it with `#ifdef` and derive any monotonic timed wait accordingly (a monotonic deadline cannot be handed to a realtime-clock condvar). --- ## 8. Error model (one source of truth) A `NativeFnT` returns a status int and, on failure, writes a human-readable message **into `result`** (a `valueStringE` tagged as an error, or a small error-struct convention). That single in-band channel crosses the actor boundary unchanged, is freed exactly once by the caller, and is surfaced into the calling engine as that engine's native error (`luaL_error`/`lua_error` for Lua, `mb_raise_error` for my-basic). There is no second error field anywhere. --- ## 9. Context lifetime and the registry (UAF fix) The critical use-after-free: contexts were addressed by raw `ScriptContextT*` (and exports held raw owner pointers), while `contextShutdown` frees the context and destroys its mutex/cond at runtime -- so a foreign thread could enqueue onto a freed `queueMutex`. Fix: - Address contexts by a **stable integer id** through a **locked registry**; never by raw pointer. `contextEnqueue`/`contextCall`/`ioDispatch` resolve id -> context under the registry lock and either hold the lock across enqueue or take a reference so the context (and its mutex) cannot be freed mid-enqueue. - Add a **generation counter** to context ids and to `CallableT.ownerGen` so a recycled id cannot misroute an in-flight completion to a different context. - `contextShutdown`: under the lock, mark dead and remove from the id map; reject new enqueues with a defined "dead context" error; drain and error-reply queued work; wait for in-flight references to drain; then free. --- ## 10. Function-value lifecycle across threads `CallableT.refCount` is atomic. `valueCopy` bumps it; `valueFree` drops it. The subtlety: releasing the underlying closure is an interpreter op (`luaL_unref` / my-basic unref) that must run on the owner's thread. So when a drop reaches zero on a *foreign* thread, the broker posts a **release message** to the owner context instead of touching the interpreter directly; the owner releases the closure on its own thread and frees the `CallableT`. If the owner is already gone (`alive == false`), the `CallableT` shell is freed directly (the closure is already gone with the interpreter) and any pending invoke returns a clean error. --- ## 11. Build order 1. **Broker core**: `broker.h` (the canonical `ValueT`/`AggregateT`/`CallableT`/enums), `valueCopy`/`valueFree`/`valueMove` with full tag coverage and the depth cap, the name->entry registry, `brokerCall`, the error convention. Unit-test value round-trips and deep-copy/free under a leak checker before any engine exists. 2. **Lua adapter** against the core: trampoline, scalar+string marshalling, table deep-copy both directions with caps, native registration, export with the refcounted `luaL_ref` lifecycle and exports-array growth. Test C<->Lua and Lua-export-called-from-C single-threaded. 3. **my-basic adapter**: lifecycle, the trampoline bank, the arg-frame protocol, `mb_memdup` string ownership, the collection mapping, and the parked `__BROKERSERVE` export frame. Test C<->BASIC and the full Lua<->broker<->BASIC round-trip against the fidelity table (assert the lossy edges error or coerce exactly as documented). 4. **Actor layer**: `ScriptContextT`, the MPSC queue, the reply future, the id+generation registry, the single error channel, the chosen block-while-waiting semantics (sec 6.1), and shutdown drain. Stress cross-context calls and teardown under a thread sanitizer. 5. **Networking/dispatcher**: epoll I/O thread, the FIFO command queue, the socket/timer natives, completion dispatch onto owning queues, callbacks via `valueFnE`. 6. **Squirrel** (later): a third adapter validates that the vtable + canonical `ValueT` really make new engines O(1). --- ## 12. Open decisions - **Block-while-waiting semantics: DECIDED -- always-live nested pump (sec 6.1).** - Strict-vs-lenient default for the lossy marshal edges (recommend: strict by default so truncation/loss is an explicit error; lenient opt-in per call). - my-basic native-bank size (cap on natives per BASIC context). - Whether a foreign function injected into BASIC should be transparently callable as a routine value (`cb(x)`) or only via the portable `CALL(fn, ...)` primitive (Lua gets the transparent form for free; BASIC's transparent form needs confirming). --- ## 13. Implementation notes (as-built: broker core + both adapters) Built and tested: `broker.h`/`value.c`/`broker.c` (core), `luaAdapter.*` (Lua 5.4), `mybasicAdapter.*` (vendored my-basic in `vendor/`), with `testBroker`/`testLua`/ `testMyBasic`/`testPolyglot` -- 378 checks, clean under ASan+UBSan. The polyglot test proves the thesis: one C native called from both engines, and a Lua function invoked from a BASIC program through the broker. **Core refinement.** The single global callable-release hook could not distinguish a Lua closure from a my-basic routine, so release is now a per-callable `CallableReleaseFnT` passed to `callableCreate` (design sec 10's "owner releases the closure", just synchronous for now). Added `callableUserData` so a release fn can reach its closure handle. **Lua adapter.** Context pointer lives in `lua_getextraspace`. Native bindings are context-owned `{fn,userData}` structs referenced by a light-userdata upvalue on one shared trampoline. A Lua function crossing out becomes a `CallableT` over a pinned `luaL_ref` (released via `luaL_unref` in the per-callable release fn); transient callback args are freed automatically because `valueFree` drops the handle. A `CallableT` crossing in becomes a callable userdata with `__call`/`__gc`. Lua allocation APIs longjmp on OOM (no NULL checks). Caveat: release exported callables before `luaContextDestroy` (the `luaL_ref` lives in that state's registry). **my-basic adapter** (the high-effort one; these rules were forced by ASan): - Build with `-DMB_DOUBLE_FLOAT` (double reals) and link `-lm`. - Native signature has no per-call userData and one interpreter userdata slot, so a macro-generated **trampoline bank** (`MB_BANK_SIZE`) supplies slot-specific entries that recover the binding from the context. - `mb_register_func` returns a count: **nonzero = success, 0 = failure** (inverted vs the usual `MB_FUNC_OK == 0`). - **Ownership is asymmetric and was the main source of bugs** (verified against the my-basic source during adversarial review): - A popped **collection** is owned by the consumer (`mb_dispose_value` after marshalling); a popped **string** is a borrowed interior pointer (copy, never free). - `mb_set_coll` **copies** a scalar/string key-value (dispose your copy after) but stores a **collection by pointer without a reference** -- so a nested collection needs an explicit `mb_ref_value` *before* the set, and must then NOT be disposed (the parent owns it). - `mb_push_value` **transfers** a collection, but **borrows** a string -- a string result must be pushed with `mb_push_string` (which marks it for lazy destroy), not `mb_push_value`. - `mb_eval_routine` **borrows** its arguments (it never frees them), so marshalled routine args are disposed by the caller after the call -- and the **return value is marshalled out first**, because a routine may return one of those borrowed arguments. - Routine values are **not** ref-counted (`mb_ref_value`/`mb_unref_value` corrupt them); a routine name must be **uppercased** before `mb_get_routine` (BASIC uppercases at parse). - int64 entering BASIC is range-checked to 32-bit `int_t` (`brokerErrRangeE` on overflow). - **Routine export** uses `mb_get_routine`(by name) + `mb_eval_routine`, both of which need a live `void** l`. That cursor only exists inside a native call, so the dispatch stashes it in `currentL`; an exported BASIC-routine `CallableT` is therefore valid only while the context is *serving* (a native frame is on the stack -- what the actor layer's parked `__BROKERSERVE` frame will guarantee). For now, fetch and invoke within one native call. - **One interpreter per program:** a my-basic context hosts a single program; reset+reload after disposing native-pushed collection intermediates is unreliable, so the tests spin a fresh context per run. (The actor layer will own one long-lived parked context per module, which sidesteps this.) **Build/verify.** Core compiled strict (`-Wconversion -Wsign-conversion`); adapters drop those two (engine headers use wide macros) but keep `-Wall -Wextra -Werror`. **All three engines are vendored under `vendor/` and built from source** -- `vendor/lua` (Lua 5.4.6, library = `src/*.c` minus the `lua.c`/`luac.c` mains), `vendor/mybasic` (my-basic), `vendor/squirrel-src` (Squirrel 3.2) -- each relaxed and un-sanitized but linked into the sanitized binaries so cross-boundary heap misuse is still caught. Nothing depends on a system-installed engine or `pkg-config`, so the build is reproducible. The Lua platform define is selected automatically: `$(OS)` first (Windows sets `Windows_NT` and has no `uname` -> no define / ISO C), else from `uname -s` (LUA_USE_LINUX + `-ldl` / LUA_USE_MACOSX / LUA_USE_POSIX). NB the project is otherwise Unix-only (pthreads, sanitizers, `setarch`), so the Windows branch only keeps the define correct. **Function-value lifecycle across threads (sec 10), DONE.** `callableInvoke` and the final `callableRelease` are now thread-correct. The core exposes two installable hooks (`callableSetInvokeHook`/`callableSetReleaseHook`, the same pattern as `brokerSetRouteHook`) so it stays independent of the actor layer; `actorInit` installs them. An invoke from a thread other than the callable's owner is marshalled to the owner's thread by reusing the CALL machinery (a callable's `fn`+`userData` are exactly a native call -- `callableFn` is the one new accessor). The final reference drop is routed too: a new `messageReleaseE` posts the finalize (fire-and-forget) to the owner, which runs the engine release (`luaL_unref` / `sq_release`) on its own thread. `callableFinalize` is the shared "run release + free shell" tail; the core still runs it inline when no actor is present (so the single-threaded `testCallableDead` semantics -- a dead callable still runs its release on last drop -- are preserved). `testEngineLua` captures a Lua closure on its context's thread, then invokes and releases it from the main thread; both marshal to the owner, ASan/TSan-clean. Limit: releasing a callable whose owner *context* has been destroyed is the deferred non-quiescent-teardown case (sec 9) -- best-effort inline finalize for now. **JavaScript adapter (Duktape), the fourth engine.** Vendored Duktape 2.7.0 (the single amalgamated `duktape.c`/`duktape.h`/`duk_config.h`) in `vendor/duktape`, built relaxed/un-sanitized. `src/js/jsAdapter.*` mirrors the Lua/Squirrel adapters: one shared trampoline recovers its binding from a hidden property on the function object (`duk_push_current_function` + an internal `\xFF`-prefixed key) and dispatches through `brokerCall`; marshalling covers scalars, binary-safe strings, and the hybrid aggregate (JS array <-> list, object <-> map) with the depth cap. JS numbers are doubles, so an integral in-range number round-trips as an int (else a real). A JS function crossing out becomes a refcounted `CallableT` over a Duktape **heap pointer** kept alive by a per-heap export registry object in the global stash (the slot is dropped on release) -- and it participates in the sec-10 cross-thread routing, so a JS closure captured on its context's thread is invoked and released from another thread correctly. `src/js/jsEngine.*` is the EngineT binding. `testJs` (single-threaded: scalar/string/array/object marshalling, export + invoke-from-C, closure-as-arg callback, error paths) and `testEngineJs` (threaded: cross-context call + the sec-10 callback) -- clean under ASan/UBSan and TSan (`make tsanjs`). Adding the engine touched zero lines of the broker core or actor layer (one adapter TU + one engine-binding TU + Makefile rules), re-confirming the O(1)-engine-add thesis. v1 limit (as with Squirrel): pushing a foreign `CallableT` *into* JS is unsupported. **Source layout.** `src/` holds the project source (core + actor directly in `src/`, one subdir per script language: `src/lua`, `src/mybasic`, `src/squirrel`, `src/js`); `tests/` holds the test programs; `obj/` collects every object file (ours and the vendored engines, via `patsubst` into `obj/`); `bin/` collects the binaries. The Makefile finds our sources by `VPATH` and groups object rules by flag set; `-MMD -MP` generate header dependencies automatically. `make clean` removes `obj/` and `bin/`. `make test` builds and runs all ten binaries; `make tsan`/`make tsansq`/`make tsanjs` are the ThreadSanitizer variants. ## 15. Public embedding API (`calog.h`) -- as-built calog is packaged as an embedding library: a host links it, registers its own native C functions, creates script contexts on an engine, and runs scripts. Every public symbol carries a `calog` prefix (types `Calog...T`, enums `Calog...E`) so the library is a good citizen in a host binary. The API was curated to the minimum: - **One handle, one header.** `CalogT` is the runtime; `calogCreate()` composes the registry with the actor layer (installs the routing hooks) and `calogDestroy()` tears both down -- no separate init/shutdown for the host. The entire embedding surface is `src/calog.h` (~30 functions); internal machinery (the registry entry type, the route/invoke/release hooks, the low-level callable lifecycle, the split `calogBrokerCreate`/`calogActorInit`) lives in `src/calogInternal.h`, which host code never includes. `calog.h` leaks no internal symbol. - **One config type.** The three per-engine configs collapsed into `CalogConfigT` (`exposeNames` + `exposeCount`), used by every built-in engine vtable. - **Value model unchanged, just prefixed.** `CalogValueT`/`CalogAggT`/`CalogFnT` + constructors (`calogValueInt`, ...), ops (`calogValueCopy`/`Free`/`Move`/`Equals`), aggregates (`calogAgg*`), function values (`calogFnInvoke`/`Retain`/`Release`), and `calogFail`/`calogTypeName` for writing natives. Contexts: `calogContextCreate`/ `Start`/`Eval`/`Destroy`/`Id`, plus `calogCurrentId`/`calogCurrent` for natives. The built-in engine vtables are `calogLuaEngine`, `calogJsEngine`, `calogSquirrelEngine` (a host may also supply a custom `CalogEngineT`). - **Packaging.** `make` builds `lib/libcalog.a` (calog itself: core + actor + every adapter/binding) and separate vendored-engine archives (`liblua.a`, `libduktape.a`, `libsquirrel.a`, `libmybasic.a`). A host links `libcalog.a` plus whichever engine archives it uses; unused adapters (and their engine deps) stay unlinked because static members are pulled only when referenced -- so a JS-only host never links Lua/Squirrel. The tests consume the archives; the threaded/engine tests use only `calog.h`, validating that the public surface is complete. `examples/embed.c` is a ~30-line host (public header only) that registers a native and calls it from JavaScript. - **Reconfirmed:** rename + restructure kept all 441 checks passing across 10 test binaries, clean under ASan/UBSan and TSan (all four engines), gcc + clang strict. ## 14. Implementation notes (as-built: actor layer, engine-on-a-thread, Squirrel) The actor layer (`context.h`/`context.c`, build step 4) is built and tested: `testActor` exercises cross-context routing, the always-live nested pump (the re-entrant A->B->A deadlock test), and a concurrent fan-out stress; clean under ASan+UBSan and ThreadSanitizer (`make tsan`, run under `setarch -R` -- some kernels hand out more mmap randomization than TSan's shadow tolerates). One thread + one MPSC queue per `ScriptContextT`; `brokerCall` routes through an installed hook (`brokerSetRouteHook`) so owner-0/same-context calls run inline and others marshal to the owning thread; an external caller blocks on a private reply box, a context caller pumps. The reply carries only `{status, result}` -- the single error channel (sec 8) is structural, the error string rides in `result`. `contextSendBlocking` and `contextReply` are the shared enqueue-wait and reply tails behind both CALL and EVAL dispatch. **Generationed registry (sec 9), DONE.** Context ids pack a 16-bit slot index and a 16-bit generation; the registry is a slot table plus a freelist. `contextDestroy` unlinks a context under the registry lock (after stopping+joining its thread) and returns the slot to the freelist; the next reuse bumps the generation. A stale id (slot since freed/recycled) resolves to `brokerErrDeadE`, never misroutes to the recycler -- `testActor`'s generation test proves it. The registry lock is held across enqueue, so a foreign enqueue cannot race a destroy onto a freed queue mutex. Still quiescence-assuming (no call to the context in flight at teardown); in-flight reference draining is the remaining sec 9 hardening. **Engine on a thread (the EngineT vtable).** `EngineT` gained `runSource`; `contextEval(context, source, result)` marshals a script run onto the context's own thread (a new `messageEvalE`) and blocks like a call. Each adapter's engine binding lives in its own TU (`luaEngine.*`, `squirrelEngine.*`) -- the only Lua/Squirrel files that depend on the threading layer, keeping the adapters thread-agnostic so `testLua`/`testPolyglot` link them without `context.o`/pthread. `createInterpreter` runs on the thread and exposes the configured natives there. Crucially, the exposed- native trampolines now dispatch through `brokerCall` (by broker+name) instead of a captured fn pointer, so an exposed native owned by another context is transparently routed to its thread -- the script author still writes `doubleIt(21)`. With no route hook installed this is identical to the old inline path (testLua still passes). `testEngineLua` proves a real Lua interpreter on a context thread calling a thread- agnostic native and a cross-context native, on the correct threads. **Squirrel adapter (sec 11 step 6), the O(1)-engine-add validation.** Vendored Squirrel 3.2 in `vendor/squirrel-src` (C++), built relaxed/un-sanitized with `-D_SQ64 -DSQUSEDOUBLE` so `SQInteger`/`SQFloat` are 64-bit int / double matching `ValueT` -- the adapter shares those defines so the ABI matches. `squirrelAdapter.*` mirrors the Lua adapter: one shared trampoline recovers its binding from the closure's single free variable (which the VM pushes onto the stack *after* the args, so it sits at the top -- verified in `sqvm.cpp` CallNative), marshals scalars, binary-safe strings, and the hybrid aggregate (array<->list, table<->map) with the shared depth cap, and dispatches through `brokerCall`. `testEngineSquirrel` runs a real VM on a thread doing the cross-context call plus string and array round-trips; clean under ASan+UBSan and TSan (`make tsansq`). The total surface a new engine added: one adapter TU + one engine-binding TU + Makefile rules -- no change to the broker core or the actor layer, which is the thesis. **Squirrel closure export, DONE.** A Squirrel closure crossing the boundary now becomes a refcounted `CallableT` over a pinned `HSQOBJECT` (`sq_addref`/`sq_release`, mirroring Lua's `luaL_ref` lifecycle): `squirrelExport` fetches a named global closure, and a closure passed as a native argument is exported the same way during ingress (the VM's foreign pointer -- finally used -- recovers the owning context). `squirrelCallableInvoke` runs `sq_pushobject`+`sq_call` on the owner's VM and marshals the return; `squirrelCallableRelease` `sq_release`s on the owner thread. Single-threaded `testSquirrel` covers export+invoke-from-C, a closure passed as an argument and called back through the broker, and the not-found/type-error paths; ASan-clean (no addref/release leak). Caveat (same as Lua): release exported callables before `squirrelContextDestroy`. Remaining limit: the reverse direction (a foreign `CallableT` pushed INTO Squirrel so a script can call it) returns `brokerErrUnsupportedE` -- Squirrel has no clean callable-userdata-with-`__gc` like Lua, so it needs a class instance with a `_call` metamethod + release hook. `make test` runs all seven binaries (411 checks). `make tsan` covers the actor core and the Lua engine path; `make tsansq` the Squirrel path. **Adversarial review (3 parallel reviewers: actor concurrency, Squirrel adapter, Lua trampoline + engine bindings).** Two real defects found and fixed: - *NULL-interpreter crash.* `threadMain` ignores `createInterpreter`'s status, so a failed create (e.g. a config expose-name that was never registered) left a context serving with `interp == NULL`; `contextDispatchEval` only checked `runSource != NULL`, so the first eval called `runSource(NULL,...)` -> NULL deref. Fixed by guarding `interp == NULL` (the context still serves native calls, just rejects evals); regression test in `testEngineLua` (testFailedInterpreter). - *OOM lost-wakeup.* `contextReply`'s context-caller branch allocated a fresh REPLY and, on `calloc` failure, dropped the wakeup -- the caller hung in `pumpUntil` forever. Fixed by reusing the request message as the reply (it already carries the token and replyToId), which removes the allocation entirely, so the wakeup can no longer be lost to OOM. The Squirrel adapter was traced clean against the real Squirrel source (trampoline free-var indexing, stack balance, ValueT ownership, the throwerror/free order, binary strings); added `sq_reservestack` guards before the recursive marshallers to match the Lua adapter's `lua_checkstack` discipline. Documented (not changed): the registry must be frozen before contexts start (`brokerCall` reads it locklessly from context threads -- noted in broker.h); teardown still assumes quiescence (sec 9); and the hybrid-aggregate-to-Squirrel-table egress flattens array indices and integer keys into one table (same lossy edge as elsewhere in the fidelity table).