1006 lines
62 KiB
Markdown
1006 lines
62 KiB
Markdown
# calog -- Polyglot Script Broker: Design & Implementation Plan
|
|
|
|
A C "broker" that lets one application be written in a mix of scripting languages
|
|
(Lua and my-basic first; Squirrel and others later). Native C functions are added
|
|
once and become callable from every language. Functions and data exported from one
|
|
module are callable from modules written in another language. Threading is actor-based;
|
|
networking rides the same dispatcher. Data sharing is by-value for v1.
|
|
|
|
This document is the reconciled output of a design pass plus an adversarial verification
|
|
pass. Where the verification corrected the first-cut design, the correction is folded in
|
|
and noted as "[verified]".
|
|
|
|
---
|
|
|
|
## 1. Architecture: hub and spoke
|
|
|
|
Nothing talks to anything else directly. Every engine talks only to the broker, through
|
|
two shared contracts:
|
|
|
|
1. One universal value type, `ValueT` (a tagged union).
|
|
2. One uniform native-function signature:
|
|
|
|
```c
|
|
typedef int32_t (*NativeFnT)(ValueT *args, int32_t argCount, ValueT *result, void *userData);
|
|
```
|
|
|
|
A developer writes a native function once against that signature and registers it once.
|
|
A *script* function exported from a module is itself stored as a `NativeFnT` whose body
|
|
re-enters its owning interpreter -- so "call C from script", "call script from C", and
|
|
"call module A's function from module B" are all the same code path. Adding an engine is
|
|
O(1) adapter work, not O(N) per existing engine.
|
|
|
|
The single most important lesson from the verification pass: **there must be exactly one
|
|
`ValueT` / `AggregateT` / `ValueTypeE`, defined in `broker.h`, included verbatim by every
|
|
adapter.** The first-cut design had three divergent copies and would not have linked, let
|
|
alone round-tripped data. Section 2 is therefore the load-bearing part of this plan.
|
|
|
|
---
|
|
|
|
## 2. The canonical type system (`broker.h`) -- single source of truth
|
|
|
|
### 2.1 Value tags and the value struct
|
|
|
|
```c
|
|
typedef enum ValueTypeE {
|
|
valueNilE = 0,
|
|
valueBoolE = 1,
|
|
valueIntE = 2, // int64_t
|
|
valueRealE = 3, // double
|
|
valueStringE = 4, // length-prefixed, binary-safe
|
|
valueAggregateE = 5, // hybrid array + map container
|
|
valueFnE = 6 // function value: refcounted handle to a CallableT
|
|
} ValueTypeE;
|
|
|
|
|
|
typedef struct StringT {
|
|
char *bytes; // owned; always NUL-terminated at bytes[length] for C consumers
|
|
int64_t length; // byte count excluding the convenience terminator (binary-safe)
|
|
} StringT;
|
|
|
|
|
|
typedef struct ValueT {
|
|
ValueTypeE type;
|
|
union {
|
|
bool b;
|
|
int64_t i;
|
|
double r;
|
|
StringT s;
|
|
struct AggregateT *agg; // heap-owned subtree
|
|
struct CallableT *callable; // refcounted, broker-owned
|
|
} as;
|
|
} ValueT;
|
|
```
|
|
|
|
### 2.2 The aggregate (one shape both engines map onto)
|
|
|
|
```c
|
|
typedef struct PairT {
|
|
ValueT key; // marshal layer constrains to int/real/string keys (see 2.5)
|
|
ValueT value;
|
|
} PairT;
|
|
|
|
|
|
typedef enum AggregateKindE {
|
|
aggregateListE = 0, // empty container round-trips as a list by default
|
|
aggregateMapE = 1,
|
|
aggregateBothE = 2 // array part AND pairs part populated
|
|
} AggregateKindE;
|
|
|
|
|
|
typedef struct AggregateT {
|
|
AggregateKindE kind; // disambiguates empty/mixed containers across engines
|
|
ValueT *array; // dense elements [0, arrayCount)
|
|
int64_t arrayCount;
|
|
int64_t arrayCap;
|
|
PairT *pairs; // map part; preserves insertion order
|
|
int64_t pairCount;
|
|
int64_t pairCap;
|
|
} AggregateT;
|
|
```
|
|
|
|
A Lua table's sequence part maps to `array`, its remaining keys to `pairs`. A my-basic
|
|
`LIST` maps to `array`, a `DICT` maps to `pairs`. The explicit `kind` flag fixes two
|
|
problems the verifier flagged: an empty `{}` had no defined type on the far side, and a
|
|
mixed array+hash Lua table had no representation at all.
|
|
|
|
### 2.3 Function values: `CallableT`
|
|
|
|
`valueFnE` is the one deliberate exception to "by-value everything". A function cannot be
|
|
meaningfully copied between heaps, so it is shared by *reference* -- but safely, because it
|
|
is only ever *invoked*, never inspected, and invocation always routes to the owning
|
|
context's thread.
|
|
|
|
```c
|
|
typedef struct CallableT {
|
|
NativeFnT fn; // uniform invoke entry
|
|
void *userData; // luaL_ref slot, pinned mb_value_t, or C closure ctx
|
|
uint32_t ownerCtxId; // context whose thread MUST run it
|
|
uint32_t ownerGen; // generation of that context (UAF guard, see sec 9)
|
|
int32_t refCount; // ATOMIC; shared-handle lifetime across threads
|
|
bool alive; // false once the owning context is torn down
|
|
} CallableT;
|
|
```
|
|
|
|
Rules (these resolve the verifier's critical "function value breaks the no-shared-pointer
|
|
invariant" finding):
|
|
|
|
- `valueCopy` of a `valueFnE` does an **atomic refcount increment** on the same
|
|
`CallableT` -- it does NOT clone the closure. The shared `CallableT*` across threads is
|
|
allowed precisely because `refCount` is atomic and the closure is touched only on its
|
|
owner thread.
|
|
- `valueFree` of a `valueFnE` does an **atomic decrement**. When it hits zero, the
|
|
underlying closure must be released with an interpreter call (`luaL_unref`, my-basic
|
|
unref) -- which can only run on the owner's thread. So a zero-drop on a foreign thread
|
|
**posts a release message to the owner context** rather than calling the interpreter
|
|
directly (sec 10).
|
|
- Invoking a dead handle (`alive == false`, owner gone) returns a clean broker error,
|
|
never a call into a freed interpreter.
|
|
|
|
### 2.4 Value operation contract (one signature, used everywhere)
|
|
|
|
The verifier caught two designs declaring `valueCopy` with incompatible signatures. The
|
|
canonical form -- status-returning, dst-by-pointer, so OOM is checkable on the hot path:
|
|
|
|
```c
|
|
int32_t valueCopy(ValueT *dst, const ValueT *src); // deep copy; brokerOkE / brokerErrOomE
|
|
void valueFree(ValueT *v); // recursive free; leaves a safe nil
|
|
void valueMove(ValueT *dst, ValueT *src); // zero-alloc ownership transfer
|
|
```
|
|
|
|
`valueFree` and `valueCopy` MUST have a `default:`/explicit case for every tag including
|
|
`valueFnE` -- a missing case was how the first cut silently leaked function payloads.
|
|
|
|
### 2.5 Cross-engine fidelity table (the honest limits)
|
|
|
|
By-value marshalling across `Lua <-> broker <-> my-basic` is lossless for the common cases
|
|
and lossy at documented edges. These are inherent to my-basic's value model (verified
|
|
against `my_basic.h`/`.c`), not marshalling bugs:
|
|
|
|
| Aspect | Lua | my-basic | Crossing Lua <-> BASIC |
|
|
|-----------------------|------------------------|-----------------------------------|-------------------------------------------------|
|
|
| integers | 64-bit | `int_t == int` (32-bit, always) | **truncates above 2^31** -- range-check + error |
|
|
| reals | double | `float` (double w/ `-DMB_DOUBLE_FLOAT`) | precision loss unless double build |
|
|
| int vs real subtype | distinct (5.4) | integral real auto-collapses to int | **subtype not preserved** through BASIC |
|
|
| strings | binary-safe (length) | bare `char*` (`strlen`) | **embedded NUL truncates** -- detect + error |
|
|
| array / list | table sequence part | `LIST` | OK |
|
|
| map / dict | table hash part | `DICT` (int/real/string keys) | OK |
|
|
| mixed array+hash table| one table | no single LIST+DICT value | **collapses to DICT** (array part -> int keys), documented |
|
|
| empty `{}` | table | LIST or DICT? | `kind` flag; default LIST |
|
|
| function value | closure (`luaL_ref`) | lambda/routine (`MB_DT_ROUTINE`) | OK via `CallableT` (by-reference) |
|
|
| nested depth | bounded | bounded | one shared cap; defined error past it |
|
|
| cycles | rejected on ingress | rejected on ingress | impossible at rest (by-value, no shared refs) |
|
|
|
|
Policy decisions baked in to make the above deterministic:
|
|
|
|
- **One shared recursion-depth cap** applied on *every* recursive path -- both ingress and
|
|
egress, all components -- failing with a defined status instead of overflowing the C
|
|
stack. (The first cut bounded only Lua ingress.)
|
|
- **One strict/lenient switch**, owned by the broker and honored by both adapters: in
|
|
strict mode an unrepresentable value/key (e.g. a function used as a table key, a 64-bit
|
|
int into BASIC, a NUL-bearing string into BASIC) is an error; in lenient mode it is
|
|
dropped/coerced with a documented rule. Never a silent surprise either way.
|
|
- **Keys**: broker allows int/real/string keys (my-basic dicts accept all three). Float
|
|
keys get defined equality; non-representable keys follow the strict/lenient switch.
|
|
|
|
---
|
|
|
|
## 3. The engine vtable -- what makes the actor loop engine-agnostic
|
|
|
|
Each adapter implements one small vtable so the broker/actor core never special-cases an
|
|
engine:
|
|
|
|
```c
|
|
typedef struct EngineT {
|
|
const char *name;
|
|
void *(*createInterpreter)(struct ScriptContextT *ctx); // ON the owning thread
|
|
void (*destroyInterpreter)(void *interp);
|
|
int32_t (*loadSource)(void *interp, const char *src, int64_t len);
|
|
int32_t (*registerNative)(void *interp, const char *name, NativeFnT fn, void *userData);
|
|
int32_t (*callExport)(void *interp, void *exportRef, ValueT *args, int32_t argCount, ValueT *result);
|
|
void (*releaseExport)(void *interp, void *exportRef); // ON the owning thread
|
|
} EngineT;
|
|
```
|
|
|
|
`createInterpreter`, `destroyInterpreter`, `releaseExport`, and every `callExport` run on
|
|
the context's own thread -- that is the invariant that keeps each interpreter
|
|
single-threaded.
|
|
|
|
---
|
|
|
|
## 4. The Lua adapter
|
|
|
|
Target Lua 5.4 (note 5.1/5.3 deltas where they matter). The C API surface was verified
|
|
accurate; the fixes below are about lifecycle, not API names.
|
|
|
|
### 4.1 Native registration and the trampoline
|
|
|
|
Each registered `NativeFnT` becomes a Lua C closure: push the `NativeFnT` and its
|
|
`userData` as upvalues with `lua_pushcclosure`, then `lua_setglobal` (or into a module
|
|
table). The single trampoline recovers them from upvalues, marshals the Lua stack into a
|
|
`ValueT[]`, calls the `NativeFnT`, and pushes the result back.
|
|
|
|
- `lua_setglobal` returns `void` (the first cut documented `int` -- harmless but wrong).
|
|
- Lua allocation APIs (`lua_newuserdatauv`, `lua_createtable`, ...) **longjmp on OOM and
|
|
never return NULL** -- so NULL-checks after them are dead code; the OOM path is a Lua
|
|
error, not a C return. Only `luaL_newstate` can return NULL and must be checked.
|
|
|
|
### 4.2 Marshalling `ValueT <-> Lua` (by value)
|
|
|
|
Scalars are direct. Strings use `lua_tolstring` + length (binary-safe, preserves NULs).
|
|
Tables deep-copy in both directions:
|
|
|
|
- Ingress (table -> `AggregateT`): normalize the table index to absolute before `lua_next`;
|
|
route numeric keys through `lua_tointeger`/`lua_tonumber` (do **not** let `lua_tolstring`
|
|
mutate a numeric key in place); fill `array` for the sequence part and `pairs` for the
|
|
rest; detect cycles via an ancestor-pointer stack (`lua_topointer`); use `lua_rawset`/
|
|
`lua_rawget` to avoid metamethods; enforce the shared depth cap.
|
|
- Egress (`AggregateT` -> table): build with `lua_createtable`, populate `array` as the
|
|
sequence and `pairs` as keyed entries, **with the same depth cap** (the first cut had no
|
|
egress cap -- a deep BASIC-origin structure could overflow the stack on the way back).
|
|
Make the builder self-balancing: record `lua_gettop` on entry and `lua_settop` back on
|
|
any error return.
|
|
|
|
### 4.3 Exporting a Lua function (and the leak fix)
|
|
|
|
A Lua function crossing the boundary becomes a `valueFnE`: pin the closure with
|
|
`luaL_ref(L, LUA_REGISTRYINDEX)`, wrap it in a `CallableT`. The `fn` body does
|
|
`lua_rawgeti` to retrieve, marshals args onto the stack, `lua_pcall`, marshals the return;
|
|
on error it pulls the message with `luaL_tolstring` and reports it as a broker error
|
|
(sec 8).
|
|
|
|
Two bugs the verifier found, fixed here:
|
|
|
|
- **The exports array needs grow-on-demand.** The context is `calloc`'d, so the array
|
|
starts NULL/0; the first export must `realloc` (double the cap, handle the NULL/0 seed)
|
|
before storing, and `luaL_unref` the just-created ref if the realloc fails.
|
|
- **Transient vs persistent ownership.** Every Lua function passed as a *native argument*
|
|
was creating a `luaL_ref` that only got released at `lua_close` -- an unbounded leak for
|
|
any long-lived context that takes callbacks. Fix: the `CallableT` refcount owns the
|
|
`luaL_ref`. When the last `valueFree` drops the handle to zero, the ref is released (on
|
|
the Lua thread per sec 10). A function the broker retains as a real export holds a
|
|
reference for as long as it is registered; a function merely borrowed for the duration
|
|
of one call is released when that call's `ValueT` args are freed. Same mechanism, two
|
|
lifetimes.
|
|
|
|
### 4.4 Calling a function value from Lua
|
|
|
|
A `valueFnE` marshalled *into* Lua becomes a Lua C closure over the `CallableT*`, so script
|
|
authors just write `cb(x, y)` and it transparently dispatches (to the owner's thread if the
|
|
function lives elsewhere). A universal `call(fn, ...)` native is also provided for
|
|
uniformity across engines.
|
|
|
|
### 4.5 Context lifecycle
|
|
|
|
`luaL_newstate` + `luaL_openlibs` on the owning thread; confine the `lua_State` to that
|
|
thread forever; teardown `luaL_unref`s outstanding refs then `lua_close`.
|
|
|
|
---
|
|
|
|
## 5. The my-basic adapter
|
|
|
|
Verified against `paladin-t/my_basic` (`my_basic.h` + `.c` read directly). **Zero
|
|
hallucinated calls.** The interesting work is three my-basic-specific quirks that shape the
|
|
adapter; all three are forced by the source, not stylistic.
|
|
|
|
### 5.1 Lifecycle and the inverted register result
|
|
|
|
`mb_init` (once per process) / `mb_open(&bas)` / `mb_load_string(bas, src, true)` /
|
|
`mb_run` / `mb_close` / `mb_dispose`. The broker pointer is threaded through the interp's
|
|
single userdata slot via `mb_set_userdata` / `mb_get_userdata`.
|
|
|
|
- **`mb_register_func` returns a count, not a status** -- nonzero means registered, `0`
|
|
means duplicate/failure. That is the opposite of the `MB_FUNC_OK == 0` convention, so the
|
|
success test must be inverted. Names are uppercased internally (`mb_strupr`), so the
|
|
broker key is the uppercased identifier (BASIC is case-insensitive).
|
|
|
|
### 5.2 The native-function protocol and the trampoline bank
|
|
|
|
The native signature is `typedef int (*mb_func_t)(struct mb_interpreter_t*, void**);` --
|
|
**no per-callback userData parameter**, and the interpreter has only **one** userdata slot.
|
|
So a single shared C trampoline cannot tell *which* broker function it is serving.
|
|
|
|
Fix (the verifier confirmed this limitation is real): a **macro-generated bank of
|
|
trampolines** `mbTramp0 .. mbTrampN`, each hardcoding its slot index, each looking up
|
|
`ctx->nativeBank[slot]` (the `NativeFnT` + `userData`) via the interpreter's userdata
|
|
pointer. The bank size caps how many natives one my-basic context can host; size it
|
|
generously and document it.
|
|
|
|
Inside a trampoline the argument protocol is the real my-basic frame dance:
|
|
`mb_attempt_open_bracket` / loop `mb_pop_value` (honoring `mb_has_arg`) /
|
|
`mb_attempt_close_bracket` / compute / `mb_push_value` (or the typed `mb_push_*`).
|
|
|
|
### 5.3 String ownership (memdup is mandatory)
|
|
|
|
`mb_pop_string` hands back a **borrowed interior pointer** -- the broker must `strdup`/copy
|
|
it immediately. Pushed strings are taken over by the interpreter and later freed with *its*
|
|
allocator, so any string handed to `mb_push_value`/`mb_make_string` **must come from
|
|
`mb_memdup`** (not plain `malloc`). Embedded NULs cannot survive (bare `char*` + `strlen`)
|
|
-- enforce the strict/lenient policy on egress.
|
|
|
|
### 5.4 Aggregates: the collection API
|
|
|
|
There is no `mb_make_coll`. A list/dict is built by presetting `coll->type =
|
|
MB_DT_LIST`/`MB_DT_DICT` then calling `mb_init_coll`, and accessed with
|
|
`mb_get_coll` / `mb_set_coll` / `mb_remove_coll` / `mb_count_coll` / `mb_keys_of_coll`.
|
|
Collection support is on by default (`MB_ENABLE_COLLECTION_LIB`). A broker aggregate with
|
|
both `array` and `pairs` populated collapses to a DICT (array part becomes integer keys)
|
|
per the fidelity table.
|
|
|
|
### 5.5 Exporting a BASIC routine -- the parked `__BROKERSERVE` frame
|
|
|
|
This is the my-basic-specific crux. To call a BASIC routine/lambda from C you use
|
|
`mb_get_routine(s, l, name, &val)` then `mb_eval_routine(s, l, val, args, argc, &ret)` --
|
|
and **`mb_eval_routine` dereferences `*l` and hard-requires a live, non-NULL `void** l`**
|
|
(verified at `my_basic.c:14344/14358`). A valid `l` only exists *inside* a running native
|
|
call. Therefore a my-basic context cannot be driven from arbitrary C; it must be **parked
|
|
inside a native frame**.
|
|
|
|
Design: register a native `__BROKERSERVE` whose C body *is* the context's message-pump /
|
|
serve loop. A module hands control to the broker by ending with a `SERVE` call (the adapter
|
|
appends one if absent). While parked there, the loop holds a valid `l`, which it uses to
|
|
`mb_eval_routine` whenever another context calls one of this module's exported routines.
|
|
`mb_get_routine` returns `MB_FUNC_OK` with a *nil* value when a name is absent, so the
|
|
not-found test is `routine.type != MB_DT_ROUTINE`, not the status code.
|
|
|
|
### 5.6 Numeric and identity caveats
|
|
|
|
`int_t` is 32-bit unconditionally (64-bit broker ints truncate -- range-check + error or
|
|
promote to real with documented precision loss); integral reals auto-collapse to int so
|
|
real/int subtype is not preserved across a BASIC hop. Both are in the fidelity table; both
|
|
follow the strict/lenient switch.
|
|
|
|
---
|
|
|
|
## 6. Threading: the actor model
|
|
|
|
Each `ScriptContextT` owns one interpreter, one OS thread (pthreads -- chosen over C11
|
|
`<threads.h>` for portability/maturity), and one inbound MPSC message queue. Interpreters
|
|
are single-threaded; only the owning thread ever enters `callExport`. A cross-context call
|
|
is a message; the caller blocks for the reply on a per-call condvar future (lost-wakeup
|
|
safe via a predicate loop). The verifier confirmed the core is sound: no path lets two
|
|
threads touch one interpreter, the deep-copy ownership ledger is correct on the success
|
|
path, and the epoll thread enqueuing while a context is mid-dispatch is race-free.
|
|
|
|
The fixes folded in from verification:
|
|
|
|
- **One error channel.** The first cut carried a separate error `ValueT` in the reply that
|
|
the caller never freed -- a leak on every errored call, lost error text, and a second
|
|
source of truth contradicting the broker's "error travels in `result`" contract. Fix:
|
|
on failure the adapter writes the error string into `result` (as `brokerSetError` does);
|
|
the reply carries only `{status, result}`. One channel, one owner, freed once.
|
|
- **`valueCopy` checked on enqueue.** Use the canonical `int32_t valueCopy(dst, src)`,
|
|
check each arg, and unwind partially-copied args on OOM (mirroring the broker route
|
|
path). The actor layer should call the broker's marshalling, not reimplement a copy loop.
|
|
- **Shutdown drains everything.** On `SHUTDOWN`, error-reply every queued `CALL` *and*
|
|
free every queued/stashed `REPLY` (result + error) before join -- the first cut leaked
|
|
in-flight replies unwound by a nested shutdown.
|
|
- **Explicit thread stack size.** The reentrancy depth bound counts dispatch nesting, not
|
|
C-stack bytes; set a validated stack size with `pthread_attr_setstacksize` (or lower the
|
|
bound) so the "clean catchable depth error" promise actually holds instead of a UB
|
|
overflow.
|
|
- **Split the ready-handshake condvar** off the queue condvar so `queueCond` has exactly
|
|
one semantic (latent lost-wakeup footgun if a second waiter is ever added).
|
|
|
|
### 6.1 What a context does while blocked: always-live nested pump [DECIDED]
|
|
|
|
When context A makes a synchronous cross-context call and waits for the reply, A's thread
|
|
**pumps its own inbound queue** instead of sleeping idle. An incoming call to A -- including
|
|
a re-entrant B->A issued during the very call A is waiting on -- is serviced on A's own
|
|
thread, then A resumes waiting. This was chosen over strict run-to-completion because it
|
|
never deadlocks and needs no wait-for-graph deadlock detector; the rejected alternative
|
|
would have had to raise a "synchronous call cycle" error on A->B->A and would leave a busy
|
|
context unresponsive to other callers. The verifier validated the pump as sound: only A's
|
|
thread ever enters A's interpreter (the single-threaded invariant holds), reply nesting is
|
|
strict LIFO, and depth is bounded.
|
|
|
|
The contract this commits the runtime and script authors to:
|
|
|
|
- **Re-entry happens only at explicit cross-context call points** (`x = getUserInfo()`,
|
|
`data = sockRecv(c)`), never mid-statement -- a call point is a yield point.
|
|
- **Module-global state may differ after a cross-context call returns**, because another
|
|
call may have run on this context while it was outstanding (the same contract as any RPC).
|
|
Local variables are unaffected.
|
|
- **Reentrancy is depth-bounded** with a catchable error (backed by an explicit pthread
|
|
stack size, sec 6 fixes), so runaway ping-pong fails cleanly instead of overflowing.
|
|
|
|
Script code stays plain synchronous-blocking regardless -- `info = getUserInfo()` just
|
|
works; this only governs what the runtime does while a call is outstanding.
|
|
|
|
---
|
|
|
|
## 7. Networking and the dispatcher
|
|
|
|
One dedicated I/O thread runs `epoll` (Linux; `kqueue`/`poll` for portability) and owns no
|
|
interpreter. Async socket primitives (`sockConnect`, `sockListen`, `sockSend`, `sockRecv`,
|
|
`sockClose`, plus a timer) are registered once through the broker, so every language gets
|
|
them. The recommended v1 model is **synchronous-blocking at the script level**: `data =
|
|
sockRecv(conn)` parks the calling context on a reply future; when epoll reports readiness,
|
|
the I/O thread builds a `CALL`/reply and enqueues it onto the *owning* context's queue, so
|
|
the result lands on the right interpreter thread. **Callbacks are opt-in on top**: pass a
|
|
`valueFnE` (e.g. `onConnect(myFunc)`) and the completion invokes it via the same dispatch,
|
|
always back on its home thread. No separate async keyword, no per-engine coroutine support
|
|
needed.
|
|
|
|
Fixes from verification:
|
|
|
|
- The I/O command queue must be **strict tail-append FIFO** (a `sockSend` issued right
|
|
after `sockConnect` must be processed after the connect that registered the handle);
|
|
assert it.
|
|
- The resolver/connect path must **deep-copy `host`** before the command is freed (the
|
|
first cut had an unconditional `free(cmd->host)` that would UAF if stored by pointer).
|
|
- Wake the epoll thread for new interest via `eventfd`/self-pipe; deregister on close;
|
|
drain `eventfd` to `EAGAIN`.
|
|
- Portability note: `pthread_condattr_setclock(CLOCK_MONOTONIC)` is absent on Darwin --
|
|
guard it with `#ifdef` and derive any monotonic timed wait accordingly (a monotonic
|
|
deadline cannot be handed to a realtime-clock condvar).
|
|
|
|
---
|
|
|
|
## 8. Error model (one source of truth)
|
|
|
|
A `NativeFnT` returns a status int and, on failure, writes a human-readable message **into
|
|
`result`** (a `valueStringE` tagged as an error, or a small error-struct convention). That
|
|
single in-band channel crosses the actor boundary unchanged, is freed exactly once by the
|
|
caller, and is surfaced into the calling engine as that engine's native error
|
|
(`luaL_error`/`lua_error` for Lua, `mb_raise_error` for my-basic). There is no second error
|
|
field anywhere.
|
|
|
|
---
|
|
|
|
## 9. Context lifetime and the registry (UAF fix)
|
|
|
|
The critical use-after-free: contexts were addressed by raw `ScriptContextT*` (and exports
|
|
held raw owner pointers), while `contextShutdown` frees the context and destroys its
|
|
mutex/cond at runtime -- so a foreign thread could enqueue onto a freed `queueMutex`.
|
|
|
|
Fix:
|
|
|
|
- Address contexts by a **stable integer id** through a **locked registry**; never by raw
|
|
pointer. `contextEnqueue`/`contextCall`/`ioDispatch` resolve id -> context under the
|
|
registry lock and either hold the lock across enqueue or take a reference so the context
|
|
(and its mutex) cannot be freed mid-enqueue.
|
|
- Add a **generation counter** to context ids and to `CallableT.ownerGen` so a recycled id
|
|
cannot misroute an in-flight completion to a different context.
|
|
- `contextShutdown`: under the lock, mark dead and remove from the id map; reject new
|
|
enqueues with a defined "dead context" error; drain and error-reply queued work; wait for
|
|
in-flight references to drain; then free.
|
|
|
|
---
|
|
|
|
## 10. Function-value lifecycle across threads
|
|
|
|
`CallableT.refCount` is atomic. `valueCopy` bumps it; `valueFree` drops it. The subtlety:
|
|
releasing the underlying closure is an interpreter op (`luaL_unref` / my-basic unref) that
|
|
must run on the owner's thread. So when a drop reaches zero on a *foreign* thread, the
|
|
broker posts a **release message** to the owner context instead of touching the interpreter
|
|
directly; the owner releases the closure on its own thread and frees the `CallableT`. If
|
|
the owner is already gone (`alive == false`), the `CallableT` shell is freed directly (the
|
|
closure is already gone with the interpreter) and any pending invoke returns a clean error.
|
|
|
|
---
|
|
|
|
## 11. Build order
|
|
|
|
1. **Broker core**: `broker.h` (the canonical `ValueT`/`AggregateT`/`CallableT`/enums),
|
|
`valueCopy`/`valueFree`/`valueMove` with full tag coverage and the depth cap, the
|
|
name->entry registry, `brokerCall`, the error convention. Unit-test value round-trips
|
|
and deep-copy/free under a leak checker before any engine exists.
|
|
2. **Lua adapter** against the core: trampoline, scalar+string marshalling, table
|
|
deep-copy both directions with caps, native registration, export with the refcounted
|
|
`luaL_ref` lifecycle and exports-array growth. Test C<->Lua and Lua-export-called-from-C
|
|
single-threaded.
|
|
3. **my-basic adapter**: lifecycle, the trampoline bank, the arg-frame protocol, `mb_memdup`
|
|
string ownership, the collection mapping, and the parked `__BROKERSERVE` export frame.
|
|
Test C<->BASIC and the full Lua<->broker<->BASIC round-trip against the fidelity table
|
|
(assert the lossy edges error or coerce exactly as documented).
|
|
4. **Actor layer**: `ScriptContextT`, the MPSC queue, the reply future, the id+generation
|
|
registry, the single error channel, the chosen block-while-waiting semantics (sec 6.1),
|
|
and shutdown drain. Stress cross-context calls and teardown under a thread sanitizer.
|
|
5. **Networking/dispatcher**: epoll I/O thread, the FIFO command queue, the socket/timer
|
|
natives, completion dispatch onto owning queues, callbacks via `valueFnE`.
|
|
6. **Squirrel** (later): a third adapter validates that the vtable + canonical `ValueT`
|
|
really make new engines O(1).
|
|
|
|
---
|
|
|
|
## 12. Open decisions
|
|
|
|
- **Block-while-waiting semantics: DECIDED -- always-live nested pump (sec 6.1).**
|
|
- Strict-vs-lenient default for the lossy marshal edges (recommend: strict by default so
|
|
truncation/loss is an explicit error; lenient opt-in per call).
|
|
- my-basic native-bank size (cap on natives per BASIC context).
|
|
- Whether a foreign function injected into BASIC should be transparently callable as a
|
|
routine value (`cb(x)`) or only via the portable `CALL(fn, ...)` primitive (Lua gets the
|
|
transparent form for free; BASIC's transparent form needs confirming).
|
|
|
|
---
|
|
|
|
## 13. Implementation notes (as-built: broker core + both adapters)
|
|
|
|
Built and tested: `broker.h`/`value.c`/`broker.c` (core), `luaAdapter.*` (Lua 5.4),
|
|
`mybasicAdapter.*` (vendored my-basic in `vendor/`), with `testBroker`/`testLua`/
|
|
`testMyBasic`/`testPolyglot` -- 378 checks, clean under ASan+UBSan. The polyglot test
|
|
proves the thesis: one C native called from both engines, and a Lua function invoked from
|
|
a BASIC program through the broker.
|
|
|
|
**Core refinement.** The single global callable-release hook could not distinguish a Lua
|
|
closure from a my-basic routine, so release is now a per-callable `CallableReleaseFnT`
|
|
passed to `callableCreate` (design sec 10's "owner releases the closure", just synchronous
|
|
for now). Added `callableUserData` so a release fn can reach its closure handle.
|
|
|
|
**Lua adapter.** Context pointer lives in `lua_getextraspace`. Native bindings are
|
|
context-owned `{fn,userData}` structs referenced by a light-userdata upvalue on one shared
|
|
trampoline. A Lua function crossing out becomes a `CallableT` over a pinned `luaL_ref`
|
|
(released via `luaL_unref` in the per-callable release fn); transient callback args are
|
|
freed automatically because `valueFree` drops the handle. A `CallableT` crossing in becomes
|
|
a callable userdata with `__call`/`__gc`. Lua allocation APIs longjmp on OOM (no NULL
|
|
checks). Caveat: release exported callables before `luaContextDestroy` (the `luaL_ref`
|
|
lives in that state's registry).
|
|
|
|
**my-basic adapter** (the high-effort one; these rules were forced by ASan):
|
|
- Build with `-DMB_DOUBLE_FLOAT` (double reals) and link `-lm`.
|
|
- Native signature has no per-call userData and one interpreter userdata slot, so a
|
|
macro-generated **trampoline bank** (`MB_BANK_SIZE`) supplies slot-specific entries that
|
|
recover the binding from the context.
|
|
- `mb_register_func` returns a count: **nonzero = success, 0 = failure** (inverted vs the
|
|
usual `MB_FUNC_OK == 0`).
|
|
- **Ownership is asymmetric and was the main source of bugs** (verified against the my-basic
|
|
source during adversarial review):
|
|
- A popped **collection** is owned by the consumer (`mb_dispose_value` after marshalling);
|
|
a popped **string** is a borrowed interior pointer (copy, never free).
|
|
- `mb_set_coll` **copies** a scalar/string key-value (dispose your copy after) but stores a
|
|
**collection by pointer without a reference** -- so a nested collection needs an explicit
|
|
`mb_ref_value` *before* the set, and must then NOT be disposed (the parent owns it).
|
|
- `mb_push_value` **transfers** a collection, but **borrows** a string -- a string result
|
|
must be pushed with `mb_push_string` (which marks it for lazy destroy), not
|
|
`mb_push_value`.
|
|
- `mb_eval_routine` **borrows** its arguments (it never frees them), so marshalled routine
|
|
args are disposed by the caller after the call -- and the **return value is marshalled
|
|
out first**, because a routine may return one of those borrowed arguments.
|
|
- Routine values are **not** ref-counted (`mb_ref_value`/`mb_unref_value` corrupt them); a
|
|
routine name must be **uppercased** before `mb_get_routine` (BASIC uppercases at parse).
|
|
- int64 entering BASIC is range-checked to 32-bit `int_t` (`brokerErrRangeE` on overflow).
|
|
- **Routine export** uses `mb_get_routine`(by name) + `mb_eval_routine`, both of which need
|
|
a live `void** l`. That cursor only exists inside a native call, so the dispatch stashes
|
|
it in `currentL`; an exported BASIC-routine `CallableT` is therefore valid only while the
|
|
context is *serving* (a native frame is on the stack -- what the actor layer's parked
|
|
`__BROKERSERVE` frame will guarantee). For now, fetch and invoke within one native call.
|
|
- **One interpreter per program:** a my-basic context hosts a single program; reset+reload
|
|
after disposing native-pushed collection intermediates is unreliable, so the tests spin a
|
|
fresh context per run. (The actor layer will own one long-lived parked context per
|
|
module, which sidesteps this.)
|
|
|
|
**Build/verify.** Core compiled strict (`-Wconversion -Wsign-conversion`); adapters drop
|
|
those two (engine headers use wide macros) but keep `-Wall -Wextra -Werror`. **All three
|
|
engines are vendored under `vendor/` and built from source** -- `vendor/lua` (Lua 5.4.6,
|
|
library = `src/*.c` minus the `lua.c`/`luac.c` mains), `vendor/mybasic` (my-basic),
|
|
`vendor/squirrel-src` (Squirrel 3.2) -- each relaxed and un-sanitized but linked into the
|
|
sanitized binaries so cross-boundary heap misuse is still caught. Nothing depends on a
|
|
system-installed engine or `pkg-config`, so the build is reproducible. The Lua platform
|
|
define is selected automatically: `$(OS)` first (Windows sets `Windows_NT` and has no
|
|
`uname` -> no define / ISO C), else from `uname -s` (LUA_USE_LINUX + `-ldl` /
|
|
LUA_USE_MACOSX / LUA_USE_POSIX). NB the project is otherwise Unix-only (pthreads,
|
|
sanitizers, `setarch`), so the Windows branch only keeps the define correct.
|
|
|
|
**Function-value lifecycle across threads (sec 10), DONE.** `callableInvoke` and the
|
|
final `callableRelease` are now thread-correct. The core exposes two installable hooks
|
|
(`callableSetInvokeHook`/`callableSetReleaseHook`, the same pattern as `brokerSetRouteHook`)
|
|
so it stays independent of the actor layer; `actorInit` installs them. An invoke from a
|
|
thread other than the callable's owner is marshalled to the owner's thread by reusing the
|
|
CALL machinery (a callable's `fn`+`userData` are exactly a native call -- `callableFn` is
|
|
the one new accessor). The final reference drop is routed too: a new `messageReleaseE`
|
|
posts the finalize (fire-and-forget) to the owner, which runs the engine release
|
|
(`luaL_unref` / `sq_release`) on its own thread. `callableFinalize` is the shared "run
|
|
release + free shell" tail; the core still runs it inline when no actor is present (so the
|
|
single-threaded `testCallableDead` semantics -- a dead callable still runs its release on
|
|
last drop -- are preserved). `testEngineLua` captures a Lua closure on its context's
|
|
thread, then invokes and releases it from the main thread; both marshal to the owner,
|
|
ASan/TSan-clean. Limit: releasing a callable whose owner *context* has been destroyed is
|
|
the deferred non-quiescent-teardown case (sec 9) -- best-effort inline finalize for now.
|
|
|
|
**JavaScript adapter (Duktape), the fourth engine.** Vendored Duktape 2.7.0 (the
|
|
single amalgamated `duktape.c`/`duktape.h`/`duk_config.h`) in `vendor/duktape`, built
|
|
relaxed/un-sanitized. `src/js/jsAdapter.*` mirrors the Lua/Squirrel adapters: one shared
|
|
trampoline recovers its binding from a hidden property on the function object
|
|
(`duk_push_current_function` + an internal `\xFF`-prefixed key) and dispatches through
|
|
`brokerCall`; marshalling covers scalars, binary-safe strings, and the hybrid aggregate
|
|
(JS array <-> list, object <-> map) with the depth cap. JS numbers are doubles, so an
|
|
integral in-range number round-trips as an int (else a real). A JS function crossing out
|
|
becomes a refcounted `CallableT` over a Duktape **heap pointer** kept alive by a per-heap
|
|
export registry object in the global stash (the slot is dropped on release) -- and it
|
|
participates in the sec-10 cross-thread routing, so a JS closure captured on its context's
|
|
thread is invoked and released from another thread correctly. `src/js/jsEngine.*` is the
|
|
EngineT binding. `testJs` (single-threaded: scalar/string/array/object marshalling, export
|
|
+ invoke-from-C, closure-as-arg callback, error paths) and `testEngineJs` (threaded:
|
|
cross-context call + the sec-10 callback) -- clean under ASan/UBSan and TSan (`make tsanjs`).
|
|
Adding the engine touched zero lines of the broker core or actor layer (one adapter TU +
|
|
one engine-binding TU + Makefile rules), re-confirming the O(1)-engine-add thesis. v1 limit
|
|
(as with Squirrel): pushing a foreign `CallableT` *into* JS is unsupported.
|
|
|
|
**Source layout.** `src/` holds the project source (core + actor directly in `src/`,
|
|
one subdir per script language: `src/lua`, `src/mybasic`, `src/squirrel`, `src/js`);
|
|
`tests/` holds the test programs; `obj/` collects every object file (ours and the vendored
|
|
engines, via `patsubst` into `obj/`); `bin/` collects the binaries. The Makefile finds our
|
|
sources by `VPATH` and groups object rules by flag set; `-MMD -MP` generate header
|
|
dependencies automatically. `make clean` removes `obj/` and `bin/`. `make test` builds and
|
|
runs all ten binaries; `make tsan`/`make tsansq`/`make tsanjs` are the ThreadSanitizer
|
|
variants.
|
|
|
|
## 16. Threading model rewrite -- host-thread natives, fire-and-forget scripts
|
|
|
|
Supersedes the earlier "natives run inline on the calling context thread" model. The
|
|
host's own thread is now an implicit **host context (id 0)**: it has a queue but no OS
|
|
thread of its own, and the host drives it by calling **`calogPump`** in its loop.
|
|
|
|
- **Scripts are fire-and-forget.** `calogContextEval(ctx, src)` enqueues the script
|
|
onto the context's thread and returns a status (not the result); the script runs
|
|
asynchronously, and results come back by calling natives.
|
|
- **A registered native runs on the host thread, serialized.** A script calling one
|
|
posts a CALL onto the host queue and parks; the host runs it during `calogPump`. So
|
|
host C code is never called concurrently and needs no locking. `actorRoute` inlines a
|
|
call already on the host thread; otherwise it marshals to the host context (id 0).
|
|
- **`calogRegisterInline`** is the escape hatch: the registry entry's `runInline` flag
|
|
(which replaced `ownerCtxId`) makes the native run on the calling script's thread.
|
|
- **Errors** from a fire-and-forget script are posted to the host queue and delivered
|
|
to the `CalogErrorFnT` handler (`calogSetErrorHandler`) during `calogPump` (default:
|
|
log to stderr).
|
|
- **Function values** (`CalogFnT`) still run on their owning engine's thread -- sec 10
|
|
routing is unchanged, and `calogFnInvoke` from the host blocks-and-pumps the host
|
|
queue while it waits (the same nested pump, now applied to the host context).
|
|
- **Nested eval is allowed:** a new eval that arrives while a context is mid-script
|
|
(parked on a native call) runs nested via `pumpUntil` -- consistent with the sec-6.1
|
|
re-entrancy contract (interpreters support nested `pcall`/`peval`).
|
|
|
|
API shape: `calogRegister(c,name,fn,ud)` / `calogRegisterInline(...)`;
|
|
`calogContextOpen(c,engine) -> CalogContextT*` (create+start merged, since nothing is
|
|
registered between them anymore) and `calogContextClose`; `calogContextEval(ctx,src)`
|
|
fire-and-forget; `calogPump`; `calogSetErrorHandler`. `CalogConfigT` and the
|
|
`createInterpreter` config parameter are gone -- a context now exposes *every*
|
|
registered native (the engine binding walks the registry via the internal
|
|
`calogForEach`). Tests rewritten to drive calog the host way (register, open, eval,
|
|
pump-until-a-native-records-the-result); `testActor` is now engine-free, exercising the
|
|
dispatch machinery with C callables (`calogFnCreate`) on synthetic contexts. Verified:
|
|
`make test` 441 checks across 11 binaries (incl. `examples/embed.c`), gcc + clang
|
|
strict, ASan/UBSan + TSan clean (`make tsan`/`tsansq`/`tsanjs`).
|
|
|
|
**`CalogT` owns its contexts; ids are unbounded.** The active-context registry moved
|
|
from `context.c` file-static globals *into* `struct CalogT` (now defined in
|
|
`calogInternal.h`): a runtime owns both its native-function registry and its
|
|
active-context registry (`ctxMutex`, `ctxSlots`, freelist). `context.c` reaches it via
|
|
one `runtime` pointer set in `calogActorInit` (which also refuses a second runtime).
|
|
So `calogDestroy` closes every still-open context automatically -- the host need not
|
|
track them (a test opens 32 and never closes them; ASan confirms no leak). Context ids
|
|
widened to **`uint64`** (32-bit slot index + 32-bit generation), so neither the live
|
|
count nor open/close churn hits a preset ceiling; `calogContextId`/`calogCurrentId`
|
|
return `uint64_t`. The now-dead `ownerGen` parameter was dropped from `calogFnCreate`
|
|
(generation lives in the packed id). Re-verified: `make test` 473 checks, ASan no
|
|
leaks, TSan clean, gcc + clang strict.
|
|
|
|
**Independent runtimes in one process.** The one-runtime limit was not fundamental --
|
|
just process-global state that hadn't moved into `CalogT`. All of it now has: the host
|
|
context, the routing hooks (`routeHook`/`invokeHook`/`releaseHook`), and the error sink
|
|
are `CalogT` fields; a `CalogFnT` carries a `runtime` pointer so `calogFnInvoke`/release
|
|
reach the right hooks (the callable path has no `CalogT` otherwise). The dispatch
|
|
reaches its runtime through the object it already holds -- the route hook is handed its
|
|
`calog`, the callable hooks read `calogFnRuntime`, context-thread code uses
|
|
`context->broker`, and the rest take an explicit `calog` argument. The **only** remaining
|
|
process global is `currentContext`, and it is thread-local (it names the calling
|
|
thread's context). So the setters (`calogSetRouteHook` etc.) are gone -- `calogActorInit`
|
|
assigns the fields directly -- and the `runtime` static that the earlier review flagged
|
|
is deleted. Runtimes are isolated: don't pass a value or callable between them (a
|
|
cross-runtime reply cannot route). A test spawns N threads, each creating, driving, and
|
|
destroying its own runtime concurrently.
|
|
|
|
**One thread may host several runtimes.** `calogPump(calog)` sets `currentContext` to
|
|
`calog`'s host context for the drain and restores it after, so a single thread can drive
|
|
many runtimes by pumping each in turn -- a native serviced during `calogPump(A)` sees
|
|
`calogCurrent() == A` even if the thread also hosts B. Two consequences fall out and are
|
|
handled: context ids number from 1 in every runtime, so the "already on the owner's
|
|
thread" (inline) and "caller can take the token/pump path" (reply) decisions match the
|
|
*runtime* too, not the id alone -- a foreign or wrong-runtime caller takes a reply box,
|
|
which cannot misroute. A test creates two runtimes on one thread, runs a script in each,
|
|
and pumps both in a loop, asserting each runtime's native resolved `calogCurrent()` to
|
|
its own runtime (it fails if the pump doesn't rebind `currentContext`). Re-verified:
|
|
`make test` 480 checks, ASan no leaks, TSan clean (both concurrent runtimes and one
|
|
thread pumping two), gcc + clang strict.
|
|
|
|
**Loading a script by filename.** Each engine carries a NULL-terminated `extensions`
|
|
list (`{"lua"}` / `{"js"}` / `{"nut"}` / `{"bas"}`), and a host makes engines available
|
|
for filename-based loading with `calogRegisterEngine(calog, &engine)`.
|
|
`calogContextLoad(calog, base)` then walks the registered engines in registration order
|
|
(each engine's extensions in order), forms `"<base>.<ext>"`, and the first one that
|
|
`fopen`s wins: it reads the file on the calling thread, opens a context on that engine,
|
|
and loads the contents fire-and-forget -- returning the context (NULL if nothing matched
|
|
or the load failed). Registration matters for more than search order: hardcoding the
|
|
built-in engine vtables in the core would force-link *all* of them (and their vendored
|
|
runtimes) into every binary, defeating the per-engine archives -- so the host opts in,
|
|
and a binary that never references an engine pulls in none (`testActor` stays
|
|
engine-free). Engine selection is fundamentally a build-time (link) choice, so
|
|
`calogRegisterBuiltinEngines` (a header-inline in `calog.h`) registers exactly the
|
|
engines whose `CALOG_WITH_<ENGINE>` macro is set -- the host defines those alongside the
|
|
archives it links, and the inline emits nothing (references no engine) unless called, so
|
|
it never force-links.
|
|
|
|
**my-basic as an actor engine.** Making my-basic loadable meant running it under the
|
|
actor model for the first time, which exposed two things. (1) Its native dispatch called
|
|
the C function *directly* instead of through `calogCall`, so natives ran on the my-basic
|
|
context thread rather than marshalling to the host -- fixed by routing `mbDispatch`
|
|
through `calogCall` (the binding now stores the registry name), matching the other
|
|
engines. (2) my-basic keeps process-global state -- lazy `mb_init` singletons and a
|
|
global `_mb_allocated` counter touched on every allocation (forced on in the vendored
|
|
header) -- so two my-basic contexts on different threads race (TSan-confirmed). The
|
|
singletons are built once by `mb_init` and read-only thereafter, so the only
|
|
execution-time shared write is that counter; a one-line vendored patch makes it
|
|
`_Atomic` (the original is preserved as `vendor/mybasic/myBasic.c.orig`). With the
|
|
counter safe, the my-basic *engine* (not the adapter, which stays usable single-threaded
|
|
and lock-free) needs a lock only across *lifecycle* -- `mb_init`'s first-context build,
|
|
`mb_dispose`'s last-context teardown, and the shared context refcount -- and NOT across
|
|
`runSource`, so several my-basic scripts execute concurrently. A `tsanmb` target proves
|
|
the parallel case is race-free (verified further by a 4-context stress running
|
|
arithmetic, strings, lists, and booleans). Verified: `make test` 494 checks (13
|
|
binaries), ASan no leaks, TSan clean on all four engines
|
|
(`tsan`/`tsansq`/`tsanjs`/`tsanmb`), gcc + clang strict.
|
|
|
|
## 15. Public embedding API (`calog.h`) -- as-built (superseded by sec 16 for threading/API)
|
|
|
|
calog is packaged as an embedding library: a host links it, registers its own native
|
|
C functions, creates script contexts on an engine, and runs scripts. Every public
|
|
symbol carries a `calog` prefix (types `Calog...T`, enums `Calog...E`) so the library
|
|
is a good citizen in a host binary. The API was curated to the minimum:
|
|
|
|
- **One handle, one header.** `CalogT` is the runtime; `calogCreate()` composes the
|
|
registry with the actor layer (installs the routing hooks) and `calogDestroy()`
|
|
tears both down -- no separate init/shutdown for the host. The entire embedding
|
|
surface is `src/calog.h` (~30 functions); internal machinery (the registry entry
|
|
type, the route/invoke/release hooks, the low-level callable lifecycle, the split
|
|
`calogBrokerCreate`/`calogActorInit`) lives in `src/calogInternal.h`, which host code
|
|
never includes. `calog.h` leaks no internal symbol.
|
|
- **One config type.** The three per-engine configs collapsed into `CalogConfigT`
|
|
(`exposeNames` + `exposeCount`), used by every built-in engine vtable.
|
|
- **Value model unchanged, just prefixed.** `CalogValueT`/`CalogAggT`/`CalogFnT` +
|
|
constructors (`calogValueInt`, ...), ops (`calogValueCopy`/`Free`/`Move`/`Equals`),
|
|
aggregates (`calogAgg*`), function values (`calogFnInvoke`/`Retain`/`Release`), and
|
|
`calogFail`/`calogTypeName` for writing natives. Contexts: `calogContextCreate`/
|
|
`Start`/`Eval`/`Destroy`/`Id`, plus `calogCurrentId`/`calogCurrent` for natives. The
|
|
built-in engine vtables are `calogLuaEngine`, `calogJsEngine`, `calogSquirrelEngine`
|
|
(a host may also supply a custom `CalogEngineT`).
|
|
- **Packaging.** `make` builds `lib/libcalog.a` (calog itself: core + actor + every
|
|
adapter/binding) and separate vendored-engine archives (`liblua.a`, `libduktape.a`,
|
|
`libsquirrel.a`, `libmybasic.a`). A host links `libcalog.a` plus whichever engine
|
|
archives it uses; unused adapters (and their engine deps) stay unlinked because static
|
|
members are pulled only when referenced -- so a JS-only host never links Lua/Squirrel.
|
|
The tests consume the archives; the threaded/engine tests use only `calog.h`,
|
|
validating that the public surface is complete. `examples/embed.c` is a ~30-line host
|
|
(public header only) that registers a native and calls it from JavaScript.
|
|
- **Reconfirmed:** rename + restructure kept all 441 checks passing across 10 test
|
|
binaries, clean under ASan/UBSan and TSan (all four engines), gcc + clang strict.
|
|
|
|
## 14. Implementation notes (as-built: actor layer, engine-on-a-thread, Squirrel)
|
|
|
|
The actor layer (`context.h`/`context.c`, build step 4) is built and tested:
|
|
`testActor` exercises cross-context routing, the always-live nested pump (the
|
|
re-entrant A->B->A deadlock test), and a concurrent fan-out stress; clean under
|
|
ASan+UBSan and ThreadSanitizer (`make tsan`, run under `setarch -R` -- some kernels
|
|
hand out more mmap randomization than TSan's shadow tolerates). One thread + one
|
|
MPSC queue per `ScriptContextT`; `brokerCall` routes through an installed hook
|
|
(`brokerSetRouteHook`) so owner-0/same-context calls run inline and others marshal
|
|
to the owning thread; an external caller blocks on a private reply box, a context
|
|
caller pumps. The reply carries only `{status, result}` -- the single error channel
|
|
(sec 8) is structural, the error string rides in `result`. `contextSendBlocking`
|
|
and `contextReply` are the shared enqueue-wait and reply tails behind both CALL and
|
|
EVAL dispatch.
|
|
|
|
**Generationed registry (sec 9), DONE.** Context ids pack a 16-bit slot index and
|
|
a 16-bit generation; the registry is a slot table plus a freelist. `contextDestroy`
|
|
unlinks a context under the registry lock (after stopping+joining its thread) and
|
|
returns the slot to the freelist; the next reuse bumps the generation. A stale id
|
|
(slot since freed/recycled) resolves to `brokerErrDeadE`, never misroutes to the
|
|
recycler -- `testActor`'s generation test proves it. The registry lock is held
|
|
across enqueue, so a foreign enqueue cannot race a destroy onto a freed queue mutex.
|
|
Still quiescence-assuming (no call to the context in flight at teardown); in-flight
|
|
reference draining is the remaining sec 9 hardening.
|
|
|
|
**Engine on a thread (the EngineT vtable).** `EngineT` gained `runSource`;
|
|
`contextEval(context, source, result)` marshals a script run onto the context's own
|
|
thread (a new `messageEvalE`) and blocks like a call. Each adapter's engine binding
|
|
lives in its own TU (`luaEngine.*`, `squirrelEngine.*`) -- the only Lua/Squirrel
|
|
files that depend on the threading layer, keeping the adapters thread-agnostic so
|
|
`testLua`/`testPolyglot` link them without `context.o`/pthread. `createInterpreter`
|
|
runs on the thread and exposes the configured natives there. Crucially, the exposed-
|
|
native trampolines now dispatch through `brokerCall` (by broker+name) instead of a
|
|
captured fn pointer, so an exposed native owned by another context is transparently
|
|
routed to its thread -- the script author still writes `doubleIt(21)`. With no route
|
|
hook installed this is identical to the old inline path (testLua still passes).
|
|
`testEngineLua` proves a real Lua interpreter on a context thread calling a thread-
|
|
agnostic native and a cross-context native, on the correct threads.
|
|
|
|
**Squirrel adapter (sec 11 step 6), the O(1)-engine-add validation.** Vendored
|
|
Squirrel 3.2 in `vendor/squirrel-src` (C++), built relaxed/un-sanitized with
|
|
`-D_SQ64 -DSQUSEDOUBLE` so `SQInteger`/`SQFloat` are 64-bit int / double matching
|
|
`ValueT` -- the adapter shares those defines so the ABI matches. `squirrelAdapter.*`
|
|
mirrors the Lua adapter: one shared trampoline recovers its binding from the
|
|
closure's single free variable (which the VM pushes onto the stack *after* the args,
|
|
so it sits at the top -- verified in `sqvm.cpp` CallNative), marshals scalars,
|
|
binary-safe strings, and the hybrid aggregate (array<->list, table<->map) with the
|
|
shared depth cap, and dispatches through `brokerCall`. `testEngineSquirrel` runs a
|
|
real VM on a thread doing the cross-context call plus string and array round-trips;
|
|
clean under ASan+UBSan and TSan (`make tsansq`). The total surface a new engine
|
|
added: one adapter TU + one engine-binding TU + Makefile rules -- no change to the
|
|
broker core or the actor layer, which is the thesis.
|
|
|
|
**Squirrel closure export, DONE.** A Squirrel closure crossing the boundary now
|
|
becomes a refcounted `CallableT` over a pinned `HSQOBJECT` (`sq_addref`/`sq_release`,
|
|
mirroring Lua's `luaL_ref` lifecycle): `squirrelExport` fetches a named global
|
|
closure, and a closure passed as a native argument is exported the same way during
|
|
ingress (the VM's foreign pointer -- finally used -- recovers the owning context).
|
|
`squirrelCallableInvoke` runs `sq_pushobject`+`sq_call` on the owner's VM and
|
|
marshals the return; `squirrelCallableRelease` `sq_release`s on the owner thread.
|
|
Single-threaded `testSquirrel` covers export+invoke-from-C, a closure passed as an
|
|
argument and called back through the broker, and the not-found/type-error paths;
|
|
ASan-clean (no addref/release leak). Caveat (same as Lua): release exported
|
|
callables before `squirrelContextDestroy`. (The reverse direction -- a foreign `CalogFnT`
|
|
pushed INTO Squirrel -- is now supported too: a native closure whose one free variable is
|
|
a release-hooked userdata holding the `CalogFnT`; see sec 18.)
|
|
|
|
`make test` runs all seven binaries (411 checks). `make tsan` covers the actor core
|
|
and the Lua engine path; `make tsansq` the Squirrel path.
|
|
|
|
**Adversarial review (3 parallel reviewers: actor concurrency, Squirrel adapter,
|
|
Lua trampoline + engine bindings).** Two real defects found and fixed:
|
|
- *NULL-interpreter crash.* `threadMain` ignores `createInterpreter`'s status, so a
|
|
failed create (e.g. a config expose-name that was never registered) left a context
|
|
serving with `interp == NULL`; `contextDispatchEval` only checked `runSource !=
|
|
NULL`, so the first eval called `runSource(NULL,...)` -> NULL deref. Fixed by
|
|
guarding `interp == NULL` (the context still serves native calls, just rejects
|
|
evals); regression test in `testEngineLua` (testFailedInterpreter).
|
|
- *OOM lost-wakeup.* `contextReply`'s context-caller branch allocated a fresh REPLY
|
|
and, on `calloc` failure, dropped the wakeup -- the caller hung in `pumpUntil`
|
|
forever. Fixed by reusing the request message as the reply (it already carries the
|
|
token and replyToId), which removes the allocation entirely, so the wakeup can no
|
|
longer be lost to OOM.
|
|
The Squirrel adapter was traced clean against the real Squirrel source (trampoline
|
|
free-var indexing, stack balance, ValueT ownership, the throwerror/free order, binary
|
|
strings); added `sq_reservestack` guards before the recursive marshallers to match the
|
|
Lua adapter's `lua_checkstack` discipline. Documented (not changed): the registry must
|
|
be frozen before contexts start (`brokerCall` reads it locklessly from context
|
|
threads -- noted in broker.h); teardown still assumes quiescence (sec 9); and the
|
|
hybrid-aggregate-to-Squirrel-table egress flattens array indices and integer keys into
|
|
one table (same lossy edge as elsewhere in the fidelity table).
|
|
|
|
## 17. Engine expansion -- QuickJS-ng, and three new languages (Berry, s7, Wren)
|
|
|
|
calog now ships **seven** engines. Each is one adapter TU (marshalling + native
|
|
trampoline + the sec-10 callable export) plus one engine-binding TU (the four-hook
|
|
`CalogEngineT`), a vendored-from-source archive, a `testEngine*`, and a `tsan*` target --
|
|
the core, the actor layer, and `calog.h` were untouched, re-confirming the O(1)-per-engine
|
|
thesis. Which engines a binary pulls in is a link-time choice: the header-inline
|
|
`calogRegisterBuiltinEngines` references only the `CALOG_WITH_<ENGINE>`-selected vtables,
|
|
so `testActor` still links **zero** engine code.
|
|
|
|
**QuickJS-ng replaces Duktape** (same `calogJsEngine` / `.js`, same `jsAdapter.h` /
|
|
`jsEngine.c` -- only `jsAdapter.c` and the Makefile changed). The wins: a JS **BigInt
|
|
round-trips to int64 exactly** (`JS_ToBigInt64`), closing the double-only fidelity gap
|
|
Duktape had (proven by a 2^53+1 test); JS functions are **refcounted `JSValue`s**
|
|
(`JS_DupValue` / `JS_FreeValue`), replacing the Duktape heap-pointer-pinning registry
|
|
behind `CalogFnT`; and the broker/name binding rides on `JS_SetContextOpaque` +
|
|
`JS_NewCFunctionData`. One gotcha: a missing global reads back as `undefined` (not an
|
|
error), so `calogJsExport` maps `JS_IsUndefined` to not-found while a bound non-function
|
|
stays a type error. Core library = `quickjs.c` + `libregexp.c` + `libunicode.c` +
|
|
`dtoa.c`, built with `-D_GNU_SOURCE`.
|
|
|
|
**Berry** (`.be`) is a Lua-like stack VM (64-bit ints, binary-safe `be_pushnstring`).
|
|
Natives are Berry **native closures carrying two upvalues** (the context comptr and the
|
|
name), recovered with `be_getupval(vm, 0, pos)`; a Berry function crossing out is pinned
|
|
under a uniquely-named hidden global (globals are GC roots) and dropped by setting it to
|
|
nil. The sharp edge: `be_pcall(vm, argc)` leaves the **result in the function's slot
|
|
(`base+1`)**, not at `-1` (which holds the last stale argument). Vendoring needs Berry's
|
|
`coc` codegen prebuild plus its OS port (`be_port.c`) and module/class tables
|
|
(`be_modtab.c`). A subtlety for records: `be_newmap`/`be_newlist` push *raw* containers
|
|
that a script cannot subscript, so an aggregate crossing out is wrapped in its `map`/`list`
|
|
class instance (`map(raw)` via `be_getbuiltin` + `be_call`, then `be_moveto`/`be_pop` to
|
|
drop the raw and `init`'s nil return) -- then `user['name']` works. Ingress reverses it:
|
|
a list/map instance holds its raw container in the hidden `.p` member, iterated with
|
|
`be_pushiter`/`be_iter_next` (see sec 18).
|
|
|
|
**s7 Scheme** (`.scm`) uses the *current* official s7 (an older mirror lacked `s7_free`,
|
|
which would leak a heap per context). Since s7 native functions carry no user data, all
|
|
natives route through **one generic `%calog-call` dispatcher** plus a per-name Scheme
|
|
wrapper (`(define (report . a) (apply %calog-call "report" a))`); the context rides on a
|
|
`*calog-context*` c-pointer global. Callables are kept alive by `s7_gc_protect` and
|
|
invoked with `s7_call`. Because `s7_eval_c_string` evaluates a single form, `calogS7Run`
|
|
wraps the (escaped) source in `(catch #t (lambda () (eval-string …)) handler)`, so both
|
|
read and run errors surface as a value -- a marker pair the runner detects. An aggregate
|
|
crossing out is a Scheme list, or an (applicable) hash-table when keyed, so a materialized
|
|
record reads as `(user "name")`; reading a script's keyed value back in is a v1 limit. s7's intentional
|
|
"permanent string" interning (which `s7_free` does not reclaim) is a small, bounded
|
|
allocation, suppressed with a documented, allocation-site-specific `__lsan` hook. s7 is
|
|
per-interpreter thread-safe -- no serialization needed (unlike my-basic).
|
|
|
|
**Wren** (`.wren`) is the outlier: Wren has **no bare function calls**, so every native is
|
|
reached through a single foreign method. A preamble defines `class Calog { foreign static
|
|
call(name, args) }`, and scripts call `Calog.call("report", [42])`; the C dispatcher
|
|
recovers the context from `wrenGetUserData`, marshals the argument list, and dispatches
|
|
through `calogCall`. A Wren function crossing out is a retained `WrenHandle`, invoked with
|
|
a cached per-arity `call(_)` handle. Wren numbers are IEEE **doubles**, so int64 above
|
|
2^53 loses precision (the same edge my-basic and old-JS have). An aggregate crossing out
|
|
is a Wren `List`, or a `Map` when keyed (a record reads as `user["name"]`); reading a
|
|
script's list/map back in is a v1 limit -- Wren's C API cannot enumerate a `Map`'s keys.
|
|
Wren keeps no process-global state, so contexts run in parallel.
|
|
|
|
Aggregate egress is therefore uniform across all seven engines: a host native can return a
|
|
keyed `CalogValueT` record and every engine reads its fields with native syntax
|
|
(`user.name` / `user['name']` / `user["name"]` / `(user "name")`). The reverse -- a script
|
|
handing a list/map *back* to C -- is complete on Lua/JS/Squirrel/my-basic and a v1 limit on
|
|
Berry/s7/Wren.
|
|
|
|
**Verified** across all seven engines: `make test` (531 checks, including a materialized-
|
|
record read per new engine), ASan/UBSan clean, a `tsan<engine>` target clean for each, and
|
|
gcc + clang strict on the core.
|
|
|
|
## 18. Closing the v1 marshalling limits (function-into-script, aggregate ingress)
|
|
|
|
The engines above shipped with two directional gaps in the value bridge: a foreign
|
|
function value could not be pushed *into* a script (only Lua did it), and a script could
|
|
not hand a keyed aggregate (map) *back* to C on the three newest engines. Both are now
|
|
closed everywhere they can be, with one honest exception each.
|
|
|
|
**New public API** `calogFnFromNative(out, calog, fn, userData)` -- wraps one of your
|
|
natives as a host-owned function value (owner id 0, runs on the host thread during
|
|
`calogPump`, like `calogRegister` but anonymous). Without it, function-into-script was
|
|
unusable from `calog.h` alone (`calogFnCreate` is internal), so a host could only forward
|
|
a *script*-derived callable, never one of its own natives. Return the result from a native
|
|
and the script gets a callable that routes back to the host.
|
|
|
|
**Function value -> script** (each `*FromValue` `calogFnE` case): an engine callable
|
|
object wraps the `CalogFnT*`, a trampoline marshals the script's args -> `calogFnInvoke`
|
|
-> marshals the result, and a finalizer runs `calogFnRelease`; `calogFnRetain` at push.
|
|
Per engine:
|
|
- **Lua** (pre-existing): userdata + `__call` + `__gc` metatable -- the reference for the rest.
|
|
- **JS**: a `JSClassDef` with both `.call` and `.finalizer`; the finalizer only receives
|
|
the runtime, so `JS_SetRuntimeOpaque` carries the context to it.
|
|
- **Squirrel**: a native closure whose single free variable is a release-hooked userdata
|
|
holding the `CalogFnT` (freeing the closure frees the userdata -> the hook releases).
|
|
- **s7**: an applicable c-object (`s7_make_c_type` + `s7_c_type_set_ref` for the call,
|
|
`s7_c_type_set_free` for release); the `ref` fn gets `(obj . args)`, so the object is
|
|
`s7_car`. Script calls `(f ...)`.
|
|
- **Berry**: no per-value finalizer exists, so the `CalogFnT`s pushed into a context are
|
|
*tracked on the context* and released together in `calogBerryDestroy`; the callable is a
|
|
native closure over `(context, CalogFnT)` comptr upvalues. (`berryFromValue` gained the
|
|
context parameter so it could record them.)
|
|
- **Wren**: a `foreign class CalogFn { construct new() {} foreign call(args) }` whose
|
|
`call` takes a *list* (Wren method arity is fixed, so a list absorbs any argument count);
|
|
finalize releases. Script calls `f.call([...])`. Gotcha: Wren requires newlines between
|
|
class members, so the preamble is multi-line.
|
|
- **MY-BASIC**: inherent gap -- BASIC has no first-class callable values to invoke.
|
|
|
|
**Aggregate ingress** (`*ToValue` map/list): Lua/JS/Squirrel/MY-BASIC already read both.
|
|
Added:
|
|
- **s7**: read a hash-table by `s7_make_iterator` + `s7_iterate` (each yields a
|
|
`(key . value)` cons; gotcha: at end `s7_iterate` returns a *non-pair* sentinel even
|
|
when the at-end flag was still false, so guard `if (!s7_is_pair(pair)) break`).
|
|
- **Berry**: a list/map instance's raw container is its hidden `.p` member; iterate with
|
|
`be_pushiter`/`be_iter_hasnext`/`be_iter_next` (which take the *container* index with the
|
|
iterator kept on top, pushing one value for a list and key+value for a map -- restore the
|
|
stack to `[container, iterator]` after each entry).
|
|
- **Wren**: a `List` reads back directly. A `Map` needs key enumeration, which upstream
|
|
Wren's C API lacks (`wrenGetMapValue` is by-key only) -- so calog adds a small patch to
|
|
the vendored `wren.c`/`wren.h` (`wrenGetMapCapacity` / `wrenGetMapEntry`, mirroring Wren's
|
|
own internal `map_iterate`), and the adapter walks the raw table skipping empty slots.
|
|
With the patch, Wren too reads maps back in. (Documented in LICENSE.md; re-apply if the
|
|
amalgamation is regenerated.)
|
|
|
|
**Deliberately not "fixed" (inherent to the engine's value model):** MY-BASIC 32-bit ints
|
|
(over 2^31 range-checked to an error, not silently truncated), MY-BASIC NUL-in-string
|
|
truncation and serialize-at-load, and JS/Wren int64 above 2^53 (IEEE doubles -- JS *could*
|
|
emit a BigInt but that breaks arithmetic mixing with Number, a worse trap than the
|
|
documented precision edge). `WREN_MAX_CALL_ARITY` (16) is pinned to Wren's own engine
|
|
limit (`MAX_PARAMETERS`) and can't be raised. `MB_BANK_SIZE` (the MY-BASIC native cap) was
|
|
32 -- the one hard cap a real app could hit, since MY-BASIC natives can't carry userdata
|
|
so each needs a hand-materialized slot trampoline; it is now **256** (the trampoline bank
|
|
and its `[MB_BANK_SIZE]` table are regenerated together, so a count mismatch fails to
|
|
compile). Berry's `BE_BYTES_MAX_SIZE` was likewise raised from 32 kb to 256 MB.
|
|
|
|
**Verified**: `make test` (539 checks) with a `testForeignFunction` per engine and a
|
|
`testMapIngress` on s7 and Berry (the Berry one nests a list to exercise list ingress);
|
|
ASan-clean on every engine (retain/release balanced); gcc strict; per-engine `tsan*`.
|