sc-dev-deploy/docs/architecture.md
2026-06-01 16:43:43 -05:00

428 lines
22 KiB
Markdown

# dev-deploy architecture
dev-deploy is a Saltcorn plugin that migrates metadata changes (and, opt-in, row
data) across Dev/Test/Prod environments. It records every metadata mutation as an
append-only journal entry keyed by a stable cross-environment UUID, then replays
those entries onto peer instances over an HMAC-authenticated HTTP transport.
This document explains the core model: the ops journal, stable entity UUIDs, the
wrap layer that produces ops, the apply layer that consumes them (including
conflict handling), the per-instance environment identity, and the three table
data modes.
Code references below are `file:line` into the plugin source.
## Contents
- [Plugin load sequence](#plugin-load-sequence)
- [The ops journal](#the-ops-journal-_dd_ops)
- [Stable entity UUIDs](#stable-entity-uuids-_dd_entity_ids)
- [The wrap layer](#the-wrap-layer)
- [Apply](#apply)
- [Environment identity](#environment-identity-_dd_env)
- [Data modes](#data-modes-user--starter--managed)
- [Plugin tables](#plugin-tables)
- [HTTP endpoints](#http-endpoints)
## Plugin load sequence
`onLoad` (`index.js`) runs on every plugin load and is idempotent:
1. `createAllTables()` creates the six plugin tables if missing (`index.js`,
`schema.js`).
2. `initEnvIfMissing()` ensures this instance has a singleton identity row in
`_dd_env` (`index.js`, `env.js`).
3. On first load only (when `bootstrapped_at` is NULL), `backfillAll()` assigns
UUIDs to every pre-existing metadata entity, then `markBootstrapped()` stamps
the env so backfill never runs again (`index.js`, `entityIds.js`,
`env.js`).
4. `installAllWraps()` monkey-patches the Saltcorn model classes so subsequent
mutations get journaled (`index.js`, `wrap.js`).
5. `ensureCsrfBypass()` appends `/dev-deploy/api/` to Saltcorn's
`disable_csrf_routes` config so peers can POST to the machine API
(`index.js`).
The plugin exports `sc_plugin_api_version: 1`, `onLoad`, and `routes`
(`index.js`).
## The ops journal (`_dd_ops`)
Every tracked change is one append-only row in `_dd_ops` (`schema.js`).
The journal is never rewritten in place: an undo is itself a new compensating op
(see the Journal viewer note at `routes.js`).
### Op record shape
`recordOp` (`ops.js`) builds and inserts a row with these fields:
| Column | Source | Notes |
| --- | --- | --- |
| `op_id` | `rec.op_id` or a fresh v4 UUID | Primary key (`ops.js`, `ids.js`) |
| `source_env_id` | this instance's `env.env_id` | Which environment authored the op (`ops.js`) |
| `op_type` | `<action>_<kind>` | e.g. `create_table`, `update_field`, `drop_view` (`wrap.js`) |
| `entity_kind` | the entity kind | e.g. `table`, `field`; NULL for config ops |
| `entity_uuid` | stable UUID of the touched entity | NULL for `set_config`/`update_plugin_config` (`wrap.js`, `wrap.js`) |
| `payload` | JSON, stored as TEXT | before/after/patch snapshots (`ops.js`) |
| `parent_op_id` | from AsyncLocalStorage | Set for cascaded child ops (`ops.js`, `context.js`) |
| `correlation_id` | from AsyncLocalStorage | Groups all ops in one logical operation (`ops.js`, `context.js`) |
| `schema_version` | `OP_SCHEMA_VERSION` (currently 1) | `ops.js`, `constants.js` |
| `created_at` | ISO 8601 timestamp | Ordering key for sync and apply (`ops.js`) |
| `applied_at` | ISO 8601 timestamp | Set on locally authored ops; on ingest set only if applied (`ops.js`, `apply.js`) |
| `status` | `committed` by default | See status values below (`ops.js`) |
| `conflict_with_op_id` | NULL unless conflicting | The local op_id this incoming op conflicts with (`apply.js`) |
`payload` is always a JSON string, parsed/stringified at the application layer;
the schema uses portable TEXT/INTEGER types with no JSONB (`schema.js`).
`recordOpSafely` wraps `recordOp` so a journal failure logs but never throws into
the user's mutation (`ops.js`).
### op_type catalog
`op_type` is `<action>_<kind>`. The set of types is fixed by the apply dispatch
table `HANDLERS` (`apply.js`):
- Metadata create/update/drop: `create_table`/`update_table`/`drop_table`,
and the same triad for `field`, `view`, `page`, `trigger`, `role`, `library`,
`tag`, `page_group`, `workflow_step` (`apply.js`).
- Create/drop only (no update handler): `constraint`, `file`,
`page_group_member` (`apply.js`, `apply.js`, `apply.js`).
- Config: `set_config` (tracked keys only) and `update_plugin_config`
(`apply.js`, `wrap.js`, `wrap.js`).
- Row data: `insert_row`, `update_row`, `drop_row`, and `set_table_mode`
(`apply.js`).
### Status values
`status` is set when an op is recorded or ingested (`apply.js`):
- `committed` -- authored locally, or ingested and applied successfully.
- `skipped_cascade` -- the op's `parent_op_id` was in the same incoming batch, so
the parent's apply reproduces this child locally (`apply.js`).
- `conflict` -- not applied; a local op touched the same entity since the last
sync. Resolved by the admin (`apply.js`).
- `error` -- apply failed, or no handler exists for the op_type (`apply.js`,
`apply.js`).
- `rejected` -- a conflict the admin resolved with "use mine" (`apply.js`).
- `merged` -- a conflict the admin resolved with a per-field merge
(`apply.js`).
- `reverted` -- excluded from conflict scanning alongside `rejected`
(`apply.js`).
Indexes on `_dd_ops` cover `created_at`, `(source_env_id, created_at)`,
`entity_uuid`, `correlation_id`, and a partial index on `status='conflict'`
(`schema.js`).
## Stable entity UUIDs (`_dd_entity_ids`)
Saltcorn core identifies metadata by integer id and human name, neither of which
is stable across environments. dev-deploy maintains a side table
`_dd_entity_ids` mapping `(kind, current_id) -> uuid` (`entityIds.js`,
`schema.js`). The UNIQUE constraint `(kind, current_id)` enforces one mapping
per local entity (`schema.js`).
| Column | Meaning |
| --- | --- |
| `uuid` | The stable cross-environment identity (PK) (`schema.js`) |
| `kind` | One of `ENTITY_KINDS` (`constants.js`) |
| `current_name` | Friendly current name; updated on rename, not used for identity (`entityIds.js`) |
| `current_id` | The local Saltcorn integer id |
| `parent_uuid` | Parent entity UUID (e.g. a field's table); preserved so revert can find the parent (`wrap.js`) |
| `created_at` | ISO 8601 timestamp |
Secondary indexes cover `(kind, current_name)` and `parent_uuid`
(`schema.js`).
### Two ways a UUID is born
- **Deterministic (backfill / first run).** `ensureUuid` derives a UUID from
`deterministicUuid(kind, canonical)` -- SHA-256 over
`ID_NAMESPACE | kind | canonicalName` shaped into an RFC 4122 v5-style UUID
(`entityIds.js`, `ids.js`, `constants.js`). Because the namespace is
frozen, two environments installed from the same metadata population converge
on identical UUIDs with no coordination step (`entityIds.js`). The canonical
key is the entity name, except fields use `table.field` and constraints use a
fingerprint scoped to the parent table UUID (`entityIds.js`,
`entityIds.js`).
- **Random (live creation).** When a wrap observes a new entity created after
bootstrap, `assignNewUuid` mints a fresh v4 UUID (`entityIds.js`). On the
receiving side, `adoptUuid` inserts a row with the *source's* UUID so the
identity is preserved across instances (`entityIds.js`).
`backfillAll` walks every tracked kind in dependency order so parents exist
before children resolve their `parent_uuid`: tables, fields, views, pages,
triggers, roles, library, tags, constraints, page groups, workflow steps
(`entityIds.js`). Each backfill function counts only newly inserted rows so
re-running is a no-op (`entityIds.js`).
Lifecycle helpers: `lookupByCurrent` / `lookupByUuid` (`entityIds.js`),
`updateName` on rename (`entityIds.js`), and `removeEntityRow` on delete so a
reused local integer id can't collide with a stale mapping (`entityIds.js`).
## The wrap layer
`installAllWraps` monkey-patches the Saltcorn model classes so that create,
update, and delete each append an op (`wrap.js`). It wraps Table, Field,
View, Page, Trigger, Role, Library, Tag, TableConstraint, File, PageGroup,
PageGroupMember, WorkflowStep, plus `state.setConfig`, `Plugin.prototype.upsert`,
and the Table row methods (`wrap.js`).
### The generic wrap
`wrap(target, method, kind, action, hooks)` (`wrap.js`) replaces a method with
an async wrapper that:
1. Returns the original immediately if journaling is suppressed -- the apply path
sets this flag (`wrap.js`, `context.js`).
2. Pre-generates an `op_id` and runs an optional `before` hook to capture
pre-state (`wrap.js`).
3. Calls `enterOp(opId, ...)` to push the op_id onto an AsyncLocalStorage stack,
then invokes the original method inside that scope (`wrap.js`,
`context.js`).
4. On success, runs the `after` hook to compute `entityUuid` + `payload`,
translates any local file references to portable
`__dd_file_ref::<uuid>` placeholders, then calls `recordOpSafely`
(`wrap.js`, `wrap.js`, `wrap.js`).
Each wrapped method is tagged `__ddWrapped` (with `__ddOriginal` kept) so a
second `installAllWraps` is a no-op (`wrap.js`, `wrap.js`). Hook errors
are caught and logged but do not throw, so the journal can never corrupt a
user-facing operation (`wrap.js`); a failure in the original method
propagates normally and no op is recorded (`wrap.js`).
### AsyncLocalStorage correlation
`context.js` holds a per-async-flow store of `{ stack, correlationId, suppressed }`
(`context.js`). `enterOp` pushes the new op_id and inherits the parent's
`correlationId` (or mints one at the top level) (`context.js`). When the
original method triggers nested mutations (e.g. a table delete cascading into
field deletes), each child op reads:
- `currentParentOpId()` -- the second-from-top of the stack, recorded as
`parent_op_id` (`context.js`).
- `currentCorrelationId()` -- shared across the whole operation
(`context.js`).
This is how apply later recognizes cascades and skips re-applying children whose
parent is in the same batch (`apply.js`).
### Payload contents per action
- **create**: `payload.after` is a snapshot of selected keys; child kinds also
carry `parent_uuid` (`wrap.js`, `wrap.js`). Snapshot key lists per kind
are defined at `wrap.js`.
- **update**: `payload.patch` is the caller's change set; some kinds also include
`before` and `after` snapshots; renames call `updateName` (`wrap.js`,
`wrap.js`).
- **drop**: the shared `standardDropHooks` capture the UUID, `parent_uuid`, and a
`before` snapshot, then `removeEntityRow` after the delete completes
(`wrap.js`).
### Config and plugin wraps
`wrapSetConfig` wraps `state.setConfig` but only journals a `set_config` op for a
small allowlist of keys -- `menu_items`, `site_name`, `site_logo_id`, `base_url`
(`wrap.js`, `wrap.js`). `menu_items` references pages/views by name, which
is naturally stable, so no UUID translation is needed (`wrap.js`).
`wrapPlugin` wraps `Plugin.prototype.upsert` to journal `plugin_config` updates,
skipping the dev-deploy plugin itself and skipping upserts that don't actually
change the configuration (Saltcorn upserts on every plugin load) (`wrap.js`,
`wrap.js`).
### Row wraps
`wrapTableRows` wraps `insertRow`/`updateRow`/`deleteRows` on `Table.prototype`
(`wrap.js`). Each consults `journalDecision(this.id)` and passes through
silently unless the table's data mode says to journal (`wrap.js`). Row ops
carry a `table_uuid` plus a row-level UUID as the op's `entity_uuid`
(`wrap.js`). See [Data modes](#data-modes-user--starter--managed) and
[managed-rows.md](./managed-rows.md).
## Apply
`apply.js` replays an op authored elsewhere onto this instance. `applyBatch(ops,
opts)` is the entry point used by both `pull` and `apiIngest` (`apply.js`,
`routes.js`, `routes.js`).
### Batch algorithm
`applyBatch` sorts the incoming ops by `created_at`, then for each op
(`apply.js`):
1. **Idempotency** -- if an op with this `op_id` already exists locally, record
`already_applied` and skip (`apply.js`). Every create handler also
re-checks `lookupByUuid(op.entity_uuid)` and returns a `noop` if the UUID is
already mapped (e.g. `apply.js`), giving a second idempotency layer.
2. **Cascade skip** -- if `parent_op_id` is in the same batch, persist
`skipped_cascade` and let the parent's apply reproduce the child
(`apply.js`).
3. **Conflict detection** -- `findConflictingLocalOp` looks for a local op on the
same `entity_uuid` applied since the last inbound sync; if found, persist the
incoming op as `conflict` (not applied) and continue (`apply.js`,
`apply.js`).
4. **Dispatch** -- look up the handler in `HANDLERS`; missing handler -> `error`
(`apply.js`).
5. **Apply suppressed** -- parse the payload, resolve file placeholders, run the
handler inside `runSuppressed(...)` so the inner Saltcorn calls don't
re-journal or auto-assign UUIDs, then persist `committed`; any throw -> `error`
(`apply.js`, `context.js`).
Each handler resolves the op's `entity_uuid` (and any `parent_uuid`) to the local
integer id, invokes the matching Saltcorn model method, and updates
`_dd_entity_ids` via `adoptUuid` / `updateName` / `removeEntityRow`
(`apply.js`). `stripSurrogateKeys` drops non-portable id columns (`id`,
`table_id`, `view_id`, `page_id`, `role_id_for_create`) from create/patch
payloads before they reach the model (`apply.js`). `persistOp` writes the op
into the local journal preserving its source-side identity (`apply.js`).
### Idempotency summary
Apply is safe to re-run: duplicate `op_id` short-circuits at the top of the loop
(`apply.js`); create handlers no-op on an already-mapped UUID
(`apply.js`); drop handlers no-op when the entity or mapping is already gone
(`apply.js`); and `applyInsertRow` no-ops when the row UUID is already present
(`apply.js`).
### Conflict detection
`findConflictingLocalOp` (`apply.js`) reads the `inbound` anchor for the peer
to get a cutoff timestamp, then finds the most recent local op (where
`source_env_id` is this env's id) on the same `entity_uuid` with
`applied_at > cutoff`, excluding `rejected` and `reverted` ops. If such an op
exists, the incoming op represents concurrent divergent change and is stored with
`status='conflict'` and `conflict_with_op_id` set to that local op_id.
### Conflict resolution: theirs / mine / merge
The admin resolves pending conflicts via the Conflicts UI (`routes.js`):
- **theirs** -- `resolveConflict(opId, "theirs")` applies the incoming op now
under suppression and marks it `committed`, clearing `conflict_with_op_id`
(`apply.js`).
- **mine** -- `resolveConflict(opId, "mine")` marks the incoming op `rejected`
and leaves local state alone; future pulls skip it by idempotency
(`apply.js`).
- **merge** (update-vs-update only) -- `conflictFieldDiff` computes per-field
differences between the incoming patch and current local state
(`apply.js`); the admin picks current/incoming/custom per field, and
`resolveConflictByMerge` writes only the chosen fields and marks the op
`merged` (`apply.js`). Mergeability is gated to matching `update_<kind>`
op types on both sides (`routes.js`).
## Environment identity (`_dd_env`)
Each instance has exactly one identity row in `_dd_env`, the singleton selected
by `getEnv` (`env.js`, `schema.js`).
| Column | Meaning |
| --- | --- |
| `env_id` | This instance's stable UUID; stamped as `source_env_id` on every op (`env.js`) |
| `env_label` | Optional human label (e.g. test, prod) shown in the admin UI |
| `on_destructive_op` | Destructive-op policy: `auto`, `confirm`, or `refuse`; defaults to `confirm` (`constants.js`, `schema.js`) |
| `require_tls` | Default TLS requirement flag (0/1) (`schema.js`) |
| `created_at` | ISO 8601 timestamp |
| `bootstrapped_at` | NULL until first-run backfill completes; gates backfill (`index.js`, `env.js`) |
`env_id` is created once by `initEnvIfMissing` as a random v4 UUID
(`env.js`). The env cache is keyed per tenant schema, not a module-level
singleton, so a multi-tenant process never shares one env across tenants
(`env.js`).
The `env_id` is the unit of pairing: an admin copies this instance's `env_id`
into a peer's add-peer form, and it is sent as the source identity on every signed
request (`routes.js`, `routes.js`).
## Data modes (user / starter / managed)
Per-table row propagation is governed by a data mode stored in `_dd_table_modes`,
keyed by `table_uuid` (`schema.js`). The Tables admin page sets these
(`routes.js`). The three modes (`constants.js`):
| Mode | Behavior |
| --- | --- |
| `user` (default) | Rows belong to the local environment; deploys never touch them. The only safe choice for end-user-entered data (`routes.js`). |
| `starter` | Rows ship to the target on first install (initial ship), then the target owns them; later source changes don't propagate (`routes.js`). |
| `managed` | Rows always sync from source; source is canonical, target edits get overwritten or surface as conflicts (`routes.js`). |
The Saltcorn `users` table is hard-locked to `user` and cannot be changed
(`routes.js`, `routes.js`).
Switching a table to `managed` or `starter` adds a hidden `_dd_row_uuid` column,
backfills UUIDs onto existing rows, journals a `set_table_mode` op, and then
journals an `insert_row` op per existing row (the initial ship). For `starter`,
`markStarterShipped` then locks out further row ops (`routes.js`,
`routes.js`, `routes.js`). Reverting to `user` best-effort drops the
hidden column (`routes.js`). On the receiving side, `applySetTableMode`
records the mode and ensures the managed schema before any row ops arrive
(`apply.js`).
For the full row-identity and propagation mechanics (the `_dd_row_uuid` column,
`journalDecision`, portable row payloads, and the binary/file-fetch path), see
[managed-rows.md](./managed-rows.md).
## Plugin tables
All six tables are created idempotently by `createAllTables` (`schema.js`)
using portable TEXT/INTEGER types (no JSONB; booleans as 0/1) (`schema.js`).
Names are prefixed with the tenant schema via `db.getTenantSchemaPrefix()`.
| Table | Purpose | Defined at |
| --- | --- | --- |
| `_dd_env` | Singleton instance identity + policies | `schema.js` |
| `_dd_peers` | Configured peers; HMAC secret stored as hex ciphertext/iv/tag | `schema.js` |
| `_dd_entity_ids` | `(kind, current_id) -> uuid` mapping | `schema.js` |
| `_dd_ops` | Append-only ops journal | `schema.js` |
| `_dd_anchors` | Per-peer per-direction sync cursor (`last_op_id`) | `schema.js` |
| `_dd_table_modes` | Per-table data mode + `starter_shipped_at` | `schema.js` |
`_dd_peers` stores the shared secret as three hex TEXT columns
(`peer_secret_ciphertext`, `peer_secret_iv`, `peer_secret_tag`) rather than a
BLOB, because Saltcorn's SQLite insert layer would JSON-stringify a Buffer
(`schema.js`). The PK uses `integer` on SQLite and `serial` on Postgres for a
portable auto-increment (`schema.js`). `_dd_anchors` is keyed on
`(peer_id, direction)` where direction is `inbound` or `outbound`
(`schema.js`, `routes.js`). Both `_dd_ops.conflict_with_op_id` and
`_dd_table_modes.starter_shipped_at` are added by idempotent migrations for
older installs (`schema.js`, `schema.js`).
## HTTP endpoints
Routes are declared in `routes.js`. Admin UI routes require an admin session
(`role_id === 1`, checked by `isAdmin`, `routes.js`). Machine API routes use
HMAC peer auth via `requirePeerAuth` and are CSRF-exempt (`noCsrf: true`,
registered into `disable_csrf_routes` at load) (`routes.js`, `index.js`).
### Admin UI (session + admin role)
| Method | URL | Handler | Purpose |
| --- | --- | --- | --- |
| GET | `/admin/dev-deploy/` | `dashboard` | Env summary, op/entity counts, pending-conflict count (`routes.js`) |
| GET | `/admin/dev-deploy/ops` | `opsView` | Journal viewer; JSON if `Accept: application/json` (`routes.js`) |
| GET | `/admin/dev-deploy/peers` | `peersView` | List peers; add-peer form (`routes.js`) |
| POST | `/admin/dev-deploy/peers/add` | `peersAdd` | Pair a peer; secret shown once (`routes.js`) |
| POST | `/admin/dev-deploy/peers/rotate` | `peersRotate` | Rotate a peer secret (`routes.js`) |
| POST | `/admin/dev-deploy/peers/delete` | `peersDelete` | Delete a peer (`routes.js`) |
| GET | `/admin/dev-deploy/plan` | `planView` | Preview ops to send to a peer (`routes.js`) |
| POST | `/admin/dev-deploy/promote` | `promote` | Push ops since outbound anchor to a peer (`routes.js`) |
| POST | `/admin/dev-deploy/pull` | `pull` | Pull + apply ops since inbound anchor (`routes.js`) |
| POST | `/admin/dev-deploy/revert` | `revertView` | Append a compensating op (`routes.js`) |
| GET | `/admin/dev-deploy/tables` | `tablesView` | View/set per-table data mode (`routes.js`) |
| POST | `/admin/dev-deploy/tables/set` | `tablesSet` | Set a table's data mode (`routes.js`) |
| GET | `/admin/dev-deploy/conflicts` | `conflictsView` | List pending conflicts (`routes.js`) |
| POST | `/admin/dev-deploy/conflicts/resolve` | `conflictsResolve` | Resolve theirs/mine (`routes.js`) |
| GET | `/admin/dev-deploy/conflicts/merge` | `conflictsMergeView` | Per-field merge form (`routes.js`) |
| POST | `/admin/dev-deploy/conflicts/merge/apply` | `conflictsMergeApply` | Apply a per-field merge (`routes.js`) |
### Machine API (HMAC peer auth, CSRF-exempt)
| Method | URL | Handler | Purpose |
| --- | --- | --- | --- |
| GET | `/dev-deploy/api/journal` | `apiJournal` | Serve this env's ops since `?since=op_id` (`routes.js`) |
| POST | `/dev-deploy/api/ingest` | `apiIngest` | Receive + apply a batch of ops (`routes.js`) |
| GET | `/dev-deploy/api/file/:uuid` | `apiFile` | Stream a file's bytes for `create_file` apply (`routes.js`) |
| GET | `/dev-deploy/api/health` | `apiHealth` | Report env_id/label and installed plugin list (`routes.js`) |
Promote and pull advance the per-peer anchor after a successful exchange so the
next sync only carries new ops (`routes.js`, `routes.js`). Both also
compare installed-plugin lists with the peer via `/dev-deploy/api/health` and
surface mismatches as warnings (`routes.js`).