# dev-deploy peering How two dev-deploy instances find, authenticate, and exchange ops with each other: the peer model, the pairing flow, the HMAC wire protocol, the sync anchors that drive promote and pull, and how a standalone instance pairs with a single tenant on a multi-tenant server. See also: [architecture.md](architecture.md) for the ops journal, stable UUIDs, and the apply pipeline; the [README](../README.md) for the full table and endpoint inventory. ## Contents - [The peer model](#the-peer-model) - [Pairing flow](#pairing-flow) - [The HMAC wire protocol](#the-hmac-wire-protocol) - [Promote, pull, and anchors](#promote-pull-and-anchors) - [Mixed-topology peering](#mixed-topology-peering) - [Endpoint reference](#endpoint-reference) - [File reference](#file-reference) ## The peer model A peer is one row in `_dd_peers` (defined in `lib/schema.js`). Each instance stores a row per peer it talks to; the relationship is configured independently on both sides (there is no central registry). | Column | Type | Meaning | | --- | --- | --- | | `peer_id` | `serial` / `integer` PK | Local surrogate id (auto-assigned). | | `env_id` | `TEXT` UNIQUE | The peer's dev-deploy `env_id` (the other side's `_dd_env.env_id`). | | `label` | `TEXT` | Optional human label (e.g. `test`, `prod`). | | `base_url` | `TEXT` | Where to reach the peer (e.g. `http://localhost:3001` or `https://tenant.example.com`). | | `peer_secret_ciphertext` | `TEXT` | Sealed shared secret (hex). | | `peer_secret_iv` | `TEXT` | AES-GCM IV (hex). | | `peer_secret_tag` | `TEXT` | AES-GCM auth tag (hex). | | `require_tls` | `INTEGER` | TLS-required flag (stored as 0/1). | | `created_at` | `TEXT` | ISO 8601 creation time. | | `last_seen_at` | `TEXT` | ISO 8601 of the last verified inbound request from this peer; `null` until first contact. | ### The sealed shared secret The shared secret is 32 random bytes (`randomSecret()`, `lib/crypto.js`). It is never stored in plaintext. At rest it is sealed with AES-256-GCM (`seal()`, `lib/crypto.js`) and split across the three `peer_secret_*` columns as hex. The hex-text storage is deliberate: Saltcorn's SQLite insert layer JSON-stringifies object values, which would mangle a raw `Buffer` column (`lib/schema.js`). The 32-byte key-encryption key (KEK) used by `seal`/`open` is derived once per process via HKDF-SHA256 from `SALTCORN_SESSION_SECRET` (`getKek()`, `lib/crypto.js`; falls back to the Saltcorn `session_secret` config). Because the KEK is tied to the session secret, rotating `SALTCORN_SESSION_SECRET` invalidates every stored pairing -- existing ciphertexts no longer decrypt (documented in `lib/crypto.js`). Plaintext only crosses the process boundary at two moments: - At pairing time, when the operator copies the secret into the other side's pairing form. - At HMAC sign/verify time, when `peerSecret()` (`lib/peers.js`) opens the sealed bytes to compute or check a signature. `rowToPeer()` (`lib/peers.js`) deliberately omits the sealed columns from the plain accessor; callers must go through `peerSecret()` / `peerSecretByEnvId()`. ## Pairing flow Pairing is symmetric: each side ends up with a `_dd_peers` row pointing at the other side's `env_id` and `base_url`, and both rows seal the *same* shared secret. One side generates the secret; the operator pastes it into the other. Each instance's own `env_id` is shown on its Peers page (`peersView`, `lib/routes.js`): "This instance's env_id is ... Paste this into the other instance's peer form." The `env_id` itself is a random UUID minted once at bootstrap (`lib/env.js`). Steps: 1. On instance A, open `/admin/dev-deploy/peers` and submit the **Add peer** form (`peersAdd`, `lib/routes.js`) with the peer's `env_id` (B's), an optional `label`, B's `base_url`, and an optional `require_tls` checkbox. Leave **Existing secret** blank. 2. `addPeer` (`lib/peers.js`) generates a fresh 32-byte secret, seals it, and inserts the row. The plaintext secret is rendered once as 64 hex characters on the confirmation page (`lib/routes.js`) -- "it will not be shown again." 3. On instance B, open its own Peers page and submit **Add peer** with A's `env_id`, A's `base_url`, and paste the 64-hex secret into the **Existing secret** field. `peersAdd` validates it against `/^[0-9a-fA-F]{64}$/` (`lib/routes.js`) and passes it to `addPeer` as `existingSecret`, so B seals the identical secret rather than generating a new one. After both rows exist, A and B share one secret and each knows the other's `env_id` and `base_url`. `env_id` is enforced UNIQUE, so re-adding the same peer fails with "peer with env_id ... already exists" (`lib/peers.js`). ### Rotation and deletion - **Rotate** (`peersRotate`, `lib/routes.js` -> `rotatePeerSecret`, `lib/peers.js`) mints a new secret for an existing peer, re-seals it, and shows the new value once. The operator must paste the new secret on the other side (re-pair or rotate there) or the pairing breaks. - **Delete** (`peersDelete`, `lib/routes.js` -> `deletePeer`, `lib/peers.js`) removes the `_dd_peers` row *and* deletes that peer's `_dd_anchors` rows, so a later re-pair starts syncing from the epoch again. ## The HMAC wire protocol Every machine-API request is signed with the shared secret using HMAC-SHA256. The outbound side is `lib/transport.js`; the inbound check is `requirePeerAuth` (`lib/peerAuth.js`). ### Headers | Header | Source | Meaning | | --- | --- | --- | | `X-DD-Env-Id` | sender's own `env_id` | Caller identity; the receiver looks it up in `_dd_peers` via `findPeerByEnvId` to find the matching secret. | | `X-DD-Timestamp` | `String(Date.now())` | Milliseconds since epoch. | | `X-DD-Nonce` | `randomNonce().toString("hex")` | 16 random bytes, hex (replay padding). | | `X-DD-Signature` | `sign(secret, canonical)` | Hex HMAC-SHA256 over the canonical string. | All four headers are required; a missing one returns `400 missing header ...` (`lib/peerAuth.js`, `lib/peerAuth.js`). When there is a request body, the sender sets `Content-Type: application/vnd.dev-deploy+json` (`lib/transport.js`). This custom type stops Saltcorn's `express.json()` middleware from consuming the request stream, so the receiver can read the exact raw bytes and HMAC them verbatim -- no re-serialization, no whitespace or key-order assumptions (`lib/peerAuth.js`, `lib/peerAuth.js`). ### The canonical string Both sides build the signed string with `buildCanonical` (`lib/crypto.js`). It is six fields joined by newlines (`\n`): ``` timestamp nonce METHOD path targetHost sha256hex(body) ``` - `METHOD` is uppercased. - `path` is the request path including query string. Outbound it is the literal `path` argument; inbound it is `req.originalUrl || req.url` (`lib/peerAuth.js`). - `body` is hashed with SHA-256 (`sha256Hex`, `lib/crypto.js`); an empty body hashes the empty string. GET/HEAD never have a body (`lib/peerAuth.js`). ### Host binding (anti-cross-tenant replay) `targetHost` is the normalized host the request is aimed at, and binding it into the signature is what stops a request signed for one tenant from being replayed against another tenant on the same multi-tenant server. - Outbound, the host is derived from the peer's `base_url`: `normalizeHost(new URL(baseUrl).host)` (`lib/transport.js`). - Inbound, it is derived from the request: prefer `X-Forwarded-Host` (first value, set by a trusted proxy), else the `Host` header, then normalized the same way (`lib/peerAuth.js` to `lib/peerAuth.js`). `normalizeHost` (`lib/crypto.js`) lowercases, trims, and drops a trailing `:80` or `:443` so both sides produce byte-identical strings (clients omit the default port from the `Host` header). Because the canonical includes `targetHost`, a signature computed for `t1.example.com` will not verify when the same bytes are re-sent to `t2.example.com`: the receiver rebuilds the canonical with its own host, the MAC differs, and verification fails with `401 bad signature`. Note (`lib/peerAuth.js`): the receiver derives the host from the request, NOT from `peerRow.base_url`. Inbound, `base_url` is the *sender's* address (used for pull-back), not the receiver's own host. ### Verification order `requirePeerAuth` (`lib/peerAuth.js`) checks, in order, and returns `null` (after sending a 4xx) on the first failure: 1. All four required headers present, else `400`. 2. Timestamp within the +/- 5 minute skew window (`timestampWithinSkew`, `lib/crypto.js`; `SKEW_TOLERANCE_MS = 5 * 60 * 1000`, `lib/crypto.js`), else `401 timestamp out of skew window`. 3. `X-DD-Env-Id` resolves to a `_dd_peers` row, else `401 unknown peer env_id`. 4. The peer has a sealed secret that opens, else `401 peer not provisioned`. 5. Signature matches via constant-time compare (`verifySignature`, `lib/crypto.js`, uses `crypto.timingSafeEqual`), else `401 bad signature`. 6. If there was a body, it parses as JSON (after the signature already covered the raw bytes), else `400 body is not valid JSON`. On success it parses the body into `req.body`, advances the peer's `last_seen_at` (`touchPeerLastSeen`, `lib/peers.js`), sets `req.dd_peer` to the peer row, and returns it. The nonce is sent and signed but the current code does not maintain a server-side seen-nonce cache; replay protection rests on the skew window and the host binding. (Stated to avoid over-claiming; no nonce store exists in the code read.) ## Promote, pull, and anchors Sync direction is per peer and per direction, tracked in `_dd_anchors` (`lib/schema.js`): | Column | Meaning | | --- | --- | | `peer_id` | FK-by-convention to `_dd_peers.peer_id` (PK part). | | `direction` | `outbound` or `inbound` (PK part). | | `last_op_id` | The last op id synced in that direction for that peer. | | `updated_at` | ISO 8601 of the last advance. | `PRIMARY KEY (peer_id, direction)` means at most one outbound and one inbound watermark per peer. Both promote and pull select only ops authored by the *local* env (`source_env_id = env.env_id`) and only those `created_at >` the anchor op's `created_at`. If there is no anchor, sync starts from the epoch (the whole journal). Helpers: `getOutboundAnchor` / `getInboundAnchor` / `upsertAnchor` (`lib/routes.js` to `lib/routes.js`). ### Promote (push ops to a peer) `promote` (`lib/routes.js`): 1. Look up the peer and the local env; read the outbound anchor. 2. Select the local env's ops after the anchor, oldest first, `LIMIT 500` (`lib/routes.js`). If none, redirect with "no ops to promote". 3. `signedFetch` `POST /dev-deploy/api/ingest` with `{ ops }` and the peer's secret (`lib/routes.js`). 4. On success, advance the outbound anchor to the last op's `op_id` (`upsertAnchor(peerId, "outbound", ...)`, `lib/routes.js`). 5. Summarize applied/error counts from the response and append any plugin- version warnings from `diffPluginsWithPeer` (`lib/routes.js`, which calls `/dev-deploy/api/health`). `planView` (`lib/routes.js`) is the dry run: same anchor-relative selection (`LIMIT 500`) but rendered as a preview table instead of being sent. The receiving side, `apiIngest` (`lib/routes.js`), authenticates, applies the batch with `applyBatch`, and advances *its* `inbound` anchor for the sender to the last received `op_id` (`lib/routes.js`). ### Pull (fetch a peer's ops) `pull` (`lib/routes.js`): 1. Read the inbound anchor; build the path `/dev-deploy/api/journal?since=` (or no `since` if no anchor) (`lib/routes.js`). 2. `signedFetch` `GET` that path (`lib/routes.js`). 3. Apply the returned `ops` with `applyBatch` (`lib/routes.js`). 4. Advance the inbound anchor to the last pulled op's `op_id` (`lib/routes.js`). 5. Summarize applied/error/conflict counts and plugin warnings. The serving side, `apiJournal` (`lib/routes.js`), returns the local env's ops after `since` (resolved to the op's `created_at`), oldest first, `LIMIT 1000` (`lib/routes.js`), as `{ source_env_id, ops }`. ## Mixed-topology peering A standalone instance and a specific tenant on a multi-tenant server peer the same way as two standalone instances; the only difference is the `base_url`. - Address the tenant by its tenant hostname as `base_url`, e.g. `https://tenant.example.com`. Saltcorn routes the request to that tenant by host, and dev-deploy's tables are schema-qualified per tenant (`db.getTenantSchemaPrefix()` is used throughout, e.g. `lib/routes.js`), so the peer row, ops, and anchors all live in that tenant's schema. - The host binding makes this safe: the signature is computed over the tenant hostname (outbound from `base_url`; inbound from `X-Forwarded-Host` / `Host`). A request signed for one tenant cannot be replayed against another tenant on the same server, because each tenant's host produces a different canonical string (see [Host binding](#host-binding-anti-cross-tenant-replay)). - Each side still stores the other's `env_id` and `base_url` in its own `_dd_peers`. A standalone instance points `base_url` at the tenant's hostname; the tenant points `base_url` back at the standalone instance's hostname. If a reverse proxy fronts the tenants, it must set `X-Forwarded-Host` to the tenant hostname so the inbound canonical matches the outbound one (`lib/peerAuth.js`). ## Endpoint reference All four machine-API routes are registered with `noCsrf: true` (`lib/routes.js` to `lib/routes.js`) and require HMAC peer auth via `requirePeerAuth`. The admin peer/sync routes require a session with admin role (`role_id === 1`, `isAdmin`, `lib/routes.js`) and use CSRF fields. ### Machine API (HMAC peer auth) | Method | Path | Handler | File:line | Purpose | | --- | --- | --- | --- | --- | | GET | `/dev-deploy/api/journal?since=op_id` | `apiJournal` | `lib/routes.js` | Return local env ops after `since`, oldest first, max 1000. Returns `{ source_env_id, ops }`. | | POST | `/dev-deploy/api/ingest` | `apiIngest` | `lib/routes.js` | Apply `{ ops }` from a peer; advance that peer's inbound anchor. Returns `{ received, results }`. | | GET | `/dev-deploy/api/file/:uuid` | `apiFile` | `lib/routes.js` | Stream a file entity's bytes by UUID (octet-stream). 404 if no `_dd_entity_ids` mapping for kind `file`. | | GET | `/dev-deploy/api/health` | `apiHealth` | `lib/routes.js` | Return `{ env_id, label, plugins }` for plugin-drift checks. | ### Admin peer and sync routes (session + admin role) | Method | Path | Handler | File:line | Purpose | | --- | --- | --- | --- | --- | | GET | `/admin/dev-deploy/peers` | `peersView` | `lib/routes.js` | List peers, show this env's `env_id`, add-peer form. | | POST | `/admin/dev-deploy/peers/add` | `peersAdd` | `lib/routes.js` | Pair a peer; generate or accept a 64-hex secret. | | POST | `/admin/dev-deploy/peers/rotate` | `peersRotate` | `lib/routes.js` | Rotate a peer's shared secret (shown once). | | POST | `/admin/dev-deploy/peers/delete` | `peersDelete` | `lib/routes.js` | Delete a peer and its anchors. | | GET | `/admin/dev-deploy/plan` | `planView` | `lib/routes.js` | Preview ops that would be promoted to a peer. | | POST | `/admin/dev-deploy/promote` | `promote` | `lib/routes.js` | Push outbound ops to a peer via signed `ingest`. | | POST | `/admin/dev-deploy/pull` | `pull` | `lib/routes.js` | Pull a peer's ops via signed `journal` and apply them. | ## File reference | File | Responsibility | | --- | --- | | `lib/peers.js` | `_dd_peers` CRUD; seal/open the shared secret; `peerSecret`, `addPeer`, `rotatePeerSecret`, `deletePeer`, `touchPeerLastSeen`. | | `lib/crypto.js` | AES-256-GCM seal/open, HKDF KEK, HMAC sign/verify, `buildCanonical`, `normalizeHost`, skew check, random secret/nonce. | | `lib/transport.js` | Outbound signed requests: `signedFetch` (JSON) and `signedFetchBinary` (raw bytes). | | `lib/peerAuth.js` | Inbound `requirePeerAuth`: header check, skew, peer lookup, raw-body HMAC verify, host binding. | | `lib/routes.js` | Admin UI for pairing/plan/promote/pull and the four machine-API handlers. | | `lib/schema.js` | `_dd_peers` (`:38`) and `_dd_anchors` (`:116`) table definitions. |