From 279f373f80c2e26ead6bd563ed24ecb6d2d0a8ff Mon Sep 17 00:00:00 2001 From: "alexei.dolgolyov" Date: Sat, 16 May 2026 22:00:41 +0300 Subject: [PATCH] docs(extra_json): policy doc for containers.extra_json evolution MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit New CODEMAPS/container-extra-json.md documents the contract every source plugin must follow when reading or writing containers.extra_json. Closes the open architectural question that was tracked in WORKLOAD_REFACTOR_TODO.md. Covers: - Schema position (column default, four write-path normalization sites) and ownership model (per-source row keys, current writers). - Reader rules: tolerate unknown keys via default json.Unmarshal, tolerate decode failure where first-class columns suffice. - Writer patterns: wholesale-overwrite (image source, single-writer short-lived rows) vs preserve-unknown-keys (static source, RMW with generic-map round-trip). Preserve-unknown-keys is the recommended default for new sources. - Concurrency: SetMaxOpenConns(1) + WAL gives atomic per-row writes and consistent reader snapshots, but does NOT serialize multi- goroutine RMW — a per-workload sync.Mutex is required for that (fenced by TestSaveState_ConcurrentWritesDoNotLoseUpdates). - What extra_json is NOT for (workload config, cross-source state, queryable data, secrets) and a checklist for adding a new field. - Pointers to every example in tree: image's containerExtra writer/ reader, static's saveState round-trip, workload_runtime.go's decode-and-tolerate consumer. WORKLOAD_REFACTOR_TODO Container.extra_json question flipped to DONE. CODEMAPS/INDEX bumped + entry linked. Reviewer pass (code-reviewer subagent) caught one HIGH factual error (wrong cross-source consumer claim) and several MEDIUM/LOW drifts; all addressed inline before commit. --- docs/CODEMAPS/INDEX.md | 3 +- docs/CODEMAPS/container-extra-json.md | 105 ++++++++++++++++++++++++++ docs/WORKLOAD_REFACTOR_TODO.md | 14 ++-- 3 files changed, 115 insertions(+), 7 deletions(-) create mode 100644 docs/CODEMAPS/container-extra-json.md diff --git a/docs/CODEMAPS/INDEX.md b/docs/CODEMAPS/INDEX.md index 2843492..4a36d16 100644 --- a/docs/CODEMAPS/INDEX.md +++ b/docs/CODEMAPS/INDEX.md @@ -1,6 +1,6 @@ # Tinyforge Codemaps — Index -**Last Updated:** 2026-05-16 +**Last Updated:** 2026-05-16 (added `container-extra-json` policy doc) This directory contains architectural maps of key Tinyforge subsystems. Each codemap focuses on one major area: core data types, contract surfaces, integration points, and recipes for extending the system. @@ -8,6 +8,7 @@ This directory contains architectural maps of key Tinyforge subsystems. Each cod - **[Workload Plugin](./workload-plugin.md)** — Source × Trigger plugin contracts; registry lookups; webhook fan-out; how to add new kinds. - **[Discovery & Runtime API](./discovery-and-runtime.md)** — `/api/discovery/*` helpers (Git provider probe, repo/branch/tree pickers, image conflicts); `/api/workloads/{id}/runtime-state` + `/storage` + `/stop` + `/start`; SSRF-safe HTTP client in `internal/staticsite`. +- **[`containers.extra_json` Evolution Policy](./container-extra-json.md)** — Ownership model, reader/writer rules, wholesale-overwrite vs preserve-unknown-keys patterns, concurrency invariants; checklist for adding a new field without breaking older deployers. ## Cross-References diff --git a/docs/CODEMAPS/container-extra-json.md b/docs/CODEMAPS/container-extra-json.md new file mode 100644 index 0000000..e10f5b8 --- /dev/null +++ b/docs/CODEMAPS/container-extra-json.md @@ -0,0 +1,105 @@ +# `containers.extra_json` — Evolution Policy + +**Last Updated:** 2026-05-16 + +`extra_json` is a TEXT column on the `containers` table that source plugins use to persist source-specific runtime state that hasn't been promoted to a first-class column. It is the single forward-compatibility seam between the canonical container row and per-source needs that arise after a schema is in production. + +This doc captures the rules every reader and writer must follow so new sources can extend the blob without breaking older ones. + +## Schema position + +- Column: `containers.extra_json TEXT NOT NULL DEFAULT '{}'` ([`internal/store/store.go:233`](../../internal/store/store.go#L233)). +- All four write paths (`CreateContainer`, `UpsertContainer`, `ReconcileContainer`, `UpdateContainer`) normalize `""` → `'{}'` before the SQL exec — readers can assume a non-empty JSON object string and never need to handle SQL `NULL` or the empty-string edge. +- Defined on the `Container` model: [`internal/store/models.go:342-347`](../../internal/store/models.go#L342-L347). + +## Ownership model + +**One container row → one owning source.** Sources never write to a row that belongs to another source. In practice: + +| Source kind | Row key | Number of rows per workload | Writes `extra_json` today? | +| ----------- | -------------------------------------- | --------------------------- | --------------------------- | +| `static` | deterministic `:site` | exactly 1 | yes (preserve-unknown-keys) | +| `image` | UUID per deployed container | 1 + N (blue-green rolls) | yes (wholesale-overwrite) | +| `compose` | deterministic `:` | N (one per compose service) | no — left at `'{}'` default | + +Two sources cannot contend on the same row, so the policy below is concerned with **forward compatibility across versions of the same source**, not cross-source contention. When compose (or any future source) starts writing `extra_json`, the same rules apply. + +## Reader rules — ALL readers + +1. **Tolerate unknown keys.** Decode into a typed struct using `encoding/json`; Go's default unmarshaller silently drops unknown keys, which is the desired behaviour. Never use `json.Decoder.DisallowUnknownFields()` on `extra_json`. +2. **Tolerate decode failure as non-fatal where the row's first-class columns are useful.** A corrupted `extra_json` is debug-logged and the reader falls back to zero state — see `workload_runtime.go:118-133` for the canonical pattern. The container's `ContainerID`, `State`, `ProxyRouteID`, etc. live in their own columns and are still trustworthy. +3. **Tolerate `''` and `'{}'`.** Both are equivalent to "no extras yet". Readers must short-circuit before json.Unmarshal to avoid `unexpected end of JSON input` on the empty case. + +## Writer rules — by mutation style + +Two distinct write patterns live in the codebase today. Pick the one that matches your source's needs. + +### Wholesale-overwrite (image source pattern) + +When the writer owns 100% of the blob's shape and discards old contents on every write: + +```go +// internal/workload/plugin/source/image/image.go:341-343 +extra := containerExtra{ProxyRoutes: faceRoutes} +if b, err := json.Marshal(extra); err == nil { + created.ExtraJSON = string(b) +} +``` + +- Cheap and simple. +- **Loses unknown keys written by future versions of the same source.** Only use when you are certain no other writer (including a future version of this code) needs to round-trip an unknown key. +- The `containerExtra` struct must be **additive-only**: never rename or remove a field once shipped, and never change its JSON type. Mark new fields with `omitempty` so older readers downgrading to an older codebase don't see surprise nulls. + +### Preserve-unknown-keys (static source pattern) + +When future versions of the source (or sibling writers) may add fields and the current writer must round-trip them: + +```go +// internal/workload/plugin/source/static/state.go saveState +// 1. Decode existing blob into map[string]json.RawMessage. +// 2. Strip every key the current typed-state struct owns +// (runtimeStateKeys) so a cleared field actually drops. +// 3. Apply caller's mutate() to the typed state. +// 4. Re-marshal typed state, splice its keys back into the +// generic map (overwriting any historical sibling). +// 5. Marshal the merged map back into extra_json. +``` + +- Slightly more expensive (two round-trips through `json`). +- Preserves keys the current writer doesn't know about — required for safe rolling deploys where a newer instance writes a new key, an older instance then reads, mutates, and writes back. +- Must declare the typed key set explicitly (`runtimeStateKeys`) so step 2 can strip them. This invariant is fenced by `TestRuntimeState_JSONTagsRoundTrip` in [`state_integration_test.go`](../../internal/workload/plugin/source/static/state_integration_test.go). + +**Default to preserve-unknown-keys for any new source.** Wholesale-overwrite is acceptable for the image source today because the row's lifetime is short (replaced on every blue-green roll) and only one writer touches it. Sources whose container rows are long-lived (static, future compose-with-stateful-services) should preserve unknown keys. + +## Concurrency + +`UpsertContainer` is atomic at the SQL layer — SQLite serializes statements through one connection ([`internal/store/store.go:55`](../../internal/store/store.go#L55) `SetMaxOpenConns(1)`) with WAL mode enabled ([`store.go:60`](../../internal/store/store.go#L60)). That guarantees no torn write on a single row, and concurrent readers see a consistent snapshot — they read either the pre- or post-write state, never a half-applied one. + +What that does **not** guarantee is atomic read-modify-write across two Go goroutines. The static source serializes its RMW through a per-workload `sync.Mutex` keyed by workload ID (`internal/workload/plugin/source/static/state.go` `lockFor` + `saveState`). Any source that does its own read-modify-write on `extra_json` must do the same — verified in `TestSaveState_ConcurrentWritesDoNotLoseUpdates` (which loses 15+ markers per 20-writer run when the mutex is disabled, as confirmed in commit `ef62a41`). + +If a future source is purely wholesale-overwrite from a single writer, no lock is needed. + +## What `extra_json` is NOT for + +- **Workload-level config.** Workload config goes in `workloads.source_config` and is the operator's surface. +- **Cross-source state.** If two sources need the same data, promote it to a column. +- **Anything queryable.** SQLite can JSON-path `extra_json` but no index supports it; readers always pull the column wholesale and parse in Go. +- **Secrets.** Anything sensitive lives in `workload_env` (per-entry encrypt flag) or another encrypted table. + +## Adding a new field — checklist + +1. Add the field to your source's typed struct with `omitempty` and a stable `json:"snake_case"` tag. +2. If you use the **preserve-unknown-keys** pattern, add the JSON key to your `*Keys` slice (the equivalent of `runtimeStateKeys`). +3. Confirm older readers (older deploys of the same binary) still parse the blob — `encoding/json` should drop the unknown key silently. Add a regression test if there's any doubt. +4. Document the new field in this codemap if it's load-bearing for cross-source code (e.g., the proxy_routes map drives `ListProxyRoutes`). + +## Pointers + +- Container model + `ExtraJSON` comment: [`internal/store/models.go:342-347`](../../internal/store/models.go#L342-L347) +- Schema declaration: [`internal/store/store.go:233`](../../internal/store/store.go#L233) +- Store-level normalization (`'{}'` default) across all four write paths: [`internal/store/containers.go:42-43`](../../internal/store/containers.go#L42-L43) (CreateContainer), `:77-78` (UpsertContainer), `:129-130` (ReconcileContainer), `:321-322` (UpdateContainer). +- Wholesale-overwrite writer + struct: [`image.go:341-343`](../../internal/workload/plugin/source/image/image.go#L341-L343) writes; [`image.go:481-487`](../../internal/workload/plugin/source/image/image.go#L481-L487) defines `containerExtra`; [`image.go:449-456`](../../internal/workload/plugin/source/image/image.go#L449-L456) reads it back in Teardown. +- Preserve-unknown-keys example + concurrency lock: [`internal/workload/plugin/source/static/state.go`](../../internal/workload/plugin/source/static/state.go). +- Canonical "decode-and-tolerate" consumer (the only cross-source reader in tree today): [`internal/api/workload_runtime.go:118-133`](../../internal/api/workload_runtime.go#L118-L133) decodes the static-only typed fields and falls back to first-class columns when the blob is empty, missing keys, or malformed. + +Note: no cross-source consumer reads `extra_json` in `internal/store/`. The proxy/route data exposed by `ListProxyRoutes` ([`containers.go:196`](../../internal/store/containers.go#L196)) comes from first-class columns (`proxy_route_id`, `subdomain`, `port`); the `proxy_routes` map inside `extra_json` is read only by the image source's own Teardown for cleanup. diff --git a/docs/WORKLOAD_REFACTOR_TODO.md b/docs/WORKLOAD_REFACTOR_TODO.md index d20754c..30c9461 100644 --- a/docs/WORKLOAD_REFACTOR_TODO.md +++ b/docs/WORKLOAD_REFACTOR_TODO.md @@ -500,13 +500,15 @@ covers the use case — `promote-from` works, the UI shows the relationship. Probably can leave the legacy `stages` table dropped entirely once cutover proceeds. -### `Container.extra_json` evolution +### ~~`Container.extra_json` evolution~~ — DONE (2026-05-16) -Currently only the image source uses it (per-face proxy route IDs). If -other sources gain similar needs (compose service health metadata, static -build SHAs), the schema there should stay versionless and additive — every -reader must tolerate unknown keys. Document this in the source plugin -guide alongside the codemap entry. +Both writer patterns now have an active example in-tree (image source +clobbers, static source preserves) and the policy is documented in +[`docs/CODEMAPS/container-extra-json.md`](CODEMAPS/container-extra-json.md): +ownership model, wholesale-overwrite vs preserve-unknown-keys, reader +tolerance for unknown keys + decode failure, the per-workload mutex +requirement for any read-modify-write writer, and a checklist for adding +a new field without breaking older deployers. ## File pointers for the next session