docs(extra_json): policy doc for containers.extra_json evolution

New CODEMAPS/container-extra-json.md documents the contract every
source plugin must follow when reading or writing containers.extra_json.
Closes the open architectural question that was tracked in
WORKLOAD_REFACTOR_TODO.md.

Covers:
- Schema position (column default, four write-path normalization
  sites) and ownership model (per-source row keys, current writers).
- Reader rules: tolerate unknown keys via default json.Unmarshal,
  tolerate decode failure where first-class columns suffice.
- Writer patterns: wholesale-overwrite (image source, single-writer
  short-lived rows) vs preserve-unknown-keys (static source, RMW with
  generic-map round-trip). Preserve-unknown-keys is the recommended
  default for new sources.
- Concurrency: SetMaxOpenConns(1) + WAL gives atomic per-row writes
  and consistent reader snapshots, but does NOT serialize multi-
  goroutine RMW — a per-workload sync.Mutex is required for that
  (fenced by TestSaveState_ConcurrentWritesDoNotLoseUpdates).
- What extra_json is NOT for (workload config, cross-source state,
  queryable data, secrets) and a checklist for adding a new field.
- Pointers to every example in tree: image's containerExtra writer/
  reader, static's saveState round-trip, workload_runtime.go's
  decode-and-tolerate consumer.

WORKLOAD_REFACTOR_TODO Container.extra_json question flipped to DONE.
CODEMAPS/INDEX bumped + entry linked.

Reviewer pass (code-reviewer subagent) caught one HIGH factual error
(wrong cross-source consumer claim) and several MEDIUM/LOW drifts;
all addressed inline before commit.
This commit is contained in:
2026-05-16 22:00:41 +03:00
parent ea55d31177
commit 279f373f80
3 changed files with 115 additions and 7 deletions
+2 -1
View File
@@ -1,6 +1,6 @@
# Tinyforge Codemaps — Index # Tinyforge Codemaps — Index
**Last Updated:** 2026-05-16 **Last Updated:** 2026-05-16 (added `container-extra-json` policy doc)
This directory contains architectural maps of key Tinyforge subsystems. Each codemap focuses on one major area: core data types, contract surfaces, integration points, and recipes for extending the system. This directory contains architectural maps of key Tinyforge subsystems. Each codemap focuses on one major area: core data types, contract surfaces, integration points, and recipes for extending the system.
@@ -8,6 +8,7 @@ This directory contains architectural maps of key Tinyforge subsystems. Each cod
- **[Workload Plugin](./workload-plugin.md)** — Source × Trigger plugin contracts; registry lookups; webhook fan-out; how to add new kinds. - **[Workload Plugin](./workload-plugin.md)** — Source × Trigger plugin contracts; registry lookups; webhook fan-out; how to add new kinds.
- **[Discovery & Runtime API](./discovery-and-runtime.md)** — `/api/discovery/*` helpers (Git provider probe, repo/branch/tree pickers, image conflicts); `/api/workloads/{id}/runtime-state` + `/storage` + `/stop` + `/start`; SSRF-safe HTTP client in `internal/staticsite`. - **[Discovery & Runtime API](./discovery-and-runtime.md)** — `/api/discovery/*` helpers (Git provider probe, repo/branch/tree pickers, image conflicts); `/api/workloads/{id}/runtime-state` + `/storage` + `/stop` + `/start`; SSRF-safe HTTP client in `internal/staticsite`.
- **[`containers.extra_json` Evolution Policy](./container-extra-json.md)** — Ownership model, reader/writer rules, wholesale-overwrite vs preserve-unknown-keys patterns, concurrency invariants; checklist for adding a new field without breaking older deployers.
## Cross-References ## Cross-References
+105
View File
@@ -0,0 +1,105 @@
# `containers.extra_json` — Evolution Policy
**Last Updated:** 2026-05-16
`extra_json` is a TEXT column on the `containers` table that source plugins use to persist source-specific runtime state that hasn't been promoted to a first-class column. It is the single forward-compatibility seam between the canonical container row and per-source needs that arise after a schema is in production.
This doc captures the rules every reader and writer must follow so new sources can extend the blob without breaking older ones.
## Schema position
- Column: `containers.extra_json TEXT NOT NULL DEFAULT '{}'` ([`internal/store/store.go:233`](../../internal/store/store.go#L233)).
- All four write paths (`CreateContainer`, `UpsertContainer`, `ReconcileContainer`, `UpdateContainer`) normalize `""``'{}'` before the SQL exec — readers can assume a non-empty JSON object string and never need to handle SQL `NULL` or the empty-string edge.
- Defined on the `Container` model: [`internal/store/models.go:342-347`](../../internal/store/models.go#L342-L347).
## Ownership model
**One container row → one owning source.** Sources never write to a row that belongs to another source. In practice:
| Source kind | Row key | Number of rows per workload | Writes `extra_json` today? |
| ----------- | -------------------------------------- | --------------------------- | --------------------------- |
| `static` | deterministic `<workloadID>:site` | exactly 1 | yes (preserve-unknown-keys) |
| `image` | UUID per deployed container | 1 + N (blue-green rolls) | yes (wholesale-overwrite) |
| `compose` | deterministic `<workloadID>:<service>` | N (one per compose service) | no — left at `'{}'` default |
Two sources cannot contend on the same row, so the policy below is concerned with **forward compatibility across versions of the same source**, not cross-source contention. When compose (or any future source) starts writing `extra_json`, the same rules apply.
## Reader rules — ALL readers
1. **Tolerate unknown keys.** Decode into a typed struct using `encoding/json`; Go's default unmarshaller silently drops unknown keys, which is the desired behaviour. Never use `json.Decoder.DisallowUnknownFields()` on `extra_json`.
2. **Tolerate decode failure as non-fatal where the row's first-class columns are useful.** A corrupted `extra_json` is debug-logged and the reader falls back to zero state — see `workload_runtime.go:118-133` for the canonical pattern. The container's `ContainerID`, `State`, `ProxyRouteID`, etc. live in their own columns and are still trustworthy.
3. **Tolerate `''` and `'{}'`.** Both are equivalent to "no extras yet". Readers must short-circuit before json.Unmarshal to avoid `unexpected end of JSON input` on the empty case.
## Writer rules — by mutation style
Two distinct write patterns live in the codebase today. Pick the one that matches your source's needs.
### Wholesale-overwrite (image source pattern)
When the writer owns 100% of the blob's shape and discards old contents on every write:
```go
// internal/workload/plugin/source/image/image.go:341-343
extra := containerExtra{ProxyRoutes: faceRoutes}
if b, err := json.Marshal(extra); err == nil {
created.ExtraJSON = string(b)
}
```
- Cheap and simple.
- **Loses unknown keys written by future versions of the same source.** Only use when you are certain no other writer (including a future version of this code) needs to round-trip an unknown key.
- The `containerExtra` struct must be **additive-only**: never rename or remove a field once shipped, and never change its JSON type. Mark new fields with `omitempty` so older readers downgrading to an older codebase don't see surprise nulls.
### Preserve-unknown-keys (static source pattern)
When future versions of the source (or sibling writers) may add fields and the current writer must round-trip them:
```go
// internal/workload/plugin/source/static/state.go saveState
// 1. Decode existing blob into map[string]json.RawMessage.
// 2. Strip every key the current typed-state struct owns
// (runtimeStateKeys) so a cleared field actually drops.
// 3. Apply caller's mutate() to the typed state.
// 4. Re-marshal typed state, splice its keys back into the
// generic map (overwriting any historical sibling).
// 5. Marshal the merged map back into extra_json.
```
- Slightly more expensive (two round-trips through `json`).
- Preserves keys the current writer doesn't know about — required for safe rolling deploys where a newer instance writes a new key, an older instance then reads, mutates, and writes back.
- Must declare the typed key set explicitly (`runtimeStateKeys`) so step 2 can strip them. This invariant is fenced by `TestRuntimeState_JSONTagsRoundTrip` in [`state_integration_test.go`](../../internal/workload/plugin/source/static/state_integration_test.go).
**Default to preserve-unknown-keys for any new source.** Wholesale-overwrite is acceptable for the image source today because the row's lifetime is short (replaced on every blue-green roll) and only one writer touches it. Sources whose container rows are long-lived (static, future compose-with-stateful-services) should preserve unknown keys.
## Concurrency
`UpsertContainer` is atomic at the SQL layer — SQLite serializes statements through one connection ([`internal/store/store.go:55`](../../internal/store/store.go#L55) `SetMaxOpenConns(1)`) with WAL mode enabled ([`store.go:60`](../../internal/store/store.go#L60)). That guarantees no torn write on a single row, and concurrent readers see a consistent snapshot — they read either the pre- or post-write state, never a half-applied one.
What that does **not** guarantee is atomic read-modify-write across two Go goroutines. The static source serializes its RMW through a per-workload `sync.Mutex` keyed by workload ID (`internal/workload/plugin/source/static/state.go` `lockFor` + `saveState`). Any source that does its own read-modify-write on `extra_json` must do the same — verified in `TestSaveState_ConcurrentWritesDoNotLoseUpdates` (which loses 15+ markers per 20-writer run when the mutex is disabled, as confirmed in commit `ef62a41`).
If a future source is purely wholesale-overwrite from a single writer, no lock is needed.
## What `extra_json` is NOT for
- **Workload-level config.** Workload config goes in `workloads.source_config` and is the operator's surface.
- **Cross-source state.** If two sources need the same data, promote it to a column.
- **Anything queryable.** SQLite can JSON-path `extra_json` but no index supports it; readers always pull the column wholesale and parse in Go.
- **Secrets.** Anything sensitive lives in `workload_env` (per-entry encrypt flag) or another encrypted table.
## Adding a new field — checklist
1. Add the field to your source's typed struct with `omitempty` and a stable `json:"snake_case"` tag.
2. If you use the **preserve-unknown-keys** pattern, add the JSON key to your `*Keys` slice (the equivalent of `runtimeStateKeys`).
3. Confirm older readers (older deploys of the same binary) still parse the blob — `encoding/json` should drop the unknown key silently. Add a regression test if there's any doubt.
4. Document the new field in this codemap if it's load-bearing for cross-source code (e.g., the proxy_routes map drives `ListProxyRoutes`).
## Pointers
- Container model + `ExtraJSON` comment: [`internal/store/models.go:342-347`](../../internal/store/models.go#L342-L347)
- Schema declaration: [`internal/store/store.go:233`](../../internal/store/store.go#L233)
- Store-level normalization (`'{}'` default) across all four write paths: [`internal/store/containers.go:42-43`](../../internal/store/containers.go#L42-L43) (CreateContainer), `:77-78` (UpsertContainer), `:129-130` (ReconcileContainer), `:321-322` (UpdateContainer).
- Wholesale-overwrite writer + struct: [`image.go:341-343`](../../internal/workload/plugin/source/image/image.go#L341-L343) writes; [`image.go:481-487`](../../internal/workload/plugin/source/image/image.go#L481-L487) defines `containerExtra`; [`image.go:449-456`](../../internal/workload/plugin/source/image/image.go#L449-L456) reads it back in Teardown.
- Preserve-unknown-keys example + concurrency lock: [`internal/workload/plugin/source/static/state.go`](../../internal/workload/plugin/source/static/state.go).
- Canonical "decode-and-tolerate" consumer (the only cross-source reader in tree today): [`internal/api/workload_runtime.go:118-133`](../../internal/api/workload_runtime.go#L118-L133) decodes the static-only typed fields and falls back to first-class columns when the blob is empty, missing keys, or malformed.
Note: no cross-source consumer reads `extra_json` in `internal/store/`. The proxy/route data exposed by `ListProxyRoutes` ([`containers.go:196`](../../internal/store/containers.go#L196)) comes from first-class columns (`proxy_route_id`, `subdomain`, `port`); the `proxy_routes` map inside `extra_json` is read only by the image source's own Teardown for cleanup.
+8 -6
View File
@@ -500,13 +500,15 @@ covers the use case — `promote-from` works, the UI shows the relationship.
Probably can leave the legacy `stages` table dropped entirely once cutover Probably can leave the legacy `stages` table dropped entirely once cutover
proceeds. proceeds.
### `Container.extra_json` evolution ### ~~`Container.extra_json` evolution~~ — DONE (2026-05-16)
Currently only the image source uses it (per-face proxy route IDs). If Both writer patterns now have an active example in-tree (image source
other sources gain similar needs (compose service health metadata, static clobbers, static source preserves) and the policy is documented in
build SHAs), the schema there should stay versionless and additive — every [`docs/CODEMAPS/container-extra-json.md`](CODEMAPS/container-extra-json.md):
reader must tolerate unknown keys. Document this in the source plugin ownership model, wholesale-overwrite vs preserve-unknown-keys, reader
guide alongside the codemap entry. tolerance for unknown keys + decode failure, the per-workload mutex
requirement for any read-modify-write writer, and a checklist for adding
a new field without breaking older deployers.
## File pointers for the next session ## File pointers for the next session