Restore a captured volume snapshot onto an image workload's live host-bind
data volumes, then redeploy — the most destructive workload action, built to
the adversarially-reviewed design (C1–C6) with all data-loss guards.
- Engine.Restore (engine-owned): all-or-nothing pre-flight re-resolution from
the workload's CURRENT config (never the tamperable manifest), per-filesystem
disk pre-check, per-workload lock, container quiesce, extract-to-tmp, durable
pre-restore snapshot, write-ahead journal, atomic rename swap, redeploy, and
crash-recovery sweep (RecoverInterruptedRestores) wired before serving.
- internal/keyedmutex: shared per-key lock; deployer now serializes every
deploy entrypoint per workload via DispatchPlugin (+ LockWorkload/RedeployLocked
for the restore re-dispatch, no deadlock).
- Untrusted-archive extractor: zip-slip containment, type allow-list (reg/dir
only), decompression-bomb cap, manifest-index bounds.
- POST /api/workloads/{id}/snapshots/{sid}/restore: admin, X-Confirm-Restore
header (CSRF), per-workload single-flight (409).
- WebUI: Restore button + danger ConfirmDialog + busy state + i18n (en/ru).
Scope: image-source only; scopes absolute/stage/project (driven off the same
supportedScopes constant capture uses).
Plan-reviewed before coding; per-phase go/security/ts reviews; final review
READY TO MERGE. Security review caught + fixed a CRITICAL manifest-Source path
traversal (re-derive target from current config + base containment).
Plan: plans/volume-snapshot-restore/
Closes the workload-first refactor by landing the Priority 3 polish
items and the Priority 4 test gap. Net: ~2,400 lines added,
~350 lines modified across 13 files.
Priority 3 — polish
- apps.* i18n namespace: 276 new keys across apps.list.* (27),
apps.new.* (91, sibling of existing apps.new.triggers.*), and
apps.detail.* (158, sibling of existing apps.detail.bindings.*).
EN+RU at 1314 keys each, perfectly in sync. /apps, /apps/new,
/apps/[id] now render entirely from i18n.
- New codemap docs/CODEMAPS/workload-plugin.md (238 lines):
Source × Trigger contract, dispatch seam, webhook fan-out path,
recipes for adding a new Source or Trigger kind. Plus
docs/CODEMAPS/INDEX.md gateway.
Priority 4 — tests
- internal/api/workloads_test.go (new, ~30 subtests): /api/workloads
CRUD + deploy + delete + env + volumes + chain + promote-from +
triggers list/inline-bind + auth gating + standalone /api/triggers
CRUD (create / dup-409 / kind filter / delete). Uses real
POST handlers via httptest.NewServer + a fake plugin source
registered under "testfakesource".
- internal/deployer/dispatch_test.go (new, 11 tests):
DispatchPlugin / DispatchTeardown / DispatchReconcile happy +
unknown-kind + propagated-error each; PluginDeps wiring; a real
2s-bounded RWMutex deadlock probe on PluginDeps vs SetDNSProvider.
- internal/workload/plugin/source/compose/compose_test.go (new,
~26 subtests): composeProjectName sanitization,
writeYAML / writeYAMLIfChanged hash short-circuit, Validate happy
+ bad inputs, Kind / SchemaSample.
Coverage delta on the workload-plugin path:
- internal/api: 1.1% → 16.0%
- internal/deployer: 0% → 54.1%
- internal/workload/plugin/source/compose: 0% → 38.5%
- Trigger plugins already at 87-95% from the trigger-split work.
Production fix surfaced by the tests
- store.CreateWorkload now self-references RefID = ID when caller
leaves RefID empty (the typical plugin-native path). The api
layer's broken backfill loop (called UpdateWorkload, which
deliberately omits ref_id) is gone. Multiple sibling plugin
workloads can now coexist under the UNIQUE(kind, ref_id) constraint.
Review fixes addressed before commit
- CRITICAL: deadlock-detect test gained a real 2s time.After (was
selecting on context.Background().Done() which never fires).
- HIGH: happy-path test now hard-asserts RefID = ID (was a t.Logf
that would silently pass after a production fix).
- HIGH: standalone /api/triggers CRUD coverage added (was bypassed
by the workload-side bind flow).
- HIGH: seedWorkload bypass deleted; tests now go through the
real POST /api/workloads handler.
- MEDIUM: withTempDir restore is a no-op (t.Setenv auto-restores);
dead `old := os.Getenv(...)` capture removed.
- MEDIUM: list-workloads test now asserts ID membership, not just
count.
Doc
- WORKLOAD_REFACTOR_TODO: all three Priority 1 items, Priority 3
polish, and Priority 4 tests marked DONE. The workload-first arc
is closed.