Restore a captured volume snapshot onto an image workload's live host-bind
data volumes, then redeploy — the most destructive workload action, built to
the adversarially-reviewed design (C1–C6) with all data-loss guards.
- Engine.Restore (engine-owned): all-or-nothing pre-flight re-resolution from
the workload's CURRENT config (never the tamperable manifest), per-filesystem
disk pre-check, per-workload lock, container quiesce, extract-to-tmp, durable
pre-restore snapshot, write-ahead journal, atomic rename swap, redeploy, and
crash-recovery sweep (RecoverInterruptedRestores) wired before serving.
- internal/keyedmutex: shared per-key lock; deployer now serializes every
deploy entrypoint per workload via DispatchPlugin (+ LockWorkload/RedeployLocked
for the restore re-dispatch, no deadlock).
- Untrusted-archive extractor: zip-slip containment, type allow-list (reg/dir
only), decompression-bomb cap, manifest-index bounds.
- POST /api/workloads/{id}/snapshots/{sid}/restore: admin, X-Confirm-Restore
header (CSRF), per-workload single-flight (409).
- WebUI: Restore button + danger ConfirmDialog + busy state + i18n (en/ru).
Scope: image-source only; scopes absolute/stage/project (driven off the same
supportedScopes constant capture uses).
Plan-reviewed before coding; per-phase go/security/ts reviews; final review
READY TO MERGE. Security review caught + fixed a CRITICAL manifest-Source path
traversal (re-derive target from current config + base containment).
Plan: plans/volume-snapshot-restore/
7.9 KiB
Phase 2: Engine.Restore orchestration + lifecycle/locking + rollback
Status: ✅ Complete Parent plan: PLAN.md Domain: backend
Objective
Wire the Phase 1 primitives into the full stop → swap → redeploy sequence under a
per-workload lock, with crash-safe rollback (journal + recovery sweep) and a durable
pre-restore auto-capture. Define the Lifecycle seam; modify the Deployer for per-workload
locking + an unlocked redeploy.
Tasks
internal/keyedmutex/keyedmutex.go— extract thegitops.gopattern into a shared package:type MutexwithLock(key string) func()andTryLock(key string) (func(), bool)(the Try variant serves the Phase 3 API single-flight → 409). Unit test both.- Deployer locking (C1) in
internal/deployer/:- add
workloadLocks keyedmutex.Mutexfield. - refactor
DispatchPlugin→unlock := d.workloadLocks.Lock(w.ID); defer unlock(); return d.dispatchLocked(ctx, w, intent); move the current body into unexporteddispatchLocked. - wrap
DispatchTeardownin the same per-workload lock. - do NOT lock
DispatchReconcile(periodic; image Reconcile is a no-op; reconcilermarkMissingRowsonly flips labels = benign; locking it would stall the reconcile loop behind long deploys). - expose
func (d *Deployer) LockWorkload(id string) func()andfunc (d *Deployer) RedeployLocked(ctx, w, intent) error(=dispatchLocked, doc: "caller already holds the workload lock; calling DispatchPlugin would deadlock").
- add
volsnap.Lifecycleinterface (in volsnap):Lock(workloadID string) func()StopContainers(ctx, workloadID string) (runningTag string, err error)— stop every running container for the workload; return the newest-running container'sImageTag(so redeploy pins the same version; empty ⇒ source default). Mark stopped rowsState="stopped".Redeploy(ctx, w store.Workload, reference string) error— unlocked re-dispatch, Reason"restore", Reference=tag.
Engine.Restore(ctx, snapshotID, workloadID string) errorininternal/volsnap/restore.go(engine owns it). Sequence — does NOT holde.mu(R1):- load snap; verify
snap.WorkloadID == workloadID; load workload + settings; requiresource_kind=="image". parseManifest;preflightResolve(C3 — abort if any fails);archiveUncompressedSize+ per-filesystemfreeDiskBytespre-check (C5/R4 — abort).unlock := lc.Lock(workloadID); defer unlock()(C1).- re-validate the workload still exists (R4 — teardown may have won the lock); abort if gone.
tag, _ := lc.StopContainers(ctx, workloadID)(C4 stop).- durably capture pre-restore snapshot:
e.Create(w, settings, "pre-restore")(folded; AFTER stop = quiesced; BEFORE any rename = R3).Createtakes its owne.mu— Restore must hold none. - write restore journal
<snapDir>/restore-<workloadID>.json(snapshotID, per-volume {live, old, tmp, swapped:false}). - extract ALL volumes to their
tmpstaging dirs (safeExtractIndex) — R3 (shrinks the destructive window to pure renames). - swap each volume (
swapVolumeDir), updating the journalswapped=trueper volume. - on ANY error in 8–9 →
rollbackSwaps+lc.Redeploy(ctx, w, tag)+ delete journal + return wrapped error. - success →
lc.Redeploy(ctx, w, tag)(C4 redeploy); remove.oldstaging dirs (reclaim disk); delete journal; best-effort audit event (store.InsertEventsource"volsnap").
Engine.SetLifecycle(lc Lifecycle)setter;Restoreerrors clearly if lifecycle is nil.
- load snap; verify
Engine.RecoverInterruptedRestores() (int, error)(R3) — startup sweep, mirrorsCleanOrphans: for eachrestore-*.jsonjournal, per volume: ifswapped→ removeold+tmp; else if live missing && old exists → rename old→live (revert mid-rename crash), remove tmp; else (live present, not swapped) → remove tmp. Delete journal. Log loudly. (Wiring at startup happens in Phase 3's main.go change, besideCleanOrphans.)
Files to Modify/Create
internal/keyedmutex/keyedmutex.go(+_test.go) — shared lock (new)internal/deployer/deployer.go,internal/deployer/dispatch.go— workloadLocks, dispatchLocked, LockWorkload, RedeployLocked, locked Teardowninternal/volsnap/restore.go— Lifecycle interface, Engine.Restore, RecoverInterruptedRestores, SetLifecycle, journal typeinternal/volsnap/restore_test.go— fake-Lifecycle orchestration tests (extends Phase 1 file)internal/api/gitops.go— (optional, low-risk) migratekeyedMutex→keyedmutex.Mutexfor DRY
Acceptance Criteria
- Lock re-entrancy:
Engine.Restore→RedeployLockeddoes NOT re-acquire the workload lock (no deadlock). All existing deployer tests still pass (lock is externally transparent). - Happy-path orchestration test uses the REAL
Engine.Create(real store +t.TempDir()) for the pre-restore capture so thee.mudeadlock (R1) would failgo test, not prod. Asserts call order: preflight → lock → stop → create → extract-all → swap-all → redeploy → cleanup. - Rollback test: a swap fails midway → originals restored, redeploy called, journal deleted, error returned.
- Preflight-fail test: lock/stop NEVER called (abort before lock).
- Disk-pre-check-fail test: abort before lock.
RecoverInterruptedRestorestest: simulate journals in each crash state → correct revert/keep/cleanup.go build ./...,go vet ./internal/...,go test ./internal/...green.
Notes
- ⚠️ The Deployer lock change touches the hot deploy path — verify no existing path re-enters
DispatchPluginunder a held lock (webhook preview = sequential teardown-then-deploy on the child, not nested — confirmed safe). - The API single-flight (Phase 3) is a fast 409 reject; the deployer lock is the real mutex — they compose (document).
Review Checklist
- All tasks completed
- Code follows project conventions
- No unintended side effects (existing deploy/teardown behavior unchanged externally)
- Build passes
- Tests pass (new + existing)
Handoff to Next Phase
Implemented: internal/keyedmutex (Lock+TryLock, tested); deployer workloadLocks +
dispatchLocked + LockWorkload + RedeployLocked, DispatchPlugin/DispatchTeardown
now per-workload-locked (reconciler intentionally NOT). volsnap.Lifecycle interface,
Engine.Restore, restoreJournal (atomic write — W1), RecoverInterruptedRestores,
recoverVolume, checkDiskSpace, SetLifecycle. Tests: restore_engine_test.go
(happy/real-Create, redeploy-fail, preflight-abort, extract-fail-after-lock, nil-lifecycle,
wrong-workload, recovery×3 states), keyedmutex_test.go. Full go test ./internal/... green.
Review (go-reviewer, APPROVE WITH NOTES): no functional blockers in this diff. Verified:
no lock re-entrancy/e.mu self-deadlock, no prune-race (extract-all precedes e.Create),
recovery state machine doesn't revert good data. Addressed in-phase: W1 (atomic journal),
W3 (extract-failure orchestration test). Residual W3 (mid-swap fault injection) accepted.
🔴 HARD PREREQUISITES for Phase 3 (B1 + N1 from review):
- Wire
snapshotEngine.RecoverInterruptedRestores()at startup incmd/server/main.go, BEFORE the API server serves — beside the existingCleanOrphans()call (~main.go:333). Without it the journal/WAL protects nothing — a crash mid-restore is unrecovered. - Wire
snapshotEngine.SetLifecycle(adapter)strictly BEFORE serving (same place asSetSnapshotEngine) so thee.lifecyclefield is safely published (no race). - The restore endpoint MUST NOT be reachable until both are wired.
Lifecycle adapter (Phase 3, main.go) maps: Lock→deployer.LockWorkload;
StopContainers→store.ListContainersByWorkload + docker.StopContainer each running +
UpdateContainerState(...,"stopped") + return newest-running ImageTag;
Redeploy→deployer.RedeployLocked with a restore-reason intent (Reference=tag).