Restore a captured volume snapshot onto an image workload's live host-bind
data volumes, then redeploy — the most destructive workload action, built to
the adversarially-reviewed design (C1–C6) with all data-loss guards.
- Engine.Restore (engine-owned): all-or-nothing pre-flight re-resolution from
the workload's CURRENT config (never the tamperable manifest), per-filesystem
disk pre-check, per-workload lock, container quiesce, extract-to-tmp, durable
pre-restore snapshot, write-ahead journal, atomic rename swap, redeploy, and
crash-recovery sweep (RecoverInterruptedRestores) wired before serving.
- internal/keyedmutex: shared per-key lock; deployer now serializes every
deploy entrypoint per workload via DispatchPlugin (+ LockWorkload/RedeployLocked
for the restore re-dispatch, no deadlock).
- Untrusted-archive extractor: zip-slip containment, type allow-list (reg/dir
only), decompression-bomb cap, manifest-index bounds.
- POST /api/workloads/{id}/snapshots/{sid}/restore: admin, X-Confirm-Restore
header (CSRF), per-workload single-flight (409).
- WebUI: Restore button + danger ConfirmDialog + busy state + i18n (en/ru).
Scope: image-source only; scopes absolute/stage/project (driven off the same
supportedScopes constant capture uses).
Plan-reviewed before coding; per-phase go/security/ts reviews; final review
READY TO MERGE. Security review caught + fixed a CRITICAL manifest-Source path
traversal (re-derive target from current config + base containment).
Plan: plans/volume-snapshot-restore/
7.0 KiB
Phase 1: Restore engine primitives + path-safe extractor + unit tests
Status: ✅ Complete Parent plan: PLAN.md Domain: backend
Objective
Build the dangerous filesystem primitives in isolation, fully unit-tested, with NO docker/lifecycle wiring. Each is a pure function over directories + the store + a parsed manifest. No caller yet (exercised by tests so not "unused"). Zero behavior change to existing capture.
Tasks
internal/volsnap/extract.go—safeExtractIndex(archivePath string, index int, dest string, bombCap int64) (int64, error): open the gzip tar, extract only entries under the"<index>/"prefix intodest, return bytes written. UNTRUSTED-input guards (C6):- zip-slip:
target := filepath.Join(dest, rel); requirestrings.HasPrefix(filepath.Clean(target)+sep, cleanDest+sep)(ortarget == cleanDest); reject otherwise. - allow ONLY
tar.TypeReg+tar.TypeDir; reject symlink/hardlink/char/block/fifo/socket with an error (never follow). - decompression-bomb cap: running byte counter; abort when it would exceed
bombCap. - create parent dirs as needed; files
0o600, dirs0o700(data dirs; ownership is the container's concern). - skip
manifest.jsonand any entry whose leading path segment ≠index.
- zip-slip:
internal/volsnap/restore.go(primitives only — NO orchestration):archiveUncompressedSize(archivePath string, bombCap int64) (int64, error)— header-only sizing pass summinghdr.Size, enforcingbombCap(feeds C5). Per-index sizes too (map[int]int64) so C5 can check per filesystem.parseManifest(snap store.VolumeSnapshot) ([]SnapshotVolume, error).preflightResolve(w store.Workload, settings store.Settings, manifest []SnapshotVolume) ([]resolvedVol, error)— ALL-OR-NOTHING (C3): for every manifest volume requiresupportedScopes[scope]ANDvolume.ResolveWorkloadPathsucceeds; on first failure return an error naming target+scope+reason (abort signal).resolvedVol{Index, Target, Scope, LivePath}. Reuses the SAMEsupportedScopesmap.- swap helpers (C2 + R2 + R3): staging is sibling to the live dir's parent so renames are same-filesystem.
stagingRoot(live string) string=filepath.Join(filepath.Dir(live), ".tf-restore-"+token).swapVolumeDir(live, tmp, old string) error= rename(live→old) then rename(tmp→live); detectEXDEV/cross-device and return a clear error WITHOUT having moved anything irreversibly (check device equivalence up-front or treat the rename error as fatal/rollback).rollbackSwaps(done []swap) error= for each completed swap in reverse, rename(live→discard), rename(old→live). freeDiskBytes(path string) (uint64, error)— platform helper. Build-tag split mirroring the repo'slockfile_windows.go/lockfile_unix.goprecedent:disk_unix.go(//go:build !windows,syscall.Statfs) +disk_windows.go(golang.org/x/sys/windows.GetDiskFreeSpaceEx). Production target is Linux.
- Constants:
maxRestoreUncompressedBytes(decompression-bomb cap) +diskFreeSafetyMarginnamed consts with rationale comments.
Files to Modify/Create
internal/volsnap/extract.go— untrusted extractor (new)internal/volsnap/restore.go— primitives: sizing, manifest parse, preflight, swap/rollback, free-disk (new)internal/volsnap/disk_unix.go,internal/volsnap/disk_windows.go— free-disk platform split (new)internal/volsnap/extract_test.go,internal/volsnap/restore_test.go— unit tests (new)go.mod—golang.org/x/syspromoted indirect→direct (already present v0.33.0)
Acceptance Criteria
- Zip-slip (
../, absolute,..\\on win), symlink, hardlink, device, fifo entries all rejected bysafeExtractIndex. - Decompression-bomb cap aborts extraction + sizing past the cap.
- Happy-path extract round-trip restores file tree + contents byte-for-byte under
dest. swapVolumeDir+rollbackSwaps: full and PARTIAL-swap rollback leave the original live dirs byte-identical.preflightResolveis all-or-nothing: one unresolvable/unsupported-scope volume → error, and the caller renames nothing.archiveUncompressedSizematches the real extracted total.go test ./internal/volsnap/...,go build ./...,go vet ./internal/...all green.
Notes
- Open the archive once per pass; on Unix an open fd survives a concurrent
Deleteunlink (defence against a racing snapshot delete); Windows refuses delete of an open file. Acceptable. safeExtractIndexwrites into a caller-provideddest(the stagingtmp), never directly onto the live path — the swap is a separate step (C2).
Review Checklist
- All tasks completed
- Code follows project conventions (gofmt, wrapped errors, small funcs)
- No unintended side effects (no change to Create/List/Delete)
- Build passes
- Tests pass (new + existing)
Handoff to Next Phase
Implemented files: extract.go (safeExtractIndex, stripIndexPrefix, leadingIndex,
withinDir), restore.go (parseManifest, preflightResolve, archiveUncompressedSize,
swap/swapVolumeDir/rollbackSwaps/stagingDirs, consts maxRestoreUncompressedBytes
= 50 GiB, diskFreeHeadroomBytes = 256 MiB), disk_unix.go/disk_windows.go
(freeDiskBytes). Tests in extract_test.go + restore_test.go. go.mod: x/sys →
direct.
API contract for Phase 2 (Engine.Restore):
safeExtractIndex(archivePath, index, dest, bombCap)— extracts ONE volume's subtree into a FRESHdest(usesO_EXCL); returns bytes written. Call once per resolved volume into itstmpstaging dir.preflightResolve(w, settings, manifest)→[]resolvedVol{Index,Target,Scope,LivePath}, ALL-OR-NOTHING; already rejects unsupported scopes AND negative indices. Run BEFORE Lock/StopContainers.stagingDirs(live, token, index)→(tmp, old)siblings offilepath.Dir(live)(same-fs ⇒ atomic rename). Use a per-restoretoken.swapVolumeDir(live, tmp, old)→(hadOld, err); self-reverts the first rename on failure (live never left missing). Collect each completed swap into[]swap{live,old,tmp,hadOld}and callrollbackSwaps(done)on any later failure.archiveUncompressedSize(archivePath, bombCap)→(perIndex map[int]int64, total, err)for the C5 per-filesystem free-disk check. NOTE: it's a LOWER-BOUND (ignores dir/inode overhead) — treat as advisory; the staged-extract+swap is the real net.freeDiskBytes(path)— pass the live dir's PARENT (where tmp/old land).
Phase 2 must: extract ALL tmp dirs first, THEN swap all (shrinks the destructive window); validate each manifest index maps to an existing archive subtree (W2 — only the negative check is done so far); the disk pre-check should sum per-target-filesystem.
Review (go-reviewer, APPROVE WITH NOTES): no blockers. Addressed in-phase: W2 (negative index reject), W3 (explicit second-rename self-revert test), W4 (stagingDirs test), N1/N2/N4 (comments + sparse-type rejection test). W1 (disk estimate is lower-bound) folded into Phase 2 guidance above.