Files
tiny-forge/CLAUDE.md
T
alexei.dolgolyov 1c47030854 feat(volsnap): volume snapshot restore (backlog #6)
Restore a captured volume snapshot onto an image workload's live host-bind
data volumes, then redeploy — the most destructive workload action, built to
the adversarially-reviewed design (C1–C6) with all data-loss guards.

- Engine.Restore (engine-owned): all-or-nothing pre-flight re-resolution from
  the workload's CURRENT config (never the tamperable manifest), per-filesystem
  disk pre-check, per-workload lock, container quiesce, extract-to-tmp, durable
  pre-restore snapshot, write-ahead journal, atomic rename swap, redeploy, and
  crash-recovery sweep (RecoverInterruptedRestores) wired before serving.
- internal/keyedmutex: shared per-key lock; deployer now serializes every
  deploy entrypoint per workload via DispatchPlugin (+ LockWorkload/RedeployLocked
  for the restore re-dispatch, no deadlock).
- Untrusted-archive extractor: zip-slip containment, type allow-list (reg/dir
  only), decompression-bomb cap, manifest-index bounds.
- POST /api/workloads/{id}/snapshots/{sid}/restore: admin, X-Confirm-Restore
  header (CSRF), per-workload single-flight (409).
- WebUI: Restore button + danger ConfirmDialog + busy state + i18n (en/ru).

Scope: image-source only; scopes absolute/stage/project (driven off the same
supportedScopes constant capture uses).

Plan-reviewed before coding; per-phase go/security/ts reviews; final review
READY TO MERGE. Security review caught + fixed a CRITICAL manifest-Source path
traversal (re-derive target from current config + base containment).

Plan: plans/volume-snapshot-restore/
2026-06-22 17:23:52 +03:00

45 lines
3.6 KiB
Markdown

# Tinyforge
## Dev Server
Start/restart with: `./scripts/dev-server.sh`
- Runs on port **8090** (avoids 8080 conflict with other local services)
- Auto-generates `ENCRYPTION_KEY` if not set
- Default login: `admin` / `admin123`
- Override port: `LISTEN_ADDR=:9000 ./scripts/dev-server.sh`
## Frontend
- **Boolean inputs use `ToggleSwitch`** (`$lib/components/ToggleSwitch.svelte`) — the slide-style switch is the unified control across the WebUI. Do not introduce raw `<input type="checkbox">` elements; place a `<ToggleSwitch>` next to a label/help block instead.
- **Confirmations & destructive actions use `ConfirmDialog`** (`$lib/components/ConfirmDialog.svelte`) — never native `window.confirm` / `alert`. For navigation guards (e.g. the unsaved-changes prompt on `/apps/new`), `cancel()` the navigation in `beforeNavigate`, open `ConfirmDialog`, and re-issue the navigation with a bypass flag on confirm. Native `beforeunload` is acceptable only for hard tab-close/reload, where the browser forbids custom UI.
- **Source-config shape: `$lib/workload/sourceForms.ts`** is the single source of truth (seed/serialize/validity for image/compose/static/dockerfile), consumed by both `/apps/new` and `/apps/[id]`. Don't re-inline seed/serialize logic.
- **"App" = workload with `source_kind !== ''`.** Triggers are first-class bindings (`workload_trigger_bindings`), NOT on the workload row — never gate app lists/counts on `trigger_kind` (it's empty for plugin workloads). Legacy pre-cutover `kind:project/stack/site` rows have an empty `source_kind` and must be excluded everywhere.
- **i18n parity is mandatory** — every key in BOTH `web/src/lib/i18n/{en,ru}.json`. A missing key is NOT a build error (`$t` returns the key string), so verify parity manually.
## Backend
- **Per-workload deploy lock.** Every deploy entrypoint (API deploy, rollback, promote,
generic-hooks, webhook trigger dispatch) funnels through `deployer.DispatchPlugin`, which
holds a per-workload `keyedmutex` lock (`internal/keyedmutex`) for the whole dispatch;
`DispatchTeardown` takes it too. This serializes all container/volume mutation per workload.
Do NOT add a deploy/teardown path that bypasses `DispatchPlugin`. Operations that must run
a deploy *while already holding* the lock (volume-snapshot restore) use
`Deployer.LockWorkload` + `RedeployLocked` (the unlocked dispatch) — calling `DispatchPlugin`
under the held lock would deadlock (Go mutexes are not reentrant). `activeWg` is a global
drain barrier for shutdown, NOT a per-workload lock.
- **Volume snapshot restore** lives in `volsnap.Engine.Restore` (engine-owned, not the API
handler): preflight re-resolves volumes from the workload's CURRENT config (never the
snapshot manifest — that's tamper-influenceable) → lock → stop → extract-to-tmp →
pre-restore snapshot → journal → atomic rename swap → redeploy. A startup
`RecoverInterruptedRestores` sweep replays the journal after a crash; it MUST be wired (with
`SetLifecycle`) before the API serves. The archive extractor treats the tar as untrusted
(zip-slip/type-allowlist/bomb-cap); the endpoint requires an `X-Confirm-Restore: <sid>`
header (CSRF), like the DB restore.
## Build & Test
- Frontend (from `web/`): `npm run check` (svelte-check — expect 0 errors), `npm run build`, `npm run test` (vitest; pure-logic units like `sourceForms.test.ts`).
- Backend (repo root): `go build ./...`, `go vet ./internal/...`, `go test ./internal/...`.
- `./scripts/dev-server.sh` rebuilds the SPA + restarts the Go server on :8090; it kills the prior process, so a previous background dev-server task reporting **exit 1 is expected**, not a failure.