# Configurable Deploy Strategy — Implementation Plan **Status:** planned (workflow-designed + adversarially reviewed) · **Feature rank:** #3 · **Date:** 2026-06-19 ## Problem `image` does zero-downtime blue-green; `dockerfile` and `static` **stop+remove the old container before creating the new one** on every redeploy (a real downtime window). `compose` is stack-managed. Give operators a per-workload **deploy strategy** and bring blue-green to the built-from-source sources. ## Design (chosen via a 3-proposal judge panel; "minimal" won, 9/10) Per-source `deploy_strategy` field **inside each source's `SourceConfig` JSON blob** — **no new DB column, no migration, no `dispatch.go` change**. Values: `""` (back-compat default), `"recreate"`, `"blue-green"`. Round-trips opaquely through `plugin.WorkloadFromStore` / `SourceConfigOf[Config]`; validated in each source's existing `Validate(json.RawMessage)` (runs on create **and** update at `workloads_plugin.go:291`). **Per-source default (load-bearing):** a single shared default would silently flip image's native blue-green to recreate, so each source has a tiny `effectiveStrategy`: - `image`: `""` → **blue-green** - `dockerfile` / `static` / `compose`: `""` → **recreate** The blue-green branch for dockerfile/static uses a **transient two-container / single-row swap** so `state.go`, `teardown.go`, and `reconcile.go` (which read one deterministic row) stay **untouched** — the lowest-risk way to ship gap-free cutover. ## Review fixes folded in (adversarial pass) 1. **BLOCKER — ordering / crash-safety.** Blue-green order MUST be: create+start green → readiness-gate green → `ConfigureRoute(green)` (upsert) → **`saveState(green)` into the single row FIRST** → only THEN stop+remove blue (captured before saveState). The single row must always point at a running container; reaping blue before persisting green orphans green and makes the reconciler flip a healthy workload to `failed`. 2. **Unique green name is load-bearing.** dockerfile/static names are deterministic (`tf-build--` / `dw-site--`) and double as the proxy `forwardHost`. The green container needs a genuinely unique name (`…-`, lifted from `image.buildContainerName`) set in **both** `cc.Name` **and** the `ConfigureRoute` `forwardHost`. 3. **Readiness, not liveness.** Before cutover, use `deps.Health.Check(ctx, http://: )` when a healthcheck path is configured (dockerfile has `Healthcheck`); fall back to the existing 3s liveness gate otherwise. Don't advertise "zero-downtime" on the liveness-only path. 4. **Pure upsert.** Drop the pre-`DeleteRoute`; call only `ConfigureRoute` (upsert-by-FQDN for NPM repoints in place; Traefik is label-driven). **Traefik caveat:** blue+green briefly carry the same host-rule labels → momentary dual-serve; documented as a Traefik-only phase-1 limitation (NPM, the common case, is gap-free). 5. **deno + storage → force recreate.** When `static` has `StorageEnabled && mode==deno`, `effectiveStrategy` forces `recreate` — blue-green would mount the same RW named volume into both containers (a concurrent-writer window recreate never had). 6. **image `recreate` gets its own shape.** Don't reuse `rollbackNew` (assumes blue survives). image `recreate` = reap existing running containers **after** a successful pull, then create green; on green failure the downtime is the accepted recreate contract (logged distinctly, not as a non-disruptive rollback). 7. Image tag `:latest` shared by blue/green is **safe** — containers pin image-by-id at create (no fix needed). ## Files (phase 1, backend-only) - **NEW** `internal/workload/plugin/strategy.go` — `StrategyRecreate`/`StrategyBlueGreen` consts, `ValidateStrategy(value string, allowBlueGreen bool) error`, `BuildGreenName(name, id string, ts time.Time) string` (lifted unique-suffix scheme). `+ strategy_test.go`. - `image/image.go` — `DeployStrategy` on Config; `effectiveStrategy` (""→blue-green); Validate; honor `recreate` (reap-after-pull + dedicated log). - `dockerfile/dockerfile.go` (Config + Validate) + `dockerfile/deploy.go` (blue-green branch, fixes 1–4) + `dockerfile/deploy_test.go`. - `static/static.go` (Config + Validate) + `static/deploy.go` (blue-green branch + deno gate, fixes 1–5) + `static/deploy_test.go`. - `compose/compose.go` — Config field + Validate rejects `blue-green` (allowBlueGreen=false) + test. ## Phase 1 backward-compat lock (mandatory, unit-tested) `ValidateStrategy("", …)` returns nil; every `effectiveStrategy("")` returns the source's historical default. Existing rows (no `deploy_strategy` key) decode `""` → today's exact behavior, byte-for-byte. ## Later phases (deferred) - **P2 (UI):** `sourceForms.ts` seed/serialize + `/apps/new` & `/apps/[id]` select + en/ru i18n (hide blue-green for compose). - **P3 (harden):** mandatory HTTP readiness probe for static; connection draining before blue removal; Traefik label suppression at cutover. - **P4 (architecture):** extract image's proven sequence into a shared `plugin.DeploySingleContainer`; migrate dockerfile/static to the multi-row model (crash-safe mid-swap; unlocks `MaxInstances>1`). - **P5:** true `rolling` (needs a backend-pool primitive on `proxy.Provider`) + compose green-project blue-green. ## Test plan Table-driven, TDD: `ValidateStrategy` accept/reject matrix (incl. `allowBlueGreen=false`, reserved `rolling` rejected, `""` accepted); per-source `effectiveStrategy` defaults + deno-storage→recreate; dockerfile/static blue-green deploy tests asserting (a) green named ≠ deterministic name, (b) collision teardown NOT run, (c) `ConfigureRoute` called with `forwardHost==green` and NO preceding `DeleteRoute`, (d) `saveState(green)` **before** `RemoveContainer(blue)`, (e) single row ends at green; failure path: green fails gate → green removed, blue + route untouched; compose rejects blue-green. Gates: `go build`, `go vet`, `go test ./internal/...`, `npm run check/test`, `./scripts/dev-server.sh`.