feat(deployer): configurable per-workload deploy strategy (blue-green for built sources)

Add a deploy_strategy field to each source's config blob — "" (default),
"recreate", or "blue-green" — validated in each source's Validate and read on
the deploy path. No new DB column, no migration: the field rides inside the
existing SourceConfig JSON and every existing workload decodes "" to its
historical behavior (image -> blue-green, others -> recreate).

The real gap this closes: dockerfile and static stopped the old container
before creating the new one on every redeploy — a downtime window image never
had. Their blue-green branch now:
- names the new "green" container with a unique suffix so it coexists with the
  still-serving blue (plumbed into both the container name AND the proxy
  forwardHost);
- skips the collision teardown that destroyed blue early;
- gates green — an HTTP readiness probe (deps.Health.Check) when a healthcheck
  is configured, else the existing liveness window;
- swaps the route via a pure upsert (no pre-DeleteRoute) so NPM repoints in
  place with no gap;
- persists green into the single runtime-state row BEFORE reaping blue, so a
  crash mid-swap can never orphan green or leave the row pointing at a removed
  container (state.go/teardown.go/reconcile.go stay untouched).

image honors explicit "recreate" (reap existing containers after pull, before
cutover); its default blue-green path is unchanged. compose stays
stack-managed and rejects "blue-green" at Validate so the contract is honest.
static forces recreate for storage-backed deno sites — blue-green would mount
the same RW volume into both containers at once.

Shared helper internal/workload/plugin/strategy.go (ValidateStrategy +
BuildGreenName). Backend-only (phase 1); the field is usable today via the
app's advanced-JSON editor — a friendly toggle + i18n follow in phase 2.
Tests: ValidateStrategy matrix, per-source Validate (incl. the empty-key
backward-compat lock), and effectiveStrategy defaults + the deno gate. Design
+ adversarial review: docs/plans/DEPLOY_STRATEGY_PLAN.md.
This commit is contained in:
2026-06-19 16:51:20 +03:00
parent 0c4c338bfe
commit e3d140c57a
13 changed files with 592 additions and 12 deletions
+98
View File
@@ -0,0 +1,98 @@
# Configurable Deploy Strategy — Implementation Plan
**Status:** planned (workflow-designed + adversarially reviewed) · **Feature rank:** #3 · **Date:** 2026-06-19
## Problem
`image` does zero-downtime blue-green; `dockerfile` and `static` **stop+remove the old
container before creating the new one** on every redeploy (a real downtime window).
`compose` is stack-managed. Give operators a per-workload **deploy strategy** and bring
blue-green to the built-from-source sources.
## Design (chosen via a 3-proposal judge panel; "minimal" won, 9/10)
Per-source `deploy_strategy` field **inside each source's `SourceConfig` JSON blob**
**no new DB column, no migration, no `dispatch.go` change**. Values: `""` (back-compat
default), `"recreate"`, `"blue-green"`. Round-trips opaquely through
`plugin.WorkloadFromStore` / `SourceConfigOf[Config]`; validated in each source's existing
`Validate(json.RawMessage)` (runs on create **and** update at `workloads_plugin.go:291`).
**Per-source default (load-bearing):** a single shared default would silently flip
image's native blue-green to recreate, so each source has a tiny `effectiveStrategy`:
- `image`: `""`**blue-green**
- `dockerfile` / `static` / `compose`: `""`**recreate**
The blue-green branch for dockerfile/static uses a **transient two-container / single-row
swap** so `state.go`, `teardown.go`, and `reconcile.go` (which read one deterministic row)
stay **untouched** — the lowest-risk way to ship gap-free cutover.
## Review fixes folded in (adversarial pass)
1. **BLOCKER — ordering / crash-safety.** Blue-green order MUST be: create+start green →
readiness-gate green → `ConfigureRoute(green)` (upsert) → **`saveState(green)` into the
single row FIRST** → only THEN stop+remove blue (captured before saveState). The single
row must always point at a running container; reaping blue before persisting green
orphans green and makes the reconciler flip a healthy workload to `failed`.
2. **Unique green name is load-bearing.** dockerfile/static names are deterministic
(`tf-build-<name>-<id>` / `dw-site-<name>-<id>`) and double as the proxy `forwardHost`.
The green container needs a genuinely unique name (`…-<ms-hex>`, lifted from
`image.buildContainerName`) set in **both** `cc.Name` **and** the `ConfigureRoute`
`forwardHost`.
3. **Readiness, not liveness.** Before cutover, use `deps.Health.Check(ctx, http://<green>:
<port><healthcheck>)` when a healthcheck path is configured (dockerfile has `Healthcheck`);
fall back to the existing 3s liveness gate otherwise. Don't advertise "zero-downtime" on
the liveness-only path.
4. **Pure upsert.** Drop the pre-`DeleteRoute`; call only `ConfigureRoute` (upsert-by-FQDN
for NPM repoints in place; Traefik is label-driven). **Traefik caveat:** blue+green
briefly carry the same host-rule labels → momentary dual-serve; documented as a
Traefik-only phase-1 limitation (NPM, the common case, is gap-free).
5. **deno + storage → force recreate.** When `static` has `StorageEnabled && mode==deno`,
`effectiveStrategy` forces `recreate` — blue-green would mount the same RW named volume
into both containers (a concurrent-writer window recreate never had).
6. **image `recreate` gets its own shape.** Don't reuse `rollbackNew` (assumes blue
survives). image `recreate` = reap existing running containers **after** a successful
pull, then create green; on green failure the downtime is the accepted recreate
contract (logged distinctly, not as a non-disruptive rollback).
7. Image tag `:latest` shared by blue/green is **safe** — containers pin image-by-id at
create (no fix needed).
## Files (phase 1, backend-only)
- **NEW** `internal/workload/plugin/strategy.go` — `StrategyRecreate`/`StrategyBlueGreen`
consts, `ValidateStrategy(value string, allowBlueGreen bool) error`,
`BuildGreenName(name, id string, ts time.Time) string` (lifted unique-suffix scheme).
`+ strategy_test.go`.
- `image/image.go` — `DeployStrategy` on Config; `effectiveStrategy` (""→blue-green);
Validate; honor `recreate` (reap-after-pull + dedicated log).
- `dockerfile/dockerfile.go` (Config + Validate) + `dockerfile/deploy.go` (blue-green
branch, fixes 14) + `dockerfile/deploy_test.go`.
- `static/static.go` (Config + Validate) + `static/deploy.go` (blue-green branch + deno
gate, fixes 15) + `static/deploy_test.go`.
- `compose/compose.go` — Config field + Validate rejects `blue-green` (allowBlueGreen=false)
+ test.
## Phase 1 backward-compat lock (mandatory, unit-tested)
`ValidateStrategy("", …)` returns nil; every `effectiveStrategy("")` returns the source's
historical default. Existing rows (no `deploy_strategy` key) decode `""` → today's exact
behavior, byte-for-byte.
## Later phases (deferred)
- **P2 (UI):** `sourceForms.ts` seed/serialize + `/apps/new` & `/apps/[id]` select +
en/ru i18n (hide blue-green for compose).
- **P3 (harden):** mandatory HTTP readiness probe for static; connection draining before
blue removal; Traefik label suppression at cutover.
- **P4 (architecture):** extract image's proven sequence into a shared
`plugin.DeploySingleContainer`; migrate dockerfile/static to the multi-row model
(crash-safe mid-swap; unlocks `MaxInstances>1`).
- **P5:** true `rolling` (needs a backend-pool primitive on `proxy.Provider`) + compose
green-project blue-green.
## Test plan
Table-driven, TDD: `ValidateStrategy` accept/reject matrix (incl. `allowBlueGreen=false`,
reserved `rolling` rejected, `""` accepted); per-source `effectiveStrategy` defaults +
deno-storage→recreate; dockerfile/static blue-green deploy tests asserting (a) green named
≠ deterministic name, (b) collision teardown NOT run, (c) `ConfigureRoute` called with
`forwardHost==green` and NO preceding `DeleteRoute`, (d) `saveState(green)` **before**
`RemoveContainer(blue)`, (e) single row ends at green; failure path: green fails gate →
green removed, blue + route untouched; compose rejects blue-green. Gates: `go build`,
`go vet`, `go test ./internal/...`, `npm run check/test`, `./scripts/dev-server.sh`.