feat(deployer): configurable per-workload deploy strategy (blue-green for built sources)
Add a deploy_strategy field to each source's config blob — "" (default), "recreate", or "blue-green" — validated in each source's Validate and read on the deploy path. No new DB column, no migration: the field rides inside the existing SourceConfig JSON and every existing workload decodes "" to its historical behavior (image -> blue-green, others -> recreate). The real gap this closes: dockerfile and static stopped the old container before creating the new one on every redeploy — a downtime window image never had. Their blue-green branch now: - names the new "green" container with a unique suffix so it coexists with the still-serving blue (plumbed into both the container name AND the proxy forwardHost); - skips the collision teardown that destroyed blue early; - gates green — an HTTP readiness probe (deps.Health.Check) when a healthcheck is configured, else the existing liveness window; - swaps the route via a pure upsert (no pre-DeleteRoute) so NPM repoints in place with no gap; - persists green into the single runtime-state row BEFORE reaping blue, so a crash mid-swap can never orphan green or leave the row pointing at a removed container (state.go/teardown.go/reconcile.go stay untouched). image honors explicit "recreate" (reap existing containers after pull, before cutover); its default blue-green path is unchanged. compose stays stack-managed and rejects "blue-green" at Validate so the contract is honest. static forces recreate for storage-backed deno sites — blue-green would mount the same RW volume into both containers at once. Shared helper internal/workload/plugin/strategy.go (ValidateStrategy + BuildGreenName). Backend-only (phase 1); the field is usable today via the app's advanced-JSON editor — a friendly toggle + i18n follow in phase 2. Tests: ValidateStrategy matrix, per-source Validate (incl. the empty-key backward-compat lock), and effectiveStrategy defaults + the deno gate. Design + adversarial review: docs/plans/DEPLOY_STRATEGY_PLAN.md.
This commit is contained in:
@@ -0,0 +1,98 @@
|
||||
# Configurable Deploy Strategy — Implementation Plan
|
||||
|
||||
**Status:** planned (workflow-designed + adversarially reviewed) · **Feature rank:** #3 · **Date:** 2026-06-19
|
||||
|
||||
## Problem
|
||||
|
||||
`image` does zero-downtime blue-green; `dockerfile` and `static` **stop+remove the old
|
||||
container before creating the new one** on every redeploy (a real downtime window).
|
||||
`compose` is stack-managed. Give operators a per-workload **deploy strategy** and bring
|
||||
blue-green to the built-from-source sources.
|
||||
|
||||
## Design (chosen via a 3-proposal judge panel; "minimal" won, 9/10)
|
||||
|
||||
Per-source `deploy_strategy` field **inside each source's `SourceConfig` JSON blob** —
|
||||
**no new DB column, no migration, no `dispatch.go` change**. Values: `""` (back-compat
|
||||
default), `"recreate"`, `"blue-green"`. Round-trips opaquely through
|
||||
`plugin.WorkloadFromStore` / `SourceConfigOf[Config]`; validated in each source's existing
|
||||
`Validate(json.RawMessage)` (runs on create **and** update at `workloads_plugin.go:291`).
|
||||
|
||||
**Per-source default (load-bearing):** a single shared default would silently flip
|
||||
image's native blue-green to recreate, so each source has a tiny `effectiveStrategy`:
|
||||
- `image`: `""` → **blue-green**
|
||||
- `dockerfile` / `static` / `compose`: `""` → **recreate**
|
||||
|
||||
The blue-green branch for dockerfile/static uses a **transient two-container / single-row
|
||||
swap** so `state.go`, `teardown.go`, and `reconcile.go` (which read one deterministic row)
|
||||
stay **untouched** — the lowest-risk way to ship gap-free cutover.
|
||||
|
||||
## Review fixes folded in (adversarial pass)
|
||||
|
||||
1. **BLOCKER — ordering / crash-safety.** Blue-green order MUST be: create+start green →
|
||||
readiness-gate green → `ConfigureRoute(green)` (upsert) → **`saveState(green)` into the
|
||||
single row FIRST** → only THEN stop+remove blue (captured before saveState). The single
|
||||
row must always point at a running container; reaping blue before persisting green
|
||||
orphans green and makes the reconciler flip a healthy workload to `failed`.
|
||||
2. **Unique green name is load-bearing.** dockerfile/static names are deterministic
|
||||
(`tf-build-<name>-<id>` / `dw-site-<name>-<id>`) and double as the proxy `forwardHost`.
|
||||
The green container needs a genuinely unique name (`…-<ms-hex>`, lifted from
|
||||
`image.buildContainerName`) set in **both** `cc.Name` **and** the `ConfigureRoute`
|
||||
`forwardHost`.
|
||||
3. **Readiness, not liveness.** Before cutover, use `deps.Health.Check(ctx, http://<green>:
|
||||
<port><healthcheck>)` when a healthcheck path is configured (dockerfile has `Healthcheck`);
|
||||
fall back to the existing 3s liveness gate otherwise. Don't advertise "zero-downtime" on
|
||||
the liveness-only path.
|
||||
4. **Pure upsert.** Drop the pre-`DeleteRoute`; call only `ConfigureRoute` (upsert-by-FQDN
|
||||
for NPM repoints in place; Traefik is label-driven). **Traefik caveat:** blue+green
|
||||
briefly carry the same host-rule labels → momentary dual-serve; documented as a
|
||||
Traefik-only phase-1 limitation (NPM, the common case, is gap-free).
|
||||
5. **deno + storage → force recreate.** When `static` has `StorageEnabled && mode==deno`,
|
||||
`effectiveStrategy` forces `recreate` — blue-green would mount the same RW named volume
|
||||
into both containers (a concurrent-writer window recreate never had).
|
||||
6. **image `recreate` gets its own shape.** Don't reuse `rollbackNew` (assumes blue
|
||||
survives). image `recreate` = reap existing running containers **after** a successful
|
||||
pull, then create green; on green failure the downtime is the accepted recreate
|
||||
contract (logged distinctly, not as a non-disruptive rollback).
|
||||
7. Image tag `:latest` shared by blue/green is **safe** — containers pin image-by-id at
|
||||
create (no fix needed).
|
||||
|
||||
## Files (phase 1, backend-only)
|
||||
|
||||
- **NEW** `internal/workload/plugin/strategy.go` — `StrategyRecreate`/`StrategyBlueGreen`
|
||||
consts, `ValidateStrategy(value string, allowBlueGreen bool) error`,
|
||||
`BuildGreenName(name, id string, ts time.Time) string` (lifted unique-suffix scheme).
|
||||
`+ strategy_test.go`.
|
||||
- `image/image.go` — `DeployStrategy` on Config; `effectiveStrategy` (""→blue-green);
|
||||
Validate; honor `recreate` (reap-after-pull + dedicated log).
|
||||
- `dockerfile/dockerfile.go` (Config + Validate) + `dockerfile/deploy.go` (blue-green
|
||||
branch, fixes 1–4) + `dockerfile/deploy_test.go`.
|
||||
- `static/static.go` (Config + Validate) + `static/deploy.go` (blue-green branch + deno
|
||||
gate, fixes 1–5) + `static/deploy_test.go`.
|
||||
- `compose/compose.go` — Config field + Validate rejects `blue-green` (allowBlueGreen=false)
|
||||
+ test.
|
||||
|
||||
## Phase 1 backward-compat lock (mandatory, unit-tested)
|
||||
`ValidateStrategy("", …)` returns nil; every `effectiveStrategy("")` returns the source's
|
||||
historical default. Existing rows (no `deploy_strategy` key) decode `""` → today's exact
|
||||
behavior, byte-for-byte.
|
||||
|
||||
## Later phases (deferred)
|
||||
- **P2 (UI):** `sourceForms.ts` seed/serialize + `/apps/new` & `/apps/[id]` select +
|
||||
en/ru i18n (hide blue-green for compose).
|
||||
- **P3 (harden):** mandatory HTTP readiness probe for static; connection draining before
|
||||
blue removal; Traefik label suppression at cutover.
|
||||
- **P4 (architecture):** extract image's proven sequence into a shared
|
||||
`plugin.DeploySingleContainer`; migrate dockerfile/static to the multi-row model
|
||||
(crash-safe mid-swap; unlocks `MaxInstances>1`).
|
||||
- **P5:** true `rolling` (needs a backend-pool primitive on `proxy.Provider`) + compose
|
||||
green-project blue-green.
|
||||
|
||||
## Test plan
|
||||
Table-driven, TDD: `ValidateStrategy` accept/reject matrix (incl. `allowBlueGreen=false`,
|
||||
reserved `rolling` rejected, `""` accepted); per-source `effectiveStrategy` defaults +
|
||||
deno-storage→recreate; dockerfile/static blue-green deploy tests asserting (a) green named
|
||||
≠ deterministic name, (b) collision teardown NOT run, (c) `ConfigureRoute` called with
|
||||
`forwardHost==green` and NO preceding `DeleteRoute`, (d) `saveState(green)` **before**
|
||||
`RemoveContainer(blue)`, (e) single row ends at green; failure path: green fails gate →
|
||||
green removed, blue + route untouched; compose rejects blue-green. Gates: `go build`,
|
||||
`go vet`, `go test ./internal/...`, `npm run check/test`, `./scripts/dev-server.sh`.
|
||||
Reference in New Issue
Block a user