Files
tiny-forge/docs/plans/DEPLOY_STRATEGY_PLAN.md
T
alexei.dolgolyov e3d140c57a feat(deployer): configurable per-workload deploy strategy (blue-green for built sources)
Add a deploy_strategy field to each source's config blob — "" (default),
"recreate", or "blue-green" — validated in each source's Validate and read on
the deploy path. No new DB column, no migration: the field rides inside the
existing SourceConfig JSON and every existing workload decodes "" to its
historical behavior (image -> blue-green, others -> recreate).

The real gap this closes: dockerfile and static stopped the old container
before creating the new one on every redeploy — a downtime window image never
had. Their blue-green branch now:
- names the new "green" container with a unique suffix so it coexists with the
  still-serving blue (plumbed into both the container name AND the proxy
  forwardHost);
- skips the collision teardown that destroyed blue early;
- gates green — an HTTP readiness probe (deps.Health.Check) when a healthcheck
  is configured, else the existing liveness window;
- swaps the route via a pure upsert (no pre-DeleteRoute) so NPM repoints in
  place with no gap;
- persists green into the single runtime-state row BEFORE reaping blue, so a
  crash mid-swap can never orphan green or leave the row pointing at a removed
  container (state.go/teardown.go/reconcile.go stay untouched).

image honors explicit "recreate" (reap existing containers after pull, before
cutover); its default blue-green path is unchanged. compose stays
stack-managed and rejects "blue-green" at Validate so the contract is honest.
static forces recreate for storage-backed deno sites — blue-green would mount
the same RW volume into both containers at once.

Shared helper internal/workload/plugin/strategy.go (ValidateStrategy +
BuildGreenName). Backend-only (phase 1); the field is usable today via the
app's advanced-JSON editor — a friendly toggle + i18n follow in phase 2.
Tests: ValidateStrategy matrix, per-source Validate (incl. the empty-key
backward-compat lock), and effectiveStrategy defaults + the deno gate. Design
+ adversarial review: docs/plans/DEPLOY_STRATEGY_PLAN.md.
2026-06-19 16:51:20 +03:00

6.0 KiB
Raw Blame History

Configurable Deploy Strategy — Implementation Plan

Status: planned (workflow-designed + adversarially reviewed) · Feature rank: #3 · Date: 2026-06-19

Problem

image does zero-downtime blue-green; dockerfile and static stop+remove the old container before creating the new one on every redeploy (a real downtime window). compose is stack-managed. Give operators a per-workload deploy strategy and bring blue-green to the built-from-source sources.

Design (chosen via a 3-proposal judge panel; "minimal" won, 9/10)

Per-source deploy_strategy field inside each source's SourceConfig JSON blobno new DB column, no migration, no dispatch.go change. Values: "" (back-compat default), "recreate", "blue-green". Round-trips opaquely through plugin.WorkloadFromStore / SourceConfigOf[Config]; validated in each source's existing Validate(json.RawMessage) (runs on create and update at workloads_plugin.go:291).

Per-source default (load-bearing): a single shared default would silently flip image's native blue-green to recreate, so each source has a tiny effectiveStrategy:

  • image: ""blue-green
  • dockerfile / static / compose: ""recreate

The blue-green branch for dockerfile/static uses a transient two-container / single-row swap so state.go, teardown.go, and reconcile.go (which read one deterministic row) stay untouched — the lowest-risk way to ship gap-free cutover.

Review fixes folded in (adversarial pass)

  1. BLOCKER — ordering / crash-safety. Blue-green order MUST be: create+start green → readiness-gate green → ConfigureRoute(green) (upsert) → saveState(green) into the single row FIRST → only THEN stop+remove blue (captured before saveState). The single row must always point at a running container; reaping blue before persisting green orphans green and makes the reconciler flip a healthy workload to failed.
  2. Unique green name is load-bearing. dockerfile/static names are deterministic (tf-build-<name>-<id> / dw-site-<name>-<id>) and double as the proxy forwardHost. The green container needs a genuinely unique name (…-<ms-hex>, lifted from image.buildContainerName) set in both cc.Name and the ConfigureRoute forwardHost.
  3. Readiness, not liveness. Before cutover, use deps.Health.Check(ctx, http://<green>: <port><healthcheck>) when a healthcheck path is configured (dockerfile has Healthcheck); fall back to the existing 3s liveness gate otherwise. Don't advertise "zero-downtime" on the liveness-only path.
  4. Pure upsert. Drop the pre-DeleteRoute; call only ConfigureRoute (upsert-by-FQDN for NPM repoints in place; Traefik is label-driven). Traefik caveat: blue+green briefly carry the same host-rule labels → momentary dual-serve; documented as a Traefik-only phase-1 limitation (NPM, the common case, is gap-free).
  5. deno + storage → force recreate. When static has StorageEnabled && mode==deno, effectiveStrategy forces recreate — blue-green would mount the same RW named volume into both containers (a concurrent-writer window recreate never had).
  6. image recreate gets its own shape. Don't reuse rollbackNew (assumes blue survives). image recreate = reap existing running containers after a successful pull, then create green; on green failure the downtime is the accepted recreate contract (logged distinctly, not as a non-disruptive rollback).
  7. Image tag :latest shared by blue/green is safe — containers pin image-by-id at create (no fix needed).

Files (phase 1, backend-only)

  • NEW internal/workload/plugin/strategy.goStrategyRecreate/StrategyBlueGreen consts, ValidateStrategy(value string, allowBlueGreen bool) error, BuildGreenName(name, id string, ts time.Time) string (lifted unique-suffix scheme). + strategy_test.go.
  • image/image.goDeployStrategy on Config; effectiveStrategy (""→blue-green); Validate; honor recreate (reap-after-pull + dedicated log).
  • dockerfile/dockerfile.go (Config + Validate) + dockerfile/deploy.go (blue-green branch, fixes 14) + dockerfile/deploy_test.go.
  • static/static.go (Config + Validate) + static/deploy.go (blue-green branch + deno gate, fixes 15) + static/deploy_test.go.
  • compose/compose.go — Config field + Validate rejects blue-green (allowBlueGreen=false)
    • test.

Phase 1 backward-compat lock (mandatory, unit-tested)

ValidateStrategy("", …) returns nil; every effectiveStrategy("") returns the source's historical default. Existing rows (no deploy_strategy key) decode "" → today's exact behavior, byte-for-byte.

Later phases (deferred)

  • P2 (UI): sourceForms.ts seed/serialize + /apps/new & /apps/[id] select + en/ru i18n (hide blue-green for compose).
  • P3 (harden): mandatory HTTP readiness probe for static; connection draining before blue removal; Traefik label suppression at cutover.
  • P4 (architecture): extract image's proven sequence into a shared plugin.DeploySingleContainer; migrate dockerfile/static to the multi-row model (crash-safe mid-swap; unlocks MaxInstances>1).
  • P5: true rolling (needs a backend-pool primitive on proxy.Provider) + compose green-project blue-green.

Test plan

Table-driven, TDD: ValidateStrategy accept/reject matrix (incl. allowBlueGreen=false, reserved rolling rejected, "" accepted); per-source effectiveStrategy defaults + deno-storage→recreate; dockerfile/static blue-green deploy tests asserting (a) green named ≠ deterministic name, (b) collision teardown NOT run, (c) ConfigureRoute called with forwardHost==green and NO preceding DeleteRoute, (d) saveState(green) before RemoveContainer(blue), (e) single row ends at green; failure path: green fails gate → green removed, blue + route untouched; compose rejects blue-green. Gates: go build, go vet, go test ./internal/..., npm run check/test, ./scripts/dev-server.sh.