Files
tiny-forge/docs/WORKLOAD_REFACTOR_TODO.md
T
alexei.dolgolyov 2aff22f565
Build / build (push) Successful in 10m39s
feat(triggers): first-class triggers + bindings with fan-out webhook
Promote triggers from embedded workload fields to standalone records
joined to workloads via workload_trigger_bindings. One trigger (webhook,
registry watcher, git push, manual) now fans out to many workloads with
per-binding config overrides (top-level JSON merge, binding wins).

Backend
- new triggers + workload_trigger_bindings tables with ON DELETE CASCADE
- boot-time backfill of embedded trigger config inside per-workload tx
- store.ErrUnique sentinel translates SQLite UNIQUE at store boundary
- /api/triggers CRUD + /api/triggers/{id}/{webhook,bindings}
- /api/bindings/{id} update/delete; /api/workloads/{id}/triggers list+bind
- bindTriggerToWorkload accepts trigger_id or inline {kind,name,config}
- inline-create uses CreateTriggerWithBindingTx (no orphan triggers)
- validateBindingConfig enforces 8 KiB cap + plugin Validate on merged
- ListTriggersWithBindingCount + ListBindings*WithNames remove N+1
- POST /api/webhook/triggers/{secret} resolves trigger then fans out
- bounded worker pool (4) per request; per-binding error isolation
- outcome accounting: deployed / skipped / no-match / errored
- legacy /api/webhook/workloads/{secret} route removed (clean break;
  backfill keeps secrets resolvable at the new /triggers/{secret} path)
- reconciler gate dropped from (Source && Trigger) to Source only
- MergeJSONConfig returns freshly allocated slices (no fan-out aliasing)
- WithEffectiveTrigger lets existing Trigger.Match contract stay unchanged

Frontend
- /triggers list, new wizard, [id] detail (bindings, webhook rotate)
- workload create wizard: NEW / PICK / SKIP trigger modes
- workload detail: bindings panel + Add-trigger modal (inline / pick)
- per-binding override editor with merged-preview + 8 KiB guard
- "OVERRIDES n FIELDS" row badge when binding_config is non-empty
- shared TriggerKindForm component (registry / git / manual + JSON)
- 3 raw <input type=checkbox> replaced with <ToggleSwitch>
- full EN + RU i18n: redeployTriggers.*, apps.detail.bindings.*,
  apps.new.triggers.*, nav.triggers; event-triggers nav disambiguated

Doc
- WORKLOAD_REFACTOR_TODO: trigger-split marked DONE; next focus is
  the static-source inline port + hard legacy cutover (Priority 1)
2026-05-16 02:24:31 +03:00

387 lines
19 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Workload-First Refactor — Remaining Work
Handoff for resuming the refactor. The plugin architecture (Source × Trigger),
`/api/workloads` surface, `/apps` UI, env/volume/webhook/logs/chain panels,
multi-face proxy routes, blue-green image deploys, schema-driven wizard, and
test coverage on triggers / image helpers / webhook parser / store upserts are
**already landed and live**. What follows is what's still pending, in priority
order.
> ## Current focus (read this first)
>
> **Triggers as first-class reusable entities — DONE** (2026-05-16). The
> trigger-split arc shipped end-to-end: `triggers` + `workload_trigger_bindings`
> tables, boot-time backfill, fan-out webhook handler at
> `/api/webhook/triggers/{secret}` with bounded concurrency, `/api/triggers`
> CRUD + `/api/bindings/{id}` + workload-side bind endpoints, full `/triggers`
> frontend (list, new, detail), workload-page bindings panel + per-binding
> override editor, i18n EN+RU.
>
> **Next on Priority 1** is the **static source inline port** (~2150 LOC
> across 8 files; details in the section below). After that, the
> **hard legacy cutover** (drop `/api/projects`, `/api/stacks`, `/api/sites`,
> `/api/stages` + their tables and frontends) clears the deck.
## Status at a glance
| Item | Priority | Status |
| ---- | -------- | ------ |
| Triggers as first-class reusable entities | 1 | **DONE** (2026-05-16) |
| Static source inline port | 1 | **PENDING — current focus** |
| Hard legacy cutover | 1 | **PENDING** — gated by static port (volume scopes blocker is resolved) |
| Generalized volume scopes | 2 | DONE |
| Kind-aware editors (compose / image / static) | 2 | DONE |
| Vendor-specific webhook parsing | 2 | DONE |
| Chain-panel CSS | 3 | DONE |
| Log Rules panel on `/apps/[id]` | adjacent | DONE — uses `getEffectiveLogScanRules` + per-workload override action |
| i18n for `/apps/*` page strings | 3 | **PARTIAL** — Log Rules panel + Observability surfaces i18n'd; `apps.*` namespace still pending |
| Docs / codemap entries for `internal/workload/plugin/` | 3 | **PENDING** |
| API-handler / dispatcher / compose-source / static-backend tests | 4 | **PENDING** |
Cross-references to the adjacent Observability work (Event Triggers + Log
Scanner backend + drop-counter stats panel) live in
[docs/LOGSCAN_AND_TRIGGERS_TODO.md](LOGSCAN_AND_TRIGGERS_TODO.md).
## Priority 1 — Architecture unlock
### ~~Triggers as first-class reusable entities~~ — DONE (2026-05-16)
Trigger config used to live embedded in the workload row
(`workload.trigger_kind` + `workload.trigger_config`). One workload owned
exactly one trigger; one trigger served exactly one workload. The split
makes a Trigger its own record so one inbound webhook / registry watcher /
schedule / git-push filter fans out to many workloads.
**Schema + store**`triggers` + `workload_trigger_bindings` tables with
`ON DELETE CASCADE`. `binding_config` JSON merges on top of `trigger.config`
(top-level merge, binding wins). Boot-time backfill lifts every existing
embedded trigger into a standalone trigger row + binding inside a
per-workload transaction so a partial failure rolls back cleanly. Trigger
names are id-suffixed unconditionally to dodge the (name, kind) collision
race. `store.ErrUnique` sentinel translates SQLite UNIQUE violations at
the store boundary; API handlers use `errors.Is` instead of substring
match. `MergeJSONConfig` always returns a freshly allocated slice (no
aliasing under fan-out).
**Webhook fan-out** — new `POST /api/webhook/triggers/{secret}` resolves
to one Trigger and fans out to every enabled binding via a bounded worker
pool (`maxTriggerFanOutConcurrency = 4`). Per-binding errors are isolated
(one broken workload doesn't block siblings). Outcome accounting splits
deployed / skipped / no-match / errored cleanly. Legacy
`POST /api/webhook/workloads/{secret}` route dropped (clean break per the
workload-first memory; the boot backfill kept secrets resolvable at the
new path).
**API**`/api/triggers` CRUD, `/api/triggers/{id}/webhook`,
`/api/triggers/{id}/bindings` (list + bind), `/api/bindings/{id}` for
update and delete, and `/api/workloads/{id}/triggers` (list + bind,
accepts either `trigger_id` or inline `{kind, name, config, ...}`).
Inline-create path
runs trigger insert + binding insert inside one transaction
(`CreateTriggerWithBindingTx`) so a binding failure can't leak an orphan
trigger. `validateBindingConfig` enforces 8 KiB cap and runs the trigger
plugin's `Validate()` against the merged shape on every bind/update.
List endpoints use `LEFT JOIN ... GROUP BY` (`ListTriggersWithBindingCount`,
`ListBindingsForTriggerWithNames`, `ListBindingsForWorkloadWithNames`) —
no per-row N+1.
**Plugin contract unchanged**`Trigger.Match` still takes `(Workload,
InboundEvent)`. The fan-out path uses `plugin.WithEffectiveTrigger` to
stuff the merged config into a copied workload before the call, so the
existing `registry`, `git`, `manual` plugins work unchanged.
**Reconciler** — gate dropped from `(SourceKind != "" && TriggerKind != "")`
to `SourceKind != ""`. A workload with a Source but no triggers still
gets `Source.Reconcile` called every tick (manual-only deploys are
common during early setup).
**Frontend** — new pages under `web/src/routes/triggers/`:
- `+page.svelte` — list with kind chips, binding count, webhook status,
empty state.
- `new/+page.svelte` — wizard with kind picker (cards), name, kind-aware
config form (registry / git / manual + JSON fallback), webhook toggles.
- `[id]/+page.svelte` — editable per-kind form, webhook URL panel
(origin-prefixed, copy + ConfirmDialog-gated rotate), bindings list
with per-row enabled `<ToggleSwitch>` + ConfirmDialog-gated unbind,
danger-zone delete.
**Workload UI** — embedded trigger fields removed.
- `apps/new/+page.svelte` — wizard now has Trigger step with NEW / PICK /
SKIP modes; bind happens after `createPluginWorkload` succeeds.
- `apps/[id]/+page.svelte` — Bindings panel above Containers, "Add trigger"
modal with Inline / Pick-existing tabs, **per-binding override editor**
(inline disclosure with read-only base config, editable JSON override,
merged preview, 8 KiB byte cap, save / reset-to-inherit). Per-row
"OVERRIDES n FIELDS" badge surfaces deviation from the trigger.
**Shared component**`web/src/lib/components/TriggerKindForm.svelte`
hosts the kind picker + name + per-kind config + JSON fallback + webhook
toggles. Reused on both `/triggers/new` and the workload Add-trigger modal.
**i18n** — full EN + RU coverage under `redeployTriggers.*` (standalone
pages), `apps.detail.bindings.*` (workload bindings panel including
`override.*`), `apps.new.triggers.*` (wizard mode picker), `nav.triggers`.
The existing `/event-triggers` nav label was disambiguated to "Event
Triggers" to coexist with the new `/triggers` entry.
**Compliance** — three pre-existing raw `<input type="checkbox">`
instances in `apps/new` + `apps/[id]` (render-markdown, env-encrypted)
replaced with `<ToggleSwitch>` to honor the project rule.
**Touch points (final):**
- `internal/store/triggers.go`, `workload_trigger_bindings.go`, `models.go`,
`store.go` (schema + backfill + `translateSQLError`).
- `internal/workload/plugin/binding.go` (`MergeJSONConfig`,
`WithEffectiveTrigger`).
- `internal/webhook/trigger_handler.go` + `handler.go` (route mount,
legacy route removed).
- `internal/reconciler/reconciler.go` (trigger gate dropped).
- `internal/api/triggers.go` + `router.go` (REST surface).
- `web/src/routes/triggers/`, `web/src/routes/apps/{new,[id]}`,
`web/src/lib/components/TriggerKindForm.svelte`, `web/src/lib/api.ts`,
`web/src/lib/i18n/{en,ru}.json`, `web/src/routes/+layout.svelte`.
**Reviews shipped through go-reviewer + security-reviewer +
typescript-reviewer subagents** — 0 CRITICAL; 5 HIGH and 4 MEDIUM
findings addressed inline before merge.
### Static source inline port — ~2150 LOC across 8 files
The current `internal/workload/plugin/source/static/` delegates to
`staticsite.Manager` via a phantom-row adapter
(`cmd/server/static_backend.go`) that keeps a synthetic row in the legacy
`static_sites` table per workload. This works but blocks the hard cutover —
you can't drop `static_sites` until the adapter is gone.
To port inline, the deploy pipeline body has to move into
`internal/workload/plugin/source/static/`:
| Source file | Lines | What to keep / port |
| --- | --- | --- |
| `internal/staticsite/manager.go` | 834 | Deploy / Stop / status pipeline. State should move to `containers` rows + `workload_env` instead of `static_sites`. |
| `internal/staticsite/gitea_content.go` | 360 | Keep as helper — Gitea content download/listing. |
| `internal/staticsite/github_provider.go` | 276 | Keep as helper. |
| `internal/staticsite/gitlab_provider.go` | 254 | Keep as helper. |
| `internal/staticsite/healthcheck.go` | 111 | Convert to plugin Reconcile body. |
| `internal/staticsite/markdown.go` | 83 | Keep as helper. |
| `internal/staticsite/provider.go` | 171 | Keep — provider abstraction. |
| `internal/staticsite/deno/` | (sub-pkg) | Keep — Dockerfile + router.ts codegen. |
Estimated as its own dedicated turn (or two). Strategy: keep the provider
abstraction + helpers exported; rewrite only `Manager.Deploy` body into a new
`source/static/deploy.go` that operates against `plugin.Workload` directly and
writes container rows + workload_env rather than the `static_sites` table.
### Hard legacy cutover
Sole remaining blocker is the static source inline port above. The
generalized-volume-scopes blocker is resolved (legacy `ResolvePath`
stays in place for legacy callers and dies with the cutover). When the
static port lands:
- Delete `/api/projects`, `/api/stacks`, `/api/sites`, `/api/stages` handlers.
- Drop tables: `projects`, `stages`, `stacks`, `stack_revisions`,
`stack_deploys`, `static_sites`, `static_site_secrets`, `deploys`,
`poll_states`.
- Delete `internal/stack/`, `internal/staticsite/` packages.
- Delete frontend `/projects`, `/sites`, `/stacks` routes.
- Delete legacy `volume.ResolvePath` + `internal/api/volume_browser.go`
callers (the only remaining users).
## Priority 2 — Behavior gaps
### ~~Generalized volume scopes~~ — DONE
Landed: `internal/volume.ResolveWorkloadPath` (workload-keyed; sits next to the
legacy `ResolvePath` so legacy code paths keep working) plus the wired-through
`computeMounts` in `internal/workload/plugin/source/image/image.go`. All
`VolumeScope` values are now honored at deploy time:
- `absolute` — host bind, validated against `settings.AllowedVolumePaths`.
- `ephemeral` — tmpfs.
- `instance` — per-tag dir under `<base>/<workload>-<idShort>/instance-<tag>/<source>`.
- `stage`, `project` — both collapse to `<base>/<workload>-<idShort>/<source>`.
- `project_named` — Docker named volume prefixed `tf-<idShort>-<name>`.
- `named` — Docker named volume by raw name.
Test coverage: `internal/volume/resolver_test.go` (table-driven, portable
Linux/Windows). The legacy `ResolvePath` stays in place for legacy deployer +
volume-browser callers and dies with the hard cutover.
### ~~Kind-aware editors on `/apps/new` and `/apps/[id]` edit~~ — DONE
All three Source plugins now have hand-rolled forms on both pages, with
an "Advanced JSON" toggle preserved as the power-user escape hatch.
Submit logic marshals form fields back into the same JSON shape the
backend already expects — no API or store changes required.
**Principle:** the plugin contract makes new Source / Trigger kinds cheap
on the backend, but the UI is not cheap by default — every kind needs a
paired hand-rolled form to be daily-driver usable. The shared JSON
editor is the fallback for power users and brand-new plugins, not the
end state. New Source / Trigger merge requests should treat "ship the
kind-aware form" as part of done, not a follow-up.
**Landed:**
- `compose`: YAML textarea + project_name input on both `/apps/new`
and `/apps/[id]`.
- `image`: form fields for image / port / healthcheck / default_tag /
registry_name / cpu_limit / memory_limit / max_instances on both
pages. Registry name is a select populated from `/api/registries`
(with text-input fallback when the list is empty). env + volumes
stay in their detail-page panels and round-trip through the form
via `imageFormBody` so manual edits aren't clobbered.
- `static`: provider select (gitea / github / gitlab), base URL,
repo_owner / repo_name (both required), branch (default "main"),
folder_path, access_token (password input, for private repos),
mode radio (static / deno), render_markdown checkbox. The
storage_enabled / storage_limit_mb fields aren't surfaced as
form controls yet, but they round-trip through `staticFormBody`
so values set via the raw JSON editor survive form edits.
**Still pending forms:** none — all three Source plugins now have
hand-rolled forms on both `/apps/new` and `/apps/[id]`.
The raw JSON editor stays available behind the "Advanced JSON" toggle
(shipped with compose) so the plugin's full sample is still reachable
for power users and for any new plugin kind without a hand-rolled form.
Effort: per-kind form roughly half a turn each; can land incrementally.
Touches `web/src/routes/apps/new/+page.svelte` and the edit block in
`web/src/routes/apps/[id]/+page.svelte`. The Svelte side keeps
serializing into the same `source_config` JSON shape the backend
already expects — no API or store change required.
### ~~Vendor-specific webhook parsing for `/api/webhook/workloads/{secret}`~~ — DONE
Landed: `internal/webhook/vendor_parsers.go` plus rewrites in
`internal/webhook/handler.go` `buildInboundEvent`. The dispatch order is now:
1. Empty body → manual event.
2. Vendor-specific parsers, short-circuit on a recognized `X-*-Event`
header — Gitea package, GitHub `package` / `registry_package`, GitHub
push, Gitea push, GitLab `Push Hook` / `Tag Push Hook`.
3. Generic simple-body fallback: top-level `image` or top-level `ref`
what the legacy CI integrations already send.
Vendor parsers can populate fields the generic parser cannot: image
digest, `GitEvent.Vendor`, registry host. When a vendor parser claims a
request (header matches) it is authoritative — a malformed Gitea
package payload surfaces as an error rather than silently falling
through to the generic parser. Test coverage:
`internal/webhook/vendor_parsers_test.go` covers each vendor branch +
the routed-via-`buildInboundEvent` integration cases.
Open follow-ups deferred to future turns:
- GitLab Container Registry events use a custom envelope outside the
webhook event surface — handle if a user reports needing it.
- Docker Hub webhook (push event) uses `{"push_data": {"tag": ...}, "repository": {...}}` — add when there's a user request.
## Priority 3 — Polish
### ~~Chain-panel CSS~~ — DONE
Landed: rules for `.chain-row`, `.chain-card` (with hover/transform on
anchors), `.chain-self` (brand-tinted highlight), `.chain-name`,
`.chain-label` (70px fixed-width mono column), `.chain-children-list`
(flex-wrap), plus a sub-600px stack to keep the panel usable on narrow
screens. Appended at the end of the `<style>` block in
`web/src/routes/apps/[id]/+page.svelte`.
### Docs / codemap entries
Nothing under `docs/CODEMAPS/` for `internal/workload/plugin/`. Should cover:
- The Source × Trigger contract + registry pattern (`init()` + blank-import in
`cmd/server/main.go`).
- How a new Source kind is added (write `init()` registration, blank-import,
add to wizard via `SchemaSample`).
- The dispatcher seam: `deployer.DispatchPlugin` / `DispatchTeardown` /
`DispatchReconcile` and how the reconciler / webhook ingress / API
handlers all flow through it.
`README.md` should mention `/apps` as the new user surface and that
`/projects` / `/sites` / `/stacks` carry `Deprecation: true` headers.
### i18n: page-level strings — PARTIAL
Already i18n'd:
- `nav.apps`, `nav.eventTriggers`, `nav.logScanRules` — top nav labels.
- Log Rules panel on `/apps/[id]` reuses `logscan.panel.*` keys
(shipped with the Observability work).
- All `/event-triggers/*` and `/log-scan-rules/*` page strings — keys
live under `triggers.*` and `logscan.*` namespaces in
`web/src/lib/i18n/{en,ru}.json`.
Still hardcoded English:
- `/apps/+page.svelte` — list page (hero, lede, stats, empty state,
table headers, status pills).
- `/apps/new/+page.svelte` — wizard labels, form copy, kind-aware
form rows (compose / image / static all hardcoded English today).
- `/apps/[id]/+page.svelte` — detail page sections (chain, env,
volumes, webhook, manual deploy, danger zone) — the Log Rules
panel embedded inside it is the only i18n'd section.
Roughly 80100 keys across the three `/apps/*` pages once extracted.
Namespace: `apps.*` (with sub-namespaces `apps.list.*`, `apps.new.*`,
`apps.detail.*`, `apps.form.*`).
## Priority 4 — Tests we still don't have
Solid pure-function coverage landed in the prior turn. Still missing:
- **API-handler integration tests** for `/api/workloads/*` (CRUD, deploy,
env, volumes, webhook, chain, promote-from). Pattern: in-memory store +
fake deployer + fake docker / proxy / dns providers, exercise via
`httptest`.
- **Deployer dispatcher**: `DispatchPlugin` / `DispatchTeardown` /
`DispatchReconcile` with a fake Source registered.
- **Compose source**: `composeProjectName` sanitizer, `writeYAMLIfChanged`
short-circuit. (Both pure; just need fixtures.)
- **Static source Backend adapter** in `cmd/server/static_backend.go`.
## Open architectural questions
### Stages chain vs explicit Stage entity
`parent_workload_id` is now the canonical mechanism for stage chains
(dev → staging → prod). Decision deferred: do we need a separate `Stage`
entity at all, or is the chain sufficient? Currently feels like the chain
covers the use case — `promote-from` works, the UI shows the relationship.
Probably can leave the legacy `stages` table dropped entirely once cutover
proceeds.
### `Container.extra_json` evolution
Currently only the image source uses it (per-face proxy route IDs). If
other sources gain similar needs (compose service health metadata, static
build SHAs), the schema there should stay versionless and additive — every
reader must tolerate unknown keys. Document this in the source plugin
guide alongside the codemap entry.
## File pointers for the next session
- Plugin contracts: `internal/workload/plugin/{plugin,source,trigger,types,registry}.go`
- Source implementations: `internal/workload/plugin/source/{image,compose,static}/`
- Trigger implementations: `internal/workload/plugin/trigger/{registry,git,manual}/`
- Dispatcher: `internal/deployer/dispatch.go`
- Webhook ingress (plugin path): `internal/webhook/handler.go` `handlePluginWorkloadWebhook`
- Reconciler hook: `internal/reconciler/reconciler.go` `reconcilePluginWorkloads`
- Static backend adapter (to be deleted post-port): `cmd/server/static_backend.go`
- Frontend pages: `web/src/routes/apps/+page.svelte`, `web/src/routes/apps/new/+page.svelte`, `web/src/routes/apps/[id]/+page.svelte`
- Tests: `internal/workload/plugin/trigger/*/!(_test).go`, `internal/workload/plugin/source/image/image_helpers_test.go`, `internal/webhook/inbound_event_test.go`, `internal/store/workload_env_test.go`
## Memory pointer
Memory at
`C:/Users/Alexei/.claude/projects/c--Users-Alexei-Documents-docker-watcher/memory/`
already covers the Workload-first decision and the no-migration constraint.
Refresh as the cutover lands.