30133bc1eb
Build / build (push) Successful in 10m40s
Two design + handoff docs: - docs/WORKLOAD_REFACTOR_TODO.md — status-at-a-glance table showing what's done (volume scopes, kind-aware editors, vendor webhook parsing, chain-panel CSS, Log Rules panel) and what's still pending (static source inline port + the hard legacy cutover gated on it; codemap entries; /apps page-level i18n; Priority 4 integration tests). - docs/LOGSCAN_AND_TRIGGERS_TODO.md — companion design + status doc for the two Observability features. Records the loop-prevention invariant (event_log = system observing itself, webhook_deliveries = system talking to outside) so the next contributor doesn't accidentally break it by adding a new EventLog subscriber that re-publishes. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
335 lines
16 KiB
Markdown
335 lines
16 KiB
Markdown
# Workload-First Refactor — Remaining Work
|
||
|
||
Handoff for resuming the refactor. The plugin architecture (Source × Trigger),
|
||
`/api/workloads` surface, `/apps` UI, env/volume/webhook/logs/chain panels,
|
||
multi-face proxy routes, blue-green image deploys, schema-driven wizard, and
|
||
test coverage on triggers / image helpers / webhook parser / store upserts are
|
||
**already landed and live**. What follows is what's still pending, in priority
|
||
order.
|
||
|
||
## Status at a glance
|
||
|
||
| Item | Priority | Status |
|
||
| ---- | -------- | ------ |
|
||
| Static source inline port | 1 | **PENDING** — only remaining blocker for hard cutover |
|
||
| Hard legacy cutover | 1 | **PENDING** — gated by static port (volume scopes blocker is resolved) |
|
||
| Generalized volume scopes | 2 | DONE |
|
||
| Kind-aware editors (compose / image / static) | 2 | DONE |
|
||
| Vendor-specific webhook parsing | 2 | DONE |
|
||
| Chain-panel CSS | 3 | DONE |
|
||
| Log Rules panel on `/apps/[id]` | adjacent | DONE — uses `getEffectiveLogScanRules` + per-workload override action |
|
||
| i18n for `/apps/*` page strings | 3 | **PARTIAL** — Log Rules panel + Observability surfaces i18n'd; `apps.*` namespace still pending |
|
||
| Docs / codemap entries for `internal/workload/plugin/` | 3 | **PENDING** |
|
||
| API-handler / dispatcher / compose-source / static-backend tests | 4 | **PENDING** |
|
||
| Triggers as first-class reusable entities (post-cutover) | 5 | **PENDING** |
|
||
|
||
Cross-references to the adjacent Observability work (Event Triggers + Log
|
||
Scanner backend + drop-counter stats panel) live in
|
||
[docs/LOGSCAN_AND_TRIGGERS_TODO.md](LOGSCAN_AND_TRIGGERS_TODO.md).
|
||
|
||
## Priority 1 — Architecture unlock
|
||
|
||
### Static source inline port — ~2150 LOC across 8 files
|
||
|
||
The current `internal/workload/plugin/source/static/` delegates to
|
||
`staticsite.Manager` via a phantom-row adapter
|
||
(`cmd/server/static_backend.go`) that keeps a synthetic row in the legacy
|
||
`static_sites` table per workload. This works but blocks the hard cutover —
|
||
you can't drop `static_sites` until the adapter is gone.
|
||
|
||
To port inline, the deploy pipeline body has to move into
|
||
`internal/workload/plugin/source/static/`:
|
||
|
||
| Source file | Lines | What to keep / port |
|
||
| --- | --- | --- |
|
||
| `internal/staticsite/manager.go` | 834 | Deploy / Stop / status pipeline. State should move to `containers` rows + `workload_env` instead of `static_sites`. |
|
||
| `internal/staticsite/gitea_content.go` | 360 | Keep as helper — Gitea content download/listing. |
|
||
| `internal/staticsite/github_provider.go` | 276 | Keep as helper. |
|
||
| `internal/staticsite/gitlab_provider.go` | 254 | Keep as helper. |
|
||
| `internal/staticsite/healthcheck.go` | 111 | Convert to plugin Reconcile body. |
|
||
| `internal/staticsite/markdown.go` | 83 | Keep as helper. |
|
||
| `internal/staticsite/provider.go` | 171 | Keep — provider abstraction. |
|
||
| `internal/staticsite/deno/` | (sub-pkg) | Keep — Dockerfile + router.ts codegen. |
|
||
|
||
Estimated as its own dedicated turn (or two). Strategy: keep the provider
|
||
abstraction + helpers exported; rewrite only `Manager.Deploy` body into a new
|
||
`source/static/deploy.go` that operates against `plugin.Workload` directly and
|
||
writes container rows + workload_env rather than the `static_sites` table.
|
||
|
||
### Hard legacy cutover
|
||
|
||
Sole remaining blocker is the static source inline port above. The
|
||
generalized-volume-scopes blocker is resolved (legacy `ResolvePath`
|
||
stays in place for legacy callers and dies with the cutover). When the
|
||
static port lands:
|
||
|
||
- Delete `/api/projects`, `/api/stacks`, `/api/sites`, `/api/stages` handlers.
|
||
- Drop tables: `projects`, `stages`, `stacks`, `stack_revisions`,
|
||
`stack_deploys`, `static_sites`, `static_site_secrets`, `deploys`,
|
||
`poll_states`.
|
||
- Delete `internal/stack/`, `internal/staticsite/` packages.
|
||
- Delete frontend `/projects`, `/sites`, `/stacks` routes.
|
||
- Delete legacy `volume.ResolvePath` + `internal/api/volume_browser.go`
|
||
callers (the only remaining users).
|
||
|
||
## Priority 2 — Behavior gaps
|
||
|
||
### ~~Generalized volume scopes~~ — DONE
|
||
|
||
Landed: `internal/volume.ResolveWorkloadPath` (workload-keyed; sits next to the
|
||
legacy `ResolvePath` so legacy code paths keep working) plus the wired-through
|
||
`computeMounts` in `internal/workload/plugin/source/image/image.go`. All
|
||
`VolumeScope` values are now honored at deploy time:
|
||
|
||
- `absolute` — host bind, validated against `settings.AllowedVolumePaths`.
|
||
- `ephemeral` — tmpfs.
|
||
- `instance` — per-tag dir under `<base>/<workload>-<idShort>/instance-<tag>/<source>`.
|
||
- `stage`, `project` — both collapse to `<base>/<workload>-<idShort>/<source>`.
|
||
- `project_named` — Docker named volume prefixed `tf-<idShort>-<name>`.
|
||
- `named` — Docker named volume by raw name.
|
||
|
||
Test coverage: `internal/volume/resolver_test.go` (table-driven, portable
|
||
Linux/Windows). The legacy `ResolvePath` stays in place for legacy deployer +
|
||
volume-browser callers and dies with the hard cutover.
|
||
|
||
### ~~Kind-aware editors on `/apps/new` and `/apps/[id]` edit~~ — DONE
|
||
|
||
All three Source plugins now have hand-rolled forms on both pages, with
|
||
an "Advanced JSON" toggle preserved as the power-user escape hatch.
|
||
Submit logic marshals form fields back into the same JSON shape the
|
||
backend already expects — no API or store changes required.
|
||
|
||
**Principle:** the plugin contract makes new Source / Trigger kinds cheap
|
||
on the backend, but the UI is not cheap by default — every kind needs a
|
||
paired hand-rolled form to be daily-driver usable. The shared JSON
|
||
editor is the fallback for power users and brand-new plugins, not the
|
||
end state. New Source / Trigger merge requests should treat "ship the
|
||
kind-aware form" as part of done, not a follow-up.
|
||
|
||
**Landed:**
|
||
|
||
- `compose`: YAML textarea + project_name input on both `/apps/new`
|
||
and `/apps/[id]`.
|
||
- `image`: form fields for image / port / healthcheck / default_tag /
|
||
registry_name / cpu_limit / memory_limit / max_instances on both
|
||
pages. Registry name is a select populated from `/api/registries`
|
||
(with text-input fallback when the list is empty). env + volumes
|
||
stay in their detail-page panels and round-trip through the form
|
||
via `imageFormBody` so manual edits aren't clobbered.
|
||
- `static`: provider select (gitea / github / gitlab), base URL,
|
||
repo_owner / repo_name (both required), branch (default "main"),
|
||
folder_path, access_token (password input, for private repos),
|
||
mode radio (static / deno), render_markdown checkbox. The
|
||
storage_enabled / storage_limit_mb fields aren't surfaced as
|
||
form controls yet, but they round-trip through `staticFormBody`
|
||
so values set via the raw JSON editor survive form edits.
|
||
|
||
**Still pending forms:** none — all three Source plugins now have
|
||
hand-rolled forms on both `/apps/new` and `/apps/[id]`.
|
||
|
||
The raw JSON editor stays available behind the "Advanced JSON" toggle
|
||
(shipped with compose) so the plugin's full sample is still reachable
|
||
for power users and for any new plugin kind without a hand-rolled form.
|
||
|
||
Effort: per-kind form roughly half a turn each; can land incrementally.
|
||
Touches `web/src/routes/apps/new/+page.svelte` and the edit block in
|
||
`web/src/routes/apps/[id]/+page.svelte`. The Svelte side keeps
|
||
serializing into the same `source_config` JSON shape the backend
|
||
already expects — no API or store change required.
|
||
|
||
### ~~Vendor-specific webhook parsing for `/api/webhook/workloads/{secret}`~~ — DONE
|
||
|
||
Landed: `internal/webhook/vendor_parsers.go` plus rewrites in
|
||
`internal/webhook/handler.go` `buildInboundEvent`. The dispatch order is now:
|
||
|
||
1. Empty body → manual event.
|
||
2. Vendor-specific parsers, short-circuit on a recognized `X-*-Event`
|
||
header — Gitea package, GitHub `package` / `registry_package`, GitHub
|
||
push, Gitea push, GitLab `Push Hook` / `Tag Push Hook`.
|
||
3. Generic simple-body fallback: top-level `image` or top-level `ref` —
|
||
what the legacy CI integrations already send.
|
||
|
||
Vendor parsers can populate fields the generic parser cannot: image
|
||
digest, `GitEvent.Vendor`, registry host. When a vendor parser claims a
|
||
request (header matches) it is authoritative — a malformed Gitea
|
||
package payload surfaces as an error rather than silently falling
|
||
through to the generic parser. Test coverage:
|
||
`internal/webhook/vendor_parsers_test.go` covers each vendor branch +
|
||
the routed-via-`buildInboundEvent` integration cases.
|
||
|
||
Open follow-ups deferred to future turns:
|
||
|
||
- GitLab Container Registry events use a custom envelope outside the
|
||
webhook event surface — handle if a user reports needing it.
|
||
- Docker Hub webhook (push event) uses `{"push_data": {"tag": ...}, "repository": {...}}` — add when there's a user request.
|
||
|
||
## Priority 3 — Polish
|
||
|
||
### ~~Chain-panel CSS~~ — DONE
|
||
|
||
Landed: rules for `.chain-row`, `.chain-card` (with hover/transform on
|
||
anchors), `.chain-self` (brand-tinted highlight), `.chain-name`,
|
||
`.chain-label` (70px fixed-width mono column), `.chain-children-list`
|
||
(flex-wrap), plus a sub-600px stack to keep the panel usable on narrow
|
||
screens. Appended at the end of the `<style>` block in
|
||
`web/src/routes/apps/[id]/+page.svelte`.
|
||
|
||
### Docs / codemap entries
|
||
|
||
Nothing under `docs/CODEMAPS/` for `internal/workload/plugin/`. Should cover:
|
||
|
||
- The Source × Trigger contract + registry pattern (`init()` + blank-import in
|
||
`cmd/server/main.go`).
|
||
- How a new Source kind is added (write `init()` registration, blank-import,
|
||
add to wizard via `SchemaSample`).
|
||
- The dispatcher seam: `deployer.DispatchPlugin` / `DispatchTeardown` /
|
||
`DispatchReconcile` and how the reconciler / webhook ingress / API
|
||
handlers all flow through it.
|
||
|
||
`README.md` should mention `/apps` as the new user surface and that
|
||
`/projects` / `/sites` / `/stacks` carry `Deprecation: true` headers.
|
||
|
||
### i18n: page-level strings — PARTIAL
|
||
|
||
Already i18n'd:
|
||
|
||
- `nav.apps`, `nav.eventTriggers`, `nav.logScanRules` — top nav labels.
|
||
- Log Rules panel on `/apps/[id]` reuses `logscan.panel.*` keys
|
||
(shipped with the Observability work).
|
||
- All `/event-triggers/*` and `/log-scan-rules/*` page strings — keys
|
||
live under `triggers.*` and `logscan.*` namespaces in
|
||
`web/src/lib/i18n/{en,ru}.json`.
|
||
|
||
Still hardcoded English:
|
||
|
||
- `/apps/+page.svelte` — list page (hero, lede, stats, empty state,
|
||
table headers, status pills).
|
||
- `/apps/new/+page.svelte` — wizard labels, form copy, kind-aware
|
||
form rows (compose / image / static all hardcoded English today).
|
||
- `/apps/[id]/+page.svelte` — detail page sections (chain, env,
|
||
volumes, webhook, manual deploy, danger zone) — the Log Rules
|
||
panel embedded inside it is the only i18n'd section.
|
||
|
||
Roughly 80–100 keys across the three `/apps/*` pages once extracted.
|
||
Namespace: `apps.*` (with sub-namespaces `apps.list.*`, `apps.new.*`,
|
||
`apps.detail.*`, `apps.form.*`).
|
||
|
||
## Priority 4 — Tests we still don't have
|
||
|
||
Solid pure-function coverage landed in the prior turn. Still missing:
|
||
|
||
- **API-handler integration tests** for `/api/workloads/*` (CRUD, deploy,
|
||
env, volumes, webhook, chain, promote-from). Pattern: in-memory store +
|
||
fake deployer + fake docker / proxy / dns providers, exercise via
|
||
`httptest`.
|
||
- **Deployer dispatcher**: `DispatchPlugin` / `DispatchTeardown` /
|
||
`DispatchReconcile` with a fake Source registered.
|
||
- **Compose source**: `composeProjectName` sanitizer, `writeYAMLIfChanged`
|
||
short-circuit. (Both pure; just need fixtures.)
|
||
- **Static source Backend adapter** in `cmd/server/static_backend.go`.
|
||
|
||
## Priority 5 — Post-cutover roadmap
|
||
|
||
### Triggers as first-class reusable entities
|
||
|
||
Today a trigger's config lives embedded in the workload row
|
||
(`workload.trigger_kind` plus `workload.trigger_config` JSON via the plugin
|
||
contract). One workload owns exactly one trigger; one trigger serves exactly
|
||
one workload. This couples two concepts that users increasingly want
|
||
orthogonal:
|
||
|
||
- One **inbound webhook** fanning out to several workloads (a single CI push
|
||
rebuilds dev + staging together).
|
||
- One **registry watcher** driving multiple workloads off the same image
|
||
(different tag filters per binding, shared poll state).
|
||
- One **schedule** kicking off a batch of jobs.
|
||
- One **git push** filter shared by sibling stack services.
|
||
|
||
**Direction:** promote triggers to their own table with a join.
|
||
|
||
- `triggers` — `id`, `kind` (registry / git / webhook / schedule / manual /
|
||
log_scan), `config` JSON, `secret`, `created_at`, audit fields.
|
||
- `workload_trigger_bindings` — `workload_id`, `trigger_id`, `binding_config`
|
||
JSON (per-binding overrides: tag filter, path filter, branch filter), plus
|
||
ordering / enabled flag.
|
||
|
||
The dispatcher seam stays unchanged — `deployer.DispatchPlugin` still receives
|
||
a `(Workload, TriggerEvent)` pair; the only change is that the event's source
|
||
is resolved through the binding row instead of the workload row.
|
||
|
||
**UX principle: first-class on the backend, inline by default in the UI.**
|
||
The workload create/edit form still has an "Add trigger" control that creates
|
||
a fresh trigger record in one step, so the 1:1 case (git push → this workload)
|
||
feels unchanged from today. Reuse is **opt-in** via a "Pick existing trigger"
|
||
picker on the same control. Triggers also get their own list/detail pages under
|
||
`/triggers` so the fan-out cases are discoverable and centrally manageable
|
||
(rotate secret once, audit once).
|
||
|
||
**Per-kind modal applies, same rule as Source plugins** — the create/edit
|
||
form for a trigger switches body by `kind` (git: repo / branch / path;
|
||
registry: image / tag regex; webhook: secret + payload preview; schedule:
|
||
cron). Backend cheap, UI requires a paired hand-rolled form per kind. Treat
|
||
"ship the kind-aware form" as part of done for any new trigger kind.
|
||
|
||
**Migration:** clean break (no migration) per the workload-first memory —
|
||
at cutover, each workload's embedded trigger config becomes a single
|
||
auto-created trigger record with a single binding row. No user-visible change
|
||
on day one; reuse becomes possible thereafter.
|
||
|
||
**Sequencing:** lands **after** the Priority 1 hard cutover. The embedded
|
||
trigger config works fine for the 1:1 case that dominates today; the
|
||
static-source inline port is the higher-value blocker. Treat this as the
|
||
next major arc once cutover ships.
|
||
|
||
**Touch points to expect:**
|
||
|
||
- `internal/workload/plugin/trigger/*` — kind handlers stay; only their input
|
||
shape changes (read from binding + trigger row, not workload row).
|
||
- `internal/store/` — new `triggers` + `workload_trigger_bindings` tables and
|
||
CRUD; remove `trigger_kind` / `trigger_config` from the workload row.
|
||
- `internal/api/workloads.go` — adapt the workload create/edit handlers to
|
||
accept either "inline new trigger" or "bind existing trigger" payloads.
|
||
- New `/api/triggers` surface + `/triggers` frontend pages.
|
||
- `internal/webhook/handler.go` — inbound webhook now resolves to a trigger,
|
||
fans out to all bound workloads.
|
||
- `internal/reconciler/reconciler.go` — registry watchers iterate triggers,
|
||
not workloads; each trigger may fire N bindings.
|
||
|
||
## Open architectural questions
|
||
|
||
### Stages chain vs explicit Stage entity
|
||
|
||
`parent_workload_id` is now the canonical mechanism for stage chains
|
||
(dev → staging → prod). Decision deferred: do we need a separate `Stage`
|
||
entity at all, or is the chain sufficient? Currently feels like the chain
|
||
covers the use case — `promote-from` works, the UI shows the relationship.
|
||
Probably can leave the legacy `stages` table dropped entirely once cutover
|
||
proceeds.
|
||
|
||
### `Container.extra_json` evolution
|
||
|
||
Currently only the image source uses it (per-face proxy route IDs). If
|
||
other sources gain similar needs (compose service health metadata, static
|
||
build SHAs), the schema there should stay versionless and additive — every
|
||
reader must tolerate unknown keys. Document this in the source plugin
|
||
guide alongside the codemap entry.
|
||
|
||
## File pointers for the next session
|
||
|
||
- Plugin contracts: `internal/workload/plugin/{plugin,source,trigger,types,registry}.go`
|
||
- Source implementations: `internal/workload/plugin/source/{image,compose,static}/`
|
||
- Trigger implementations: `internal/workload/plugin/trigger/{registry,git,manual}/`
|
||
- Dispatcher: `internal/deployer/dispatch.go`
|
||
- Webhook ingress (plugin path): `internal/webhook/handler.go` `handlePluginWorkloadWebhook`
|
||
- Reconciler hook: `internal/reconciler/reconciler.go` `reconcilePluginWorkloads`
|
||
- Static backend adapter (to be deleted post-port): `cmd/server/static_backend.go`
|
||
- Frontend pages: `web/src/routes/apps/+page.svelte`, `web/src/routes/apps/new/+page.svelte`, `web/src/routes/apps/[id]/+page.svelte`
|
||
- Tests: `internal/workload/plugin/trigger/*/!(_test).go`, `internal/workload/plugin/source/image/image_helpers_test.go`, `internal/webhook/inbound_event_test.go`, `internal/store/workload_env_test.go`
|
||
|
||
## Memory pointer
|
||
|
||
Memory at
|
||
`C:/Users/Alexei/.claude/projects/c--Users-Alexei-Documents-docker-watcher/memory/`
|
||
already covers the Workload-first decision and the no-migration constraint.
|
||
Refresh as the cutover lands.
|