Two design + handoff docs: - docs/WORKLOAD_REFACTOR_TODO.md — status-at-a-glance table showing what's done (volume scopes, kind-aware editors, vendor webhook parsing, chain-panel CSS, Log Rules panel) and what's still pending (static source inline port + the hard legacy cutover gated on it; codemap entries; /apps page-level i18n; Priority 4 integration tests). - docs/LOGSCAN_AND_TRIGGERS_TODO.md — companion design + status doc for the two Observability features. Records the loop-prevention invariant (event_log = system observing itself, webhook_deliveries = system talking to outside) so the next contributor doesn't accidentally break it by adding a new EventLog subscriber that re-publishes. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
16 KiB
Workload-First Refactor — Remaining Work
Handoff for resuming the refactor. The plugin architecture (Source × Trigger),
/api/workloads surface, /apps UI, env/volume/webhook/logs/chain panels,
multi-face proxy routes, blue-green image deploys, schema-driven wizard, and
test coverage on triggers / image helpers / webhook parser / store upserts are
already landed and live. What follows is what's still pending, in priority
order.
Status at a glance
| Item | Priority | Status |
|---|---|---|
| Static source inline port | 1 | PENDING — only remaining blocker for hard cutover |
| Hard legacy cutover | 1 | PENDING — gated by static port (volume scopes blocker is resolved) |
| Generalized volume scopes | 2 | DONE |
| Kind-aware editors (compose / image / static) | 2 | DONE |
| Vendor-specific webhook parsing | 2 | DONE |
| Chain-panel CSS | 3 | DONE |
Log Rules panel on /apps/[id] |
adjacent | DONE — uses getEffectiveLogScanRules + per-workload override action |
i18n for /apps/* page strings |
3 | PARTIAL — Log Rules panel + Observability surfaces i18n'd; apps.* namespace still pending |
Docs / codemap entries for internal/workload/plugin/ |
3 | PENDING |
| API-handler / dispatcher / compose-source / static-backend tests | 4 | PENDING |
| Triggers as first-class reusable entities (post-cutover) | 5 | PENDING |
Cross-references to the adjacent Observability work (Event Triggers + Log Scanner backend + drop-counter stats panel) live in docs/LOGSCAN_AND_TRIGGERS_TODO.md.
Priority 1 — Architecture unlock
Static source inline port — ~2150 LOC across 8 files
The current internal/workload/plugin/source/static/ delegates to
staticsite.Manager via a phantom-row adapter
(cmd/server/static_backend.go) that keeps a synthetic row in the legacy
static_sites table per workload. This works but blocks the hard cutover —
you can't drop static_sites until the adapter is gone.
To port inline, the deploy pipeline body has to move into
internal/workload/plugin/source/static/:
| Source file | Lines | What to keep / port |
|---|---|---|
internal/staticsite/manager.go |
834 | Deploy / Stop / status pipeline. State should move to containers rows + workload_env instead of static_sites. |
internal/staticsite/gitea_content.go |
360 | Keep as helper — Gitea content download/listing. |
internal/staticsite/github_provider.go |
276 | Keep as helper. |
internal/staticsite/gitlab_provider.go |
254 | Keep as helper. |
internal/staticsite/healthcheck.go |
111 | Convert to plugin Reconcile body. |
internal/staticsite/markdown.go |
83 | Keep as helper. |
internal/staticsite/provider.go |
171 | Keep — provider abstraction. |
internal/staticsite/deno/ |
(sub-pkg) | Keep — Dockerfile + router.ts codegen. |
Estimated as its own dedicated turn (or two). Strategy: keep the provider
abstraction + helpers exported; rewrite only Manager.Deploy body into a new
source/static/deploy.go that operates against plugin.Workload directly and
writes container rows + workload_env rather than the static_sites table.
Hard legacy cutover
Sole remaining blocker is the static source inline port above. The
generalized-volume-scopes blocker is resolved (legacy ResolvePath
stays in place for legacy callers and dies with the cutover). When the
static port lands:
- Delete
/api/projects,/api/stacks,/api/sites,/api/stageshandlers. - Drop tables:
projects,stages,stacks,stack_revisions,stack_deploys,static_sites,static_site_secrets,deploys,poll_states. - Delete
internal/stack/,internal/staticsite/packages. - Delete frontend
/projects,/sites,/stacksroutes. - Delete legacy
volume.ResolvePath+internal/api/volume_browser.gocallers (the only remaining users).
Priority 2 — Behavior gaps
Generalized volume scopes — DONE
Landed: internal/volume.ResolveWorkloadPath (workload-keyed; sits next to the
legacy ResolvePath so legacy code paths keep working) plus the wired-through
computeMounts in internal/workload/plugin/source/image/image.go. All
VolumeScope values are now honored at deploy time:
absolute— host bind, validated againstsettings.AllowedVolumePaths.ephemeral— tmpfs.instance— per-tag dir under<base>/<workload>-<idShort>/instance-<tag>/<source>.stage,project— both collapse to<base>/<workload>-<idShort>/<source>.project_named— Docker named volume prefixedtf-<idShort>-<name>.named— Docker named volume by raw name.
Test coverage: internal/volume/resolver_test.go (table-driven, portable
Linux/Windows). The legacy ResolvePath stays in place for legacy deployer +
volume-browser callers and dies with the hard cutover.
Kind-aware editors on /apps/new and /apps/[id] edit — DONE
/apps/new and /apps/[id] editAll three Source plugins now have hand-rolled forms on both pages, with an "Advanced JSON" toggle preserved as the power-user escape hatch. Submit logic marshals form fields back into the same JSON shape the backend already expects — no API or store changes required.
Principle: the plugin contract makes new Source / Trigger kinds cheap on the backend, but the UI is not cheap by default — every kind needs a paired hand-rolled form to be daily-driver usable. The shared JSON editor is the fallback for power users and brand-new plugins, not the end state. New Source / Trigger merge requests should treat "ship the kind-aware form" as part of done, not a follow-up.
Landed:
compose: YAML textarea + project_name input on both/apps/newand/apps/[id].image: form fields for image / port / healthcheck / default_tag / registry_name / cpu_limit / memory_limit / max_instances on both pages. Registry name is a select populated from/api/registries(with text-input fallback when the list is empty). env + volumes stay in their detail-page panels and round-trip through the form viaimageFormBodyso manual edits aren't clobbered.static: provider select (gitea / github / gitlab), base URL, repo_owner / repo_name (both required), branch (default "main"), folder_path, access_token (password input, for private repos), mode radio (static / deno), render_markdown checkbox. The storage_enabled / storage_limit_mb fields aren't surfaced as form controls yet, but they round-trip throughstaticFormBodyso values set via the raw JSON editor survive form edits.
Still pending forms: none — all three Source plugins now have
hand-rolled forms on both /apps/new and /apps/[id].
The raw JSON editor stays available behind the "Advanced JSON" toggle (shipped with compose) so the plugin's full sample is still reachable for power users and for any new plugin kind without a hand-rolled form.
Effort: per-kind form roughly half a turn each; can land incrementally.
Touches web/src/routes/apps/new/+page.svelte and the edit block in
web/src/routes/apps/[id]/+page.svelte. The Svelte side keeps
serializing into the same source_config JSON shape the backend
already expects — no API or store change required.
Vendor-specific webhook parsing for /api/webhook/workloads/{secret} — DONE
/api/webhook/workloads/{secret}Landed: internal/webhook/vendor_parsers.go plus rewrites in
internal/webhook/handler.go buildInboundEvent. The dispatch order is now:
- Empty body → manual event.
- Vendor-specific parsers, short-circuit on a recognized
X-*-Eventheader — Gitea package, GitHubpackage/registry_package, GitHub push, Gitea push, GitLabPush Hook/Tag Push Hook. - Generic simple-body fallback: top-level
imageor top-levelref— what the legacy CI integrations already send.
Vendor parsers can populate fields the generic parser cannot: image
digest, GitEvent.Vendor, registry host. When a vendor parser claims a
request (header matches) it is authoritative — a malformed Gitea
package payload surfaces as an error rather than silently falling
through to the generic parser. Test coverage:
internal/webhook/vendor_parsers_test.go covers each vendor branch +
the routed-via-buildInboundEvent integration cases.
Open follow-ups deferred to future turns:
- GitLab Container Registry events use a custom envelope outside the webhook event surface — handle if a user reports needing it.
- Docker Hub webhook (push event) uses
{"push_data": {"tag": ...}, "repository": {...}}— add when there's a user request.
Priority 3 — Polish
Chain-panel CSS — DONE
Landed: rules for .chain-row, .chain-card (with hover/transform on
anchors), .chain-self (brand-tinted highlight), .chain-name,
.chain-label (70px fixed-width mono column), .chain-children-list
(flex-wrap), plus a sub-600px stack to keep the panel usable on narrow
screens. Appended at the end of the <style> block in
web/src/routes/apps/[id]/+page.svelte.
Docs / codemap entries
Nothing under docs/CODEMAPS/ for internal/workload/plugin/. Should cover:
- The Source × Trigger contract + registry pattern (
init()+ blank-import incmd/server/main.go). - How a new Source kind is added (write
init()registration, blank-import, add to wizard viaSchemaSample). - The dispatcher seam:
deployer.DispatchPlugin/DispatchTeardown/DispatchReconcileand how the reconciler / webhook ingress / API handlers all flow through it.
README.md should mention /apps as the new user surface and that
/projects / /sites / /stacks carry Deprecation: true headers.
i18n: page-level strings — PARTIAL
Already i18n'd:
nav.apps,nav.eventTriggers,nav.logScanRules— top nav labels.- Log Rules panel on
/apps/[id]reuseslogscan.panel.*keys (shipped with the Observability work). - All
/event-triggers/*and/log-scan-rules/*page strings — keys live undertriggers.*andlogscan.*namespaces inweb/src/lib/i18n/{en,ru}.json.
Still hardcoded English:
/apps/+page.svelte— list page (hero, lede, stats, empty state, table headers, status pills)./apps/new/+page.svelte— wizard labels, form copy, kind-aware form rows (compose / image / static all hardcoded English today)./apps/[id]/+page.svelte— detail page sections (chain, env, volumes, webhook, manual deploy, danger zone) — the Log Rules panel embedded inside it is the only i18n'd section.
Roughly 80–100 keys across the three /apps/* pages once extracted.
Namespace: apps.* (with sub-namespaces apps.list.*, apps.new.*,
apps.detail.*, apps.form.*).
Priority 4 — Tests we still don't have
Solid pure-function coverage landed in the prior turn. Still missing:
- API-handler integration tests for
/api/workloads/*(CRUD, deploy, env, volumes, webhook, chain, promote-from). Pattern: in-memory store + fake deployer + fake docker / proxy / dns providers, exercise viahttptest. - Deployer dispatcher:
DispatchPlugin/DispatchTeardown/DispatchReconcilewith a fake Source registered. - Compose source:
composeProjectNamesanitizer,writeYAMLIfChangedshort-circuit. (Both pure; just need fixtures.) - Static source Backend adapter in
cmd/server/static_backend.go.
Priority 5 — Post-cutover roadmap
Triggers as first-class reusable entities
Today a trigger's config lives embedded in the workload row
(workload.trigger_kind plus workload.trigger_config JSON via the plugin
contract). One workload owns exactly one trigger; one trigger serves exactly
one workload. This couples two concepts that users increasingly want
orthogonal:
- One inbound webhook fanning out to several workloads (a single CI push rebuilds dev + staging together).
- One registry watcher driving multiple workloads off the same image (different tag filters per binding, shared poll state).
- One schedule kicking off a batch of jobs.
- One git push filter shared by sibling stack services.
Direction: promote triggers to their own table with a join.
triggers—id,kind(registry / git / webhook / schedule / manual / log_scan),configJSON,secret,created_at, audit fields.workload_trigger_bindings—workload_id,trigger_id,binding_configJSON (per-binding overrides: tag filter, path filter, branch filter), plus ordering / enabled flag.
The dispatcher seam stays unchanged — deployer.DispatchPlugin still receives
a (Workload, TriggerEvent) pair; the only change is that the event's source
is resolved through the binding row instead of the workload row.
UX principle: first-class on the backend, inline by default in the UI.
The workload create/edit form still has an "Add trigger" control that creates
a fresh trigger record in one step, so the 1:1 case (git push → this workload)
feels unchanged from today. Reuse is opt-in via a "Pick existing trigger"
picker on the same control. Triggers also get their own list/detail pages under
/triggers so the fan-out cases are discoverable and centrally manageable
(rotate secret once, audit once).
Per-kind modal applies, same rule as Source plugins — the create/edit
form for a trigger switches body by kind (git: repo / branch / path;
registry: image / tag regex; webhook: secret + payload preview; schedule:
cron). Backend cheap, UI requires a paired hand-rolled form per kind. Treat
"ship the kind-aware form" as part of done for any new trigger kind.
Migration: clean break (no migration) per the workload-first memory — at cutover, each workload's embedded trigger config becomes a single auto-created trigger record with a single binding row. No user-visible change on day one; reuse becomes possible thereafter.
Sequencing: lands after the Priority 1 hard cutover. The embedded trigger config works fine for the 1:1 case that dominates today; the static-source inline port is the higher-value blocker. Treat this as the next major arc once cutover ships.
Touch points to expect:
internal/workload/plugin/trigger/*— kind handlers stay; only their input shape changes (read from binding + trigger row, not workload row).internal/store/— newtriggers+workload_trigger_bindingstables and CRUD; removetrigger_kind/trigger_configfrom the workload row.internal/api/workloads.go— adapt the workload create/edit handlers to accept either "inline new trigger" or "bind existing trigger" payloads.- New
/api/triggerssurface +/triggersfrontend pages. internal/webhook/handler.go— inbound webhook now resolves to a trigger, fans out to all bound workloads.internal/reconciler/reconciler.go— registry watchers iterate triggers, not workloads; each trigger may fire N bindings.
Open architectural questions
Stages chain vs explicit Stage entity
parent_workload_id is now the canonical mechanism for stage chains
(dev → staging → prod). Decision deferred: do we need a separate Stage
entity at all, or is the chain sufficient? Currently feels like the chain
covers the use case — promote-from works, the UI shows the relationship.
Probably can leave the legacy stages table dropped entirely once cutover
proceeds.
Container.extra_json evolution
Currently only the image source uses it (per-face proxy route IDs). If other sources gain similar needs (compose service health metadata, static build SHAs), the schema there should stay versionless and additive — every reader must tolerate unknown keys. Document this in the source plugin guide alongside the codemap entry.
File pointers for the next session
- Plugin contracts:
internal/workload/plugin/{plugin,source,trigger,types,registry}.go - Source implementations:
internal/workload/plugin/source/{image,compose,static}/ - Trigger implementations:
internal/workload/plugin/trigger/{registry,git,manual}/ - Dispatcher:
internal/deployer/dispatch.go - Webhook ingress (plugin path):
internal/webhook/handler.gohandlePluginWorkloadWebhook - Reconciler hook:
internal/reconciler/reconciler.goreconcilePluginWorkloads - Static backend adapter (to be deleted post-port):
cmd/server/static_backend.go - Frontend pages:
web/src/routes/apps/+page.svelte,web/src/routes/apps/new/+page.svelte,web/src/routes/apps/[id]/+page.svelte - Tests:
internal/workload/plugin/trigger/*/!(_test).go,internal/workload/plugin/source/image/image_helpers_test.go,internal/webhook/inbound_event_test.go,internal/store/workload_env_test.go
Memory pointer
Memory at
C:/Users/Alexei/.claude/projects/c--Users-Alexei-Documents-docker-watcher/memory/
already covers the Workload-first decision and the no-migration constraint.
Refresh as the cutover lands.