Follow-ups on commit 39e1e36 addressing review feedback from
go-reviewer / security-reviewer / typescript-reviewer.
Backend:
- New POST /api/triggers/{id}/fire (AdminOnly, schedule-only): operator
"Fire now" button — dispatches immediately without waiting for the
next natural interval. Persists last_fired_at BEFORE dispatch, same
ordering as the scheduler. Per-trigger in-flight guard (429 if a
fire is already running) to defend against rapid double-clicks /
runaway scripts. Refuses request when AdminOnly claims are absent
rather than logging an unattributable deploy.
- SetTriggerLastFired now validates timestamp parses as RFC3339 before
writing. Rejects empty string explicitly — empty-clears semantics
were dead (no caller) and would silently re-fire on next tick if
ever accidentally written. A future reset-cadence flow must add a
dedicated ClearTriggerLastFired so the call site is grep-able and
separately auditable.
- Scheduler logs WARN on catch-up fires (now - lastFired > 2× interval)
so the "surprise burst at restart" pattern shows up in audit logs.
- BindingResult reason strings extracted to package consts
(webhook.Reason*) so the scheduler and api fire-now classifications
stay in sync without string-matching drift.
- SECURITY NOTE on FanOutForTrigger documents that the
WebhookRequireSignature gate is ingress-only by design.
Frontend:
- Refactored /triggers/new (770 LOC → 155 LOC) and /triggers/[id]
(~350 LOC dropped) to use the shared TriggerKindForm. Eliminates the
triplicated per-kind state + buildConfig + canSubmit + template that
caused the d-unit regex drift in the prior commit.
- New seedTriggerKindFormState helper on TriggerKindForm primes the
form from a server-returned trigger config with defensive type
guards; resets per-kind slots first so re-seeding across kinds
doesn't inherit stale state.
- /triggers/[id] gains a Schedule status panel with Last Fired + Fire
Now button (gated on binding_count > 0). Confirmation dialog,
result flash, timer cleanup on unmount + new-fire (no stale-closure
race). EN+RU i18n parity.
Fourth trigger kind alongside registry/git/manual. Recurring time-interval
fires driven by a new internal/scheduler tick loop (default 30s, clamped
to 5m). Goes through the same webhook.Handler.FanOutForTrigger seam as
inbound HTTP webhooks, so per-binding concurrency, outcome accounting,
and config-merge semantics are identical.
Schema: triggers.last_fired_at TEXT column (additive ALTER for existing
DBs). Scheduler persists last_fired_at BEFORE dispatch so a panicking
Match cannot wedge a tight loop; failed deploys wait one full interval
before retry — correct trade-off for a periodic refresh trigger.
Frontend: TriggerKindForm + /triggers/new + /triggers/[id] gain the
schedule kind (4-col card grid, preset chips Hourly/Daily/Weekly,
custom interval input matched to Go time.ParseDuration syntax, optional
pinned reference). /triggers/[id] surfaces "last fired" on schedule rows.
EN+RU i18n in parity.
Review fixes from go-reviewer / security-reviewer / typescript-reviewer:
- Scheduler Start/Stop wrapped in sync.Once (no goroutine leak / double-
cancel panic on shutdown re-entry).
- shouldFire rejects sub-MinInterval as defense-in-depth against
hand-inserted rows that bypassed Validate.
- fire() asserts trigger Kind=="schedule" before dispatching.
- Aligned isValidInterval regex across all three frontend sites; reject
the unsupported "d" unit (Go time.ParseDuration doesn't accept it).
- formatLastFired falls back to lastFiredNever on malformed timestamps
rather than leaking raw bytes into the UI.
- main.go scheduler closure logs per-fire deployed/errored counts.
Closes the workload-first refactor by landing the Priority 3 polish
items and the Priority 4 test gap. Net: ~2,400 lines added,
~350 lines modified across 13 files.
Priority 3 — polish
- apps.* i18n namespace: 276 new keys across apps.list.* (27),
apps.new.* (91, sibling of existing apps.new.triggers.*), and
apps.detail.* (158, sibling of existing apps.detail.bindings.*).
EN+RU at 1314 keys each, perfectly in sync. /apps, /apps/new,
/apps/[id] now render entirely from i18n.
- New codemap docs/CODEMAPS/workload-plugin.md (238 lines):
Source × Trigger contract, dispatch seam, webhook fan-out path,
recipes for adding a new Source or Trigger kind. Plus
docs/CODEMAPS/INDEX.md gateway.
Priority 4 — tests
- internal/api/workloads_test.go (new, ~30 subtests): /api/workloads
CRUD + deploy + delete + env + volumes + chain + promote-from +
triggers list/inline-bind + auth gating + standalone /api/triggers
CRUD (create / dup-409 / kind filter / delete). Uses real
POST handlers via httptest.NewServer + a fake plugin source
registered under "testfakesource".
- internal/deployer/dispatch_test.go (new, 11 tests):
DispatchPlugin / DispatchTeardown / DispatchReconcile happy +
unknown-kind + propagated-error each; PluginDeps wiring; a real
2s-bounded RWMutex deadlock probe on PluginDeps vs SetDNSProvider.
- internal/workload/plugin/source/compose/compose_test.go (new,
~26 subtests): composeProjectName sanitization,
writeYAML / writeYAMLIfChanged hash short-circuit, Validate happy
+ bad inputs, Kind / SchemaSample.
Coverage delta on the workload-plugin path:
- internal/api: 1.1% → 16.0%
- internal/deployer: 0% → 54.1%
- internal/workload/plugin/source/compose: 0% → 38.5%
- Trigger plugins already at 87-95% from the trigger-split work.
Production fix surfaced by the tests
- store.CreateWorkload now self-references RefID = ID when caller
leaves RefID empty (the typical plugin-native path). The api
layer's broken backfill loop (called UpdateWorkload, which
deliberately omits ref_id) is gone. Multiple sibling plugin
workloads can now coexist under the UNIQUE(kind, ref_id) constraint.
Review fixes addressed before commit
- CRITICAL: deadlock-detect test gained a real 2s time.After (was
selecting on context.Background().Done() which never fires).
- HIGH: happy-path test now hard-asserts RefID = ID (was a t.Logf
that would silently pass after a production fix).
- HIGH: standalone /api/triggers CRUD coverage added (was bypassed
by the workload-side bind flow).
- HIGH: seedWorkload bypass deleted; tests now go through the
real POST /api/workloads handler.
- MEDIUM: withTempDir restore is a no-op (t.Setenv auto-restores);
dead `old := os.Getenv(...)` capture removed.
- MEDIUM: list-workloads test now asserts ID membership, not just
count.
Doc
- WORKLOAD_REFACTOR_TODO: all three Priority 1 items, Priority 3
polish, and Priority 4 tests marked DONE. The workload-first arc
is closed.
Wraps up the workload refactor with the fixes that came out of the multi-agent
code review (see docs/plans/workload-refactor.md "What actually shipped").
Backend:
- store.ReconcileContainer: separate write path so the 30s reconciler tick no
longer overwrites deployer-owned fields (subdomain, proxy_route_id,
npm_proxy_id, image_tag).
- Container.stage_id column + index; ListProxyRoutes / ListContainersByStageID
join via stage_id (survives stage rename), with legacy fallback to
(project_id, role=stage_name).
- Reconciler: workload-existence check (rejects forged tinyforge.workload.id
labels), skips inventing project-kind rows, child-context cancel before
wg.Wait() on shutdown.
- Transactional CRUD across projects / stacks / static_sites: parent UPDATE
and workload sync land in one transaction so secret rotations are durable.
- Webhook routing reads exclusively through workloads.webhook_secret; legacy
GetProjectByWebhookSecret / GetStaticSiteByWebhookSecret fallback removed.
- store.GetStackByComposeProjectName + indexed lookup (no more full-table
stack scan per compose container per tick).
- store.ListMissingSweepRows: filtered query for the missing-sweep.
- /api/instances/* handlers verify (workload_id, role) match URL
(project_id, stage_name) before mutating — closes the cross-project
hijack the security review flagged.
- extra_json no longer referenced from Go (column kept on disk for now).
Frontend:
- WorkloadContainers.svelte: generic detail-page panel reusable by stack and
site detail pages.
- Containers page polish: client-side kind/state filters over an unfiltered
fetch, URL-synced filters, race-safe loads via sequence number, EN+RU i18n,
sidebar counter via navCounts.containers.
Misc:
- scripts/dev-server.sh: tolerate empty netstat grep result.
- .gitignore: ignore docker-watcher binaries, .claude/worktrees/, .facts-sync.json.
End-to-end extraction of the Instance concept. After this commit:
* internal/store/instances.go — DELETED
* internal/store/models.go — Instance struct gone, ProxyRoute moved here
* containers table is the single source of truth for project/stack/site
container state. instances table is dropped via DROP TABLE migration
(idempotent; re-runnable on every boot).
* Legacy tinyforge.project / tinyforge.stage / tinyforge.instance-id
Docker labels are no longer emitted; only tinyforge.workload.{id,kind},
tinyforge.role, and tinyforge.managed are stamped on new containers.
Backend rewrites:
- internal/deployer: executeDeploy + blueGreenDeploy + rollback +
promote use store.Container natively. New
removeContainer() replaces removeInstance().
enforceMaxInstances reads via
ListContainersByStageID.
- internal/reconciler: legacy tinyforge.instance-id dispatch removed;
upsertByWorkloadLabel now finds existing rows
by docker container ID first and falls back to
the deterministic workloadID:role key.
- internal/stale/scanner: Scan + new FindStaleContainers walk the
containers table; emit StaleContainer JSON.
- internal/stats/collector: ListContainers replaces ListAllInstances.
- internal/webhook/handler: workload-secret lookup tried first; falls back
to project / static_site secret column.
- internal/api: instances.go, stale.go, stats.go, stats_history.go,
projects.go, settings.go, docker.go, dns.go all read /
write through Container.
Docker layer:
- ManagedContainer exposes WorkloadID/Kind/Role from the canonical labels.
- ListContainers filters by tinyforge.managed=true.
- Network creation uses LabelManaged instead of LabelProject.
Frontend:
- Instance type is now a Container alias; .status → .state,
.last_alive_at → .last_seen_at.
- InstanceCard takes stageId as a prop (no longer derived from Instance).
- StaleContainer JSON shape rewritten: { container, workload_name, role,
days_stale }. StaleContainerCard + /containers/stale page updated.
- ProjectCard / homepage / SystemHealthCard filter by .state.
The migration loop now tolerates "no such table" alongside "duplicate
column" / "already exists" so obsolete ALTER TABLE entries targeting the
dropped instances table no-op cleanly on first boot.
Tests: store + deployer + reconciler + webhook + staticsite + notify all
still pass. Frontend svelte-check: zero errors.
The Proxies page consumer (and the secondary callers in
internal/api/health.go and internal/api/settings.go) now read
from the normalized containers index instead of the instances
table. Stage ID is recovered through a (project_id, role=stage_name)
join — uniquely-indexed via the existing UNIQUE(project_id, name)
constraint on stages.
Source field stays "instance" for back-compat with the Proxies
page filter (the frontend keys off the literal string).
Three new tests pin the join shape, verify the npm_proxy_id-only
WHERE branch survives, and check that an orphan-role row falls
out of the join cleanly (catches a regression to LEFT JOIN).
CRUD on Project / Stack / StaticSite now keeps a paired Workload
row in sync. Secret setters (webhook secret, signing secret,
require-signature toggle, notification secret) all re-sync after
mutating the source-of-truth row so the workload row always
reflects the canonical state.
Delete cascades: DeleteProject/Stack/StaticSite now drop the
matching workload row plus any container index entries owned by
it, so global views don't show ghost rows.
Boot-time BackfillWorkloads scans every project/stack/site and
ensures each has a workload row. Idempotent — safe to run on
every restart, recovers from a deleted/missing workload row.
Behavior unchanged for existing call sites; the workloads table
just starts being populated. Deployer / reconciler / consumer
switchover land in the next commit.
Introduces the data layer for the Workload refactor (see
docs/plans/workload-refactor.md): three new tables and store
methods, no behavior changes elsewhere yet.
- workloads: unifying primitive over Project/Stack/StaticSite,
paired via UNIQUE(kind, ref_id). Notification + webhook config
hosted here so it lives in one place across kinds.
- containers: normalized index of every Tinyforge-managed
container with first-class subdomain/proxy_route_id/npm_proxy_id
columns (heavily queried by ListProxyRoutes / stale detection).
- apps: optional grouping of workloads; schema only, no UI in v1.
Foundation only — deployer surgery, reconciler, and consumer
switchover land in the next commit.
Persists every inbound webhook hit (project + site) so users can debug
"why didn't my deploy fire?" without grepping daemon logs. Surfaces a
14-day rolling history under the WebhookPanel on each project + site
detail page; refreshes every 30s while open. Daily cron prunes records
older than 14 days alongside the existing event log prune.
Schema:
- webhook_deliveries(id, target_type, target_id, target_name, received_at,
source_ip, signature_state, status_code, outcome, detail, body_size)
- indexes on (target_type,target_id,received_at) and (received_at)
Backend:
- store: WebhookDelivery model + Insert/List/Prune helpers
- webhook/handler: deferred recordDelivery() captures the final outcome
on every return path including HMAC rejects, image mismatch, no-stage,
auto_deploy=false, and successful deploys; signatureStateFor()
classifies "unconfigured" vs "missing" vs "invalid" vs "valid"
- api: GET /api/{projects,sites}/{id}/webhook/deliveries with
parseLimit() helper (default 50, max 200)
- main: daily prune cron retains the last 14 days
Frontend:
- WebhookDeliveryLog.svelte: panel with refresh button, status code +
outcome + signature badges, relative time tooltip-on-hover for
absolute time, source IP column
- Mounted below WebhookPanel on project + site detail pages
- en/ru i18n strings for outcome/signature enums and column labels
Adds an opt-in inbound HMAC scheme so a leaked URL alone is not enough
to forge deploy/sync requests — the caller must also know a separate
signing secret. Header format is X-Hub-Signature-256, matching the
Gitea/GitHub/GitLab convention so existing CI integrations work without
custom code.
Behaviour:
- per-project / per-site signing_secret is independent of the URL secret
- require_signature flag does a hard 401 on missing/invalid signatures
- even when require_signature is off, an *invalid* submitted signature
returns 401 — surfaces CI misconfiguration instead of silently passing
- comparison uses subtle/hmac.Equal (constant time)
Backend:
- store: webhook_signing_secret + webhook_require_signature columns on
projects + static_sites; scanProject helper, scan helpers updated; new
Set* helpers for both fields
- webhook/handler: verifyHMAC helper, body read once, integrated into
both project and site handlers
- api: per-entity signing-secret rotate / disable / require-toggle
endpoints under /api/{projects,sites}/{id}/webhook/...
Frontend:
- WebhookPanel gains optional signing handlers (no breaking change for
existing callers; signing UI hides when handlers aren't wired)
- one-shot reveal of the issued secret with copy + dismiss
- ToggleSwitch for require-signature, disabled until a secret is issued
- en/ru i18n strings
Tests:
- HMACRequiredAndValid (200 + deploy fires)
- HMACRequiredButMissing (401, no deploy)
- HMACPresentButWrong (401 even when require_signature=false)
- HMACOptionalUnsignedAccepted (200 when neither configured)
Adds an opt-in "auto_backup_before_deploy" setting that triggers a
"pre-deploy" backup at the start of every project deploy via the deploy
pipeline (covers both the async HTTP path and the sync poller/webhook
path). Failures are logged to the deploy log but do not abort — missing
a backup is preferable to refusing to ship a fix.
- store: settings.auto_backup_before_deploy column + scan/update wiring
- backup: accept "pre-deploy" as a valid backup_type
- deployer: small PreDeployBackuper interface, hooked into runDeploy
right after settings load and before any state-mutating work
- api: settings request/response surface the new flag
- web: ToggleSwitch on the backup settings page; "Pre-deploy" badge
variant in the backup list (badge-warning so it stands out)
- i18n: en/ru strings for the toggle, help text, and badge label
Outgoing notifications were bare POSTs with no auth and no way to verify
they came from Tinyforge. They also went out from one global URL only,
even though stages had a notification_url field, and static-site sync
emitted no events at all.
Schema: add notification_url + notification_secret (lazy-generated) to
settings, projects, stages and static_sites. Migrations are additive.
Notifier: SendSigned computes HMAC-SHA256 over the exact body bytes and
sends X-Hub-Signature-256 (GitHub-compatible — receivers built for
GitHub/Gitea/Forgejo verify out of the box). Aux headers
X-Tinyforge-Event/Delivery/Timestamp/Tier are advisory and not signed.
Empty secret => unsigned send for back-compat.
Resolution: deploys fall through stage > project > settings, sites fall
through site > settings. The secret travels with the URL that sourced
it, so any tier can sign even when its parents are unsigned. Site sync
events now actually emit (site_sync_success / site_sync_failure).
API: 12 new endpoints — {GET secret, POST regenerate, POST disable,
POST test} for each of the 4 tiers. SendSyncForTest returns
status_code/latency_ms/signature_sent/delivery_id/response_snippet so
the UI surfaces receiver feedback inline.
UI: shared OutgoingWebhookPanel.svelte fits the existing card aesthetic.
Signing-state pill, secret reveal-on-demand, regenerate/disable behind
ConfirmDialog modals (not inline strips — too easy to misclick), send-
test result card with colour-coded status. Wired into Settings →
Integrations, project edit form, per-stage edit, and per-site detail.
EN + RU i18n.
Tests: round-trip (sender signs, receiver verifies), tampered-body and
wrong-secret rejection, unsigned-send omits header, send-test surfaces
4xx, concurrent fan-out via Drain. Resolver precedence locked for both
deploy and site paths.
Docs: docs/webhooks.md with header reference, verifier snippets in
Node/Python/Go, and a recipe for the service-to-notification-bridge
generic webhook provider.
Security:
- rate limit /api/webhook routes per-IP and cap concurrent site syncs
- global SSE connection cap (256) with new sse_gate
- validate ?tail= and cap JSON log responses at 4 MiB
- strip ANSI/CSI/OSC and control bytes from streamed log lines
- redact webhook secret from request log middleware
- scrub host details from /api/health for non-admin viewers
- drop container_id from /api/system/stats/top for non-admins
- generate webhook secrets via crypto/rand; require >=32 chars on insert
- verify iid path consistency in streamContainerLogs
- LimitReader on site webhook body; reject malformed non-empty bodies
Concurrency / correctness:
- stats collector: Stop() no longer hangs without Start(), semaphore
acquired in parent loop so ctx cancellation short-circuits the queue,
in-flight tick cancellable via shared base context, zero-ts guard
- webhook handler: replace fire-and-forget goroutine with WaitGroup-tracked
workers + Drain() wired into graceful shutdown
- $derived(() => ...) mis-idiom fixed in ContainerStats / InstanceCard /
ProjectCard (returned function instead of value)
- SystemResourcesCard: rename `window` and `t` locals to avoid shadowing
globalThis.window and the i18n `t` import
Quality / performance:
- replace O(n^2) insertion sort with sort.Slice in stats top
- runMigrations only swallows duplicate-column / already-exists errors
- PruneStatsSamplesBefore wrapped in a transaction
- collapse N+1 in unusedImageStats / pruneImages to one ListAllInstances
pass; surface DB errors instead of silently treating them as inactive
- run Docker Info + DiskUsage in parallel via errgroup
- container log SSE emits `: ping` heartbeat every 20 s
- imageMatches case-insensitive on registry host (RFC behaviour)
- log warning on invalid stage tag pattern instead of silent skip
- reject malformed non-empty site webhook payloads
Frontend / i18n:
- shared formatBytes utility replaces three local copies
- statsInterval store drives dynamic "no samples / collection disabled"
copy across ContainerStats and SystemResourcesCard
- top consumers row now shows owner_name (project/stage or site name)
- drop seven `as any` casts on the Settings type; add cloudflare_api_token
write-only field
- move "Service status", "Docker daemon", "Docker unreachable",
"Proxy unreachable", "reachable", and "Docker daemon is not reachable."
strings into en/ru i18n bundles
Background collector samples CPU/memory/network/block I/O for every
instance and site on a configurable interval (default 15s, range
5-300s), persists samples to SQLite with a configurable retention
window (default 2h, range 0-24h), and skips ticks gracefully when
the Docker daemon is unreachable. Settings are reloadable without
a restart — each tick re-reads them.
New API endpoints:
- GET /api/system/stats (host snapshot: info + df)
- GET /api/system/stats/history
- GET /api/system/stats/top?by=cpu|memory
- GET /api/projects/{id}/stages/{s}/instances/{iid}/stats/history
- GET /api/sites/{id}/stats[/history]
- GET /api/sites/{id}/logs (SSE + JSON, reuses instance log streamer)
Frontend:
- ECharts added with tree-shaken imports (~180KB gzip) for
future-proof time-series/gantt/graph visualizations
- CollapsibleSection wraps all dashboard sections (system health,
daemons, system resources, static sites, projects) with
localStorage-persisted open state
- SystemResourcesCard shows capacity tiles, workload utilization
chart with 30m/2h/6h/24h window picker, disk breakdown with
reclaimable callouts, and top 5 consumers
- ContainerStats and ContainerLogs take a source discriminated union
so sites reuse the same components as instances; sites detail page
embeds both for Deno backend debugging
- Settings › Maintenance exposes collection interval + retention
- Docker-unavailable state returns 503 and renders an amber banner
instead of a generic 500
Full i18n coverage (en + ru) for all new strings.
Replace the single global webhook secret with entity-scoped secrets stored
on each project and static site. Webhook-driven project autocreate is
removed — projects must exist before their URL can trigger deploys.
Also wires static-site webhooks (sync_trigger=push|tag), turning the
previously inert "push" trigger into a functional one: POST the site's
webhook URL from a Git provider and Tinyforge re-syncs on matching refs.
- Adds webhook_secret columns + unique indexes to projects and static_sites
- Per-entity GET/regenerate endpoints under /api/projects/{id}/webhook
and /api/sites/{id}/webhook (admin-only)
- Removes /api/settings/webhook-url and the global webhook panel
- Reusable WebhookPanel Svelte component on both detail pages, i18n in en/ru
- Tests for matcher (siteRefMatches, ParseImageRef) and handler (project
match/mismatch/404 and site push/manual/branch-skip)
UI consistency
- ForgeHero now supports backHref, mono kicker, stats snippet, staggered
entrance animation, and a registration-tick divider
- Every route now opens with the same "THE FORGE // SECTION" eyebrow: projects,
sites, stacks, proxies, events, dns, deploy, settings, stale containers,
site/project detail + env/volumes/browse, new site wizard
- Stacks list/detail/new moved to the shared hero and brand-anchor eyebrow
- Toolbars migrated from bespoke buttons to the shared .forge-btn utilities
- Sidebar footline adds a live UTC "forge clock" and a vim-style g-prefix
quick-nav hint (g d/p/s/k/x/r/e/c jumps to each section)
Proxies page
- Server-side: merge static site proxy routes with instance routes and sort
by domain (internal/api/proxies.go, internal/store/static_sites.go)
- ProxyRoute gains a Source field ("instance" | "static_site")
- Frontend adds source filter tabs and per-source labels/badges
Adds a new Stacks feature: upload/edit docker-compose YAML,
deploy as atomic units, browse revisions, roll back, and
stream logs. Backend in internal/stack + internal/api/stacks.go,
persistent storage in internal/store/stacks.go.
Stacks pages (list, new, detail) use a modern Forge aesthetic —
Instrument Serif display type, JetBrains Mono for meta/code,
indigo ember accents, dot-grid hero, registration marks on
hover, terminal panel for logs. Palette is sourced from the
app's existing design tokens so the feature remains consistent
with the rest of Tinyforge.
Fonts self-hosted via @fontsource/instrument-serif and
@fontsource/jetbrains-mono to satisfy the strict CSP.
- Add storage_enabled and storage_limit_mb columns to static_sites.
- Create/attach Docker volumes (tinyforge-site-{name}-data) for Deno
sites with storage enabled, mounted at /app/data.
- Grant --allow-write=/app/data in Deno container CMD.
- Add storage usage API endpoint (GET /api/sites/{id}/storage).
- Show storage section in site detail page with usage bar.
- Add storage toggle and limit field to new site wizard.
- Use ConfirmDialog for secret deletion instead of inline delete.
Rebrand the project as Tinyforge to reflect its evolution from a Docker
container watcher into a self-hosted mini CI/deployment platform.
Rename covers: Go module path, Docker labels, DB/config filenames,
JWT issuer, Dockerfile binary, docker-compose, CI workflows, frontend
i18n, README with static sites docs, and all code comments.
- Add public_ip field to Settings for DNS A records (proxy/load balancer IP)
- DNS records now use public_ip, falling back to server_ip if empty
- Server IP renamed to "Server IP (Docker Host)" for clarity
- Public IP labeled "Public IP (DNS Target)"
- Updated help texts for domain, server IP, public IP, and Docker network
- DB migration + schema for public_ip column
Resource limits:
- Add cpu_limit (cores) and memory_limit (MB) fields to Stage model
- Pass limits to Docker container via NanoCPUs and Memory in HostConfig
- Add CPU/Memory fields to stage creation form in project detail
- 0 = unlimited (default)
NPM access list:
- Add npm_access_list_id to Settings (global default) and Project (per-project override)
- Per-project overrides global when > 0
- NPM provider passes access_list_id when configuring proxy hosts
- Add GET /api/settings/npm-access-lists endpoint to list NPM access lists
- Add access list picker on NPM settings page (global)
- Add access list ID field on project edit form (per-project)
- DB migrations for all new columns
- Add npm_remote setting: when enabled, proxy forwards to server_ip with
published host ports instead of Docker container names
- Deployer looks up assigned host port via InspectContainerPort in remote mode
- Auto-remove stale containers with same name before creating new ones
- Add Remote NPM toggle with warning on NPM settings page
- DB migration + schema for npm_remote column
- Add per-event delete button (trash icon on hover) in event log entries
- Set Docker network default to 'docker-watcher' in DB schema + migration for existing DBs
- Parse Go duration strings (5m, 1h) to seconds in settings UI, convert back on save
- Clear error when network is empty in deployer instead of hidden fallback
- Add enable_proxy toggle to Quick Deploy form (defaults to on)
- Add DELETE /api/events/log/{id} and DELETE /api/events/log endpoints
- Add Clear All button with confirmation on Events page
- Rename "NPM Proxy" to "Enable Proxy" on stage form (provider-agnostic)
- Fix polling interval validation (min 60s) and number input trim errors
- Fix domain field no longer required in settings
- Add /proxies page showing deploy-managed proxy routes with project/stage links, search, and status
- Add GET /api/proxies endpoint joining instances with project/stage names
- Add POST /api/settings/npm/test endpoint for NPM connection validation
- Add GET /api/auth/mode public endpoint for auth mode detection
- Add NPM Test Connection button with validation on save
- Fix OIDC SSO button only shown when auth_mode is oidc
- Fix webhook URL showing empty when domain not set (fallback to request host)
- Fix quick deploy double-tag (image:latest:latest) by splitting tag from image URL
- Fix trim() errors on number inputs in deploy and settings forms
- Fix NPM client auto-append /api to base URL
- Sanitize NPM test error messages (no raw HTML)
- Remove healthcheck field from Quick Deploy form
- Fix env vars placeholder newline
- Make domain field optional in settings
- Set polling interval minimum to 60s
- Add Proxies and Events to sidebar navigation
- Fix SSL cert name flash on NPM settings page
- Fix empty state icon on proxies page
Replace direct npm.Client usage throughout the codebase with the
proxy.Provider interface, enabling pluggable proxy backends. The
deployer, API layer, and proxy manager now use provider-agnostic
route management (ConfigureRoute/DeleteRoute) instead of NPM-specific
API calls. Adds ProxyRouteID (string) to Instance model and
ProxyProvider setting to Settings, with SQLite migrations for
backward compatibility.
- Detect existing projects with same image on quick deploy; show conflict dialog with options
- Move logout button to sidebar header as icon-only
- Replace backup checkbox with ToggleSwitch component
- Allow unsafe-inline in CSP script-src for SvelteKit hydration
- Guard SSE connection behind isAuthenticated() check
- Add notification_url ALTER TABLE migration for existing databases
- Restore RegisterPersistentLogger on event bus
- Expand health endpoint to check DB, Docker, and NPM connectivity (FUNC-M4)
- Add project_id, stage_id, offset query params to deploys endpoint (FUNC-M5, FUNC-M6)
- Add notification_url field to Stage model for per-project overrides (FUNC-M2)
- Add NPM Ping method for health checking
- Sanitize all internal error messages in API handlers (SEC-M4)
- Add audit trail events for admin actions (FUNC-M3)
- Add EventLog event type to event bus
Add backup/restore functionality for the SQLite database. Users can
trigger manual backups, configure automatic backups on an interval
with retention policies, list/download/delete backups, and restore
from any backup.
- Backup engine using VACUUM INTO (safe with WAL mode)
- Backup metadata tracked in DB, files stored in DATA_DIR/backups/
- Settings: backup_enabled, backup_interval_hours, backup_retention_count
- API: POST/GET/DELETE /api/backups, download, restore endpoints
- Autobackup via cron scheduler with configurable interval
- Retention: prune on startup, after each backup (manual and auto)
- Orphan cleanup: removes backup files without metadata on startup
- Restore: replaces DB and triggers graceful server shutdown
- Settings UI: /settings/backup with toggle, interval, retention config
- Backup list with download, delete, restore actions
- i18n: English and Russian translations
Add flexible DNS management to Docker Watcher. By default, wildcard DNS
is assumed (current behavior). When disabled, users can configure a
Cloudflare DNS provider with API token and zone selection. DNS A records
are automatically created/updated/deleted in sync with proxy consumers
(deployed instances and standalone proxies).
- Settings: wildcard_dns toggle, dns_provider, cloudflare credentials
- Cloudflare client: Provider interface with EnsureRecord/DeleteRecord/ListRecords
- DNS lifecycle hooks in deployer and proxy manager (best-effort)
- Settings UI: DNS config section with provider picker, zone selector, test button
- DNS Records page at /dns with filtering, sync status, reconciliation
- Records visible in both wildcard and managed modes
- Cleanup on provider change: removes old records when switching modes
- Add 'absolute' volume scope for direct host paths (NFS, external mounts)
- Allowlist in settings: allowed_volume_paths (JSON array of prefixes)
- Validation: absolute source must be under an allowed prefix
- Empty allowlist = absolute scope disabled entirely
- Settings API exposes/validates allowed_volume_paths
- Frontend type updated with absolute scope
- CRITICAL: validate volume Name against path traversal (safe regex)
- HIGH: log data migration errors instead of silently ignoring
- HIGH: reject empty source when switching from ephemeral scope
Replace confusing shared/isolated volume modes with explicit scopes:
- instance: per-deploy isolated directory
- stage: shared within a stage across deploys
- project: shared across all stages
- project_named: named group within a project
- named: global named volume across projects
- ephemeral: tmpfs in-memory mount
Includes schema migration (shared→project, isolated→instance),
backward-compatible deployer resolution, scope metadata API endpoint,
and redesigned volume editor UI with scope guide cards and hints.
Critical fixes:
- Fix StaleContainer frontend type to match nested backend response shape
- Guard ContainerID[:12] slice against empty/short IDs in ListAllProxies
High-priority fixes:
- Support comma-separated severity/source in event log filtering (IN clause)
- Eliminate N+1 queries in ListAllProxies and FindStaleInstances (pre-load maps)
- Stop leaking internal error messages to API clients (use slog + generic msgs)
Add database foundation for observability features:
- event_log table with severity/source filtering and pagination
- standalone_proxies table for user-created reverse proxies
- stale_threshold_days setting (default 7 days)
- Auto-persist warn/error events from event bus to database
- SSE broadcast of persistent events for real-time UI updates
- Frontend types and API functions for downstream UI phases
Add enable_proxy boolean to stages (default true). When disabled,
the deployer skips NPM proxy host creation — useful for internal
services, workers, or externally-routed containers. UI shows
toggle in Add Stage form and "No Proxy" badge on stage header.
Security: apply AdminOnly middleware to mutating routes, require
ENCRYPTION_KEY and ADMIN_PASSWORD (no insecure defaults), restrict
CORS to same-origin, fix OIDC token delivery via cookie instead of
URL query param, add rate limiting on login, add MaxBytesReader,
validate volume paths against traversal, add security headers,
validate user roles, add Secure flag to OIDC cookie.
Performance: set SQLite MaxOpenConns(1) to prevent SQLITE_BUSY,
add FK indexes on 8 columns, track notifier goroutines with
WaitGroup for graceful shutdown, use GetRegistryByName instead of
GetAllRegistries in deployer, pass basePath param to avoid redundant
settings query, return empty slices from store to remove reflection.
Quality: refactor TriggerDeploy to delegate to runDeploy (~100 lines
removed), consolidate duplicated utilities (extractPort, boolToInt,
now, isTerminalStatus) into shared exports, migrate all log.Printf
to slog structured logging, use consistent webhook response envelope,
remove dead code (parseEnvVars, duplicate auth types).
UX: clean up NPM proxy on instance removal via API, add README with
quickstart guide, add .env.example, require ADMIN_PASSWORD in
docker-compose, document staging-net prerequisite.
Add global base_volume_path to settings. Relative volume source
paths are automatically prepended with the base path at deploy
time. Absolute paths are used as-is. Configurable in Settings >
General.
- Add ListImages() to registry interface, implement for Gitea
- Add owner field to registry config (needed for Gitea packages API)
- GET /api/registries/:id/images endpoint
- "Browse Images" button on Projects and Quick Deploy pages
- Image dropdown with registry grouping and search
- i18n support (EN/RU) for all new UI strings