Background worker that keeps the containers table in sync with
docker ps. Runs one boot pass and ticks every 30s.
Dispatch precedence per container:
1. tinyforge.workload.id label (canonical, new)
2. tinyforge.instance-id label (legacy project — joins via instances)
3. tinyforge.static-site label (legacy site)
4. com.docker.compose.project (stacks — joins via ComposeProjectName)
Rows whose Docker container ID is no longer present are flipped
to state='missing'. Placeholder rows (empty container_id, e.g.
a deploy mid-flight) are left alone so a tick that races a
deploy doesn't mark them as missing.
DockerLister interface lets tests substitute a fake daemon —
6 unit tests cover the dispatch matrix, missing-sweep, and
state normalization.
Wired into cmd/server/main.go between docker.New and the
existing startup chain. Boot pass populates the containers
table from any pre-refactor running containers.
CRUD on Project / Stack / StaticSite now keeps a paired Workload
row in sync. Secret setters (webhook secret, signing secret,
require-signature toggle, notification secret) all re-sync after
mutating the source-of-truth row so the workload row always
reflects the canonical state.
Delete cascades: DeleteProject/Stack/StaticSite now drop the
matching workload row plus any container index entries owned by
it, so global views don't show ghost rows.
Boot-time BackfillWorkloads scans every project/stack/site and
ensures each has a workload row. Idempotent — safe to run on
every restart, recovers from a deleted/missing workload row.
Behavior unchanged for existing call sites; the workloads table
just starts being populated. Deployer / reconciler / consumer
switchover land in the next commit.
Persists every inbound webhook hit (project + site) so users can debug
"why didn't my deploy fire?" without grepping daemon logs. Surfaces a
14-day rolling history under the WebhookPanel on each project + site
detail page; refreshes every 30s while open. Daily cron prunes records
older than 14 days alongside the existing event log prune.
Schema:
- webhook_deliveries(id, target_type, target_id, target_name, received_at,
source_ip, signature_state, status_code, outcome, detail, body_size)
- indexes on (target_type,target_id,received_at) and (received_at)
Backend:
- store: WebhookDelivery model + Insert/List/Prune helpers
- webhook/handler: deferred recordDelivery() captures the final outcome
on every return path including HMAC rejects, image mismatch, no-stage,
auto_deploy=false, and successful deploys; signatureStateFor()
classifies "unconfigured" vs "missing" vs "invalid" vs "valid"
- api: GET /api/{projects,sites}/{id}/webhook/deliveries with
parseLimit() helper (default 50, max 200)
- main: daily prune cron retains the last 14 days
Frontend:
- WebhookDeliveryLog.svelte: panel with refresh button, status code +
outcome + signature badges, relative time tooltip-on-hover for
absolute time, source IP column
- Mounted below WebhookPanel on project + site detail pages
- en/ru i18n strings for outcome/signature enums and column labels
Adds an opt-in "auto_backup_before_deploy" setting that triggers a
"pre-deploy" backup at the start of every project deploy via the deploy
pipeline (covers both the async HTTP path and the sync poller/webhook
path). Failures are logged to the deploy log but do not abort — missing
a backup is preferable to refusing to ship a fix.
- store: settings.auto_backup_before_deploy column + scan/update wiring
- backup: accept "pre-deploy" as a valid backup_type
- deployer: small PreDeployBackuper interface, hooked into runDeploy
right after settings load and before any state-mutating work
- api: settings request/response surface the new flag
- web: ToggleSwitch on the backup settings page; "Pre-deploy" badge
variant in the backup list (badge-warning so it stands out)
- i18n: en/ru strings for the toggle, help text, and badge label
Outgoing notifications were bare POSTs with no auth and no way to verify
they came from Tinyforge. They also went out from one global URL only,
even though stages had a notification_url field, and static-site sync
emitted no events at all.
Schema: add notification_url + notification_secret (lazy-generated) to
settings, projects, stages and static_sites. Migrations are additive.
Notifier: SendSigned computes HMAC-SHA256 over the exact body bytes and
sends X-Hub-Signature-256 (GitHub-compatible — receivers built for
GitHub/Gitea/Forgejo verify out of the box). Aux headers
X-Tinyforge-Event/Delivery/Timestamp/Tier are advisory and not signed.
Empty secret => unsigned send for back-compat.
Resolution: deploys fall through stage > project > settings, sites fall
through site > settings. The secret travels with the URL that sourced
it, so any tier can sign even when its parents are unsigned. Site sync
events now actually emit (site_sync_success / site_sync_failure).
API: 12 new endpoints — {GET secret, POST regenerate, POST disable,
POST test} for each of the 4 tiers. SendSyncForTest returns
status_code/latency_ms/signature_sent/delivery_id/response_snippet so
the UI surfaces receiver feedback inline.
UI: shared OutgoingWebhookPanel.svelte fits the existing card aesthetic.
Signing-state pill, secret reveal-on-demand, regenerate/disable behind
ConfirmDialog modals (not inline strips — too easy to misclick), send-
test result card with colour-coded status. Wired into Settings →
Integrations, project edit form, per-stage edit, and per-site detail.
EN + RU i18n.
Tests: round-trip (sender signs, receiver verifies), tampered-body and
wrong-secret rejection, unsigned-send omits header, send-test surfaces
4xx, concurrent fan-out via Drain. Resolver precedence locked for both
deploy and site paths.
Docs: docs/webhooks.md with header reference, verifier snippets in
Node/Python/Go, and a recipe for the service-to-notification-bridge
generic webhook provider.
Security:
- rate limit /api/webhook routes per-IP and cap concurrent site syncs
- global SSE connection cap (256) with new sse_gate
- validate ?tail= and cap JSON log responses at 4 MiB
- strip ANSI/CSI/OSC and control bytes from streamed log lines
- redact webhook secret from request log middleware
- scrub host details from /api/health for non-admin viewers
- drop container_id from /api/system/stats/top for non-admins
- generate webhook secrets via crypto/rand; require >=32 chars on insert
- verify iid path consistency in streamContainerLogs
- LimitReader on site webhook body; reject malformed non-empty bodies
Concurrency / correctness:
- stats collector: Stop() no longer hangs without Start(), semaphore
acquired in parent loop so ctx cancellation short-circuits the queue,
in-flight tick cancellable via shared base context, zero-ts guard
- webhook handler: replace fire-and-forget goroutine with WaitGroup-tracked
workers + Drain() wired into graceful shutdown
- $derived(() => ...) mis-idiom fixed in ContainerStats / InstanceCard /
ProjectCard (returned function instead of value)
- SystemResourcesCard: rename `window` and `t` locals to avoid shadowing
globalThis.window and the i18n `t` import
Quality / performance:
- replace O(n^2) insertion sort with sort.Slice in stats top
- runMigrations only swallows duplicate-column / already-exists errors
- PruneStatsSamplesBefore wrapped in a transaction
- collapse N+1 in unusedImageStats / pruneImages to one ListAllInstances
pass; surface DB errors instead of silently treating them as inactive
- run Docker Info + DiskUsage in parallel via errgroup
- container log SSE emits `: ping` heartbeat every 20 s
- imageMatches case-insensitive on registry host (RFC behaviour)
- log warning on invalid stage tag pattern instead of silent skip
- reject malformed non-empty site webhook payloads
Frontend / i18n:
- shared formatBytes utility replaces three local copies
- statsInterval store drives dynamic "no samples / collection disabled"
copy across ContainerStats and SystemResourcesCard
- top consumers row now shows owner_name (project/stage or site name)
- drop seven `as any` casts on the Settings type; add cloudflare_api_token
write-only field
- move "Service status", "Docker daemon", "Docker unreachable",
"Proxy unreachable", "reachable", and "Docker daemon is not reachable."
strings into en/ru i18n bundles
Background collector samples CPU/memory/network/block I/O for every
instance and site on a configurable interval (default 15s, range
5-300s), persists samples to SQLite with a configurable retention
window (default 2h, range 0-24h), and skips ticks gracefully when
the Docker daemon is unreachable. Settings are reloadable without
a restart — each tick re-reads them.
New API endpoints:
- GET /api/system/stats (host snapshot: info + df)
- GET /api/system/stats/history
- GET /api/system/stats/top?by=cpu|memory
- GET /api/projects/{id}/stages/{s}/instances/{iid}/stats/history
- GET /api/sites/{id}/stats[/history]
- GET /api/sites/{id}/logs (SSE + JSON, reuses instance log streamer)
Frontend:
- ECharts added with tree-shaken imports (~180KB gzip) for
future-proof time-series/gantt/graph visualizations
- CollapsibleSection wraps all dashboard sections (system health,
daemons, system resources, static sites, projects) with
localStorage-persisted open state
- SystemResourcesCard shows capacity tiles, workload utilization
chart with 30m/2h/6h/24h window picker, disk breakdown with
reclaimable callouts, and top 5 consumers
- ContainerStats and ContainerLogs take a source discriminated union
so sites reuse the same components as instances; sites detail page
embeds both for Deno backend debugging
- Settings › Maintenance exposes collection interval + retention
- Docker-unavailable state returns 503 and renders an amber banner
instead of a generic 500
Full i18n coverage (en + ru) for all new strings.
Replace the single global webhook secret with entity-scoped secrets stored
on each project and static site. Webhook-driven project autocreate is
removed — projects must exist before their URL can trigger deploys.
Also wires static-site webhooks (sync_trigger=push|tag), turning the
previously inert "push" trigger into a functional one: POST the site's
webhook URL from a Git provider and Tinyforge re-syncs on matching refs.
- Adds webhook_secret columns + unique indexes to projects and static_sites
- Per-entity GET/regenerate endpoints under /api/projects/{id}/webhook
and /api/sites/{id}/webhook (admin-only)
- Removes /api/settings/webhook-url and the global webhook panel
- Reusable WebhookPanel Svelte component on both detail pages, i18n in en/ru
- Tests for matcher (siteRefMatches, ParseImageRef) and handler (project
match/mismatch/404 and site push/manual/branch-skip)
Adds a new Stacks feature: upload/edit docker-compose YAML,
deploy as atomic units, browse revisions, roll back, and
stream logs. Backend in internal/stack + internal/api/stacks.go,
persistent storage in internal/store/stacks.go.
Stacks pages (list, new, detail) use a modern Forge aesthetic —
Instrument Serif display type, JetBrains Mono for meta/code,
indigo ember accents, dot-grid hero, registration marks on
hover, terminal panel for logs. Palette is sourced from the
app's existing design tokens so the feature remains consistent
with the rest of Tinyforge.
Fonts self-hosted via @fontsource/instrument-serif and
@fontsource/jetbrains-mono to satisfy the strict CSP.
Rebrand the project as Tinyforge to reflect its evolution from a Docker
container watcher into a self-hosted mini CI/deployment platform.
Rename covers: Go module path, Docker labels, DB/config filenames,
JWT issuer, Dockerfile binary, docker-compose, CI workflows, frontend
i18n, README with static sites docs, and all code comments.
When domain, SSL certificate, or proxy provider changes in settings:
- Delete old proxy routes from the previous provider
- Switch to None: clear all route IDs on instances
- Switch to NPM/Traefik: re-create routes with new settings
- Domain change: re-configure all routes with new FQDN
- SSL cert change: re-apply to all existing routes
- Provider created dynamically at runtime via createProxyProvider()
- Deployer and API server updated via SetProxyProvider callback
Replace direct npm.Client usage throughout the codebase with the
proxy.Provider interface, enabling pluggable proxy backends. The
deployer, API layer, and proxy manager now use provider-agnostic
route management (ConfigureRoute/DeleteRoute) instead of NPM-specific
API calls. Adds ProxyRouteID (string) to Instance model and
ProxyProvider setting to Settings, with SQLite migrations for
backward compatibility.
- HIGH: Add sync.Mutex to backup Engine to prevent concurrent
backup/restore operations
- HIGH: Restore uses io.Copy instead of ReadFile to avoid OOM on
large databases
- HIGH: Send HTTP response before closing DB during restore, then
perform destructive operations in a goroutine
- HIGH: Create pre-restore safety backup before overwriting database
- HIGH: Autobackup cron reschedules dynamically when settings change
via callback pattern (same as DNS provider changes)
Add backup/restore functionality for the SQLite database. Users can
trigger manual backups, configure automatic backups on an interval
with retention policies, list/download/delete backups, and restore
from any backup.
- Backup engine using VACUUM INTO (safe with WAL mode)
- Backup metadata tracked in DB, files stored in DATA_DIR/backups/
- Settings: backup_enabled, backup_interval_hours, backup_retention_count
- API: POST/GET/DELETE /api/backups, download, restore endpoints
- Autobackup via cron scheduler with configurable interval
- Retention: prune on startup, after each backup (manual and auto)
- Orphan cleanup: removes backup files without metadata on startup
- Restore: replaces DB and triggers graceful server shutdown
- Settings UI: /settings/backup with toggle, interval, retention config
- Backup list with download, delete, restore actions
- i18n: English and Russian translations
Add flexible DNS management to Docker Watcher. By default, wildcard DNS
is assumed (current behavior). When disabled, users can configure a
Cloudflare DNS provider with API token and zone selection. DNS A records
are automatically created/updated/deleted in sync with proxy consumers
(deployed instances and standalone proxies).
- Settings: wildcard_dns toggle, dns_provider, cloudflare credentials
- Cloudflare client: Provider interface with EnsureRecord/DeleteRecord/ListRecords
- DNS lifecycle hooks in deployer and proxy manager (best-effort)
- Settings UI: DNS config section with provider picker, zone selector, test button
- DNS Records page at /dns with filtering, sync status, reconciliation
- Records visible in both wildcard and managed modes
- Cleanup on provider change: removes old records when switching modes
Add standalone proxy management:
- Multi-step validation pipeline (DNS, TCP, HTTP) with diagnostic hints
- Proxy lifecycle: create/update/delete via NPM API with SSL auto-assign
- Periodic health monitoring (5min) with event log on status transitions
- Unified /api/proxies/all endpoint merging standalone + managed proxies
- Frontend types and API functions for downstream UI phases
Add database foundation for observability features:
- event_log table with severity/source filtering and pagination
- standalone_proxies table for user-created reverse proxies
- stale_threshold_days setting (default 7 days)
- Auto-persist warn/error events from event bus to database
- SSE broadcast of persistent events for real-time UI updates
- Frontend types and API functions for downstream UI phases
Security: apply AdminOnly middleware to mutating routes, require
ENCRYPTION_KEY and ADMIN_PASSWORD (no insecure defaults), restrict
CORS to same-origin, fix OIDC token delivery via cookie instead of
URL query param, add rate limiting on login, add MaxBytesReader,
validate volume paths against traversal, add security headers,
validate user roles, add Secure flag to OIDC cookie.
Performance: set SQLite MaxOpenConns(1) to prevent SQLITE_BUSY,
add FK indexes on 8 columns, track notifier goroutines with
WaitGroup for graceful shutdown, use GetRegistryByName instead of
GetAllRegistries in deployer, pass basePath param to avoid redundant
settings query, return empty slices from store to remove reflection.
Quality: refactor TriggerDeploy to delegate to runDeploy (~100 lines
removed), consolidate duplicated utilities (extractPort, boolToInt,
now, isTerminalStatus) into shared exports, migrate all log.Printf
to slog structured logging, use consistent webhook response envelope,
remove dead code (parseEnvVars, duplicate auth types).
UX: clean up NPM proxy on instance removal via API, add README with
quickstart guide, add .env.example, require ADMIN_PASSWORD in
docker-compose, document staging-net prerequisite.
Embed SvelteKit static build in Go binary via go:embed. Event bus
for pub/sub with deploy log, instance status, and deploy status events.
SSE endpoints for real-time streaming. Frontend SSE client with
exponential backoff reconnection. Makefile for build pipeline.
Update Phase 12 auth plan with OAuth2/OIDC support.
Remove webhook secret from logs and API response.
Add auth-pending note to router. Fix decrypt fallback that
would use ciphertext as auth token on decrypt failure.
All REST endpoints wired with chi router: projects, stages, instances,
deploys, registries, settings, quick deploy, webhook. Full main.go
wiring with graceful shutdown. Consistent JSON envelope responses.
Sensitive fields stripped from API responses.
AES-256-GCM encryption for credential storage, YAML seed config
parser with validation, and transactional import into SQLite.
Credentials (registry tokens, NPM password) encrypted before storage.
Initialize Go module, directory structure, and full SQLite store layer:
- 7-table schema (projects, stages, registries, settings, instances,
deploys, deploy_logs) with auto-migration
- CRUD operations for all entities with proper error handling
- ErrNotFound sentinel for distinguishing 404 from 500 in handlers
- WAL mode, foreign keys, busy timeout pragmas