7641c0de1209b9ce62fc587006a2c4a544525b56
127 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
1c47030854 |
feat(volsnap): volume snapshot restore (backlog #6)
Restore a captured volume snapshot onto an image workload's live host-bind
data volumes, then redeploy — the most destructive workload action, built to
the adversarially-reviewed design (C1–C6) with all data-loss guards.
- Engine.Restore (engine-owned): all-or-nothing pre-flight re-resolution from
the workload's CURRENT config (never the tamperable manifest), per-filesystem
disk pre-check, per-workload lock, container quiesce, extract-to-tmp, durable
pre-restore snapshot, write-ahead journal, atomic rename swap, redeploy, and
crash-recovery sweep (RecoverInterruptedRestores) wired before serving.
- internal/keyedmutex: shared per-key lock; deployer now serializes every
deploy entrypoint per workload via DispatchPlugin (+ LockWorkload/RedeployLocked
for the restore re-dispatch, no deadlock).
- Untrusted-archive extractor: zip-slip containment, type allow-list (reg/dir
only), decompression-bomb cap, manifest-index bounds.
- POST /api/workloads/{id}/snapshots/{sid}/restore: admin, X-Confirm-Restore
header (CSRF), per-workload single-flight (409).
- WebUI: Restore button + danger ConfirmDialog + busy state + i18n (en/ru).
Scope: image-source only; scopes absolute/stage/project (driven off the same
supportedScopes constant capture uses).
Plan-reviewed before coding; per-phase go/security/ts reviews; final review
READY TO MERGE. Security review caught + fixed a CRITICAL manifest-Source path
traversal (re-derive target from current config + base containment).
Plan: plans/volume-snapshot-restore/
|
||
|
|
8a5f69af87 |
feat(web): bidirectional stage promotion + confirm dialog
Completes env-promotion #5. The promote-from endpoint (admin, tested) and the parent->self chain UI already shipped; the endpoint is direction-agnostic, so this is frontend-only: - self->child promotion: each eligible child (image source, non-preview) in the stage chain gets a "Promote to this" button that pushes this workload's running version to that child. - both directions route through a ConfirmDialog (promote copies the source's running tag onto the target AND deploys it) with a success toast. - i18n apps.detail.chain* in en + ru (parity 1810/1810). |
||
|
|
7733e64b08 |
feat(gitops): config-as-code via .tinyforge.yml for repo-backed workloads
A dockerfile or static workload can opt in to reading its deploy config from a
.tinyforge.yml in its own repo. Tinyforge fetches the file, shows field-level
drift vs the live config, and an admin applies it with an explicit Sync. The
repo becomes the source of truth for the declared fields. Manual-sync only;
no auto-apply on deploy, no multi-workload reconcile, no create/delete in v1.
Scope is deliberately source-aware and source_config-resident: dockerfile
declares port/healthcheck/deploy_strategy, static declares deploy_strategy.
The file never carries repo coords or secrets (those stay in the encrypted
DB), which keeps credentials out of the repo.
Backend:
- internal/gitops: Spec/ParseSpec (KnownFields rejects unknown keys), a
source-aware ApplyPlan/BuildPlan, MergeAndValidate (omitted-field-preserving
deep merge + validate-the-merged-result-then-commit — never a partial
config), declared-only Drift with normalization, and Fetch with
ok/no_file/fetch_failed/invalid statuses and token-redacted messages.
- staticsite: DownloadFile added to GitProvider + Gitea/GitHub/GitLab impls,
reusing each provider's SSRF-safe client; 64 KiB cap; ErrFileNotFound.
- store: 4 additive gitops_* columns + setters (disjoint from UpdateWorkload
so the edit-form save and a sync never clobber each other).
- api: GET /workloads/{id}/gitops (status + raw + live drift + managed_fields),
PUT /gitops (admin, enable/path, traversal-safe), POST /gitops/sync (admin,
per-workload locked read->merge->validate->write, audited to event_log).
Frontend:
- GitOpsPanel.svelte: status pill, a purpose-built field-level drift view,
.tinyforge.yml preview, enable ToggleSwitch, Sync via ConfirmDialog; all five
statuses handled, admin affordances gated on the real viewer role.
- GitOps-managed badge (list + detail hero) and a read-only edit-form banner.
- api.ts fetchers + types; i18n apps.detail.gitops.* (en + ru parity).
Built phase-by-phase with an adversarial plan review (caught 5 design flaws
pre-implementation) and an independent review per phase (go / security / ts /
final) — all APPROVE, 0 CRITICAL/HIGH. docs/gitops.md documents the schema and
what's intentionally out of v1. Plan: plans/gitops/.
|
||
|
|
5b51bbbd7f |
feat(web): deploy-strategy selector UI for image/dockerfile/static sources
Phase-2 UI for the per-workload deploy_strategy shipped in
|
||
|
|
0c4c338bfe |
feat(apps): per-workload deploy history, rollback, and resource metrics
Two additions to the app detail page, each backed by a per-workload
endpoint.
Deploy history + rollback:
- New deploy_history table — a structured, version-pinned ledger of every
dispatch (success AND failure), distinct from the free-text event_log.
Recorded at the single DispatchPlugin choke point so every source kind
is covered. The raw deploy error is never persisted (it can carry
registry-auth / compose-stdout secrets) — only a generic marker, with
detail going to slog. Pruned to the newest N per workload; cascade-
deleted with the workload.
- GET /api/workloads/{id}/deploys lists the ledger; POST .../rollback
(admin) replays a prior successful deploy's pinned reference as a
rollback-reason dispatch. Phase 1 is image-source only (RollbackCapable);
git-built sources need checkout-by-commit, a later phase.
- DeployHistoryPanel.svelte renders the ledger with confirm-gated rollback.
Per-workload metrics:
- ListContainerStatsSamplesByWorkload joins the existing container stats
samples through the containers index; GET /api/workloads/{id}/stats/history
aggregates CPU/memory per timestamp across the workload's containers.
- WorkloadMetricsPanel.svelte reuses ResourceChart (CPU% + memory MiB,
windowed, 15s poll).
en/ru i18n added with parity. Tests: store CRUD + cascade + workload-scoped
join, deployer recording (incl. secret-non-leak on failure), API rollback
guards, and per-timestamp aggregation. Plans under docs/plans/.
|
||
|
|
6492944c8f |
fix(web): keep the image-ref conflict indicator from reflowing the form
Build / build (push) Successful in 11m30s
Move the conflict-checking hint inside the image-ref field as an absolutely-positioned overlay so a blur→check→clear cycle no longer shifts the rows below it on the /apps/new wizard. |
||
|
|
ec8c0cd891 |
feat(web): warm-seed cache for triggers/[id] (last detail page)
Build / build (push) Successful in 11m31s
Trigger detail re-fetched on every visit, flashing a skeleton. Warm-seed the trigger + config form from triggerDetailCache so the hero + form render instantly on revisit; gate the webhook + bindings panels on a detailsLoaded flag (loading row until their data lands) so they never flash a wrong "OFF"/"empty" state. The rotated webhook secret lives in the separate (uncached) webhook object and is always fetched fresh — never persisted. Hardened against reused-component nav races (review): a monotonic loadSeq token makes the latest load() the sole writer (fixes A→B→A same-id concurrency), and each cache-writing mutation handler pins the id for its round-trip so a mid-mutation nav can't poison another id's cache entry or surface its secret. |
||
|
|
192204a51c |
feat(web): stale-while-revalidate caches to eliminate tab-switch flicker
Build / build (push) Failing after 4m51s
Sidebar tabs, Settings, and drill-in detail pages re-fetched on every
visit (loading=true + onMount), flashing an empty skeleton frame on each
navigation. Add an SWR cache layer so revisiting a view renders cached
data instantly while refreshing in the background.
- resourceCache.ts: single-value + keyed (per-id) SWR cache factories
- caches.ts: per-resource cache instances; resetAllCaches() on logout
- eventsSnapshot.ts: warm-seed snapshot for the SSE/paginated events page
- List/sidebar pages read $cache.value via $derived, refresh() on mount;
mutations refresh the cache
- Settings forms seed once from settingsCache (edit-safe) and refetch
after save (PUT /api/settings returns {status}, not the Settings object)
- Detail [id] pages warm-seed per id; apps/[id] seeds {workload,containers},
resets non-seeded panels on warm nav, clears workload on 404, and
invalidates its cache entry on delete
Deferred (still cold-fetch): triggers/[id] (webhook secret + multi-fetch
body gate), apps/new (create wizard).
|
||
|
|
6b45ed62bb |
feat(snapshots): capture app data-volume snapshots
Build / build (push) Successful in 10m59s
Add per-workload capture of host-bind data volumes as downloadable tar.gz archives: a new internal/volsnap engine (enumerate host-bind volumes via the computeMounts merge, archive with archive/tar+gzip skipping symlinks/special files, per-workload retention + startup orphan cleanup), a volume_snapshots table + store CRUD, admin-gated API (list/snapshotable/create/download/delete), and a Snapshots panel on /apps/[id] that shows coverage and which volumes are skipped (and why). Scope: image-source apps, host-bind scopes (absolute/stage/project); Docker named volumes, tmpfs, and instance scope are surfaced as not-yet-supported. Restore is a separate later phase. Download/FilePath are containment-checked; create returns a typed no-data error (400) vs generic 500. Covered by archiver unit tests + full API e2e. |
||
|
|
97f338fba3 |
feat(maintenance): add Docker build-cache prune action
Add an admin-only POST /api/docker/prune-build-cache endpoint plus a Settings > Maintenance danger-zone button to reclaim disk used by the Docker build cache (image + static-site builds), which previously grew unbounded with no UI lever. Prunes unused-only (all=false) so a warm cache is preserved for apps redeploying soon. Mirrors the existing prune-images vertical slice; full en/ru i18n parity. |
||
|
|
15e5b186cd |
feat(secrets): scoped shared secrets rule-management UI (Phase 2)
Completes scoped shared secrets end-to-end: /shared-secrets list/new/edit routes (mirroring metric-alert-rules) with an env-key name, a WRITE-ONLY value (password input; never pre-filled — the API returns only has_value; omitted on PATCH to keep the stored secret, provided to rotate; cleared after save), an encrypted toggle (flipping it requires re-entering the value, matching the server's 400 guard), a global|app scope with an App-grouping picker (listApps), description, and enabled. 409 conflicts surface a friendly message. New "System" nav entry (IconKey) + api.ts client + full sharedsecrets.* i18n (en/ru parity). Reviewed: typescript APPROVE (0 CRITICAL/HIGH). |
||
|
|
7576f54e76 |
feat(alerts): metric-alert rule-management UI (Phase 2)
Completes metric-threshold alerting end-to-end: /metric-alert-rules list/new/edit routes (mirroring log-scan-rules) with metric/comparator/ threshold fields, the workload scope picker, ToggleSwitch, and a ConfirmDialog delete flow; an api.ts MetricAlertRule CRUD client; an "Observe" nav entry; and a full metricalert.* i18n namespace (en/ru parity). Create-form cooldown defaults to 300s to match the server. Rules are now manageable in the WebUI; breaches already surface in the per-app activity timeline and fire any configured event-trigger webhook. Reviewed: typescript APPROVE (0 CRITICAL/HIGH). |
||
|
|
2e26f555c5 |
fix(dashboard): count Recent workloads by source_kind, not raw rows
The Recent-workloads badge and empty-state guard used workloads.length (which includes legacy kind:site rows with empty source_kind) while the list renders pluginWorkloads (source_kind != ''). With one legacy row and no real workloads the badge showed "1" over an empty list and the empty state never appeared. Both now use pluginWorkloads, matching the list and the headline Total count. |
||
|
|
cdb9fd57d1 |
feat(alerts): metric-threshold alerting (backend + API)
Operators can define metric-threshold alert rules (cpu_percent, memory_percent, memory_bytes; gt/lt) per-workload or global via /api/metric-alert-rules. A periodic evaluator (internal/metricalert, 30s tick) checks the freshest container stats sample per container against enabled rules and, on breach (per-rule-per-workload cooldown), emits into the existing event_log + bus pipeline (source "metric_alert", workload_id set). Alerts therefore surface on the global events page, the per-app activity timeline, and any configured event-trigger webhook -- no new notification plumbing. Mirrors the log_scan_rules store/API/route patterns and the stats.Collector lifecycle. Rule CRUD reads are authed, mutations AdminOnly. Frontend rule-config UI is a follow-up phase. Reviewed: go APPROVE (0 CRITICAL/HIGH). |
||
|
|
93b6911b34 |
feat(apps): per-app deploy/activity timeline
Every deploy across all four source kinds now writes a workload-scoped
event via a shared plugin.EmitDeployEvent helper (replacing the inline
emit duplicated in static/dockerfile, standardizing static's metadata
key site_id->workload_id, and adding emission to image+compose which
were silent). New indexed event_log.workload_id column, EventLogFilter
.WorkloadID, and GET /api/workloads/{id}/events (id pinned from path).
Frontend: a forge "Activity" panel on /apps/[id] reusing EventLogEntry,
live SSE prepend filtered by workload_id, load-more pagination, an
All/Errors severity filter, and a shared toEventLogEntry mapper. en/ru
i18n parity.
Security: compose's failure status emits a generic reason instead of raw
`docker compose up` output, which can echo app secrets and egresses to
operator webhooks (NotificationURL + event-trigger actions); full detail
stays only in the returned error. Rune-safe 256-rune status cap.
Reviewed: go + typescript APPROVE; security HIGH fixed.
|
||
|
|
3071cda512 |
feat(deploy): commit-status reporting to Git providers
Report deploy status back to the Git provider as a commit status (pending/success/failure) for git-sourced workloads (static + dockerfile). - GitProvider.SetCommitStatus on gitea/github/gitlab over the existing SSRF-safe client; fixed "tinyforge" context so redeploys update one row. postJSON returns status-code-only errors (never echoes the upstream body, which a hostile provider could use to reflect the auth token into the best-effort log line). - Best-effort deploy hook: pending on deploy start, success/failure on outcome, gated on a per-workload report_commit_status flag. Never fails or blocks a deploy; emits nothing on the unchanged-SHA short-circuit. - UI ToggleSwitch (create + edit) + reportCommitStatus in sourceForms.ts + en/ru i18n. - Tests: per-provider state mapping + request shape; reporter gating (enabled/disabled/empty-SHA/nil/error-swallow). Reviewed via go-reviewer + security-reviewer (0 CRITICAL/HIGH; one MEDIUM body-echo log-leak fixed). |
||
|
|
410a131cec |
feat(apps): stepped creation wizard, branch previews, and app-creation fixes
This session (frontend focus):
- Rebuild /apps/new as a 4-step wizard (Basics → Configure → Trigger → Review):
WizardRail, SourceKindPicker card grid, AppManifest review, per-step validation,
ConfirmDialog-based unsaved-changes guard.
- Extract lib/workload/sourceForms.ts (single source of truth for source_config)
+ {Image,Compose,Static,Dockerfile}SourceForm + StaticDiscoveryWizard; fold the
/apps/[id] edit form onto the same components (removes the duplication). Add
vitest + sourceForms unit tests.
- Branch preview environments UI: /chain is_preview/preview_branch + a Preview
environments panel on /apps/[id] (per-branch URLs, ConfirmDialog teardown, armed
state); RegistryImagePicker on the registry trigger and the image source.
- Fixes: image-inspect 404 -> admin-gated POST /api/discovery/image/inspect;
conflict-panel blur flicker; friendly localized discovery errors; CPU/Memory
label hints; dashboard + /apps "Total workloads" count only source_kind workloads
(drop stale trigger_kind gate); NPM cert/access-list name cache; EntityPicker
empty-list guard.
- Update CLAUDE.md frontend conventions + add a Build & Test section.
Also captures pre-existing in-progress platform work (not from this session):
workload notifications, Prometheus metrics export, store lockfile, health probes,
backup hardening, and related store/webhook/scheduler changes.
|
||
|
|
956943edbb |
feat(proxies): per-row Triggers deep-link to /apps/[id]#bindings
The proxies page now exposes the trigger bindings for each routed
workload via a per-row action chip. Resolves the explicit "what's
next" call-out in WORKLOAD_REFACTOR_TODO under Priority 3 polish.
- Added id="bindings" to the existing trigger bindings <section> on
/apps/[id]/+page.svelte so URL fragments resolve to the panel.
- New triggersHref(route) helper in /proxies that builds
/apps/{workload_id}#bindings; code-pointer comment explains the
back-compat naming (ProxyRoute.project_id is actually the workload
ID — see internal/store/models.go:110-113), so a future contributor
doesn't trip on the mismatch and rip the helper out.
- New right-aligned "Actions" column with a button-shaped link;
defensive — falls back to — when project_id is absent.
- Three new i18n keys under proxies.* (actions, viewTriggers,
viewTriggersTitle) mirrored across EN + RU. Key parity now 1512
each.
No backend change needed; ListProxyRoutes already selects w.id into
ProxyRoute.project_id. Workload-aware batch endpoints (showing
trigger counts inline) were deliberately out of scope for this
half-turn — flagged as a future enhancement only if users want
inline counts.
Verification: svelte-check 0 errors + 3 pre-existing warnings in
TagCombobox; go build + go test ./... all green across 20 packages.
|
||
|
|
ea55d31177 |
feat(discovery+runtime): restore static-site wizard discovery + close /sites/[id] feature parity
Build / build (push) Successful in 10m43s
Two-stage feature arc closing the gaps left by the hard legacy cutover.
The static-site creation wizard regains its auto-discovery + connection-test
flow; /apps/[id] grows the runtime/storage/lifecycle surface the legacy
/sites/[id] page used to expose.
Backend (Go)
- internal/api/discovery.go: six admin-gated endpoints wrapping
staticsite.GitProvider — POST /api/discovery/git/{detect-provider,
test-connection,repos,branches,tree} + GET /api/discovery/image/conflicts.
Identifier validation (validateGitIdent / validateGitBranch) at the
boundary so provider URL interpolation cannot be hijacked via `..`.
Upstream errors scrubbed: detailed slog on the server, generic 502 to
the client (mitigates token-reflection-in-error-page).
- internal/api/workload_runtime.go: four endpoints —
GET /api/workloads/{id}/runtime-state decodes containers.extra_json for
static workloads; GET /api/workloads/{id}/storage execs `du -sb /app/data`
with a 30s in-process cache (storageProbeCache) so polling can't turn
into per-request execs; POST /api/workloads/{id}/{stop,start} iterate
ListContainersByWorkload and call docker.StopContainer / StartContainer,
returning 200 / 409 (nothing to act on) / 502 (all failed).
- internal/staticsite/safehttp.go: NewSafeHTTPClient + ValidateBaseURL +
blockReason. DialContext re-resolves hostnames and refuses loopback /
link-local / multicast / unspecified addresses. RFC1918 + ULA explicitly
allowed (self-hosted Gitea on LAN is the dominant deployment).
Replaced four raw &http.Client{} constructions in the provider files.
- internal/staticsite/gitlab_provider.go: url.PathEscape each segment in
the raw-file URL builder for parity with projectPath().
- Test coverage: 26 cases in discovery_test.go (image-tag stripping,
source-config decoding, conflict scenarios, validator boundaries,
scheme rejection), 14 in workload_runtime_test.go (404 / 409 / nil-docker
/ probe-cache), 16 in safehttp_test.go (URL validation + block-reason
policy matrix + live dial against loopback + AWS metadata literals).
Frontend (Svelte 5 + runes)
- web/src/lib/api.ts: typed wrappers for every endpoint, AbortSignal
threaded through post(); ApiError exported so callers can narrow on
e.status; new DetectedGitProvider narrow union.
- web/src/routes/apps/new/+page.svelte: static-form discovery controls
(auto-detect provider, test connection, repo / branch / folder
EntityPickers, Deno auto-detect); image-form conflict panel with
debounced lookup + double-click submit guard ("Forge anyway") + Inspect
button that pre-fills port/healthcheck; English error fallbacks routed
through apps.new.errors.* (en + ru).
- web/src/routes/apps/[id]/+page.svelte: runtime-state panel + storage
panel + Stop / Start / Open-site toolbar; universal live-state badge
in the hero lede for image/compose/static (RUNNING / TRANSITIONING /
STOPPED / NOT DEPLOYED / MIXED · n/m RUNNING); ContainerStats panel
per row (auto-collapsing native <details> when N > 2); read-only
webhook bindings summary card; responsive toolbar overflow with native
<details> at <640px (z-index 100 above sticky nav).
- web/src/app.css: project-wide .forge-btn-ghost:focus-visible outline.
Hardening from go-reviewer + security-reviewer + typescript-reviewer +
frontend-design UI/UX subagents (0 CRITICAL, all HIGH/BLOCKER addressed
inline, IMPORTANT applied before commit):
- AbortController + per-call sequence tokens on every long-running
fetch (loadRuntimeState / loadStorage / loadTriggerMeta / inspectImage /
listImageConflicts) plus onDestroy cleanup so late resolves cannot
mutate dead component state.
- doStop / doStart snapshot and restore `error` across the finally-block
reload so a load()-cleared message doesn't hide a real failure.
- triggersById refreshed after inline trigger creation so the webhook
card doesn't silently exclude the just-created trigger.
- Live-state badge wraps in role=status / aria-live=polite (no redundant
aria-label).
- Webhook row has a single click target (was two pointing at the same URL).
- Empty webhook section hides entirely.
- Dropped role=menu / role=menuitem from the overflow menu (they would
promise arrow-key nav we don't wire; native Tab + ESC carry it).
Doc
- docs/CODEMAPS/INDEX.md + new docs/CODEMAPS/discovery-and-runtime.md
map the endpoint surface, security posture, frontend integration
patterns, and an "add a new probe" recipe.
Verification
- svelte-check: 0 errors, 3 pre-existing a11y warnings.
- go build + go vet + go test ./...: all green.
- i18n parity: en + ru at 1413 keys each.
- Live smoke against :8090: 404 / 409 / 502 envelopes correct, discovery
sanity passes, ProbeError surfaces on no-container path.
|
||
|
|
5e78f13e06 |
refactor(triggers): review followups — fire-now, dedupe trigger pages, hardening
Build / build (push) Failing after 34s
Follow-ups on commit
|
||
|
|
39e1e36510 |
feat(triggers): add schedule trigger kind + internal scheduler
Build / build (push) Successful in 10m42s
Fourth trigger kind alongside registry/git/manual. Recurring time-interval fires driven by a new internal/scheduler tick loop (default 30s, clamped to 5m). Goes through the same webhook.Handler.FanOutForTrigger seam as inbound HTTP webhooks, so per-binding concurrency, outcome accounting, and config-merge semantics are identical. Schema: triggers.last_fired_at TEXT column (additive ALTER for existing DBs). Scheduler persists last_fired_at BEFORE dispatch so a panicking Match cannot wedge a tight loop; failed deploys wait one full interval before retry — correct trade-off for a periodic refresh trigger. Frontend: TriggerKindForm + /triggers/new + /triggers/[id] gain the schedule kind (4-col card grid, preset chips Hourly/Daily/Weekly, custom interval input matched to Go time.ParseDuration syntax, optional pinned reference). /triggers/[id] surfaces "last fired" on schedule rows. EN+RU i18n in parity. Review fixes from go-reviewer / security-reviewer / typescript-reviewer: - Scheduler Start/Stop wrapped in sync.Once (no goroutine leak / double- cancel panic on shutdown re-entry). - shouldFire rejects sub-MinInterval as defense-in-depth against hand-inserted rows that bypassed Validate. - fire() asserts trigger Kind=="schedule" before dispatching. - Aligned isValidInterval regex across all three frontend sites; reject the unsupported "d" unit (Go time.ParseDuration doesn't accept it). - formatLastFired falls back to lastFiredNever on malformed timestamps rather than leaking raw bytes into the UI. - main.go scheduler closure logs per-fire deployed/errored counts. |
||
|
|
e3c7b13d58 |
chore(workload): close the workload-first arc — apps i18n + codemap + tests
Build / build (push) Successful in 10m36s
Closes the workload-first refactor by landing the Priority 3 polish items and the Priority 4 test gap. Net: ~2,400 lines added, ~350 lines modified across 13 files. Priority 3 — polish - apps.* i18n namespace: 276 new keys across apps.list.* (27), apps.new.* (91, sibling of existing apps.new.triggers.*), and apps.detail.* (158, sibling of existing apps.detail.bindings.*). EN+RU at 1314 keys each, perfectly in sync. /apps, /apps/new, /apps/[id] now render entirely from i18n. - New codemap docs/CODEMAPS/workload-plugin.md (238 lines): Source × Trigger contract, dispatch seam, webhook fan-out path, recipes for adding a new Source or Trigger kind. Plus docs/CODEMAPS/INDEX.md gateway. Priority 4 — tests - internal/api/workloads_test.go (new, ~30 subtests): /api/workloads CRUD + deploy + delete + env + volumes + chain + promote-from + triggers list/inline-bind + auth gating + standalone /api/triggers CRUD (create / dup-409 / kind filter / delete). Uses real POST handlers via httptest.NewServer + a fake plugin source registered under "testfakesource". - internal/deployer/dispatch_test.go (new, 11 tests): DispatchPlugin / DispatchTeardown / DispatchReconcile happy + unknown-kind + propagated-error each; PluginDeps wiring; a real 2s-bounded RWMutex deadlock probe on PluginDeps vs SetDNSProvider. - internal/workload/plugin/source/compose/compose_test.go (new, ~26 subtests): composeProjectName sanitization, writeYAML / writeYAMLIfChanged hash short-circuit, Validate happy + bad inputs, Kind / SchemaSample. Coverage delta on the workload-plugin path: - internal/api: 1.1% → 16.0% - internal/deployer: 0% → 54.1% - internal/workload/plugin/source/compose: 0% → 38.5% - Trigger plugins already at 87-95% from the trigger-split work. Production fix surfaced by the tests - store.CreateWorkload now self-references RefID = ID when caller leaves RefID empty (the typical plugin-native path). The api layer's broken backfill loop (called UpdateWorkload, which deliberately omits ref_id) is gone. Multiple sibling plugin workloads can now coexist under the UNIQUE(kind, ref_id) constraint. Review fixes addressed before commit - CRITICAL: deadlock-detect test gained a real 2s time.After (was selecting on context.Background().Done() which never fires). - HIGH: happy-path test now hard-asserts RefID = ID (was a t.Logf that would silently pass after a production fix). - HIGH: standalone /api/triggers CRUD coverage added (was bypassed by the workload-side bind flow). - HIGH: seedWorkload bypass deleted; tests now go through the real POST /api/workloads handler. - MEDIUM: withTempDir restore is a no-op (t.Setenv auto-restores); dead `old := os.Getenv(...)` capture removed. - MEDIUM: list-workloads test now asserts ID membership, not just count. Doc - WORKLOAD_REFACTOR_TODO: all three Priority 1 items, Priority 3 polish, and Priority 4 tests marked DONE. The workload-first arc is closed. |
||
|
|
739b67856a |
feat(cutover): hard legacy cutover — drop projects/stacks/sites/deploys
Build / build (push) Successful in 10m39s
The clean-break delete that closes the workload-first refactor arc.
Net diff: ~30 backend files deleted, ~20 modified, ~12k LOC removed
on the Go side; entire /projects /stacks /sites /deploy frontend
trees gone; ~6.7k LOC removed on the Svelte/TypeScript side.
Backend
- API handlers gone: internal/api/{projects,stages,stage_env,stacks,
static_sites,deploys,instances,volume_browser}.go
- Store CRUD + tests gone: internal/store/{projects,stages,stage_env,
stacks,static_sites,static_site_secrets,deploys,poll_state,volumes,
workload_sync}.go (+ _test.go siblings)
- Legacy deployer pipeline gone: internal/deployer/{bluegreen,promote,
rollback,subdomain,resolver_test}.go; deployer.go trimmed to just the
dispatch surface used by the plugin pipeline
- internal/staticsite/{manager,healthcheck}.go and
internal/stack/manager.go gone (the rest of those packages stay as
helpers imported by the static + compose plugins)
- internal/registry/poller.go gone (legacy registry poller)
- internal/volume.ResolvePath gone; ResolveWorkloadPath stays
- internal/webhook: handleWebhook (project) + handleSiteWebhook (site)
gone; only POST /api/webhook/triggers/{secret} remains
- workload-side webhook URL handlers (getWorkloadWebhook +
regenerateWorkloadWebhook + EnsureWorkloadWebhookSecret +
SetWorkloadWebhookSecret + GetWorkloadByWebhookSecret) gone — they
minted URLs that would 404 against the new trigger-only ingress
- cmd/server/main.go: dropped staticsite.Manager, stack.Manager,
staticsite.HealthChecker, registry poller, SetSiteSyncTriggerer,
SetStaticSiteManager, SetStackManager, wireStaticBackend
- store/store.go: idempotent DROP TABLE IF EXISTS for every legacy
table (projects, stages, stage_env, volumes, deploys, deploy_logs,
poll_states, stacks, stack_revisions, stack_deploys, static_sites,
static_site_secrets); FK order children-then-parents
- store/models.go: dropped Project, Stage, Deploy, DeployLog, StageEnv,
Volume, StaticSite, StaticSiteSecret, Stack, StackRevision,
StackDeploy types; kept WorkloadKind constants as documented strings
- internal/store/helpers.go (new): BoolToInt, rowScanner,
GenerateWebhookSecret extracted from deleted CRUD files
- internal/api/secrets.go (new): forwards to store.GenerateWebhookSecret
so api + store paths share one secret-generation impl (no
panic-vs-UUID-fallback divergence)
- internal/reconciler/reconciler.go: dropped legacy stack-by-compose
+ static-site label paths; only canonical tinyforge.workload.id
dispatch remains
- providers (gitea_content/github_provider/gitlab_provider) gained
path-traversal rejection on every tree entry
- internal/webhook ParsedImage / ParseImageRef demoted to package-
private (no external callers)
Frontend
- /projects /stacks /sites /deploy routes deleted (entire trees)
- ProjectCard / InstanceCard / StaleContainerCard components deleted
- api.ts: dropped every project/stage/stack/site/deploy/instance
helper + types (Project, Stage, Stack, StaticSite, Deploy,
Instance, Volume, etc.); kept Workload, Container, App, Settings,
Registry, EventTrigger, LogScanRule, webhook envelopes
- WorkloadWebhook type + getWorkloadWebhook/regenerateWorkloadWebhook
api functions gone (mirror of the backend deletion above)
- web/src/routes/+layout.svelte: dropped /projects /sites /stacks
/deploy nav entries, trimmed quick-nav keymap
- web/src/routes/+page.svelte: dashboard rewrite — reads
listWorkloads + listContainers only; 4-card stat grid
(workloads/running/failed/stale) + recent workloads strip
- navCounts.ts, SystemHealthCard.svelte, ContainerLogs.svelte,
ContainerStats.svelte, StatusBadge.svelte, TagCombobox.svelte,
proxies/+page.svelte, containers/+page.svelte all rewired to the
workload-first surface
- AbortController plumbing on dashboard, nav-counts, stale page,
SystemHealthCard so navigation doesn't leave dangling fetches
- i18n: dropped projects.*, projectDetail.*, envEditor.*,
volumeEditor.*, volumeBrowser.*, quickDeploy.*, sites.*, stacks.*,
instance.*, confirm.* namespaces; en/ru parity preserved (1042
keys each)
Hardening from go-reviewer + security-reviewer + typescript-reviewer
subagent passes (0 CRITICAL across all three; 1 HIGH + ~12 MEDIUM
addressed inline before commit):
- Sec H1: dead-end workload webhook URL handlers (would mint URLs
that 404 the new trigger-only ingress) deleted across backend +
frontend
- Go M1: IsTerminalDeployStatus dropped (no production callers)
- Go M2: ParsedImage/ParseImageRef lowercased (in-package only)
- Go M6: generateWebhookSecret unified — api shim forwards to
store.GenerateWebhookSecret
- Doc/comment freshness: stage_id (no longer FK), ProxyRoute legacy
field names, workloadIDRow rationale, webhook_deliveries.target_type
enum, WebhookDeliveryLog component header
Doc
- WORKLOAD_REFACTOR_TODO: cutover marked DONE; all three Priority 1
items are now shipped. Next focus is Priority 3 polish (apps.* i18n
+ codemap entries) and Priority 4 tests.
Behavioral notes for operators upgrading from a pre-cutover build
- Existing rows in the dropped tables disappear on first boot.
- Legacy webhook URLs at /api/webhook/{secret} and
/api/webhook/sites/{secret} return 404; CI configs must repoint to
/api/webhook/triggers/{secret} (the trigger-split boot backfill
lifted any embedded workload secret onto a Trigger row, so the
secret value itself carries over).
- Frontend routes /projects /stacks /sites /deploy are gone; nav
links replaced with /apps and /triggers.
|
||
|
|
2aff22f565 |
feat(triggers): first-class triggers + bindings with fan-out webhook
Build / build (push) Successful in 10m39s
Promote triggers from embedded workload fields to standalone records
joined to workloads via workload_trigger_bindings. One trigger (webhook,
registry watcher, git push, manual) now fans out to many workloads with
per-binding config overrides (top-level JSON merge, binding wins).
Backend
- new triggers + workload_trigger_bindings tables with ON DELETE CASCADE
- boot-time backfill of embedded trigger config inside per-workload tx
- store.ErrUnique sentinel translates SQLite UNIQUE at store boundary
- /api/triggers CRUD + /api/triggers/{id}/{webhook,bindings}
- /api/bindings/{id} update/delete; /api/workloads/{id}/triggers list+bind
- bindTriggerToWorkload accepts trigger_id or inline {kind,name,config}
- inline-create uses CreateTriggerWithBindingTx (no orphan triggers)
- validateBindingConfig enforces 8 KiB cap + plugin Validate on merged
- ListTriggersWithBindingCount + ListBindings*WithNames remove N+1
- POST /api/webhook/triggers/{secret} resolves trigger then fans out
- bounded worker pool (4) per request; per-binding error isolation
- outcome accounting: deployed / skipped / no-match / errored
- legacy /api/webhook/workloads/{secret} route removed (clean break;
backfill keeps secrets resolvable at the new /triggers/{secret} path)
- reconciler gate dropped from (Source && Trigger) to Source only
- MergeJSONConfig returns freshly allocated slices (no fan-out aliasing)
- WithEffectiveTrigger lets existing Trigger.Match contract stay unchanged
Frontend
- /triggers list, new wizard, [id] detail (bindings, webhook rotate)
- workload create wizard: NEW / PICK / SKIP trigger modes
- workload detail: bindings panel + Add-trigger modal (inline / pick)
- per-binding override editor with merged-preview + 8 KiB guard
- "OVERRIDES n FIELDS" row badge when binding_config is non-empty
- shared TriggerKindForm component (registry / git / manual + JSON)
- 3 raw <input type=checkbox> replaced with <ToggleSwitch>
- full EN + RU i18n: redeployTriggers.*, apps.detail.bindings.*,
apps.new.triggers.*, nav.triggers; event-triggers nav disambiguated
Doc
- WORKLOAD_REFACTOR_TODO: trigger-split marked DONE; next focus is
the static-source inline port + hard legacy cutover (Priority 1)
|
||
|
|
4707db1c3b |
feat(observability): event-triggers + log-scan-rules UI + i18n
Operator-facing surfaces for the two backend features:
- /event-triggers — list (filter summary, status pill),
/event-triggers/new (form with regex validation), and
/event-triggers/[id] (edit + Send-test + delete) with
CONFIGURED secret badge + clear-to-rotate flow, ConfirmDialog
for delete, aria-live regions on async result slots.
- /log-scan-rules — list with scope filter chips and stats panel
(active tails, RATE-LIMITED, COOLED DOWN, COMPILE ERRORS),
/log-scan-rules/new (with EntityPicker for workload scope and
inline RegexTestBox), /log-scan-rules/[id] (edit + server-side
/test + delete + live RegexTestBox panel).
- web/src/lib/components/RegexTestBox.svelte — reusable
client-side regex test with sample input + captures display.
- web/src/lib/api.ts — typed wrappers for EventTrigger and
LogScanRule CRUD + /test + getLogScanStats +
getEffectiveLogScanRules.
- web/src/routes/+layout.svelte — nav entries for both surfaces.
- web/src/lib/i18n/{en,ru}.json — ~90 keys under observability.*,
triggers.*, logscan.* namespaces; Russian translations cover
every key.
Design + a11y polish per a frontend-design review pass: all
boolean inputs use ToggleSwitch, all destructive actions use
ConfirmDialog with confirmVariant="danger" / onconfirm /
oncancel, hand-rolled .btn-primary replaced with global
forge-btn classes, hex colors replaced with var(--*) tokens,
role="alert" on error banners, aria-invalid + aria-describedby
on invalid-regex inputs, aria-busy on async forms, mobile
breakpoints (hide-md columns, .row.three collapsing 3→2→1,
.table-wrap overflow-x).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
8d6a527a2b |
refactor(workload): plugin architecture wave + apps UI + volume scopes
Completes the workload-first refactor's plugin layer:
- internal/workload/plugin/ — Source/Trigger plugin contract,
registry, types (Workload, DeploymentIntent, InboundEvent,
PublicFace). Self-registering init() pattern + blank-import
in cmd/server/main.go.
- Source plugins: image (blue-green with multi-face proxy routing),
compose, static. Trigger plugins: registry, git, manual.
- internal/deployer/dispatch.go — DispatchPlugin/Teardown/Reconcile
seam routing the legacy deployer through plugins.
- internal/api/workload_*.go — REST surface: workloads, env,
volumes, chain (parent/children), promote-from. hooks.go
serves /api/hooks/kinds/{kind}/schema for the wizard.
- internal/store: workload_env (encrypt-at-rest secrets) and
workload_volumes tables, keyed on workload_id.
- cmd/server/static_backend.go — phantom-row adapter delegating
the static source plugin to the legacy staticsite.Manager
(deleted at hard cutover once the static inline port lands).
- web/src/routes/apps/ — /apps list + /apps/new wizard +
/apps/[id] detail with kind-aware compose / image / static
forms (Advanced JSON toggle), env panel, volumes panel,
webhook panel, chain panel, manual deploy.
Volume scope generalization (v2 resolver):
- internal/volume.ResolveWorkloadPath (workload-keyed, sits
next to legacy ResolvePath). Honors all VolumeScope values:
absolute, ephemeral, instance, stage, project, project_named,
named. internal/workload/plugin/source/image/image.go
computeMounts wires settings + imageTag through. Coverage in
internal/volume/resolver_test.go (portable Linux/Windows via
t.TempDir).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
f42b21a2b9 |
fix(sites): show Secrets card before Resources on site detail
Build / build (push) Successful in 10m30s
Secrets is the more frequently edited section than Resources/logs; reorder so it appears right after Site Info on /sites/[id]. |
||
|
|
cba2149aa9 |
refactor(workload): finalize containers index + post-review hardening
Wraps up the workload refactor with the fixes that came out of the multi-agent code review (see docs/plans/workload-refactor.md "What actually shipped"). Backend: - store.ReconcileContainer: separate write path so the 30s reconciler tick no longer overwrites deployer-owned fields (subdomain, proxy_route_id, npm_proxy_id, image_tag). - Container.stage_id column + index; ListProxyRoutes / ListContainersByStageID join via stage_id (survives stage rename), with legacy fallback to (project_id, role=stage_name). - Reconciler: workload-existence check (rejects forged tinyforge.workload.id labels), skips inventing project-kind rows, child-context cancel before wg.Wait() on shutdown. - Transactional CRUD across projects / stacks / static_sites: parent UPDATE and workload sync land in one transaction so secret rotations are durable. - Webhook routing reads exclusively through workloads.webhook_secret; legacy GetProjectByWebhookSecret / GetStaticSiteByWebhookSecret fallback removed. - store.GetStackByComposeProjectName + indexed lookup (no more full-table stack scan per compose container per tick). - store.ListMissingSweepRows: filtered query for the missing-sweep. - /api/instances/* handlers verify (workload_id, role) match URL (project_id, stage_name) before mutating — closes the cross-project hijack the security review flagged. - extra_json no longer referenced from Go (column kept on disk for now). Frontend: - WorkloadContainers.svelte: generic detail-page panel reusable by stack and site detail pages. - Containers page polish: client-side kind/state filters over an unfiltered fetch, URL-synced filters, race-safe loads via sequence number, EN+RU i18n, sidebar counter via navCounts.containers. Misc: - scripts/dev-server.sh: tolerate empty netstat grep result. - .gitignore: ignore docker-watcher binaries, .claude/worktrees/, .facts-sync.json. |
||
|
|
d8ab22876f |
refactor(workload): extract Instance entirely; Container is canonical
Build / build (push) Successful in 10m41s
End-to-end extraction of the Instance concept. After this commit:
* internal/store/instances.go — DELETED
* internal/store/models.go — Instance struct gone, ProxyRoute moved here
* containers table is the single source of truth for project/stack/site
container state. instances table is dropped via DROP TABLE migration
(idempotent; re-runnable on every boot).
* Legacy tinyforge.project / tinyforge.stage / tinyforge.instance-id
Docker labels are no longer emitted; only tinyforge.workload.{id,kind},
tinyforge.role, and tinyforge.managed are stamped on new containers.
Backend rewrites:
- internal/deployer: executeDeploy + blueGreenDeploy + rollback +
promote use store.Container natively. New
removeContainer() replaces removeInstance().
enforceMaxInstances reads via
ListContainersByStageID.
- internal/reconciler: legacy tinyforge.instance-id dispatch removed;
upsertByWorkloadLabel now finds existing rows
by docker container ID first and falls back to
the deterministic workloadID:role key.
- internal/stale/scanner: Scan + new FindStaleContainers walk the
containers table; emit StaleContainer JSON.
- internal/stats/collector: ListContainers replaces ListAllInstances.
- internal/webhook/handler: workload-secret lookup tried first; falls back
to project / static_site secret column.
- internal/api: instances.go, stale.go, stats.go, stats_history.go,
projects.go, settings.go, docker.go, dns.go all read /
write through Container.
Docker layer:
- ManagedContainer exposes WorkloadID/Kind/Role from the canonical labels.
- ListContainers filters by tinyforge.managed=true.
- Network creation uses LabelManaged instead of LabelProject.
Frontend:
- Instance type is now a Container alias; .status → .state,
.last_alive_at → .last_seen_at.
- InstanceCard takes stageId as a prop (no longer derived from Instance).
- StaleContainer JSON shape rewritten: { container, workload_name, role,
days_stale }. StaleContainerCard + /containers/stale page updated.
- ProjectCard / homepage / SystemHealthCard filter by .state.
The migration loop now tolerates "no such table" alongside "duplicate
column" / "already exists" so obsolete ALTER TABLE entries targeting the
dropped instances table no-op cleanly on first boot.
Tests: store + deployer + reconciler + webhook + staticsite + notify all
still pass. Frontend svelte-check: zero errors.
|
||
|
|
3e28588f10 |
feat(workload): global Containers tab + frontend client
Adds the user-visible piece of the Workload refactor:
- web/src/lib/types.ts — Workload, Container, ContainerView,
App, WorkloadKind, ContainerState
- web/src/lib/api.ts — listWorkloads, getWorkload,
listWorkloadContainers, setWorkloadAppID,
listContainers (with filter),
CRUD for apps
- web/src/lib/i18n/{en,ru}.json — nav.containers
- web/src/routes/+layout.svelte — "Containers" nav item between Stacks
and Deploy, IconContainer
- web/src/routes/containers/+page.svelte — global Containers table:
* filter chips for kind (project/stack/site) and state
* client-side search across workload name / role / image /
subdomain / container ID prefix
* Workload column links to the kind-specific detail page,
resolved through a one-time /api/workloads call to map
workload_id → ref_id
* existing /containers/stale route untouched
The page renders against the live database now — boot backfill
populated workload rows from existing projects/stacks/sites,
the deployer dual-writes containers on every deploy, and the
30s reconciler keeps the index in sync with `docker ps`.
|
||
|
|
0f60a7a5db |
feat(webhook): inbound delivery audit log
Build / build (push) Successful in 10m35s
Persists every inbound webhook hit (project + site) so users can debug
"why didn't my deploy fire?" without grepping daemon logs. Surfaces a
14-day rolling history under the WebhookPanel on each project + site
detail page; refreshes every 30s while open. Daily cron prunes records
older than 14 days alongside the existing event log prune.
Schema:
- webhook_deliveries(id, target_type, target_id, target_name, received_at,
source_ip, signature_state, status_code, outcome, detail, body_size)
- indexes on (target_type,target_id,received_at) and (received_at)
Backend:
- store: WebhookDelivery model + Insert/List/Prune helpers
- webhook/handler: deferred recordDelivery() captures the final outcome
on every return path including HMAC rejects, image mismatch, no-stage,
auto_deploy=false, and successful deploys; signatureStateFor()
classifies "unconfigured" vs "missing" vs "invalid" vs "valid"
- api: GET /api/{projects,sites}/{id}/webhook/deliveries with
parseLimit() helper (default 50, max 200)
- main: daily prune cron retains the last 14 days
Frontend:
- WebhookDeliveryLog.svelte: panel with refresh button, status code +
outcome + signature badges, relative time tooltip-on-hover for
absolute time, source IP column
- Mounted below WebhookPanel on project + site detail pages
- en/ru i18n strings for outcome/signature enums and column labels
|
||
|
|
831b5c1a43 |
feat(webhook): HMAC-SHA256 signature verification on inbound webhooks
Adds an opt-in inbound HMAC scheme so a leaked URL alone is not enough
to forge deploy/sync requests — the caller must also know a separate
signing secret. Header format is X-Hub-Signature-256, matching the
Gitea/GitHub/GitLab convention so existing CI integrations work without
custom code.
Behaviour:
- per-project / per-site signing_secret is independent of the URL secret
- require_signature flag does a hard 401 on missing/invalid signatures
- even when require_signature is off, an *invalid* submitted signature
returns 401 — surfaces CI misconfiguration instead of silently passing
- comparison uses subtle/hmac.Equal (constant time)
Backend:
- store: webhook_signing_secret + webhook_require_signature columns on
projects + static_sites; scanProject helper, scan helpers updated; new
Set* helpers for both fields
- webhook/handler: verifyHMAC helper, body read once, integrated into
both project and site handlers
- api: per-entity signing-secret rotate / disable / require-toggle
endpoints under /api/{projects,sites}/{id}/webhook/...
Frontend:
- WebhookPanel gains optional signing handlers (no breaking change for
existing callers; signing UI hides when handlers aren't wired)
- one-shot reveal of the issued secret with copy + dismiss
- ToggleSwitch for require-signature, disabled until a secret is issued
- en/ru i18n strings
Tests:
- HMACRequiredAndValid (200 + deploy fires)
- HMACRequiredButMissing (401, no deploy)
- HMACPresentButWrong (401 even when require_signature=false)
- HMACOptionalUnsignedAccepted (200 when neither configured)
|
||
|
|
793570f4a1 |
feat(stats): inline 30-min sparklines on container CPU + memory bars
Always-visible trend lines next to the existing bars on every running instance and static-site card so the user can spot a slow drift or recent spike at a glance, without expanding the full history chart. Implemented as a tiny pure-SVG component (no ECharts on hot list paths) — values are fetched once per 30s alongside the current snapshot via a 30m history query that the collector already serves. - Sparkline.svelte: pure-SVG polyline + optional area fill, normalises values to [0,100] and clamps out-of-range points - ContainerStats: parallel fetch of stats + 30m history, two new $derived arrays for CPU% and memory%, sparklines slotted between the bar and the numeric readout |
||
|
|
2c109913bd |
fix(types): resolve pre-existing svelte-check errors
stale-containers page referenced container.id but StaleContainer wraps an inner instance object — switch to container.instance.id at the three call sites. env editor derived projectId from $page.params.id (string | undefined); coalesce to empty string so call sites that pass it to API helpers don't trip the strictNullChecks gate. Pre-existing errors flagged by `svelte-check`; not introduced by recent feature work but blocking the green check. |
||
|
|
8b886ddf2b |
feat(backup): take Tinyforge DB snapshot before every deploy
Adds an opt-in "auto_backup_before_deploy" setting that triggers a "pre-deploy" backup at the start of every project deploy via the deploy pipeline (covers both the async HTTP path and the sync poller/webhook path). Failures are logged to the deploy log but do not abort — missing a backup is preferable to refusing to ship a fix. - store: settings.auto_backup_before_deploy column + scan/update wiring - backup: accept "pre-deploy" as a valid backup_type - deployer: small PreDeployBackuper interface, hooked into runDeploy right after settings load and before any state-mutating work - api: settings request/response surface the new flag - web: ToggleSwitch on the backup settings page; "Pre-deploy" badge variant in the backup list (badge-warning so it stands out) - i18n: en/ru strings for the toggle, help text, and badge label |
||
|
|
0405ecd9ce |
feat(notify): HMAC-signed outgoing webhooks with per-tier secrets and test sender
Build / build (push) Successful in 10m36s
Outgoing notifications were bare POSTs with no auth and no way to verify
they came from Tinyforge. They also went out from one global URL only,
even though stages had a notification_url field, and static-site sync
emitted no events at all.
Schema: add notification_url + notification_secret (lazy-generated) to
settings, projects, stages and static_sites. Migrations are additive.
Notifier: SendSigned computes HMAC-SHA256 over the exact body bytes and
sends X-Hub-Signature-256 (GitHub-compatible — receivers built for
GitHub/Gitea/Forgejo verify out of the box). Aux headers
X-Tinyforge-Event/Delivery/Timestamp/Tier are advisory and not signed.
Empty secret => unsigned send for back-compat.
Resolution: deploys fall through stage > project > settings, sites fall
through site > settings. The secret travels with the URL that sourced
it, so any tier can sign even when its parents are unsigned. Site sync
events now actually emit (site_sync_success / site_sync_failure).
API: 12 new endpoints — {GET secret, POST regenerate, POST disable,
POST test} for each of the 4 tiers. SendSyncForTest returns
status_code/latency_ms/signature_sent/delivery_id/response_snippet so
the UI surfaces receiver feedback inline.
UI: shared OutgoingWebhookPanel.svelte fits the existing card aesthetic.
Signing-state pill, secret reveal-on-demand, regenerate/disable behind
ConfirmDialog modals (not inline strips — too easy to misclick), send-
test result card with colour-coded status. Wired into Settings →
Integrations, project edit form, per-stage edit, and per-site detail.
EN + RU i18n.
Tests: round-trip (sender signs, receiver verifies), tampered-body and
wrong-secret rejection, unsigned-send omits header, send-test surfaces
4xx, concurrent fan-out via Drain. Resolver precedence locked for both
deploy and site paths.
Docs: docs/webhooks.md with header reference, verifier snippets in
Node/Python/Go, and a recipe for the service-to-notification-bridge
generic webhook provider.
|
||
|
|
134fe22fde |
refactor(ui): standardize boolean inputs on ToggleSwitch
Build / build (push) Successful in 11m22s
Replace raw <input type="checkbox"> with the ToggleSwitch component on sites/[id] (encrypt-secret), sites/new (render-markdown, enable-storage), and stacks/new (deploy-immediately). Document the convention in CLAUDE.md so future forms keep the same control instead of mixing styles. |
||
|
|
a4362b842d |
fix: harden security, fix concurrency bugs, and address review findings
Build / build (push) Successful in 11m42s
Security: - rate limit /api/webhook routes per-IP and cap concurrent site syncs - global SSE connection cap (256) with new sse_gate - validate ?tail= and cap JSON log responses at 4 MiB - strip ANSI/CSI/OSC and control bytes from streamed log lines - redact webhook secret from request log middleware - scrub host details from /api/health for non-admin viewers - drop container_id from /api/system/stats/top for non-admins - generate webhook secrets via crypto/rand; require >=32 chars on insert - verify iid path consistency in streamContainerLogs - LimitReader on site webhook body; reject malformed non-empty bodies Concurrency / correctness: - stats collector: Stop() no longer hangs without Start(), semaphore acquired in parent loop so ctx cancellation short-circuits the queue, in-flight tick cancellable via shared base context, zero-ts guard - webhook handler: replace fire-and-forget goroutine with WaitGroup-tracked workers + Drain() wired into graceful shutdown - $derived(() => ...) mis-idiom fixed in ContainerStats / InstanceCard / ProjectCard (returned function instead of value) - SystemResourcesCard: rename `window` and `t` locals to avoid shadowing globalThis.window and the i18n `t` import Quality / performance: - replace O(n^2) insertion sort with sort.Slice in stats top - runMigrations only swallows duplicate-column / already-exists errors - PruneStatsSamplesBefore wrapped in a transaction - collapse N+1 in unusedImageStats / pruneImages to one ListAllInstances pass; surface DB errors instead of silently treating them as inactive - run Docker Info + DiskUsage in parallel via errgroup - container log SSE emits `: ping` heartbeat every 20 s - imageMatches case-insensitive on registry host (RFC behaviour) - log warning on invalid stage tag pattern instead of silent skip - reject malformed non-empty site webhook payloads Frontend / i18n: - shared formatBytes utility replaces three local copies - statsInterval store drives dynamic "no samples / collection disabled" copy across ContainerStats and SystemResourcesCard - top consumers row now shows owner_name (project/stage or site name) - drop seven `as any` casts on the Settings type; add cloudflare_api_token write-only field - move "Service status", "Docker daemon", "Docker unreachable", "Proxy unreachable", "reachable", and "Docker daemon is not reachable." strings into en/ru i18n bundles |
||
|
|
05440a5f92 |
feat(stats): resource metrics dashboard + sites logs/stats
Build / build (push) Successful in 10m50s
Background collector samples CPU/memory/network/block I/O for every
instance and site on a configurable interval (default 15s, range
5-300s), persists samples to SQLite with a configurable retention
window (default 2h, range 0-24h), and skips ticks gracefully when
the Docker daemon is unreachable. Settings are reloadable without
a restart — each tick re-reads them.
New API endpoints:
- GET /api/system/stats (host snapshot: info + df)
- GET /api/system/stats/history
- GET /api/system/stats/top?by=cpu|memory
- GET /api/projects/{id}/stages/{s}/instances/{iid}/stats/history
- GET /api/sites/{id}/stats[/history]
- GET /api/sites/{id}/logs (SSE + JSON, reuses instance log streamer)
Frontend:
- ECharts added with tree-shaken imports (~180KB gzip) for
future-proof time-series/gantt/graph visualizations
- CollapsibleSection wraps all dashboard sections (system health,
daemons, system resources, static sites, projects) with
localStorage-persisted open state
- SystemResourcesCard shows capacity tiles, workload utilization
chart with 30m/2h/6h/24h window picker, disk breakdown with
reclaimable callouts, and top 5 consumers
- ContainerStats and ContainerLogs take a source discriminated union
so sites reuse the same components as instances; sites detail page
embeds both for Deno backend debugging
- Settings › Maintenance exposes collection interval + retention
- Docker-unavailable state returns 503 and renders an amber banner
instead of a generic 500
Full i18n coverage (en + ru) for all new strings.
|
||
|
|
0632f512e6 |
feat(webhook): per-project and per-site webhook URLs
Build / build (push) Successful in 10m25s
Replace the single global webhook secret with entity-scoped secrets stored
on each project and static site. Webhook-driven project autocreate is
removed — projects must exist before their URL can trigger deploys.
Also wires static-site webhooks (sync_trigger=push|tag), turning the
previously inert "push" trigger into a functional one: POST the site's
webhook URL from a Git provider and Tinyforge re-syncs on matching refs.
- Adds webhook_secret columns + unique indexes to projects and static_sites
- Per-entity GET/regenerate endpoints under /api/projects/{id}/webhook
and /api/sites/{id}/webhook (admin-only)
- Removes /api/settings/webhook-url and the global webhook panel
- Reusable WebhookPanel Svelte component on both detail pages, i18n in en/ru
- Tests for matcher (siteRefMatches, ParseImageRef) and handler (project
match/mismatch/404 and site push/manual/branch-skip)
|
||
|
|
e08acf5c0e |
refactor(settings): split General into focused pages
Build / build (push) Successful in 10m38s
General was a 547-line catch-all mixing seven concerns, destructive
actions (image prune) inches away from form fields, and Cloudflare DNS
buried under four unrelated cards. A single "Save" committed everything
at once — one invalid field blocked valid edits elsewhere.
Splits:
- /settings Overview: timezone, core infra, proxy choice
- /settings/integrations outgoing notification URL + incoming webhook
- /settings/dns wildcard + Cloudflare provider
- /settings/maintenance stale threshold, prune threshold, prune action
(in a dedicated "Danger zone" card)
- /settings/credentials removed (was an 18-line redirect stub)
Sidebar is grouped (Overview / Routing / System / Security) with
section headers; NPM & Traefik items remain conditional on the
proxy-provider choice. Each page loads settings and PUTs only its own
subset, so mistakes on one page can't block edits on another.
No backend changes — the API already accepts Partial<Settings>.
|
||
|
|
03d58a072c |
fix: treat naive backend timestamps as UTC for relative labels
Build / build (push) Successful in 10m35s
Backend emits store.Now() as "2026-04-23 14:05:32" — UTC but without a Z marker or T separator. JavaScript parses such strings as local time, so every "N ago" label is skewed by the browser's UTC offset (3h off in UTC+3, causing the visible mismatch with the tooltip). Normalise bare/space-separated timestamps to UTC inside toDate() so every $fmt.* call, including relative, matches the absolute timestamp shown in the tooltip. InstanceCard drops its own parser and delegates to $fmt.relative for consistency. |
||
|
|
90e6e59d9e |
feat: daemon health panel, brand-rail status chips, user timezone selector
Build / build (push) Successful in 10m35s
- Health API now surfaces Docker /info + /version (version, platform, kernel, container/image counts, storage driver, memory, latency) and NPM aggregates (proxy host total, managed-by-Tinyforge count, access lists, certificates, endpoint URL). - Docker/NPM indicators moved out of the sidebar footer and into a compact mono-styled rail directly under the Tinyforge brand title, with pulse/fault animations and click-to-expand error hints. - New SystemDaemonsCard on the dashboard: two terminal-styled panels (Docker Engine + Proxy) with a running/paused/stopped stacked bar, key-value diagnostics, and a total-vs-managed proportion meter on the proxy-hosts tile. - Shared health store so the sidebar and dashboard share a single 30 s poll instead of duplicating traffic. - User-facing timezone preference with auto-detect fallback; all dates across projects, sites, stacks, settings, backup, event log and stale containers now render through \$fmt.date / \$fmt.datetime. - en/ru translations for both features. |
||
|
|
a182a93950 |
feat: nav counter badges, login backdrop, events i18n + misc fixes
Build / build (push) Successful in 10m29s
Nav & UI polish
- Sidebar nav items show monospace count badges (projects, sites, stacks,
proxies). Events badge shows error count only, styled red as actionable
- New $lib/stores/navCounts.ts polls all counts in parallel every 60s and
refreshes on route change so badges track mutations
- Login page gets a dynamic forge backdrop: rotating conic glow, drifting
embers, dot-grid texture, vignette — all pure CSS, reduced-motion safe
- main element gets scrollbar-gutter: stable so Settings tab switching no
longer shifts horizontally when content heights differ
Events i18n
- events.source.* dictionary rewritten to match actually-emitted backend
sources (deploy, static_site, stale_scanner, stale_cleanup, admin);
dead keys (container, proxy, system) removed
- EventLogFilter.allSources + /events default sources state updated to match
- Localize "{N} total" via events.totalCount in the page hero toolbar
Backend
- Stage API accepts enable_proxy on create/update (defaults to true) so
proxy registration can be opted out per stage
Concurrency
- api.ts: queued request waiters no longer double-increment the inflight
counter; releasing a slot hands it off directly
Reactive effects
- project detail / env / volumes pages wrap side-effect calls in untrack()
to prevent $effect feedback loops when their loaders mutate tracked state
|
||
|
|
ef0669d5dd |
feat: unified THE FORGE // SECTION headers and merged proxy routes
Build / build (push) Successful in 10m37s
UI consistency
- ForgeHero now supports backHref, mono kicker, stats snippet, staggered
entrance animation, and a registration-tick divider
- Every route now opens with the same "THE FORGE // SECTION" eyebrow: projects,
sites, stacks, proxies, events, dns, deploy, settings, stale containers,
site/project detail + env/volumes/browse, new site wizard
- Stacks list/detail/new moved to the shared hero and brand-anchor eyebrow
- Toolbars migrated from bespoke buttons to the shared .forge-btn utilities
- Sidebar footline adds a live UTC "forge clock" and a vim-style g-prefix
quick-nav hint (g d/p/s/k/x/r/e/c jumps to each section)
Proxies page
- Server-side: merge static site proxy routes with instance routes and sort
by domain (internal/api/proxies.go, internal/store/static_sites.go)
- ProxyRoute gains a Source field ("instance" | "static_site")
- Frontend adds source filter tabs and per-source labels/badges
|
||
|
|
0fd92fdfa3 |
feat: Forge design system app-wide + Stacks i18n
Build / build (push) Successful in 10m47s
Promotes the Forge visual language from the Stacks feature into a global design system used across the app: - app.css: Forge utilities (dot-grid backdrop, eyebrow, ember, display/lede, status pills, stat grid, panels, registration marks, alert, terminal, buttons). CSS variables alias the forge display font to the app's standard sans stack (Inter, now properly self-hosted via @fontsource/inter). - +layout.svelte: reskinned sidebar brand, active nav rail, mobile top bar, global h1/h2 typography overrides, main dot-grid backdrop. - Shared components reskinned: EmptyState (breathing-ember empty mark), StatusBadge (mono pills with pulse), ConfirmDialog (registration marks + forge buttons). - Dashboard (+page.svelte): ForgeHero header, forge-stat-grid, Instrument-style section titles with accent. - New ForgeHero component for reusable hero headers. Stacks feature fully localized (EN + RU): - 80+ keys under stacks.* covering list, new, detail, revisions, logs, errors, status labels, delete/rollback dialogs. - Russian uses forge vocabulary (куются/наковальня/куём/etc). - $t() wired through all three Stacks pages. |
||
|
|
75424a5f25 |
feat: docker-compose stacks with Forge-themed UI
Build / build (push) Successful in 10m42s
Adds a new Stacks feature: upload/edit docker-compose YAML, deploy as atomic units, browse revisions, roll back, and stream logs. Backend in internal/stack + internal/api/stacks.go, persistent storage in internal/store/stacks.go. Stacks pages (list, new, detail) use a modern Forge aesthetic — Instrument Serif display type, JetBrains Mono for meta/code, indigo ember accents, dot-grid hero, registration marks on hover, terminal panel for logs. Palette is sourced from the app's existing design tokens so the feature remains consistent with the rest of Tinyforge. Fonts self-hosted via @fontsource/instrument-serif and @fontsource/jetbrains-mono to satisfy the strict CSP. |
||
|
|
b622384774 |
feat: persistent storage for Deno static sites
Build / build (push) Successful in 10m21s
- Add storage_enabled and storage_limit_mb columns to static_sites.
- Create/attach Docker volumes (tinyforge-site-{name}-data) for Deno
sites with storage enabled, mounted at /app/data.
- Grant --allow-write=/app/data in Deno container CMD.
- Add storage usage API endpoint (GET /api/sites/{id}/storage).
- Show storage section in site detail page with usage bar.
- Add storage toggle and limit field to new site wizard.
- Use ConfirmDialog for secret deletion instead of inline delete.
|
||
|
|
9ec25a8d5a |
feat: project detail UX improvements
- Tag picker: replace raw text input with EntityPicker modal showing registry tags (auto-detected by image hostname) and local images. Fix URL encoding bug where encodeURIComponent encoded slashes in image paths, causing 502 on registry tag API. - Stage editing: inline edit form for name, tag pattern, max instances, CPU/memory limits, auto-deploy and proxy toggles. - Stage delete: use ConfirmDialog modal instead of window.confirm(). Immediately remove stage from local state after deletion. - Project-level env: add/edit/delete project env vars (stored in project.env JSON field). Move stage selector inline with Stage Overrides heading so it's clear project env is independent. - Access list UX: rename "None (public)" to "Global default", clarify help text. - Add missing i18n keys for all new UI (en + ru). |
||
|
|
96fd910603 |
fix: resolve ERR_INSUFFICIENT_RESOURCES connection exhaustion
- Add concurrency limiter (max 4 GET requests) to API layer, leaving slots for SSE and health checks. Write ops bypass the limiter. - Add AbortController to ContainerStats, project detail page, and dashboard to cancel in-flight requests on navigation/unmount. - Move global SSE connection from layout to events page (only consumer). - Add 30s heartbeat to SSE endpoint to detect zombie connections. - Serialize dashboard project fetches to avoid parallel burst. - Rebuild frontend in dev-server.sh so go:embed stays in sync. |