tiny-forge

Author	SHA1	Message	Date
alexei.dolgolyov	1c47030854	feat(volsnap): volume snapshot restore (backlog #6 ) Restore a captured volume snapshot onto an image workload's live host-bind data volumes, then redeploy — the most destructive workload action, built to the adversarially-reviewed design (C1–C6) with all data-loss guards. - Engine.Restore (engine-owned): all-or-nothing pre-flight re-resolution from the workload's CURRENT config (never the tamperable manifest), per-filesystem disk pre-check, per-workload lock, container quiesce, extract-to-tmp, durable pre-restore snapshot, write-ahead journal, atomic rename swap, redeploy, and crash-recovery sweep (RecoverInterruptedRestores) wired before serving. - internal/keyedmutex: shared per-key lock; deployer now serializes every deploy entrypoint per workload via DispatchPlugin (+ LockWorkload/RedeployLocked for the restore re-dispatch, no deadlock). - Untrusted-archive extractor: zip-slip containment, type allow-list (reg/dir only), decompression-bomb cap, manifest-index bounds. - POST /api/workloads/{id}/snapshots/{sid}/restore: admin, X-Confirm-Restore header (CSRF), per-workload single-flight (409). - WebUI: Restore button + danger ConfirmDialog + busy state + i18n (en/ru). Scope: image-source only; scopes absolute/stage/project (driven off the same supportedScopes constant capture uses). Plan-reviewed before coding; per-phase go/security/ts reviews; final review READY TO MERGE. Security review caught + fixed a CRITICAL manifest-Source path traversal (re-derive target from current config + base containment). Plan: plans/volume-snapshot-restore/	2026-06-22 17:23:52 +03:00
alexei.dolgolyov	7733e64b08	feat(gitops): config-as-code via .tinyforge.yml for repo-backed workloads A dockerfile or static workload can opt in to reading its deploy config from a .tinyforge.yml in its own repo. Tinyforge fetches the file, shows field-level drift vs the live config, and an admin applies it with an explicit Sync. The repo becomes the source of truth for the declared fields. Manual-sync only; no auto-apply on deploy, no multi-workload reconcile, no create/delete in v1. Scope is deliberately source-aware and source_config-resident: dockerfile declares port/healthcheck/deploy_strategy, static declares deploy_strategy. The file never carries repo coords or secrets (those stay in the encrypted DB), which keeps credentials out of the repo. Backend: - internal/gitops: Spec/ParseSpec (KnownFields rejects unknown keys), a source-aware ApplyPlan/BuildPlan, MergeAndValidate (omitted-field-preserving deep merge + validate-the-merged-result-then-commit — never a partial config), declared-only Drift with normalization, and Fetch with ok/no_file/fetch_failed/invalid statuses and token-redacted messages. - staticsite: DownloadFile added to GitProvider + Gitea/GitHub/GitLab impls, reusing each provider's SSRF-safe client; 64 KiB cap; ErrFileNotFound. - store: 4 additive gitops_* columns + setters (disjoint from UpdateWorkload so the edit-form save and a sync never clobber each other). - api: GET /workloads/{id}/gitops (status + raw + live drift + managed_fields), PUT /gitops (admin, enable/path, traversal-safe), POST /gitops/sync (admin, per-workload locked read->merge->validate->write, audited to event_log). Frontend: - GitOpsPanel.svelte: status pill, a purpose-built field-level drift view, .tinyforge.yml preview, enable ToggleSwitch, Sync via ConfirmDialog; all five statuses handled, admin affordances gated on the real viewer role. - GitOps-managed badge (list + detail hero) and a read-only edit-form banner. - api.ts fetchers + types; i18n apps.detail.gitops.* (en + ru parity). Built phase-by-phase with an adversarial plan review (caught 5 design flaws pre-implementation) and an independent review per phase (go / security / ts / final) — all APPROVE, 0 CRITICAL/HIGH. docs/gitops.md documents the schema and what's intentionally out of v1. Plan: plans/gitops/.	2026-06-21 23:32:02 +03:00
alexei.dolgolyov	0c4c338bfe	feat(apps): per-workload deploy history, rollback, and resource metrics Two additions to the app detail page, each backed by a per-workload endpoint. Deploy history + rollback: - New deploy_history table — a structured, version-pinned ledger of every dispatch (success AND failure), distinct from the free-text event_log. Recorded at the single DispatchPlugin choke point so every source kind is covered. The raw deploy error is never persisted (it can carry registry-auth / compose-stdout secrets) — only a generic marker, with detail going to slog. Pruned to the newest N per workload; cascade- deleted with the workload. - GET /api/workloads/{id}/deploys lists the ledger; POST .../rollback (admin) replays a prior successful deploy's pinned reference as a rollback-reason dispatch. Phase 1 is image-source only (RollbackCapable); git-built sources need checkout-by-commit, a later phase. - DeployHistoryPanel.svelte renders the ledger with confirm-gated rollback. Per-workload metrics: - ListContainerStatsSamplesByWorkload joins the existing container stats samples through the containers index; GET /api/workloads/{id}/stats/history aggregates CPU/memory per timestamp across the workload's containers. - WorkloadMetricsPanel.svelte reuses ResourceChart (CPU% + memory MiB, windowed, 15s poll). en/ru i18n added with parity. Tests: store CRUD + cascade + workload-scoped join, deployer recording (incl. secret-non-leak on failure), API rollback guards, and per-timestamp aggregation. Plans under docs/plans/.	2026-06-19 16:22:12 +03:00
alexei.dolgolyov	c8e71a0c34	refactor(plugin): centralize workload conversion + container cleanup Three packages (api, reconciler, webhook) each carried a private 30-line toPluginWorkload() copy that had drifted — only the api version logged malformed public_faces JSON; the others swallowed it. Hoist the single implementation to plugin.WorkloadFromStore() (convert.go); store is already a plugin dependency so no new import edge or cycle forms. Likewise the dockerfile and static sources each had a private removeContainerByName() that disagreed (remove-all vs stop-at-first). Docker enforces unique container names, so the two were equivalent for every reachable state; converge on plugin.RemoveContainerByName() (container.go, stop-at-first) with a note on why remove-all was moot. Callers migrated; old copies removed. Adds convert_test.go pinning the field-by-field contract and JSON edge cases.	2026-06-19 16:21:54 +03:00
alexei.dolgolyov	6b45ed62bb	feat(snapshots): capture app data-volume snapshots Build / build (push) Successful in 10m59s Details Add per-workload capture of host-bind data volumes as downloadable tar.gz archives: a new internal/volsnap engine (enumerate host-bind volumes via the computeMounts merge, archive with archive/tar+gzip skipping symlinks/special files, per-workload retention + startup orphan cleanup), a volume_snapshots table + store CRUD, admin-gated API (list/snapshotable/create/download/delete), and a Snapshots panel on /apps/[id] that shows coverage and which volumes are skipped (and why). Scope: image-source apps, host-bind scopes (absolute/stage/project); Docker named volumes, tmpfs, and instance scope are surfaced as not-yet-supported. Restore is a separate later phase. Download/FilePath are containment-checked; create returns a typed no-data error (400) vs generic 500. Covered by archiver unit tests + full API e2e.	2026-06-02 14:56:10 +03:00
alexei.dolgolyov	97f338fba3	feat(maintenance): add Docker build-cache prune action Add an admin-only POST /api/docker/prune-build-cache endpoint plus a Settings > Maintenance danger-zone button to reclaim disk used by the Docker build cache (image + static-site builds), which previously grew unbounded with no UI lever. Prunes unused-only (all=false) so a warm cache is preserved for apps redeploying soon. Mirrors the existing prune-images vertical slice; full en/ru i18n parity.	2026-06-02 13:34:05 +03:00
alexei.dolgolyov	fa6d5bd3ba	feat(secrets): scoped shared secrets — backend + API (Phase 1) Secrets defined once and applied to many workloads by scope (global or per-app), encrypted at rest and resolved into container env as a low-precedence default layer: global-shared < app-shared < image cfg.Env < workload_env. A workload with no applicable shared secrets is byte-identical to the prior workload_env-only behavior. - store: shared_secrets table + CRUD + ListApplicableSharedSecrets (enabled global + app, global-first), UNIQUE(scope,app_id,name). - plugin.ResolveSharedSecrets + integration into BuildWorkloadEnv (static/dockerfile) and image buildEnv; best-effort — a shared-secret store/decrypt error never fails a deploy, and values are never logged. - REST CRUD at /api/shared-secrets (reads authed, mutations AdminOnly); values encrypted at the boundary via crypto.Encrypt and never returned (only a has_value flag), mirroring workload_env. UNIQUE collisions 409. Compose is out of scope (YAML-defined env). Frontend rule UI is Phase 2. Reviewed: go + security APPROVE (0 CRITICAL/HIGH); two MEDIUMs fixed (translateSQLError -> 409, no driver-message leak). Deferred defense-in- depth: json:"-" on the model value + a description length cap.	2026-05-29 15:26:09 +03:00
alexei.dolgolyov	cdb9fd57d1	feat(alerts): metric-threshold alerting (backend + API) Operators can define metric-threshold alert rules (cpu_percent, memory_percent, memory_bytes; gt/lt) per-workload or global via /api/metric-alert-rules. A periodic evaluator (internal/metricalert, 30s tick) checks the freshest container stats sample per container against enabled rules and, on breach (per-rule-per-workload cooldown), emits into the existing event_log + bus pipeline (source "metric_alert", workload_id set). Alerts therefore surface on the global events page, the per-app activity timeline, and any configured event-trigger webhook -- no new notification plumbing. Mirrors the log_scan_rules store/API/route patterns and the stats.Collector lifecycle. Rule CRUD reads are authed, mutations AdminOnly. Frontend rule-config UI is a follow-up phase. Reviewed: go APPROVE (0 CRITICAL/HIGH).	2026-05-29 14:06:23 +03:00
alexei.dolgolyov	93b6911b34	feat(apps): per-app deploy/activity timeline Every deploy across all four source kinds now writes a workload-scoped event via a shared plugin.EmitDeployEvent helper (replacing the inline emit duplicated in static/dockerfile, standardizing static's metadata key site_id->workload_id, and adding emission to image+compose which were silent). New indexed event_log.workload_id column, EventLogFilter .WorkloadID, and GET /api/workloads/{id}/events (id pinned from path). Frontend: a forge "Activity" panel on /apps/[id] reusing EventLogEntry, live SSE prepend filtered by workload_id, load-more pagination, an All/Errors severity filter, and a shared toEventLogEntry mapper. en/ru i18n parity. Security: compose's failure status emits a generic reason instead of raw `docker compose up` output, which can echo app secrets and egresses to operator webhooks (NotificationURL + event-trigger actions); full detail stays only in the returned error. Rune-safe 256-rune status cap. Reviewed: go + typescript APPROVE; security HIGH fixed.	2026-05-29 13:51:17 +03:00
alexei.dolgolyov	410a131cec	feat(apps): stepped creation wizard, branch previews, and app-creation fixes This session (frontend focus): - Rebuild /apps/new as a 4-step wizard (Basics → Configure → Trigger → Review): WizardRail, SourceKindPicker card grid, AppManifest review, per-step validation, ConfirmDialog-based unsaved-changes guard. - Extract lib/workload/sourceForms.ts (single source of truth for source_config) + {Image,Compose,Static,Dockerfile}SourceForm + StaticDiscoveryWizard; fold the /apps/[id] edit form onto the same components (removes the duplication). Add vitest + sourceForms unit tests. - Branch preview environments UI: /chain is_preview/preview_branch + a Preview environments panel on /apps/[id] (per-branch URLs, ConfirmDialog teardown, armed state); RegistryImagePicker on the registry trigger and the image source. - Fixes: image-inspect 404 -> admin-gated POST /api/discovery/image/inspect; conflict-panel blur flicker; friendly localized discovery errors; CPU/Memory label hints; dashboard + /apps "Total workloads" count only source_kind workloads (drop stale trigger_kind gate); NPM cert/access-list name cache; EntityPicker empty-list guard. - Update CLAUDE.md frontend conventions + add a Build & Test section. Also captures pre-existing in-progress platform work (not from this session): workload notifications, Prometheus metrics export, store lockfile, health probes, backup hardening, and related store/webhook/scheduler changes.	2026-05-29 02:09:54 +03:00
alexei.dolgolyov	ea55d31177	feat(discovery+runtime): restore static-site wizard discovery + close /sites/[id] feature parity Build / build (push) Successful in 10m43s Details Two-stage feature arc closing the gaps left by the hard legacy cutover. The static-site creation wizard regains its auto-discovery + connection-test flow; /apps/[id] grows the runtime/storage/lifecycle surface the legacy /sites/[id] page used to expose. Backend (Go) - internal/api/discovery.go: six admin-gated endpoints wrapping staticsite.GitProvider — POST /api/discovery/git/{detect-provider, test-connection,repos,branches,tree} + GET /api/discovery/image/conflicts. Identifier validation (validateGitIdent / validateGitBranch) at the boundary so provider URL interpolation cannot be hijacked via `..`. Upstream errors scrubbed: detailed slog on the server, generic 502 to the client (mitigates token-reflection-in-error-page). - internal/api/workload_runtime.go: four endpoints — GET /api/workloads/{id}/runtime-state decodes containers.extra_json for static workloads; GET /api/workloads/{id}/storage execs `du -sb /app/data` with a 30s in-process cache (storageProbeCache) so polling can't turn into per-request execs; POST /api/workloads/{id}/{stop,start} iterate ListContainersByWorkload and call docker.StopContainer / StartContainer, returning 200 / 409 (nothing to act on) / 502 (all failed). - internal/staticsite/safehttp.go: NewSafeHTTPClient + ValidateBaseURL + blockReason. DialContext re-resolves hostnames and refuses loopback / link-local / multicast / unspecified addresses. RFC1918 + ULA explicitly allowed (self-hosted Gitea on LAN is the dominant deployment). Replaced four raw &http.Client{} constructions in the provider files. - internal/staticsite/gitlab_provider.go: url.PathEscape each segment in the raw-file URL builder for parity with projectPath(). - Test coverage: 26 cases in discovery_test.go (image-tag stripping, source-config decoding, conflict scenarios, validator boundaries, scheme rejection), 14 in workload_runtime_test.go (404 / 409 / nil-docker / probe-cache), 16 in safehttp_test.go (URL validation + block-reason policy matrix + live dial against loopback + AWS metadata literals). Frontend (Svelte 5 + runes) - web/src/lib/api.ts: typed wrappers for every endpoint, AbortSignal threaded through post(); ApiError exported so callers can narrow on e.status; new DetectedGitProvider narrow union. - web/src/routes/apps/new/+page.svelte: static-form discovery controls (auto-detect provider, test connection, repo / branch / folder EntityPickers, Deno auto-detect); image-form conflict panel with debounced lookup + double-click submit guard ("Forge anyway") + Inspect button that pre-fills port/healthcheck; English error fallbacks routed through apps.new.errors.* (en + ru). - web/src/routes/apps/[id]/+page.svelte: runtime-state panel + storage panel + Stop / Start / Open-site toolbar; universal live-state badge in the hero lede for image/compose/static (RUNNING / TRANSITIONING / STOPPED / NOT DEPLOYED / MIXED · n/m RUNNING); ContainerStats panel per row (auto-collapsing native <details> when N > 2); read-only webhook bindings summary card; responsive toolbar overflow with native <details> at <640px (z-index 100 above sticky nav). - web/src/app.css: project-wide .forge-btn-ghost:focus-visible outline. Hardening from go-reviewer + security-reviewer + typescript-reviewer + frontend-design UI/UX subagents (0 CRITICAL, all HIGH/BLOCKER addressed inline, IMPORTANT applied before commit): - AbortController + per-call sequence tokens on every long-running fetch (loadRuntimeState / loadStorage / loadTriggerMeta / inspectImage / listImageConflicts) plus onDestroy cleanup so late resolves cannot mutate dead component state. - doStop / doStart snapshot and restore `error` across the finally-block reload so a load()-cleared message doesn't hide a real failure. - triggersById refreshed after inline trigger creation so the webhook card doesn't silently exclude the just-created trigger. - Live-state badge wraps in role=status / aria-live=polite (no redundant aria-label). - Webhook row has a single click target (was two pointing at the same URL). - Empty webhook section hides entirely. - Dropped role=menu / role=menuitem from the overflow menu (they would promise arrow-key nav we don't wire; native Tab + ESC carry it). Doc - docs/CODEMAPS/INDEX.md + new docs/CODEMAPS/discovery-and-runtime.md map the endpoint surface, security posture, frontend integration patterns, and an "add a new probe" recipe. Verification - svelte-check: 0 errors, 3 pre-existing a11y warnings. - go build + go vet + go test ./...: all green. - i18n parity: en + ru at 1413 keys each. - Live smoke against :8090: 404 / 409 / 502 envelopes correct, discovery sanity passes, ProbeError surfaces on no-container path.	2026-05-16 21:35:51 +03:00
alexei.dolgolyov	5e78f13e06	refactor(triggers): review followups — fire-now, dedupe trigger pages, hardening Build / build (push) Failing after 34s Details Follow-ups on commit `39e1e36` addressing review feedback from go-reviewer / security-reviewer / typescript-reviewer. Backend: - New POST /api/triggers/{id}/fire (AdminOnly, schedule-only): operator "Fire now" button — dispatches immediately without waiting for the next natural interval. Persists last_fired_at BEFORE dispatch, same ordering as the scheduler. Per-trigger in-flight guard (429 if a fire is already running) to defend against rapid double-clicks / runaway scripts. Refuses request when AdminOnly claims are absent rather than logging an unattributable deploy. - SetTriggerLastFired now validates timestamp parses as RFC3339 before writing. Rejects empty string explicitly — empty-clears semantics were dead (no caller) and would silently re-fire on next tick if ever accidentally written. A future reset-cadence flow must add a dedicated ClearTriggerLastFired so the call site is grep-able and separately auditable. - Scheduler logs WARN on catch-up fires (now - lastFired > 2× interval) so the "surprise burst at restart" pattern shows up in audit logs. - BindingResult reason strings extracted to package consts (webhook.Reason*) so the scheduler and api fire-now classifications stay in sync without string-matching drift. - SECURITY NOTE on FanOutForTrigger documents that the WebhookRequireSignature gate is ingress-only by design. Frontend: - Refactored /triggers/new (770 LOC → 155 LOC) and /triggers/[id] (~350 LOC dropped) to use the shared TriggerKindForm. Eliminates the triplicated per-kind state + buildConfig + canSubmit + template that caused the d-unit regex drift in the prior commit. - New seedTriggerKindFormState helper on TriggerKindForm primes the form from a server-returned trigger config with defensive type guards; resets per-kind slots first so re-seeding across kinds doesn't inherit stale state. - /triggers/[id] gains a Schedule status panel with Last Fired + Fire Now button (gated on binding_count > 0). Confirmation dialog, result flash, timer cleanup on unmount + new-fire (no stale-closure race). EN+RU i18n parity.	2026-05-16 12:16:47 +03:00
alexei.dolgolyov	39e1e36510	feat(triggers): add schedule trigger kind + internal scheduler Build / build (push) Successful in 10m42s Details Fourth trigger kind alongside registry/git/manual. Recurring time-interval fires driven by a new internal/scheduler tick loop (default 30s, clamped to 5m). Goes through the same webhook.Handler.FanOutForTrigger seam as inbound HTTP webhooks, so per-binding concurrency, outcome accounting, and config-merge semantics are identical. Schema: triggers.last_fired_at TEXT column (additive ALTER for existing DBs). Scheduler persists last_fired_at BEFORE dispatch so a panicking Match cannot wedge a tight loop; failed deploys wait one full interval before retry — correct trade-off for a periodic refresh trigger. Frontend: TriggerKindForm + /triggers/new + /triggers/[id] gain the schedule kind (4-col card grid, preset chips Hourly/Daily/Weekly, custom interval input matched to Go time.ParseDuration syntax, optional pinned reference). /triggers/[id] surfaces "last fired" on schedule rows. EN+RU i18n in parity. Review fixes from go-reviewer / security-reviewer / typescript-reviewer: - Scheduler Start/Stop wrapped in sync.Once (no goroutine leak / double- cancel panic on shutdown re-entry). - shouldFire rejects sub-MinInterval as defense-in-depth against hand-inserted rows that bypassed Validate. - fire() asserts trigger Kind=="schedule" before dispatching. - Aligned isValidInterval regex across all three frontend sites; reject the unsupported "d" unit (Go time.ParseDuration doesn't accept it). - formatLastFired falls back to lastFiredNever on malformed timestamps rather than leaking raw bytes into the UI. - main.go scheduler closure logs per-fire deployed/errored counts.	2026-05-16 11:24:05 +03:00
alexei.dolgolyov	e3c7b13d58	chore(workload): close the workload-first arc — apps i18n + codemap + tests Build / build (push) Successful in 10m36s Details Closes the workload-first refactor by landing the Priority 3 polish items and the Priority 4 test gap. Net: ~2,400 lines added, ~350 lines modified across 13 files. Priority 3 — polish - apps.* i18n namespace: 276 new keys across apps.list.* (27), apps.new.* (91, sibling of existing apps.new.triggers.), and apps.detail. (158, sibling of existing apps.detail.bindings.*). EN+RU at 1314 keys each, perfectly in sync. /apps, /apps/new, /apps/[id] now render entirely from i18n. - New codemap docs/CODEMAPS/workload-plugin.md (238 lines): Source × Trigger contract, dispatch seam, webhook fan-out path, recipes for adding a new Source or Trigger kind. Plus docs/CODEMAPS/INDEX.md gateway. Priority 4 — tests - internal/api/workloads_test.go (new, ~30 subtests): /api/workloads CRUD + deploy + delete + env + volumes + chain + promote-from + triggers list/inline-bind + auth gating + standalone /api/triggers CRUD (create / dup-409 / kind filter / delete). Uses real POST handlers via httptest.NewServer + a fake plugin source registered under "testfakesource". - internal/deployer/dispatch_test.go (new, 11 tests): DispatchPlugin / DispatchTeardown / DispatchReconcile happy + unknown-kind + propagated-error each; PluginDeps wiring; a real 2s-bounded RWMutex deadlock probe on PluginDeps vs SetDNSProvider. - internal/workload/plugin/source/compose/compose_test.go (new, ~26 subtests): composeProjectName sanitization, writeYAML / writeYAMLIfChanged hash short-circuit, Validate happy + bad inputs, Kind / SchemaSample. Coverage delta on the workload-plugin path: - internal/api: 1.1% → 16.0% - internal/deployer: 0% → 54.1% - internal/workload/plugin/source/compose: 0% → 38.5% - Trigger plugins already at 87-95% from the trigger-split work. Production fix surfaced by the tests - store.CreateWorkload now self-references RefID = ID when caller leaves RefID empty (the typical plugin-native path). The api layer's broken backfill loop (called UpdateWorkload, which deliberately omits ref_id) is gone. Multiple sibling plugin workloads can now coexist under the UNIQUE(kind, ref_id) constraint. Review fixes addressed before commit - CRITICAL: deadlock-detect test gained a real 2s time.After (was selecting on context.Background().Done() which never fires). - HIGH: happy-path test now hard-asserts RefID = ID (was a t.Logf that would silently pass after a production fix). - HIGH: standalone /api/triggers CRUD coverage added (was bypassed by the workload-side bind flow). - HIGH: seedWorkload bypass deleted; tests now go through the real POST /api/workloads handler. - MEDIUM: withTempDir restore is a no-op (t.Setenv auto-restores); dead `old := os.Getenv(...)` capture removed. - MEDIUM: list-workloads test now asserts ID membership, not just count. Doc - WORKLOAD_REFACTOR_TODO: all three Priority 1 items, Priority 3 polish, and Priority 4 tests marked DONE. The workload-first arc is closed.	2026-05-16 06:42:43 +03:00
alexei.dolgolyov	739b67856a	feat(cutover): hard legacy cutover — drop projects/stacks/sites/deploys Build / build (push) Successful in 10m39s Details The clean-break delete that closes the workload-first refactor arc. Net diff: ~30 backend files deleted, ~20 modified, ~12k LOC removed on the Go side; entire /projects /stacks /sites /deploy frontend trees gone; ~6.7k LOC removed on the Svelte/TypeScript side. Backend - API handlers gone: internal/api/{projects,stages,stage_env,stacks, static_sites,deploys,instances,volume_browser}.go - Store CRUD + tests gone: internal/store/{projects,stages,stage_env, stacks,static_sites,static_site_secrets,deploys,poll_state,volumes, workload_sync}.go (+ _test.go siblings) - Legacy deployer pipeline gone: internal/deployer/{bluegreen,promote, rollback,subdomain,resolver_test}.go; deployer.go trimmed to just the dispatch surface used by the plugin pipeline - internal/staticsite/{manager,healthcheck}.go and internal/stack/manager.go gone (the rest of those packages stay as helpers imported by the static + compose plugins) - internal/registry/poller.go gone (legacy registry poller) - internal/volume.ResolvePath gone; ResolveWorkloadPath stays - internal/webhook: handleWebhook (project) + handleSiteWebhook (site) gone; only POST /api/webhook/triggers/{secret} remains - workload-side webhook URL handlers (getWorkloadWebhook + regenerateWorkloadWebhook + EnsureWorkloadWebhookSecret + SetWorkloadWebhookSecret + GetWorkloadByWebhookSecret) gone — they minted URLs that would 404 against the new trigger-only ingress - cmd/server/main.go: dropped staticsite.Manager, stack.Manager, staticsite.HealthChecker, registry poller, SetSiteSyncTriggerer, SetStaticSiteManager, SetStackManager, wireStaticBackend - store/store.go: idempotent DROP TABLE IF EXISTS for every legacy table (projects, stages, stage_env, volumes, deploys, deploy_logs, poll_states, stacks, stack_revisions, stack_deploys, static_sites, static_site_secrets); FK order children-then-parents - store/models.go: dropped Project, Stage, Deploy, DeployLog, StageEnv, Volume, StaticSite, StaticSiteSecret, Stack, StackRevision, StackDeploy types; kept WorkloadKind constants as documented strings - internal/store/helpers.go (new): BoolToInt, rowScanner, GenerateWebhookSecret extracted from deleted CRUD files - internal/api/secrets.go (new): forwards to store.GenerateWebhookSecret so api + store paths share one secret-generation impl (no panic-vs-UUID-fallback divergence) - internal/reconciler/reconciler.go: dropped legacy stack-by-compose + static-site label paths; only canonical tinyforge.workload.id dispatch remains - providers (gitea_content/github_provider/gitlab_provider) gained path-traversal rejection on every tree entry - internal/webhook ParsedImage / ParseImageRef demoted to package- private (no external callers) Frontend - /projects /stacks /sites /deploy routes deleted (entire trees) - ProjectCard / InstanceCard / StaleContainerCard components deleted - api.ts: dropped every project/stage/stack/site/deploy/instance helper + types (Project, Stage, Stack, StaticSite, Deploy, Instance, Volume, etc.); kept Workload, Container, App, Settings, Registry, EventTrigger, LogScanRule, webhook envelopes - WorkloadWebhook type + getWorkloadWebhook/regenerateWorkloadWebhook api functions gone (mirror of the backend deletion above) - web/src/routes/+layout.svelte: dropped /projects /sites /stacks /deploy nav entries, trimmed quick-nav keymap - web/src/routes/+page.svelte: dashboard rewrite — reads listWorkloads + listContainers only; 4-card stat grid (workloads/running/failed/stale) + recent workloads strip - navCounts.ts, SystemHealthCard.svelte, ContainerLogs.svelte, ContainerStats.svelte, StatusBadge.svelte, TagCombobox.svelte, proxies/+page.svelte, containers/+page.svelte all rewired to the workload-first surface - AbortController plumbing on dashboard, nav-counts, stale page, SystemHealthCard so navigation doesn't leave dangling fetches - i18n: dropped projects., projectDetail., envEditor., volumeEditor., volumeBrowser., quickDeploy., sites., stacks., instance., confirm. namespaces; en/ru parity preserved (1042 keys each) Hardening from go-reviewer + security-reviewer + typescript-reviewer subagent passes (0 CRITICAL across all three; 1 HIGH + ~12 MEDIUM addressed inline before commit): - Sec H1: dead-end workload webhook URL handlers (would mint URLs that 404 the new trigger-only ingress) deleted across backend + frontend - Go M1: IsTerminalDeployStatus dropped (no production callers) - Go M2: ParsedImage/ParseImageRef lowercased (in-package only) - Go M6: generateWebhookSecret unified — api shim forwards to store.GenerateWebhookSecret - Doc/comment freshness: stage_id (no longer FK), ProxyRoute legacy field names, workloadIDRow rationale, webhook_deliveries.target_type enum, WebhookDeliveryLog component header Doc - WORKLOAD_REFACTOR_TODO: cutover marked DONE; all three Priority 1 items are now shipped. Next focus is Priority 3 polish (apps.* i18n + codemap entries) and Priority 4 tests. Behavioral notes for operators upgrading from a pre-cutover build - Existing rows in the dropped tables disappear on first boot. - Legacy webhook URLs at /api/webhook/{secret} and /api/webhook/sites/{secret} return 404; CI configs must repoint to /api/webhook/triggers/{secret} (the trigger-split boot backfill lifted any embedded workload secret onto a Trigger row, so the secret value itself carries over). - Frontend routes /projects /stacks /sites /deploy are gone; nav links replaced with /apps and /triggers.	2026-05-16 06:00:21 +03:00
alexei.dolgolyov	2aff22f565	feat(triggers): first-class triggers + bindings with fan-out webhook Build / build (push) Successful in 10m39s Details Promote triggers from embedded workload fields to standalone records joined to workloads via workload_trigger_bindings. One trigger (webhook, registry watcher, git push, manual) now fans out to many workloads with per-binding config overrides (top-level JSON merge, binding wins). Backend - new triggers + workload_trigger_bindings tables with ON DELETE CASCADE - boot-time backfill of embedded trigger config inside per-workload tx - store.ErrUnique sentinel translates SQLite UNIQUE at store boundary - /api/triggers CRUD + /api/triggers/{id}/{webhook,bindings} - /api/bindings/{id} update/delete; /api/workloads/{id}/triggers list+bind - bindTriggerToWorkload accepts trigger_id or inline {kind,name,config} - inline-create uses CreateTriggerWithBindingTx (no orphan triggers) - validateBindingConfig enforces 8 KiB cap + plugin Validate on merged - ListTriggersWithBindingCount + ListBindingsWithNames remove N+1 - POST /api/webhook/triggers/{secret} resolves trigger then fans out - bounded worker pool (4) per request; per-binding error isolation - outcome accounting: deployed / skipped / no-match / errored - legacy /api/webhook/workloads/{secret} route removed (clean break; backfill keeps secrets resolvable at the new /triggers/{secret} path) - reconciler gate dropped from (Source && Trigger) to Source only - MergeJSONConfig returns freshly allocated slices (no fan-out aliasing) - WithEffectiveTrigger lets existing Trigger.Match contract stay unchanged Frontend - /triggers list, new wizard, [id] detail (bindings, webhook rotate) - workload create wizard: NEW / PICK / SKIP trigger modes - workload detail: bindings panel + Add-trigger modal (inline / pick) - per-binding override editor with merged-preview + 8 KiB guard - "OVERRIDES n FIELDS" row badge when binding_config is non-empty - shared TriggerKindForm component (registry / git / manual + JSON) - 3 raw <input type=checkbox> replaced with <ToggleSwitch> - full EN + RU i18n: redeployTriggers., apps.detail.bindings., apps.new.triggers., nav.triggers; event-triggers nav disambiguated Doc - WORKLOAD_REFACTOR_TODO: trigger-split marked DONE; next focus is the static-source inline port + hard legacy cutover (Priority 1)	2026-05-16 02:24:31 +03:00
alexei.dolgolyov	7a9ff7ad54	feat(observability): event triggers + log scanner backend Two paired backends sharing the events.Bus seam: Event triggers (consumer-side): - internal/store/event_triggers.go — CRUD with action_secret redaction on read (placeholder echo treated as "no change" on PATCH so secrets aren't accidentally wiped). - internal/events/dispatcher.go — bus subscriber, AND-composed filters (severity CSV, source CSV, message regex with memoized compile cache). Structural loop-prevention: never writes to event_log. Sends via notifier.SendPayload. - internal/notify: SendPayload + SendSyncForTestPayload methods, TierEventTrigger constant, doSendRaw shared with the legacy Event-shaped path. - internal/api/event_triggers.go — admin-gated CRUD + /test sending the real TriggerWebhookPayload shape. SSRF guard rejects loopback / link-local / unspecified targets. PATCH uses pointer-typed DTO for partial updates. Log scanner (producer-side): - internal/logscanner/ — engine (per-rule cooldown + per-container token bucket, atomic drop counters), tail (multiplexed docker frame demuxer with TTY fallback + 16 MiB payload cap + 1 MiB reassembly cap + RFC3339Nano-validated timestamp strip + UTF-8-safe message truncation), manager (5s container polling, atomic.Pointer[Snapshot] hot-reload, HitEmitter writes event_log + publishes EventLog so the trigger dispatcher picks them up immediately). - internal/docker/container.go — ContainerLogsOpts exposes stream selection for stderr-only / stdout-only rules. - internal/store: log_scan_rules table + CRUD with EffectiveLogScanRules resolver (globals minus per-workload overrides plus workload-only additions). Transactional cascade-delete of overrides when a global rule is removed. - internal/api/log_scan_rules.go — admin-gated CRUD + /test (sample_line → matched/captures) + /stats (drop counters + active tail count + last-snapshot compile errors) + GET /api/workloads/{id}/effective-rules. cmd/server/main.go wires both subsystems next to the existing RegisterPersistentLogger. Coverage spans engine cooldown / bucket counter tests, snapshot effective-set semantics, manager compile- error capture, dispatcher matching, store validation + cascade-delete, API URL validator + secret redaction. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-11 22:18:11 +03:00
alexei.dolgolyov	8d6a527a2b	refactor(workload): plugin architecture wave + apps UI + volume scopes Completes the workload-first refactor's plugin layer: - internal/workload/plugin/ — Source/Trigger plugin contract, registry, types (Workload, DeploymentIntent, InboundEvent, PublicFace). Self-registering init() pattern + blank-import in cmd/server/main.go. - Source plugins: image (blue-green with multi-face proxy routing), compose, static. Trigger plugins: registry, git, manual. - internal/deployer/dispatch.go — DispatchPlugin/Teardown/Reconcile seam routing the legacy deployer through plugins. - internal/api/workload_*.go — REST surface: workloads, env, volumes, chain (parent/children), promote-from. hooks.go serves /api/hooks/kinds/{kind}/schema for the wizard. - internal/store: workload_env (encrypt-at-rest secrets) and workload_volumes tables, keyed on workload_id. - cmd/server/static_backend.go — phantom-row adapter delegating the static source plugin to the legacy staticsite.Manager (deleted at hard cutover once the static inline port lands). - web/src/routes/apps/ — /apps list + /apps/new wizard + /apps/[id] detail with kind-aware compose / image / static forms (Advanced JSON toggle), env panel, volumes panel, webhook panel, chain panel, manual deploy. Volume scope generalization (v2 resolver): - internal/volume.ResolveWorkloadPath (workload-keyed, sits next to legacy ResolvePath). Honors all VolumeScope values: absolute, ephemeral, instance, stage, project, project_named, named. internal/workload/plugin/source/image/image.go computeMounts wires settings + imageTag through. Coverage in internal/volume/resolver_test.go (portable Linux/Windows via t.TempDir). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-11 22:17:41 +03:00
alexei.dolgolyov	cba2149aa9	refactor(workload): finalize containers index + post-review hardening Wraps up the workload refactor with the fixes that came out of the multi-agent code review (see docs/plans/workload-refactor.md "What actually shipped"). Backend: - store.ReconcileContainer: separate write path so the 30s reconciler tick no longer overwrites deployer-owned fields (subdomain, proxy_route_id, npm_proxy_id, image_tag). - Container.stage_id column + index; ListProxyRoutes / ListContainersByStageID join via stage_id (survives stage rename), with legacy fallback to (project_id, role=stage_name). - Reconciler: workload-existence check (rejects forged tinyforge.workload.id labels), skips inventing project-kind rows, child-context cancel before wg.Wait() on shutdown. - Transactional CRUD across projects / stacks / static_sites: parent UPDATE and workload sync land in one transaction so secret rotations are durable. - Webhook routing reads exclusively through workloads.webhook_secret; legacy GetProjectByWebhookSecret / GetStaticSiteByWebhookSecret fallback removed. - store.GetStackByComposeProjectName + indexed lookup (no more full-table stack scan per compose container per tick). - store.ListMissingSweepRows: filtered query for the missing-sweep. - /api/instances/* handlers verify (workload_id, role) match URL (project_id, stage_name) before mutating — closes the cross-project hijack the security review flagged. - extra_json no longer referenced from Go (column kept on disk for now). Frontend: - WorkloadContainers.svelte: generic detail-page panel reusable by stack and site detail pages. - Containers page polish: client-side kind/state filters over an unfiltered fetch, URL-synced filters, race-safe loads via sequence number, EN+RU i18n, sidebar counter via navCounts.containers. Misc: - scripts/dev-server.sh: tolerate empty netstat grep result. - .gitignore: ignore docker-watcher binaries, .claude/worktrees/, .facts-sync.json.	2026-05-09 15:44:41 +03:00
alexei.dolgolyov	d8ab22876f	refactor(workload): extract Instance entirely; Container is canonical Build / build (push) Successful in 10m41s Details End-to-end extraction of the Instance concept. After this commit: * internal/store/instances.go — DELETED * internal/store/models.go — Instance struct gone, ProxyRoute moved here * containers table is the single source of truth for project/stack/site container state. instances table is dropped via DROP TABLE migration (idempotent; re-runnable on every boot). * Legacy tinyforge.project / tinyforge.stage / tinyforge.instance-id Docker labels are no longer emitted; only tinyforge.workload.{id,kind}, tinyforge.role, and tinyforge.managed are stamped on new containers. Backend rewrites: - internal/deployer: executeDeploy + blueGreenDeploy + rollback + promote use store.Container natively. New removeContainer() replaces removeInstance(). enforceMaxInstances reads via ListContainersByStageID. - internal/reconciler: legacy tinyforge.instance-id dispatch removed; upsertByWorkloadLabel now finds existing rows by docker container ID first and falls back to the deterministic workloadID:role key. - internal/stale/scanner: Scan + new FindStaleContainers walk the containers table; emit StaleContainer JSON. - internal/stats/collector: ListContainers replaces ListAllInstances. - internal/webhook/handler: workload-secret lookup tried first; falls back to project / static_site secret column. - internal/api: instances.go, stale.go, stats.go, stats_history.go, projects.go, settings.go, docker.go, dns.go all read / write through Container. Docker layer: - ManagedContainer exposes WorkloadID/Kind/Role from the canonical labels. - ListContainers filters by tinyforge.managed=true. - Network creation uses LabelManaged instead of LabelProject. Frontend: - Instance type is now a Container alias; .status → .state, .last_alive_at → .last_seen_at. - InstanceCard takes stageId as a prop (no longer derived from Instance). - StaleContainer JSON shape rewritten: { container, workload_name, role, days_stale }. StaleContainerCard + /containers/stale page updated. - ProjectCard / homepage / SystemHealthCard filter by .state. The migration loop now tolerates "no such table" alongside "duplicate column" / "already exists" so obsolete ALTER TABLE entries targeting the dropped instances table no-op cleanly on first boot. Tests: store + deployer + reconciler + webhook + staticsite + notify all still pass. Frontend svelte-check: zero errors.	2026-05-09 14:43:12 +03:00
alexei.dolgolyov	0acbcda084	feat(workload): /api/workloads /api/containers /api/apps endpoints Adds the read API surface that the global Containers view (and the per-workload container panel on project/stack/site detail pages) consume. - GET /api/workloads (?kind=) → workload list - GET /api/workloads/{id} → single workload - GET /api/workloads/{id}/containers → workload's containers - PATCH /api/workloads/{id}/app → assign/clear app_id (admin) - GET /api/containers (?workload_id=&kind=&state=&app_id=) → global index, decorated with workload + app name so the table renders without N+1 fetches - GET /api/containers/{id} → single container row - GET /api/apps → list - GET /api/apps/{id} → single - POST /api/apps → create (admin) - PUT /api/apps/{id} → update (admin) - DELETE /api/apps/{id} → delete (admin) — clears app_id on owning workloads but leaves them assigned-to-none Mutations on projects/stacks/sites still go through the existing kind-specific endpoints; the new surface is read-only at the workload layer.	2026-05-09 13:52:31 +03:00
alexei.dolgolyov	7f2d1bdae1	feat(workload): switch buildActiveImagesSet to containers index First consumer migration off the instances table. The image prune logic now walks the normalized containers.image_ref column directly — one DB pass against a single table instead of joining instances against projects to reconstruct the full "image:tag" string. Demonstrates the consumer-switch pattern the remaining read sites (proxies, stale scanner, webhook matcher) will follow. The legacy `projects []store.Project` parameter is kept on the function signature for now so call sites don't change in this commit; the underscore-discard in the body makes it explicit that it's no longer load-bearing.	2026-05-09 13:47:20 +03:00
alexei.dolgolyov	0f60a7a5db	feat(webhook): inbound delivery audit log Build / build (push) Successful in 10m35s Details Persists every inbound webhook hit (project + site) so users can debug "why didn't my deploy fire?" without grepping daemon logs. Surfaces a 14-day rolling history under the WebhookPanel on each project + site detail page; refreshes every 30s while open. Daily cron prunes records older than 14 days alongside the existing event log prune. Schema: - webhook_deliveries(id, target_type, target_id, target_name, received_at, source_ip, signature_state, status_code, outcome, detail, body_size) - indexes on (target_type,target_id,received_at) and (received_at) Backend: - store: WebhookDelivery model + Insert/List/Prune helpers - webhook/handler: deferred recordDelivery() captures the final outcome on every return path including HMAC rejects, image mismatch, no-stage, auto_deploy=false, and successful deploys; signatureStateFor() classifies "unconfigured" vs "missing" vs "invalid" vs "valid" - api: GET /api/{projects,sites}/{id}/webhook/deliveries with parseLimit() helper (default 50, max 200) - main: daily prune cron retains the last 14 days Frontend: - WebhookDeliveryLog.svelte: panel with refresh button, status code + outcome + signature badges, relative time tooltip-on-hover for absolute time, source IP column - Mounted below WebhookPanel on project + site detail pages - en/ru i18n strings for outcome/signature enums and column labels	2026-05-07 02:40:39 +03:00
alexei.dolgolyov	831b5c1a43	feat(webhook): HMAC-SHA256 signature verification on inbound webhooks Adds an opt-in inbound HMAC scheme so a leaked URL alone is not enough to forge deploy/sync requests — the caller must also know a separate signing secret. Header format is X-Hub-Signature-256, matching the Gitea/GitHub/GitLab convention so existing CI integrations work without custom code. Behaviour: - per-project / per-site signing_secret is independent of the URL secret - require_signature flag does a hard 401 on missing/invalid signatures - even when require_signature is off, an invalid submitted signature returns 401 — surfaces CI misconfiguration instead of silently passing - comparison uses subtle/hmac.Equal (constant time) Backend: - store: webhook_signing_secret + webhook_require_signature columns on projects + static_sites; scanProject helper, scan helpers updated; new Set* helpers for both fields - webhook/handler: verifyHMAC helper, body read once, integrated into both project and site handlers - api: per-entity signing-secret rotate / disable / require-toggle endpoints under /api/{projects,sites}/{id}/webhook/... Frontend: - WebhookPanel gains optional signing handlers (no breaking change for existing callers; signing UI hides when handlers aren't wired) - one-shot reveal of the issued secret with copy + dismiss - ToggleSwitch for require-signature, disabled until a secret is issued - en/ru i18n strings Tests: - HMACRequiredAndValid (200 + deploy fires) - HMACRequiredButMissing (401, no deploy) - HMACPresentButWrong (401 even when require_signature=false) - HMACOptionalUnsignedAccepted (200 when neither configured)	2026-05-07 02:34:40 +03:00
alexei.dolgolyov	8b886ddf2b	feat(backup): take Tinyforge DB snapshot before every deploy Adds an opt-in "auto_backup_before_deploy" setting that triggers a "pre-deploy" backup at the start of every project deploy via the deploy pipeline (covers both the async HTTP path and the sync poller/webhook path). Failures are logged to the deploy log but do not abort — missing a backup is preferable to refusing to ship a fix. - store: settings.auto_backup_before_deploy column + scan/update wiring - backup: accept "pre-deploy" as a valid backup_type - deployer: small PreDeployBackuper interface, hooked into runDeploy right after settings load and before any state-mutating work - api: settings request/response surface the new flag - web: ToggleSwitch on the backup settings page; "Pre-deploy" badge variant in the backup list (badge-warning so it stands out) - i18n: en/ru strings for the toggle, help text, and badge label	2026-05-07 02:14:26 +03:00
alexei.dolgolyov	0405ecd9ce	feat(notify): HMAC-signed outgoing webhooks with per-tier secrets and test sender Build / build (push) Successful in 10m36s Details Outgoing notifications were bare POSTs with no auth and no way to verify they came from Tinyforge. They also went out from one global URL only, even though stages had a notification_url field, and static-site sync emitted no events at all. Schema: add notification_url + notification_secret (lazy-generated) to settings, projects, stages and static_sites. Migrations are additive. Notifier: SendSigned computes HMAC-SHA256 over the exact body bytes and sends X-Hub-Signature-256 (GitHub-compatible — receivers built for GitHub/Gitea/Forgejo verify out of the box). Aux headers X-Tinyforge-Event/Delivery/Timestamp/Tier are advisory and not signed. Empty secret => unsigned send for back-compat. Resolution: deploys fall through stage > project > settings, sites fall through site > settings. The secret travels with the URL that sourced it, so any tier can sign even when its parents are unsigned. Site sync events now actually emit (site_sync_success / site_sync_failure). API: 12 new endpoints — {GET secret, POST regenerate, POST disable, POST test} for each of the 4 tiers. SendSyncForTest returns status_code/latency_ms/signature_sent/delivery_id/response_snippet so the UI surfaces receiver feedback inline. UI: shared OutgoingWebhookPanel.svelte fits the existing card aesthetic. Signing-state pill, secret reveal-on-demand, regenerate/disable behind ConfirmDialog modals (not inline strips — too easy to misclick), send- test result card with colour-coded status. Wired into Settings → Integrations, project edit form, per-stage edit, and per-site detail. EN + RU i18n. Tests: round-trip (sender signs, receiver verifies), tampered-body and wrong-secret rejection, unsigned-send omits header, send-test surfaces 4xx, concurrent fan-out via Drain. Resolver precedence locked for both deploy and site paths. Docs: docs/webhooks.md with header reference, verifier snippets in Node/Python/Go, and a recipe for the service-to-notification-bridge generic webhook provider.	2026-05-07 02:03:32 +03:00
alexei.dolgolyov	a4362b842d	fix: harden security, fix concurrency bugs, and address review findings Build / build (push) Successful in 11m42s Details Security: - rate limit /api/webhook routes per-IP and cap concurrent site syncs - global SSE connection cap (256) with new sse_gate - validate ?tail= and cap JSON log responses at 4 MiB - strip ANSI/CSI/OSC and control bytes from streamed log lines - redact webhook secret from request log middleware - scrub host details from /api/health for non-admin viewers - drop container_id from /api/system/stats/top for non-admins - generate webhook secrets via crypto/rand; require >=32 chars on insert - verify iid path consistency in streamContainerLogs - LimitReader on site webhook body; reject malformed non-empty bodies Concurrency / correctness: - stats collector: Stop() no longer hangs without Start(), semaphore acquired in parent loop so ctx cancellation short-circuits the queue, in-flight tick cancellable via shared base context, zero-ts guard - webhook handler: replace fire-and-forget goroutine with WaitGroup-tracked workers + Drain() wired into graceful shutdown - $derived(() => ...) mis-idiom fixed in ContainerStats / InstanceCard / ProjectCard (returned function instead of value) - SystemResourcesCard: rename `window` and `t` locals to avoid shadowing globalThis.window and the i18n `t` import Quality / performance: - replace O(n^2) insertion sort with sort.Slice in stats top - runMigrations only swallows duplicate-column / already-exists errors - PruneStatsSamplesBefore wrapped in a transaction - collapse N+1 in unusedImageStats / pruneImages to one ListAllInstances pass; surface DB errors instead of silently treating them as inactive - run Docker Info + DiskUsage in parallel via errgroup - container log SSE emits `: ping` heartbeat every 20 s - imageMatches case-insensitive on registry host (RFC behaviour) - log warning on invalid stage tag pattern instead of silent skip - reject malformed non-empty site webhook payloads Frontend / i18n: - shared formatBytes utility replaces three local copies - statsInterval store drives dynamic "no samples / collection disabled" copy across ContainerStats and SystemResourcesCard - top consumers row now shows owner_name (project/stage or site name) - drop seven `as any` casts on the Settings type; add cloudflare_api_token write-only field - move "Service status", "Docker daemon", "Docker unreachable", "Proxy unreachable", "reachable", and "Docker daemon is not reachable." strings into en/ru i18n bundles	2026-05-07 00:56:14 +03:00
alexei.dolgolyov	05440a5f92	feat(stats): resource metrics dashboard + sites logs/stats Build / build (push) Successful in 10m50s Details Background collector samples CPU/memory/network/block I/O for every instance and site on a configurable interval (default 15s, range 5-300s), persists samples to SQLite with a configurable retention window (default 2h, range 0-24h), and skips ticks gracefully when the Docker daemon is unreachable. Settings are reloadable without a restart — each tick re-reads them. New API endpoints: - GET /api/system/stats (host snapshot: info + df) - GET /api/system/stats/history - GET /api/system/stats/top?by=cpu\|memory - GET /api/projects/{id}/stages/{s}/instances/{iid}/stats/history - GET /api/sites/{id}/stats[/history] - GET /api/sites/{id}/logs (SSE + JSON, reuses instance log streamer) Frontend: - ECharts added with tree-shaken imports (~180KB gzip) for future-proof time-series/gantt/graph visualizations - CollapsibleSection wraps all dashboard sections (system health, daemons, system resources, static sites, projects) with localStorage-persisted open state - SystemResourcesCard shows capacity tiles, workload utilization chart with 30m/2h/6h/24h window picker, disk breakdown with reclaimable callouts, and top 5 consumers - ContainerStats and ContainerLogs take a source discriminated union so sites reuse the same components as instances; sites detail page embeds both for Deno backend debugging - Settings › Maintenance exposes collection interval + retention - Docker-unavailable state returns 503 and renders an amber banner instead of a generic 500 Full i18n coverage (en + ru) for all new strings.	2026-04-24 15:02:43 +03:00
alexei.dolgolyov	0632f512e6	feat(webhook): per-project and per-site webhook URLs Build / build (push) Successful in 10m25s Details Replace the single global webhook secret with entity-scoped secrets stored on each project and static site. Webhook-driven project autocreate is removed — projects must exist before their URL can trigger deploys. Also wires static-site webhooks (sync_trigger=push\|tag), turning the previously inert "push" trigger into a functional one: POST the site's webhook URL from a Git provider and Tinyforge re-syncs on matching refs. - Adds webhook_secret columns + unique indexes to projects and static_sites - Per-entity GET/regenerate endpoints under /api/projects/{id}/webhook and /api/sites/{id}/webhook (admin-only) - Removes /api/settings/webhook-url and the global webhook panel - Reusable WebhookPanel Svelte component on both detail pages, i18n in en/ru - Tests for matcher (siteRefMatches, ParseImageRef) and handler (project match/mismatch/404 and site push/manual/branch-skip)	2026-04-23 15:18:19 +03:00
alexei.dolgolyov	90e6e59d9e	feat: daemon health panel, brand-rail status chips, user timezone selector Build / build (push) Successful in 10m35s Details - Health API now surfaces Docker /info + /version (version, platform, kernel, container/image counts, storage driver, memory, latency) and NPM aggregates (proxy host total, managed-by-Tinyforge count, access lists, certificates, endpoint URL). - Docker/NPM indicators moved out of the sidebar footer and into a compact mono-styled rail directly under the Tinyforge brand title, with pulse/fault animations and click-to-expand error hints. - New SystemDaemonsCard on the dashboard: two terminal-styled panels (Docker Engine + Proxy) with a running/paused/stopped stacked bar, key-value diagnostics, and a total-vs-managed proportion meter on the proxy-hosts tile. - Shared health store so the sidebar and dashboard share a single 30 s poll instead of duplicating traffic. - User-facing timezone preference with auto-detect fallback; all dates across projects, sites, stacks, settings, backup, event log and stale containers now render through \$fmt.date / \$fmt.datetime. - en/ru translations for both features.	2026-04-23 14:32:30 +03:00
alexei.dolgolyov	a182a93950	feat: nav counter badges, login backdrop, events i18n + misc fixes Build / build (push) Successful in 10m29s Details Nav & UI polish - Sidebar nav items show monospace count badges (projects, sites, stacks, proxies). Events badge shows error count only, styled red as actionable - New $lib/stores/navCounts.ts polls all counts in parallel every 60s and refreshes on route change so badges track mutations - Login page gets a dynamic forge backdrop: rotating conic glow, drifting embers, dot-grid texture, vignette — all pure CSS, reduced-motion safe - main element gets scrollbar-gutter: stable so Settings tab switching no longer shifts horizontally when content heights differ Events i18n - events.source.* dictionary rewritten to match actually-emitted backend sources (deploy, static_site, stale_scanner, stale_cleanup, admin); dead keys (container, proxy, system) removed - EventLogFilter.allSources + /events default sources state updated to match - Localize "{N} total" via events.totalCount in the page hero toolbar Backend - Stage API accepts enable_proxy on create/update (defaults to true) so proxy registration can be opted out per stage Concurrency - api.ts: queued request waiters no longer double-increment the inflight counter; releasing a slot hands it off directly Reactive effects - project detail / env / volumes pages wrap side-effect calls in untrack() to prevent $effect feedback loops when their loaders mutate tracked state	2026-04-22 18:30:34 +03:00
alexei.dolgolyov	ef0669d5dd	feat: unified THE FORGE // SECTION headers and merged proxy routes Build / build (push) Successful in 10m37s Details UI consistency - ForgeHero now supports backHref, mono kicker, stats snippet, staggered entrance animation, and a registration-tick divider - Every route now opens with the same "THE FORGE // SECTION" eyebrow: projects, sites, stacks, proxies, events, dns, deploy, settings, stale containers, site/project detail + env/volumes/browse, new site wizard - Stacks list/detail/new moved to the shared hero and brand-anchor eyebrow - Toolbars migrated from bespoke buttons to the shared .forge-btn utilities - Sidebar footline adds a live UTC "forge clock" and a vim-style g-prefix quick-nav hint (g d/p/s/k/x/r/e/c jumps to each section) Proxies page - Server-side: merge static site proxy routes with instance routes and sort by domain (internal/api/proxies.go, internal/store/static_sites.go) - ProxyRoute gains a Source field ("instance" \| "static_site") - Frontend adds source filter tabs and per-source labels/badges	2026-04-22 16:27:55 +03:00
alexei.dolgolyov	75424a5f25	feat: docker-compose stacks with Forge-themed UI Build / build (push) Successful in 10m42s Details Adds a new Stacks feature: upload/edit docker-compose YAML, deploy as atomic units, browse revisions, roll back, and stream logs. Backend in internal/stack + internal/api/stacks.go, persistent storage in internal/store/stacks.go. Stacks pages (list, new, detail) use a modern Forge aesthetic — Instrument Serif display type, JetBrains Mono for meta/code, indigo ember accents, dot-grid hero, registration marks on hover, terminal panel for logs. Palette is sourced from the app's existing design tokens so the feature remains consistent with the rest of Tinyforge. Fonts self-hosted via @fontsource/instrument-serif and @fontsource/jetbrains-mono to satisfy the strict CSP.	2026-04-16 03:48:37 +03:00
alexei.dolgolyov	b622384774	feat: persistent storage for Deno static sites Build / build (push) Successful in 10m21s Details - Add storage_enabled and storage_limit_mb columns to static_sites. - Create/attach Docker volumes (tinyforge-site-{name}-data) for Deno sites with storage enabled, mounted at /app/data. - Grant --allow-write=/app/data in Deno container CMD. - Add storage usage API endpoint (GET /api/sites/{id}/storage). - Show storage section in site detail page with usage bar. - Add storage toggle and limit field to new site wizard. - Use ConfirmDialog for secret deletion instead of inline delete.	2026-04-13 00:12:51 +03:00
alexei.dolgolyov	96fd910603	fix: resolve ERR_INSUFFICIENT_RESOURCES connection exhaustion - Add concurrency limiter (max 4 GET requests) to API layer, leaving slots for SSE and health checks. Write ops bypass the limiter. - Add AbortController to ContainerStats, project detail page, and dashboard to cancel in-flight requests on navigation/unmount. - Move global SSE connection from layout to events page (only consumer). - Add 30s heartbeat to SSE endpoint to detect zombie connections. - Serialize dashboard project fetches to avoid parallel burst. - Rebuild frontend in dev-server.sh so go:embed stays in sync.	2026-04-13 00:12:14 +03:00
alexei.dolgolyov	791cd4d6af	feat: rename Docker Watcher to Tinyforge Build / build (push) Successful in 12m20s Details Rebrand the project as Tinyforge to reflect its evolution from a Docker container watcher into a self-hosted mini CI/deployment platform. Rename covers: Go module path, Docker labels, DB/config filenames, JWT issuer, Dockerfile binary, docker-compose, CI workflows, frontend i18n, README with static sites docs, and all code comments.	2026-04-12 21:30:39 +03:00
alexei.dolgolyov	8d2c5a063b	feat: static sites feature with Gitea/GitHub/GitLab support and Deno backend Deploy static content from Git repository folders with optional server-side API endpoints. Supports Gitea/Forgejo/Gogs, GitHub, and GitLab with provider autodetection. - New Sites entity with CRUD, encrypted secrets, and manual/push/tag sync triggers - Pluggable GitProvider interface with three implementations - Deno container mode: auto-generates router from API_{method}_{name} exports - Static container mode: nginx serving files with optional markdown rendering - Wizard UI with provider selector, repo picker, branch/folder tree pickers - Deploy pipeline builds fresh image, starts container, configures NPM proxy - Stop/Start buttons, force redeploy on manual trigger - Periodic health checker detects crashed containers - Proxy route existence check during auto-sync	2026-04-11 03:35:57 +03:00
alexei.dolgolyov	b0816502bf	feat: configurable unused images threshold with dashboard warning - Add image_prune_threshold_mb setting (default 1024 MB) - Add GET /api/docker/unused-images endpoint returning unused image count, size, and threshold status - Dashboard shows amber warning banner when unused project images exceed threshold - Banner links to settings page for pruning, shows count and human-readable size (MB/GB) - Threshold configurable in Docker Image Cleanup section of settings - DB migration + schema for image_prune_threshold_mb	2026-04-05 14:34:48 +03:00
alexei.dolgolyov	21ffef2ee2	feat: separate Public IP for DNS records from Server IP, improve settings help texts - Add public_ip field to Settings for DNS A records (proxy/load balancer IP) - DNS records now use public_ip, falling back to server_ip if empty - Server IP renamed to "Server IP (Docker Host)" for clarity - Public IP labeled "Public IP (DNS Target)" - Updated help texts for domain, server IP, public IP, and Docker network - DB migration + schema for public_ip column	2026-04-05 14:12:53 +03:00
alexei.dolgolyov	d03cc3c811	feat: container logs viewer with SSE streaming and line limiter - Add GET /api/projects/{id}/stages/{stage}/instances/{iid}/logs endpoint - Supports JSON mode (returns array of lines) and SSE mode (streams in real-time) - Docker log stream header (8-byte prefix) stripped automatically - ContainerLogs component with: - Tail line selector (50/200/500/1000) - Follow button for real-time streaming via SSE - Auto-scroll to bottom - Dark terminal-style display - Close button - Logs button (events icon) on each instance card - i18n keys in EN and RU	2026-04-05 14:04:45 +03:00
alexei.dolgolyov	ac3132d172	feat: show local Docker images on project detail page - Add GET /api/projects/{id}/images endpoint returning local images matching the project - Add ListImagesByRef with tag, size, and created timestamp to Docker client - Display images table on project page with tag, ID (truncated), size (MB), and created date - Only shown when Docker is available and images exist locally	2026-04-05 13:56:55 +03:00
alexei.dolgolyov	5577851f22	feat: project-scoped Docker image prune, conflict fix, deploy toggle, access list picker - Image prune only removes images matching project image refs, skips active instances - Add ListImagesByRef and RemoveImage to Docker client - Fix 409 conflict: use listProjects instead of duplicate POST - Add "Deploy immediately" toggle to Quick Deploy (off by default) - Replace raw access list ID with EntityPicker on project edit form - Trigger proxy resync on access list change - Fix stage form layout: single responsive row - Fix empty port default on project creation - Improve inspect error message for remote Docker	2026-04-05 13:49:20 +03:00
alexei.dolgolyov	a830378c5b	fix: replace access list ID field with EntityPicker, add deploy toggle, improve UX - Replace raw NPM access list ID input with EntityPicker on project edit form - Resolve access list name from NPM API when editing project - Add "Deploy immediately" toggle to Quick Deploy (off by default) - Fix stage form layout: all fields on same row with toggles - Fix empty port default on project creation (placeholder instead of pre-filled) - Improve inspect error message when Docker is unavailable - Trigger proxy resync when NPM access list changes - Resolve access list name on NPM settings page load	2026-04-05 13:07:09 +03:00
alexei.dolgolyov	7550fe9e32	feat: CPU/RAM limits per stage, NPM access list (global + per-project) Resource limits: - Add cpu_limit (cores) and memory_limit (MB) fields to Stage model - Pass limits to Docker container via NanoCPUs and Memory in HostConfig - Add CPU/Memory fields to stage creation form in project detail - 0 = unlimited (default) NPM access list: - Add npm_access_list_id to Settings (global default) and Project (per-project override) - Per-project overrides global when > 0 - NPM provider passes access_list_id when configuring proxy hosts - Add GET /api/settings/npm-access-lists endpoint to list NPM access lists - Add access list picker on NPM settings page (global) - Add access list ID field on project edit form (per-project) - DB migrations for all new columns	2026-04-05 12:44:26 +03:00
alexei.dolgolyov	c6d20ca26e	feat: NPM access list support (global default + per-project override)	2026-04-05 12:38:20 +03:00
alexei.dolgolyov	4ff8daafc4	fix: reconcile instance status with Docker on list, add IsContainerRunning	2026-04-05 02:42:31 +03:00
alexei.dolgolyov	12d78bec99	fix: instance link includes domain, project delete cleans up containers and proxies - InstanceCard appends settings domain to subdomain link (stage-dev-app.example.com instead of just stage-dev-app) - Project deletion now removes Docker containers and proxy routes before deleting DB records - Pass domain from settings to InstanceCard via project detail page	2026-04-05 02:38:32 +03:00
alexei.dolgolyov	b54481aff8	fix: NPM remote toggle auto-save, proxy resync on remote change, webhook URL as path - Remote NPM toggle now auto-saves immediately when toggled - Toggling npm_remote triggers proxy resync (re-creates routes with server_ip or container name) - Webhook URL shows just the path (/api/webhook/{secret}) instead of full URL with wrong domain - Fix tag dropdown: resolve registry ID from name before fetching tags - Remove unused fmt import	2026-04-05 02:27:41 +03:00
alexei.dolgolyov	195ef3e7e5	feat: NPM remote mode for cross-machine deployments - Add npm_remote setting: when enabled, proxy forwards to server_ip with published host ports instead of Docker container names - Deployer looks up assigned host port via InspectContainerPort in remote mode - Auto-remove stale containers with same name before creating new ones - Add Remote NPM toggle with warning on NPM settings page - DB migration + schema for npm_remote column	2026-04-05 02:18:06 +03:00
alexei.dolgolyov	c26c41e6a1	feat: enable proxy toggle on quick deploy, event log clearing, and UX fixes - Add enable_proxy toggle to Quick Deploy form (defaults to on) - Add DELETE /api/events/log/{id} and DELETE /api/events/log endpoints - Add Clear All button with confirmation on Events page - Rename "NPM Proxy" to "Enable Proxy" on stage form (provider-agnostic) - Fix polling interval validation (min 60s) and number input trim errors - Fix domain field no longer required in settings	2026-04-05 01:50:19 +03:00

1 2

94 Commits