2e26f555c5c795e53cdc793173853c7f3ddfcebc
13 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
279f373f80 |
docs(extra_json): policy doc for containers.extra_json evolution
New CODEMAPS/container-extra-json.md documents the contract every source plugin must follow when reading or writing containers.extra_json. Closes the open architectural question that was tracked in WORKLOAD_REFACTOR_TODO.md. Covers: - Schema position (column default, four write-path normalization sites) and ownership model (per-source row keys, current writers). - Reader rules: tolerate unknown keys via default json.Unmarshal, tolerate decode failure where first-class columns suffice. - Writer patterns: wholesale-overwrite (image source, single-writer short-lived rows) vs preserve-unknown-keys (static source, RMW with generic-map round-trip). Preserve-unknown-keys is the recommended default for new sources. - Concurrency: SetMaxOpenConns(1) + WAL gives atomic per-row writes and consistent reader snapshots, but does NOT serialize multi- goroutine RMW — a per-workload sync.Mutex is required for that (fenced by TestSaveState_ConcurrentWritesDoNotLoseUpdates). - What extra_json is NOT for (workload config, cross-source state, queryable data, secrets) and a checklist for adding a new field. - Pointers to every example in tree: image's containerExtra writer/ reader, static's saveState round-trip, workload_runtime.go's decode-and-tolerate consumer. WORKLOAD_REFACTOR_TODO Container.extra_json question flipped to DONE. CODEMAPS/INDEX bumped + entry linked. Reviewer pass (code-reviewer subagent) caught one HIGH factual error (wrong cross-source consumer claim) and several MEDIUM/LOW drifts; all addressed inline before commit. |
||
|
|
ea55d31177 |
feat(discovery+runtime): restore static-site wizard discovery + close /sites/[id] feature parity
Build / build (push) Successful in 10m43s
Two-stage feature arc closing the gaps left by the hard legacy cutover.
The static-site creation wizard regains its auto-discovery + connection-test
flow; /apps/[id] grows the runtime/storage/lifecycle surface the legacy
/sites/[id] page used to expose.
Backend (Go)
- internal/api/discovery.go: six admin-gated endpoints wrapping
staticsite.GitProvider — POST /api/discovery/git/{detect-provider,
test-connection,repos,branches,tree} + GET /api/discovery/image/conflicts.
Identifier validation (validateGitIdent / validateGitBranch) at the
boundary so provider URL interpolation cannot be hijacked via `..`.
Upstream errors scrubbed: detailed slog on the server, generic 502 to
the client (mitigates token-reflection-in-error-page).
- internal/api/workload_runtime.go: four endpoints —
GET /api/workloads/{id}/runtime-state decodes containers.extra_json for
static workloads; GET /api/workloads/{id}/storage execs `du -sb /app/data`
with a 30s in-process cache (storageProbeCache) so polling can't turn
into per-request execs; POST /api/workloads/{id}/{stop,start} iterate
ListContainersByWorkload and call docker.StopContainer / StartContainer,
returning 200 / 409 (nothing to act on) / 502 (all failed).
- internal/staticsite/safehttp.go: NewSafeHTTPClient + ValidateBaseURL +
blockReason. DialContext re-resolves hostnames and refuses loopback /
link-local / multicast / unspecified addresses. RFC1918 + ULA explicitly
allowed (self-hosted Gitea on LAN is the dominant deployment).
Replaced four raw &http.Client{} constructions in the provider files.
- internal/staticsite/gitlab_provider.go: url.PathEscape each segment in
the raw-file URL builder for parity with projectPath().
- Test coverage: 26 cases in discovery_test.go (image-tag stripping,
source-config decoding, conflict scenarios, validator boundaries,
scheme rejection), 14 in workload_runtime_test.go (404 / 409 / nil-docker
/ probe-cache), 16 in safehttp_test.go (URL validation + block-reason
policy matrix + live dial against loopback + AWS metadata literals).
Frontend (Svelte 5 + runes)
- web/src/lib/api.ts: typed wrappers for every endpoint, AbortSignal
threaded through post(); ApiError exported so callers can narrow on
e.status; new DetectedGitProvider narrow union.
- web/src/routes/apps/new/+page.svelte: static-form discovery controls
(auto-detect provider, test connection, repo / branch / folder
EntityPickers, Deno auto-detect); image-form conflict panel with
debounced lookup + double-click submit guard ("Forge anyway") + Inspect
button that pre-fills port/healthcheck; English error fallbacks routed
through apps.new.errors.* (en + ru).
- web/src/routes/apps/[id]/+page.svelte: runtime-state panel + storage
panel + Stop / Start / Open-site toolbar; universal live-state badge
in the hero lede for image/compose/static (RUNNING / TRANSITIONING /
STOPPED / NOT DEPLOYED / MIXED · n/m RUNNING); ContainerStats panel
per row (auto-collapsing native <details> when N > 2); read-only
webhook bindings summary card; responsive toolbar overflow with native
<details> at <640px (z-index 100 above sticky nav).
- web/src/app.css: project-wide .forge-btn-ghost:focus-visible outline.
Hardening from go-reviewer + security-reviewer + typescript-reviewer +
frontend-design UI/UX subagents (0 CRITICAL, all HIGH/BLOCKER addressed
inline, IMPORTANT applied before commit):
- AbortController + per-call sequence tokens on every long-running
fetch (loadRuntimeState / loadStorage / loadTriggerMeta / inspectImage /
listImageConflicts) plus onDestroy cleanup so late resolves cannot
mutate dead component state.
- doStop / doStart snapshot and restore `error` across the finally-block
reload so a load()-cleared message doesn't hide a real failure.
- triggersById refreshed after inline trigger creation so the webhook
card doesn't silently exclude the just-created trigger.
- Live-state badge wraps in role=status / aria-live=polite (no redundant
aria-label).
- Webhook row has a single click target (was two pointing at the same URL).
- Empty webhook section hides entirely.
- Dropped role=menu / role=menuitem from the overflow menu (they would
promise arrow-key nav we don't wire; native Tab + ESC carry it).
Doc
- docs/CODEMAPS/INDEX.md + new docs/CODEMAPS/discovery-and-runtime.md
map the endpoint surface, security posture, frontend integration
patterns, and an "add a new probe" recipe.
Verification
- svelte-check: 0 errors, 3 pre-existing a11y warnings.
- go build + go vet + go test ./...: all green.
- i18n parity: en + ru at 1413 keys each.
- Live smoke against :8090: 404 / 409 / 502 envelopes correct, discovery
sanity passes, ProbeError surfaces on no-container path.
|
||
|
|
39e1e36510 |
feat(triggers): add schedule trigger kind + internal scheduler
Build / build (push) Successful in 10m42s
Fourth trigger kind alongside registry/git/manual. Recurring time-interval fires driven by a new internal/scheduler tick loop (default 30s, clamped to 5m). Goes through the same webhook.Handler.FanOutForTrigger seam as inbound HTTP webhooks, so per-binding concurrency, outcome accounting, and config-merge semantics are identical. Schema: triggers.last_fired_at TEXT column (additive ALTER for existing DBs). Scheduler persists last_fired_at BEFORE dispatch so a panicking Match cannot wedge a tight loop; failed deploys wait one full interval before retry — correct trade-off for a periodic refresh trigger. Frontend: TriggerKindForm + /triggers/new + /triggers/[id] gain the schedule kind (4-col card grid, preset chips Hourly/Daily/Weekly, custom interval input matched to Go time.ParseDuration syntax, optional pinned reference). /triggers/[id] surfaces "last fired" on schedule rows. EN+RU i18n in parity. Review fixes from go-reviewer / security-reviewer / typescript-reviewer: - Scheduler Start/Stop wrapped in sync.Once (no goroutine leak / double- cancel panic on shutdown re-entry). - shouldFire rejects sub-MinInterval as defense-in-depth against hand-inserted rows that bypassed Validate. - fire() asserts trigger Kind=="schedule" before dispatching. - Aligned isValidInterval regex across all three frontend sites; reject the unsupported "d" unit (Go time.ParseDuration doesn't accept it). - formatLastFired falls back to lastFiredNever on malformed timestamps rather than leaking raw bytes into the UI. - main.go scheduler closure logs per-fire deployed/errored counts. |
||
|
|
e3c7b13d58 |
chore(workload): close the workload-first arc — apps i18n + codemap + tests
Build / build (push) Successful in 10m36s
Closes the workload-first refactor by landing the Priority 3 polish items and the Priority 4 test gap. Net: ~2,400 lines added, ~350 lines modified across 13 files. Priority 3 — polish - apps.* i18n namespace: 276 new keys across apps.list.* (27), apps.new.* (91, sibling of existing apps.new.triggers.*), and apps.detail.* (158, sibling of existing apps.detail.bindings.*). EN+RU at 1314 keys each, perfectly in sync. /apps, /apps/new, /apps/[id] now render entirely from i18n. - New codemap docs/CODEMAPS/workload-plugin.md (238 lines): Source × Trigger contract, dispatch seam, webhook fan-out path, recipes for adding a new Source or Trigger kind. Plus docs/CODEMAPS/INDEX.md gateway. Priority 4 — tests - internal/api/workloads_test.go (new, ~30 subtests): /api/workloads CRUD + deploy + delete + env + volumes + chain + promote-from + triggers list/inline-bind + auth gating + standalone /api/triggers CRUD (create / dup-409 / kind filter / delete). Uses real POST handlers via httptest.NewServer + a fake plugin source registered under "testfakesource". - internal/deployer/dispatch_test.go (new, 11 tests): DispatchPlugin / DispatchTeardown / DispatchReconcile happy + unknown-kind + propagated-error each; PluginDeps wiring; a real 2s-bounded RWMutex deadlock probe on PluginDeps vs SetDNSProvider. - internal/workload/plugin/source/compose/compose_test.go (new, ~26 subtests): composeProjectName sanitization, writeYAML / writeYAMLIfChanged hash short-circuit, Validate happy + bad inputs, Kind / SchemaSample. Coverage delta on the workload-plugin path: - internal/api: 1.1% → 16.0% - internal/deployer: 0% → 54.1% - internal/workload/plugin/source/compose: 0% → 38.5% - Trigger plugins already at 87-95% from the trigger-split work. Production fix surfaced by the tests - store.CreateWorkload now self-references RefID = ID when caller leaves RefID empty (the typical plugin-native path). The api layer's broken backfill loop (called UpdateWorkload, which deliberately omits ref_id) is gone. Multiple sibling plugin workloads can now coexist under the UNIQUE(kind, ref_id) constraint. Review fixes addressed before commit - CRITICAL: deadlock-detect test gained a real 2s time.After (was selecting on context.Background().Done() which never fires). - HIGH: happy-path test now hard-asserts RefID = ID (was a t.Logf that would silently pass after a production fix). - HIGH: standalone /api/triggers CRUD coverage added (was bypassed by the workload-side bind flow). - HIGH: seedWorkload bypass deleted; tests now go through the real POST /api/workloads handler. - MEDIUM: withTempDir restore is a no-op (t.Setenv auto-restores); dead `old := os.Getenv(...)` capture removed. - MEDIUM: list-workloads test now asserts ID membership, not just count. Doc - WORKLOAD_REFACTOR_TODO: all three Priority 1 items, Priority 3 polish, and Priority 4 tests marked DONE. The workload-first arc is closed. |
||
|
|
739b67856a |
feat(cutover): hard legacy cutover — drop projects/stacks/sites/deploys
Build / build (push) Successful in 10m39s
The clean-break delete that closes the workload-first refactor arc.
Net diff: ~30 backend files deleted, ~20 modified, ~12k LOC removed
on the Go side; entire /projects /stacks /sites /deploy frontend
trees gone; ~6.7k LOC removed on the Svelte/TypeScript side.
Backend
- API handlers gone: internal/api/{projects,stages,stage_env,stacks,
static_sites,deploys,instances,volume_browser}.go
- Store CRUD + tests gone: internal/store/{projects,stages,stage_env,
stacks,static_sites,static_site_secrets,deploys,poll_state,volumes,
workload_sync}.go (+ _test.go siblings)
- Legacy deployer pipeline gone: internal/deployer/{bluegreen,promote,
rollback,subdomain,resolver_test}.go; deployer.go trimmed to just the
dispatch surface used by the plugin pipeline
- internal/staticsite/{manager,healthcheck}.go and
internal/stack/manager.go gone (the rest of those packages stay as
helpers imported by the static + compose plugins)
- internal/registry/poller.go gone (legacy registry poller)
- internal/volume.ResolvePath gone; ResolveWorkloadPath stays
- internal/webhook: handleWebhook (project) + handleSiteWebhook (site)
gone; only POST /api/webhook/triggers/{secret} remains
- workload-side webhook URL handlers (getWorkloadWebhook +
regenerateWorkloadWebhook + EnsureWorkloadWebhookSecret +
SetWorkloadWebhookSecret + GetWorkloadByWebhookSecret) gone — they
minted URLs that would 404 against the new trigger-only ingress
- cmd/server/main.go: dropped staticsite.Manager, stack.Manager,
staticsite.HealthChecker, registry poller, SetSiteSyncTriggerer,
SetStaticSiteManager, SetStackManager, wireStaticBackend
- store/store.go: idempotent DROP TABLE IF EXISTS for every legacy
table (projects, stages, stage_env, volumes, deploys, deploy_logs,
poll_states, stacks, stack_revisions, stack_deploys, static_sites,
static_site_secrets); FK order children-then-parents
- store/models.go: dropped Project, Stage, Deploy, DeployLog, StageEnv,
Volume, StaticSite, StaticSiteSecret, Stack, StackRevision,
StackDeploy types; kept WorkloadKind constants as documented strings
- internal/store/helpers.go (new): BoolToInt, rowScanner,
GenerateWebhookSecret extracted from deleted CRUD files
- internal/api/secrets.go (new): forwards to store.GenerateWebhookSecret
so api + store paths share one secret-generation impl (no
panic-vs-UUID-fallback divergence)
- internal/reconciler/reconciler.go: dropped legacy stack-by-compose
+ static-site label paths; only canonical tinyforge.workload.id
dispatch remains
- providers (gitea_content/github_provider/gitlab_provider) gained
path-traversal rejection on every tree entry
- internal/webhook ParsedImage / ParseImageRef demoted to package-
private (no external callers)
Frontend
- /projects /stacks /sites /deploy routes deleted (entire trees)
- ProjectCard / InstanceCard / StaleContainerCard components deleted
- api.ts: dropped every project/stage/stack/site/deploy/instance
helper + types (Project, Stage, Stack, StaticSite, Deploy,
Instance, Volume, etc.); kept Workload, Container, App, Settings,
Registry, EventTrigger, LogScanRule, webhook envelopes
- WorkloadWebhook type + getWorkloadWebhook/regenerateWorkloadWebhook
api functions gone (mirror of the backend deletion above)
- web/src/routes/+layout.svelte: dropped /projects /sites /stacks
/deploy nav entries, trimmed quick-nav keymap
- web/src/routes/+page.svelte: dashboard rewrite — reads
listWorkloads + listContainers only; 4-card stat grid
(workloads/running/failed/stale) + recent workloads strip
- navCounts.ts, SystemHealthCard.svelte, ContainerLogs.svelte,
ContainerStats.svelte, StatusBadge.svelte, TagCombobox.svelte,
proxies/+page.svelte, containers/+page.svelte all rewired to the
workload-first surface
- AbortController plumbing on dashboard, nav-counts, stale page,
SystemHealthCard so navigation doesn't leave dangling fetches
- i18n: dropped projects.*, projectDetail.*, envEditor.*,
volumeEditor.*, volumeBrowser.*, quickDeploy.*, sites.*, stacks.*,
instance.*, confirm.* namespaces; en/ru parity preserved (1042
keys each)
Hardening from go-reviewer + security-reviewer + typescript-reviewer
subagent passes (0 CRITICAL across all three; 1 HIGH + ~12 MEDIUM
addressed inline before commit):
- Sec H1: dead-end workload webhook URL handlers (would mint URLs
that 404 the new trigger-only ingress) deleted across backend +
frontend
- Go M1: IsTerminalDeployStatus dropped (no production callers)
- Go M2: ParsedImage/ParseImageRef lowercased (in-package only)
- Go M6: generateWebhookSecret unified — api shim forwards to
store.GenerateWebhookSecret
- Doc/comment freshness: stage_id (no longer FK), ProxyRoute legacy
field names, workloadIDRow rationale, webhook_deliveries.target_type
enum, WebhookDeliveryLog component header
Doc
- WORKLOAD_REFACTOR_TODO: cutover marked DONE; all three Priority 1
items are now shipped. Next focus is Priority 3 polish (apps.* i18n
+ codemap entries) and Priority 4 tests.
Behavioral notes for operators upgrading from a pre-cutover build
- Existing rows in the dropped tables disappear on first boot.
- Legacy webhook URLs at /api/webhook/{secret} and
/api/webhook/sites/{secret} return 404; CI configs must repoint to
/api/webhook/triggers/{secret} (the trigger-split boot backfill
lifted any embedded workload secret onto a Trigger row, so the
secret value itself carries over).
- Frontend routes /projects /stacks /sites /deploy are gone; nav
links replaced with /apps and /triggers.
|
||
|
|
234c3c711e |
feat(static): inline static-source plugin; drop phantom-row adapter
Build / build (push) Successful in 10m43s
Lift the static-site deploy pipeline from internal/staticsite/manager.go into internal/workload/plugin/source/static/ so plugin-native static workloads operate directly on plugin.Workload + the containers table + workload_env. The cmd/server/static_backend.go phantom-row adapter is gone; the legacy static_sites table is no longer touched on plugin deploys. Backend - new state.go: runtimeState (last_commit_sha, last_sync_at, last_error, status) persisted in containers.extra_json under the deterministic row id <workloadID>:site - per-workload sync.Mutex serializes saveState read-modify-write so parallel deploys for the same workload can't race container_id / proxy_route_id writes - extra_json round-trips through map[string]json.RawMessage so unknown keys survive — typed runtimeStateKeys are stripped before merge so clearing a typed field actually drops the key - new env.go reads workload_env (replaces static_site_secrets for plugin-native sites); decrypt-failure logs and skips one entry rather than failing the whole deploy - new build.go ports prepareDenoBuild + prepareStaticBuild + copyDir; copyDir uses filepath.WalkDir + Lstat to refuse symlinks and non-regular files - new deploy.go is the ~300-line core; intent.Reason gates force vs skip-if-no-changes; success-path saveState failure rolls back container + proxy route and writes "failed" state (no orphans) - new teardown.go combines Remove + Stop; idempotent on never-deployed workloads - new reconcile.go refreshes container state from Docker; flips runtimeState.Status to failed when the container is missing/crashed Hardening (from go-reviewer + security-reviewer subagent passes; 1 CRITICAL + 5 HIGH + 3 MEDIUM addressed before merge) - path-traversal defense in all 3 providers (gitea_content, github_provider, gitlab_provider): reject tree entries whose resolved local path escapes destDir - verifyDownloadInsideRoot walks the build dir post-download as a second line of defense - sanitizeError redacts the access token, collapses to one line, and clamps to 240 bytes before persisting to extra_json or fanning out to the notification webhook - container/image/volume names suffixed with workload-id short prefix (workload name is not UNIQUE in schema) - primaryDomain reads settings.Domain to complete a bare subdomain face into a full FQDN (matches legacy Manager behavior) - ctx-aware health-check sleep - json.Marshal for event metadata (was fmt.Sprintf JSON template) - strings.HasPrefix for failed-status detection (was brittle slice expression) Wire-up - cmd/server/main.go: removed wireStaticBackend(...) call; existing blank import on _ ".../source/static" drives init() registration - cmd/server/static_backend.go deleted Doc - WORKLOAD_REFACTOR_TODO: static port marked DONE; next focus is the hard legacy cutover (drop /api/projects, /api/stacks, /api/sites, /api/stages + their tables, internal/stack + internal/staticsite packages, frontend /projects /stacks /sites) Behavior notes for operators - plugin-native static workloads no longer write to static_sites; legacy /api/sites/* still serves original rows unchanged - legacy tinyforge.static-site / .static-site-name container labels dropped on plugin deploys; canonical tinyforge.workload.id / .kind cover ownership - container/image/volume names gained an 8-char ID suffix (e.g. dw-site-mysite-a1b2c3d4); legacy-deployed sites keep the old shape until redeployed through the plugin path |
||
|
|
2aff22f565 |
feat(triggers): first-class triggers + bindings with fan-out webhook
Build / build (push) Successful in 10m39s
Promote triggers from embedded workload fields to standalone records
joined to workloads via workload_trigger_bindings. One trigger (webhook,
registry watcher, git push, manual) now fans out to many workloads with
per-binding config overrides (top-level JSON merge, binding wins).
Backend
- new triggers + workload_trigger_bindings tables with ON DELETE CASCADE
- boot-time backfill of embedded trigger config inside per-workload tx
- store.ErrUnique sentinel translates SQLite UNIQUE at store boundary
- /api/triggers CRUD + /api/triggers/{id}/{webhook,bindings}
- /api/bindings/{id} update/delete; /api/workloads/{id}/triggers list+bind
- bindTriggerToWorkload accepts trigger_id or inline {kind,name,config}
- inline-create uses CreateTriggerWithBindingTx (no orphan triggers)
- validateBindingConfig enforces 8 KiB cap + plugin Validate on merged
- ListTriggersWithBindingCount + ListBindings*WithNames remove N+1
- POST /api/webhook/triggers/{secret} resolves trigger then fans out
- bounded worker pool (4) per request; per-binding error isolation
- outcome accounting: deployed / skipped / no-match / errored
- legacy /api/webhook/workloads/{secret} route removed (clean break;
backfill keeps secrets resolvable at the new /triggers/{secret} path)
- reconciler gate dropped from (Source && Trigger) to Source only
- MergeJSONConfig returns freshly allocated slices (no fan-out aliasing)
- WithEffectiveTrigger lets existing Trigger.Match contract stay unchanged
Frontend
- /triggers list, new wizard, [id] detail (bindings, webhook rotate)
- workload create wizard: NEW / PICK / SKIP trigger modes
- workload detail: bindings panel + Add-trigger modal (inline / pick)
- per-binding override editor with merged-preview + 8 KiB guard
- "OVERRIDES n FIELDS" row badge when binding_config is non-empty
- shared TriggerKindForm component (registry / git / manual + JSON)
- 3 raw <input type=checkbox> replaced with <ToggleSwitch>
- full EN + RU i18n: redeployTriggers.*, apps.detail.bindings.*,
apps.new.triggers.*, nav.triggers; event-triggers nav disambiguated
Doc
- WORKLOAD_REFACTOR_TODO: trigger-split marked DONE; next focus is
the static-source inline port + hard legacy cutover (Priority 1)
|
||
|
|
30133bc1eb |
docs: workload refactor + observability progress
Build / build (push) Successful in 10m40s
Two design + handoff docs: - docs/WORKLOAD_REFACTOR_TODO.md — status-at-a-glance table showing what's done (volume scopes, kind-aware editors, vendor webhook parsing, chain-panel CSS, Log Rules panel) and what's still pending (static source inline port + the hard legacy cutover gated on it; codemap entries; /apps page-level i18n; Priority 4 integration tests). - docs/LOGSCAN_AND_TRIGGERS_TODO.md — companion design + status doc for the two Observability features. Records the loop-prevention invariant (event_log = system observing itself, webhook_deliveries = system talking to outside) so the next contributor doesn't accidentally break it by adding a new EventLog subscriber that re-publishes. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
cba2149aa9 |
refactor(workload): finalize containers index + post-review hardening
Wraps up the workload refactor with the fixes that came out of the multi-agent code review (see docs/plans/workload-refactor.md "What actually shipped"). Backend: - store.ReconcileContainer: separate write path so the 30s reconciler tick no longer overwrites deployer-owned fields (subdomain, proxy_route_id, npm_proxy_id, image_tag). - Container.stage_id column + index; ListProxyRoutes / ListContainersByStageID join via stage_id (survives stage rename), with legacy fallback to (project_id, role=stage_name). - Reconciler: workload-existence check (rejects forged tinyforge.workload.id labels), skips inventing project-kind rows, child-context cancel before wg.Wait() on shutdown. - Transactional CRUD across projects / stacks / static_sites: parent UPDATE and workload sync land in one transaction so secret rotations are durable. - Webhook routing reads exclusively through workloads.webhook_secret; legacy GetProjectByWebhookSecret / GetStaticSiteByWebhookSecret fallback removed. - store.GetStackByComposeProjectName + indexed lookup (no more full-table stack scan per compose container per tick). - store.ListMissingSweepRows: filtered query for the missing-sweep. - /api/instances/* handlers verify (workload_id, role) match URL (project_id, stage_name) before mutating — closes the cross-project hijack the security review flagged. - extra_json no longer referenced from Go (column kept on disk for now). Frontend: - WorkloadContainers.svelte: generic detail-page panel reusable by stack and site detail pages. - Containers page polish: client-side kind/state filters over an unfiltered fetch, URL-synced filters, race-safe loads via sequence number, EN+RU i18n, sidebar counter via navCounts.containers. Misc: - scripts/dev-server.sh: tolerate empty netstat grep result. - .gitignore: ignore docker-watcher binaries, .claude/worktrees/, .facts-sync.json. |
||
|
|
f54a6ecee3 |
feat(workload): add Workload/Container/App store foundation
Introduces the data layer for the Workload refactor (see docs/plans/workload-refactor.md): three new tables and store methods, no behavior changes elsewhere yet. - workloads: unifying primitive over Project/Stack/StaticSite, paired via UNIQUE(kind, ref_id). Notification + webhook config hosted here so it lives in one place across kinds. - containers: normalized index of every Tinyforge-managed container with first-class subdomain/proxy_route_id/npm_proxy_id columns (heavily queried by ListProxyRoutes / stale detection). - apps: optional grouping of workloads; schema only, no UI in v1. Foundation only — deployer surgery, reconciler, and consumer switchover land in the next commit. |
||
|
|
0405ecd9ce |
feat(notify): HMAC-signed outgoing webhooks with per-tier secrets and test sender
Build / build (push) Successful in 10m36s
Outgoing notifications were bare POSTs with no auth and no way to verify
they came from Tinyforge. They also went out from one global URL only,
even though stages had a notification_url field, and static-site sync
emitted no events at all.
Schema: add notification_url + notification_secret (lazy-generated) to
settings, projects, stages and static_sites. Migrations are additive.
Notifier: SendSigned computes HMAC-SHA256 over the exact body bytes and
sends X-Hub-Signature-256 (GitHub-compatible — receivers built for
GitHub/Gitea/Forgejo verify out of the box). Aux headers
X-Tinyforge-Event/Delivery/Timestamp/Tier are advisory and not signed.
Empty secret => unsigned send for back-compat.
Resolution: deploys fall through stage > project > settings, sites fall
through site > settings. The secret travels with the URL that sourced
it, so any tier can sign even when its parents are unsigned. Site sync
events now actually emit (site_sync_success / site_sync_failure).
API: 12 new endpoints — {GET secret, POST regenerate, POST disable,
POST test} for each of the 4 tiers. SendSyncForTest returns
status_code/latency_ms/signature_sent/delivery_id/response_snippet so
the UI surfaces receiver feedback inline.
UI: shared OutgoingWebhookPanel.svelte fits the existing card aesthetic.
Signing-state pill, secret reveal-on-demand, regenerate/disable behind
ConfirmDialog modals (not inline strips — too easy to misclick), send-
test result card with colour-coded status. Wired into Settings →
Integrations, project edit form, per-stage edit, and per-site detail.
EN + RU i18n.
Tests: round-trip (sender signs, receiver verifies), tampered-body and
wrong-secret rejection, unsigned-send omits header, send-test surfaces
4xx, concurrent fan-out via Drain. Resolver precedence locked for both
deploy and site paths.
Docs: docs/webhooks.md with header reference, verifier snippets in
Node/Python/Go, and a recipe for the service-to-notification-bridge
generic webhook provider.
|
||
|
|
a4362b842d |
fix: harden security, fix concurrency bugs, and address review findings
Build / build (push) Successful in 11m42s
Security: - rate limit /api/webhook routes per-IP and cap concurrent site syncs - global SSE connection cap (256) with new sse_gate - validate ?tail= and cap JSON log responses at 4 MiB - strip ANSI/CSI/OSC and control bytes from streamed log lines - redact webhook secret from request log middleware - scrub host details from /api/health for non-admin viewers - drop container_id from /api/system/stats/top for non-admins - generate webhook secrets via crypto/rand; require >=32 chars on insert - verify iid path consistency in streamContainerLogs - LimitReader on site webhook body; reject malformed non-empty bodies Concurrency / correctness: - stats collector: Stop() no longer hangs without Start(), semaphore acquired in parent loop so ctx cancellation short-circuits the queue, in-flight tick cancellable via shared base context, zero-ts guard - webhook handler: replace fire-and-forget goroutine with WaitGroup-tracked workers + Drain() wired into graceful shutdown - $derived(() => ...) mis-idiom fixed in ContainerStats / InstanceCard / ProjectCard (returned function instead of value) - SystemResourcesCard: rename `window` and `t` locals to avoid shadowing globalThis.window and the i18n `t` import Quality / performance: - replace O(n^2) insertion sort with sort.Slice in stats top - runMigrations only swallows duplicate-column / already-exists errors - PruneStatsSamplesBefore wrapped in a transaction - collapse N+1 in unusedImageStats / pruneImages to one ListAllInstances pass; surface DB errors instead of silently treating them as inactive - run Docker Info + DiskUsage in parallel via errgroup - container log SSE emits `: ping` heartbeat every 20 s - imageMatches case-insensitive on registry host (RFC behaviour) - log warning on invalid stage tag pattern instead of silent skip - reject malformed non-empty site webhook payloads Frontend / i18n: - shared formatBytes utility replaces three local copies - statsInterval store drives dynamic "no samples / collection disabled" copy across ContainerStats and SystemResourcesCard - top consumers row now shows owner_name (project/stage or site name) - drop seven `as any` casts on the Settings type; add cloudflare_api_token write-only field - move "Service status", "Docker daemon", "Docker unreachable", "Proxy unreachable", "reachable", and "Docker daemon is not reachable." strings into en/ru i18n bundles |
||
|
|
670948f113 |
fix: address code review findings for DNS management
- CRITICAL: Change DNS zones endpoint from GET to POST to avoid leaking API token in URL query parameters - HIGH: Add sync.RWMutex to protect dnsProvider field in Server, Deployer, and proxy Manager against concurrent read/write races - HIGH: Capture old DNS provider reference synchronously before launching background cleanup goroutine - HIGH: Use getDNS()/getDNSProviderLocked() accessors instead of direct field reads in all DNS operations |