Commit Graph

58 Commits

Author SHA1 Message Date
alexei.dolgolyov 956943edbb feat(proxies): per-row Triggers deep-link to /apps/[id]#bindings
The proxies page now exposes the trigger bindings for each routed
workload via a per-row action chip. Resolves the explicit "what's
next" call-out in WORKLOAD_REFACTOR_TODO under Priority 3 polish.

- Added id="bindings" to the existing trigger bindings <section> on
  /apps/[id]/+page.svelte so URL fragments resolve to the panel.
- New triggersHref(route) helper in /proxies that builds
  /apps/{workload_id}#bindings; code-pointer comment explains the
  back-compat naming (ProxyRoute.project_id is actually the workload
  ID — see internal/store/models.go:110-113), so a future contributor
  doesn't trip on the mismatch and rip the helper out.
- New right-aligned "Actions" column with a button-shaped link;
  defensive — falls back to — when project_id is absent.
- Three new i18n keys under proxies.* (actions, viewTriggers,
  viewTriggersTitle) mirrored across EN + RU. Key parity now 1512
  each.

No backend change needed; ListProxyRoutes already selects w.id into
ProxyRoute.project_id. Workload-aware batch endpoints (showing
trigger counts inline) were deliberately out of scope for this
half-turn — flagged as a future enhancement only if users want
inline counts.

Verification: svelte-check 0 errors + 3 pre-existing warnings in
TagCombobox; go build + go test ./... all green across 20 packages.
2026-05-16 22:46:51 +03:00
alexei.dolgolyov ea55d31177 feat(discovery+runtime): restore static-site wizard discovery + close /sites/[id] feature parity
Build / build (push) Successful in 10m43s
Two-stage feature arc closing the gaps left by the hard legacy cutover.
The static-site creation wizard regains its auto-discovery + connection-test
flow; /apps/[id] grows the runtime/storage/lifecycle surface the legacy
/sites/[id] page used to expose.

Backend (Go)
- internal/api/discovery.go: six admin-gated endpoints wrapping
  staticsite.GitProvider — POST /api/discovery/git/{detect-provider,
  test-connection,repos,branches,tree} + GET /api/discovery/image/conflicts.
  Identifier validation (validateGitIdent / validateGitBranch) at the
  boundary so provider URL interpolation cannot be hijacked via `..`.
  Upstream errors scrubbed: detailed slog on the server, generic 502 to
  the client (mitigates token-reflection-in-error-page).
- internal/api/workload_runtime.go: four endpoints —
  GET /api/workloads/{id}/runtime-state decodes containers.extra_json for
  static workloads; GET /api/workloads/{id}/storage execs `du -sb /app/data`
  with a 30s in-process cache (storageProbeCache) so polling can't turn
  into per-request execs; POST /api/workloads/{id}/{stop,start} iterate
  ListContainersByWorkload and call docker.StopContainer / StartContainer,
  returning 200 / 409 (nothing to act on) / 502 (all failed).
- internal/staticsite/safehttp.go: NewSafeHTTPClient + ValidateBaseURL +
  blockReason. DialContext re-resolves hostnames and refuses loopback /
  link-local / multicast / unspecified addresses. RFC1918 + ULA explicitly
  allowed (self-hosted Gitea on LAN is the dominant deployment).
  Replaced four raw &http.Client{} constructions in the provider files.
- internal/staticsite/gitlab_provider.go: url.PathEscape each segment in
  the raw-file URL builder for parity with projectPath().
- Test coverage: 26 cases in discovery_test.go (image-tag stripping,
  source-config decoding, conflict scenarios, validator boundaries,
  scheme rejection), 14 in workload_runtime_test.go (404 / 409 / nil-docker
  / probe-cache), 16 in safehttp_test.go (URL validation + block-reason
  policy matrix + live dial against loopback + AWS metadata literals).

Frontend (Svelte 5 + runes)
- web/src/lib/api.ts: typed wrappers for every endpoint, AbortSignal
  threaded through post(); ApiError exported so callers can narrow on
  e.status; new DetectedGitProvider narrow union.
- web/src/routes/apps/new/+page.svelte: static-form discovery controls
  (auto-detect provider, test connection, repo / branch / folder
  EntityPickers, Deno auto-detect); image-form conflict panel with
  debounced lookup + double-click submit guard ("Forge anyway") + Inspect
  button that pre-fills port/healthcheck; English error fallbacks routed
  through apps.new.errors.* (en + ru).
- web/src/routes/apps/[id]/+page.svelte: runtime-state panel + storage
  panel + Stop / Start / Open-site toolbar; universal live-state badge
  in the hero lede for image/compose/static (RUNNING / TRANSITIONING /
  STOPPED / NOT DEPLOYED / MIXED · n/m RUNNING); ContainerStats panel
  per row (auto-collapsing native <details> when N > 2); read-only
  webhook bindings summary card; responsive toolbar overflow with native
  <details> at <640px (z-index 100 above sticky nav).
- web/src/app.css: project-wide .forge-btn-ghost:focus-visible outline.

Hardening from go-reviewer + security-reviewer + typescript-reviewer +
frontend-design UI/UX subagents (0 CRITICAL, all HIGH/BLOCKER addressed
inline, IMPORTANT applied before commit):
- AbortController + per-call sequence tokens on every long-running
  fetch (loadRuntimeState / loadStorage / loadTriggerMeta / inspectImage /
  listImageConflicts) plus onDestroy cleanup so late resolves cannot
  mutate dead component state.
- doStop / doStart snapshot and restore `error` across the finally-block
  reload so a load()-cleared message doesn't hide a real failure.
- triggersById refreshed after inline trigger creation so the webhook
  card doesn't silently exclude the just-created trigger.
- Live-state badge wraps in role=status / aria-live=polite (no redundant
  aria-label).
- Webhook row has a single click target (was two pointing at the same URL).
- Empty webhook section hides entirely.
- Dropped role=menu / role=menuitem from the overflow menu (they would
  promise arrow-key nav we don't wire; native Tab + ESC carry it).

Doc
- docs/CODEMAPS/INDEX.md + new docs/CODEMAPS/discovery-and-runtime.md
  map the endpoint surface, security posture, frontend integration
  patterns, and an "add a new probe" recipe.

Verification
- svelte-check: 0 errors, 3 pre-existing a11y warnings.
- go build + go vet + go test ./...: all green.
- i18n parity: en + ru at 1413 keys each.
- Live smoke against :8090: 404 / 409 / 502 envelopes correct, discovery
  sanity passes, ProbeError surfaces on no-container path.
2026-05-16 21:35:51 +03:00
alexei.dolgolyov 5e78f13e06 refactor(triggers): review followups — fire-now, dedupe trigger pages, hardening
Build / build (push) Failing after 34s
Follow-ups on commit 39e1e36 addressing review feedback from
go-reviewer / security-reviewer / typescript-reviewer.

Backend:
- New POST /api/triggers/{id}/fire (AdminOnly, schedule-only): operator
  "Fire now" button — dispatches immediately without waiting for the
  next natural interval. Persists last_fired_at BEFORE dispatch, same
  ordering as the scheduler. Per-trigger in-flight guard (429 if a
  fire is already running) to defend against rapid double-clicks /
  runaway scripts. Refuses request when AdminOnly claims are absent
  rather than logging an unattributable deploy.
- SetTriggerLastFired now validates timestamp parses as RFC3339 before
  writing. Rejects empty string explicitly — empty-clears semantics
  were dead (no caller) and would silently re-fire on next tick if
  ever accidentally written. A future reset-cadence flow must add a
  dedicated ClearTriggerLastFired so the call site is grep-able and
  separately auditable.
- Scheduler logs WARN on catch-up fires (now - lastFired > 2× interval)
  so the "surprise burst at restart" pattern shows up in audit logs.
- BindingResult reason strings extracted to package consts
  (webhook.Reason*) so the scheduler and api fire-now classifications
  stay in sync without string-matching drift.
- SECURITY NOTE on FanOutForTrigger documents that the
  WebhookRequireSignature gate is ingress-only by design.

Frontend:
- Refactored /triggers/new (770 LOC → 155 LOC) and /triggers/[id]
  (~350 LOC dropped) to use the shared TriggerKindForm. Eliminates the
  triplicated per-kind state + buildConfig + canSubmit + template that
  caused the d-unit regex drift in the prior commit.
- New seedTriggerKindFormState helper on TriggerKindForm primes the
  form from a server-returned trigger config with defensive type
  guards; resets per-kind slots first so re-seeding across kinds
  doesn't inherit stale state.
- /triggers/[id] gains a Schedule status panel with Last Fired + Fire
  Now button (gated on binding_count > 0). Confirmation dialog,
  result flash, timer cleanup on unmount + new-fire (no stale-closure
  race). EN+RU i18n parity.
2026-05-16 12:16:47 +03:00
alexei.dolgolyov 39e1e36510 feat(triggers): add schedule trigger kind + internal scheduler
Build / build (push) Successful in 10m42s
Fourth trigger kind alongside registry/git/manual. Recurring time-interval
fires driven by a new internal/scheduler tick loop (default 30s, clamped
to 5m). Goes through the same webhook.Handler.FanOutForTrigger seam as
inbound HTTP webhooks, so per-binding concurrency, outcome accounting,
and config-merge semantics are identical.

Schema: triggers.last_fired_at TEXT column (additive ALTER for existing
DBs). Scheduler persists last_fired_at BEFORE dispatch so a panicking
Match cannot wedge a tight loop; failed deploys wait one full interval
before retry — correct trade-off for a periodic refresh trigger.

Frontend: TriggerKindForm + /triggers/new + /triggers/[id] gain the
schedule kind (4-col card grid, preset chips Hourly/Daily/Weekly,
custom interval input matched to Go time.ParseDuration syntax, optional
pinned reference). /triggers/[id] surfaces "last fired" on schedule rows.
EN+RU i18n in parity.

Review fixes from go-reviewer / security-reviewer / typescript-reviewer:
- Scheduler Start/Stop wrapped in sync.Once (no goroutine leak / double-
  cancel panic on shutdown re-entry).
- shouldFire rejects sub-MinInterval as defense-in-depth against
  hand-inserted rows that bypassed Validate.
- fire() asserts trigger Kind=="schedule" before dispatching.
- Aligned isValidInterval regex across all three frontend sites; reject
  the unsupported "d" unit (Go time.ParseDuration doesn't accept it).
- formatLastFired falls back to lastFiredNever on malformed timestamps
  rather than leaking raw bytes into the UI.
- main.go scheduler closure logs per-fire deployed/errored counts.
2026-05-16 11:24:05 +03:00
alexei.dolgolyov e3c7b13d58 chore(workload): close the workload-first arc — apps i18n + codemap + tests
Build / build (push) Successful in 10m36s
Closes the workload-first refactor by landing the Priority 3 polish
items and the Priority 4 test gap. Net: ~2,400 lines added,
~350 lines modified across 13 files.

Priority 3 — polish
- apps.* i18n namespace: 276 new keys across apps.list.* (27),
  apps.new.* (91, sibling of existing apps.new.triggers.*), and
  apps.detail.* (158, sibling of existing apps.detail.bindings.*).
  EN+RU at 1314 keys each, perfectly in sync. /apps, /apps/new,
  /apps/[id] now render entirely from i18n.
- New codemap docs/CODEMAPS/workload-plugin.md (238 lines):
  Source × Trigger contract, dispatch seam, webhook fan-out path,
  recipes for adding a new Source or Trigger kind. Plus
  docs/CODEMAPS/INDEX.md gateway.

Priority 4 — tests
- internal/api/workloads_test.go (new, ~30 subtests): /api/workloads
  CRUD + deploy + delete + env + volumes + chain + promote-from +
  triggers list/inline-bind + auth gating + standalone /api/triggers
  CRUD (create / dup-409 / kind filter / delete). Uses real
  POST handlers via httptest.NewServer + a fake plugin source
  registered under "testfakesource".
- internal/deployer/dispatch_test.go (new, 11 tests):
  DispatchPlugin / DispatchTeardown / DispatchReconcile happy +
  unknown-kind + propagated-error each; PluginDeps wiring; a real
  2s-bounded RWMutex deadlock probe on PluginDeps vs SetDNSProvider.
- internal/workload/plugin/source/compose/compose_test.go (new,
  ~26 subtests): composeProjectName sanitization,
  writeYAML / writeYAMLIfChanged hash short-circuit, Validate happy
  + bad inputs, Kind / SchemaSample.

Coverage delta on the workload-plugin path:
- internal/api: 1.1% → 16.0%
- internal/deployer: 0% → 54.1%
- internal/workload/plugin/source/compose: 0% → 38.5%
- Trigger plugins already at 87-95% from the trigger-split work.

Production fix surfaced by the tests
- store.CreateWorkload now self-references RefID = ID when caller
  leaves RefID empty (the typical plugin-native path). The api
  layer's broken backfill loop (called UpdateWorkload, which
  deliberately omits ref_id) is gone. Multiple sibling plugin
  workloads can now coexist under the UNIQUE(kind, ref_id) constraint.

Review fixes addressed before commit
- CRITICAL: deadlock-detect test gained a real 2s time.After (was
  selecting on context.Background().Done() which never fires).
- HIGH: happy-path test now hard-asserts RefID = ID (was a t.Logf
  that would silently pass after a production fix).
- HIGH: standalone /api/triggers CRUD coverage added (was bypassed
  by the workload-side bind flow).
- HIGH: seedWorkload bypass deleted; tests now go through the
  real POST /api/workloads handler.
- MEDIUM: withTempDir restore is a no-op (t.Setenv auto-restores);
  dead `old := os.Getenv(...)` capture removed.
- MEDIUM: list-workloads test now asserts ID membership, not just
  count.

Doc
- WORKLOAD_REFACTOR_TODO: all three Priority 1 items, Priority 3
  polish, and Priority 4 tests marked DONE. The workload-first arc
  is closed.
2026-05-16 06:42:43 +03:00
alexei.dolgolyov 739b67856a feat(cutover): hard legacy cutover — drop projects/stacks/sites/deploys
Build / build (push) Successful in 10m39s
The clean-break delete that closes the workload-first refactor arc.
Net diff: ~30 backend files deleted, ~20 modified, ~12k LOC removed
on the Go side; entire /projects /stacks /sites /deploy frontend
trees gone; ~6.7k LOC removed on the Svelte/TypeScript side.

Backend
- API handlers gone: internal/api/{projects,stages,stage_env,stacks,
  static_sites,deploys,instances,volume_browser}.go
- Store CRUD + tests gone: internal/store/{projects,stages,stage_env,
  stacks,static_sites,static_site_secrets,deploys,poll_state,volumes,
  workload_sync}.go (+ _test.go siblings)
- Legacy deployer pipeline gone: internal/deployer/{bluegreen,promote,
  rollback,subdomain,resolver_test}.go; deployer.go trimmed to just the
  dispatch surface used by the plugin pipeline
- internal/staticsite/{manager,healthcheck}.go and
  internal/stack/manager.go gone (the rest of those packages stay as
  helpers imported by the static + compose plugins)
- internal/registry/poller.go gone (legacy registry poller)
- internal/volume.ResolvePath gone; ResolveWorkloadPath stays
- internal/webhook: handleWebhook (project) + handleSiteWebhook (site)
  gone; only POST /api/webhook/triggers/{secret} remains
- workload-side webhook URL handlers (getWorkloadWebhook +
  regenerateWorkloadWebhook + EnsureWorkloadWebhookSecret +
  SetWorkloadWebhookSecret + GetWorkloadByWebhookSecret) gone — they
  minted URLs that would 404 against the new trigger-only ingress
- cmd/server/main.go: dropped staticsite.Manager, stack.Manager,
  staticsite.HealthChecker, registry poller, SetSiteSyncTriggerer,
  SetStaticSiteManager, SetStackManager, wireStaticBackend
- store/store.go: idempotent DROP TABLE IF EXISTS for every legacy
  table (projects, stages, stage_env, volumes, deploys, deploy_logs,
  poll_states, stacks, stack_revisions, stack_deploys, static_sites,
  static_site_secrets); FK order children-then-parents
- store/models.go: dropped Project, Stage, Deploy, DeployLog, StageEnv,
  Volume, StaticSite, StaticSiteSecret, Stack, StackRevision,
  StackDeploy types; kept WorkloadKind constants as documented strings
- internal/store/helpers.go (new): BoolToInt, rowScanner,
  GenerateWebhookSecret extracted from deleted CRUD files
- internal/api/secrets.go (new): forwards to store.GenerateWebhookSecret
  so api + store paths share one secret-generation impl (no
  panic-vs-UUID-fallback divergence)
- internal/reconciler/reconciler.go: dropped legacy stack-by-compose
  + static-site label paths; only canonical tinyforge.workload.id
  dispatch remains
- providers (gitea_content/github_provider/gitlab_provider) gained
  path-traversal rejection on every tree entry
- internal/webhook ParsedImage / ParseImageRef demoted to package-
  private (no external callers)

Frontend
- /projects /stacks /sites /deploy routes deleted (entire trees)
- ProjectCard / InstanceCard / StaleContainerCard components deleted
- api.ts: dropped every project/stage/stack/site/deploy/instance
  helper + types (Project, Stage, Stack, StaticSite, Deploy,
  Instance, Volume, etc.); kept Workload, Container, App, Settings,
  Registry, EventTrigger, LogScanRule, webhook envelopes
- WorkloadWebhook type + getWorkloadWebhook/regenerateWorkloadWebhook
  api functions gone (mirror of the backend deletion above)
- web/src/routes/+layout.svelte: dropped /projects /sites /stacks
  /deploy nav entries, trimmed quick-nav keymap
- web/src/routes/+page.svelte: dashboard rewrite — reads
  listWorkloads + listContainers only; 4-card stat grid
  (workloads/running/failed/stale) + recent workloads strip
- navCounts.ts, SystemHealthCard.svelte, ContainerLogs.svelte,
  ContainerStats.svelte, StatusBadge.svelte, TagCombobox.svelte,
  proxies/+page.svelte, containers/+page.svelte all rewired to the
  workload-first surface
- AbortController plumbing on dashboard, nav-counts, stale page,
  SystemHealthCard so navigation doesn't leave dangling fetches
- i18n: dropped projects.*, projectDetail.*, envEditor.*,
  volumeEditor.*, volumeBrowser.*, quickDeploy.*, sites.*, stacks.*,
  instance.*, confirm.* namespaces; en/ru parity preserved (1042
  keys each)

Hardening from go-reviewer + security-reviewer + typescript-reviewer
subagent passes (0 CRITICAL across all three; 1 HIGH + ~12 MEDIUM
addressed inline before commit):

- Sec H1: dead-end workload webhook URL handlers (would mint URLs
  that 404 the new trigger-only ingress) deleted across backend +
  frontend
- Go M1: IsTerminalDeployStatus dropped (no production callers)
- Go M2: ParsedImage/ParseImageRef lowercased (in-package only)
- Go M6: generateWebhookSecret unified — api shim forwards to
  store.GenerateWebhookSecret
- Doc/comment freshness: stage_id (no longer FK), ProxyRoute legacy
  field names, workloadIDRow rationale, webhook_deliveries.target_type
  enum, WebhookDeliveryLog component header

Doc
- WORKLOAD_REFACTOR_TODO: cutover marked DONE; all three Priority 1
  items are now shipped. Next focus is Priority 3 polish (apps.* i18n
  + codemap entries) and Priority 4 tests.

Behavioral notes for operators upgrading from a pre-cutover build
- Existing rows in the dropped tables disappear on first boot.
- Legacy webhook URLs at /api/webhook/{secret} and
  /api/webhook/sites/{secret} return 404; CI configs must repoint to
  /api/webhook/triggers/{secret} (the trigger-split boot backfill
  lifted any embedded workload secret onto a Trigger row, so the
  secret value itself carries over).
- Frontend routes /projects /stacks /sites /deploy are gone; nav
  links replaced with /apps and /triggers.
2026-05-16 06:00:21 +03:00
alexei.dolgolyov 2aff22f565 feat(triggers): first-class triggers + bindings with fan-out webhook
Build / build (push) Successful in 10m39s
Promote triggers from embedded workload fields to standalone records
joined to workloads via workload_trigger_bindings. One trigger (webhook,
registry watcher, git push, manual) now fans out to many workloads with
per-binding config overrides (top-level JSON merge, binding wins).

Backend
- new triggers + workload_trigger_bindings tables with ON DELETE CASCADE
- boot-time backfill of embedded trigger config inside per-workload tx
- store.ErrUnique sentinel translates SQLite UNIQUE at store boundary
- /api/triggers CRUD + /api/triggers/{id}/{webhook,bindings}
- /api/bindings/{id} update/delete; /api/workloads/{id}/triggers list+bind
- bindTriggerToWorkload accepts trigger_id or inline {kind,name,config}
- inline-create uses CreateTriggerWithBindingTx (no orphan triggers)
- validateBindingConfig enforces 8 KiB cap + plugin Validate on merged
- ListTriggersWithBindingCount + ListBindings*WithNames remove N+1
- POST /api/webhook/triggers/{secret} resolves trigger then fans out
- bounded worker pool (4) per request; per-binding error isolation
- outcome accounting: deployed / skipped / no-match / errored
- legacy /api/webhook/workloads/{secret} route removed (clean break;
  backfill keeps secrets resolvable at the new /triggers/{secret} path)
- reconciler gate dropped from (Source && Trigger) to Source only
- MergeJSONConfig returns freshly allocated slices (no fan-out aliasing)
- WithEffectiveTrigger lets existing Trigger.Match contract stay unchanged

Frontend
- /triggers list, new wizard, [id] detail (bindings, webhook rotate)
- workload create wizard: NEW / PICK / SKIP trigger modes
- workload detail: bindings panel + Add-trigger modal (inline / pick)
- per-binding override editor with merged-preview + 8 KiB guard
- "OVERRIDES n FIELDS" row badge when binding_config is non-empty
- shared TriggerKindForm component (registry / git / manual + JSON)
- 3 raw <input type=checkbox> replaced with <ToggleSwitch>
- full EN + RU i18n: redeployTriggers.*, apps.detail.bindings.*,
  apps.new.triggers.*, nav.triggers; event-triggers nav disambiguated

Doc
- WORKLOAD_REFACTOR_TODO: trigger-split marked DONE; next focus is
  the static-source inline port + hard legacy cutover (Priority 1)
2026-05-16 02:24:31 +03:00
alexei.dolgolyov 4707db1c3b feat(observability): event-triggers + log-scan-rules UI + i18n
Operator-facing surfaces for the two backend features:

- /event-triggers — list (filter summary, status pill),
  /event-triggers/new (form with regex validation), and
  /event-triggers/[id] (edit + Send-test + delete) with
  CONFIGURED secret badge + clear-to-rotate flow, ConfirmDialog
  for delete, aria-live regions on async result slots.
- /log-scan-rules — list with scope filter chips and stats panel
  (active tails, RATE-LIMITED, COOLED DOWN, COMPILE ERRORS),
  /log-scan-rules/new (with EntityPicker for workload scope and
  inline RegexTestBox), /log-scan-rules/[id] (edit + server-side
  /test + delete + live RegexTestBox panel).
- web/src/lib/components/RegexTestBox.svelte — reusable
  client-side regex test with sample input + captures display.
- web/src/lib/api.ts — typed wrappers for EventTrigger and
  LogScanRule CRUD + /test + getLogScanStats +
  getEffectiveLogScanRules.
- web/src/routes/+layout.svelte — nav entries for both surfaces.
- web/src/lib/i18n/{en,ru}.json — ~90 keys under observability.*,
  triggers.*, logscan.* namespaces; Russian translations cover
  every key.

Design + a11y polish per a frontend-design review pass: all
boolean inputs use ToggleSwitch, all destructive actions use
ConfirmDialog with confirmVariant="danger" / onconfirm /
oncancel, hand-rolled .btn-primary replaced with global
forge-btn classes, hex colors replaced with var(--*) tokens,
role="alert" on error banners, aria-invalid + aria-describedby
on invalid-regex inputs, aria-busy on async forms, mobile
breakpoints (hide-md columns, .row.three collapsing 3→2→1,
.table-wrap overflow-x).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-11 22:18:29 +03:00
alexei.dolgolyov cba2149aa9 refactor(workload): finalize containers index + post-review hardening
Wraps up the workload refactor with the fixes that came out of the multi-agent
code review (see docs/plans/workload-refactor.md "What actually shipped").

Backend:
- store.ReconcileContainer: separate write path so the 30s reconciler tick no
  longer overwrites deployer-owned fields (subdomain, proxy_route_id,
  npm_proxy_id, image_tag).
- Container.stage_id column + index; ListProxyRoutes / ListContainersByStageID
  join via stage_id (survives stage rename), with legacy fallback to
  (project_id, role=stage_name).
- Reconciler: workload-existence check (rejects forged tinyforge.workload.id
  labels), skips inventing project-kind rows, child-context cancel before
  wg.Wait() on shutdown.
- Transactional CRUD across projects / stacks / static_sites: parent UPDATE
  and workload sync land in one transaction so secret rotations are durable.
- Webhook routing reads exclusively through workloads.webhook_secret; legacy
  GetProjectByWebhookSecret / GetStaticSiteByWebhookSecret fallback removed.
- store.GetStackByComposeProjectName + indexed lookup (no more full-table
  stack scan per compose container per tick).
- store.ListMissingSweepRows: filtered query for the missing-sweep.
- /api/instances/* handlers verify (workload_id, role) match URL
  (project_id, stage_name) before mutating — closes the cross-project
  hijack the security review flagged.
- extra_json no longer referenced from Go (column kept on disk for now).

Frontend:
- WorkloadContainers.svelte: generic detail-page panel reusable by stack and
  site detail pages.
- Containers page polish: client-side kind/state filters over an unfiltered
  fetch, URL-synced filters, race-safe loads via sequence number, EN+RU i18n,
  sidebar counter via navCounts.containers.

Misc:
- scripts/dev-server.sh: tolerate empty netstat grep result.
- .gitignore: ignore docker-watcher binaries, .claude/worktrees/, .facts-sync.json.
2026-05-09 15:44:41 +03:00
alexei.dolgolyov 3e28588f10 feat(workload): global Containers tab + frontend client
Adds the user-visible piece of the Workload refactor:

- web/src/lib/types.ts          — Workload, Container, ContainerView,
                                  App, WorkloadKind, ContainerState
- web/src/lib/api.ts            — listWorkloads, getWorkload,
                                  listWorkloadContainers, setWorkloadAppID,
                                  listContainers (with filter),
                                  CRUD for apps
- web/src/lib/i18n/{en,ru}.json — nav.containers
- web/src/routes/+layout.svelte — "Containers" nav item between Stacks
                                  and Deploy, IconContainer
- web/src/routes/containers/+page.svelte — global Containers table:
    * filter chips for kind (project/stack/site) and state
    * client-side search across workload name / role / image /
      subdomain / container ID prefix
    * Workload column links to the kind-specific detail page,
      resolved through a one-time /api/workloads call to map
      workload_id → ref_id
    * existing /containers/stale route untouched

The page renders against the live database now — boot backfill
populated workload rows from existing projects/stacks/sites,
the deployer dual-writes containers on every deploy, and the
30s reconciler keeps the index in sync with `docker ps`.
2026-05-09 14:02:20 +03:00
alexei.dolgolyov 0f60a7a5db feat(webhook): inbound delivery audit log
Build / build (push) Successful in 10m35s
Persists every inbound webhook hit (project + site) so users can debug
"why didn't my deploy fire?" without grepping daemon logs. Surfaces a
14-day rolling history under the WebhookPanel on each project + site
detail page; refreshes every 30s while open. Daily cron prunes records
older than 14 days alongside the existing event log prune.

Schema:
- webhook_deliveries(id, target_type, target_id, target_name, received_at,
  source_ip, signature_state, status_code, outcome, detail, body_size)
- indexes on (target_type,target_id,received_at) and (received_at)

Backend:
- store: WebhookDelivery model + Insert/List/Prune helpers
- webhook/handler: deferred recordDelivery() captures the final outcome
  on every return path including HMAC rejects, image mismatch, no-stage,
  auto_deploy=false, and successful deploys; signatureStateFor()
  classifies "unconfigured" vs "missing" vs "invalid" vs "valid"
- api: GET /api/{projects,sites}/{id}/webhook/deliveries with
  parseLimit() helper (default 50, max 200)
- main: daily prune cron retains the last 14 days

Frontend:
- WebhookDeliveryLog.svelte: panel with refresh button, status code +
  outcome + signature badges, relative time tooltip-on-hover for
  absolute time, source IP column
- Mounted below WebhookPanel on project + site detail pages
- en/ru i18n strings for outcome/signature enums and column labels
2026-05-07 02:40:39 +03:00
alexei.dolgolyov 831b5c1a43 feat(webhook): HMAC-SHA256 signature verification on inbound webhooks
Adds an opt-in inbound HMAC scheme so a leaked URL alone is not enough
to forge deploy/sync requests — the caller must also know a separate
signing secret. Header format is X-Hub-Signature-256, matching the
Gitea/GitHub/GitLab convention so existing CI integrations work without
custom code.

Behaviour:
- per-project / per-site signing_secret is independent of the URL secret
- require_signature flag does a hard 401 on missing/invalid signatures
- even when require_signature is off, an *invalid* submitted signature
  returns 401 — surfaces CI misconfiguration instead of silently passing
- comparison uses subtle/hmac.Equal (constant time)

Backend:
- store: webhook_signing_secret + webhook_require_signature columns on
  projects + static_sites; scanProject helper, scan helpers updated; new
  Set* helpers for both fields
- webhook/handler: verifyHMAC helper, body read once, integrated into
  both project and site handlers
- api: per-entity signing-secret rotate / disable / require-toggle
  endpoints under /api/{projects,sites}/{id}/webhook/...

Frontend:
- WebhookPanel gains optional signing handlers (no breaking change for
  existing callers; signing UI hides when handlers aren't wired)
- one-shot reveal of the issued secret with copy + dismiss
- ToggleSwitch for require-signature, disabled until a secret is issued
- en/ru i18n strings

Tests:
- HMACRequiredAndValid (200 + deploy fires)
- HMACRequiredButMissing (401, no deploy)
- HMACPresentButWrong (401 even when require_signature=false)
- HMACOptionalUnsignedAccepted (200 when neither configured)
2026-05-07 02:34:40 +03:00
alexei.dolgolyov 8b886ddf2b feat(backup): take Tinyforge DB snapshot before every deploy
Adds an opt-in "auto_backup_before_deploy" setting that triggers a
"pre-deploy" backup at the start of every project deploy via the deploy
pipeline (covers both the async HTTP path and the sync poller/webhook
path). Failures are logged to the deploy log but do not abort — missing
a backup is preferable to refusing to ship a fix.

- store: settings.auto_backup_before_deploy column + scan/update wiring
- backup: accept "pre-deploy" as a valid backup_type
- deployer: small PreDeployBackuper interface, hooked into runDeploy
  right after settings load and before any state-mutating work
- api: settings request/response surface the new flag
- web: ToggleSwitch on the backup settings page; "Pre-deploy" badge
  variant in the backup list (badge-warning so it stands out)
- i18n: en/ru strings for the toggle, help text, and badge label
2026-05-07 02:14:26 +03:00
alexei.dolgolyov 0405ecd9ce feat(notify): HMAC-signed outgoing webhooks with per-tier secrets and test sender
Build / build (push) Successful in 10m36s
Outgoing notifications were bare POSTs with no auth and no way to verify
they came from Tinyforge. They also went out from one global URL only,
even though stages had a notification_url field, and static-site sync
emitted no events at all.

Schema: add notification_url + notification_secret (lazy-generated) to
settings, projects, stages and static_sites. Migrations are additive.

Notifier: SendSigned computes HMAC-SHA256 over the exact body bytes and
sends X-Hub-Signature-256 (GitHub-compatible — receivers built for
GitHub/Gitea/Forgejo verify out of the box). Aux headers
X-Tinyforge-Event/Delivery/Timestamp/Tier are advisory and not signed.
Empty secret => unsigned send for back-compat.

Resolution: deploys fall through stage > project > settings, sites fall
through site > settings. The secret travels with the URL that sourced
it, so any tier can sign even when its parents are unsigned. Site sync
events now actually emit (site_sync_success / site_sync_failure).

API: 12 new endpoints — {GET secret, POST regenerate, POST disable,
POST test} for each of the 4 tiers. SendSyncForTest returns
status_code/latency_ms/signature_sent/delivery_id/response_snippet so
the UI surfaces receiver feedback inline.

UI: shared OutgoingWebhookPanel.svelte fits the existing card aesthetic.
Signing-state pill, secret reveal-on-demand, regenerate/disable behind
ConfirmDialog modals (not inline strips — too easy to misclick), send-
test result card with colour-coded status. Wired into Settings →
Integrations, project edit form, per-stage edit, and per-site detail.
EN + RU i18n.

Tests: round-trip (sender signs, receiver verifies), tampered-body and
wrong-secret rejection, unsigned-send omits header, send-test surfaces
4xx, concurrent fan-out via Drain. Resolver precedence locked for both
deploy and site paths.

Docs: docs/webhooks.md with header reference, verifier snippets in
Node/Python/Go, and a recipe for the service-to-notification-bridge
generic webhook provider.
2026-05-07 02:03:32 +03:00
alexei.dolgolyov a4362b842d fix: harden security, fix concurrency bugs, and address review findings
Build / build (push) Successful in 11m42s
Security:
- rate limit /api/webhook routes per-IP and cap concurrent site syncs
- global SSE connection cap (256) with new sse_gate
- validate ?tail= and cap JSON log responses at 4 MiB
- strip ANSI/CSI/OSC and control bytes from streamed log lines
- redact webhook secret from request log middleware
- scrub host details from /api/health for non-admin viewers
- drop container_id from /api/system/stats/top for non-admins
- generate webhook secrets via crypto/rand; require >=32 chars on insert
- verify iid path consistency in streamContainerLogs
- LimitReader on site webhook body; reject malformed non-empty bodies

Concurrency / correctness:
- stats collector: Stop() no longer hangs without Start(), semaphore
  acquired in parent loop so ctx cancellation short-circuits the queue,
  in-flight tick cancellable via shared base context, zero-ts guard
- webhook handler: replace fire-and-forget goroutine with WaitGroup-tracked
  workers + Drain() wired into graceful shutdown
- $derived(() => ...) mis-idiom fixed in ContainerStats / InstanceCard /
  ProjectCard (returned function instead of value)
- SystemResourcesCard: rename `window` and `t` locals to avoid shadowing
  globalThis.window and the i18n `t` import

Quality / performance:
- replace O(n^2) insertion sort with sort.Slice in stats top
- runMigrations only swallows duplicate-column / already-exists errors
- PruneStatsSamplesBefore wrapped in a transaction
- collapse N+1 in unusedImageStats / pruneImages to one ListAllInstances
  pass; surface DB errors instead of silently treating them as inactive
- run Docker Info + DiskUsage in parallel via errgroup
- container log SSE emits `: ping` heartbeat every 20 s
- imageMatches case-insensitive on registry host (RFC behaviour)
- log warning on invalid stage tag pattern instead of silent skip
- reject malformed non-empty site webhook payloads

Frontend / i18n:
- shared formatBytes utility replaces three local copies
- statsInterval store drives dynamic "no samples / collection disabled"
  copy across ContainerStats and SystemResourcesCard
- top consumers row now shows owner_name (project/stage or site name)
- drop seven `as any` casts on the Settings type; add cloudflare_api_token
  write-only field
- move "Service status", "Docker daemon", "Docker unreachable",
  "Proxy unreachable", "reachable", and "Docker daemon is not reachable."
  strings into en/ru i18n bundles
2026-05-07 00:56:14 +03:00
alexei.dolgolyov 05440a5f92 feat(stats): resource metrics dashboard + sites logs/stats
Build / build (push) Successful in 10m50s
Background collector samples CPU/memory/network/block I/O for every
instance and site on a configurable interval (default 15s, range
5-300s), persists samples to SQLite with a configurable retention
window (default 2h, range 0-24h), and skips ticks gracefully when
the Docker daemon is unreachable. Settings are reloadable without
a restart — each tick re-reads them.

New API endpoints:
- GET /api/system/stats (host snapshot: info + df)
- GET /api/system/stats/history
- GET /api/system/stats/top?by=cpu|memory
- GET /api/projects/{id}/stages/{s}/instances/{iid}/stats/history
- GET /api/sites/{id}/stats[/history]
- GET /api/sites/{id}/logs (SSE + JSON, reuses instance log streamer)

Frontend:
- ECharts added with tree-shaken imports (~180KB gzip) for
  future-proof time-series/gantt/graph visualizations
- CollapsibleSection wraps all dashboard sections (system health,
  daemons, system resources, static sites, projects) with
  localStorage-persisted open state
- SystemResourcesCard shows capacity tiles, workload utilization
  chart with 30m/2h/6h/24h window picker, disk breakdown with
  reclaimable callouts, and top 5 consumers
- ContainerStats and ContainerLogs take a source discriminated union
  so sites reuse the same components as instances; sites detail page
  embeds both for Deno backend debugging
- Settings › Maintenance exposes collection interval + retention
- Docker-unavailable state returns 503 and renders an amber banner
  instead of a generic 500

Full i18n coverage (en + ru) for all new strings.
2026-04-24 15:02:43 +03:00
alexei.dolgolyov 0632f512e6 feat(webhook): per-project and per-site webhook URLs
Build / build (push) Successful in 10m25s
Replace the single global webhook secret with entity-scoped secrets stored
on each project and static site. Webhook-driven project autocreate is
removed — projects must exist before their URL can trigger deploys.

Also wires static-site webhooks (sync_trigger=push|tag), turning the
previously inert "push" trigger into a functional one: POST the site's
webhook URL from a Git provider and Tinyforge re-syncs on matching refs.

- Adds webhook_secret columns + unique indexes to projects and static_sites
- Per-entity GET/regenerate endpoints under /api/projects/{id}/webhook
  and /api/sites/{id}/webhook (admin-only)
- Removes /api/settings/webhook-url and the global webhook panel
- Reusable WebhookPanel Svelte component on both detail pages, i18n in en/ru
- Tests for matcher (siteRefMatches, ParseImageRef) and handler (project
  match/mismatch/404 and site push/manual/branch-skip)
2026-04-23 15:18:19 +03:00
alexei.dolgolyov e08acf5c0e refactor(settings): split General into focused pages
Build / build (push) Successful in 10m38s
General was a 547-line catch-all mixing seven concerns, destructive
actions (image prune) inches away from form fields, and Cloudflare DNS
buried under four unrelated cards. A single "Save" committed everything
at once — one invalid field blocked valid edits elsewhere.

Splits:
- /settings                Overview: timezone, core infra, proxy choice
- /settings/integrations   outgoing notification URL + incoming webhook
- /settings/dns            wildcard + Cloudflare provider
- /settings/maintenance    stale threshold, prune threshold, prune action
                           (in a dedicated "Danger zone" card)
- /settings/credentials    removed (was an 18-line redirect stub)

Sidebar is grouped (Overview / Routing / System / Security) with
section headers; NPM & Traefik items remain conditional on the
proxy-provider choice. Each page loads settings and PUTs only its own
subset, so mistakes on one page can't block edits on another.

No backend changes — the API already accepts Partial<Settings>.
2026-04-23 14:53:48 +03:00
alexei.dolgolyov 90e6e59d9e feat: daemon health panel, brand-rail status chips, user timezone selector
Build / build (push) Successful in 10m35s
- Health API now surfaces Docker /info + /version (version, platform,
  kernel, container/image counts, storage driver, memory, latency) and
  NPM aggregates (proxy host total, managed-by-Tinyforge count, access
  lists, certificates, endpoint URL).
- Docker/NPM indicators moved out of the sidebar footer and into a
  compact mono-styled rail directly under the Tinyforge brand title,
  with pulse/fault animations and click-to-expand error hints.
- New SystemDaemonsCard on the dashboard: two terminal-styled panels
  (Docker Engine + Proxy) with a running/paused/stopped stacked bar,
  key-value diagnostics, and a total-vs-managed proportion meter on
  the proxy-hosts tile.
- Shared health store so the sidebar and dashboard share a single
  30 s poll instead of duplicating traffic.
- User-facing timezone preference with auto-detect fallback; all
  dates across projects, sites, stacks, settings, backup, event log
  and stale containers now render through \$fmt.date / \$fmt.datetime.
- en/ru translations for both features.
2026-04-23 14:32:30 +03:00
alexei.dolgolyov a182a93950 feat: nav counter badges, login backdrop, events i18n + misc fixes
Build / build (push) Successful in 10m29s
Nav & UI polish
- Sidebar nav items show monospace count badges (projects, sites, stacks,
  proxies). Events badge shows error count only, styled red as actionable
- New $lib/stores/navCounts.ts polls all counts in parallel every 60s and
  refreshes on route change so badges track mutations
- Login page gets a dynamic forge backdrop: rotating conic glow, drifting
  embers, dot-grid texture, vignette — all pure CSS, reduced-motion safe
- main element gets scrollbar-gutter: stable so Settings tab switching no
  longer shifts horizontally when content heights differ

Events i18n
- events.source.* dictionary rewritten to match actually-emitted backend
  sources (deploy, static_site, stale_scanner, stale_cleanup, admin);
  dead keys (container, proxy, system) removed
- EventLogFilter.allSources + /events default sources state updated to match
- Localize "{N} total" via events.totalCount in the page hero toolbar

Backend
- Stage API accepts enable_proxy on create/update (defaults to true) so
  proxy registration can be opted out per stage

Concurrency
- api.ts: queued request waiters no longer double-increment the inflight
  counter; releasing a slot hands it off directly

Reactive effects
- project detail / env / volumes pages wrap side-effect calls in untrack()
  to prevent $effect feedback loops when their loaders mutate tracked state
2026-04-22 18:30:34 +03:00
alexei.dolgolyov ef0669d5dd feat: unified THE FORGE // SECTION headers and merged proxy routes
Build / build (push) Successful in 10m37s
UI consistency
- ForgeHero now supports backHref, mono kicker, stats snippet, staggered
  entrance animation, and a registration-tick divider
- Every route now opens with the same "THE FORGE // SECTION" eyebrow: projects,
  sites, stacks, proxies, events, dns, deploy, settings, stale containers,
  site/project detail + env/volumes/browse, new site wizard
- Stacks list/detail/new moved to the shared hero and brand-anchor eyebrow
- Toolbars migrated from bespoke buttons to the shared .forge-btn utilities
- Sidebar footline adds a live UTC "forge clock" and a vim-style g-prefix
  quick-nav hint (g d/p/s/k/x/r/e/c jumps to each section)

Proxies page
- Server-side: merge static site proxy routes with instance routes and sort
  by domain (internal/api/proxies.go, internal/store/static_sites.go)
- ProxyRoute gains a Source field ("instance" | "static_site")
- Frontend adds source filter tabs and per-source labels/badges
2026-04-22 16:27:55 +03:00
alexei.dolgolyov 0fd92fdfa3 feat: Forge design system app-wide + Stacks i18n
Build / build (push) Successful in 10m47s
Promotes the Forge visual language from the Stacks feature into a
global design system used across the app:

- app.css: Forge utilities (dot-grid backdrop, eyebrow, ember,
  display/lede, status pills, stat grid, panels, registration
  marks, alert, terminal, buttons). CSS variables alias the forge
  display font to the app's standard sans stack (Inter, now
  properly self-hosted via @fontsource/inter).
- +layout.svelte: reskinned sidebar brand, active nav rail,
  mobile top bar, global h1/h2 typography overrides, main dot-grid
  backdrop.
- Shared components reskinned: EmptyState (breathing-ember empty
  mark), StatusBadge (mono pills with pulse), ConfirmDialog
  (registration marks + forge buttons).
- Dashboard (+page.svelte): ForgeHero header, forge-stat-grid,
  Instrument-style section titles with accent.
- New ForgeHero component for reusable hero headers.

Stacks feature fully localized (EN + RU):
- 80+ keys under stacks.* covering list, new, detail, revisions,
  logs, errors, status labels, delete/rollback dialogs.
- Russian uses forge vocabulary (куются/наковальня/куём/etc).
- $t() wired through all three Stacks pages.
2026-04-16 04:17:42 +03:00
alexei.dolgolyov 75424a5f25 feat: docker-compose stacks with Forge-themed UI
Build / build (push) Successful in 10m42s
Adds a new Stacks feature: upload/edit docker-compose YAML,
deploy as atomic units, browse revisions, roll back, and
stream logs. Backend in internal/stack + internal/api/stacks.go,
persistent storage in internal/store/stacks.go.

Stacks pages (list, new, detail) use a modern Forge aesthetic —
Instrument Serif display type, JetBrains Mono for meta/code,
indigo ember accents, dot-grid hero, registration marks on
hover, terminal panel for logs. Palette is sourced from the
app's existing design tokens so the feature remains consistent
with the rest of Tinyforge.

Fonts self-hosted via @fontsource/instrument-serif and
@fontsource/jetbrains-mono to satisfy the strict CSP.
2026-04-16 03:48:37 +03:00
alexei.dolgolyov 9ec25a8d5a feat: project detail UX improvements
- Tag picker: replace raw text input with EntityPicker modal showing
  registry tags (auto-detected by image hostname) and local images.
  Fix URL encoding bug where encodeURIComponent encoded slashes in
  image paths, causing 502 on registry tag API.
- Stage editing: inline edit form for name, tag pattern, max instances,
  CPU/memory limits, auto-deploy and proxy toggles.
- Stage delete: use ConfirmDialog modal instead of window.confirm().
  Immediately remove stage from local state after deletion.
- Project-level env: add/edit/delete project env vars (stored in
  project.env JSON field). Move stage selector inline with Stage
  Overrides heading so it's clear project env is independent.
- Access list UX: rename "None (public)" to "Global default", clarify
  help text.
- Add missing i18n keys for all new UI (en + ru).
2026-04-13 00:12:34 +03:00
alexei.dolgolyov 791cd4d6af feat: rename Docker Watcher to Tinyforge
Build / build (push) Successful in 12m20s
Rebrand the project as Tinyforge to reflect its evolution from a Docker
container watcher into a self-hosted mini CI/deployment platform.

Rename covers: Go module path, Docker labels, DB/config filenames,
JWT issuer, Dockerfile binary, docker-compose, CI workflows, frontend
i18n, README with static sites docs, and all code comments.
2026-04-12 21:30:39 +03:00
alexei.dolgolyov 8d2c5a063b feat: static sites feature with Gitea/GitHub/GitLab support and Deno backend
Deploy static content from Git repository folders with optional server-side
API endpoints. Supports Gitea/Forgejo/Gogs, GitHub, and GitLab with provider
autodetection.

- New Sites entity with CRUD, encrypted secrets, and manual/push/tag sync triggers
- Pluggable GitProvider interface with three implementations
- Deno container mode: auto-generates router from API_{method}_{name} exports
- Static container mode: nginx serving files with optional markdown rendering
- Wizard UI with provider selector, repo picker, branch/folder tree pickers
- Deploy pipeline builds fresh image, starts container, configures NPM proxy
- Stop/Start buttons, force redeploy on manual trigger
- Periodic health checker detects crashed containers
- Proxy route existence check during auto-sync
2026-04-11 03:35:57 +03:00
alexei.dolgolyov b0816502bf feat: configurable unused images threshold with dashboard warning
- Add image_prune_threshold_mb setting (default 1024 MB)
- Add GET /api/docker/unused-images endpoint returning unused image count, size, and threshold status
- Dashboard shows amber warning banner when unused project images exceed threshold
- Banner links to settings page for pruning, shows count and human-readable size (MB/GB)
- Threshold configurable in Docker Image Cleanup section of settings
- DB migration + schema for image_prune_threshold_mb
2026-04-05 14:34:48 +03:00
alexei.dolgolyov 0ddad87a9a fix: add confirmation dialog for Docker image prune button 2026-04-05 14:20:49 +03:00
alexei.dolgolyov 21ffef2ee2 feat: separate Public IP for DNS records from Server IP, improve settings help texts
- Add public_ip field to Settings for DNS A records (proxy/load balancer IP)
- DNS records now use public_ip, falling back to server_ip if empty
- Server IP renamed to "Server IP (Docker Host)" for clarity
- Public IP labeled "Public IP (DNS Target)"
- Updated help texts for domain, server IP, public IP, and Docker network
- DB migration + schema for public_ip column
2026-04-05 14:12:53 +03:00
alexei.dolgolyov d03cc3c811 feat: container logs viewer with SSE streaming and line limiter
- Add GET /api/projects/{id}/stages/{stage}/instances/{iid}/logs endpoint
- Supports JSON mode (returns array of lines) and SSE mode (streams in real-time)
- Docker log stream header (8-byte prefix) stripped automatically
- ContainerLogs component with:
  - Tail line selector (50/200/500/1000)
  - Follow button for real-time streaming via SSE
  - Auto-scroll to bottom
  - Dark terminal-style display
  - Close button
- Logs button (events icon) on each instance card
- i18n keys in EN and RU
2026-04-05 14:04:45 +03:00
alexei.dolgolyov ac3132d172 feat: show local Docker images on project detail page
- Add GET /api/projects/{id}/images endpoint returning local images matching the project
- Add ListImagesByRef with tag, size, and created timestamp to Docker client
- Display images table on project page with tag, ID (truncated), size (MB), and created date
- Only shown when Docker is available and images exist locally
2026-04-05 13:56:55 +03:00
alexei.dolgolyov 5577851f22 feat: project-scoped Docker image prune, conflict fix, deploy toggle, access list picker
- Image prune only removes images matching project image refs, skips active instances
- Add ListImagesByRef and RemoveImage to Docker client
- Fix 409 conflict: use listProjects instead of duplicate POST
- Add "Deploy immediately" toggle to Quick Deploy (off by default)
- Replace raw access list ID with EntityPicker on project edit form
- Trigger proxy resync on access list change
- Fix stage form layout: single responsive row
- Fix empty port default on project creation
- Improve inspect error message for remote Docker
2026-04-05 13:49:20 +03:00
alexei.dolgolyov a830378c5b fix: replace access list ID field with EntityPicker, add deploy toggle, improve UX
- Replace raw NPM access list ID input with EntityPicker on project edit form
- Resolve access list name from NPM API when editing project
- Add "Deploy immediately" toggle to Quick Deploy (off by default)
- Fix stage form layout: all fields on same row with toggles
- Fix empty port default on project creation (placeholder instead of pre-filled)
- Improve inspect error message when Docker is unavailable
- Trigger proxy resync when NPM access list changes
- Resolve access list name on NPM settings page load
2026-04-05 13:07:09 +03:00
alexei.dolgolyov 7550fe9e32 feat: CPU/RAM limits per stage, NPM access list (global + per-project)
Resource limits:
- Add cpu_limit (cores) and memory_limit (MB) fields to Stage model
- Pass limits to Docker container via NanoCPUs and Memory in HostConfig
- Add CPU/Memory fields to stage creation form in project detail
- 0 = unlimited (default)

NPM access list:
- Add npm_access_list_id to Settings (global default) and Project (per-project override)
- Per-project overrides global when > 0
- NPM provider passes access_list_id when configuring proxy hosts
- Add GET /api/settings/npm-access-lists endpoint to list NPM access lists
- Add access list picker on NPM settings page (global)
- Add access list ID field on project edit form (per-project)
- DB migrations for all new columns
2026-04-05 12:44:26 +03:00
alexei.dolgolyov 195ef3e7e5 feat: NPM remote mode for cross-machine deployments
- Add npm_remote setting: when enabled, proxy forwards to server_ip with
  published host ports instead of Docker container names
- Deployer looks up assigned host port via InspectContainerPort in remote mode
- Auto-remove stale containers with same name before creating new ones
- Add Remote NPM toggle with warning on NPM settings page
- DB migration + schema for npm_remote column
2026-04-05 02:18:06 +03:00
alexei.dolgolyov c26c41e6a1 feat: enable proxy toggle on quick deploy, event log clearing, and UX fixes
- Add enable_proxy toggle to Quick Deploy form (defaults to on)
- Add DELETE /api/events/log/{id} and DELETE /api/events/log endpoints
- Add Clear All button with confirmation on Events page
- Rename "NPM Proxy" to "Enable Proxy" on stage form (provider-agnostic)
- Fix polling interval validation (min 60s) and number input trim errors
- Fix domain field no longer required in settings
2026-04-05 01:50:19 +03:00
alexei.dolgolyov 187e302f4a feat: proxy routes page, OIDC login fix, NPM test connection, webhook URL fix, and UX improvements
- Add /proxies page showing deploy-managed proxy routes with project/stage links, search, and status
- Add GET /api/proxies endpoint joining instances with project/stage names
- Add POST /api/settings/npm/test endpoint for NPM connection validation
- Add GET /api/auth/mode public endpoint for auth mode detection
- Add NPM Test Connection button with validation on save
- Fix OIDC SSO button only shown when auth_mode is oidc
- Fix webhook URL showing empty when domain not set (fallback to request host)
- Fix quick deploy double-tag (image:latest:latest) by splitting tag from image URL
- Fix trim() errors on number inputs in deploy and settings forms
- Fix NPM client auto-append /api to base URL
- Sanitize NPM test error messages (no raw HTML)
- Remove healthcheck field from Quick Deploy form
- Fix env vars placeholder newline
- Make domain field optional in settings
- Set polling interval minimum to 60s
- Add Proxies and Events to sidebar navigation
- Fix SSL cert name flash on NPM settings page
- Fix empty state icon on proxies page
2026-04-05 01:27:54 +03:00
alexei.dolgolyov 1aa9c3f0e9 feat: separate NPM and Traefik settings tabs, add Events to sidebar nav
- Create /settings/npm page with NPM credentials + SSL certificate picker
- Create /settings/traefik page with entrypoint, cert resolver, network, API URL + labels reference
- Dynamic settings nav: NPM/Traefik tabs only visible when respective provider is selected
- Remove inline Traefik config from general settings page
- Remove old credentials page (replaced by NPM tab)
- Add Events page to sidebar navigation
- Fix SystemHealthCard after standalone proxy removal
2026-04-04 23:27:00 +03:00
alexei.dolgolyov 308547a3d7 refactor: remove standalone proxies, add Traefik provider with Docker labels
Standalone proxy removal:
- Delete store, API handlers, proxy manager, health monitor, validator, hints
- Delete frontend pages (proxies list, create, edit) and components (ProxyCard, ProxyForm, ProxyFilter, ProxyGroup, ValidationChecklist)
- Remove proxy routes from router, nav items, dashboard references
- Clean up SystemHealthCard to remove proxy section

Traefik provider:
- Add TraefikProvider implementing proxy.Provider via Docker labels
- ContainerLabels() returns traefik.enable, router rule, entrypoints, service port, TLS cert resolver, docker network
- ConfigureRoute() returns router name (labels handle routing at container creation)
- DeleteRoute() is no-op (container removal auto-deregisters)
- Ping() checks Traefik API health (optional)
- Wire ContainerLabels into deployer (executeDeploy + blueGreenDeploy)
- Add Traefik settings: entrypoint, cert_resolver, network, api_url
- Add traefik option to proxy provider selector in settings UI
- Show conditional Traefik config fields
- Add i18n keys (EN + RU)
2026-04-04 22:54:31 +03:00
alexei.dolgolyov 216bd7e2db fix: UI/UX consistency overhaul — fix 8 bugs, standardize design system
Bug fixes:
- Backup refresh no longer re-renders entire page (separate refreshing state)
- SSL cert button no longer flickers when no certs available
- Volume mode selector rewritten to use proper scope system (7 scopes)
- Navigation flicker eliminated when returning from env/volumes pages
- Logout button moved to sidebar footer near theme/locale controls
- Subdomain pattern now shows variable hint tooltip ({project}, {stage}, etc.)
- SSL certificate selector moved to Credentials page with auto-save
- Projects page now has search/filter by name, image, or registry

Consistency improvements:
- New Breadcrumb component replaces 5 inline implementations
- New IconArrowLeft, IconChevronDown components replace inline SVGs
- All inline spinners replaced with IconLoader component
- 10 semantic badge classes with dark mode variants in tokens.css
- Global disabled button cursor-not-allowed rule
- Raw inputs in auth page replaced with FormField components
- Missing aria-labels added to icon-only buttons
- Error panels standardized to use design tokens
2026-04-04 21:34:36 +03:00
alexei.dolgolyov 97d2980e95 feat: proxy provider UI, webhook URL fix, and dev-server key persistence
- Add proxy provider selector (None/NPM) to settings page with radio cards
- Show warning when switching to None about existing NPM routes
- Conditionally hide SSL certificate picker and NPM credentials when provider is None
- Fix webhook URL regenerate not updating UI (field name mismatch: url vs webhook_url)
- Fix dev-server.sh to persist ENCRYPTION_KEY across restarts via data/.dev-key
- Fix selected radio card background visibility in dark mode
2026-04-04 20:33:08 +03:00
alexei.dolgolyov 6667abf03c fix: quick deploy duplicate detection, logout UX, backup toggle, CSP, SSE guard, and migration
- Detect existing projects with same image on quick deploy; show conflict dialog with options
- Move logout button to sidebar header as icon-only
- Replace backup checkbox with ToggleSwitch component
- Allow unsafe-inline in CSP script-src for SvelteKit hydration
- Guard SSE connection behind isAuthenticated() check
- Add notification_url ALTER TABLE migration for existing databases
- Restore RegisterPersistentLogger on event bus
2026-04-04 14:40:59 +03:00
alexei.dolgolyov 04c1411f5d fix: extract hardcoded English strings to i18n system with Russian translations
- Extract ~40 hardcoded strings from project detail, deploy, settings, credentials, registries, auth, env editor pages
- Add corresponding Russian translations
- Replace native confirm() default labels with i18n keys in ConfirmDialog
- Fix InstanceCard pluralization to use i18n
2026-04-04 13:03:05 +03:00
alexei.dolgolyov a9c7775bb7 feat: configuration backup management with manual and auto backup
Add backup/restore functionality for the SQLite database. Users can
trigger manual backups, configure automatic backups on an interval
with retention policies, list/download/delete backups, and restore
from any backup.

- Backup engine using VACUUM INTO (safe with WAL mode)
- Backup metadata tracked in DB, files stored in DATA_DIR/backups/
- Settings: backup_enabled, backup_interval_hours, backup_retention_count
- API: POST/GET/DELETE /api/backups, download, restore endpoints
- Autobackup via cron scheduler with configurable interval
- Retention: prune on startup, after each backup (manual and auto)
- Orphan cleanup: removes backup files without metadata on startup
- Restore: replaces DB and triggers graceful server shutdown
- Settings UI: /settings/backup with toggle, interval, retention config
- Backup list with download, delete, restore actions
- i18n: English and Russian translations
2026-04-02 15:32:15 +03:00
alexei.dolgolyov c730cfaa45 feat: Cloudflare DNS management with automatic record sync
Add flexible DNS management to Docker Watcher. By default, wildcard DNS
is assumed (current behavior). When disabled, users can configure a
Cloudflare DNS provider with API token and zone selection. DNS A records
are automatically created/updated/deleted in sync with proxy consumers
(deployed instances and standalone proxies).

- Settings: wildcard_dns toggle, dns_provider, cloudflare credentials
- Cloudflare client: Provider interface with EnsureRecord/DeleteRecord/ListRecords
- DNS lifecycle hooks in deployer and proxy manager (best-effort)
- Settings UI: DNS config section with provider picker, zone selector, test button
- DNS Records page at /dns with filtering, sync status, reconciliation
- Records visible in both wildcard and managed modes
- Cleanup on provider change: removes old records when switching modes
2026-04-02 14:49:21 +03:00
alexei.dolgolyov 6b54a72ec9 feat(volume-browser): phase 2 - file browser UI
- Browse route: /projects/{id}/volumes/{volId}/browse
- Directory listing with file icons, sizes, dates
- Breadcrumb navigation, click-to-navigate directories
- Download entire volume or folder as ZIP
- Upload files via file picker
- i18n EN/RU for all browser strings
2026-04-01 23:04:30 +03:00
alexei.dolgolyov 8fb959f81f feat: volume scopes redesign — replace shared/isolated with 6 scopes
Replace confusing shared/isolated volume modes with explicit scopes:
- instance: per-deploy isolated directory
- stage: shared within a stage across deploys
- project: shared across all stages
- project_named: named group within a project
- named: global named volume across projects
- ephemeral: tmpfs in-memory mount

Includes schema migration (shared→project, isolated→instance),
backward-compatible deployer resolution, scope metadata API endpoint,
and redesigned volume editor UI with scope guide cards and hints.
2026-03-31 23:22:43 +03:00
alexei.dolgolyov 1a8dfefa77 feat: Docker diagnostic hints on disconnection
- Classify Docker errors into categories (socket_not_found, connection_refused,
  permission_denied, timeout, tls_error) with platform-specific hints
- Enrich GET /api/health with structured diagnostics (category, hints, platform)
- Expandable hints panel in sidebar when Docker is disconnected
- "Retry now" button for immediate re-check
- Collapsible raw error details for advanced users
2026-03-30 14:05:00 +03:00
alexei.dolgolyov 37cfa090ac feat: global Docker health indicator and graceful degradation
- GET /api/health endpoint returning Docker connectivity status
- Sidebar shows Docker connection dot (green=connected, red=disconnected)
- Stale scanner returns store-only results when Docker is unavailable
- Polls health every 30s
2026-03-30 13:43:33 +03:00
alexei.dolgolyov 71aeb615b3 fix(observability): router conflict, logout button, missing i18n
- Fix chi duplicate Route() panic by consolidating read/write routes
  into single Route blocks with nested admin Group
- Add logout button to sidebar with token cleanup
- Add missing settingsAuth.password i18n key
2026-03-30 12:26:22 +03:00