Two additions to the app detail page, each backed by a per-workload
endpoint.
Deploy history + rollback:
- New deploy_history table — a structured, version-pinned ledger of every
dispatch (success AND failure), distinct from the free-text event_log.
Recorded at the single DispatchPlugin choke point so every source kind
is covered. The raw deploy error is never persisted (it can carry
registry-auth / compose-stdout secrets) — only a generic marker, with
detail going to slog. Pruned to the newest N per workload; cascade-
deleted with the workload.
- GET /api/workloads/{id}/deploys lists the ledger; POST .../rollback
(admin) replays a prior successful deploy's pinned reference as a
rollback-reason dispatch. Phase 1 is image-source only (RollbackCapable);
git-built sources need checkout-by-commit, a later phase.
- DeployHistoryPanel.svelte renders the ledger with confirm-gated rollback.
Per-workload metrics:
- ListContainerStatsSamplesByWorkload joins the existing container stats
samples through the containers index; GET /api/workloads/{id}/stats/history
aggregates CPU/memory per timestamp across the workload's containers.
- WorkloadMetricsPanel.svelte reuses ResourceChart (CPU% + memory MiB,
windowed, 15s poll).
en/ru i18n added with parity. Tests: store CRUD + cascade + workload-scoped
join, deployer recording (incl. secret-non-leak on failure), API rollback
guards, and per-timestamp aggregation. Plans under docs/plans/.
4.2 KiB
Per-Workload Metrics Graph — Implementation Plan
Status: planned · Feature rank: #2 · Date: 2026-06-19
Problem
Stats are collected per container (container_stats_samples, CPU/mem/net/disk) and
charted globally on the dashboard (SystemResourcesCard + ResourceChart), but
/apps/[id] shows only live snapshots — there's no per-workload "is my app leaking
memory / pegging CPU over the last few hours" view. This is a daily question and the
data already exists; we just need a per-workload query + a panel that reuses the chart.
Verified facts
ContainerStatsSample.OwnerID== the container row id (containers.id), confirmed bylookupInstanceName→GetContainerByID(sm.OwnerID)in stats_history.go.OwnerType∈ {instance, site}.- Each sample's
tsis that container's own Docker-statsTimestamp.Unix()(collector.go) — NOT one shared tick stamp. In a multi-container tick the per-second truncation usually collapses them to the same integerts, so per-tsaggregation works; a ±1s split at a second boundary is cosmetic for a trend line. (Reviewer-corrected.) The handler 404s on an unknown workload id but returns[]for a known workload with no samples yet. ResourceChart.sveltetakes a fully-builtEChartsOptionfrom the parent; the parent owns series/axes (seeSystemResourcesCard). Reads stay available when Docker is down (samples come from SQLite, not the daemon).- Per-workload reads (
/events,/runtime-state) are open to any authenticated user; this endpoint follows suit (noAdminOnly).
Backend
-
Store —
ListContainerStatsSamplesByWorkload(workloadID string, sinceTS int64):SELECT cs.container_id, cs.owner_type, cs.owner_id, cs.ts, cs.cpu_percent, cs.memory_usage, cs.memory_limit, cs.network_rx, cs.network_tx, cs.block_read, cs.block_write FROM container_stats_samples cs JOIN containers c ON c.id = cs.owner_id WHERE c.workload_id = ? AND cs.ts >= ? ORDER BY cs.ts ASCReturns
[]ContainerStatsSample. -
API —
getWorkloadStatsHistory(GET/api/workloads/{id}/stats/history?window=): reuseparseWindow/sinceTimestamp; aggregate samples per ts into a compact series so multi-container workloads (compose) sum correctly:type workloadStatsPoint struct { TS int64 `json:"ts"` CPUPercent float64 `json:"cpu_percent"` // sum across the workload's containers MemoryUsage int64 `json:"memory_usage"` // sum bytes MemoryLimit int64 `json:"memory_limit"` // max (effective ceiling) }Always returns
[](never 503) — empty when stats are disabled / Docker was down / the workload is new. Register in the/workloads/{id}route block. -
Tests — store: join scopes to the right workload (A's samples ≠ B's); API: per-ts aggregation sums two containers at the same tick.
Frontend
- api.ts —
WorkloadStatsPointtype +fetchWorkloadStatsHistory(id, window, signal). WorkloadMetricsPanel.svelte— window selector (30m / 2h / 6h), fetch + 15s poll (mirrorSystemResourcesCard), build anEChartsOptionwith two series: CPU % on the left axis, Memory (MiB) on the right axis (absolute bytes, becausememory_limitis often 0/unlimited so a % would divide by zero).EmptyState/ hint when there are no samples. Render viaResourceChart. Mount on/apps/[id]near the deploy-history panel.- i18n —
apps.detail.metrics.*in both en.json and ru.json (parity mandatory).
Risks / mitigations
- Docker down / stats disabled → empty series, friendly hint (no error). SQLite read path is independent of the daemon.
- memory_limit = 0 (unlimited) → plot absolute MiB, not %, to avoid div-by-zero.
- Sparse sampling → chart shows whatever ticks exist; window selector lets the user widen. No interpolation.
- Auth → read-only, any authenticated user (consistent with other per-workload reads).
Rollout
Single change set, additive, no migration. Reuses the existing echarts dependency and
ResourceChart component.