# Per-Workload Metrics Graph — Implementation Plan **Status:** planned · **Feature rank:** #2 · **Date:** 2026-06-19 ## Problem Stats are collected per container (`container_stats_samples`, CPU/mem/net/disk) and charted **globally** on the dashboard (`SystemResourcesCard` + `ResourceChart`), but `/apps/[id]` shows only live snapshots — there's no per-workload "is my app leaking memory / pegging CPU over the last few hours" view. This is a daily question and the data already exists; we just need a per-workload query + a panel that reuses the chart. ## Verified facts - `ContainerStatsSample.OwnerID` == the **container row id** (`containers.id`), confirmed by `lookupInstanceName` → `GetContainerByID(sm.OwnerID)` in [stats_history.go](../../internal/api/stats_history.go). `OwnerType` ∈ {instance, site}. - Each sample's `ts` is that container's own Docker-stats `Timestamp.Unix()` ([collector.go](../../internal/stats/collector.go)) — NOT one shared tick stamp. In a multi-container tick the per-second truncation usually collapses them to the same integer `ts`, so per-`ts` aggregation works; a ±1s split at a second boundary is cosmetic for a trend line. (Reviewer-corrected.) The handler 404s on an unknown workload id but returns `[]` for a known workload with no samples yet. - `ResourceChart.svelte` takes a fully-built `EChartsOption` from the parent; the parent owns series/axes (see `SystemResourcesCard`). Reads stay available when Docker is down (samples come from SQLite, not the daemon). - Per-workload reads (`/events`, `/runtime-state`) are open to any authenticated user; this endpoint follows suit (no `AdminOnly`). ## Backend 1. **Store** — `ListContainerStatsSamplesByWorkload(workloadID string, sinceTS int64)`: ```sql SELECT cs.container_id, cs.owner_type, cs.owner_id, cs.ts, cs.cpu_percent, cs.memory_usage, cs.memory_limit, cs.network_rx, cs.network_tx, cs.block_read, cs.block_write FROM container_stats_samples cs JOIN containers c ON c.id = cs.owner_id WHERE c.workload_id = ? AND cs.ts >= ? ORDER BY cs.ts ASC ``` Returns `[]ContainerStatsSample`. 2. **API** — `getWorkloadStatsHistory` (GET `/api/workloads/{id}/stats/history?window=`): reuse `parseWindow`/`sinceTimestamp`; aggregate samples **per ts** into a compact series so multi-container workloads (compose) sum correctly: ```go type workloadStatsPoint struct { TS int64 `json:"ts"` CPUPercent float64 `json:"cpu_percent"` // sum across the workload's containers MemoryUsage int64 `json:"memory_usage"` // sum bytes MemoryLimit int64 `json:"memory_limit"` // max (effective ceiling) } ``` Always returns `[]` (never 503) — empty when stats are disabled / Docker was down / the workload is new. Register in the `/workloads/{id}` route block. 3. **Tests** — store: join scopes to the right workload (A's samples ≠ B's); API: per-ts aggregation sums two containers at the same tick. ## Frontend 4. **api.ts** — `WorkloadStatsPoint` type + `fetchWorkloadStatsHistory(id, window, signal)`. 5. **`WorkloadMetricsPanel.svelte`** — window selector (30m / 2h / 6h), fetch + 15s poll (mirror `SystemResourcesCard`), build an `EChartsOption` with **two series**: CPU % on the left axis, Memory (MiB) on the right axis (absolute bytes, because `memory_limit` is often 0/unlimited so a % would divide by zero). `EmptyState`/ hint when there are no samples. Render via `ResourceChart`. Mount on `/apps/[id]` near the deploy-history panel. 6. **i18n** — `apps.detail.metrics.*` in both en.json and ru.json (parity mandatory). ## Risks / mitigations - **Docker down / stats disabled** → empty series, friendly hint (no error). SQLite read path is independent of the daemon. - **memory_limit = 0 (unlimited)** → plot absolute MiB, not %, to avoid div-by-zero. - **Sparse sampling** → chart shows whatever ticks exist; window selector lets the user widen. No interpolation. - **Auth** → read-only, any authenticated user (consistent with other per-workload reads). ## Rollout Single change set, additive, no migration. Reuses the existing `echarts` dependency and `ResourceChart` component.