Files
tiny-forge/internal/api/stats_history.go
T
alexei.dolgolyov 739b67856a
Build / build (push) Successful in 10m39s
feat(cutover): hard legacy cutover — drop projects/stacks/sites/deploys
The clean-break delete that closes the workload-first refactor arc.
Net diff: ~30 backend files deleted, ~20 modified, ~12k LOC removed
on the Go side; entire /projects /stacks /sites /deploy frontend
trees gone; ~6.7k LOC removed on the Svelte/TypeScript side.

Backend
- API handlers gone: internal/api/{projects,stages,stage_env,stacks,
  static_sites,deploys,instances,volume_browser}.go
- Store CRUD + tests gone: internal/store/{projects,stages,stage_env,
  stacks,static_sites,static_site_secrets,deploys,poll_state,volumes,
  workload_sync}.go (+ _test.go siblings)
- Legacy deployer pipeline gone: internal/deployer/{bluegreen,promote,
  rollback,subdomain,resolver_test}.go; deployer.go trimmed to just the
  dispatch surface used by the plugin pipeline
- internal/staticsite/{manager,healthcheck}.go and
  internal/stack/manager.go gone (the rest of those packages stay as
  helpers imported by the static + compose plugins)
- internal/registry/poller.go gone (legacy registry poller)
- internal/volume.ResolvePath gone; ResolveWorkloadPath stays
- internal/webhook: handleWebhook (project) + handleSiteWebhook (site)
  gone; only POST /api/webhook/triggers/{secret} remains
- workload-side webhook URL handlers (getWorkloadWebhook +
  regenerateWorkloadWebhook + EnsureWorkloadWebhookSecret +
  SetWorkloadWebhookSecret + GetWorkloadByWebhookSecret) gone — they
  minted URLs that would 404 against the new trigger-only ingress
- cmd/server/main.go: dropped staticsite.Manager, stack.Manager,
  staticsite.HealthChecker, registry poller, SetSiteSyncTriggerer,
  SetStaticSiteManager, SetStackManager, wireStaticBackend
- store/store.go: idempotent DROP TABLE IF EXISTS for every legacy
  table (projects, stages, stage_env, volumes, deploys, deploy_logs,
  poll_states, stacks, stack_revisions, stack_deploys, static_sites,
  static_site_secrets); FK order children-then-parents
- store/models.go: dropped Project, Stage, Deploy, DeployLog, StageEnv,
  Volume, StaticSite, StaticSiteSecret, Stack, StackRevision,
  StackDeploy types; kept WorkloadKind constants as documented strings
- internal/store/helpers.go (new): BoolToInt, rowScanner,
  GenerateWebhookSecret extracted from deleted CRUD files
- internal/api/secrets.go (new): forwards to store.GenerateWebhookSecret
  so api + store paths share one secret-generation impl (no
  panic-vs-UUID-fallback divergence)
- internal/reconciler/reconciler.go: dropped legacy stack-by-compose
  + static-site label paths; only canonical tinyforge.workload.id
  dispatch remains
- providers (gitea_content/github_provider/gitlab_provider) gained
  path-traversal rejection on every tree entry
- internal/webhook ParsedImage / ParseImageRef demoted to package-
  private (no external callers)

Frontend
- /projects /stacks /sites /deploy routes deleted (entire trees)
- ProjectCard / InstanceCard / StaleContainerCard components deleted
- api.ts: dropped every project/stage/stack/site/deploy/instance
  helper + types (Project, Stage, Stack, StaticSite, Deploy,
  Instance, Volume, etc.); kept Workload, Container, App, Settings,
  Registry, EventTrigger, LogScanRule, webhook envelopes
- WorkloadWebhook type + getWorkloadWebhook/regenerateWorkloadWebhook
  api functions gone (mirror of the backend deletion above)
- web/src/routes/+layout.svelte: dropped /projects /sites /stacks
  /deploy nav entries, trimmed quick-nav keymap
- web/src/routes/+page.svelte: dashboard rewrite — reads
  listWorkloads + listContainers only; 4-card stat grid
  (workloads/running/failed/stale) + recent workloads strip
- navCounts.ts, SystemHealthCard.svelte, ContainerLogs.svelte,
  ContainerStats.svelte, StatusBadge.svelte, TagCombobox.svelte,
  proxies/+page.svelte, containers/+page.svelte all rewired to the
  workload-first surface
- AbortController plumbing on dashboard, nav-counts, stale page,
  SystemHealthCard so navigation doesn't leave dangling fetches
- i18n: dropped projects.*, projectDetail.*, envEditor.*,
  volumeEditor.*, volumeBrowser.*, quickDeploy.*, sites.*, stacks.*,
  instance.*, confirm.* namespaces; en/ru parity preserved (1042
  keys each)

Hardening from go-reviewer + security-reviewer + typescript-reviewer
subagent passes (0 CRITICAL across all three; 1 HIGH + ~12 MEDIUM
addressed inline before commit):

- Sec H1: dead-end workload webhook URL handlers (would mint URLs
  that 404 the new trigger-only ingress) deleted across backend +
  frontend
- Go M1: IsTerminalDeployStatus dropped (no production callers)
- Go M2: ParsedImage/ParseImageRef lowercased (in-package only)
- Go M6: generateWebhookSecret unified — api shim forwards to
  store.GenerateWebhookSecret
- Doc/comment freshness: stage_id (no longer FK), ProxyRoute legacy
  field names, workloadIDRow rationale, webhook_deliveries.target_type
  enum, WebhookDeliveryLog component header

Doc
- WORKLOAD_REFACTOR_TODO: cutover marked DONE; all three Priority 1
  items are now shipped. Next focus is Priority 3 polish (apps.* i18n
  + codemap entries) and Priority 4 tests.

Behavioral notes for operators upgrading from a pre-cutover build
- Existing rows in the dropped tables disappear on first boot.
- Legacy webhook URLs at /api/webhook/{secret} and
  /api/webhook/sites/{secret} return 404; CI configs must repoint to
  /api/webhook/triggers/{secret} (the trigger-split boot backfill
  lifted any embedded workload secret onto a Trigger row, so the
  secret value itself carries over).
- Frontend routes /projects /stacks /sites /deploy are gone; nav
  links replaced with /apps and /triggers.
2026-05-16 06:00:21 +03:00

190 lines
5.9 KiB
Go

package api
import (
"log/slog"
"net/http"
"sort"
"strconv"
"time"
"github.com/alexei/tinyforge/internal/auth"
"github.com/alexei/tinyforge/internal/store"
)
// topConsumerMinWindow is how recent a container sample must be to count toward
// the "top consumers" list. Scaled with the collector interval (read from
// settings) so it stays meaningful even when sampling is sparse.
const topConsumerMinWindow = 2 * time.Minute
// TopContainerSample augments a stats sample with the human-readable owner
// name so the UI can show "workload/role" without an extra round-trip per row.
type TopContainerSample struct {
store.ContainerStatsSample
OwnerName string `json:"owner_name"`
}
const (
// defaultHistoryWindow is used when no ?window= param is provided or the
// value fails to parse. Matches the default retention so the "last 2h"
// view always has data when collection is enabled.
defaultHistoryWindow = 2 * time.Hour
maxHistoryWindow = 24 * time.Hour
)
// parseWindow reads the ?window= query (Go duration string, e.g. "1h", "30m")
// and returns a bounded duration.
func parseWindow(r *http.Request) time.Duration {
raw := r.URL.Query().Get("window")
if raw == "" {
return defaultHistoryWindow
}
d, err := time.ParseDuration(raw)
if err != nil || d <= 0 {
return defaultHistoryWindow
}
if d > maxHistoryWindow {
return maxHistoryWindow
}
return d
}
// sinceTimestamp converts a duration into a Unix-seconds cutoff.
func sinceTimestamp(window time.Duration) int64 {
return time.Now().UTC().Add(-window).Unix()
}
// getSystemStats handles GET /api/system/stats — current host snapshot.
// When the Docker daemon is unreachable (e.g. Docker Desktop stopped) the
// handler returns 503 so the frontend can show a dedicated unavailable
// state instead of treating it as a generic 5xx failure.
func (s *Server) getSystemStats(w http.ResponseWriter, r *http.Request) {
if s.docker == nil {
respondError(w, http.StatusServiceUnavailable, "Docker is not available")
return
}
sys, err := s.docker.GetSystemStats(r.Context())
if err != nil {
slog.Warn("system stats unavailable", "error", err)
respondError(w, http.StatusServiceUnavailable, "Docker is not available")
return
}
respondJSON(w, http.StatusOK, sys)
}
// getSystemStatsHistory handles GET /api/system/stats/history?window=1h.
func (s *Server) getSystemStatsHistory(w http.ResponseWriter, r *http.Request) {
samples, err := s.store.ListSystemStatsSamples(sinceTimestamp(parseWindow(r)))
if err != nil {
slog.Error("failed to list system stats samples", "error", err)
respondError(w, http.StatusInternalServerError, "failed to list samples")
return
}
if samples == nil {
samples = []store.SystemStatsSample{}
}
respondJSON(w, http.StatusOK, samples)
}
// listTopContainers handles GET /api/system/stats/top?limit=5&by=cpu.
// Returns the top-N most recent samples across containers, sorted by CPU or
// memory. Container IDs are stripped for non-admins so a low-privilege viewer
// cannot enumerate workloads outside their scope.
func (s *Server) listTopContainers(w http.ResponseWriter, r *http.Request) {
limit := 5
if raw := r.URL.Query().Get("limit"); raw != "" {
if n, err := strconv.Atoi(raw); err == nil && n > 0 && n <= 50 {
limit = n
}
}
by := r.URL.Query().Get("by")
if by != "memory" {
by = "cpu"
}
// Samples must be at least as recent as max(2*interval, 2 minutes) so the
// list reflects near-current load even when collection is sparse.
window := topConsumerMinWindow
if settings, err := s.store.GetSettings(); err == nil && settings.StatsIntervalSeconds > 0 {
if w := time.Duration(settings.StatsIntervalSeconds*2) * time.Second; w > window {
window = w
}
}
samples, err := s.store.ListAllRecentContainerStatsSamples(sinceTimestamp(window))
if err != nil {
slog.Error("failed to list container samples for top", "error", err)
respondError(w, http.StatusInternalServerError, "failed to list samples")
return
}
// Keep only the latest sample per container.
latest := make(map[string]store.ContainerStatsSample, len(samples))
for _, sm := range samples {
if prev, ok := latest[sm.ContainerID]; !ok || sm.TS > prev.TS {
latest[sm.ContainerID] = sm
}
}
top := make([]store.ContainerStatsSample, 0, len(latest))
for _, sm := range latest {
top = append(top, sm)
}
sort.Slice(top, func(i, j int) bool {
if by == "memory" {
return top[i].MemoryUsage > top[j].MemoryUsage
}
return top[i].CPUPercent > top[j].CPUPercent
})
if len(top) > limit {
top = top[:limit]
}
enriched := s.enrichWithOwnerNames(top)
// Scrub container IDs for non-admins. The owner name is the actionable
// identifier; the container ID is a host-level handle that reveals
// workload existence to viewers who shouldn't have it.
claims, _ := auth.ClaimsFromContext(r.Context())
if claims.Role != "admin" {
for i := range enriched {
enriched[i].ContainerID = ""
}
}
respondJSON(w, http.StatusOK, enriched)
}
// enrichWithOwnerNames attaches a human-readable owner name to each sample.
// Names are resolved through the containers index → workloads, which after
// the cutover is the only available lookup path.
func (s *Server) enrichWithOwnerNames(samples []store.ContainerStatsSample) []TopContainerSample {
out := make([]TopContainerSample, len(samples))
for i, sm := range samples {
out[i] = TopContainerSample{ContainerStatsSample: sm}
out[i].OwnerName = s.lookupInstanceName(sm.OwnerID)
}
return out
}
// lookupInstanceName returns "workload/role" for a container row, or empty
// on any lookup error so a transient miss does not break the response.
func (s *Server) lookupInstanceName(instanceID string) string {
c, err := s.store.GetContainerByID(instanceID)
if err != nil {
return ""
}
w, err := s.store.GetWorkloadByID(c.WorkloadID)
if err != nil {
if c.Role != "" {
return c.Role
}
return ""
}
if c.Role != "" {
return w.Name + "/" + c.Role
}
return w.Name
}