739b67856a
Build / build (push) Successful in 10m39s
The clean-break delete that closes the workload-first refactor arc.
Net diff: ~30 backend files deleted, ~20 modified, ~12k LOC removed
on the Go side; entire /projects /stacks /sites /deploy frontend
trees gone; ~6.7k LOC removed on the Svelte/TypeScript side.
Backend
- API handlers gone: internal/api/{projects,stages,stage_env,stacks,
static_sites,deploys,instances,volume_browser}.go
- Store CRUD + tests gone: internal/store/{projects,stages,stage_env,
stacks,static_sites,static_site_secrets,deploys,poll_state,volumes,
workload_sync}.go (+ _test.go siblings)
- Legacy deployer pipeline gone: internal/deployer/{bluegreen,promote,
rollback,subdomain,resolver_test}.go; deployer.go trimmed to just the
dispatch surface used by the plugin pipeline
- internal/staticsite/{manager,healthcheck}.go and
internal/stack/manager.go gone (the rest of those packages stay as
helpers imported by the static + compose plugins)
- internal/registry/poller.go gone (legacy registry poller)
- internal/volume.ResolvePath gone; ResolveWorkloadPath stays
- internal/webhook: handleWebhook (project) + handleSiteWebhook (site)
gone; only POST /api/webhook/triggers/{secret} remains
- workload-side webhook URL handlers (getWorkloadWebhook +
regenerateWorkloadWebhook + EnsureWorkloadWebhookSecret +
SetWorkloadWebhookSecret + GetWorkloadByWebhookSecret) gone — they
minted URLs that would 404 against the new trigger-only ingress
- cmd/server/main.go: dropped staticsite.Manager, stack.Manager,
staticsite.HealthChecker, registry poller, SetSiteSyncTriggerer,
SetStaticSiteManager, SetStackManager, wireStaticBackend
- store/store.go: idempotent DROP TABLE IF EXISTS for every legacy
table (projects, stages, stage_env, volumes, deploys, deploy_logs,
poll_states, stacks, stack_revisions, stack_deploys, static_sites,
static_site_secrets); FK order children-then-parents
- store/models.go: dropped Project, Stage, Deploy, DeployLog, StageEnv,
Volume, StaticSite, StaticSiteSecret, Stack, StackRevision,
StackDeploy types; kept WorkloadKind constants as documented strings
- internal/store/helpers.go (new): BoolToInt, rowScanner,
GenerateWebhookSecret extracted from deleted CRUD files
- internal/api/secrets.go (new): forwards to store.GenerateWebhookSecret
so api + store paths share one secret-generation impl (no
panic-vs-UUID-fallback divergence)
- internal/reconciler/reconciler.go: dropped legacy stack-by-compose
+ static-site label paths; only canonical tinyforge.workload.id
dispatch remains
- providers (gitea_content/github_provider/gitlab_provider) gained
path-traversal rejection on every tree entry
- internal/webhook ParsedImage / ParseImageRef demoted to package-
private (no external callers)
Frontend
- /projects /stacks /sites /deploy routes deleted (entire trees)
- ProjectCard / InstanceCard / StaleContainerCard components deleted
- api.ts: dropped every project/stage/stack/site/deploy/instance
helper + types (Project, Stage, Stack, StaticSite, Deploy,
Instance, Volume, etc.); kept Workload, Container, App, Settings,
Registry, EventTrigger, LogScanRule, webhook envelopes
- WorkloadWebhook type + getWorkloadWebhook/regenerateWorkloadWebhook
api functions gone (mirror of the backend deletion above)
- web/src/routes/+layout.svelte: dropped /projects /sites /stacks
/deploy nav entries, trimmed quick-nav keymap
- web/src/routes/+page.svelte: dashboard rewrite — reads
listWorkloads + listContainers only; 4-card stat grid
(workloads/running/failed/stale) + recent workloads strip
- navCounts.ts, SystemHealthCard.svelte, ContainerLogs.svelte,
ContainerStats.svelte, StatusBadge.svelte, TagCombobox.svelte,
proxies/+page.svelte, containers/+page.svelte all rewired to the
workload-first surface
- AbortController plumbing on dashboard, nav-counts, stale page,
SystemHealthCard so navigation doesn't leave dangling fetches
- i18n: dropped projects.*, projectDetail.*, envEditor.*,
volumeEditor.*, volumeBrowser.*, quickDeploy.*, sites.*, stacks.*,
instance.*, confirm.* namespaces; en/ru parity preserved (1042
keys each)
Hardening from go-reviewer + security-reviewer + typescript-reviewer
subagent passes (0 CRITICAL across all three; 1 HIGH + ~12 MEDIUM
addressed inline before commit):
- Sec H1: dead-end workload webhook URL handlers (would mint URLs
that 404 the new trigger-only ingress) deleted across backend +
frontend
- Go M1: IsTerminalDeployStatus dropped (no production callers)
- Go M2: ParsedImage/ParseImageRef lowercased (in-package only)
- Go M6: generateWebhookSecret unified — api shim forwards to
store.GenerateWebhookSecret
- Doc/comment freshness: stage_id (no longer FK), ProxyRoute legacy
field names, workloadIDRow rationale, webhook_deliveries.target_type
enum, WebhookDeliveryLog component header
Doc
- WORKLOAD_REFACTOR_TODO: cutover marked DONE; all three Priority 1
items are now shipped. Next focus is Priority 3 polish (apps.* i18n
+ codemap entries) and Priority 4 tests.
Behavioral notes for operators upgrading from a pre-cutover build
- Existing rows in the dropped tables disappear on first boot.
- Legacy webhook URLs at /api/webhook/{secret} and
/api/webhook/sites/{secret} return 404; CI configs must repoint to
/api/webhook/triggers/{secret} (the trigger-split boot backfill
lifted any embedded workload secret onto a Trigger row, so the
secret value itself carries over).
- Frontend routes /projects /stacks /sites /deploy are gone; nav
links replaced with /apps and /triggers.
190 lines
5.9 KiB
Go
190 lines
5.9 KiB
Go
package api
|
|
|
|
import (
|
|
"log/slog"
|
|
"net/http"
|
|
"sort"
|
|
"strconv"
|
|
"time"
|
|
|
|
"github.com/alexei/tinyforge/internal/auth"
|
|
"github.com/alexei/tinyforge/internal/store"
|
|
)
|
|
|
|
// topConsumerMinWindow is how recent a container sample must be to count toward
|
|
// the "top consumers" list. Scaled with the collector interval (read from
|
|
// settings) so it stays meaningful even when sampling is sparse.
|
|
const topConsumerMinWindow = 2 * time.Minute
|
|
|
|
// TopContainerSample augments a stats sample with the human-readable owner
|
|
// name so the UI can show "workload/role" without an extra round-trip per row.
|
|
type TopContainerSample struct {
|
|
store.ContainerStatsSample
|
|
OwnerName string `json:"owner_name"`
|
|
}
|
|
|
|
const (
|
|
// defaultHistoryWindow is used when no ?window= param is provided or the
|
|
// value fails to parse. Matches the default retention so the "last 2h"
|
|
// view always has data when collection is enabled.
|
|
defaultHistoryWindow = 2 * time.Hour
|
|
maxHistoryWindow = 24 * time.Hour
|
|
)
|
|
|
|
// parseWindow reads the ?window= query (Go duration string, e.g. "1h", "30m")
|
|
// and returns a bounded duration.
|
|
func parseWindow(r *http.Request) time.Duration {
|
|
raw := r.URL.Query().Get("window")
|
|
if raw == "" {
|
|
return defaultHistoryWindow
|
|
}
|
|
d, err := time.ParseDuration(raw)
|
|
if err != nil || d <= 0 {
|
|
return defaultHistoryWindow
|
|
}
|
|
if d > maxHistoryWindow {
|
|
return maxHistoryWindow
|
|
}
|
|
return d
|
|
}
|
|
|
|
// sinceTimestamp converts a duration into a Unix-seconds cutoff.
|
|
func sinceTimestamp(window time.Duration) int64 {
|
|
return time.Now().UTC().Add(-window).Unix()
|
|
}
|
|
|
|
// getSystemStats handles GET /api/system/stats — current host snapshot.
|
|
// When the Docker daemon is unreachable (e.g. Docker Desktop stopped) the
|
|
// handler returns 503 so the frontend can show a dedicated unavailable
|
|
// state instead of treating it as a generic 5xx failure.
|
|
func (s *Server) getSystemStats(w http.ResponseWriter, r *http.Request) {
|
|
if s.docker == nil {
|
|
respondError(w, http.StatusServiceUnavailable, "Docker is not available")
|
|
return
|
|
}
|
|
sys, err := s.docker.GetSystemStats(r.Context())
|
|
if err != nil {
|
|
slog.Warn("system stats unavailable", "error", err)
|
|
respondError(w, http.StatusServiceUnavailable, "Docker is not available")
|
|
return
|
|
}
|
|
respondJSON(w, http.StatusOK, sys)
|
|
}
|
|
|
|
// getSystemStatsHistory handles GET /api/system/stats/history?window=1h.
|
|
func (s *Server) getSystemStatsHistory(w http.ResponseWriter, r *http.Request) {
|
|
samples, err := s.store.ListSystemStatsSamples(sinceTimestamp(parseWindow(r)))
|
|
if err != nil {
|
|
slog.Error("failed to list system stats samples", "error", err)
|
|
respondError(w, http.StatusInternalServerError, "failed to list samples")
|
|
return
|
|
}
|
|
if samples == nil {
|
|
samples = []store.SystemStatsSample{}
|
|
}
|
|
respondJSON(w, http.StatusOK, samples)
|
|
}
|
|
|
|
// listTopContainers handles GET /api/system/stats/top?limit=5&by=cpu.
|
|
// Returns the top-N most recent samples across containers, sorted by CPU or
|
|
// memory. Container IDs are stripped for non-admins so a low-privilege viewer
|
|
// cannot enumerate workloads outside their scope.
|
|
func (s *Server) listTopContainers(w http.ResponseWriter, r *http.Request) {
|
|
limit := 5
|
|
if raw := r.URL.Query().Get("limit"); raw != "" {
|
|
if n, err := strconv.Atoi(raw); err == nil && n > 0 && n <= 50 {
|
|
limit = n
|
|
}
|
|
}
|
|
by := r.URL.Query().Get("by")
|
|
if by != "memory" {
|
|
by = "cpu"
|
|
}
|
|
|
|
// Samples must be at least as recent as max(2*interval, 2 minutes) so the
|
|
// list reflects near-current load even when collection is sparse.
|
|
window := topConsumerMinWindow
|
|
if settings, err := s.store.GetSettings(); err == nil && settings.StatsIntervalSeconds > 0 {
|
|
if w := time.Duration(settings.StatsIntervalSeconds*2) * time.Second; w > window {
|
|
window = w
|
|
}
|
|
}
|
|
|
|
samples, err := s.store.ListAllRecentContainerStatsSamples(sinceTimestamp(window))
|
|
if err != nil {
|
|
slog.Error("failed to list container samples for top", "error", err)
|
|
respondError(w, http.StatusInternalServerError, "failed to list samples")
|
|
return
|
|
}
|
|
|
|
// Keep only the latest sample per container.
|
|
latest := make(map[string]store.ContainerStatsSample, len(samples))
|
|
for _, sm := range samples {
|
|
if prev, ok := latest[sm.ContainerID]; !ok || sm.TS > prev.TS {
|
|
latest[sm.ContainerID] = sm
|
|
}
|
|
}
|
|
|
|
top := make([]store.ContainerStatsSample, 0, len(latest))
|
|
for _, sm := range latest {
|
|
top = append(top, sm)
|
|
}
|
|
|
|
sort.Slice(top, func(i, j int) bool {
|
|
if by == "memory" {
|
|
return top[i].MemoryUsage > top[j].MemoryUsage
|
|
}
|
|
return top[i].CPUPercent > top[j].CPUPercent
|
|
})
|
|
if len(top) > limit {
|
|
top = top[:limit]
|
|
}
|
|
|
|
enriched := s.enrichWithOwnerNames(top)
|
|
|
|
// Scrub container IDs for non-admins. The owner name is the actionable
|
|
// identifier; the container ID is a host-level handle that reveals
|
|
// workload existence to viewers who shouldn't have it.
|
|
claims, _ := auth.ClaimsFromContext(r.Context())
|
|
if claims.Role != "admin" {
|
|
for i := range enriched {
|
|
enriched[i].ContainerID = ""
|
|
}
|
|
}
|
|
|
|
respondJSON(w, http.StatusOK, enriched)
|
|
}
|
|
|
|
// enrichWithOwnerNames attaches a human-readable owner name to each sample.
|
|
// Names are resolved through the containers index → workloads, which after
|
|
// the cutover is the only available lookup path.
|
|
func (s *Server) enrichWithOwnerNames(samples []store.ContainerStatsSample) []TopContainerSample {
|
|
out := make([]TopContainerSample, len(samples))
|
|
for i, sm := range samples {
|
|
out[i] = TopContainerSample{ContainerStatsSample: sm}
|
|
out[i].OwnerName = s.lookupInstanceName(sm.OwnerID)
|
|
}
|
|
return out
|
|
}
|
|
|
|
// lookupInstanceName returns "workload/role" for a container row, or empty
|
|
// on any lookup error so a transient miss does not break the response.
|
|
func (s *Server) lookupInstanceName(instanceID string) string {
|
|
c, err := s.store.GetContainerByID(instanceID)
|
|
if err != nil {
|
|
return ""
|
|
}
|
|
w, err := s.store.GetWorkloadByID(c.WorkloadID)
|
|
if err != nil {
|
|
if c.Role != "" {
|
|
return c.Role
|
|
}
|
|
return ""
|
|
}
|
|
if c.Role != "" {
|
|
return w.Name + "/" + c.Role
|
|
}
|
|
return w.Name
|
|
}
|
|
|