fix: harden security, fix concurrency bugs, and address review findings
Build / build (push) Successful in 11m42s
Build / build (push) Successful in 11m42s
Security: - rate limit /api/webhook routes per-IP and cap concurrent site syncs - global SSE connection cap (256) with new sse_gate - validate ?tail= and cap JSON log responses at 4 MiB - strip ANSI/CSI/OSC and control bytes from streamed log lines - redact webhook secret from request log middleware - scrub host details from /api/health for non-admin viewers - drop container_id from /api/system/stats/top for non-admins - generate webhook secrets via crypto/rand; require >=32 chars on insert - verify iid path consistency in streamContainerLogs - LimitReader on site webhook body; reject malformed non-empty bodies Concurrency / correctness: - stats collector: Stop() no longer hangs without Start(), semaphore acquired in parent loop so ctx cancellation short-circuits the queue, in-flight tick cancellable via shared base context, zero-ts guard - webhook handler: replace fire-and-forget goroutine with WaitGroup-tracked workers + Drain() wired into graceful shutdown - $derived(() => ...) mis-idiom fixed in ContainerStats / InstanceCard / ProjectCard (returned function instead of value) - SystemResourcesCard: rename `window` and `t` locals to avoid shadowing globalThis.window and the i18n `t` import Quality / performance: - replace O(n^2) insertion sort with sort.Slice in stats top - runMigrations only swallows duplicate-column / already-exists errors - PruneStatsSamplesBefore wrapped in a transaction - collapse N+1 in unusedImageStats / pruneImages to one ListAllInstances pass; surface DB errors instead of silently treating them as inactive - run Docker Info + DiskUsage in parallel via errgroup - container log SSE emits `: ping` heartbeat every 20 s - imageMatches case-insensitive on registry host (RFC behaviour) - log warning on invalid stage tag pattern instead of silent skip - reject malformed non-empty site webhook payloads Frontend / i18n: - shared formatBytes utility replaces three local copies - statsInterval store drives dynamic "no samples / collection disabled" copy across ContainerStats and SystemResourcesCard - top consumers row now shows owner_name (project/stage or site name) - drop seven `as any` casts on the Settings type; add cloudflare_api_token write-only field - move "Service status", "Docker daemon", "Docker unreachable", "Proxy unreachable", "reachable", and "Docker daemon is not reachable." strings into en/ru i18n bundles
This commit is contained in:
+64
-8
@@ -5,20 +5,57 @@ import (
|
||||
"net/http"
|
||||
"time"
|
||||
|
||||
"github.com/alexei/tinyforge/internal/auth"
|
||||
"github.com/alexei/tinyforge/internal/proxy"
|
||||
)
|
||||
|
||||
// healthProbeTimeout caps a single health probe so a stuck dependency does
|
||||
// not hold the polling endpoint open. The UI polls every 30 s, so 8 s leaves
|
||||
// headroom for the ping + Info + NPM list calls.
|
||||
const healthProbeTimeout = 8 * time.Second
|
||||
|
||||
// nonAdminDockerFields enumerates the fields any authenticated user is
|
||||
// allowed to see — version + connectivity + container counts. Host-detail
|
||||
// fields (kernel, root_dir, hostname, OS, storage driver) are admin-only to
|
||||
// avoid recon information leaks.
|
||||
var nonAdminDockerFields = map[string]bool{
|
||||
"connected": true,
|
||||
"latency_ms": true,
|
||||
"error": true,
|
||||
"version": true,
|
||||
"api_version": true,
|
||||
"containers": true,
|
||||
"running": true,
|
||||
"paused": true,
|
||||
"stopped": true,
|
||||
"images": true,
|
||||
"ncpu": true,
|
||||
"memory_total": true,
|
||||
}
|
||||
|
||||
// nonAdminProxyFields are the proxy fields safe to share with non-admins.
|
||||
// Configured URLs and aggregate counts of internal lists/certs are stripped.
|
||||
var nonAdminProxyFields = map[string]bool{
|
||||
"provider": true,
|
||||
"connected": true,
|
||||
"latency_ms": true,
|
||||
"error": true,
|
||||
"proxy_hosts_managed": true,
|
||||
}
|
||||
|
||||
// getHealth handles GET /api/health.
|
||||
//
|
||||
// Returns the connectivity state and (when connected) rich diagnostics for the
|
||||
// Docker daemon and the active proxy provider. This endpoint is polled by the
|
||||
// UI every 30 seconds — keep the calls cheap. The expensive NPM list calls
|
||||
// are only issued when the initial ping succeeds, so a down proxy never
|
||||
// amplifies latency.
|
||||
// Returns the connectivity state and (when connected) diagnostics for the
|
||||
// Docker daemon and the active proxy provider. Detailed host information
|
||||
// (kernel, root_dir, internal NPM URL, …) is stripped for non-admin users to
|
||||
// avoid leaking infrastructure details to read-only viewers.
|
||||
func (s *Server) getHealth(w http.ResponseWriter, r *http.Request) {
|
||||
ctx, cancel := context.WithTimeout(r.Context(), 8*time.Second)
|
||||
ctx, cancel := context.WithTimeout(r.Context(), healthProbeTimeout)
|
||||
defer cancel()
|
||||
|
||||
claims, _ := auth.ClaimsFromContext(r.Context())
|
||||
isAdmin := claims.Role == "admin"
|
||||
|
||||
now := time.Now().UTC().Format(time.RFC3339)
|
||||
result := map[string]any{
|
||||
"checked_at": now,
|
||||
@@ -32,16 +69,35 @@ func (s *Server) getHealth(w http.ResponseWriter, r *http.Request) {
|
||||
}
|
||||
|
||||
// ── Docker daemon ────────────────────────────────────────────────
|
||||
result["docker"] = s.dockerHealth(ctx)
|
||||
docker := s.dockerHealth(ctx)
|
||||
if !isAdmin {
|
||||
docker = filterFields(docker, nonAdminDockerFields)
|
||||
}
|
||||
result["docker"] = docker
|
||||
|
||||
// ── Proxy provider ───────────────────────────────────────────────
|
||||
if s.proxyProvider != nil {
|
||||
result["proxy"] = s.proxyHealth(ctx)
|
||||
proxyInfo := s.proxyHealth(ctx)
|
||||
if !isAdmin {
|
||||
proxyInfo = filterFields(proxyInfo, nonAdminProxyFields)
|
||||
}
|
||||
result["proxy"] = proxyInfo
|
||||
}
|
||||
|
||||
respondJSON(w, http.StatusOK, result)
|
||||
}
|
||||
|
||||
// filterFields returns a copy of m containing only the keys present in allow.
|
||||
func filterFields(m map[string]any, allow map[string]bool) map[string]any {
|
||||
out := make(map[string]any, len(allow))
|
||||
for k, v := range m {
|
||||
if allow[k] {
|
||||
out[k] = v
|
||||
}
|
||||
}
|
||||
return out
|
||||
}
|
||||
|
||||
// dockerHealth probes the Docker daemon and, if reachable, attaches a full
|
||||
// DaemonInfo snapshot. The caller does not need to error-check the Info()
|
||||
// call — if it fails, the connected flag remains true (ping succeeded) but
|
||||
|
||||
Reference in New Issue
Block a user