Files
tiny-forge/plans/observability-proxy-mgmt/phase-8-stats-notifications.md
T
alexei.dolgolyov c38b7d4c78 feat(observability): phase 1 - schema, models & event log backend
Add database foundation for observability features:
- event_log table with severity/source filtering and pagination
- standalone_proxies table for user-created reverse proxies
- stale_threshold_days setting (default 7 days)
- Auto-persist warn/error events from event bus to database
- SSE broadcast of persistent events for real-time UI updates
- Frontend types and API functions for downstream UI phases
2026-03-30 10:59:13 +03:00

3.5 KiB

Phase 8: Container Stats & Notifications

Status: Not Started Parent plan: PLAN.md Domain: fullstack

Objective

Add container resource monitoring (CPU/memory), notification triggers for operational events, and a system health dashboard summary.

Tasks

  • Task 1: Create internal/docker/stats.go — wrapper around Docker Stats API to get CPU %, memory usage/limit for a container
  • Task 2: Add API endpoint: GET /api/projects/{id}/stages/{stage}/instances/{iid}/stats — returns current CPU/memory for an instance
  • Task 3: Create SSE event type container_stats — periodically broadcast stats for running containers (every 30s)
  • Task 4: Extend notification stub (internal/notify/) — implement webhook sender for events:
    • Stale container detected
    • Proxy health failure
    • Deploy failure/rollback
    • Format: JSON payload with event type, details, timestamp
  • Task 5: Add notification settings UI — enable/disable per event type in settings page
  • Task 6: Update instance cards in frontend — show CPU % bar and memory usage badge
  • Task 7: Create ContainerStats component — mini CPU/memory visualization (progress bars)
  • Task 8: Dashboard system health summary card — total containers (running/stopped), healthy/unhealthy proxies, recent error count (last 24h)
  • Task 9: Wire notification sender to event bus — subscribe to relevant event types, fire notifications
  • Task 10: Add event log pruning cron job — delete events older than 30 days (configurable)
  • Task 11: Add i18n keys for stats and notifications

Files to Modify/Create

  • internal/docker/stats.go — NEW: Docker Stats API wrapper
  • internal/api/stats.go — NEW: Stats HTTP handler
  • internal/api/router.go — Mount stats endpoint
  • internal/notify/sender.go — Implement webhook notification sender
  • internal/notify/types.go — NEW: Notification event types and payloads
  • cmd/server/main.go — Wire notification subscriber and event pruning cron
  • web/src/lib/types.ts — Add ContainerStats, NotificationSettings types
  • web/src/lib/api.ts — Add fetchContainerStats function
  • web/src/lib/components/ContainerStats.svelte — NEW: CPU/memory display
  • web/src/lib/components/SystemHealthCard.svelte — NEW: Dashboard summary
  • web/src/routes/+page.svelte — Add system health card to dashboard
  • web/src/routes/settings/+page.svelte — Add notification settings section
  • web/src/lib/sse.ts — Add container_stats SSE handler

Acceptance Criteria

  • Container stats (CPU/memory) visible on instance cards
  • Stats update in real-time via SSE
  • Webhook notifications fire for configured event types
  • Dashboard shows system health summary
  • Event log auto-prunes old entries
  • Settings page allows configuring notification preferences
  • Build passes, existing tests pass

Notes

  • Docker Stats API returns a stream — read one snapshot and close, don't hold the connection
  • CPU calculation: (container CPU delta / system CPU delta) * 100 — needs two reads
  • Memory: usage_bytes / limit_bytes * 100 for percentage
  • Notification webhook format should be compatible with common receivers (Slack webhook, Discord webhook, generic HTTP)
  • System health card: consider caching aggregated stats to avoid N+1 queries on dashboard load

Review Checklist

  • All tasks completed
  • Code follows project conventions
  • No unintended side effects
  • Build passes
  • Tests pass (new + existing)

Handoff to Next Phase