diff --git a/plans/observability-proxy-mgmt/CONTEXT.md b/plans/observability-proxy-mgmt/CONTEXT.md deleted file mode 100644 index 025cb0e..0000000 --- a/plans/observability-proxy-mgmt/CONTEXT.md +++ /dev/null @@ -1,52 +0,0 @@ -# Feature Context: Observability & Proxy Management - -## Configuration -- **Development mode:** Automated -- **Execution mode:** Orchestrator -- **Strategy:** Incremental -- **Build (full):** `make build` -- **Build (frontend):** `cd web && npm install && npm run build` -- **Build (backend):** `go build -o docker-watcher ./cmd/server` -- **Test:** `go test ./...` -- **Lint (backend):** `go vet ./...` -- **Lint (frontend):** `cd web && npm run check` -- **Dev server:** `make dev` (port: 8080) - -## Current State -Feature branch just created. No implementation yet. Codebase is fully working on main. - -## Temporary Workarounds -(none yet) - -## Cross-Phase Dependencies -- Phases 2 & 3 depend on Phase 1 (schema, event_log table, store methods) -- Phases 4, 5, 6, 7 depend on their respective backend phases (1-3) for API endpoints -- Phase 8 depends on Phases 1-3 for backend infrastructure and event system - -## Deferred Work -(none yet) - -## Failed Approaches -(none yet) - -## Review Findings Log -(none yet) - -## Phase Execution Log -| Phase | Agent Used | Test Writer | Parallel | Notes | -|-------|-----------|-------------|----------|-------| -| (none yet) | | | | | - -## Environment & Runtime Notes -- Build is currently blocked on Go 1.25 transitive dep from Docker SDK β€” may need to use Go 1.24 toolchain -- SQLite has MaxOpenConns=1, so all DB operations are serialized -- Frontend is embedded into Go binary via embed.FS - -## Implementation Notes -- Event bus (`internal/events/bus.go`) uses buffered channels (64 cap), non-blocking publish -- NPM client (`internal/npm/client.go`) handles JWT auth with auto-refresh -- Store uses additive migrations β€” new `ALTER TABLE` statements are appended to runMigrations(), errors ignored for idempotency -- New tables use `CREATE TABLE IF NOT EXISTS` in the schema constant -- All API responses use envelope pattern: `{success: bool, data?: T, error?: string}` -- Frontend types in `web/src/lib/types.ts` mirror Go models -- API functions centralized in `web/src/lib/api.ts` diff --git a/plans/observability-proxy-mgmt/PLAN.md b/plans/observability-proxy-mgmt/PLAN.md deleted file mode 100644 index 8f4ec19..0000000 --- a/plans/observability-proxy-mgmt/PLAN.md +++ /dev/null @@ -1,71 +0,0 @@ -# Feature: Observability & Proxy Management - -**Branch:** `feature/observability-proxy-mgmt` -**Base branch:** `main` -**Created:** 2026-03-30 -**Status:** 🟑 In Progress -**Strategy:** Incremental -**Mode:** Automated -**Execution:** Orchestrator - -## Summary - -Extend Docker Watcher with four interconnected features: stale container detection, -standalone proxy management with health monitoring, a unified proxy viewer, and a -persistent event log β€” plus container stats and notification triggers. - -## Build & Test Commands -- **Build (frontend):** `cd web && npm install && npm run build` -- **Build (backend):** `go build -o docker-watcher ./cmd/server` -- **Build (full):** `make build` -- **Test (backend):** `go test ./...` -- **Lint (backend):** `go vet ./...` -- **Lint (frontend):** `cd web && npm run check` - -## Tech Stack Summary -- **Backend:** Go 1.24, chi v5 router, SQLite (modernc.org/sqlite), Docker SDK (moby/moby/client) -- **Frontend:** SvelteKit 2.15, Svelte 5, TypeScript 5.7, Tailwind CSS 4, Vite 6 -- **Real-time:** Server-Sent Events with auto-reconnect -- **Auth:** JWT + optional OIDC -- **Encryption:** AES-256-GCM for credentials - -## Project Conventions -- **Go:** gofmt, small interfaces, error wrapping with `fmt.Errorf("context: %w", err)`, constructor injection -- **DB:** Single-row settings, additive migrations via `ALTER TABLE` (errors ignored for idempotency), `CREATE TABLE IF NOT EXISTS` for new tables -- **API:** Envelope pattern `{success, data?, error?}`, chi route groups, admin middleware for writes -- **Frontend:** Svelte 5 runes ($state, $derived, $effect), TypeScript interfaces mirroring Go models, centralized api.ts, custom components (no UI library) -- **Files:** Feature-organized, small focused files -- **State:** Immutable patterns, no mutation - -## Phases - -- [ ] Phase 1: Schema, Models & Event Log Backend [domain: backend] β†’ [subplan](./phase-1-schema-eventlog.md) -- [ ] Phase 2: Stale Container Detection [domain: backend] β†’ [subplan](./phase-2-stale-detection.md) -- [ ] Phase 3: Direct Proxy Creation with Validation [domain: backend] β†’ [subplan](./phase-3-proxy-creation.md) -- [ ] Phase 4: Unified Proxy Viewer UI [domain: frontend] β†’ [subplan](./phase-4-proxy-viewer.md) -- [ ] Phase 5: Stale Containers UI [domain: frontend] β†’ [subplan](./phase-5-stale-ui.md) -- [ ] Phase 6: Direct Proxy Creation UI [domain: frontend] β†’ [subplan](./phase-6-proxy-creation-ui.md) -- [ ] Phase 7: Event Log UI [domain: frontend] β†’ [subplan](./phase-7-eventlog-ui.md) -- [ ] Phase 8: Container Stats & Notifications [domain: fullstack] β†’ [subplan](./phase-8-stats-notifications.md) - -**Parallelizable phases:** -- Phases 4, 5, 6, 7 are all frontend phases that touch different routes/components and can potentially run in parallel after all backend phases (1-3) complete. - -## Phase Progress Log - -| Phase | Domain | Status | Review | Build | Committed | -|-------|--------|--------|--------|-------|-----------| -| Phase 1: Schema & Event Log | backend | ⬜ Not Started | ⬜ | ⬜ | ⬜ | -| Phase 2: Stale Detection | backend | ⬜ Not Started | ⬜ | ⬜ | ⬜ | -| Phase 3: Proxy Creation | backend | ⬜ Not Started | ⬜ | ⬜ | ⬜ | -| Phase 4: Proxy Viewer UI | frontend | ⬜ Not Started | ⬜ | ⬜ | ⬜ | -| Phase 5: Stale Containers UI | frontend | ⬜ Not Started | ⬜ | ⬜ | ⬜ | -| Phase 6: Proxy Creation UI | frontend | ⬜ Not Started | ⬜ | ⬜ | ⬜ | -| Phase 7: Event Log UI | frontend | ⬜ Not Started | ⬜ | ⬜ | ⬜ | -| Phase 8: Stats & Notifications | fullstack | ⬜ Not Started | ⬜ | ⬜ | ⬜ | - -## Final Review -- [ ] Comprehensive code review -- [ ] Full build passes -- [ ] Full test suite passes -- [ ] Merged to `main` diff --git a/plans/observability-proxy-mgmt/phase-1-schema-eventlog.md b/plans/observability-proxy-mgmt/phase-1-schema-eventlog.md deleted file mode 100644 index 247d673..0000000 --- a/plans/observability-proxy-mgmt/phase-1-schema-eventlog.md +++ /dev/null @@ -1,60 +0,0 @@ -# Phase 1: Schema, Models & Event Log Backend - -**Status:** ⬜ Not Started -**Parent plan:** [PLAN.md](./PLAN.md) -**Domain:** backend - -## Objective -Lay the database foundation for all new features and implement the persistent event log system. - -## Tasks - -- [ ] Task 1: Add `event_log` table to schema (id INTEGER PK AUTOINCREMENT, source TEXT, severity TEXT, message TEXT, metadata TEXT JSON, created_at TEXT) -- [ ] Task 2: Add `standalone_proxies` table to schema (id TEXT PK, domain TEXT UNIQUE, destination_url TEXT, destination_port INTEGER, ssl_certificate_id INTEGER, npm_proxy_id INTEGER, health_status TEXT, health_checked_at TEXT, created_at TEXT, updated_at TEXT) -- [ ] Task 3: Add `stale_threshold_days` column to settings table (migration, default 7) -- [ ] Task 4: Create `internal/store/eventlog.go` β€” store methods: InsertEvent, ListEvents (paginated, filterable by severity/source/date range), GetEventStats (counts by severity), PruneEvents (delete old entries) -- [ ] Task 5: Create `internal/store/standalone_proxy.go` β€” store methods: CreateStandaloneProxy, GetStandaloneProxy, ListStandaloneProxies, UpdateStandaloneProxy, DeleteStandaloneProxy, UpdateProxyHealth -- [ ] Task 6: Create Go models in `internal/store/models.go` β€” EventLog struct, StandaloneProxy struct -- [ ] Task 7: Update settings model to include stale_threshold_days field; update GetSettings/SaveSettings -- [ ] Task 8: Enhance event bus to auto-persist warn/error events β€” add a subscriber in events.Bus that writes to store -- [ ] Task 9: Add API endpoints: `GET /api/events/log` (paginated, filterable), `GET /api/events/log/stats` -- [ ] Task 10: Add new SSE event type `event_log` β€” broadcast persistent events in real-time -- [ ] Task 11: Add frontend types: EventLogEntry, StandaloneProxy interfaces in types.ts -- [ ] Task 12: Add API functions in api.ts: fetchEventLog, fetchEventLogStats - -## Files to Modify/Create -- `internal/store/store.go` β€” Add schema for event_log, standalone_proxies tables; migration for stale_threshold_days -- `internal/store/models.go` β€” Add EventLog, StandaloneProxy structs; update Settings struct -- `internal/store/eventlog.go` β€” NEW: Event log store methods -- `internal/store/standalone_proxy.go` β€” NEW: Standalone proxy store methods -- `internal/store/settings.go` β€” Update GetSettings/SaveSettings for new field -- `internal/events/bus.go` β€” Add persistent event subscriber -- `internal/api/router.go` β€” Mount new event log routes -- `internal/api/eventlog.go` β€” NEW: Event log HTTP handlers -- `web/src/lib/types.ts` β€” Add EventLogEntry, StandaloneProxy types -- `web/src/lib/api.ts` β€” Add fetchEventLog, fetchEventLogStats functions - -## Acceptance Criteria -- event_log and standalone_proxies tables created on startup (migration is idempotent) -- stale_threshold_days setting accessible via settings API -- Events with warn/error severity auto-persisted from event bus -- GET /api/events/log returns paginated, filterable results -- GET /api/events/log/stats returns severity counts -- Frontend types and API functions ready for downstream UI phases -- Existing functionality unchanged β€” all current tests/builds pass - -## Notes -- Follow existing migration pattern: ALTER TABLE errors ignored for idempotency -- event_log metadata is a JSON TEXT column for flexible structured data -- Pagination follows offset/limit pattern (no cursor β€” SQLite is simple enough) -- Event log pruning can be called from a cron job later (Phase 8) - -## Review Checklist -- [ ] All tasks completed -- [ ] Code follows project conventions -- [ ] No unintended side effects -- [ ] Build passes -- [ ] Tests pass (new + existing) - -## Handoff to Next Phase - diff --git a/plans/observability-proxy-mgmt/phase-2-stale-detection.md b/plans/observability-proxy-mgmt/phase-2-stale-detection.md deleted file mode 100644 index aa10c15..0000000 --- a/plans/observability-proxy-mgmt/phase-2-stale-detection.md +++ /dev/null @@ -1,55 +0,0 @@ -# Phase 2: Stale Container Detection - -**Status:** ⬜ Not Started -**Parent plan:** [PLAN.md](./PLAN.md) -**Domain:** backend - -## Objective -Implement a periodic scanner that detects containers managed by docker-watcher which have been non-running for more than N configurable days, and exposes them via API. - -## Tasks - -- [ ] Task 1: Create `internal/stale/scanner.go` β€” Scanner struct with dependencies (store, docker client, event bus) -- [ ] Task 2: Implement scan logic: query all instances from store, check Docker container state via Docker SDK, compare against stale_threshold_days from settings -- [ ] Task 3: Add `last_alive_at` column to instances table (migration) β€” updated when instance is seen running -- [ ] Task 4: Update deployer/instance lifecycle to set last_alive_at when container starts/is seen running -- [ ] Task 5: Implement stale detection: instance is stale if status != 'running' AND (now - last_alive_at) > threshold days -- [ ] Task 6: Emit event_log warnings when containers become newly stale (avoid re-emitting for already-known stale containers) -- [ ] Task 7: Register scanner as cron job (reuse existing robfig/cron infrastructure from registry poller) -- [ ] Task 8: Add API endpoints: `GET /api/containers/stale` (list stale with project/stage info), `POST /api/containers/stale/{id}/cleanup` (remove single), `POST /api/containers/stale/cleanup` (bulk remove) -- [ ] Task 9: Cleanup handler: stop container via Docker SDK, remove instance from store, emit event -- [ ] Task 10: Wire scanner into main.go startup (after store, docker client, event bus init) - -## Files to Modify/Create -- `internal/stale/scanner.go` β€” NEW: Stale container scanner -- `internal/store/store.go` β€” Migration for last_alive_at column -- `internal/store/models.go` β€” Update Instance struct with LastAliveAt field -- `internal/store/instances.go` β€” Update queries to include last_alive_at; add UpdateLastAliveAt method -- `internal/api/router.go` β€” Mount stale container routes -- `internal/api/stale.go` β€” NEW: Stale container HTTP handlers -- `cmd/server/main.go` β€” Wire scanner with cron - -## Acceptance Criteria -- Scanner runs on configurable interval (e.g., every hour) -- Stale containers correctly identified based on threshold -- GET /api/containers/stale returns list with project name, stage name, image tag, last alive timestamp, days stale -- Cleanup endpoints properly stop Docker containers and remove from store -- Events emitted when containers become stale -- Existing deploy flow unaffected β€” last_alive_at updated on successful deploy -- Build passes, existing tests pass - -## Notes -- Scanner should handle gracefully: containers that no longer exist in Docker (already removed externally) -- Bulk cleanup should be admin-only -- Consider: scan interval could be derived from stale_threshold_days (e.g., scan every threshold/7 days, min 1h) -- Don't remove containers that are in 'removing' status (already being cleaned up) - -## Review Checklist -- [ ] All tasks completed -- [ ] Code follows project conventions -- [ ] No unintended side effects -- [ ] Build passes -- [ ] Tests pass (new + existing) - -## Handoff to Next Phase - diff --git a/plans/observability-proxy-mgmt/phase-3-proxy-creation.md b/plans/observability-proxy-mgmt/phase-3-proxy-creation.md deleted file mode 100644 index c713044..0000000 --- a/plans/observability-proxy-mgmt/phase-3-proxy-creation.md +++ /dev/null @@ -1,81 +0,0 @@ -# Phase 3: Direct Proxy Creation with Validation - -**Status:** ⬜ Not Started -**Parent plan:** [PLAN.md](./PLAN.md) -**Domain:** backend - -## Objective -Implement standalone proxy creation with a multi-step validation pipeline that checks destination reachability, and periodic health monitoring for all standalone proxies. - -## Tasks - -- [ ] Task 1: Create `internal/proxy/validator.go` β€” validation pipeline: - - URL/port syntax validation - - DNS resolution check - - TCP port reachability (net.DialTimeout, 5s) - - HTTP health probe (GET to destination, 10s timeout) - - Returns structured ValidationResult with per-step pass/fail and diagnostic hints -- [ ] Task 2: Create `internal/proxy/hints.go` β€” diagnostic hint generator: - - DNS failure β†’ "Domain cannot be resolved. Check DNS settings or use an IP address." - - TCP refused β†’ "Port {port} is not accepting connections. Check if the service is running and the port is correct." - - TCP timeout β†’ "Connection timed out. Possible firewall blocking. Check network/firewall rules." - - Host unreachable β†’ "Host is not reachable. Verify the IP address and network connectivity." - - HTTP error β†’ "Service responded with HTTP {status}. The service may not be healthy." -- [ ] Task 3: Create `internal/proxy/manager.go` β€” proxy lifecycle: - - CreateProxy: validate destination, create NPM proxy host (using npm.Client), assign SSL cert from settings, save to standalone_proxies table - - UpdateProxy: re-validate, update NPM proxy host, update store - - DeleteProxy: remove NPM proxy host, remove from store - - GetProxy/ListProxies: read from store with health status -- [ ] Task 4: Create `internal/proxy/health.go` β€” periodic health monitor: - - Cron job that checks all standalone proxies - - HTTP GET to destination URL/port - - Updates health_status (healthy/unhealthy/unknown) and health_checked_at in store - - Emits event_log on status change (healthyβ†’unhealthy or vice versa) -- [ ] Task 5: Add API endpoints: - - `POST /api/proxies/validate` β€” run validation without creating - - `POST /api/proxies` β€” create standalone proxy - - `GET /api/proxies` β€” list standalone proxies - - `GET /api/proxies/{id}` β€” get single proxy - - `PUT /api/proxies/{id}` β€” update proxy - - `DELETE /api/proxies/{id}` β€” delete proxy - - `GET /api/proxies/all` β€” merged view: standalone + deploy-managed proxies (for Phase 4 UI) -- [ ] Task 6: Wire health monitor cron job in main.go -- [ ] Task 7: Add frontend API functions in api.ts: validateProxy, createProxy, listProxies, getProxy, updateProxy, deleteProxy, listAllProxies -- [ ] Task 8: Add frontend types: ValidationResult, ValidationStep, ProxyHealthStatus - -## Files to Modify/Create -- `internal/proxy/validator.go` β€” NEW: Validation pipeline -- `internal/proxy/hints.go` β€” NEW: Diagnostic hints -- `internal/proxy/manager.go` β€” NEW: Proxy lifecycle management -- `internal/proxy/health.go` β€” NEW: Health monitoring -- `internal/api/router.go` β€” Mount proxy routes -- `internal/api/proxy.go` β€” NEW: Proxy HTTP handlers -- `cmd/server/main.go` β€” Wire proxy manager and health monitor -- `web/src/lib/types.ts` β€” Add ValidationResult, ProxyHealthStatus types -- `web/src/lib/api.ts` β€” Add proxy API functions - -## Acceptance Criteria -- Validation pipeline returns structured results with specific failure hints -- POST /api/proxies/validate runs full check without side effects -- Proxy creation creates NPM proxy host with SSL cert from global settings -- Health monitor runs periodically and updates proxy status -- Events emitted on health status changes -- GET /api/proxies/all merges standalone and deploy-managed proxy data -- Build passes, existing tests pass - -## Notes -- Validation should be fast (short timeouts) β€” user waits for results -- Health monitor interval: every 5 minutes (configurable later) -- For /api/proxies/all: query NPM for all proxy hosts, join with instances table for managed proxies, join with standalone_proxies for standalone ones -- SSL cert auto-assigned from settings.ssl_certificate_id -- Consider: proxy domain must be unique across both standalone and managed proxies - -## Review Checklist -- [ ] All tasks completed -- [ ] Code follows project conventions -- [ ] No unintended side effects -- [ ] Build passes -- [ ] Tests pass (new + existing) - -## Handoff to Next Phase - diff --git a/plans/observability-proxy-mgmt/phase-4-proxy-viewer.md b/plans/observability-proxy-mgmt/phase-4-proxy-viewer.md deleted file mode 100644 index e77218c..0000000 --- a/plans/observability-proxy-mgmt/phase-4-proxy-viewer.md +++ /dev/null @@ -1,56 +0,0 @@ -# Phase 4: Unified Proxy Viewer UI - -**Status:** ⬜ Not Started -**Parent plan:** [PLAN.md](./PLAN.md) -**Domain:** frontend - -## Objective -Build a unified proxy viewer page showing ALL proxies (deploy-managed and standalone) with grouping, filtering, and real-time health indicators. - -## Tasks - -- [ ] Task 1: Create route `/proxies` with `+page.svelte` and `+page.ts` data loader -- [ ] Task 2: Create ProxyCard component β€” displays: domain, destination, SSL badge, health indicator (green/yellow/red dot), proxy type badge (managed/standalone), last health check timestamp -- [ ] Task 3: Create ProxyGroup component β€” collapsible section with project name header, stage sub-groups, proxy count badge -- [ ] Task 4: Create StandaloneProxyGroup component β€” separate collapsible section for user-created proxies -- [ ] Task 5: Implement filtering: by project, stage, health status (healthy/unhealthy/unknown), proxy type (managed/standalone), free-text search by domain/destination -- [ ] Task 6: Filter bar component with dropdown selects and search input -- [ ] Task 7: SSE integration β€” subscribe to proxy health events, update health indicators in real-time -- [ ] Task 8: Empty state β€” friendly message when no proxies exist, with link to create one -- [ ] Task 9: Add navigation link in sidebar layout (+layout.svelte) -- [ ] Task 10: Add i18n keys for proxy viewer page - -## Files to Modify/Create -- `web/src/routes/proxies/+page.svelte` β€” NEW: Proxy viewer page -- `web/src/routes/proxies/+page.ts` β€” NEW: Data loader -- `web/src/lib/components/ProxyCard.svelte` β€” NEW: Individual proxy display -- `web/src/lib/components/ProxyGroup.svelte` β€” NEW: Collapsible project/stage group -- `web/src/lib/components/ProxyFilter.svelte` β€” NEW: Filter bar -- `web/src/routes/+layout.svelte` β€” Add proxies nav link -- `web/src/lib/i18n/en.ts` (or equivalent) β€” Add proxy viewer strings - -## Acceptance Criteria -- All proxies visible: both deploy-managed and standalone -- Proxies grouped by project/stage in collapsible sections -- Health indicators show real-time status (green=healthy, red=unhealthy, yellow=unknown) -- Filtering works: project, stage, health, type, text search -- SSE updates health indicators without page refresh -- Navigation accessible from sidebar -- Responsive layout (mobile-friendly) - -## Notes -- Use existing component patterns (ConfirmDialog, FormField styles, etc.) -- Follow existing Svelte 5 patterns ($state, $derived, $effect) -- The /api/proxies/all endpoint from Phase 3 provides the data source -- Health indicator should pulse/animate briefly on status change -- Consider: show proxy count in sidebar nav badge - -## Review Checklist -- [ ] All tasks completed -- [ ] Code follows project conventions -- [ ] No unintended side effects -- [ ] Build passes -- [ ] Tests pass (new + existing) - -## Handoff to Next Phase - diff --git a/plans/observability-proxy-mgmt/phase-5-stale-ui.md b/plans/observability-proxy-mgmt/phase-5-stale-ui.md deleted file mode 100644 index 28adfb6..0000000 --- a/plans/observability-proxy-mgmt/phase-5-stale-ui.md +++ /dev/null @@ -1,55 +0,0 @@ -# Phase 5: Stale Containers UI - -**Status:** ⬜ Not Started -**Parent plan:** [PLAN.md](./PLAN.md) -**Domain:** frontend - -## Objective -Build the stale containers dashboard widget and dedicated view, with cleanup actions and settings configuration. - -## Tasks - -- [ ] Task 1: Add API functions in api.ts: fetchStaleContainers, cleanupStaleContainer, bulkCleanupStaleContainers -- [ ] Task 2: Create StaleContainerCard component β€” shows: container name, project, stage, image tag, last alive timestamp, "X days stale" badge (color-coded by severity) -- [ ] Task 3: Create stale containers section on dashboard (+page.svelte) β€” count badge, mini-list of top 5 offenders, "View all" link -- [ ] Task 4: Create dedicated route `/containers/stale` with full stale container list -- [ ] Task 5: Individual cleanup action β€” ConfirmDialog with warning, calls cleanup API -- [ ] Task 6: Bulk cleanup action β€” "Clean up all" button with confirmation, progress indicator -- [ ] Task 7: Settings integration β€” add stale_threshold_days field to settings page with validation (min 1 day) -- [ ] Task 8: Add navigation link or sub-nav for stale containers -- [ ] Task 9: Add i18n keys for stale containers - -## Files to Modify/Create -- `web/src/lib/api.ts` β€” Add stale container API functions -- `web/src/lib/types.ts` β€” Add StaleContainer interface -- `web/src/lib/components/StaleContainerCard.svelte` β€” NEW: Stale container display -- `web/src/routes/+page.svelte` β€” Add stale containers dashboard widget -- `web/src/routes/containers/stale/+page.svelte` β€” NEW: Dedicated stale view -- `web/src/routes/containers/stale/+page.ts` β€” NEW: Data loader -- `web/src/routes/settings/+page.svelte` β€” Add stale threshold setting field -- `web/src/routes/+layout.svelte` β€” Add nav link if needed - -## Acceptance Criteria -- Dashboard shows stale container count and top offenders -- Dedicated page lists all stale containers with details -- Individual cleanup removes container with confirmation -- Bulk cleanup works with progress feedback -- Settings page allows configuring stale threshold -- Severity coloring: 7-14 days = yellow, 14+ days = red -- Responsive layout - -## Notes -- Reuse existing ConfirmDialog for destructive actions -- Dashboard widget should not slow down initial page load (lazy load or small payload) -- Stale container data comes from GET /api/containers/stale (Phase 2) -- Settings update uses existing PUT /api/settings endpoint - -## Review Checklist -- [ ] All tasks completed -- [ ] Code follows project conventions -- [ ] No unintended side effects -- [ ] Build passes -- [ ] Tests pass (new + existing) - -## Handoff to Next Phase - diff --git a/plans/observability-proxy-mgmt/phase-6-proxy-creation-ui.md b/plans/observability-proxy-mgmt/phase-6-proxy-creation-ui.md deleted file mode 100644 index 7ccf7df..0000000 --- a/plans/observability-proxy-mgmt/phase-6-proxy-creation-ui.md +++ /dev/null @@ -1,54 +0,0 @@ -# Phase 6: Direct Proxy Creation UI - -**Status:** ⬜ Not Started -**Parent plan:** [PLAN.md](./PLAN.md) -**Domain:** frontend - -## Objective -Build the proxy creation form with live validation feedback, diagnostic hints, and management actions (edit/delete). - -## Tasks - -- [ ] Task 1: Create "Create Proxy" form component β€” fields: destination URL/IP, port, domain (auto-suggested from subdomain pattern), optional custom subdomain override -- [ ] Task 2: Live validation β€” debounced calls to POST /api/proxies/validate as user types (300ms debounce) -- [ ] Task 3: Validation result display β€” step-by-step checklist with icons: - - βœ… DNS resolution OK / ❌ DNS resolution failed - - βœ… TCP port reachable / ❌ TCP port not reachable - - βœ… HTTP responding / ❌ HTTP not responding - - Each failure shows the diagnostic hint from the backend -- [ ] Task 4: Create proxy submission β€” calls POST /api/proxies, shows success toast with health indicator -- [ ] Task 5: Edit proxy β€” modal or inline form, pre-populated with current values, re-validates on save -- [ ] Task 6: Delete proxy β€” ConfirmDialog with domain name confirmation -- [ ] Task 7: Integration with proxy viewer page β€” "Create Proxy" button in the proxy viewer header -- [ ] Task 8: Domain auto-suggestion β€” when user enters destination, suggest domain based on subdomain_pattern from settings -- [ ] Task 9: Add i18n keys for proxy creation - -## Files to Modify/Create -- `web/src/lib/components/ProxyForm.svelte` β€” NEW: Create/edit proxy form -- `web/src/lib/components/ValidationChecklist.svelte` β€” NEW: Step-by-step validation display -- `web/src/routes/proxies/+page.svelte` β€” Add "Create Proxy" button and modal/panel -- `web/src/lib/api.ts` β€” Ensure validateProxy, createProxy, updateProxy, deleteProxy are present (from Phase 3) - -## Acceptance Criteria -- Form validates destination in real-time with debouncing -- Each validation step shows pass/fail with diagnostic hints -- Proxy creation works end-to-end (form β†’ API β†’ NPM β†’ success) -- Edit and delete work for existing standalone proxies -- Domain auto-suggestion works from settings pattern -- Error states handled gracefully (network errors, API failures) - -## Notes -- Validation should show a loading spinner while in progress -- Don't validate on every keystroke β€” use 300ms debounce -- If all validation steps fail, still allow creation (user might know better β€” just warn) -- SSL certificate is applied automatically from global settings (no cert picker in form) - -## Review Checklist -- [ ] All tasks completed -- [ ] Code follows project conventions -- [ ] No unintended side effects -- [ ] Build passes -- [ ] Tests pass (new + existing) - -## Handoff to Next Phase - diff --git a/plans/observability-proxy-mgmt/phase-7-eventlog-ui.md b/plans/observability-proxy-mgmt/phase-7-eventlog-ui.md deleted file mode 100644 index d17e39e..0000000 --- a/plans/observability-proxy-mgmt/phase-7-eventlog-ui.md +++ /dev/null @@ -1,54 +0,0 @@ -# Phase 7: Event Log UI - -**Status:** ⬜ Not Started -**Parent plan:** [PLAN.md](./PLAN.md) -**Domain:** frontend - -## Objective -Build a persistent, searchable event log viewer with real-time streaming, filters, and resource linking. - -## Tasks - -- [ ] Task 1: Create route `/events` with `+page.svelte` and `+page.ts` data loader -- [ ] Task 2: Create EventLogEntry component β€” timestamp, severity badge (info=blue, warn=yellow, error=red), source icon (container/proxy/deploy/system), message text, expandable metadata section -- [ ] Task 3: Create EventLogFilter component β€” filters: severity multi-select, source multi-select, date range picker (start/end), free-text search -- [ ] Task 4: Implement pagination β€” "Load more" button at bottom (offset/limit pattern matching API) -- [ ] Task 5: SSE integration β€” subscribe to event_log events, prepend new entries at top with subtle highlight animation -- [ ] Task 6: Quick actions β€” clickable links to related resources (e.g., click container name β†’ go to project/stage, click proxy domain β†’ go to proxy viewer) -- [ ] Task 7: Stats header β€” show counts by severity (from GET /api/events/log/stats), with colored badges -- [ ] Task 8: Add navigation link in sidebar -- [ ] Task 9: Add i18n keys for event log page - -## Files to Modify/Create -- `web/src/routes/events/+page.svelte` β€” NEW: Event log page -- `web/src/routes/events/+page.ts` β€” NEW: Data loader -- `web/src/lib/components/EventLogEntry.svelte` β€” NEW: Event entry display -- `web/src/lib/components/EventLogFilter.svelte` β€” NEW: Filter controls -- `web/src/routes/+layout.svelte` β€” Add events nav link -- `web/src/lib/sse.ts` β€” Add event_log SSE subscription helper (if needed) - -## Acceptance Criteria -- Event log shows all persistent events with severity and source -- Filters work: severity, source, date range, text search -- New events stream in real-time via SSE without page refresh -- Pagination loads older events on demand -- Quick actions link to related resources -- Stats header shows severity distribution -- Responsive layout - -## Notes -- Follow existing SSE patterns from deploy logs viewer -- Date range filter: consider "last hour", "last 24h", "last 7 days" presets + custom range -- Metadata section is JSON β€” render as formatted key-value pairs, not raw JSON -- Resource linking: parse source and metadata to construct navigation URLs -- Consider: auto-scroll to top when new event arrives (if user is at top), otherwise show "N new events" badge - -## Review Checklist -- [ ] All tasks completed -- [ ] Code follows project conventions -- [ ] No unintended side effects -- [ ] Build passes -- [ ] Tests pass (new + existing) - -## Handoff to Next Phase - diff --git a/plans/observability-proxy-mgmt/phase-8-stats-notifications.md b/plans/observability-proxy-mgmt/phase-8-stats-notifications.md deleted file mode 100644 index 857236b..0000000 --- a/plans/observability-proxy-mgmt/phase-8-stats-notifications.md +++ /dev/null @@ -1,67 +0,0 @@ -# Phase 8: Container Stats & Notifications - -**Status:** ⬜ Not Started -**Parent plan:** [PLAN.md](./PLAN.md) -**Domain:** fullstack - -## Objective -Add container resource monitoring (CPU/memory), notification triggers for operational events, and a system health dashboard summary. - -## Tasks - -- [ ] Task 1: Create `internal/docker/stats.go` β€” wrapper around Docker Stats API to get CPU %, memory usage/limit for a container -- [ ] Task 2: Add API endpoint: `GET /api/projects/{id}/stages/{stage}/instances/{iid}/stats` β€” returns current CPU/memory for an instance -- [ ] Task 3: Create SSE event type `container_stats` β€” periodically broadcast stats for running containers (every 30s) -- [ ] Task 4: Extend notification stub (`internal/notify/`) β€” implement webhook sender for events: - - Stale container detected - - Proxy health failure - - Deploy failure/rollback - - Format: JSON payload with event type, details, timestamp -- [ ] Task 5: Add notification settings UI β€” enable/disable per event type in settings page -- [ ] Task 6: Update instance cards in frontend β€” show CPU % bar and memory usage badge -- [ ] Task 7: Create ContainerStats component β€” mini CPU/memory visualization (progress bars) -- [ ] Task 8: Dashboard system health summary card β€” total containers (running/stopped), healthy/unhealthy proxies, recent error count (last 24h) -- [ ] Task 9: Wire notification sender to event bus β€” subscribe to relevant event types, fire notifications -- [ ] Task 10: Add event log pruning cron job β€” delete events older than 30 days (configurable) -- [ ] Task 11: Add i18n keys for stats and notifications - -## Files to Modify/Create -- `internal/docker/stats.go` β€” NEW: Docker Stats API wrapper -- `internal/api/stats.go` β€” NEW: Stats HTTP handler -- `internal/api/router.go` β€” Mount stats endpoint -- `internal/notify/sender.go` β€” Implement webhook notification sender -- `internal/notify/types.go` β€” NEW: Notification event types and payloads -- `cmd/server/main.go` β€” Wire notification subscriber and event pruning cron -- `web/src/lib/types.ts` β€” Add ContainerStats, NotificationSettings types -- `web/src/lib/api.ts` β€” Add fetchContainerStats function -- `web/src/lib/components/ContainerStats.svelte` β€” NEW: CPU/memory display -- `web/src/lib/components/SystemHealthCard.svelte` β€” NEW: Dashboard summary -- `web/src/routes/+page.svelte` β€” Add system health card to dashboard -- `web/src/routes/settings/+page.svelte` β€” Add notification settings section -- `web/src/lib/sse.ts` β€” Add container_stats SSE handler - -## Acceptance Criteria -- Container stats (CPU/memory) visible on instance cards -- Stats update in real-time via SSE -- Webhook notifications fire for configured event types -- Dashboard shows system health summary -- Event log auto-prunes old entries -- Settings page allows configuring notification preferences -- Build passes, existing tests pass - -## Notes -- Docker Stats API returns a stream β€” read one snapshot and close, don't hold the connection -- CPU calculation: (container CPU delta / system CPU delta) * 100 β€” needs two reads -- Memory: usage_bytes / limit_bytes * 100 for percentage -- Notification webhook format should be compatible with common receivers (Slack webhook, Discord webhook, generic HTTP) -- System health card: consider caching aggregated stats to avoid N+1 queries on dashboard load - -## Review Checklist -- [ ] All tasks completed -- [ ] Code follows project conventions -- [ ] No unintended side effects -- [ ] Build passes -- [ ] Tests pass (new + existing) - -## Handoff to Next Phase -