Commit Graph

42 Commits

Author SHA1 Message Date
alexei.dolgolyov a9c7775bb7 feat: configuration backup management with manual and auto backup
Add backup/restore functionality for the SQLite database. Users can
trigger manual backups, configure automatic backups on an interval
with retention policies, list/download/delete backups, and restore
from any backup.

- Backup engine using VACUUM INTO (safe with WAL mode)
- Backup metadata tracked in DB, files stored in DATA_DIR/backups/
- Settings: backup_enabled, backup_interval_hours, backup_retention_count
- API: POST/GET/DELETE /api/backups, download, restore endpoints
- Autobackup via cron scheduler with configurable interval
- Retention: prune on startup, after each backup (manual and auto)
- Orphan cleanup: removes backup files without metadata on startup
- Restore: replaces DB and triggers graceful server shutdown
- Settings UI: /settings/backup with toggle, interval, retention config
- Backup list with download, delete, restore actions
- i18n: English and Russian translations
2026-04-02 15:32:15 +03:00
alexei.dolgolyov 6bb4781158 fix: address final review findings
- CRITICAL: Add binaries and .svelte-kit/ to .gitignore, remove from tracking
- HIGH: Return error from computeExpectedFQDNs to prevent mass DNS
  deletion on transient DB errors during sync
- MEDIUM: Log error in rollback DNS cleanup when GetSettings fails
2026-04-02 15:04:53 +03:00
alexei.dolgolyov 670948f113 fix: address code review findings for DNS management
- CRITICAL: Change DNS zones endpoint from GET to POST to avoid
  leaking API token in URL query parameters
- HIGH: Add sync.RWMutex to protect dnsProvider field in Server,
  Deployer, and proxy Manager against concurrent read/write races
- HIGH: Capture old DNS provider reference synchronously before
  launching background cleanup goroutine
- HIGH: Use getDNS()/getDNSProviderLocked() accessors instead of
  direct field reads in all DNS operations
2026-04-02 14:54:15 +03:00
alexei.dolgolyov c730cfaa45 feat: Cloudflare DNS management with automatic record sync
Add flexible DNS management to Docker Watcher. By default, wildcard DNS
is assumed (current behavior). When disabled, users can configure a
Cloudflare DNS provider with API token and zone selection. DNS A records
are automatically created/updated/deleted in sync with proxy consumers
(deployed instances and standalone proxies).

- Settings: wildcard_dns toggle, dns_provider, cloudflare credentials
- Cloudflare client: Provider interface with EnsureRecord/DeleteRecord/ListRecords
- DNS lifecycle hooks in deployer and proxy manager (best-effort)
- Settings UI: DNS config section with provider picker, zone selector, test button
- DNS Records page at /dns with filtering, sync status, reconciliation
- Records visible in both wildcard and managed modes
- Cleanup on provider change: removes old records when switching modes
2026-04-02 14:49:21 +03:00
alexei.dolgolyov 582e7e39e3 feat(volume-browser): absolute scope with allowlist security
- Add 'absolute' volume scope for direct host paths (NFS, external mounts)
- Allowlist in settings: allowed_volume_paths (JSON array of prefixes)
- Validation: absolute source must be under an allowed prefix
- Empty allowlist = absolute scope disabled entirely
- Settings API exposes/validates allowed_volume_paths
- Frontend type updated with absolute scope
2026-04-01 23:31:27 +03:00
alexei.dolgolyov 0491849f0f fix(volume-browser): address security review findings
Critical fixes:
- IDOR: verify volume belongs to project before resolving path
- Upload: override global 1MB body limit for upload endpoint (100MB)

High-priority fixes:
- Symlink escape: use filepath.EvalSymlinks in safePath validation
- Remove host filesystem path from browse API response
- Sanitize Content-Disposition filenames, force application/octet-stream
- Strip directory components from upload filenames
2026-04-01 23:17:35 +03:00
alexei.dolgolyov 4a0f223d61 feat(volume-browser): phase 1 - path resolver & file system API
- Extract volume path resolution into shared internal/volume/resolver.go
- File browser operations: ListDir, OpenFile, WriteZip, SaveFile
- Strict path traversal protection (double-validated)
- API endpoints: browse, download (file or zip), upload (multipart)
- Refactor deployer to use shared resolver
2026-04-01 22:59:02 +03:00
alexei.dolgolyov bb2729ad12 fix: address volume scopes review findings
- CRITICAL: validate volume Name against path traversal (safe regex)
- HIGH: log data migration errors instead of silently ignoring
- HIGH: reject empty source when switching from ephemeral scope
2026-03-31 23:31:27 +03:00
alexei.dolgolyov 8fb959f81f feat: volume scopes redesign — replace shared/isolated with 6 scopes
Replace confusing shared/isolated volume modes with explicit scopes:
- instance: per-deploy isolated directory
- stage: shared within a stage across deploys
- project: shared across all stages
- project_named: named group within a project
- named: global named volume across projects
- ephemeral: tmpfs in-memory mount

Includes schema migration (shared→project, isolated→instance),
backward-compatible deployer resolution, scope metadata API endpoint,
and redesigned volume editor UI with scope guide cards and hints.
2026-03-31 23:22:43 +03:00
alexei.dolgolyov 1a8dfefa77 feat: Docker diagnostic hints on disconnection
- Classify Docker errors into categories (socket_not_found, connection_refused,
  permission_denied, timeout, tls_error) with platform-specific hints
- Enrich GET /api/health with structured diagnostics (category, hints, platform)
- Expandable hints panel in sidebar when Docker is disconnected
- "Retry now" button for immediate re-check
- Collapsible raw error details for advanced users
2026-03-30 14:05:00 +03:00
alexei.dolgolyov 37cfa090ac feat: global Docker health indicator and graceful degradation
- GET /api/health endpoint returning Docker connectivity status
- Sidebar shows Docker connection dot (green=connected, red=disconnected)
- Stale scanner returns store-only results when Docker is unavailable
- Polls health every 30s
2026-03-30 13:43:33 +03:00
alexei.dolgolyov 71aeb615b3 fix(observability): router conflict, logout button, missing i18n
- Fix chi duplicate Route() panic by consolidating read/write routes
  into single Route blocks with nested admin Group
- Add logout button to sidebar with token cleanup
- Add missing settingsAuth.password i18n key
2026-03-30 12:26:22 +03:00
alexei.dolgolyov e0a648fb0c fix(observability): address final review findings
Critical fixes:
- Fix StaleContainer frontend type to match nested backend response shape
- Guard ContainerID[:12] slice against empty/short IDs in ListAllProxies

High-priority fixes:
- Support comma-separated severity/source in event log filtering (IN clause)
- Eliminate N+1 queries in ListAllProxies and FindStaleInstances (pre-load maps)
- Stop leaking internal error messages to API clients (use slog + generic msgs)
2026-03-30 11:47:16 +03:00
alexei.dolgolyov 7c57c740b4 feat(observability): phase 8 - container stats, notifications & dashboard
Add container monitoring and notification system:
- Docker Stats API: real-time CPU/memory for running containers
- Webhook notifications for errors (deploy failures, stale, proxy unhealthy)
- Event log auto-pruning (daily, 30-day retention)
- ContainerStats component with auto-polling progress bars
- SystemHealthCard dashboard widget with running/proxy/error counts
- Full EN/RU i18n for stats and system health
2026-03-30 11:37:25 +03:00
alexei.dolgolyov 7a85441b81 feat(observability): phase 3 - direct proxy creation with validation
Add standalone proxy management:
- Multi-step validation pipeline (DNS, TCP, HTTP) with diagnostic hints
- Proxy lifecycle: create/update/delete via NPM API with SSL auto-assign
- Periodic health monitoring (5min) with event log on status transitions
- Unified /api/proxies/all endpoint merging standalone + managed proxies
- Frontend types and API functions for downstream UI phases
2026-03-30 11:19:55 +03:00
alexei.dolgolyov aefecdffdf feat(observability): phase 2 - stale container detection
Add periodic scanner for stale containers:
- Cron-based scanner (hourly) detects non-running containers exceeding threshold
- last_alive_at tracking on instances, updated on deploy/start/restart
- API: GET /api/containers/stale, POST cleanup (single + bulk)
- Event log warnings emitted for newly stale containers
- Graceful handling of externally removed containers
2026-03-30 11:12:25 +03:00
alexei.dolgolyov c38b7d4c78 feat(observability): phase 1 - schema, models & event log backend
Add database foundation for observability features:
- event_log table with severity/source filtering and pagination
- standalone_proxies table for user-created reverse proxies
- stale_threshold_days setting (default 7 days)
- Auto-persist warn/error events from event bus to database
- SSE broadcast of persistent events for real-time UI updates
- Frontend types and API functions for downstream UI phases
2026-03-30 10:59:13 +03:00
alexei.dolgolyov f71c314262 feat: auto-reapply SSL cert to all managed proxies on change
When the SSL certificate is changed in settings, automatically
updates all existing NPM proxy hosts managed by Docker Watcher
in the background. Clears SSL if cert is removed.
2026-03-29 13:11:21 +03:00
alexei.dolgolyov 9f284932a1 feat: SSL wildcard certificate picker from NPM
- NPM client: ListCertificates endpoint
- API: GET /api/settings/npm-certificates (wildcard-only filter)
- Settings UI: EntityPicker for selecting wildcard certs
- Deployer: applies certificate_id + ssl_forced to proxy hosts
- Uses HTTPS subdomain URLs when SSL cert is configured
2026-03-29 13:07:58 +03:00
alexei.dolgolyov e94c4f9116 feat: optional NPM proxy per stage
Add enable_proxy boolean to stages (default true). When disabled,
the deployer skips NPM proxy host creation — useful for internal
services, workers, or externally-routed containers. UI shows
toggle in Add Stage form and "No Proxy" badge on stage header.
2026-03-29 12:58:13 +03:00
alexei.dolgolyov be6ad15efc fix: comprehensive security, performance, and quality hardening
Security: apply AdminOnly middleware to mutating routes, require
ENCRYPTION_KEY and ADMIN_PASSWORD (no insecure defaults), restrict
CORS to same-origin, fix OIDC token delivery via cookie instead of
URL query param, add rate limiting on login, add MaxBytesReader,
validate volume paths against traversal, add security headers,
validate user roles, add Secure flag to OIDC cookie.

Performance: set SQLite MaxOpenConns(1) to prevent SQLITE_BUSY,
add FK indexes on 8 columns, track notifier goroutines with
WaitGroup for graceful shutdown, use GetRegistryByName instead of
GetAllRegistries in deployer, pass basePath param to avoid redundant
settings query, return empty slices from store to remove reflection.

Quality: refactor TriggerDeploy to delegate to runDeploy (~100 lines
removed), consolidate duplicated utilities (extractPort, boolToInt,
now, isTerminalStatus) into shared exports, migrate all log.Printf
to slog structured logging, use consistent webhook response envelope,
remove dead code (parseEnvVars, duplicate auth types).

UX: clean up NPM proxy on instance removal via API, add README with
quickstart guide, add .env.example, require ADMIN_PASSWORD in
docker-compose, document staging-net prerequisite.
2026-03-29 12:49:24 +03:00
alexei.dolgolyov 1cfd23c431 feat: base volume path setting
Add global base_volume_path to settings. Relative volume source
paths are automatically prepended with the base path at deploy
time. Absolute paths are used as-is. Configurable in Settings >
General.
2026-03-28 15:21:37 +03:00
alexei.dolgolyov a47f703910 fix: nil slices return [] not null in all API responses 2026-03-28 14:33:36 +03:00
alexei.dolgolyov 4ba3673b96 feat: support multiple owners per registry (comma-separated) 2026-03-28 14:15:42 +03:00
alexei.dolgolyov 37e251da85 feat: auto-discover container images from registries
- Add ListImages() to registry interface, implement for Gitea
- Add owner field to registry config (needed for Gitea packages API)
- GET /api/registries/:id/images endpoint
- "Browse Images" button on Projects and Quick Deploy pages
- Image dropdown with registry grouping and search
- i18n support (EN/RU) for all new UI strings
2026-03-28 14:04:11 +03:00
alexei.dolgolyov 77251c540b fix: align webhook regenerate route with frontend path 2026-03-28 13:58:53 +03:00
alexei.dolgolyov 52eec11d16 fix: registry test endpoint accepts empty body for connectivity check 2026-03-28 13:54:05 +03:00
alexei.dolgolyov 28abad27c6 fix: SSE flusher support, null-safe API responses
- Add Flush() to statusRecorder so SSE works through logging middleware
- Return empty array instead of null for empty project lists
- Fixes 500 on /api/events and null.length crash on dashboard
2026-03-28 13:48:20 +03:00
alexei.dolgolyov 4d4e07eb2e fix: serve SvelteKit _app assets correctly
- Use all: prefix in go:embed to include _app/ directory (Go skips
  _-prefixed dirs by default)
- Move jsonContentType middleware to /api route group only
- Use http.ServeContent for proper MIME type detection
- Remove unused fileServer variable
2026-03-28 13:34:02 +03:00
alexei.dolgolyov 652229c67f chore: fix build dependencies and frontend config
Migrate Docker SDK from github.com/docker/docker (+incompatible)
to github.com/moby/moby/client v0.3.0 + moby/moby/api v1.54.0
(proper Go modules). Adapt all container/image/network operations
to the new moby API (Filters, ContainerListOptions, PullResponse,
InspectResult, etc.). Add AsyncTriggerDeploy runDeploy method.

Fix SvelteKit build: disable prerender, set strict=false for SPA,
bump vite-plugin-svelte to v5 for vite 6 compat.

Add .dockerignore to exclude .git, node_modules, plans.
2026-03-28 13:13:45 +03:00
alexei.dolgolyov 1f81ca9eb0 fix(docker-watcher): address final review findings
Security:
- Move config export behind auth middleware
- Validate OIDC callback token before storing in localStorage
- Use constant-time comparison for webhook secret
- Encrypt OIDC client secret at rest (like registry tokens)

Performance:
- Make TriggerDeploy async from HTTP handlers (return deploy ID
  immediately, run pipeline in background goroutine)

Robustness:
- Wrap api.ts res.json() in try/catch for non-JSON responses

i18n:
- Replace ~20 hardcoded English validation messages with $t() calls
- Localize ConfirmDialog cancel button, InstanceCard confirm titles,
  ProjectCard instance/instances pluralization
- Add validation keys to both en.json and ru.json
2026-03-28 00:14:53 +03:00
alexei.dolgolyov d4659146fc feat(docker-watcher): phase 13 - volumes & environment
Per-stage env var overrides with encryption for secrets.
Volume mounts with shared/isolated modes (isolated appends
/{stage}-{tag}/ to source path). Store CRUD, API endpoints,
and frontend editors for both. Env merge during deploy.
2026-03-27 23:28:59 +03:00
alexei.dolgolyov 32de5b26a8 feat(docker-watcher): phase 12 - hardening
Blue-green zero-downtime deploys, promote flow validation.
Dual auth: local (bcrypt + JWT) and OAuth2/OIDC (any provider).
Auth middleware, login page, auth settings UI.
Structured logging (slog JSON), config export to YAML.
Graceful shutdown with deploy draining.
Multi-stage Dockerfile and production docker-compose.yml.
Swap phase order: Volumes & Env before UI Polish.
2026-03-27 23:20:56 +03:00
alexei.dolgolyov 5558396bb7 feat(docker-watcher): phase 11 - frontend embed & SSE
Embed SvelteKit static build in Go binary via go:embed. Event bus
for pub/sub with deploy log, instance status, and deploy status events.
SSE endpoints for real-time streaming. Frontend SSE client with
exponential backoff reconnection. Makefile for build pipeline.
Update Phase 12 auth plan with OAuth2/OIDC support.
2026-03-27 22:30:25 +03:00
alexei.dolgolyov 757c72eea1 fix(docker-watcher): phase 8 security fixes
Remove webhook secret from logs and API response.
Add auth-pending note to router. Fix decrypt fallback that
would use ciphertext as auth token on decrypt failure.
2026-03-27 22:10:00 +03:00
alexei.dolgolyov 97d4243cfe feat(docker-watcher): phase 8 - REST API layer
All REST endpoints wired with chi router: projects, stages, instances,
deploys, registries, settings, quick deploy, webhook. Full main.go
wiring with graceful shutdown. Consistent JSON envelope responses.
Sensitive fields stripped from API responses.
2026-03-27 22:06:57 +03:00
alexei.dolgolyov bbcc4f55f0 feat(docker-watcher): phase 7 - deployer & health checker
Deploy orchestrator with full pipeline: pull → create container →
start → network → NPM proxy → health check. Rollback on failure,
multi-instance support, max_instances enforcement, webhook notifications.
Fix NPM auth in rollback and error logging in removeInstance.
2026-03-27 22:02:09 +03:00
alexei.dolgolyov eef60a4302 feat(docker-watcher): phase 6 - webhook handler
Secret UUID-based webhook endpoint for CI image push notifications.
Project/stage matching via glob patterns, auto-creation of unknown
projects from image inspection. Fix JSON response injection.
2026-03-27 21:56:18 +03:00
alexei.dolgolyov 90be636d66 feat(docker-watcher): phase 5 - registry client & poller
Gitea registry client with tag listing and pattern matching, cron-based
polling scheduler with first-poll safety, poll state persistence.
DeployTriggerer interface for decoupled deploy triggering.
2026-03-27 21:34:09 +03:00
alexei.dolgolyov 389ed5aff8 feat(docker-watcher): phases 3+4 - Docker client & NPM client
Phase 3: Docker Engine API wrapper — pull/inspect images, container
lifecycle (create/start/stop/remove/restart), network management,
label-based container tracking, deterministic naming.

Phase 4: Nginx Proxy Manager API client — JWT auth with auto-refresh,
CRUD for proxy hosts, domain-based host lookup.
2026-03-27 21:08:57 +03:00
alexei.dolgolyov cdf21682d6 feat(docker-watcher): phase 2 - crypto & config seed loader
AES-256-GCM encryption for credential storage, YAML seed config
parser with validation, and transactional import into SQLite.
Credentials (registry tokens, NPM password) encrypted before storage.
2026-03-27 21:01:16 +03:00
alexei.dolgolyov d63c831d15 feat(docker-watcher): phase 1 - project scaffold & SQLite store
Initialize Go module, directory structure, and full SQLite store layer:
- 7-table schema (projects, stages, registries, settings, instances,
  deploys, deploy_logs) with auto-migration
- CRUD operations for all entities with proper error handling
- ErrNotFound sentinel for distinguishing 404 from 500 in handlers
- WAL mode, foreign keys, busy timeout pragmas
2026-03-27 20:52:29 +03:00