feat(cutover): hard legacy cutover — drop projects/stacks/sites/deploys
Build / build (push) Successful in 10m39s
Build / build (push) Successful in 10m39s
The clean-break delete that closes the workload-first refactor arc.
Net diff: ~30 backend files deleted, ~20 modified, ~12k LOC removed
on the Go side; entire /projects /stacks /sites /deploy frontend
trees gone; ~6.7k LOC removed on the Svelte/TypeScript side.
Backend
- API handlers gone: internal/api/{projects,stages,stage_env,stacks,
static_sites,deploys,instances,volume_browser}.go
- Store CRUD + tests gone: internal/store/{projects,stages,stage_env,
stacks,static_sites,static_site_secrets,deploys,poll_state,volumes,
workload_sync}.go (+ _test.go siblings)
- Legacy deployer pipeline gone: internal/deployer/{bluegreen,promote,
rollback,subdomain,resolver_test}.go; deployer.go trimmed to just the
dispatch surface used by the plugin pipeline
- internal/staticsite/{manager,healthcheck}.go and
internal/stack/manager.go gone (the rest of those packages stay as
helpers imported by the static + compose plugins)
- internal/registry/poller.go gone (legacy registry poller)
- internal/volume.ResolvePath gone; ResolveWorkloadPath stays
- internal/webhook: handleWebhook (project) + handleSiteWebhook (site)
gone; only POST /api/webhook/triggers/{secret} remains
- workload-side webhook URL handlers (getWorkloadWebhook +
regenerateWorkloadWebhook + EnsureWorkloadWebhookSecret +
SetWorkloadWebhookSecret + GetWorkloadByWebhookSecret) gone — they
minted URLs that would 404 against the new trigger-only ingress
- cmd/server/main.go: dropped staticsite.Manager, stack.Manager,
staticsite.HealthChecker, registry poller, SetSiteSyncTriggerer,
SetStaticSiteManager, SetStackManager, wireStaticBackend
- store/store.go: idempotent DROP TABLE IF EXISTS for every legacy
table (projects, stages, stage_env, volumes, deploys, deploy_logs,
poll_states, stacks, stack_revisions, stack_deploys, static_sites,
static_site_secrets); FK order children-then-parents
- store/models.go: dropped Project, Stage, Deploy, DeployLog, StageEnv,
Volume, StaticSite, StaticSiteSecret, Stack, StackRevision,
StackDeploy types; kept WorkloadKind constants as documented strings
- internal/store/helpers.go (new): BoolToInt, rowScanner,
GenerateWebhookSecret extracted from deleted CRUD files
- internal/api/secrets.go (new): forwards to store.GenerateWebhookSecret
so api + store paths share one secret-generation impl (no
panic-vs-UUID-fallback divergence)
- internal/reconciler/reconciler.go: dropped legacy stack-by-compose
+ static-site label paths; only canonical tinyforge.workload.id
dispatch remains
- providers (gitea_content/github_provider/gitlab_provider) gained
path-traversal rejection on every tree entry
- internal/webhook ParsedImage / ParseImageRef demoted to package-
private (no external callers)
Frontend
- /projects /stacks /sites /deploy routes deleted (entire trees)
- ProjectCard / InstanceCard / StaleContainerCard components deleted
- api.ts: dropped every project/stage/stack/site/deploy/instance
helper + types (Project, Stage, Stack, StaticSite, Deploy,
Instance, Volume, etc.); kept Workload, Container, App, Settings,
Registry, EventTrigger, LogScanRule, webhook envelopes
- WorkloadWebhook type + getWorkloadWebhook/regenerateWorkloadWebhook
api functions gone (mirror of the backend deletion above)
- web/src/routes/+layout.svelte: dropped /projects /sites /stacks
/deploy nav entries, trimmed quick-nav keymap
- web/src/routes/+page.svelte: dashboard rewrite — reads
listWorkloads + listContainers only; 4-card stat grid
(workloads/running/failed/stale) + recent workloads strip
- navCounts.ts, SystemHealthCard.svelte, ContainerLogs.svelte,
ContainerStats.svelte, StatusBadge.svelte, TagCombobox.svelte,
proxies/+page.svelte, containers/+page.svelte all rewired to the
workload-first surface
- AbortController plumbing on dashboard, nav-counts, stale page,
SystemHealthCard so navigation doesn't leave dangling fetches
- i18n: dropped projects.*, projectDetail.*, envEditor.*,
volumeEditor.*, volumeBrowser.*, quickDeploy.*, sites.*, stacks.*,
instance.*, confirm.* namespaces; en/ru parity preserved (1042
keys each)
Hardening from go-reviewer + security-reviewer + typescript-reviewer
subagent passes (0 CRITICAL across all three; 1 HIGH + ~12 MEDIUM
addressed inline before commit):
- Sec H1: dead-end workload webhook URL handlers (would mint URLs
that 404 the new trigger-only ingress) deleted across backend +
frontend
- Go M1: IsTerminalDeployStatus dropped (no production callers)
- Go M2: ParsedImage/ParseImageRef lowercased (in-package only)
- Go M6: generateWebhookSecret unified — api shim forwards to
store.GenerateWebhookSecret
- Doc/comment freshness: stage_id (no longer FK), ProxyRoute legacy
field names, workloadIDRow rationale, webhook_deliveries.target_type
enum, WebhookDeliveryLog component header
Doc
- WORKLOAD_REFACTOR_TODO: cutover marked DONE; all three Priority 1
items are now shipped. Next focus is Priority 3 polish (apps.* i18n
+ codemap entries) and Priority 4 tests.
Behavioral notes for operators upgrading from a pre-cutover build
- Existing rows in the dropped tables disappear on first boot.
- Legacy webhook URLs at /api/webhook/{secret} and
/api/webhook/sites/{secret} return 404; CI configs must repoint to
/api/webhook/triggers/{secret} (the trigger-split boot backfill
lifted any embedded workload secret onto a Trigger row, so the
secret value itself carries over).
- Frontend routes /projects /stacks /sites /deploy are gone; nav
links replaced with /apps and /triggers.
This commit is contained in:
+19
-796
@@ -1,16 +1,15 @@
|
||||
// Package deployer dispatches plugin-native Source deploys. The legacy
|
||||
// project-pipeline lived here until the hard cutover; what remains is a
|
||||
// thin holder for the Deployer's shared dependencies that `dispatch.go`
|
||||
// hands to every Source via PluginDeps().
|
||||
package deployer
|
||||
|
||||
import (
|
||||
"context"
|
||||
"encoding/json"
|
||||
"errors"
|
||||
"fmt"
|
||||
"log/slog"
|
||||
"sort"
|
||||
"sync"
|
||||
"sync/atomic"
|
||||
|
||||
"github.com/alexei/tinyforge/internal/crypto"
|
||||
"github.com/alexei/tinyforge/internal/dns"
|
||||
"github.com/alexei/tinyforge/internal/docker"
|
||||
"github.com/alexei/tinyforge/internal/events"
|
||||
@@ -18,14 +17,11 @@ import (
|
||||
"github.com/alexei/tinyforge/internal/notify"
|
||||
"github.com/alexei/tinyforge/internal/proxy"
|
||||
"github.com/alexei/tinyforge/internal/store"
|
||||
"github.com/alexei/tinyforge/internal/volume"
|
||||
"github.com/moby/moby/api/types/mount"
|
||||
"github.com/google/uuid"
|
||||
)
|
||||
|
||||
// Deployer orchestrates the full deployment flow: pull image, create container,
|
||||
// start, configure proxy, health check, and handle rollback on failure.
|
||||
// It implements both webhook.DeployTriggerer and registry.DeployTriggerer.
|
||||
// Deployer owns the dependency bundle each Source plugin needs at deploy
|
||||
// time. The plugin pipeline reaches in via PluginDeps(); see dispatch.go
|
||||
// for the dispatch surface itself.
|
||||
type Deployer struct {
|
||||
docker *docker.Client
|
||||
proxy proxy.Provider
|
||||
@@ -88,21 +84,20 @@ func (d *Deployer) SetPreDeployBackuper(b PreDeployBackuper) {
|
||||
d.backuper = b
|
||||
}
|
||||
|
||||
// maybeBackupBeforeDeploy creates a "pre-deploy" Tinyforge DB snapshot when
|
||||
// MaybeBackupBeforeDeploy creates a "pre-deploy" Tinyforge DB snapshot when
|
||||
// the setting is enabled. Failures are logged but do not abort the deploy:
|
||||
// missing a backup is preferable to refusing to ship a fix.
|
||||
func (d *Deployer) maybeBackupBeforeDeploy(deployID string, settings store.Settings) {
|
||||
// missing a backup is preferable to refusing to ship a fix. Exposed so
|
||||
// Source plugins can opt into the same behaviour.
|
||||
func (d *Deployer) MaybeBackupBeforeDeploy(deployID string, settings store.Settings) {
|
||||
if !settings.AutoBackupBeforeDeploy || d.backuper == nil {
|
||||
return
|
||||
}
|
||||
backup, err := d.backuper.CreateBackup("pre-deploy")
|
||||
if err != nil {
|
||||
slog.Warn("pre-deploy backup failed", "deploy_id", deployID, "error", err)
|
||||
d.logDeploy(deployID, fmt.Sprintf("Pre-deploy backup failed: %v", err), "warn")
|
||||
return
|
||||
}
|
||||
slog.Info("pre-deploy backup created", "deploy_id", deployID, "backup_id", backup.ID, "filename", backup.Filename)
|
||||
d.logDeploy(deployID, fmt.Sprintf("Pre-deploy backup created: %s", backup.Filename), "info")
|
||||
}
|
||||
|
||||
// SetDNSProvider sets the DNS provider for managing DNS records during deployments.
|
||||
@@ -113,796 +108,24 @@ func (d *Deployer) SetDNSProvider(provider dns.Provider) {
|
||||
d.dns = provider
|
||||
}
|
||||
|
||||
// getDNS returns the current DNS provider under read lock.
|
||||
func (d *Deployer) getDNS() dns.Provider {
|
||||
d.dnsMu.RLock()
|
||||
defer d.dnsMu.RUnlock()
|
||||
return d.dns
|
||||
}
|
||||
|
||||
// Drain waits for all in-progress deploys to complete. Call this during graceful shutdown.
|
||||
func (d *Deployer) Drain() {
|
||||
d.shuttingDown.Store(true)
|
||||
if !d.shuttingDown.CompareAndSwap(false, true) {
|
||||
// Already draining.
|
||||
}
|
||||
slog.Info("deployer: draining in-progress deploys")
|
||||
d.activeWg.Wait()
|
||||
slog.Info("deployer: all deploys drained")
|
||||
}
|
||||
|
||||
// AsyncTriggerDeploy creates a deploy record and returns the deploy ID immediately,
|
||||
// then runs the full deploy pipeline in a background goroutine. Use this from HTTP handlers
|
||||
// to avoid blocking the request. Progress is streamed via SSE.
|
||||
func (d *Deployer) AsyncTriggerDeploy(ctx context.Context, projectID, stageID, imageTag string) (string, error) {
|
||||
if d.shuttingDown.Load() {
|
||||
return "", fmt.Errorf("deployer is shutting down, rejecting new deploy")
|
||||
}
|
||||
// ShuttingDown reports whether Drain() has been called.
|
||||
func (d *Deployer) ShuttingDown() bool { return d.shuttingDown.Load() }
|
||||
|
||||
// Validate inputs synchronously so the caller gets immediate feedback.
|
||||
project, err := d.store.GetProjectByID(projectID)
|
||||
if err != nil {
|
||||
return "", fmt.Errorf("get project: %w", err)
|
||||
}
|
||||
stage, err := d.store.GetStageByID(stageID)
|
||||
if err != nil {
|
||||
return "", fmt.Errorf("get stage: %w", err)
|
||||
}
|
||||
if err := d.validatePromoteFrom(stage, imageTag); err != nil {
|
||||
return "", fmt.Errorf("promote validation: %w", err)
|
||||
}
|
||||
|
||||
// Create deploy record synchronously so caller gets the ID.
|
||||
deploy, err := d.store.CreateDeploy(store.Deploy{
|
||||
ProjectID: projectID,
|
||||
StageID: stageID,
|
||||
ImageTag: imageTag,
|
||||
Status: "pending",
|
||||
})
|
||||
if err != nil {
|
||||
return "", fmt.Errorf("create deploy record: %w", err)
|
||||
}
|
||||
|
||||
// Run the actual deploy in the background.
|
||||
d.activeWg.Add(1)
|
||||
go func() {
|
||||
defer d.activeWg.Done()
|
||||
// Use a detached context so client disconnect doesn't abort the deploy.
|
||||
bgCtx := context.Background()
|
||||
if err := d.runDeploy(bgCtx, project, stage, deploy.ID, imageTag); err != nil {
|
||||
slog.Error("async deploy failed", "deploy_id", deploy.ID, "error", err)
|
||||
}
|
||||
}()
|
||||
|
||||
return deploy.ID, nil
|
||||
}
|
||||
|
||||
// runDeploy is the internal deploy pipeline used by AsyncTriggerDeploy.
|
||||
// It assumes the deploy record already exists and project/stage are validated.
|
||||
func (d *Deployer) runDeploy(ctx context.Context, project store.Project, stage store.Stage, deployID string, imageTag string) error {
|
||||
settings, err := d.store.GetSettings()
|
||||
if err != nil {
|
||||
if updateErr := d.store.UpdateDeployStatus(deployID, "failed", err.Error()); updateErr != nil {
|
||||
slog.Warn("update deploy status", "error", updateErr)
|
||||
}
|
||||
return fmt.Errorf("get settings: %w", err)
|
||||
}
|
||||
|
||||
slog.Info("starting deploy",
|
||||
"deploy_id", deployID,
|
||||
"project", project.Name,
|
||||
"stage", stage.Name,
|
||||
"tag", imageTag,
|
||||
)
|
||||
d.logDeploy(deployID, fmt.Sprintf("Starting deploy of %s:%s for project %s, stage %s", project.Image, imageTag, project.Name, stage.Name), "info")
|
||||
|
||||
// Take a pre-deploy DB snapshot if the operator opted in. Runs before
|
||||
// any state-mutating work so a corrupted deploy is recoverable.
|
||||
d.maybeBackupBeforeDeploy(deployID, settings)
|
||||
|
||||
// Enforce max_instances before deploying.
|
||||
if err := d.enforceMaxInstances(ctx, stage, deployID, settings); err != nil {
|
||||
d.logDeploy(deployID, fmt.Sprintf("Failed to enforce max instances: %v", err), "error")
|
||||
}
|
||||
|
||||
var containerID string
|
||||
var proxyRouteID string
|
||||
var instanceID string
|
||||
var deployErr error
|
||||
|
||||
if stage.MaxInstances == 1 {
|
||||
containerID, proxyRouteID, instanceID, deployErr = d.blueGreenDeploy(ctx, project, stage, settings, deployID, imageTag)
|
||||
} else {
|
||||
containerID, proxyRouteID, instanceID, deployErr = d.executeDeploy(ctx, project, stage, settings, deployID, imageTag)
|
||||
}
|
||||
|
||||
if deployErr != nil {
|
||||
d.logDeploy(deployID, fmt.Sprintf("Deploy failed: %v", deployErr), "error")
|
||||
d.publishDeployStatus(deployID, project.ID, stage.ID, imageTag, "failed", deployErr.Error())
|
||||
d.rollback(ctx, deployID, containerID, proxyRouteID, instanceID)
|
||||
|
||||
url, secret, tier := resolveDeployTarget(stage, project, settings)
|
||||
d.notifier.SendSigned(url, secret, tier, notify.Event{
|
||||
Type: "deploy_failure",
|
||||
Project: project.Name,
|
||||
Stage: stage.Name,
|
||||
ImageTag: imageTag,
|
||||
Error: deployErr.Error(),
|
||||
})
|
||||
|
||||
return fmt.Errorf("deploy failed: %w", deployErr)
|
||||
}
|
||||
|
||||
if err := d.store.UpdateDeployStatus(deployID, "success", ""); err != nil {
|
||||
slog.Warn("update deploy status to success", "error", err)
|
||||
}
|
||||
d.publishDeployStatus(deployID, project.ID, stage.ID, imageTag, "success", "")
|
||||
|
||||
subdomain := d.buildSubdomain(project, stage, settings, imageTag)
|
||||
fullURL := fmt.Sprintf("https://%s.%s", subdomain, settings.Domain)
|
||||
|
||||
d.logDeploy(deployID, fmt.Sprintf("Deploy successful: %s", fullURL), "info")
|
||||
|
||||
url, secret, tier := resolveDeployTarget(stage, project, settings)
|
||||
d.notifier.SendSigned(url, secret, tier, notify.Event{
|
||||
Type: "deploy_success",
|
||||
Project: project.Name,
|
||||
Stage: stage.Name,
|
||||
ImageTag: imageTag,
|
||||
Subdomain: subdomain,
|
||||
URL: fullURL,
|
||||
})
|
||||
|
||||
return nil
|
||||
}
|
||||
|
||||
// resolveDeployTarget picks the most-specific (URL, secret, tier) for a
|
||||
// deploy notification: stage > project > global. An empty URL at a tier
|
||||
// means "fall through to the next" — never "send unsigned to nowhere". The
|
||||
// secret is always paired with the URL that sourced it, so a stage can sign
|
||||
// even when project and global are unsigned (and vice versa).
|
||||
func resolveDeployTarget(stage store.Stage, project store.Project, settings store.Settings) (string, string, notify.Tier) {
|
||||
if stage.NotificationURL != "" {
|
||||
return stage.NotificationURL, stage.NotificationSecret, notify.TierStage
|
||||
}
|
||||
if project.NotificationURL != "" {
|
||||
return project.NotificationURL, project.NotificationSecret, notify.TierProject
|
||||
}
|
||||
return settings.NotificationURL, settings.NotificationSecret, notify.TierSettings
|
||||
}
|
||||
|
||||
// TriggerDeploy is the synchronous entry point for deployments (used by poller and webhook).
|
||||
// It validates inputs, creates a deploy record, and delegates to runDeploy.
|
||||
func (d *Deployer) TriggerDeploy(ctx context.Context, projectID, stageID, imageTag string) error {
|
||||
// rejectIfDraining is exposed in case any plugin wants the same hard-stop
|
||||
// behaviour the legacy pipeline used.
|
||||
func (d *Deployer) rejectIfDraining() error {
|
||||
if d.shuttingDown.Load() {
|
||||
return fmt.Errorf("deployer is shutting down, rejecting new deploy")
|
||||
}
|
||||
|
||||
d.activeWg.Add(1)
|
||||
defer d.activeWg.Done()
|
||||
|
||||
project, err := d.store.GetProjectByID(projectID)
|
||||
if err != nil {
|
||||
return fmt.Errorf("get project: %w", err)
|
||||
}
|
||||
|
||||
stage, err := d.store.GetStageByID(stageID)
|
||||
if err != nil {
|
||||
return fmt.Errorf("get stage: %w", err)
|
||||
}
|
||||
|
||||
if err := d.validatePromoteFrom(stage, imageTag); err != nil {
|
||||
return fmt.Errorf("promote validation: %w", err)
|
||||
}
|
||||
|
||||
deploy, err := d.store.CreateDeploy(store.Deploy{
|
||||
ProjectID: projectID,
|
||||
StageID: stageID,
|
||||
ImageTag: imageTag,
|
||||
Status: "pending",
|
||||
})
|
||||
if err != nil {
|
||||
return fmt.Errorf("create deploy record: %w", err)
|
||||
}
|
||||
|
||||
if err := d.runDeploy(ctx, project, stage, deploy.ID, imageTag); err != nil {
|
||||
return err
|
||||
}
|
||||
return nil
|
||||
}
|
||||
|
||||
// executeDeploy runs the deploy pipeline steps and returns rollback-relevant state.
|
||||
// It returns (containerID, proxyRouteID, instanceID, error).
|
||||
func (d *Deployer) executeDeploy(
|
||||
ctx context.Context,
|
||||
project store.Project,
|
||||
stage store.Stage,
|
||||
settings store.Settings,
|
||||
deployID string,
|
||||
imageTag string,
|
||||
) (string, string, string, error) {
|
||||
var containerID string
|
||||
var proxyRouteID string
|
||||
var instanceID string
|
||||
|
||||
// Step 1: Pull image.
|
||||
if err := d.store.UpdateDeployStatus(deployID, "pulling", ""); err != nil {
|
||||
slog.Warn("update deploy status", "error", err)
|
||||
}
|
||||
d.publishDeployStatus(deployID, project.ID, stage.ID, imageTag, "pulling", "")
|
||||
d.logDeploy(deployID, fmt.Sprintf("Pulling image %s:%s", project.Image, imageTag), "info")
|
||||
|
||||
authConfig, err := d.buildRegistryAuth(project)
|
||||
if err != nil {
|
||||
return containerID, proxyRouteID, instanceID, fmt.Errorf("build registry auth: %w", err)
|
||||
}
|
||||
|
||||
if err := d.docker.PullImage(ctx, project.Image, imageTag, authConfig); err != nil {
|
||||
return containerID, proxyRouteID, instanceID, fmt.Errorf("pull image: %w", err)
|
||||
}
|
||||
d.logDeploy(deployID, "Image pulled successfully", "info")
|
||||
|
||||
// Step 2: Ensure network exists.
|
||||
if settings.Network == "" {
|
||||
return containerID, proxyRouteID, instanceID, fmt.Errorf("docker network not configured in settings")
|
||||
}
|
||||
networkID, err := d.docker.EnsureNetwork(ctx, settings.Network)
|
||||
if err != nil {
|
||||
return containerID, proxyRouteID, instanceID, fmt.Errorf("ensure network: %w", err)
|
||||
}
|
||||
d.logDeploy(deployID, fmt.Sprintf("Network %s ready (ID: %s)", settings.Network, truncateID(networkID)), "info")
|
||||
|
||||
// Step 3: Create and start container.
|
||||
if err := d.store.UpdateDeployStatus(deployID, "starting", ""); err != nil {
|
||||
slog.Warn("update deploy status", "error", err)
|
||||
}
|
||||
d.publishDeployStatus(deployID, project.ID, stage.ID, imageTag, "starting", "")
|
||||
|
||||
// Pre-generate instance ID so it can be set as a container label.
|
||||
instanceID = uuid.New().String()
|
||||
subdomain := d.buildSubdomain(project, stage, settings, imageTag)
|
||||
workloadID := d.resolveProjectWorkloadID(project.ID)
|
||||
|
||||
containerName := docker.ContainerName(project.Name, stage.Name, imageTag)
|
||||
|
||||
// Remove any stale container with the same name (e.g., from a previous failed deploy).
|
||||
_ = d.docker.RemoveContainer(ctx, containerName, true)
|
||||
|
||||
portStr := fmt.Sprintf("%d/tcp", project.Port)
|
||||
envVars := d.mergeEnvVars(project, stage.ID)
|
||||
mounts := d.computeVolumeMounts(project.ID, project.Name, stage.Name, imageTag, settings.BaseVolumePath)
|
||||
|
||||
containerCfg := docker.ContainerConfig{
|
||||
Name: containerName,
|
||||
Image: project.Image + ":" + imageTag,
|
||||
Env: envVars,
|
||||
ExposedPorts: []string{portStr},
|
||||
NetworkName: settings.Network,
|
||||
NetworkID: networkID,
|
||||
WorkloadID: workloadID,
|
||||
WorkloadKind: string(store.WorkloadKindProject),
|
||||
Role: stage.Name,
|
||||
Mounts: mounts,
|
||||
CpuLimit: stage.CpuLimit,
|
||||
MemoryLimit: stage.MemoryLimit,
|
||||
}
|
||||
|
||||
// Set proxy labels for providers that use Docker labels (e.g., Traefik).
|
||||
if stage.EnableProxy {
|
||||
fqdn := subdomain + "." + settings.Domain
|
||||
if proxyLabels := d.proxy.ContainerLabels(fqdn, project.Port); proxyLabels != nil {
|
||||
if containerCfg.Labels == nil {
|
||||
containerCfg.Labels = make(map[string]string)
|
||||
}
|
||||
for k, v := range proxyLabels {
|
||||
containerCfg.Labels[k] = v
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
d.logDeploy(deployID, fmt.Sprintf("Creating container %s", containerName), "info")
|
||||
containerID, err = d.docker.CreateContainer(ctx, containerCfg)
|
||||
if err != nil {
|
||||
return containerID, proxyRouteID, instanceID, fmt.Errorf("create container: %w", err)
|
||||
}
|
||||
d.logDeploy(deployID, fmt.Sprintf("Container created (ID: %s)", truncateID(containerID)), "info")
|
||||
|
||||
// Create container row with the pre-generated ID. The deployer is the
|
||||
// authoritative writer until the next reconciler tick — it's important
|
||||
// the row exists before StartContainer so a fast tick doesn't see an
|
||||
// orphan and mark it missing.
|
||||
row, err := d.store.CreateContainer(store.Container{
|
||||
ID: instanceID,
|
||||
WorkloadID: workloadID,
|
||||
WorkloadKind: string(store.WorkloadKindProject),
|
||||
Role: stage.Name,
|
||||
StageID: stage.ID,
|
||||
ContainerID: containerID,
|
||||
ImageRef: project.Image + ":" + imageTag,
|
||||
ImageTag: imageTag,
|
||||
Host: "local",
|
||||
State: "stopped",
|
||||
Port: project.Port,
|
||||
Subdomain: subdomain,
|
||||
})
|
||||
if err != nil {
|
||||
return containerID, proxyRouteID, instanceID, fmt.Errorf("create container row: %w", err)
|
||||
}
|
||||
instanceID = row.ID
|
||||
|
||||
// Link deploy to container row (the existing Deploy.InstanceID column
|
||||
// stores the row ID — same value as before, just a renamed concept).
|
||||
if err := d.store.SetDeployInstanceID(deployID, instanceID); err != nil {
|
||||
slog.Warn("link deploy to container", "error", err)
|
||||
}
|
||||
|
||||
d.logDeploy(deployID, fmt.Sprintf("Starting container %s", containerName), "info")
|
||||
if err := d.docker.StartContainer(ctx, containerID); err != nil {
|
||||
return containerID, proxyRouteID, instanceID, fmt.Errorf("start container: %w", err)
|
||||
}
|
||||
|
||||
if err := d.store.UpdateContainerState(instanceID, "running"); err != nil {
|
||||
slog.Warn("update container state to running", "error", err)
|
||||
}
|
||||
row.State = "running"
|
||||
row.LastSeenAt = store.Now()
|
||||
d.publishInstanceStatus(instanceID, project.ID, stage.ID, "running")
|
||||
d.logDeploy(deployID, "Container started", "info")
|
||||
|
||||
// Step 4: Configure NPM proxy (optional per stage).
|
||||
if stage.EnableProxy {
|
||||
if err := d.store.UpdateDeployStatus(deployID, "configuring_proxy", ""); err != nil {
|
||||
slog.Warn("update deploy status", "error", err)
|
||||
}
|
||||
d.publishDeployStatus(deployID, project.ID, stage.ID, imageTag, "configuring_proxy", "")
|
||||
|
||||
accessListID := settings.NpmAccessListID
|
||||
if project.NpmAccessListID > 0 {
|
||||
accessListID = project.NpmAccessListID
|
||||
}
|
||||
|
||||
proxyRouteID, err = d.configureProxy(ctx, deployID, settings, containerID, containerName, project.Port, subdomain, accessListID)
|
||||
if err != nil {
|
||||
return containerID, proxyRouteID, instanceID, fmt.Errorf("configure proxy: %w", err)
|
||||
}
|
||||
|
||||
// Update container row with proxy route ID.
|
||||
row.ProxyRouteID = proxyRouteID
|
||||
row.Subdomain = subdomain
|
||||
if err := d.store.UpdateContainer(row); err != nil {
|
||||
slog.Warn("update container with proxy ID", "error", err)
|
||||
}
|
||||
|
||||
// Create DNS record for this container.
|
||||
fqdn := subdomain + "." + settings.Domain
|
||||
d.ensureDNS(ctx, fqdn, "instance", instanceID, deployID)
|
||||
} else {
|
||||
d.logDeploy(deployID, "Proxy creation skipped (disabled for this stage)", "info")
|
||||
row.Subdomain = subdomain
|
||||
if err := d.store.UpdateContainer(row); err != nil {
|
||||
slog.Warn("update container", "error", err)
|
||||
}
|
||||
}
|
||||
|
||||
// Step 5: Health check.
|
||||
if project.Healthcheck != "" {
|
||||
if err := d.store.UpdateDeployStatus(deployID, "health_checking", ""); err != nil {
|
||||
slog.Warn("update deploy status", "error", err)
|
||||
}
|
||||
d.publishDeployStatus(deployID, project.ID, stage.ID, imageTag, "health_checking", "")
|
||||
|
||||
healthURL := fmt.Sprintf("http://%s:%d%s", containerName, project.Port, project.Healthcheck)
|
||||
d.logDeploy(deployID, fmt.Sprintf("Running health check: %s", healthURL), "info")
|
||||
|
||||
if err := d.health.Check(ctx, healthURL); err != nil {
|
||||
return containerID, proxyRouteID, instanceID, fmt.Errorf("health check: %w", err)
|
||||
}
|
||||
d.logDeploy(deployID, "Health check passed", "info")
|
||||
} else {
|
||||
d.logDeploy(deployID, "No health check configured, skipping", "info")
|
||||
}
|
||||
|
||||
return containerID, proxyRouteID, instanceID, nil
|
||||
}
|
||||
|
||||
// configureProxy creates or updates a proxy route for the deployed container.
|
||||
// Uses the configured proxy.Provider (NPM, Traefik, or None).
|
||||
// In NPM remote mode, uses server_ip + published host port instead of container name.
|
||||
// Returns the proxy route ID string.
|
||||
func (d *Deployer) configureProxy(
|
||||
ctx context.Context,
|
||||
deployID string,
|
||||
settings store.Settings,
|
||||
containerID string,
|
||||
containerName string,
|
||||
containerPort int,
|
||||
subdomain string,
|
||||
accessListID int,
|
||||
) (string, error) {
|
||||
fqdn := subdomain + "." + settings.Domain
|
||||
|
||||
forwardHost := containerName
|
||||
forwardPort := containerPort
|
||||
|
||||
// In NPM remote mode, use server_ip and the published host port.
|
||||
if settings.NpmRemote && settings.ProxyProvider == "npm" {
|
||||
if settings.ServerIP == "" {
|
||||
return "", fmt.Errorf("NPM remote mode requires Server IP to be configured in settings")
|
||||
}
|
||||
forwardHost = settings.ServerIP
|
||||
|
||||
hostPort, err := d.docker.InspectContainerPort(ctx, containerID, fmt.Sprintf("%d/tcp", containerPort))
|
||||
if err != nil {
|
||||
return "", fmt.Errorf("look up host port for remote NPM: %w", err)
|
||||
}
|
||||
forwardPort = int(hostPort)
|
||||
d.logDeploy(deployID, fmt.Sprintf("NPM remote mode: using %s:%d (host port)", forwardHost, forwardPort), "info")
|
||||
}
|
||||
|
||||
d.logDeploy(deployID, fmt.Sprintf("Configuring proxy (%s): %s -> %s:%d", d.proxy.Name(), fqdn, forwardHost, forwardPort), "info")
|
||||
|
||||
routeID, err := d.proxy.ConfigureRoute(ctx, fqdn, forwardHost, forwardPort, proxy.RouteOptions{
|
||||
SSLCertificateID: settings.SSLCertificateID,
|
||||
AccessListID: accessListID,
|
||||
})
|
||||
if err != nil {
|
||||
return "", fmt.Errorf("configure proxy route: %w", err)
|
||||
}
|
||||
|
||||
if routeID != "" {
|
||||
d.logDeploy(deployID, fmt.Sprintf("Proxy route configured (ID: %s)", routeID), "info")
|
||||
}
|
||||
return routeID, nil
|
||||
}
|
||||
|
||||
// enforceMaxInstances removes the oldest container rows when the stage has
|
||||
// reached its instance limit, making room for the new deploy.
|
||||
func (d *Deployer) enforceMaxInstances(ctx context.Context, stage store.Stage, deployID string, settings store.Settings) error {
|
||||
if stage.MaxInstances <= 0 {
|
||||
return nil
|
||||
}
|
||||
|
||||
containers, err := d.store.ListContainersByStageID(stage.ID)
|
||||
if err != nil {
|
||||
return fmt.Errorf("get containers for stage: %w", err)
|
||||
}
|
||||
|
||||
// Filter to running/stopped containers (not already failed/removing).
|
||||
var active []store.Container
|
||||
for _, c := range containers {
|
||||
if c.State == "running" || c.State == "stopped" {
|
||||
active = append(active, c)
|
||||
}
|
||||
}
|
||||
|
||||
// We need room for one more container, so remove the oldest when at limit.
|
||||
removeCount := len(active) - stage.MaxInstances + 1
|
||||
if removeCount <= 0 {
|
||||
return nil
|
||||
}
|
||||
|
||||
// Sort by created_at ascending (oldest first).
|
||||
sort.Slice(active, func(i, j int) bool {
|
||||
return active[i].CreatedAt < active[j].CreatedAt
|
||||
})
|
||||
|
||||
for i := 0; i < removeCount && i < len(active); i++ {
|
||||
c := active[i]
|
||||
d.logDeploy(deployID, fmt.Sprintf("Removing oldest container %s (tag: %s) to enforce max_instances=%d", c.ID, c.ImageTag, stage.MaxInstances), "info")
|
||||
|
||||
if err := d.removeContainer(ctx, c, settings); err != nil {
|
||||
d.logDeploy(deployID, fmt.Sprintf("Failed to remove container %s: %v", c.ID, err), "warn")
|
||||
continue
|
||||
}
|
||||
d.logDeploy(deployID, fmt.Sprintf("Removed container %s", c.ID), "info")
|
||||
}
|
||||
|
||||
return nil
|
||||
}
|
||||
|
||||
// removeContainer stops + removes the Docker container, deletes its proxy
|
||||
// route, drops the DNS record, and removes the container row from the store.
|
||||
func (d *Deployer) removeContainer(ctx context.Context, c store.Container, settings store.Settings) error {
|
||||
// Mark as removing.
|
||||
if err := d.store.UpdateContainerState(c.ID, "removing"); err != nil {
|
||||
slog.Warn("update container state to removing", "id", c.ID, "error", err)
|
||||
}
|
||||
|
||||
// Remove Docker container.
|
||||
if c.ContainerID != "" {
|
||||
if err := d.docker.RemoveContainer(ctx, c.ContainerID, true); err != nil {
|
||||
slog.Warn("remove docker container", "container_id", c.ContainerID, "error", err)
|
||||
}
|
||||
}
|
||||
|
||||
// Delete proxy route.
|
||||
if c.ProxyRouteID != "" {
|
||||
if err := d.proxy.DeleteRoute(ctx, c.ProxyRouteID); err != nil {
|
||||
slog.Warn("delete proxy route", "route_id", c.ProxyRouteID, "error", err)
|
||||
}
|
||||
|
||||
// Remove DNS record.
|
||||
if c.Subdomain != "" && settings.Domain != "" {
|
||||
fqdn := c.Subdomain + "." + settings.Domain
|
||||
d.removeDNS(ctx, fqdn, "")
|
||||
}
|
||||
}
|
||||
|
||||
// Drop the container row.
|
||||
if err := d.store.DeleteContainer(c.ID); err != nil && !errors.Is(err, store.ErrNotFound) {
|
||||
return fmt.Errorf("delete container row: %w", err)
|
||||
}
|
||||
|
||||
return nil
|
||||
}
|
||||
|
||||
// buildSubdomain generates the subdomain for an instance based on settings and stage config.
|
||||
func (d *Deployer) buildSubdomain(project store.Project, stage store.Stage, settings store.Settings, imageTag string) string {
|
||||
return GenerateTaggedSubdomain(settings.SubdomainPattern, project.Name, stage.Name, imageTag, stage.Subdomain)
|
||||
}
|
||||
|
||||
// buildRegistryAuth constructs the Docker registry auth string for pulling images.
|
||||
// If the project has a registry configured, it looks up the registry token.
|
||||
func (d *Deployer) buildRegistryAuth(project store.Project) (string, error) {
|
||||
if project.Registry == "" {
|
||||
return "", nil
|
||||
}
|
||||
|
||||
reg, err := d.store.GetRegistryByName(project.Registry)
|
||||
if err != nil {
|
||||
return "", fmt.Errorf("get registry %s: %w", project.Registry, err)
|
||||
}
|
||||
|
||||
if reg.Token != "" {
|
||||
decrypted, err := crypto.Decrypt(d.encKey, reg.Token)
|
||||
if err != nil {
|
||||
return "", fmt.Errorf("decrypt registry token: %w", err)
|
||||
}
|
||||
return docker.EncodeRegistryAuth(decrypted, decrypted, reg.URL)
|
||||
}
|
||||
return "", nil
|
||||
}
|
||||
|
||||
// mergeEnvVars builds the final environment variable list for a container:
|
||||
// 1. Parse project-level env JSON
|
||||
// 2. Overlay with stage-level env overrides (stage wins on key conflict)
|
||||
// 3. Decrypt any encrypted (secret) values
|
||||
// Returns a []string of KEY=VALUE pairs.
|
||||
func (d *Deployer) mergeEnvVars(project store.Project, stageID string) []string {
|
||||
// Step 1: Parse project-level env.
|
||||
envMap := make(map[string]string)
|
||||
if project.Env != "" && project.Env != "{}" {
|
||||
var projectEnv map[string]string
|
||||
if err := json.Unmarshal([]byte(project.Env), &projectEnv); err != nil {
|
||||
slog.Warn("parse project env vars", "error", err)
|
||||
} else {
|
||||
for k, v := range projectEnv {
|
||||
envMap[k] = v
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// Step 2: Overlay with stage-level overrides.
|
||||
stageEnvs, err := d.store.GetStageEnvByStageID(stageID)
|
||||
if err != nil {
|
||||
slog.Warn("get stage env overrides", "stage_id", stageID, "error", err)
|
||||
} else {
|
||||
for _, se := range stageEnvs {
|
||||
value := se.Value
|
||||
if se.Encrypted {
|
||||
// Step 3: Decrypt secret values.
|
||||
decrypted, err := crypto.Decrypt(d.encKey, se.Value)
|
||||
if err != nil {
|
||||
slog.Warn("decrypt stage env value", "key", se.Key, "error", err)
|
||||
continue
|
||||
}
|
||||
value = decrypted
|
||||
}
|
||||
envMap[se.Key] = value
|
||||
}
|
||||
}
|
||||
|
||||
vars := make([]string, 0, len(envMap))
|
||||
for k, v := range envMap {
|
||||
vars = append(vars, k+"="+v)
|
||||
}
|
||||
return vars
|
||||
}
|
||||
|
||||
// computeVolumeMounts builds Docker mount specifications from the project's volume config.
|
||||
// Uses the shared volume.ResolvePath for path resolution.
|
||||
func (d *Deployer) computeVolumeMounts(projectID, projectName, stageName, imageTag, basePath string) []mount.Mount {
|
||||
vols, err := d.store.GetVolumesByProjectID(projectID)
|
||||
if err != nil {
|
||||
slog.Warn("get project volumes", "project_id", projectID, "error", err)
|
||||
return nil
|
||||
}
|
||||
|
||||
if len(vols) == 0 {
|
||||
return nil
|
||||
}
|
||||
|
||||
params := volume.ResolveParams{
|
||||
BasePath: basePath,
|
||||
ProjectName: projectName,
|
||||
StageName: stageName,
|
||||
ImageTag: imageTag,
|
||||
}
|
||||
|
||||
mounts := make([]mount.Mount, 0, len(vols))
|
||||
for _, vol := range vols {
|
||||
scope := vol.Scope
|
||||
if scope == "" {
|
||||
switch vol.Mode {
|
||||
case "isolated":
|
||||
scope = "instance"
|
||||
default:
|
||||
scope = "project"
|
||||
}
|
||||
}
|
||||
|
||||
// Ephemeral volumes use tmpfs — no host path.
|
||||
if scope == "ephemeral" {
|
||||
mounts = append(mounts, mount.Mount{
|
||||
Type: mount.TypeTmpfs,
|
||||
Target: vol.Target,
|
||||
})
|
||||
continue
|
||||
}
|
||||
|
||||
source, err := volume.ResolvePath(vol, params)
|
||||
if err != nil {
|
||||
slog.Warn("resolve volume path", "volume_id", vol.ID, "error", err)
|
||||
continue
|
||||
}
|
||||
|
||||
mounts = append(mounts, mount.Mount{
|
||||
Type: mount.TypeBind,
|
||||
Source: source,
|
||||
Target: vol.Target,
|
||||
})
|
||||
}
|
||||
return mounts
|
||||
}
|
||||
|
||||
// logDeploy appends a log entry for a deploy and publishes it on the event bus.
|
||||
// Errors are logged to stderr but not propagated.
|
||||
func (d *Deployer) logDeploy(deployID, message, level string) {
|
||||
if err := d.store.AppendDeployLog(deployID, message, level); err != nil {
|
||||
slog.Warn("append deploy log", "error", err)
|
||||
}
|
||||
if d.eventBus != nil {
|
||||
d.eventBus.Publish(events.Event{
|
||||
Type: events.EventDeployLog,
|
||||
Payload: events.DeployLogPayload{
|
||||
DeployID: deployID,
|
||||
Message: message,
|
||||
Level: level,
|
||||
},
|
||||
})
|
||||
}
|
||||
}
|
||||
|
||||
// publishDeployStatus publishes a deploy status change event on the bus.
|
||||
func (d *Deployer) publishDeployStatus(deployID, projectID, stageID, imageTag, status, deployErr string) {
|
||||
if d.eventBus != nil {
|
||||
d.eventBus.Publish(events.Event{
|
||||
Type: events.EventDeployStatus,
|
||||
Payload: events.DeployStatusPayload{
|
||||
DeployID: deployID,
|
||||
ProjectID: projectID,
|
||||
StageID: stageID,
|
||||
ImageTag: imageTag,
|
||||
Status: status,
|
||||
Error: deployErr,
|
||||
},
|
||||
})
|
||||
}
|
||||
}
|
||||
|
||||
// publishInstanceStatus publishes an instance status change event on the bus.
|
||||
func (d *Deployer) publishInstanceStatus(instanceID, projectID, stageID, status string) {
|
||||
if d.eventBus != nil {
|
||||
d.eventBus.Publish(events.Event{
|
||||
Type: events.EventInstanceStatus,
|
||||
Payload: events.InstanceStatusPayload{
|
||||
InstanceID: instanceID,
|
||||
ProjectID: projectID,
|
||||
StageID: stageID,
|
||||
Status: status,
|
||||
},
|
||||
})
|
||||
}
|
||||
}
|
||||
|
||||
// ensureDNS creates or updates a DNS record for the given FQDN. Best-effort: logs warnings on failure.
|
||||
func (d *Deployer) ensureDNS(ctx context.Context, fqdn, consumerType, consumerID, deployID string) {
|
||||
dnsProvider := d.getDNS()
|
||||
if dnsProvider == nil {
|
||||
return
|
||||
}
|
||||
settings, err := d.store.GetSettings()
|
||||
if err != nil {
|
||||
slog.Warn("dns: get settings for server IP", "error", err)
|
||||
return
|
||||
}
|
||||
if settings.ServerIP == "" {
|
||||
slog.Warn("dns: server IP not configured, skipping DNS record creation", "fqdn", fqdn)
|
||||
return
|
||||
}
|
||||
|
||||
recordID, err := dnsProvider.EnsureRecord(ctx, fqdn, settings.ServerIP)
|
||||
if err != nil {
|
||||
msg := fmt.Sprintf("DNS: failed to create/update record for %s: %v", fqdn, err)
|
||||
slog.Warn(msg)
|
||||
if deployID != "" {
|
||||
d.logDeploy(deployID, msg, "warn")
|
||||
}
|
||||
return
|
||||
}
|
||||
|
||||
// Track the record locally.
|
||||
if _, err := d.store.CreateDNSRecord(store.DNSRecord{
|
||||
FQDN: fqdn,
|
||||
ProviderRecordID: recordID,
|
||||
ConsumerType: consumerType,
|
||||
ConsumerID: consumerID,
|
||||
}); err != nil {
|
||||
// May already exist — try updating.
|
||||
if updateErr := d.store.UpdateDNSRecordProviderID(fqdn, recordID); updateErr != nil {
|
||||
slog.Warn("dns: failed to track record", "fqdn", fqdn, "error", updateErr)
|
||||
}
|
||||
}
|
||||
|
||||
logMsg := fmt.Sprintf("DNS: record ensured for %s", fqdn)
|
||||
slog.Info(logMsg)
|
||||
if deployID != "" {
|
||||
d.logDeploy(deployID, logMsg, "info")
|
||||
}
|
||||
}
|
||||
|
||||
// removeDNS deletes a DNS record for the given FQDN. Best-effort: logs warnings on failure.
|
||||
func (d *Deployer) removeDNS(ctx context.Context, fqdn, deployID string) {
|
||||
dnsProvider := d.getDNS()
|
||||
if dnsProvider == nil {
|
||||
return
|
||||
}
|
||||
|
||||
if err := dnsProvider.DeleteRecord(ctx, fqdn); err != nil {
|
||||
msg := fmt.Sprintf("DNS: failed to delete record for %s: %v", fqdn, err)
|
||||
slog.Warn(msg)
|
||||
if deployID != "" {
|
||||
d.logDeploy(deployID, msg, "warn")
|
||||
}
|
||||
return
|
||||
}
|
||||
|
||||
// Remove local tracking.
|
||||
if err := d.store.DeleteDNSRecord(fqdn); err != nil {
|
||||
slog.Warn("dns: failed to remove tracking record", "fqdn", fqdn, "error", err)
|
||||
}
|
||||
|
||||
logMsg := fmt.Sprintf("DNS: record deleted for %s", fqdn)
|
||||
slog.Info(logMsg)
|
||||
if deployID != "" {
|
||||
d.logDeploy(deployID, logMsg, "info")
|
||||
}
|
||||
}
|
||||
|
||||
// truncateID safely truncates a Docker ID to 12 characters for display.
|
||||
func truncateID(id string) string {
|
||||
if len(id) > 12 {
|
||||
return id[:12]
|
||||
}
|
||||
return id
|
||||
}
|
||||
|
||||
// resolveProjectWorkloadID returns the workload ID paired with a project.
|
||||
// Backfill-on-boot guarantees the row exists, so this is essentially a lookup.
|
||||
// On miss (defensive), it logs and returns empty so the caller can decide.
|
||||
func (d *Deployer) resolveProjectWorkloadID(projectID string) string {
|
||||
w, err := d.store.GetWorkloadByRef(store.WorkloadKindProject, projectID)
|
||||
if err != nil {
|
||||
slog.Warn("resolve project workload", "project_id", projectID, "error", err)
|
||||
return ""
|
||||
}
|
||||
return w.ID
|
||||
}
|
||||
|
||||
|
||||
Reference in New Issue
Block a user