feat(observability): event triggers + log scanner backend
Two paired backends sharing the events.Bus seam:
Event triggers (consumer-side):
- internal/store/event_triggers.go — CRUD with action_secret
redaction on read (placeholder echo treated as "no change" on
PATCH so secrets aren't accidentally wiped).
- internal/events/dispatcher.go — bus subscriber, AND-composed
filters (severity CSV, source CSV, message regex with memoized
compile cache). Structural loop-prevention: never writes to
event_log. Sends via notifier.SendPayload.
- internal/notify: SendPayload + SendSyncForTestPayload methods,
TierEventTrigger constant, doSendRaw shared with the legacy
Event-shaped path.
- internal/api/event_triggers.go — admin-gated CRUD + /test
sending the real TriggerWebhookPayload shape. SSRF guard
rejects loopback / link-local / unspecified targets. PATCH
uses pointer-typed DTO for partial updates.
Log scanner (producer-side):
- internal/logscanner/ — engine (per-rule cooldown +
per-container token bucket, atomic drop counters), tail
(multiplexed docker frame demuxer with TTY fallback + 16 MiB
payload cap + 1 MiB reassembly cap + RFC3339Nano-validated
timestamp strip + UTF-8-safe message truncation), manager
(5s container polling, atomic.Pointer[Snapshot] hot-reload,
HitEmitter writes event_log + publishes EventLog so the
trigger dispatcher picks them up immediately).
- internal/docker/container.go — ContainerLogsOpts exposes
stream selection for stderr-only / stdout-only rules.
- internal/store: log_scan_rules table + CRUD with
EffectiveLogScanRules resolver (globals minus per-workload
overrides plus workload-only additions). Transactional
cascade-delete of overrides when a global rule is removed.
- internal/api/log_scan_rules.go — admin-gated CRUD + /test
(sample_line → matched/captures) + /stats (drop counters +
active tail count + last-snapshot compile errors) +
GET /api/workloads/{id}/effective-rules.
cmd/server/main.go wires both subsystems next to the existing
RegisterPersistentLogger. Coverage spans engine cooldown / bucket
counter tests, snapshot effective-set semantics, manager compile-
error capture, dispatcher matching, store validation +
cascade-delete, API URL validator + secret redaction.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
+126
-4
@@ -197,6 +197,34 @@ type StageEnv struct {
|
||||
UpdatedAt string `json:"updated_at"`
|
||||
}
|
||||
|
||||
// WorkloadVolume is the plugin-shape equivalent of legacy Volume: a
|
||||
// per-workload mount declaration. The Scope enum matches the existing
|
||||
// VolumeScope contract so the legacy resolver can be reused once its
|
||||
// project_id assumption is loosened.
|
||||
type WorkloadVolume struct {
|
||||
ID string `json:"id"`
|
||||
WorkloadID string `json:"workload_id"`
|
||||
Source string `json:"source"`
|
||||
Target string `json:"target"`
|
||||
Scope string `json:"scope"`
|
||||
Name string `json:"name"`
|
||||
CreatedAt string `json:"created_at"`
|
||||
UpdatedAt string `json:"updated_at"`
|
||||
}
|
||||
|
||||
// WorkloadEnv is the plugin-shape equivalent of StageEnv: per-workload
|
||||
// environment variable overrides, optionally encrypted at rest. Read by
|
||||
// the Source plugin at deploy time, merged on top of source_config.env.
|
||||
type WorkloadEnv struct {
|
||||
ID string `json:"id"`
|
||||
WorkloadID string `json:"workload_id"`
|
||||
Key string `json:"key"`
|
||||
Value string `json:"value"`
|
||||
Encrypted bool `json:"encrypted"`
|
||||
CreatedAt string `json:"created_at"`
|
||||
UpdatedAt string `json:"updated_at"`
|
||||
}
|
||||
|
||||
// VolumeScope defines the sharing scope for a volume mount.
|
||||
// Valid scopes: instance, stage, project, project_named, named, ephemeral.
|
||||
type VolumeScope string
|
||||
@@ -333,6 +361,82 @@ type EventLog struct {
|
||||
CreatedAt string `json:"created_at"`
|
||||
}
|
||||
|
||||
// EventTrigger is a filter+action rule evaluated against EventLog
|
||||
// entries published on the bus. When all non-empty filters match, the
|
||||
// trigger fires its configured action (webhook today, additional action
|
||||
// types extensible via the ActionType enum).
|
||||
//
|
||||
// Filter fields use a comma-separated list shape for multi-value
|
||||
// filters (severity, source) to keep the schema flat — empty string
|
||||
// means "no filter on this dimension." FilterMessageRegex is a single
|
||||
// regex evaluated against EventLog.Message.
|
||||
//
|
||||
// Loop-prevention: deliveries are recorded in webhook_deliveries (the
|
||||
// existing audit trail). The dispatcher MUST NOT write to event_log
|
||||
// or it will recurse.
|
||||
type EventTrigger struct {
|
||||
ID int64 `json:"id"`
|
||||
Name string `json:"name"`
|
||||
FilterSeverity string `json:"filter_severity"` // comma list: "warn,error"; "" = any
|
||||
FilterSource string `json:"filter_source"` // comma list: "logscan,deploy"; "" = any
|
||||
FilterMessageRegex string `json:"filter_message_regex"` // "" = any
|
||||
ActionType string `json:"action_type"` // "webhook" today
|
||||
ActionTarget string `json:"action_target"` // URL for webhook
|
||||
ActionSecret string `json:"action_secret"` // optional HMAC secret for signed delivery
|
||||
Enabled bool `json:"enabled"`
|
||||
CreatedAt string `json:"created_at"`
|
||||
UpdatedAt string `json:"updated_at"`
|
||||
}
|
||||
|
||||
// EventTriggerActionType enumerates the supported action_type values.
|
||||
// Adding a new action is additive — old triggers keep working, the
|
||||
// dispatcher just learns a new branch.
|
||||
const (
|
||||
EventTriggerActionWebhook = "webhook"
|
||||
)
|
||||
|
||||
// LogScanRule is one regex-based pattern the log scanner evaluates
|
||||
// against container log lines. The (workload_id, overrides_id) pair
|
||||
// implements the "global rule with optional per-workload override"
|
||||
// pattern documented in docs/LOGSCAN_AND_TRIGGERS_TODO.md:
|
||||
//
|
||||
// - WorkloadID == "" && OverridesID == 0 → global rule, applies to
|
||||
// every workload unless overridden.
|
||||
// - WorkloadID != "" && OverridesID == 0 → workload-only addition.
|
||||
// - WorkloadID != "" && OverridesID != 0 → override of the named
|
||||
// global rule for one workload (Enabled=false to disable globally
|
||||
// for this workload).
|
||||
type LogScanRule struct {
|
||||
ID int64 `json:"id"`
|
||||
WorkloadID string `json:"workload_id"` // "" = global
|
||||
OverridesID int64 `json:"overrides_id"` // 0 = not an override
|
||||
Name string `json:"name"`
|
||||
Pattern string `json:"pattern"` // regex, compiled at load
|
||||
Severity string `json:"severity"` // info|warn|error
|
||||
Streams string `json:"streams"` // all|stdout|stderr
|
||||
CooldownSeconds int `json:"cooldown_seconds"`
|
||||
Enabled bool `json:"enabled"`
|
||||
CreatedAt string `json:"created_at"`
|
||||
UpdatedAt string `json:"updated_at"`
|
||||
}
|
||||
|
||||
// Log scan stream filter values. "all" reads both streams; "stdout"
|
||||
// or "stderr" filter to one. Used both for store validation and at
|
||||
// docker-side log read time.
|
||||
const (
|
||||
LogScanStreamAll = "all"
|
||||
LogScanStreamStdout = "stdout"
|
||||
LogScanStreamStderr = "stderr"
|
||||
)
|
||||
|
||||
// Log scan severity values mirror the event_log enum so a matched
|
||||
// rule lands as an event_log row with the rule's severity verbatim.
|
||||
const (
|
||||
LogScanSeverityInfo = "info"
|
||||
LogScanSeverityWarn = "warn"
|
||||
LogScanSeverityError = "error"
|
||||
)
|
||||
|
||||
// WorkloadKind enumerates the kinds of things that own containers.
|
||||
// Each kind has a corresponding row in projects/stacks/static_sites referenced via Workload.RefID.
|
||||
type WorkloadKind string
|
||||
@@ -346,12 +450,24 @@ const (
|
||||
// Workload is the unifying primitive that abstracts Project, Stack, and StaticSite.
|
||||
// Each row is paired with exactly one project/stack/site via (Kind, RefID).
|
||||
// Notification + webhook config moves here so it lives in one place across kinds.
|
||||
//
|
||||
// SourceKind / SourceConfig / TriggerKind / TriggerConfig / PublicFaces /
|
||||
// ParentWorkloadID populate the unified plugin model from the Workload-first
|
||||
// refactor. Existing rows keep these empty until they are explicitly migrated
|
||||
// or replaced — the legacy Kind/RefID columns continue to point at
|
||||
// project/stack/site rows in parallel during the cutover.
|
||||
type Workload struct {
|
||||
ID string `json:"id"`
|
||||
Kind string `json:"kind"` // project | stack | site
|
||||
Kind string `json:"kind"` // project | stack | site (legacy discriminator)
|
||||
RefID string `json:"ref_id"`
|
||||
Name string `json:"name"`
|
||||
AppID string `json:"app_id"` // nullable; "" = unassigned
|
||||
AppID string `json:"app_id"` // nullable; "" = unassigned (a.k.a. GroupID after rename)
|
||||
SourceKind string `json:"source_kind"` // "" until plugin-mode populated
|
||||
SourceConfig string `json:"source_config"` // JSON-encoded, decoded by the matching Source
|
||||
TriggerKind string `json:"trigger_kind"`
|
||||
TriggerConfig string `json:"trigger_config"` // JSON-encoded, decoded by the matching Trigger
|
||||
PublicFaces string `json:"public_faces"` // JSON-encoded []PublicFace
|
||||
ParentWorkloadID string `json:"parent_workload_id"` // "" = root; non-empty = stage chain
|
||||
NotificationURL string `json:"notification_url"`
|
||||
NotificationSecret string `json:"-"` // never serialized
|
||||
WebhookSecret string `json:"-"` // URL-identifier secret; never serialized
|
||||
@@ -384,8 +500,14 @@ type Container struct {
|
||||
ProxyRouteID string `json:"proxy_route_id"`
|
||||
NpmProxyID int `json:"npm_proxy_id"`
|
||||
LastSeenAt string `json:"last_seen_at"`
|
||||
CreatedAt string `json:"created_at"`
|
||||
UpdatedAt string `json:"updated_at"`
|
||||
// ExtraJSON carries source-specific metadata that isn't promoted to a
|
||||
// first-class column — currently per-face proxy route IDs for
|
||||
// multi-face image deploys. Stored as a JSON object; '{}' on empty
|
||||
// rows. Sources own the shape; consumers should tolerate unknown
|
||||
// keys.
|
||||
ExtraJSON string `json:"extra_json"`
|
||||
CreatedAt string `json:"created_at"`
|
||||
UpdatedAt string `json:"updated_at"`
|
||||
}
|
||||
|
||||
// App is an optional grouping of workloads (e.g., "my-saas" = web project + worker stack + redis stack).
|
||||
|
||||
Reference in New Issue
Block a user