Files
ledgrab/plans/activity-log/phase-3-instrumentation.md
T

6.1 KiB

Phase 3: Event instrumentation (4 categories)

Status: Not Started Parent plan: PLAN.md Domain: backend · 🔒 security-sensitive (security reviewer triggers)

Objective

Emit audit records at the real call sites for all four categories, using the Phase 2 recorder. Maximize coverage via the central fire_entity_event choke point; add explicit recorder.record(...) calls for non-entity events. Never log secrets.

Tasks

Entity CRUD (via the choke point)

  • In api/dependencies.py, extend fire_entity_event to ALSO record an audit entry:
    • Signature gains an optional entity_name: str | None = None.
    • For created/updated: if entity_name not supplied, best-effort resolve from the matching store in _deps keyed by entity_type (entity still present). For deleted: do not resolve post-hoc — rely on the explicit entity_name passed by the handler (deletes are the most important; a name-less delete entry is unacceptable).
    • Map action → severity (info), category entity. Build a human message (e.g. "Target 'Desk' updated"). Read actor from the ContextVar.
    • Recording is best-effort (never break the entity operation).
  • Update entity delete handlers to pass entity_name into fire_entity_event (the entity object is already loaded for the 404 check). Cover the representative/most-used entities at minimum: output targets, sync clocks, devices, picture/audio/color-strip sources, automations, scene presets/playlists, templates, gradients. (Create/update can rely on hook resolution but pass the name where trivially available.)

Authentication (DESCOPED: no key create/rotate/revoke — those routes don't exist)

  • In api/auth.py, record:
    • auth failures: missing/invalid Bearer token (HTTP), rejected LAN-without-keys, rejected WS origin (4403), WS auth handshake failure (4401). Category auth, severity warning. Include the caller IP/label and the reason in metadatanever the attempted token.
    • WS session establishment (successful accept_and_authenticate_ws): category auth, severity info, actor = authenticated label.
    • (Do NOT record per-request HTTP auth success — too frequent.)

Device connect/disconnect (use existing discrete seams)

  • Hook device_health_changed (core/processing/device_health.py, fired only on online != prev_online) → record online/offline transition. Category device, severity info (online) / warning (offline).
  • Hook device_discovered / device_lost (core/devices/discovery_watcher.py, runs on the zeroconf thread → recorder must marshal to the loop, which Phase 2 handles). Category device.
  • ADB connect/disconnect (api/routes/system_settings.py:adb_connect/adb_disconnect).

Capture & system events (explicit record calls)

  • Target processing start/stop + bulk (api/routes/output_targets_control.py).
  • Scene activation (scene_presets.py:activate_scene_preset), playlist start/stop (scene_playlists.py), automation activate/deactivate (automation_engine.py).
  • System: backup create/restore/delete (backup.py), update apply/dismiss (update.py), restart/shutdown (backup.py), calibration start/stop/cancel (calibration.py).
  • Settings changes: scope to high-value settings only (auto-backup, update, shutdown action). Exclude the activity-log's own "activity_log" settings key to avoid self-referential churn.

Tests

  • server/tests/test_activity_instrumentation.py (or per-area):
    • representative entity create/update/delete produces a record with correct category/actor/ name (incl. a delete carrying its name);
    • an auth failure produces a warning record and the token never appears in any field;
    • a device health transition and a discovery event produce records;
    • a capture start and a backup/restore produce records.

Files to Modify/Create

  • server/src/ledgrab/api/dependencies.py — modify: fire_entity_event records + entity_name
  • entity delete route handlers under api/routes/ — modify: pass entity_name
  • server/src/ledgrab/api/auth.py — modify: auth-failure + WS-session records
  • server/src/ledgrab/core/processing/device_health.py — modify: online/offline record
  • server/src/ledgrab/core/devices/discovery_watcher.py — modify: discovered/lost record
  • server/src/ledgrab/api/routes/system_settings.py — modify: ADB + settings records
  • server/src/ledgrab/api/routes/output_targets_control.py — modify: start/stop records
  • server/src/ledgrab/api/routes/{scene_presets,scene_playlists,backup,update,calibration}.py — modify
  • server/src/ledgrab/core/automations/automation_engine.py — modify: activate/deactivate records
  • server/tests/test_activity_instrumentation.py — new

Acceptance Criteria

  • All four categories emit records at the named sites; entity deletes carry the entity name.
  • API-key tokens / secrets never appear in any audit field (test-enforced).
  • Recording never breaks the audited action (best-effort; failures swallowed + logged).
  • Actor is the authenticated label for request-originated events, "system" for engine/thread events. New + existing tests green; ruff clean.

Notes

  • Get the recorder via the Phase 2 DI getter; for engine/thread sites that lack DI, use the module singleton/accessor Phase 2 exposes.
  • Keep messages human-readable and localized-agnostic (English source strings; the frontend renders structured fields, not server message translation — message is a fallback/summary).
  • This is the security-sensitive phase — the security reviewer runs here AND at final review.

Review Checklist

  • All tasks completed
  • Code follows project conventions
  • No unintended side effects (audited actions still succeed on recorder failure)
  • No secrets logged (token never recorded) — explicitly verified
  • Build passes (ruff + pytest)
  • Tests pass (new + existing)

Handoff to Next Phase