Files
ledgrab/plans/activity-log/phase-3-instrumentation.md
T
alexei.dolgolyov 25c613c5cb feat(activity-log): phase 3 - event instrumentation (4 categories)
- entity CRUD via fire_entity_event choke point (name resolved/sanitized; deletes pass name explicitly)
- auth: failures + WS session establishment (no tokens logged); per-IP audit-record throttle
- device: online/offline (health), discovered/lost (zeroconf), ADB connect/disconnect
- capture/system: target start-stop, scenes, playlists, automations, backup/restore, update, restart, calibration, settings
- security hardening: sanitize_display strips control/NUL/ANSI/newlines from untrusted strings; malformed-IPv6 origin guard
- 129 instrumentation tests (incl. secret-leak, log-injection, throttle, best-effort) + autouse throttle-reset fixture
2026-06-09 19:20:57 +03:00

9.7 KiB

Phase 3: Event instrumentation (4 categories)

Status: Done Parent plan: PLAN.md Domain: backend · 🔒 security-sensitive (security reviewer triggers)

Objective

Emit audit records at the real call sites for all four categories, using the Phase 2 recorder. Maximize coverage via the central fire_entity_event choke point; add explicit recorder.record(...) calls for non-entity events. Never log secrets.

Tasks

Entity CRUD (via the choke point)

  • In api/dependencies.py, extend fire_entity_event to ALSO record an audit entry:
    • Signature gains an optional entity_name: str | None = None.
    • For created/updated: if entity_name not supplied, best-effort resolve from the matching store in _deps keyed by entity_type (entity still present). For deleted: do not resolve post-hoc — rely on the explicit entity_name passed by the handler (deletes are the most important; a name-less delete entry is unacceptable).
    • Map action → severity (info), category entity. Build a human message (e.g. "Target 'Desk' updated"). Read actor from the ContextVar.
    • Recording is best-effort (never break the entity operation).
  • Update entity delete handlers to pass entity_name into fire_entity_event (the entity object is already loaded for the 404 check). Cover the representative/most-used entities at minimum: output targets, sync clocks, devices, picture/audio/color-strip sources, automations, scene presets/playlists, templates, gradients. (Create/update can rely on hook resolution but pass the name where trivially available.)

Authentication (DESCOPED: no key create/rotate/revoke — those routes don't exist)

  • In api/auth.py, record:
    • auth failures: missing/invalid Bearer token (HTTP), rejected LAN-without-keys, rejected WS origin (4403), WS auth handshake failure (4401). Category auth, severity warning. Include the caller IP/label and the reason in metadatanever the attempted token.
    • WS session establishment (successful accept_and_authenticate_ws): category auth, severity info, actor = authenticated label.
    • (Do NOT record per-request HTTP auth success — too frequent.)

Device connect/disconnect (use existing discrete seams)

  • Hook device_health_changed (core/processing/device_health.py, fired only on online != prev_online) → record online/offline transition. Category device, severity info (online) / warning (offline).
  • Hook device_discovered / device_lost (core/devices/discovery_watcher.py, runs on the zeroconf thread → recorder must marshal to the loop, which Phase 2 handles). Category device.
  • ADB connect/disconnect (api/routes/system_settings.py:adb_connect/adb_disconnect).

Capture & system events (explicit record calls)

  • Target processing start/stop + bulk (api/routes/output_targets_control.py).
  • Scene activation (scene_presets.py:activate_scene_preset), playlist start/stop (scene_playlists.py), automation activate/deactivate (automation_engine.py).
  • System: backup create/restore/delete (backup.py), update apply/dismiss (update.py), restart/shutdown (backup.py), calibration start/stop/cancel (calibration.py).
  • Settings changes: scope to high-value settings only (auto-backup, update, shutdown action). Exclude the activity-log's own "activity_log" settings key to avoid self-referential churn.

Tests

  • server/tests/test_activity_instrumentation.py (or per-area):
    • representative entity create/update/delete produces a record with correct category/actor/ name (incl. a delete carrying its name);
    • an auth failure produces a warning record and the token never appears in any field;
    • a device health transition and a discovery event produce records;
    • a capture start and a backup/restore produce records.

Files to Modify/Create

  • server/src/ledgrab/api/dependencies.py — modify: fire_entity_event records + entity_name
  • entity delete route handlers under api/routes/ — modify: pass entity_name
  • server/src/ledgrab/api/auth.py — modify: auth-failure + WS-session records
  • server/src/ledgrab/core/processing/device_health.py — modify: online/offline record
  • server/src/ledgrab/core/devices/discovery_watcher.py — modify: discovered/lost record
  • server/src/ledgrab/api/routes/system_settings.py — modify: ADB + settings records
  • server/src/ledgrab/api/routes/output_targets_control.py — modify: start/stop records
  • server/src/ledgrab/api/routes/{scene_presets,scene_playlists,backup,update,calibration}.py — modify
  • server/src/ledgrab/core/automations/automation_engine.py — modify: activate/deactivate records
  • server/tests/test_activity_instrumentation.py — new

Acceptance Criteria

  • All four categories emit records at the named sites; entity deletes carry the entity name.
  • API-key tokens / secrets never appear in any audit field (test-enforced).
  • Recording never breaks the audited action (best-effort; failures swallowed + logged).
  • Actor is the authenticated label for request-originated events, "system" for engine/thread events. New + existing tests green; ruff clean.

Notes

  • Get the recorder via the Phase 2 DI getter; for engine/thread sites that lack DI, use the module singleton/accessor Phase 2 exposes.
  • Keep messages human-readable and localized-agnostic (English source strings; the frontend renders structured fields, not server message translation — message is a fallback/summary).
  • This is the security-sensitive phase — the security reviewer runs here AND at final review.

Review Checklist

  • All tasks completed
  • Code follows project conventions
  • No unintended side effects (audited actions still succeed on recorder failure)
  • No secrets logged (token never recorded) — explicitly verified
  • Build passes (ruff + pytest)
  • Tests pass (new + existing)

Handoff to Next Phase

Phase 3 is complete. The following (category, action) pairs are now emitted, along with their metadata keys, for Phase 4 to expose via query/filter and for Phase 5 quick-filter presets.

entity category

Action Severity Metadata keys Notes
entity.created info All entity types via fire_entity_event choke-point
entity.updated info All entity types; name resolved from store when not passed
entity.deleted info Name passed explicitly by delete handler before deletion

auth category

Action Severity Metadata keys Notes
auth.rejected warning reason (str), client (str/IP) Missing Bearer, invalid Bearer, LAN-no-keys, WS origin, WS auth timeout, invalid WS token
auth.ws_connected info client (str/IP) Successful WS session established

device category

Action Severity Metadata keys Notes
device.online info latency_ms (float) Health monitor, transition only
device.offline warning latency_ms (float) Health monitor, transition only
device.discovered info url (str), device_type (str) Zeroconf discovery thread; recorder marshals to loop
device.lost warning url (str), device_type (str) Zeroconf discovery thread
device.adb_connected info address (str) ADB route success
device.adb_disconnected info address (str) ADB route success

capture category

Action Severity Metadata keys Notes
capture.started info Per target (individual + bulk)
capture.stopped info Per target (individual + bulk)
scene.activated info scene_presets.py:activate_scene_preset
playlist.started info scene_playlists.py:start_scene_playlist
playlist.stopped info scene_playlists.py:stop_scene_playlist
automation.activated info automation_engine.py:_activate_automation; actor="system"
automation.deactivated info automation_engine.py:_deactivate_automation; actor="system"

system category

Action Severity Metadata keys Notes
backup.created info filename (str) backup.py:backup_config
backup.restored info backup.py:restore_config
backup.deleted info filename (str) backup.py:delete_saved_backup
server.restarting info backup.py:restart_server
server.shutdown_requested info backup.py:shutdown_server
update.dismissed info version (str) update.py:dismiss_update
update.applied info version (str) update.py:apply_update
settings.changed info setting_key (str) + setting-specific keys setting_key values: "auto_backup", "update", "shutdown_action". Activity-log own key excluded.
calibration.started info calibration.py; entity_type="device", entity_id=device_id
calibration.stopped info calibration.py
calibration.cancelled info calibration.py

Implementation notes for Phase 4

  • The metadata field is a JSON TEXT column. All keys above are scalars (str, float).
  • Phase 4 filter metadata_key / metadata_value lookup, if added, can target setting_key for settings-change filtering.
  • entity_type is populated for entity CRUD and calibration.started. For auth/system/capture events entity_type may be None.
  • entity_name is always populated for entity.deleted; populated for CRUD create/update when resolved; populated for most capture/system events where a name is meaningful.