Files
ledgrab/plans/activity-log/phase-2-recorder-retention.md
T

7.2 KiB

Phase 2: Recorder, actor context, retention, lifecycle

Status: Not Started Parent plan: PLAN.md Domain: backend

Objective

Build the runtime layer over the Phase 1 repository: a thread-safe ActivityRecorder facade that persists an entry AND pushes a live activity_logged event; an actor ContextVar populated by the auth layer; a background ActivityLogRetentionEngine mirroring AutoBackupEngine; and the main.py/dependencies.py wiring (init, DI getter, retention start/stop, shutdown ordering). After this phase the audit log records nothing yet (no call sites) — that is Phase 3 — but the full machinery is live and unit-tested.

Tasks

  • Create server/src/ledgrab/core/activity_log/__init__.py and server/src/ledgrab/core/activity_log/recorder.py:
    • ActivityRecorder(repo: ActivityLogRepository, processor_manager, *, loop=None).
    • record(category, action, *, severity="info", actor=None, entity_type=None, entity_id=None, entity_name=None, message, metadata=None) -> None:
      • resolve actor from the actor ContextVar when not supplied, default "system";
      • build an ActivityLogEntry (id al_<uuid8>, ts=datetime.now(timezone.utc));
      • thread-safe write: if called on the event loop thread, write inline via repo.record(entry) then fire the live event; if called from another thread (zeroconf discovery), marshal the whole write+emit onto the loop via loop.call_soon_threadsafe(...). Capture the loop lazily (mirror utils/log_broadcaster.py:ensure_loop/call_soon_threadsafe). Never raise into the caller — audit recording is best-effort and must not break the audited action; log failures at warning.
      • live push: processor_manager.fire_event({"type": "activity_logged", "entry": entry_as_dict}).
    • Provide a tiny helper to serialize an entry to the same dict shape the API returns (reuse in Phase 4 / frontend).
    • enabled flag honored: when retention settings say enabled=false, record() is a no-op — EXCEPT the "audit log disabled" event itself, which must be recorded before the flag takes effect (see retention engine).
  • Actor ContextVar:
    • Add current_actor: ContextVar[str] (module-level, e.g. in core/activity_log/context.py or api/auth.py). In verify_api_key (api/auth.py), set it next to the existing request.state.auth_label = ... (both the authenticated label and the "anonymous" branch). Default "system" when unset. Ensure no cross-request leakage (set on every auth evaluation).
  • Create server/src/ledgrab/core/activity_log/retention.py:
    • ActivityLogRetentionEngine(repo, db, recorder) mirroring core/backup/auto_backup.py: _load_settings()/_save_settings() via db.get_setting("activity_log") / db.set_setting("activity_log", {...}), DEFAULT_SETTINGS = {"enabled": True, "max_days": 90, "max_entries": 20000}. async start() → spawn _retention_loop() (asyncio.create_task); loop sleeps a sane interval (e.g. hourly) then calls repo.prune(before_ts=now-max_days, max_entries=...). async stop() → cancel + await task. get_settings() / async update_settings(...) that persist and apply (changing enabled is logged via the recorder BEFORE disabling).
  • Wiring:
    • main.py: instantiate activity_log_repo = ActivityLogRepository(db) (module level near other stores); in lifespan startup build activity_recorder + activity_log_retention_engine, pass to init_dependencies(...), and await activity_log_retention_engine.start().
    • In lifespan shutdown: record a system / server_shutting_down event via the recorder as the first shutdown action (before engines/db close), then await _bounded("activity_log_retention.stop", activity_log_retention_engine.stop(), timeout=0.5).
    • api/dependencies.py: add activity_recorder + activity_log_repo + activity_log_retention_engine to _deps, parameters to init_dependencies, and getters get_activity_recorder(), get_activity_log_repo(), get_activity_log_retention_engine().
  • Realtime allowlist (order matters — do allowlist FIRST so the parity test stays green):
    • Add 'activity_logged' to _ALLOWED_SERVER_EVENT_TYPES in server/src/ledgrab/static/js/core/events-ws.ts (+ a one-line comment naming the source).
    • Confirm tests/test_events_ws_parity.py passes with the new emit type.
  • Unit tests server/tests/core/test_activity_recorder.py + test_activity_log_retention.py:
    • recorder persists an entry AND calls fire_event with type=="activity_logged";
    • actor resolves from ContextVar; defaults to "system"; failure in repo doesn't raise;
    • cross-thread record() (call from a threading.Thread) routes through the loop and persists;
    • retention prunes per settings; settings round-trip via db; disabling logs the disable event.

Files to Modify/Create

  • server/src/ledgrab/core/activity_log/__init__.py — new
  • server/src/ledgrab/core/activity_log/recorder.py — new
  • server/src/ledgrab/core/activity_log/context.py — new (actor ContextVar) (or place in auth.py)
  • server/src/ledgrab/core/activity_log/retention.py — new
  • server/src/ledgrab/api/auth.py — modify: set actor ContextVar in verify_api_key
  • server/src/ledgrab/main.py — modify: instantiate, wire lifespan start/shutdown
  • server/src/ledgrab/api/dependencies.py — modify: _deps, init_dependencies, getters
  • server/src/ledgrab/static/js/core/events-ws.ts — modify: allowlist activity_logged
  • server/tests/core/test_activity_recorder.py — new
  • server/tests/core/test_activity_log_retention.py — new

Acceptance Criteria

  • Recorder persists + fires activity_logged; never raises into callers; thread-safe from non-loop threads.
  • Actor ContextVar populated by auth; default "system"; no cross-request leakage.
  • Retention engine starts/stops cleanly in lifespan; prunes by age + count; settings persist.
  • server_shutting_down is recorded before teardown; no lost-on-graceful-shutdown entries.
  • test_events_ws_parity.py green (allowlist updated). Existing tests still green; ruff clean.

Notes

  • Reference: core/backup/auto_backup.py (engine shape, settings persistence, _bounded shutdown in main.py), utils/log_broadcaster.py (ensure_loop, call_soon_threadsafe thread marshaling), core/processing/processor_manager.py:247 (fire_event).
  • Do not add any instrumentation call sites in this phase — only the machinery. Phase 3 adds the record(...) calls. (Intermediate commit emits nothing; that is fine and green.)
  • Freeze the ActivityLogEntry dict shape here — Phase 4 (API response) and Phase 5 (frontend entry) consume it.

Review Checklist

  • All tasks completed
  • Code follows project conventions (engine/DI patterns)
  • No unintended side effects (no call sites yet; lifespan order correct)
  • Build passes (ruff + pytest, incl. parity test)
  • Tests pass (new + existing)

Handoff to Next Phase