Files
ledgrab/plans/activity-log/phase-2-recorder-retention.md
T
alexei.dolgolyov 726f39e2ba feat(activity-log): phase 2 - recorder, actor context, retention, lifecycle
- ActivityRecorder: thread-safe record() (inline on loop, call_soon_threadsafe off-loop), best-effort, fires activity_logged event
- current_actor ContextVar set in verify_api_key (both branches), default system
- ActivityLogRetentionEngine: prune loop (max_days+max_entries), settings persistence, rehydrates recorder.enabled on startup
- lifespan wiring: server.shutting_down recorded first on shutdown, retention stop before db.close
- events-ws.ts allowlist + parity; DI getters + module accessor; 62 new tests
2026-06-09 18:10:27 +03:00

197 lines
9.4 KiB
Markdown

# Phase 2: Recorder, actor context, retention, lifecycle
**Status:** ✅ Done
**Parent plan:** [PLAN.md](./PLAN.md)
**Domain:** backend
## Objective
Build the runtime layer over the Phase 1 repository: a thread-safe `ActivityRecorder` facade
that persists an entry AND pushes a live `activity_logged` event; an actor `ContextVar`
populated by the auth layer; a background `ActivityLogRetentionEngine` mirroring
`AutoBackupEngine`; and the `main.py`/`dependencies.py` wiring (init, DI getter, retention
start/stop, shutdown ordering). After this phase the audit log records nothing yet (no call
sites) — that is Phase 3 — but the full machinery is live and unit-tested.
## Tasks
- [x] Create `server/src/ledgrab/core/activity_log/__init__.py` and
`server/src/ledgrab/core/activity_log/recorder.py`:
- `ActivityRecorder(repo: ActivityLogRepository, processor_manager, *, loop=None)`.
- `record(category, action, *, severity="info", actor=None, entity_type=None,
entity_id=None, entity_name=None, message, metadata=None) -> None`:
- resolve `actor` from the actor `ContextVar` when not supplied, default `"system"`;
- build an `ActivityLogEntry` (id `al_<uuid8>`, `ts=datetime.now(timezone.utc)`);
- **thread-safe write:** if called on the event loop thread, write inline via
`repo.record(entry)` then fire the live event; if called from another thread (zeroconf
discovery), marshal the whole write+emit onto the loop via
`loop.call_soon_threadsafe(...)`. Capture the loop lazily (mirror
`utils/log_broadcaster.py:ensure_loop`/`call_soon_threadsafe`). Never raise into the
caller — audit recording is best-effort and must not break the audited action; log
failures at `warning`.
- live push: `processor_manager.fire_event({"type": "activity_logged", "entry": entry_as_dict})`.
- Provide a tiny helper to serialize an entry to the same dict shape the API returns
(reuse in Phase 4 / frontend).
- `enabled` flag honored: when retention settings say `enabled=false`, `record()` is a
no-op — EXCEPT the "audit log disabled" event itself, which must be recorded before the
flag takes effect (see retention engine).
- [x] Actor `ContextVar`:
- Add `current_actor: ContextVar[str]` (module-level, e.g. in `core/activity_log/context.py`
or `api/auth.py`). In `verify_api_key` (`api/auth.py`), set it next to the existing
`request.state.auth_label = ...` (both the authenticated label and the `"anonymous"`
branch). Default `"system"` when unset. Ensure no cross-request leakage (set on every
auth evaluation).
- [x] Create `server/src/ledgrab/core/activity_log/retention.py`:
- `ActivityLogRetentionEngine(repo, db, recorder)` mirroring `core/backup/auto_backup.py`:
`_load_settings()`/`_save_settings()` via `db.get_setting("activity_log")` /
`db.set_setting("activity_log", {...})`, `DEFAULT_SETTINGS = {"enabled": True,
"max_days": 90, "max_entries": 20000}`.
`async start()` → spawn `_retention_loop()` (`asyncio.create_task`); loop sleeps a sane
interval (e.g. hourly) then calls `repo.prune(before_ts=now-max_days, max_entries=...)`.
`async stop()` → cancel + await task. `get_settings()` / `async update_settings(...)`
that persist and apply (changing `enabled` is logged via the recorder BEFORE disabling).
- [x] Wiring:
- `main.py`: instantiate `activity_log_repo = ActivityLogRepository(db)` (module level near
other stores); in `lifespan` startup build `activity_recorder` + `activity_log_retention_engine`,
pass to `init_dependencies(...)`, and `await activity_log_retention_engine.start()`.
- In `lifespan` **shutdown**: record a `system` / `server_shutting_down` event via the
recorder as the **first** shutdown action (before engines/db close), then
`await _bounded("activity_log_retention.stop", activity_log_retention_engine.stop(), timeout=0.5)`.
- `api/dependencies.py`: add `activity_recorder` + `activity_log_repo` +
`activity_log_retention_engine` to `_deps`, parameters to `init_dependencies`, and
getters `get_activity_recorder()`, `get_activity_log_repo()`,
`get_activity_log_retention_engine()`.
- [x] Realtime allowlist (order matters — do allowlist FIRST so the parity test stays green):
- Add `'activity_logged'` to `_ALLOWED_SERVER_EVENT_TYPES` in
`server/src/ledgrab/static/js/core/events-ws.ts` (+ a one-line comment naming the source).
- Confirm `tests/test_events_ws_parity.py` passes with the new emit type.
- [x] Unit tests `server/tests/core/test_activity_recorder.py` +
`test_activity_log_retention.py`:
- recorder persists an entry AND calls `fire_event` with `type=="activity_logged"`;
- actor resolves from ContextVar; defaults to `"system"`; failure in repo doesn't raise;
- cross-thread `record()` (call from a `threading.Thread`) routes through the loop and persists;
- retention prunes per settings; settings round-trip via db; disabling logs the disable event.
## Files to Modify/Create
- `server/src/ledgrab/core/activity_log/__init__.py` — new
- `server/src/ledgrab/core/activity_log/recorder.py` — new
- `server/src/ledgrab/core/activity_log/context.py` — new (actor ContextVar) *(or place in auth.py)*
- `server/src/ledgrab/core/activity_log/retention.py` — new
- `server/src/ledgrab/api/auth.py` — modify: set actor ContextVar in `verify_api_key`
- `server/src/ledgrab/main.py` — modify: instantiate, wire lifespan start/shutdown
- `server/src/ledgrab/api/dependencies.py` — modify: `_deps`, `init_dependencies`, getters
- `server/src/ledgrab/static/js/core/events-ws.ts` — modify: allowlist `activity_logged`
- `server/tests/core/test_activity_recorder.py` — new
- `server/tests/core/test_activity_log_retention.py` — new
## Acceptance Criteria
- Recorder persists + fires `activity_logged`; never raises into callers; thread-safe from
non-loop threads.
- Actor ContextVar populated by auth; default `"system"`; no cross-request leakage.
- Retention engine starts/stops cleanly in lifespan; prunes by age + count; settings persist.
- `server_shutting_down` is recorded before teardown; no lost-on-graceful-shutdown entries.
- `test_events_ws_parity.py` green (allowlist updated). Existing tests still green; `ruff` clean.
## Notes
- Reference: `core/backup/auto_backup.py` (engine shape, settings persistence, `_bounded`
shutdown in `main.py`), `utils/log_broadcaster.py` (`ensure_loop`, `call_soon_threadsafe`
thread marshaling), `core/processing/processor_manager.py:247` (`fire_event`).
- **Do not add any instrumentation call sites in this phase** — only the machinery. Phase 3
adds the `record(...)` calls. (Intermediate commit emits nothing; that is fine and green.)
- Freeze the `ActivityLogEntry` dict shape here — Phase 4 (API response) and Phase 5
(frontend `entry`) consume it.
## Review Checklist
- [x] All tasks completed
- [x] Code follows project conventions (engine/DI patterns)
- [x] No unintended side effects (no call sites yet; lifespan order correct)
- [x] Build passes (ruff + pytest, incl. parity test)
- [x] Tests pass (new + existing)
## Handoff to Next Phase
### recorder.record(...) — final signature
```python
recorder.record(
category: str, # ActivityCategory constant
action: str, # verb-object label
*,
severity: str = "info", # ActivitySeverity constant
actor: str | None = None, # resolved from current_actor ContextVar when None
entity_type: str | None = None,
entity_id: str | None = None,
entity_name: str | None = None,
message: str,
metadata: dict | None = None,
_bypass_enabled: bool = False, # internal: used by retention engine only
) -> None
```
### Actor ContextVar import path
```python
from ledgrab.core.activity_log.context import current_actor
```
### Module accessor (for non-DI sites)
```python
from ledgrab.core.activity_log.recorder import get_module_recorder, set_module_recorder
recorder = get_module_recorder() # returns ActivityRecorder | None
```
### entry_to_dict helper (for API response serialisation)
```python
from ledgrab.core.activity_log.recorder import entry_to_dict
d = entry_to_dict(entry) # returns dict with 11 keys
```
### Frozen `activity_logged` event payload shape
```python
{
"type": "activity_logged",
"entry": {
"id": str, # "al_<8-hex>"
"ts": str, # ISO-8601 UTC string
"category": str,
"action": str,
"severity": str,
"actor": str,
"entity_type": str | None,
"entity_id": str | None,
"entity_name": str | None,
"message": str,
"metadata": dict, # real dict, not JSON string
}
}
```
### DI getter names (in `api/dependencies.py`)
```python
from ledgrab.api.dependencies import (
get_activity_recorder,
get_activity_log_repo,
get_activity_log_retention_engine,
)
```
### Notes for Phase 3
- Phase 3 instruments `fire_entity_event` in `api/dependencies.py` by calling
`get_module_recorder()` there (not via FastAPI DI — it's a plain function).
- The actor ContextVar is already set by `verify_api_key` before any route
handler runs, so entity events carry the correct actor automatically.
- `recorder.record(...)` never raises; Phase 3 call sites need no try/except.
Phase 2 landed (2026-06-09): ActivityRecorder, actor ContextVar, ActivityLogRetentionEngine,
all wiring in main.py/dependencies.py/auth.py, activity_logged allowlist in events-ws.ts,
24 new tests — all green. Full suite 2309 passed, 2 skipped, 0 failed. Ruff clean.