Phase 1: Storage — model, migration, repository

Status: ⬜ Not Started Parent plan: PLAN.md Domain: data

Objective

Create the persistent foundation for the audit log: an ActivityLogEntry dataclass, an additive idempotent SQLite migration that creates a dedicated indexed activity_log table, and a purpose-built ActivityLogRepository (NOT BaseSqliteStore) supporting append, keyset-paginated filtered query, count, time/count-based prune, and streaming export.

Tasks

Create server/src/ledgrab/storage/activity_log.py:
- ActivityCategory and ActivitySeverity string enums (or Literal unions used as constants). Categories: auth, device, entity, capture, system. Severities: info, warning, error.
- @dataclass ActivityLogEntry with fields: id: str (e.g. al_<uuid8>), ts: datetime (UTC, server-assigned), category: str, action: str, severity: str, actor: str, entity_type: str | None, entity_id: str | None, entity_name: str | None, message: str, metadata: dict (small JSON; default empty). Provide to_row() / from_row() (column tuple/dict ↔ dataclass; metadata JSON-encoded; ts isoformat).
Add migration to server/src/ledgrab/storage/data_migrations.py:
- New DataMigration subclass AddActivityLogTableMigration with unique name (next sequential id, e.g. "NNN_add_activity_log" — match existing naming) and apply(conn) creating activity_log with an INTEGER PRIMARY KEY AUTOINCREMENT seq (monotonic keyset tiebreaker) plus columns: id TEXT UNIQUE NOT NULL, ts TEXT NOT NULL, category TEXT NOT NULL, action TEXT NOT NULL, severity TEXT NOT NULL, actor TEXT NOT NULL, entity_type TEXT, entity_id TEXT, entity_name TEXT, message TEXT NOT NULL, metadata TEXT NOT NULL DEFAULT '{}'.
- Indexes: (ts DESC, seq DESC) (primary keyset/sort), category, severity, actor, (entity_type, entity_id). Use CREATE TABLE/INDEX IF NOT EXISTS for idempotency.
- Append the instance to ALL_MIGRATIONS (never reorder existing entries).
Create server/src/ledgrab/storage/activity_log_repository.py:
- class ActivityLogRepository taking db: Database (NOT subclassing BaseSqliteStore).
- record(entry: ActivityLogEntry) -> None: single parameterized INSERT via db.execute(...) (auto-commit). The seq is DB-assigned. Caller guarantees this runs on the event-loop thread (see Phase 2 — cross-thread marshaling lives in the recorder).
- query(filters: ActivityLogFilters, *, before_seq: int | None, limit: int) -> list[ActivityLogEntry]: keyset pagination WHERE seq < ? ORDER BY seq DESC LIMIT ? plus optional filters — category IN (...), severity IN (...), actor = ?, entity_type = ?, entity_id = ?, ts >= ? / ts <= ?, message LIKE ? (free-text, %q%, escaped). All parameterized.
- count(filters) -> int.
- prune(*, before_ts: datetime | None, max_entries: int | None) -> int: delete rows older than before_ts, and/or trim to the newest max_entries by seq. Returns rows deleted.
- clear() -> int: delete all rows (used by the API clear endpoint; the clear action is itself audited by the recorder, not here). Returns rows deleted.
- iter_export(filters) -> Iterator[ActivityLogEntry]: cursor-based streaming for export (does not load all rows into memory).
- Define a small ActivityLogFilters dataclass (all-optional fields) in the repository or activity_log.py and reuse it across query/count/prune/export.
Unit tests in server/tests/storage/test_activity_log_repository.py:
- insert + read back round-trip (incl. metadata JSON, UTC ts);
- filter by each dimension (category/severity/actor/entity/date/free-text);
- keyset pagination stability across two pages with same-ts rows (seq tiebreaker);
- prune by age and by max_entries;
- clear; count; export iterator yields all matching rows;
- migration idempotency (constructing the repo twice / running migrations twice is safe).

Files to Modify/Create

server/src/ledgrab/storage/activity_log.py — new: dataclass + enums + filters + row codec
server/src/ledgrab/storage/data_migrations.py — modify: add migration + append to ALL_MIGRATIONS
server/src/ledgrab/storage/activity_log_repository.py — new: repository
server/tests/storage/test_activity_log_repository.py — new: unit tests

Acceptance Criteria

activity_log table + indexes created idempotently on startup (running migrations twice is a no-op).
Query is keyset-paginated and index-backed; a 10k-row table never loads fully into memory.
Pagination is stable when many rows share the same millisecond ts (uses seq tiebreaker).
prune removes by age AND by max-entry cap; clear empties the table; export streams.
All filters use parameterized SQL (no string interpolation of user input).
New unit tests pass; ruff check clean; existing tests still green.

Notes

Reference patterns: storage/database.py (execute, transaction, get_setting), storage/data_migrations.py (DataMigration, MigrationRunner, ALL_MIGRATIONS), storage/sync_clock.py (dataclass to_dict/from_dict style).
🔒 Migration-safety addendum (data domain): this migration is purely additive (new table) — no rename, no field/key/file move, no data movement → no data-loss risk. Still idempotent (IF NOT EXISTS). Rollback = drop the table; no user data is transformed.
Do NOT wire the repository into main.py or dependencies.py here — that is Phase 2.
Database's connection is created with the existing threading model; the repository must not assume it can be called from arbitrary threads. Thread marshaling is Phase 2's job.

Review Checklist

All tasks completed
Code follows project conventions (dataclass codec style, migration naming)
No unintended side effects (no startup wiring yet)
Build passes (ruff + pytest)
Tests pass (new + existing)

6.2 KiB Raw Blame History