103 lines
6.2 KiB
Markdown
103 lines
6.2 KiB
Markdown
# Phase 1: Storage — model, migration, repository
|
|
|
|
**Status:** ⬜ Not Started
|
|
**Parent plan:** [PLAN.md](./PLAN.md)
|
|
**Domain:** data
|
|
|
|
## Objective
|
|
|
|
Create the persistent foundation for the audit log: an `ActivityLogEntry` dataclass, an
|
|
additive idempotent SQLite migration that creates a dedicated indexed `activity_log` table,
|
|
and a purpose-built `ActivityLogRepository` (NOT `BaseSqliteStore`) supporting append,
|
|
keyset-paginated filtered query, count, time/count-based prune, and streaming export.
|
|
|
|
## Tasks
|
|
|
|
- [ ] Create `server/src/ledgrab/storage/activity_log.py`:
|
|
- `ActivityCategory` and `ActivitySeverity` string enums (or `Literal` unions used as
|
|
constants). Categories: `auth`, `device`, `entity`, `capture`, `system`. Severities:
|
|
`info`, `warning`, `error`.
|
|
- `@dataclass ActivityLogEntry` with fields: `id: str` (e.g. `al_<uuid8>`), `ts: datetime`
|
|
(UTC, server-assigned), `category: str`, `action: str`, `severity: str`, `actor: str`,
|
|
`entity_type: str | None`, `entity_id: str | None`, `entity_name: str | None`,
|
|
`message: str`, `metadata: dict` (small JSON; default empty). Provide `to_row()` /
|
|
`from_row()` (column tuple/dict ↔ dataclass; `metadata` JSON-encoded; `ts` isoformat).
|
|
- [ ] Add migration to `server/src/ledgrab/storage/data_migrations.py`:
|
|
- New `DataMigration` subclass `AddActivityLogTableMigration` with unique `name`
|
|
(next sequential id, e.g. `"NNN_add_activity_log"` — match existing naming) and
|
|
`apply(conn)` creating `activity_log` with an INTEGER PRIMARY KEY AUTOINCREMENT `seq`
|
|
(monotonic keyset tiebreaker) plus columns: `id TEXT UNIQUE NOT NULL`, `ts TEXT NOT NULL`,
|
|
`category TEXT NOT NULL`, `action TEXT NOT NULL`, `severity TEXT NOT NULL`,
|
|
`actor TEXT NOT NULL`, `entity_type TEXT`, `entity_id TEXT`, `entity_name TEXT`,
|
|
`message TEXT NOT NULL`, `metadata TEXT NOT NULL DEFAULT '{}'`.
|
|
- Indexes: `(ts DESC, seq DESC)` (primary keyset/sort), `category`, `severity`, `actor`,
|
|
`(entity_type, entity_id)`. Use `CREATE TABLE/INDEX IF NOT EXISTS` for idempotency.
|
|
- Append the instance to `ALL_MIGRATIONS` (never reorder existing entries).
|
|
- [ ] Create `server/src/ledgrab/storage/activity_log_repository.py`:
|
|
- `class ActivityLogRepository` taking `db: Database` (NOT subclassing `BaseSqliteStore`).
|
|
- `record(entry: ActivityLogEntry) -> None`: single parameterized INSERT via
|
|
`db.execute(...)` (auto-commit). The `seq` is DB-assigned. **Caller guarantees this runs
|
|
on the event-loop thread** (see Phase 2 — cross-thread marshaling lives in the recorder).
|
|
- `query(filters: ActivityLogFilters, *, before_seq: int | None, limit: int) -> list[ActivityLogEntry]`:
|
|
keyset pagination `WHERE seq < ? ORDER BY seq DESC LIMIT ?` plus optional filters —
|
|
`category IN (...)`, `severity IN (...)`, `actor = ?`, `entity_type = ?`, `entity_id = ?`,
|
|
`ts >= ?` / `ts <= ?`, `message LIKE ?` (free-text, `%q%`, escaped). All parameterized.
|
|
- `count(filters) -> int`.
|
|
- `prune(*, before_ts: datetime | None, max_entries: int | None) -> int`: delete rows older
|
|
than `before_ts`, and/or trim to the newest `max_entries` by `seq`. Returns rows deleted.
|
|
- `clear() -> int`: delete all rows (used by the API clear endpoint; the clear action is
|
|
itself audited by the recorder, not here). Returns rows deleted.
|
|
- `iter_export(filters) -> Iterator[ActivityLogEntry]`: cursor-based streaming for export
|
|
(does not load all rows into memory).
|
|
- Define a small `ActivityLogFilters` dataclass (all-optional fields) in the repository or
|
|
`activity_log.py` and reuse it across query/count/prune/export.
|
|
- [ ] Unit tests in `server/tests/storage/test_activity_log_repository.py`:
|
|
- insert + read back round-trip (incl. metadata JSON, UTC ts);
|
|
- filter by each dimension (category/severity/actor/entity/date/free-text);
|
|
- keyset pagination stability across two pages with same-`ts` rows (seq tiebreaker);
|
|
- prune by age and by max_entries;
|
|
- clear; count; export iterator yields all matching rows;
|
|
- migration idempotency (constructing the repo twice / running migrations twice is safe).
|
|
|
|
## Files to Modify/Create
|
|
|
|
- `server/src/ledgrab/storage/activity_log.py` — new: dataclass + enums + filters + row codec
|
|
- `server/src/ledgrab/storage/data_migrations.py` — modify: add migration + append to `ALL_MIGRATIONS`
|
|
- `server/src/ledgrab/storage/activity_log_repository.py` — new: repository
|
|
- `server/tests/storage/test_activity_log_repository.py` — new: unit tests
|
|
|
|
## Acceptance Criteria
|
|
|
|
- `activity_log` table + indexes created idempotently on startup (running migrations twice is a no-op).
|
|
- Query is keyset-paginated and index-backed; a 10k-row table never loads fully into memory.
|
|
- Pagination is stable when many rows share the same millisecond `ts` (uses `seq` tiebreaker).
|
|
- `prune` removes by age AND by max-entry cap; `clear` empties the table; `export` streams.
|
|
- All filters use parameterized SQL (no string interpolation of user input).
|
|
- New unit tests pass; `ruff check` clean; existing tests still green.
|
|
|
|
## Notes
|
|
|
|
- Reference patterns: `storage/database.py` (`execute`, `transaction`, `get_setting`),
|
|
`storage/data_migrations.py` (`DataMigration`, `MigrationRunner`, `ALL_MIGRATIONS`),
|
|
`storage/sync_clock.py` (dataclass `to_dict`/`from_dict` style).
|
|
- 🔒 **Migration-safety addendum (data domain):** this migration is purely additive (new
|
|
table) — no rename, no field/key/file move, no data movement → no data-loss risk. Still
|
|
idempotent (`IF NOT EXISTS`). Rollback = drop the table; no user data is transformed.
|
|
- Do NOT wire the repository into `main.py` or `dependencies.py` here — that is Phase 2.
|
|
- `Database`'s connection is created with the existing threading model; the repository must
|
|
not assume it can be called from arbitrary threads. Thread marshaling is Phase 2's job.
|
|
|
|
## Review Checklist
|
|
|
|
- [ ] All tasks completed
|
|
- [ ] Code follows project conventions (dataclass codec style, migration naming)
|
|
- [ ] No unintended side effects (no startup wiring yet)
|
|
- [ ] Build passes (ruff + pytest)
|
|
- [ ] Tests pass (new + existing)
|
|
|
|
## Handoff to Next Phase
|
|
|
|
<!-- Filled in by the implementer: final ActivityLogEntry field list + the ActivityLogFilters
|
|
shape (Phase 2/4 depend on the frozen schema), the migration name used, and the exact
|
|
repository method signatures. -->
|