feat(activity-log): phase 1 - storage model, migration, repository

- ActivityLogEntry dataclass + ActivityCategory/ActivitySeverity + ActivityLogFilters
- additive idempotent migration 002_add_activity_log (indexed activity_log table, seq keyset tiebreaker)
- ActivityLogRepository (record/query/count/prune/clear/iter_export), keyset pagination, parameterized SQL
- 102 unit + adversarial tests (SQL-injection, pagination, prune, codec, migration idempotency)
This commit is contained in:
2026-06-09 17:40:37 +03:00
parent 1afe7d6fcc
commit 1ac4a0f66d
8 changed files with 2100 additions and 18 deletions
+3 -3
View File
@@ -45,8 +45,8 @@ context (survives across phases; graduates to CLAUDE.md only if it's a lasting p
## Frozen contracts (fill as phases complete)
- ActivityLogEntry fields / dict shape: _(Phase 1/2 handoff)_
- ActivityLogFilters shape: _(Phase 1 handoff)_
- ActivityLogEntry fields / dict shape: **frozen** — see phase-1-storage.md Handoff section. 11 fields: `id`, `ts`, `category`, `action`, `severity`, `actor`, `message`, `entity_type`, `entity_id`, `entity_name`, `metadata`. `seq` is DB-only (not on dataclass).
- ActivityLogFilters shape: **frozen** — 8 optional fields: `categories`, `severities`, `actor`, `entity_type`, `entity_id`, `since`, `until`, `message_like`. See phase-1-storage.md Handoff.
- recorder.record(...) signature + actor ContextVar import path: _(Phase 2 handoff)_
- API endpoints + query params + page envelope + settings bounds: _(Phase 4 handoff)_
@@ -65,4 +65,4 @@ context (survives across phases; graduates to CLAUDE.md only if it's a lasting p
## Phase progress notes
_(Orchestrator appends a short note per phase: what landed, commit sha, any warnings.)_
Phase 1 landed (2026-06-09): `activity_log.py` (dataclass + enums + filters + codec), `AddActivityLogTableMigration` (`002_add_activity_log`) appended to `ALL_MIGRATIONS`, `ActivityLogRepository` (record/query/count/prune/clear/iter_export), 41 new tests — all green. Full suite 2226 passed, 0 failed. Schema and method signatures frozen in phase-1-storage.md Handoff. Gotcha: `Database.execute` takes a positional tuple — use `?` placeholders (not `:name`), otherwise Python 3.14 will raise `ProgrammingError`.
+2 -2
View File
@@ -59,7 +59,7 @@ is an on-demand CSV/JSON **export** (no separate backup subsystem).
## Phases
- [ ] Phase 1: Storage — model, migration, repository [domain: data] → [subplan](./phase-1-storage.md)
- [x] Phase 1: Storage — model, migration, repository [domain: data] → [subplan](./phase-1-storage.md)
- [ ] Phase 2: Recorder, actor context, retention, lifecycle [domain: backend] → [subplan](./phase-2-recorder-retention.md)
- [ ] Phase 3: Event instrumentation (4 categories) [domain: backend] → [subplan](./phase-3-instrumentation.md)
- [ ] Phase 4: REST API — query/filter/export/settings/clear [domain: backend] → [subplan](./phase-4-api.md)
@@ -79,7 +79,7 @@ is an on-demand CSV/JSON **export** (no separate backup subsystem).
| Phase | Domain | Status | Review | Build | Committed |
|-------|--------|--------|--------|-------|-----------|
| Phase 1: Storage | data | ⬜ Not Started | ⬜ | ⬜ | |
| Phase 1: Storage | data | ✅ Done | ✅ Passed | ✅ Passed | |
| Phase 2: Recorder/Retention | backend | ⬜ Not Started | ⬜ | ⬜ | ⬜ |
| Phase 3: Instrumentation | backend | ⬜ Not Started | ⬜ | ⬜ | ⬜ |
| Phase 4: REST API | backend | ⬜ Not Started | ⬜ | ⬜ | ⬜ |
+88 -13
View File
@@ -1,6 +1,6 @@
# Phase 1: Storage — model, migration, repository
**Status:** ⬜ Not Started
**Status:** ✅ Done
**Parent plan:** [PLAN.md](./PLAN.md)
**Domain:** data
@@ -13,7 +13,7 @@ keyset-paginated filtered query, count, time/count-based prune, and streaming ex
## Tasks
- [ ] Create `server/src/ledgrab/storage/activity_log.py`:
- [x] Create `server/src/ledgrab/storage/activity_log.py`:
- `ActivityCategory` and `ActivitySeverity` string enums (or `Literal` unions used as
constants). Categories: `auth`, `device`, `entity`, `capture`, `system`. Severities:
`info`, `warning`, `error`.
@@ -22,7 +22,7 @@ keyset-paginated filtered query, count, time/count-based prune, and streaming ex
`entity_type: str | None`, `entity_id: str | None`, `entity_name: str | None`,
`message: str`, `metadata: dict` (small JSON; default empty). Provide `to_row()` /
`from_row()` (column tuple/dict ↔ dataclass; `metadata` JSON-encoded; `ts` isoformat).
- [ ] Add migration to `server/src/ledgrab/storage/data_migrations.py`:
- [x] Add migration to `server/src/ledgrab/storage/data_migrations.py`:
- New `DataMigration` subclass `AddActivityLogTableMigration` with unique `name`
(next sequential id, e.g. `"NNN_add_activity_log"` — match existing naming) and
`apply(conn)` creating `activity_log` with an INTEGER PRIMARY KEY AUTOINCREMENT `seq`
@@ -33,7 +33,7 @@ keyset-paginated filtered query, count, time/count-based prune, and streaming ex
- Indexes: `(ts DESC, seq DESC)` (primary keyset/sort), `category`, `severity`, `actor`,
`(entity_type, entity_id)`. Use `CREATE TABLE/INDEX IF NOT EXISTS` for idempotency.
- Append the instance to `ALL_MIGRATIONS` (never reorder existing entries).
- [ ] Create `server/src/ledgrab/storage/activity_log_repository.py`:
- [x] Create `server/src/ledgrab/storage/activity_log_repository.py`:
- `class ActivityLogRepository` taking `db: Database` (NOT subclassing `BaseSqliteStore`).
- `record(entry: ActivityLogEntry) -> None`: single parameterized INSERT via
`db.execute(...)` (auto-commit). The `seq` is DB-assigned. **Caller guarantees this runs
@@ -51,7 +51,7 @@ keyset-paginated filtered query, count, time/count-based prune, and streaming ex
(does not load all rows into memory).
- Define a small `ActivityLogFilters` dataclass (all-optional fields) in the repository or
`activity_log.py` and reuse it across query/count/prune/export.
- [ ] Unit tests in `server/tests/storage/test_activity_log_repository.py`:
- [x] Unit tests in `server/tests/storage/test_activity_log_repository.py`:
- insert + read back round-trip (incl. metadata JSON, UTC ts);
- filter by each dimension (category/severity/actor/entity/date/free-text);
- keyset pagination stability across two pages with same-`ts` rows (seq tiebreaker);
@@ -89,14 +89,89 @@ keyset-paginated filtered query, count, time/count-based prune, and streaming ex
## Review Checklist
- [ ] All tasks completed
- [ ] Code follows project conventions (dataclass codec style, migration naming)
- [ ] No unintended side effects (no startup wiring yet)
- [ ] Build passes (ruff + pytest)
- [ ] Tests pass (new + existing)
- [x] All tasks completed
- [x] Code follows project conventions (dataclass codec style, migration naming)
- [x] No unintended side effects (no startup wiring yet)
- [x] Build passes (ruff + pytest)
- [x] Tests pass (new + existing)
## Handoff to Next Phase
<!-- Filled in by the implementer: final ActivityLogEntry field list + the ActivityLogFilters
shape (Phase 2/4 depend on the frozen schema), the migration name used, and the exact
repository method signatures. -->
### ActivityLogEntry — final field list and dict shape
```python
@dataclass
class ActivityLogEntry:
id: str # "al_<uuid8>" — caller-assigned
ts: datetime # UTC-aware; stored as ISO-8601 string in DB
category: str # ActivityCategory constant
action: str # verb-object label, e.g. "entity.created"
severity: str # ActivitySeverity constant
actor: str # API-key label or "system"
message: str # human-readable description
entity_type: str | None # e.g. "output_target"
entity_id: str | None # stable entity id
entity_name: str | None # name at time of event
metadata: dict # JSON-serialisable; default {}
```
`to_row()` returns a flat dict with 11 keys (same names); `metadata` is JSON string, `ts` is isoformat string. `seq` is NOT in `to_row()` — it is DB-assigned.
### ActivityLogFilters — shape (all fields optional, default None)
```python
@dataclass
class ActivityLogFilters:
categories: Sequence[str] | None # category IN (...)
severities: Sequence[str] | None # severity IN (...)
actor: str | None # exact match
entity_type: str | None # exact match
entity_id: str | None # exact match
since: datetime | None # ts >= since
until: datetime | None # ts <= until
message_like: str | None # LIKE %value% (escaped)
```
### Migration name used
`"002_add_activity_log"` — appended as position [1] in `ALL_MIGRATIONS`.
### ActivityLogRepository — exact method signatures
```python
class ActivityLogRepository:
def __init__(self, db: Database) -> None
def record(self, entry: ActivityLogEntry) -> None
def query(
self,
filters: ActivityLogFilters,
*,
before_seq: int | None = None,
limit: int = 50,
) -> list[ActivityLogEntry]
def count(self, filters: ActivityLogFilters | None = None) -> int
def prune(
self,
*,
before_ts: datetime | None = None,
max_entries: int | None = None,
) -> int
def clear(self) -> int
def iter_export(
self, filters: ActivityLogFilters | None = None
) -> Iterator[ActivityLogEntry]
```
### Key behavioural notes for Phase 2/3/4
- `record()` expects to be called from the event-loop thread (or with `Database` RLock already held). Phase 2 is responsible for thread marshaling via `loop.call_soon_threadsafe`.
- `query()` returns entries in **ascending chronological order within the page** (reversed internally from DESC fetch for display convenience). The smallest `seq` on a page is `page[0]`'s seq — pass that as `before_seq` for the next page.
- `count(None)` == `count(ActivityLogFilters())` — both count all rows.
- `prune(before_ts=X, max_entries=N)` applies both predicates independently (age prune first, then count cap).
- `iter_export` holds `db._lock` for the entire iteration. Phase 4 should stream the response and consume promptly.
- `ActivityLogCategory` and `ActivityLogSeverity` are plain classes with string class-attributes and an `ALL` tuple — NOT `enum.Enum`.
- Imports for Phase 2/3/4:
```python
from ledgrab.storage.activity_log import ActivityLogEntry, ActivityLogFilters, ActivityCategory, ActivitySeverity
from ledgrab.storage.activity_log_repository import ActivityLogRepository
```