Files

T

alexei.dolgolyov 4a0927521a feat(activity-log): phase 4 - REST API (list/export/settings/clear)

- GET /activity-log: filtered, keyset-paginated list (categories/severities/actor/entity/date/q)
- GET /activity-log/export: streaming CSV/JSON, chunked keyset (releases DB lock per batch), CSV formula-injection guard
- GET/PUT /activity-log/settings: retention config (PUT require_authenticated)
- DELETE /activity-log: clear (require_authenticated, self-audited)
- security: export DoS fix, settings-PUT auth gate, CSV \t/\r guard, metadata-as-JSON
- 122 API tests (auth posture, CSV injection, pagination integrity, filters, settings bounds, clear-audited)

2026-06-09 20:09:46 +03:00

7.2 KiB

Raw Blame History

Phase 4: REST API — query / filter / export / settings / clear

Status: ✅ Done Parent plan: PLAN.md Domain: backend

Objective

Expose the audit log over the REST API: a filtered, keyset-paginated list endpoint; a streaming CSV/JSON export honoring the same filters; retention settings get/update; and a destructive clear. Apply the project's auth posture (stricter auth on export + clear).

Tasks

server/src/ledgrab/api/schemas/activity_log.py (Pydantic):
- ActivityLogEntryResponse (matches the frozen Phase 2 entry dict shape).
- ActivityLogPageResponse { entries: list[...], next_before_seq: int | None, total: int (optional/over filters), has_more: bool }.
- ActivityLogSettingsResponse / UpdateActivityLogSettingsRequest (enabled, max_days, max_entries) with validation bounds.
server/src/ledgrab/api/routes/activity_log.py — APIRouter(prefix="/api/v1/activity-log"):
- GET "" — list. Query params: categories, severities, actor, entity_type, entity_id, since/until (ISO), q (free-text), before_seq (cursor), limit (default 50, capped e.g. 200). AuthRequired. Maps params → ActivityLogFilters, calls repo.query(...), returns page envelope.
- GET "/export" — streaming export. Same filters; format=csv|json. Uses StreamingResponse over repo.iter_export(...). require_authenticated() (may contain IPs/labels). Sets Content-Disposition with a timestamped filename.
- GET "/settings" / PUT "/settings" — retention settings via the retention engine. AuthRequired; updates apply immediately.
- DELETE "" — clear all entries. require_authenticated(). The clear is itself audited (recorder records a system/activity_log_cleared entry AFTER the wipe, so the log shows who cleared it and when).
- Register the router in server/src/ledgrab/api/__init__.py (aggregator).
API tests server/tests/api/routes/test_activity_log_api.py:
- list returns entries; each filter narrows results; before_seq cursor paginates without overlap/gaps; limit cap enforced;
- export CSV and JSON both stream and honor filters; export requires authentication (401 for loopback-anonymous when keys configured);
- settings get/update round-trip + validation rejects out-of-range;
- clear empties the log, requires auth, and leaves exactly one post-clear audit entry.

Files to Modify/Create

server/src/ledgrab/api/schemas/activity_log.py — new
server/src/ledgrab/api/routes/activity_log.py — new
server/src/ledgrab/api/__init__.py — modify: register router
server/tests/api/routes/test_activity_log_api.py — new

Acceptance Criteria

List is filterable on every dimension and keyset-paginated (stable, no dupes/gaps).
Export streams CSV + JSON, honors filters, and requires authentication.
Settings get/update works and validates bounds; changes take effect immediately.
Clear requires authentication and is itself audited.
New + existing tests green; ruff clean.

Notes

Auth helpers: AuthRequired dependency for normal endpoints; require_authenticated() for export + clear (pattern: backup download / secret reveal). See api/auth.py + server/CLAUDE.md.
Follow the existing route/schema conventions (one schema file per entity, router registered in api/__init__.py). Reference api/routes/backup.py for settings-style GET/PUT + a StreamingResponse/download pattern.
Reuse the entry→dict serializer from Phase 2 to keep the response shape single-sourced.
Backup/restore: no STORE_MAP change needed — backup is whole-DB; the table is auto-covered.

Review Checklist

All tasks completed
Code follows project conventions (router registration, schema-per-entity, auth posture)
No unintended side effects
Build passes (ruff + pytest)
Tests pass (new + existing)

Handoff to Next Phase

Endpoint paths

Method	Path	Auth
GET	`/api/v1/activity-log`	AuthRequired (anonymous allowed)
GET	`/api/v1/activity-log/export`	require_authenticated (no anonymous)
GET	`/api/v1/activity-log/settings`	AuthRequired
PUT	`/api/v1/activity-log/settings`	AuthRequired
DELETE	`/api/v1/activity-log`	require_authenticated (no anonymous)

List query parameters (GET /api/v1/activity-log)

Param	Type	Default	Notes
`categories`	`list[str]`	—	Repeatable. Values: auth, device, entity, capture, system
`severities`	`list[str]`	—	Repeatable. Values: info, warning, error
`actor`	`str`	—	Exact match
`entity_type`	`str`	—	Exact match
`entity_id`	`str`	—	Exact match
`since`	`datetime` (ISO-8601)	—	Inclusive lower bound on ts
`until`	`datetime` (ISO-8601)	—	Inclusive upper bound on ts
`q`	`str`	—	Substring match on message (LIKE %q%)
`before_seq`	`int`	—	Keyset cursor from previous page's `next_before_seq`
`limit`	`int`	50	Max entries per page. ge=1, le=200

Export endpoint (GET /api/v1/activity-log/export) accepts the same filter params plus format=csv|json.

Page envelope fields (ActivityLogPageResponse)

{
  "entries": [...],          // list[ActivityLogEntryResponse]
  "next_before_seq": 42,     // int | null — pass as before_seq for next page
  "has_more": true,          // bool
  "total": 1337              // int — total matching all filters (all pages)
}

Entry dict shape (ActivityLogEntryResponse)

11 fields — identical to entry_to_dict() output:

{
  "id": "al_abcd1234",
  "ts": "2026-06-09T12:34:56.789+00:00",
  "category": "entity",
  "action": "entity.created",
  "severity": "info",
  "actor": "my-api-key",
  "entity_type": "output_target",
  "entity_id": "pt_abc",
  "entity_name": "Desk",
  "message": "Output target 'Desk' created",
  "metadata": {}
}

Settings field bounds (UpdateActivityLogSettingsRequest)

Field	Type	ge	le	Notes
`enabled`	`bool`	—	—	Enable/disable recording
`max_days`	`int`	0	3650	0 = no age-based pruning
`max_entries`	`int`	0	10_000_000	0 = no count-based pruning

Export format param

?format=csv (default) → text/csv; charset=utf-8 ?format=json → application/json (streamed JSON array)

Pagination algorithm

Keyset cursor (before_seq) works as follows:

Omit before_seq (or pass null) to get the FIRST (newest) page.
Each page response includes next_before_seq (the seq of the oldest entry on the page).
Pass next_before_seq as before_seq in the next request to get the following (older) page.
has_more=false means there are no more pages; next_before_seq is null.
total is constant across pages for the same filter set.

New method added to ActivityLogRepository (additive, not breaking)

get_seq_for_id(entry_id: str) -> int | None — indexed point-lookup of seq by entry id. Used internally by the list endpoint to build the keyset cursor.

Phase 4 landed (2026-06-09): schemas, route (list/export/settings/clear), router registration, 49 new tests — all green. Full suite 2486 passed, 2 skipped, 0 failed. Ruff clean.

7.2 KiB Raw Blame History