Files
ledgrab/plans/activity-log/phase-4-api.md
T
alexei.dolgolyov 4a0927521a feat(activity-log): phase 4 - REST API (list/export/settings/clear)
- GET /activity-log: filtered, keyset-paginated list (categories/severities/actor/entity/date/q)
- GET /activity-log/export: streaming CSV/JSON, chunked keyset (releases DB lock per batch), CSV formula-injection guard
- GET/PUT /activity-log/settings: retention config (PUT require_authenticated)
- DELETE /activity-log: clear (require_authenticated, self-audited)
- security: export DoS fix, settings-PUT auth gate, CSV \t/\r guard, metadata-as-JSON
- 122 API tests (auth posture, CSV injection, pagination integrity, filters, settings bounds, clear-audited)
2026-06-09 20:09:46 +03:00

7.2 KiB

Phase 4: REST API — query / filter / export / settings / clear

Status: Done Parent plan: PLAN.md Domain: backend

Objective

Expose the audit log over the REST API: a filtered, keyset-paginated list endpoint; a streaming CSV/JSON export honoring the same filters; retention settings get/update; and a destructive clear. Apply the project's auth posture (stricter auth on export + clear).

Tasks

  • server/src/ledgrab/api/schemas/activity_log.py (Pydantic):
    • ActivityLogEntryResponse (matches the frozen Phase 2 entry dict shape).
    • ActivityLogPageResponse { entries: list[...], next_before_seq: int | None, total: int (optional/over filters), has_more: bool }.
    • ActivityLogSettingsResponse / UpdateActivityLogSettingsRequest (enabled, max_days, max_entries) with validation bounds.
  • server/src/ledgrab/api/routes/activity_log.pyAPIRouter(prefix="/api/v1/activity-log"):
    • GET "" — list. Query params: categories, severities, actor, entity_type, entity_id, since/until (ISO), q (free-text), before_seq (cursor), limit (default 50, capped e.g. 200). AuthRequired. Maps params → ActivityLogFilters, calls repo.query(...), returns page envelope.
    • GET "/export" — streaming export. Same filters; format=csv|json. Uses StreamingResponse over repo.iter_export(...). require_authenticated() (may contain IPs/labels). Sets Content-Disposition with a timestamped filename.
    • GET "/settings" / PUT "/settings" — retention settings via the retention engine. AuthRequired; updates apply immediately.
    • DELETE "" — clear all entries. require_authenticated(). The clear is itself audited (recorder records a system/activity_log_cleared entry AFTER the wipe, so the log shows who cleared it and when).
    • Register the router in server/src/ledgrab/api/__init__.py (aggregator).
  • API tests server/tests/api/routes/test_activity_log_api.py:
    • list returns entries; each filter narrows results; before_seq cursor paginates without overlap/gaps; limit cap enforced;
    • export CSV and JSON both stream and honor filters; export requires authentication (401 for loopback-anonymous when keys configured);
    • settings get/update round-trip + validation rejects out-of-range;
    • clear empties the log, requires auth, and leaves exactly one post-clear audit entry.

Files to Modify/Create

  • server/src/ledgrab/api/schemas/activity_log.py — new
  • server/src/ledgrab/api/routes/activity_log.py — new
  • server/src/ledgrab/api/__init__.py — modify: register router
  • server/tests/api/routes/test_activity_log_api.py — new

Acceptance Criteria

  • List is filterable on every dimension and keyset-paginated (stable, no dupes/gaps).
  • Export streams CSV + JSON, honors filters, and requires authentication.
  • Settings get/update works and validates bounds; changes take effect immediately.
  • Clear requires authentication and is itself audited.
  • New + existing tests green; ruff clean.

Notes

  • Auth helpers: AuthRequired dependency for normal endpoints; require_authenticated() for export + clear (pattern: backup download / secret reveal). See api/auth.py + server/CLAUDE.md.
  • Follow the existing route/schema conventions (one schema file per entity, router registered in api/__init__.py). Reference api/routes/backup.py for settings-style GET/PUT + a StreamingResponse/download pattern.
  • Reuse the entry→dict serializer from Phase 2 to keep the response shape single-sourced.
  • Backup/restore: no STORE_MAP change needed — backup is whole-DB; the table is auto-covered.

Review Checklist

  • All tasks completed
  • Code follows project conventions (router registration, schema-per-entity, auth posture)
  • No unintended side effects
  • Build passes (ruff + pytest)
  • Tests pass (new + existing)

Handoff to Next Phase

Endpoint paths

Method Path Auth
GET /api/v1/activity-log AuthRequired (anonymous allowed)
GET /api/v1/activity-log/export require_authenticated (no anonymous)
GET /api/v1/activity-log/settings AuthRequired
PUT /api/v1/activity-log/settings AuthRequired
DELETE /api/v1/activity-log require_authenticated (no anonymous)

List query parameters (GET /api/v1/activity-log)

Param Type Default Notes
categories list[str] Repeatable. Values: auth, device, entity, capture, system
severities list[str] Repeatable. Values: info, warning, error
actor str Exact match
entity_type str Exact match
entity_id str Exact match
since datetime (ISO-8601) Inclusive lower bound on ts
until datetime (ISO-8601) Inclusive upper bound on ts
q str Substring match on message (LIKE %q%)
before_seq int Keyset cursor from previous page's next_before_seq
limit int 50 Max entries per page. ge=1, le=200

Export endpoint (GET /api/v1/activity-log/export) accepts the same filter params plus format=csv|json.

Page envelope fields (ActivityLogPageResponse)

{
  "entries": [...],          // list[ActivityLogEntryResponse]
  "next_before_seq": 42,     // int | null — pass as before_seq for next page
  "has_more": true,          // bool
  "total": 1337              // int — total matching all filters (all pages)
}

Entry dict shape (ActivityLogEntryResponse)

11 fields — identical to entry_to_dict() output:

{
  "id": "al_abcd1234",
  "ts": "2026-06-09T12:34:56.789+00:00",
  "category": "entity",
  "action": "entity.created",
  "severity": "info",
  "actor": "my-api-key",
  "entity_type": "output_target",
  "entity_id": "pt_abc",
  "entity_name": "Desk",
  "message": "Output target 'Desk' created",
  "metadata": {}
}

Settings field bounds (UpdateActivityLogSettingsRequest)

Field Type ge le Notes
enabled bool Enable/disable recording
max_days int 0 3650 0 = no age-based pruning
max_entries int 0 10_000_000 0 = no count-based pruning

Export format param

?format=csv (default) → text/csv; charset=utf-8 ?format=jsonapplication/json (streamed JSON array)

Pagination algorithm

Keyset cursor (before_seq) works as follows:

  • Omit before_seq (or pass null) to get the FIRST (newest) page.
  • Each page response includes next_before_seq (the seq of the oldest entry on the page).
  • Pass next_before_seq as before_seq in the next request to get the following (older) page.
  • has_more=false means there are no more pages; next_before_seq is null.
  • total is constant across pages for the same filter set.

New method added to ActivityLogRepository (additive, not breaking)

get_seq_for_id(entry_id: str) -> int | None — indexed point-lookup of seq by entry id. Used internally by the list endpoint to build the keyset cursor.

Phase 4 landed (2026-06-09): schemas, route (list/export/settings/clear), router registration, 49 new tests — all green. Full suite 2486 passed, 2 skipped, 0 failed. Ruff clean.