Files
ledgrab/plans/activity-log/PLAN.md
T
alexei.dolgolyov 6e1dd2111d feat(activity-log): phase 6 - dashboard widget + settings panel + docs
- Dashboard 'Recent Activity' widget: latest 5 entries, live prepend, 'View all' -> Activity tab
- Settings 'Activity Log' panel: retention (enabled/max_days/max_entries) GET/PUT, clear (confirm + auth-required toast), CSV/JSON export
- audit-log vs ephemeral debug Log Viewer distinction note + cross-links
- public helpers fetchRecentEntries/renderCompactEntry on activity-log.ts (reused, no dup markup)
- README Activity Log section; i18n across en/ru/zh
- review fixes: clear 401 surfaces toast; empty widget transitions on first live event
2026-06-09 21:05:40 +03:00

8.8 KiB

Feature: Activity / Audit Log

Branch: feature/activity-log Base branch: master (merge target) Branch point: 17dd2e02bab4d00a93479eb6af1a8c6ddc0c7224 (use for clean review diffs) Created: 2026-06-09 Status: 🟢 All phases complete — awaiting final review + merge Strategy: Incremental Mode: Automated Execution: Orchestrator Remote: origin → https://git.dolgolyov-family.by/alexei.dolgolyov/ledgrab.git

Summary

A persistent, queryable audit log of meaningful LedGrab actions, surfaced in the WebUI. Captures four categories — entity CRUD, authentication, device connect/disconnect, and capture & system events — as action-metadata-only records (who/what/when + entity type/name/id + a human-readable message + small structured metadata; no before/after diffs). Surfaced as a dedicated top-level Activity tab with smart filtering + live updates, a compact Recent Activity widget on the Dashboard, and a Settings panel for retention. Durability rides on the existing whole-DB ledgrab.db backup; portability is an on-demand CSV/JSON export (no separate backup subsystem).

Design pillars (the load-bearing decisions)

  1. Dedicated activity_log table + repository — NOT BaseSqliteStore. That base loads every row into an in-memory cache and uses a generic id/name/data blob — wrong for an append-heavy, unbounded log. We use a purpose-built indexed table with query-on-demand keyset pagination.
  2. Central choke-point instrumentation. fire_entity_event() (api/dependencies.py:202) is already called by every entity route on create/update/delete and has _deps access to resolve names. The recorder hooks there for all entity CRUD. Non-entity events get explicit recorder.record(...) calls.
  3. Actor via ContextVar. Set inside verify_api_key (next to request.state.auth_label), default "system", reset per-request. The recorder reads it without threading actor through every call.
  4. Direct synchronous write on the event-loop thread (no separate buffered-writer subsystem — simpler, and the request already did a synchronous=FULL entity write). Cross-thread callers (zeroconf discovery thread) marshal via loop.call_soon_threadsafe, mirroring utils/log_broadcaster.py. The "server shutting down" event is recorded as the FIRST action in the lifespan shutdown block, before any teardown.
  5. Reuse the existing realtime bus. A new activity_logged event over /api/v1/events/ws (one events-ws.ts allowlist entry + test_events_ws_parity.py update). No new socket.
  6. Never log secrets. API-key tokens are never stored — only labels/ids.
  7. Differentiate from the existing Log Viewer. utils/log_broadcaster.py is an ephemeral 500-line debug-log tail. The audit log is persistent, structured, semantic. Cross-link in the Settings panel; never duplicate.

Build & Test Commands

  • Build (frontend): cd server && npm run build
  • Type-check (TS): cd server && npx tsc --noEmit
  • Test: cd server && py -3.13 -m pytest tests/ --no-cov -q
  • Lint (Python): cd server && ruff check src/ tests/ --fix
  • Events parity test (load-bearing for P2): included in pytest (tests/test_events_ws_parity.py)

Scope checks to the files actually edited: backend phases run ruff + pytest; frontend-only phases run tsc --noEmit + npm run build (no pytest/ruff). Phases touching both run both.

Phases

  • Phase 1: Storage — model, migration, repository [domain: data] → subplan
  • Phase 2: Recorder, actor context, retention, lifecycle [domain: backend] → subplan
  • Phase 3: Event instrumentation (4 categories) [domain: backend] → subplan
  • Phase 4: REST API — query/filter/export/settings/clear [domain: backend] → subplan
  • Phase 5: Frontend — Activity tab + smart filtering + live updates [domain: frontend] → subplan
  • Phase 6: Dashboard widget + Settings panel + docs [domain: frontend] → subplan

Parallelizable Phase Groups (Orchestrator mode only)

  • Phases 3 and 4 are parallelizable only after the schema is frozen at end of Phase 2. Both depend on the P2 recorder/schema; P4 registers the router in api/__init__.py, P3 edits dependencies.py/auth.py (different files, shared schema contract). To keep the Automated run simple and low-risk, they run sequentially (3 → 4) unless time pressure warrants worktree-isolated parallelism. Phases 5 → 6 are sequential (6 reuses P5 formatters and the feature module).

Phase Progress Log

Phase Domain Status Review Build Committed
Phase 1: Storage data Done Passed Passed
Phase 2: Recorder/Retention backend Done Passed Passed
Phase 3: Instrumentation backend Done Passed Passed
Phase 4: REST API backend Done Passed Passed
Phase 5: Frontend tab frontend Done Passed Passed (tsc+build)
Phase 6: Dashboard/Settings frontend Done Passed Passed (tsc+build)

Outstanding Warnings

Phase Warning Severity Status (open / resolved / accepted)
3 Log injection via unauth mDNS device name/url into audit message 🟠 High (security) resolved — sanitize_display helper applied
3 Origin sanitizer missed spaces/NUL/ANSI 🟠 High (security) resolved — sanitize_display over netloc
3 Unauth auth-failure audit-write flood (no write-rate bound) 🟠 High (security) resolved — per-IP audit-record throttle (10s, capped)
3 Malformed-IPv6 Origin → urlparse ValueError into WS handler 🟡 Warning resolved — try/except guard
3 Throttle module-global state caused flaky test contamination 🟡 Warning resolved — autouse conftest reset fixture
4 Export held global DB write-lock across the stream (slow-client DoS) 🟠 High (security) resolved — chunked keyset export releases lock per batch
4 PUT /settings only AuthRequired → anon could disable auditing/prune trail 🟠 High (security) resolved — require_authenticated on settings PUT
4 CSV formula-injection missed leading TAB/CR 🟡 Medium (security) resolved — added \t/\r to guard
4 total count full-scans on every list request 🔵 Low (perf) accepted — bounded by retention; read-only; optional opt-in deferred
5 Inverted list ordering broke pagination + live-append 🔴 Blocker resolved — pages reversed to newest-first; re-review PASS
5 Attribute-context XSS (entity_name title + JSON.stringify onclick) 🟡 Warning (security) resolved — _escapeAttr + data-attr event delegation
5 Filter toolbar value= attrs not quote-escaped (new code) 🟡 Warning (security) resolved — _escapeAttr on q/actor/entity_type/since/until
5 Manual browser smoke test (tab loads, filters, live, export) 🔵 Note open — recommend at final review (server restart needed)
6 clearActivityLog() 401 path unreachable → silent failure 🟡 Warning resolved — handle401:false surfaces auth-required toast
6 Recent Activity widget dropped first live event when empty 🔵 Note resolved — empty→list transition on first live event
6 Widget outside dashboard layout-toggle/ordering system 🔵 Note accepted — deliberate (always-visible), collapse still works

Final Review

  • Comprehensive code review
  • Security review (auth/PII-in-logs/secrets/log-injection — triggered)
  • All Outstanding Warnings resolved or consciously accepted
  • Full build passes (npm run build + tsc --noEmit)
  • Full test suite passes (pytest)
  • Merged to master

Amendment Log

(Filled in if the plan is amended mid-implementation.)

  • 2026-06-09: Plan reviewer (pre-implementation) → ⚠️ with 3 Critical Gaps, all resolved before Phase 1: (G1) descoped non-existent API-key mutation events; (G2) dropped the buffered-writer subsystem for a direct synchronous-on-loop write with call_soon_threadsafe marshaling for thread-origin events; (G3) record "server shutting down" first in the shutdown block (no buffer to flush). Concerns folded in: device events via device_health_changed + device_discovered/_lost; actor ContextVar in verify_api_key; name-on-delete passed explicitly; settings-audit scoped + self-key excluded; parity allowlist before emit site. Adopted suggestions: rowid keyset tiebreaker; log the disable action. Deferred: setup-scaffold noise suppression.