Files
ledgrab/plans/activity-log/PLAN.md
T
alexei.dolgolyov ff1ff06cb5 fix(activity-log): post-test polish - localize descriptions, dashboard widget, ticking time
- localize entry descriptions client-side via localizeMessage (activity_log.msg.* + entity_type.* templates x3 locales); server message kept as fallback/export/search
- remove redundant Activity header banner from tab
- Recent Activity widget is now a first-class dashboard section (Customize Dashboard show/hide/reorder; pre-existing layouts preserved)
- live activity event updates the widget surgically (no full dashboard rebuild); single listener with teardown
- relative-time labels tick via shared ensureRelativeTimeTicker (single 30s interval, visibility-aware)
2026-06-10 12:03:18 +03:00

139 lines
10 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Feature: Activity / Audit Log
**Branch:** `feature/activity-log`
**Base branch:** `master` (merge target)
**Branch point:** `17dd2e02bab4d00a93479eb6af1a8c6ddc0c7224` (use for clean review diffs)
**Created:** 2026-06-09
**Status:** 🟢 All phases complete — awaiting final review + merge
**Strategy:** Incremental
**Mode:** Automated
**Execution:** Orchestrator
**Remote:** origin → https://git.dolgolyov-family.by/alexei.dolgolyov/ledgrab.git
## Summary
A persistent, queryable audit log of meaningful LedGrab actions, surfaced in the WebUI.
Captures four categories — entity CRUD, authentication, device connect/disconnect, and
capture & system events — as **action-metadata-only** records (who/what/when + entity
type/name/id + a human-readable message + small structured metadata; **no before/after
diffs**). Surfaced as a dedicated top-level **Activity** tab with smart filtering + live
updates, a compact **Recent Activity** widget on the Dashboard, and a **Settings** panel
for retention. Durability rides on the existing whole-DB `ledgrab.db` backup; portability
is an on-demand CSV/JSON **export** (no separate backup subsystem).
## Design pillars (the load-bearing decisions)
1. **Dedicated `activity_log` table + repository — NOT `BaseSqliteStore`.** That base loads
every row into an in-memory cache and uses a generic `id/name/data` blob — wrong for an
append-heavy, unbounded log. We use a purpose-built indexed table with query-on-demand
keyset pagination.
2. **Central choke-point instrumentation.** `fire_entity_event()` (`api/dependencies.py:202`)
is already called by every entity route on create/update/delete and has `_deps` access to
resolve names. The recorder hooks there for all entity CRUD. Non-entity events get explicit
`recorder.record(...)` calls.
3. **Actor via `ContextVar`.** Set inside `verify_api_key` (next to `request.state.auth_label`),
default `"system"`, reset per-request. The recorder reads it without threading actor
through every call.
4. **Direct synchronous write on the event-loop thread** (no separate buffered-writer
subsystem — simpler, and the request already did a `synchronous=FULL` entity write).
Cross-thread callers (zeroconf discovery thread) marshal via `loop.call_soon_threadsafe`,
mirroring `utils/log_broadcaster.py`. The "server shutting down" event is recorded as the
FIRST action in the lifespan shutdown block, before any teardown.
5. **Reuse the existing realtime bus.** A new `activity_logged` event over `/api/v1/events/ws`
(one `events-ws.ts` allowlist entry + `test_events_ws_parity.py` update). No new socket.
6. **Never log secrets.** API-key *tokens* are never stored — only labels/ids.
7. **Differentiate from the existing Log Viewer.** `utils/log_broadcaster.py` is an ephemeral
500-line debug-log tail. The audit log is persistent, structured, semantic. Cross-link in
the Settings panel; never duplicate.
## Build & Test Commands
- **Build (frontend):** `cd server && npm run build`
- **Type-check (TS):** `cd server && npx tsc --noEmit`
- **Test:** `cd server && py -3.13 -m pytest tests/ --no-cov -q`
- **Lint (Python):** `cd server && ruff check src/ tests/ --fix`
- **Events parity test (load-bearing for P2):** included in pytest (`tests/test_events_ws_parity.py`)
> Scope checks to the files actually edited: backend phases run ruff + pytest; frontend-only
> phases run `tsc --noEmit` + `npm run build` (no pytest/ruff). Phases touching both run both.
## Phases
- [x] Phase 1: Storage — model, migration, repository [domain: data] → [subplan](./phase-1-storage.md)
- [x] Phase 2: Recorder, actor context, retention, lifecycle [domain: backend] → [subplan](./phase-2-recorder-retention.md)
- [x] Phase 3: Event instrumentation (4 categories) [domain: backend] → [subplan](./phase-3-instrumentation.md)
- [x] Phase 4: REST API — query/filter/export/settings/clear [domain: backend] → [subplan](./phase-4-api.md)
- [x] Phase 5: Frontend — Activity tab + smart filtering + live updates [domain: frontend] → [subplan](./phase-5-frontend-tab.md)
- [x] Phase 6: Dashboard widget + Settings panel + docs [domain: frontend] → [subplan](./phase-6-dashboard-settings.md)
## Parallelizable Phase Groups (Orchestrator mode only)
- Phases 3 and 4 are **parallelizable only after the schema is frozen at end of Phase 2**.
Both depend on the P2 recorder/schema; P4 registers the router in `api/__init__.py`, P3
edits `dependencies.py`/`auth.py` (different files, shared schema contract). To keep the
Automated run simple and low-risk, they run **sequentially** (3 → 4) unless time pressure
warrants worktree-isolated parallelism. Phases 5 → 6 are sequential (6 reuses P5 formatters
and the feature module).
## Phase Progress Log
| Phase | Domain | Status | Review | Build | Committed |
|-------|--------|--------|--------|-------|-----------|
| Phase 1: Storage | data | ✅ Done | ✅ Passed | ✅ Passed | ✅ |
| Phase 2: Recorder/Retention | backend | ✅ Done | ✅ Passed | ✅ Passed | ✅ |
| Phase 3: Instrumentation | backend | ✅ Done | ✅ Passed | ✅ Passed | ✅ |
| Phase 4: REST API | backend | ✅ Done | ✅ Passed | ✅ Passed | ✅ |
| Phase 5: Frontend tab | frontend | ✅ Done | ✅ Passed | ✅ Passed (tsc+build) | ✅ |
| Phase 6: Dashboard/Settings | frontend | ✅ Done | ✅ Passed | ✅ Passed (tsc+build) | ✅ |
## Outstanding Warnings
| Phase | Warning | Severity | Status (open / resolved / accepted) |
|-------|---------|----------|-------------------------------------|
| 3 | Log injection via unauth mDNS device name/url into audit message | 🟠 High (security) | resolved — `sanitize_display` helper applied |
| 3 | Origin sanitizer missed spaces/NUL/ANSI | 🟠 High (security) | resolved — `sanitize_display` over netloc |
| 3 | Unauth auth-failure audit-write flood (no write-rate bound) | 🟠 High (security) | resolved — per-IP audit-record throttle (10s, capped) |
| 3 | Malformed-IPv6 Origin → urlparse ValueError into WS handler | 🟡 Warning | resolved — try/except guard |
| 3 | Throttle module-global state caused flaky test contamination | 🟡 Warning | resolved — autouse conftest reset fixture |
| 4 | Export held global DB write-lock across the stream (slow-client DoS) | 🟠 High (security) | resolved — chunked keyset export releases lock per batch |
| 4 | PUT /settings only AuthRequired → anon could disable auditing/prune trail | 🟠 High (security) | resolved — `require_authenticated` on settings PUT |
| 4 | CSV formula-injection missed leading TAB/CR | 🟡 Medium (security) | resolved — added `\t`/`\r` to guard |
| 4 | `total` count full-scans on every list request | 🔵 Low (perf) | accepted — bounded by retention; read-only; optional opt-in deferred |
| 5 | Inverted list ordering broke pagination + live-append | 🔴 Blocker | resolved — pages reversed to newest-first; re-review PASS |
| 5 | Attribute-context XSS (entity_name title + JSON.stringify onclick) | 🟡 Warning (security) | resolved — `_escapeAttr` + data-attr event delegation |
| 5 | Filter toolbar value= attrs not quote-escaped (new code) | 🟡 Warning (security) | resolved — `_escapeAttr` on q/actor/entity_type/since/until |
| 5 | Manual browser smoke test (tab loads, filters, live, export) | 🔵 Note | open — recommend at final review (server restart needed) |
| 6 | clearActivityLog() 401 path unreachable → silent failure | 🟡 Warning | resolved — `handle401:false` surfaces auth-required toast |
| 6 | Recent Activity widget dropped first live event when empty | 🔵 Note | resolved — empty→list transition on first live event |
| 6 | Widget outside dashboard layout-toggle/ordering system | 🔵 Note | accepted — deliberate (always-visible), collapse still works |
| Final | Entity-crosslink map keys mismatched backend entity_type (device/color_strip_source/audio_source) | 🟡 Warning | resolved — `_ENTITY_NAV` keys corrected + scene_playlist added |
| Final | Owner-authored names interpolated raw at some record sites | 🔵 Note (defense-in-depth) | resolved — `sanitize_display` applied uniformly |
| Manual test | Entry descriptions rendered server English (not localized) | 🟡 Warning (acceptance criterion) | resolved — client-side `localizeMessage` from structured fields + `activity_log.msg.*`/`entity_type.*` templates ×3 |
| Manual test | Redundant "Activity" header banner at top of tab | 🔵 Note | resolved — header block removed |
| Manual test | Recent Activity widget missing from Customize Dashboard | 🟡 Warning | resolved — registered as a first-class dashboard section (show/hide/reorder; pre-existing layouts preserved) |
| Manual test | Activity widget live event rebuilt the whole dashboard | 🟡 Warning (perf) | resolved — surgical list update; single listener with teardown |
| Manual test | Relative-time labels static (never tick) | 🟡 Warning | resolved — shared `ensureRelativeTimeTicker` (single 30s interval, visibility-aware) |
## Final Review
- [ ] Comprehensive code review
- [ ] Security review (auth/PII-in-logs/secrets/log-injection — triggered)
- [ ] All Outstanding Warnings resolved or consciously accepted
- [ ] Full build passes (`npm run build` + `tsc --noEmit`)
- [ ] Full test suite passes (`pytest`)
- [ ] Merged to `master`
## Amendment Log
_(Filled in if the plan is amended mid-implementation.)_
- 2026-06-09: Plan reviewer (pre-implementation) → ⚠️ with 3 Critical Gaps, all resolved
before Phase 1: (G1) descoped non-existent API-key mutation events; (G2) dropped the
buffered-writer subsystem for a direct synchronous-on-loop write with `call_soon_threadsafe`
marshaling for thread-origin events; (G3) record "server shutting down" first in the shutdown
block (no buffer to flush). Concerns folded in: device events via `device_health_changed` +
`device_discovered/_lost`; actor ContextVar in `verify_api_key`; name-on-delete passed
explicitly; settings-audit scoped + self-key excluded; parity allowlist before emit site.
Adopted suggestions: rowid keyset tiebreaker; log the disable action. Deferred: setup-scaffold
noise suppression.