From 628c6b2f0d6eccfb0e02b53896b4ae2a030b1b3f Mon Sep 17 00:00:00 2001 From: "alexei.dolgolyov" Date: Sat, 23 May 2026 00:36:39 +0300 Subject: [PATCH] docs: capture architecture-audit remainder for follow-up sessions 10 commits in this branch landed the data-safety bugs (C2, C11), the worst parallel-change problems (C1/C3/C4/C6/C7), and 5 of the 9 HIGH audit findings. The remainder splits cleanly into four follow-up sessions: a frontend sprint (C8/C9/C10/H6/H7/H8/M7-M11/L1), a Device redesign (H4), a BaseTargetProcessor ABC (C5/H5/M1/M2/L2), and polish (M3/M6/M12/L3-L5). This file documents what's left, the recommended ordering, and the three registry patterns already in the codebase that new contributors should reach for instead of writing fresh if/elif chains. --- AUDIT_REMAINING.md | 347 +++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 347 insertions(+) create mode 100644 AUDIT_REMAINING.md diff --git a/AUDIT_REMAINING.md b/AUDIT_REMAINING.md new file mode 100644 index 0000000..7c0665c --- /dev/null +++ b/AUDIT_REMAINING.md @@ -0,0 +1,347 @@ +# LedGrab Architecture Audit — Remaining Items + +Roadmap for the architecture-audit refactor sprint that started 2026-05-22. +This file lists every audit finding that is **not yet addressed**; the ones +already landed in commits `563cbac..2f15fbb` are summarised below for +context. + +## Already done (10 commits) + +| Commit | Findings addressed | +|---|---| +| `563cbac` | C2, C11, C1 (parallel-change only), C3, C4, C6, C7-streams | +| `29bdacf` | C5 (HA/Z2M swap helper; full ABC deferred) | +| `97dae2c` | H1 | +| `5fec8db` | M4 | +| `98fb61d` | H2 | +| `9f3f346` | M5 | +| `05f73ee` | H6 (bindable extraction only) | +| `3b8f00e` + `c1aa2eb` | C7 store-side | +| `2f15fbb` | H3 | + +All commits have ≥1 code-review subagent pass with HIGH findings fixed +before commit. Tests pass on each commit; ruff clean; tsc + bundle build +clean for the frontend commit. + +The two CRITICAL **data-safety** items (C2 silent CSS fallback, C11 +string-replace JSON migration) are fixed. The two CRITICAL +**parallel-change** problems for color-strip + value-source dispatch are +fixed. The two HIGH dispatch problems (H1 effects, H2 rules) are fixed. + +--- + +## Remaining backend items + +### HIGH + +#### H4 — `Device.__init__` 40+ params mixing per-type fields + +**File:** `server/src/ledgrab/storage/device_store.py:46-150` + +The `Device` dataclass constructor accepts ~40 parameters that mix common +fields with DMX-only / DDP-only / Hue-only / Yeelight-only / Wiz-only / +LIFX-only / Govee-only / Nanoleaf-only / SPI-only / Chroma-only / +GameSense-only fields. Setting `hue_username` on a WLED device is +silently ignored. + +**Approach:** introduce per-device-type config dataclasses +(`DmxConfig`, `HueConfig`, `DdpConfig`, …) and make `Device.config` a +discriminated union. Per-type validation moves to the config classes. +Wire migration: every existing device row needs to be re-parsed; use the +versioned `MigrationRunner` introduced in Phase 1.2. + +**Risk:** medium-high. Touches: +- `storage/device_store.py` — Device dataclass, `from_dict`, `to_dict`, + `create_device`, `update_device` +- `api/schemas/devices.py` — Pydantic schemas +- `api/routes/devices.py` — request validation +- `core/devices/*` — every provider reads device fields +- A new migration to translate flat fields → nested `config` + +**Estimated scope:** ~1500 LOC diff, 1-2 dedicated sessions. + +#### H5 — `WledTargetProcessor` god class (32 methods, 5 responsibilities) + +**File:** `server/src/ledgrab/core/processing/wled_target_processor.py` (1238 LOC) + +Conflates: +1. Device connectivity (probe, liveness, reconnect) +2. FPS negotiation (adaptive_fps, keepalive_interval, state_check_interval) +3. LED resampling (`_fit_to_device` — 60 lines of numpy) +4. Preview WebSocket fanout (`_preview_clients`, `_broadcast_led_preview`) +5. Metrics emission (`get_state`, `get_metrics`) + +**Approach:** extract `WledDeviceConnector`, `WledPixelSender`, +`TargetFitProcessor`, `TargetPreviewBroadcaster`, `TargetMetricsCollector`. +`WledTargetProcessor` becomes an orchestrator that composes them. + +**Risk:** HIGHEST in the audit. This class drives physical LED hardware +in production. A regression caught at runtime (in the user's living +room) is the expensive failure mode. Needs manual verification with at +least one real WLED device after the refactor. + +**Coupled with:** C5 (HA/Z2M shared the same shape; should extract a +common `BaseTargetProcessor` ABC at the same time so all three +processors share lifecycle / preview / metrics code). + +**Estimated scope:** ~2000 LOC diff, 2-3 dedicated sessions, with manual +device testing after each. + +#### H7 — `device-discovery.ts` 1745 LOC + +Frontend mirror of H4. The `onDeviceTypeChanged` handler has a giant +switch with 15+ device kinds and 15+ `_showXxxFields` / `_buildXxxItems` +helpers. Adding a device type requires editing 5 separate frontend hooks. + +**Approach:** mirror the H4 backend redesign — once the storage layer +has per-type config objects, the frontend can have a per-type field-set +registry. Best done **after** H4 lands so the schemas drive the +registry. + +**Estimated scope:** 1-2 sessions; coupled to H4. + +#### H8 — `automations.ts` 1410 LOC + +Frontend mirror of H2 (rule polymorphism). Already addressed on the +backend in `98fb61d`; the frontend dispatch on `RuleType` is still +hand-rolled. + +**Approach:** introduce a rule-type registry on the frontend matching +the backend's `_RULE_HANDLERS` shape. + +**Estimated scope:** half a session. + +### MEDIUM + +#### M1 — `ProcessorManager.add_target` shotgun (11 args, WLED-leak) + +**File:** `server/src/ledgrab/core/processing/processor_manager.py:396` + +Method is named generically (`add_target`) but accepts `protocol="ddp"` +and `keepalive_interval` — WLED-only fields. HA and Z2M have sibling +methods with their own bespoke params. + +**Approach:** extract a `TargetFactory` (per-kind builders, similar to +`value_source_factories.py` from Phase 7). Couple with H5/C5 work. + +#### M2 — `TargetContext` god-bag + +**File:** `server/src/ledgrab/core/processing/processor_manager.py` + +`@dataclass TargetContext` exposes ~8 attributes (device_store, +color_strip_stream_manager, value_stream_manager, metrics_history, +mqtt_manager, ha_manager, …). Processors silently depend on whichever +fields they read. Tests have to construct a huge mock context. + +**Approach:** make per-processor explicit dependency injection. Couple +with H5 work. + +#### M3 — Validation duplicated across layers + +Field-level constraints (composite nesting depth, name uniqueness, span +ranges) are enforced in route + schema + store. Adding a new constraint +means editing 3 places. + +**Approach:** move all validation to the model/schema layer (Pydantic +validators + dataclass `__post_init__`). Routes trust the schema; store +trusts the model. + +**Risk:** moderate — cross-cutting; needs careful review of which layer +currently owns which constraint. + +#### M6 — `ws_stream.py` mixed concerns (699 LOC) + +**File:** `server/src/ledgrab/api/routes/color_strip_sources/ws_stream.py` + +The worst part (stream-creation dispatch) was fixed in Phase 2.1 — it +now calls `color_strip_kinds.build_stream(source, deps)`. The remaining +699 lines mix config parsing + WebSocket lifecycle + frame loop. Could +extract the frame loop into a separate `PreviewFrameLoop` class. + +**Estimated scope:** half a session. Low impact since the parallel-change +problem is already fixed. + +#### M7 — No shared frontend API client + +**File:** every `static/js/features/*.ts` + +`fetchWithAuth(...)` + bespoke error-unwrapping is copy-pasted in every +feature's save / load function. ~25 files. + +**Approach:** introduce `static/js/core/api-client.ts` with typed +methods (`get`, `post`, `put`, `delete`) that handle auth, JSON parsing, +error normalisation. Replace `fetchWithAuth` calls across features. + +#### M8 — Global `_cached*` `let` vars + +Mutable module-level state mutated from multiple feature modules. No +subscription model — features manually `invalidate()` after CRUD. + +**Approach:** introduce a reactive cache (EventEmitter pattern or a tiny +store like Nano Stores). Couple with M7 (the API client can drive cache +invalidation on write). + +#### M9 — `dashboard.ts` 1421 LOC + +Frontend god-module orchestrating + rendering device / target / CSS +cards. Couple with C8/C9/C10 frontend split work. + +#### M10 — Duplicate frontend modal classes + +`ValueSourceModal`, `StreamEditorModal`, `TargetEditorModal`, +`AddDeviceModal`, etc. each reimplement pristine-check / undo / focus +management. + +**Approach:** introduce a `FormModal` base class. + +#### M11 — Hardcoded `_getSectionForSource` / `_getTabForSource` + +Routing tables duplicated across multiple feature files (streams.ts, +value-sources.ts). Adding a new stream type requires hunting strings. + +**Approach:** single routing registry keyed by source_type. + +#### M12 — Late imports masking cycles + +Partially addressed by the kind registries (Phase 2.1, 2.2). Some +late-imports still exist in `value_stream.py`, `audio_stream.py`, the +target processors. Resolving them requires restructuring module layout +to break the circular dependencies. + +**Estimated scope:** small follow-up after H5. + +### LOW + +#### L1 — `(src as any).field` casts in `value-sources.ts` + +Discriminated unions aren't narrowed properly. Couple with C8 frontend +split. + +#### L2 — Mutable state without locks + +`_preview_clients`, `_last_preview_data`, `_color_stream`, +`_css_stream` are mutated from multiple async tasks without explicit +locks. Production has not exhibited issues but the contract is fragile. + +**Approach:** add explicit `asyncio.Lock` per processor. Couple with H5. + +#### L3 — `Calibration.validate()` raises instead of returning result + +**File:** `server/src/ledgrab/core/capture/calibration.py:164` + +All 4 call sites currently rely on the raise; converting to +`ValidationResult` would force every caller to check a return value +without adding safety. **Recommendation:** skip — current design is +appropriate. + +#### L4 — `_SOURCE_TYPE_MAP` is module-private + +No public `GET /api/v1/source-types` discovery endpoint. Frontend +hardcodes the list of source types in `types.ts`. + +**Approach:** add a discovery route + matching frontend fetch. Couple +with H6 frontend split (since `types.ts` is involved). + +#### L5 — `AudioValueStream` implicit state machine + +**File:** `server/src/ledgrab/core/processing/value_stream.py:169-383` + +`get_value()` can be called before `start()`; transitions are implicit. +**Approach:** explicit State pattern. Low value (production callers +always start before reading). + +--- + +## Remaining frontend items (all) + +### CRITICAL + +- **C8** — `value-sources.ts` 1972 LOC (4 god-functions, type-dispatch ladders) +- **C9** — `graph-editor.ts` 2707 LOC (layout + interaction + state + WS sync + …) +- **C10** — `streams.ts` 2341 LOC (picture / audio / template kitchen-sink) + +### Other frontend (severity in main list above) + +- **H6 rest** — split remaining ~1100 LOC of `types.ts` into per-entity files +- **H7** — `device-discovery.ts` 1745 LOC (couple with H4) +- **H8** — `automations.ts` 1410 LOC (mirror H2) +- **M7** — shared API client +- **M8** — reactive cache +- **M9** — `dashboard.ts` 1421 LOC +- **M10** — `FormModal` base +- **M11** — routing registry +- **L1** — narrowing the discriminated unions + +The frontend remainder is **multi-day work** even when broken up by +finding. Recommended approach: a dedicated frontend sprint with the +typescript-reviewer agent + manual UI testing for each god-module +split. Order: + +1. Finish `types.ts` split (H6) — pure organisation, low risk, unblocks + the rest +2. Introduce API client (M7) — every feature file gains a cleaner shape +3. Split `value-sources.ts` (C8) — uses the API client + per-type + registry pattern +4. Split `streams.ts` (C10) +5. Split `graph-editor.ts` (C9) — needs the most care; the file owns + the entire visual editor +6. Polish: `dashboard.ts` (M9), `device-discovery.ts` (H7), + `automations.ts` (H8), `FormModal` (M10), routing registry (M11), + reactive cache (M8), narrowing (L1) + +--- + +## Recommended ordering for future sessions + +### Session A — Frontend sprint (multi-day) + +Address H6-rest, C8, C9, C10, H7, H8, M7-M11, L1. See order above. +Critical to have typescript-reviewer feedback + manual UI testing after +each split. + +### Session B — Device redesign (1-2 sessions) + +Address H4 alone. Touches device storage + provider classes; needs a +data migration. Once H4 lands, H7 frontend mirror can follow. + +### Session C — BaseTargetProcessor ABC (2-3 sessions) + +Address C5 (full) + H5 + M1 + M2 + L2 together. Highest risk in the +audit because it drives physical LED hardware. Each step needs manual +verification with a real device. + +### Session D — Polish (half a session) + +Address M3, M6 (remainder), M12 (remainder), L3 (decision: skip), L4, +L5. + +--- + +## Pattern reference for new contributors + +Three registry-pattern templates that already exist in the codebase and +should be the model for the remaining dispatch ladders: + +1. **Class-level handler dict + import-time coverage assertion** + - `core/processing/effect_stream.py::_RENDERERS` + (`@_effect_renderer` decorator + `@_collect_effect_renderers` + class decorator) + - `core/automations/automation_engine.py::AutomationEngine._RULE_HANDLERS` + (module-level binding after class definition) + - `api/routes/output_targets.py::_TARGET_RESPONSE_BUILDERS` + (response-shape dispatch keyed by storage class) + +2. **Per-type free functions + dependency-bag dataclass** + - `core/processing/color_strip_kinds.py` (`StreamDeps` + `STREAM_BUILDERS`) + - `core/processing/value_kinds.py` (`ValueStreamDeps` + `STREAM_BUILDERS`) + - `storage/value_source_factories.py` (`CREATE_BUILDERS` + `UPDATE_APPLIERS`) + +3. **Versioned migration runner** + - `storage/data_migrations.py` (`MigrationRunner` + `DataMigration` ABC) + - Used for any storage rename / field-shape change in the future. + - Audit-table contract: atomic transaction covers + applied-check + apply + record, so partial-failure cannot leave + data rewritten but unrecorded. + +Adding a new feature that touches dispatch should reach for one of +these three patterns before writing a fresh if/elif chain.