# LedGrab Architecture Audit — Remaining Items Roadmap for the architecture-audit refactor sprint that started 2026-05-22. This file lists every audit finding that is **not yet addressed**; the ones already landed in commits `563cbac..2f15fbb` are summarised below for context. ## Already done (10 commits) | Commit | Findings addressed | |---|---| | `563cbac` | C2, C11, C1 (parallel-change only), C3, C4, C6, C7-streams | | `29bdacf` | C5 (HA/Z2M swap helper; full ABC deferred) | | `97dae2c` | H1 | | `5fec8db` | M4 | | `98fb61d` | H2 | | `9f3f346` | M5 | | `05f73ee` | H6 (bindable extraction only) | | `3b8f00e` + `c1aa2eb` | C7 store-side | | `2f15fbb` | H3 | | _uncommitted (2026-05-27 autonomous pass)_ | H6-rest, H8, M7 (foundation + 3 reference files) | All commits have ≥1 code-review subagent pass with HIGH findings fixed before commit. Tests pass on each commit; ruff clean; tsc + bundle build clean for the frontend commit. The two CRITICAL **data-safety** items (C2 silent CSS fallback, C11 string-replace JSON migration) are fixed. The two CRITICAL **parallel-change** problems for color-strip + value-source dispatch are fixed. The two HIGH dispatch problems (H1 effects, H2 rules) are fixed. --- ## Remaining backend items ### HIGH #### H4 — `Device.__init__` 40+ params mixing per-type fields **File:** `server/src/ledgrab/storage/device_store.py:46-150` The `Device` dataclass constructor accepts ~40 parameters that mix common fields with DMX-only / DDP-only / Hue-only / Yeelight-only / Wiz-only / LIFX-only / Govee-only / Nanoleaf-only / SPI-only / Chroma-only / GameSense-only fields. Setting `hue_username` on a WLED device is silently ignored. **Approach:** introduce per-device-type config dataclasses (`DmxConfig`, `HueConfig`, `DdpConfig`, …) and make `Device.config` a discriminated union. Per-type validation moves to the config classes. Wire migration: every existing device row needs to be re-parsed; use the versioned `MigrationRunner` introduced in Phase 1.2. **Risk:** medium-high. Touches: - `storage/device_store.py` — Device dataclass, `from_dict`, `to_dict`, `create_device`, `update_device` - `api/schemas/devices.py` — Pydantic schemas - `api/routes/devices.py` — request validation - `core/devices/*` — every provider reads device fields - A new migration to translate flat fields → nested `config` **Estimated scope:** ~1500 LOC diff, 1-2 dedicated sessions. #### H5 — `WledTargetProcessor` god class (32 methods, 5 responsibilities) **File:** `server/src/ledgrab/core/processing/wled_target_processor.py` (1238 LOC) Conflates: 1. Device connectivity (probe, liveness, reconnect) 2. FPS negotiation (adaptive_fps, keepalive_interval, state_check_interval) 3. LED resampling (`_fit_to_device` — 60 lines of numpy) 4. Preview WebSocket fanout (`_preview_clients`, `_broadcast_led_preview`) 5. Metrics emission (`get_state`, `get_metrics`) **Approach:** extract `WledDeviceConnector`, `WledPixelSender`, `TargetFitProcessor`, `TargetPreviewBroadcaster`, `TargetMetricsCollector`. `WledTargetProcessor` becomes an orchestrator that composes them. **Risk:** HIGHEST in the audit. This class drives physical LED hardware in production. A regression caught at runtime (in the user's living room) is the expensive failure mode. Needs manual verification with at least one real WLED device after the refactor. **Coupled with:** C5 (HA/Z2M shared the same shape; should extract a common `BaseTargetProcessor` ABC at the same time so all three processors share lifecycle / preview / metrics code). **Estimated scope:** ~2000 LOC diff, 2-3 dedicated sessions, with manual device testing after each. #### H7 — `device-discovery.ts` 1745 LOC Frontend mirror of H4. The `onDeviceTypeChanged` handler has a giant switch with 15+ device kinds and 15+ `_showXxxFields` / `_buildXxxItems` helpers. Adding a device type requires editing 5 separate frontend hooks. **Approach:** mirror the H4 backend redesign — once the storage layer has per-type config objects, the frontend can have a per-type field-set registry. Best done **after** H4 lands so the schemas drive the registry. **Estimated scope:** 1-2 sessions; coupled to H4. #### H8 — `automations.ts` 1410 LOC — ✅ DONE (uncommitted, 2026-05-27) Frontend mirror of H2 (rule polymorphism). Already addressed on the backend in `98fb61d`; the frontend dispatch on `RuleType` was hand-rolled. **Done:** the two remaining hand-rolled dispatch ladders were converted to registries keyed by `RuleType`, alongside the pre-existing `RULE_CHIP_RENDERERS`: - `RULE_FIELD_RENDERERS` — the `renderFields` if/elif ladder was extracted into module-level `_renderXxxFields(container, data)` functions (they only ever closed over `container`); the in-row `renderFields` is now a 3-line dispatcher. - `RULE_COLLECTORS` — the `getAutomationEditorRules` if/elif ladder became per-type collectors; the loop is now a registry lookup. - All three registries are typed `Record` (compile-time exhaustiveness) and an import-time `_assertRuleHandlerCoverage()` logs loudly if any registry drifts from `RULE_TYPE_KEYS`. (Frontend logs rather than throws — a thrown error at import would brick the whole bundle, not just the editor — the one intentional divergence from the backend's raising `_assert_rule_handler_coverage`.) Adding a new rule type now means: one entry in `RULE_TYPE_KEYS`, `RULE_TYPE_ICONS`, and each of the three registries — and tsc + the coverage check flag any omission. Verified: tsc + bundle build clean; typescript-reviewer APPROVE (the extracted renderer bodies are byte-identical to the originals; no stray closure captures; http_poll widget-stash + HA entity loading preserved). ### MEDIUM #### M1 — `ProcessorManager.add_target` shotgun (11 args, WLED-leak) **File:** `server/src/ledgrab/core/processing/processor_manager.py:396` Method is named generically (`add_target`) but accepts `protocol="ddp"` and `keepalive_interval` — WLED-only fields. HA and Z2M have sibling methods with their own bespoke params. **Approach:** extract a `TargetFactory` (per-kind builders, similar to `value_source_factories.py` from Phase 7). Couple with H5/C5 work. #### M2 — `TargetContext` god-bag **File:** `server/src/ledgrab/core/processing/processor_manager.py` `@dataclass TargetContext` exposes ~8 attributes (device_store, color_strip_stream_manager, value_stream_manager, metrics_history, mqtt_manager, ha_manager, …). Processors silently depend on whichever fields they read. Tests have to construct a huge mock context. **Approach:** make per-processor explicit dependency injection. Couple with H5 work. #### M3 — Validation duplicated across layers Field-level constraints (composite nesting depth, name uniqueness, span ranges) are enforced in route + schema + store. Adding a new constraint means editing 3 places. **Approach:** move all validation to the model/schema layer (Pydantic validators + dataclass `__post_init__`). Routes trust the schema; store trusts the model. **Risk:** moderate — cross-cutting; needs careful review of which layer currently owns which constraint. #### M6 — `ws_stream.py` mixed concerns (699 LOC) **File:** `server/src/ledgrab/api/routes/color_strip_sources/ws_stream.py` The worst part (stream-creation dispatch) was fixed in Phase 2.1 — it now calls `color_strip_kinds.build_stream(source, deps)`. The remaining 699 lines mix config parsing + WebSocket lifecycle + frame loop. Could extract the frame loop into a separate `PreviewFrameLoop` class. **Estimated scope:** half a session. Low impact since the parallel-change problem is already fixed. #### M7 — No shared frontend API client — 🟡 FOUNDATION DONE (uncommitted, 2026-05-27) **File:** every `static/js/features/*.ts` `fetchWithAuth(...)` + bespoke error-unwrapping is copy-pasted in every feature's save / load function. ~45 files, ~243 call sites. **Done:** `static/js/core/api-client.ts` now provides typed `apiGet` / `apiPost` / `apiPut` / `apiPatch` / `apiDelete` that wrap `fetchWithAuth` (so auth, 401-relogin, retry, timeout, and the offline toast are unchanged) and collapse the repeated `if (!resp.ok) { detail || HTTP } … resp.json()` dance into one call returning a typed body and throwing `ApiError` on failure. The `detail` unwrap is hardened to join FastAPI validation arrays instead of stringifying to `[object Object]`. **35 feature/core files migrated** (covers GET/POST/PUT/DELETE, typed response bodies, custom i18n error messages, silent-failure GETs, bulk `Promise.allSettled` deletes, inline-error saves, array-`detail` joins, fire-and-forget POSTs, and local catch handling) — reviewer-approved for behaviour parity across the riskier divergences. Migrated files include the integration sources (weather / HA / MQTT / HTTP), the template families (capture / audio / audio-processing / pattern), the scene-preset CRUD, the simple-CRUD entity files (sync-clocks / audio-sources / game-integration / gradient / displays / device-discovery), the light-target editors (z2m / ha), the preferences modules (dashboard-layout / card-modes / notifications-watcher), the calibration editors (simple + advanced), the entire `automations.ts` and `devices.ts` CRUD surfaces, and several core utilities (`api-client.ts` itself, `cache.ts`, `command-palette.ts`, `graph-connections.ts`, `tag-input.ts`, `process-picker.ts`, `perf-charts.ts`, `icon-picker.ts`, `update.ts`, `integrations.ts`). Also added **14 new locale keys** (en / ru / zh) so the fallback messages the migration surfaces — `pattern.error.save_failed`, `audio_processing.error.save_failed`, `audio_template.error.save_failed`, `audio_template.error.load_failed`, `templates.error.save_failed`, `templates.error.load_failed`, `gradient.error.save_failed`, `target.error.load_failed`, `device.error.load_failed`, `automations.error.{load,save,delete,toggle}_failed`, plus `gradient.error.delete_failed` for ru/zh — are translated instead of hardcoded English. A scan confirms **no `errorMessage: ''` strings remain** in the migrated diff. **Remaining:** 9 feature files (~94 call sites). All but one are the big god-modules whose migration is best done as part of their C8/C9/C10 splits: `streams.ts` (18), `settings.ts` (18), `targets.ts` (16), `dashboard.ts` (15), `color-strips/index.ts` (8), `graph-editor.ts` (7), `assets.ts` (6 — also blocked by multipart upload + blob download paths that legitimately bypass the JSON client), and `value-sources.ts` (5). The lone leaf file still on `fetchWithAuth` is `pairing-flow.ts` (1) — its branching on raw `Response.status` codes (200 / 409 / 4xx) doesn't fit the api-client contract, so it stays on raw fetch by design. Migration is mechanical but **not** a blind find/replace — each site carries its own localised error key that must be preserved as the `errorMessage` option, and binary/multipart endpoints (e.g. `assets.ts` file upload / blob download) must stay on raw `fetchWithAuth` (the client is JSON-only). Each migrated file ideally gets manual UI smoke-testing. **Behaviour note:** migrated GET sites now prefer the server's `detail` over the generic localised fallback when present — matching what the write paths already did; intended, but user-visible. #### M8 — Global `_cached*` `let` vars Mutable module-level state mutated from multiple feature modules. No subscription model — features manually `invalidate()` after CRUD. **Approach:** introduce a reactive cache (EventEmitter pattern or a tiny store like Nano Stores). Couple with M7 (the API client can drive cache invalidation on write). #### M9 — `dashboard.ts` 1421 LOC Frontend god-module orchestrating + rendering device / target / CSS cards. Couple with C8/C9/C10 frontend split work. #### M10 — Duplicate frontend modal classes `ValueSourceModal`, `StreamEditorModal`, `TargetEditorModal`, `AddDeviceModal`, etc. each reimplement pristine-check / undo / focus management. **Approach:** introduce a `FormModal` base class. #### M11 — Hardcoded `_getSectionForSource` / `_getTabForSource` Routing tables duplicated across multiple feature files (streams.ts, value-sources.ts). Adding a new stream type requires hunting strings. **Approach:** single routing registry keyed by source_type. #### M12 — Late imports masking cycles Partially addressed by the kind registries (Phase 2.1, 2.2). Some late-imports still exist in `value_stream.py`, `audio_stream.py`, the target processors. Resolving them requires restructuring module layout to break the circular dependencies. **Estimated scope:** small follow-up after H5. ### LOW #### L1 — `(src as any).field` casts in `value-sources.ts` Discriminated unions aren't narrowed properly. Couple with C8 frontend split. #### L2 — Mutable state without locks `_preview_clients`, `_last_preview_data`, `_color_stream`, `_css_stream` are mutated from multiple async tasks without explicit locks. Production has not exhibited issues but the contract is fragile. **Approach:** add explicit `asyncio.Lock` per processor. Couple with H5. #### L3 — `Calibration.validate()` raises instead of returning result **File:** `server/src/ledgrab/core/capture/calibration.py:164` All 4 call sites currently rely on the raise; converting to `ValidationResult` would force every caller to check a return value without adding safety. **Recommendation:** skip — current design is appropriate. #### L4 — `_SOURCE_TYPE_MAP` is module-private No public `GET /api/v1/source-types` discovery endpoint. Frontend hardcodes the list of source types in `types.ts`. **Approach:** add a discovery route + matching frontend fetch. Couple with H6 frontend split (since `types.ts` is involved). #### L5 — `AudioValueStream` implicit state machine **File:** `server/src/ledgrab/core/processing/value_stream.py:169-383` `get_value()` can be called before `start()`; transitions are implicit. **Approach:** explicit State pattern. Low value (production callers always start before reading). --- ## Remaining frontend items (all) ### CRITICAL - **C8** — `value-sources.ts` 1972 LOC (4 god-functions, type-dispatch ladders) - **C9** — `graph-editor.ts` 2707 LOC (layout + interaction + state + WS sync + …) - **C10** — `streams.ts` 2341 LOC (picture / audio / template kitchen-sink) ### Other frontend (severity in main list above) - **H6 rest** — ✅ DONE (uncommitted, 2026-05-27): `types.ts` (1140 LOC) split into 18 per-entity files under `types/` (joining the existing `bindable.ts`); `types.ts` is now a ~200-line pure re-export barrel, so every `import { … } from '../types.ts'` still resolves. Reviewer confirmed all 102 exported symbols preserved, none renamed. - **H7** — `device-discovery.ts` 1745 LOC (couple with H4) - **H8** — `automations.ts` 1410 LOC (mirror H2) - **M7** — shared API client - **M8** — reactive cache - **M9** — `dashboard.ts` 1421 LOC - **M10** — `FormModal` base - **M11** — routing registry - **L1** — narrowing the discriminated unions The frontend remainder is **multi-day work** even when broken up by finding. Recommended approach: a dedicated frontend sprint with the typescript-reviewer agent + manual UI testing for each god-module split. Order: 1. Finish `types.ts` split (H6) — pure organisation, low risk, unblocks the rest 2. Introduce API client (M7) — every feature file gains a cleaner shape 3. Split `value-sources.ts` (C8) — uses the API client + per-type registry pattern 4. Split `streams.ts` (C10) 5. Split `graph-editor.ts` (C9) — needs the most care; the file owns the entire visual editor 6. Polish: `dashboard.ts` (M9), `device-discovery.ts` (H7), `automations.ts` (H8), `FormModal` (M10), routing registry (M11), reactive cache (M8), narrowing (L1) --- ## Recommended ordering for future sessions ### Session A — Frontend sprint (multi-day) Address H6-rest, C8, C9, C10, H7, H8, M7-M11, L1. See order above. Critical to have typescript-reviewer feedback + manual UI testing after each split. > **Progress (2026-05-27, uncommitted):** steps 1 & 2 of the order above > are done — H6-rest (`types.ts` split) and M7-foundation (`api-client.ts` > + 3 reference migrations). H8 (automations registry) also landed. Still > open: C8, C9, C10, H7, the remaining ~40 M7 file migrations, M8-M11, L1. > Next per the order: introduce the API client everywhere (finish M7), > then split `value-sources.ts` (C8). ### Session B — Device redesign (1-2 sessions) Address H4 alone. Touches device storage + provider classes; needs a data migration. Once H4 lands, H7 frontend mirror can follow. ### Session C — BaseTargetProcessor ABC (2-3 sessions) Address C5 (full) + H5 + M1 + M2 + L2 together. Highest risk in the audit because it drives physical LED hardware. Each step needs manual verification with a real device. ### Session D — Polish (half a session) Address M3, M6 (remainder), M12 (remainder), L3 (decision: skip), L4, L5. --- ## Pattern reference for new contributors Three registry-pattern templates that already exist in the codebase and should be the model for the remaining dispatch ladders: 1. **Class-level handler dict + import-time coverage assertion** - `core/processing/effect_stream.py::_RENDERERS` (`@_effect_renderer` decorator + `@_collect_effect_renderers` class decorator) - `core/automations/automation_engine.py::AutomationEngine._RULE_HANDLERS` (module-level binding after class definition) - `api/routes/output_targets.py::_TARGET_RESPONSE_BUILDERS` (response-shape dispatch keyed by storage class) 2. **Per-type free functions + dependency-bag dataclass** - `core/processing/color_strip_kinds.py` (`StreamDeps` + `STREAM_BUILDERS`) - `core/processing/value_kinds.py` (`ValueStreamDeps` + `STREAM_BUILDERS`) - `storage/value_source_factories.py` (`CREATE_BUILDERS` + `UPDATE_APPLIERS`) 3. **Versioned migration runner** - `storage/data_migrations.py` (`MigrationRunner` + `DataMigration` ABC) - Used for any storage rename / field-shape change in the future. - Audit-table contract: atomic transaction covers applied-check + apply + record, so partial-failure cannot leave data rewritten but unrecorded. Adding a new feature that touches dispatch should reach for one of these three patterns before writing a fresh if/elif chain.