ledgrab/AUDIT_REMAINING.md

# LedGrab Architecture Audit — Remaining Items

Roadmap for the architecture-audit refactor sprint that started 2026-05-22.
This file lists every audit finding that is **not yet addressed**; the ones
already landed in commits `563cbac..2f15fbb` are summarised below for
context.

## Already done (10 commits)

| Commit | Findings addressed |
|---|---|
| `563cbac` | C2, C11, C1 (parallel-change only), C3, C4, C6, C7-streams |
| `29bdacf` | C5 (HA/Z2M swap helper; full ABC deferred) |
| `97dae2c` | H1 |
| `5fec8db` | M4 |
| `98fb61d` | H2 |
| `9f3f346` | M5 |
| `05f73ee` | H6 (bindable extraction only) |
| `3b8f00e` + `c1aa2eb` | C7 store-side |
| `2f15fbb` | H3 |

All commits have ≥1 code-review subagent pass with HIGH findings fixed
before commit. Tests pass on each commit; ruff clean; tsc + bundle build
clean for the frontend commit.

The two CRITICAL **data-safety** items (C2 silent CSS fallback, C11
string-replace JSON migration) are fixed. The two CRITICAL
**parallel-change** problems for color-strip + value-source dispatch are
fixed. The two HIGH dispatch problems (H1 effects, H2 rules) are fixed.

---

## Remaining backend items

### HIGH

#### H4 — `Device.__init__` 40+ params mixing per-type fields

**File:** `server/src/ledgrab/storage/device_store.py:46-150`

The `Device` dataclass constructor accepts ~40 parameters that mix common
fields with DMX-only / DDP-only / Hue-only / Yeelight-only / Wiz-only /
LIFX-only / Govee-only / Nanoleaf-only / SPI-only / Chroma-only /
GameSense-only fields. Setting `hue_username` on a WLED device is
silently ignored.

**Approach:** introduce per-device-type config dataclasses
(`DmxConfig`, `HueConfig`, `DdpConfig`, …) and make `Device.config` a
discriminated union. Per-type validation moves to the config classes.
Wire migration: every existing device row needs to be re-parsed; use the
versioned `MigrationRunner` introduced in Phase 1.2.

**Risk:** medium-high. Touches:
- `storage/device_store.py` — Device dataclass, `from_dict`, `to_dict`,
  `create_device`, `update_device`
- `api/schemas/devices.py` — Pydantic schemas
- `api/routes/devices.py` — request validation
- `core/devices/*` — every provider reads device fields
- A new migration to translate flat fields → nested `config`

**Estimated scope:** ~1500 LOC diff, 1-2 dedicated sessions.

#### H5 — `WledTargetProcessor` god class (32 methods, 5 responsibilities)

**File:** `server/src/ledgrab/core/processing/wled_target_processor.py` (1238 LOC)

Conflates:
1. Device connectivity (probe, liveness, reconnect)
2. FPS negotiation (adaptive_fps, keepalive_interval, state_check_interval)
3. LED resampling (`_fit_to_device` — 60 lines of numpy)
4. Preview WebSocket fanout (`_preview_clients`, `_broadcast_led_preview`)
5. Metrics emission (`get_state`, `get_metrics`)

**Approach:** extract `WledDeviceConnector`, `WledPixelSender`,
`TargetFitProcessor`, `TargetPreviewBroadcaster`, `TargetMetricsCollector`.
`WledTargetProcessor` becomes an orchestrator that composes them.

**Risk:** HIGHEST in the audit. This class drives physical LED hardware
in production. A regression caught at runtime (in the user's living
room) is the expensive failure mode. Needs manual verification with at
least one real WLED device after the refactor.

**Coupled with:** C5 (HA/Z2M shared the same shape; should extract a
common `BaseTargetProcessor` ABC at the same time so all three
processors share lifecycle / preview / metrics code).

**Estimated scope:** ~2000 LOC diff, 2-3 dedicated sessions, with manual
device testing after each.

#### H7 — `device-discovery.ts` 1745 LOC

Frontend mirror of H4. The `onDeviceTypeChanged` handler has a giant
switch with 15+ device kinds and 15+ `_showXxxFields` / `_buildXxxItems`
helpers. Adding a device type requires editing 5 separate frontend hooks.

**Approach:** mirror the H4 backend redesign — once the storage layer
has per-type config objects, the frontend can have a per-type field-set
registry. Best done **after** H4 lands so the schemas drive the
registry.

**Estimated scope:** 1-2 sessions; coupled to H4.

#### H8 — `automations.ts` 1410 LOC

Frontend mirror of H2 (rule polymorphism). Already addressed on the
backend in `98fb61d`; the frontend dispatch on `RuleType` is still
hand-rolled.

**Approach:** introduce a rule-type registry on the frontend matching
the backend's `_RULE_HANDLERS` shape.

**Estimated scope:** half a session.

### MEDIUM

#### M1 — `ProcessorManager.add_target` shotgun (11 args, WLED-leak)

**File:** `server/src/ledgrab/core/processing/processor_manager.py:396`

Method is named generically (`add_target`) but accepts `protocol="ddp"`
and `keepalive_interval` — WLED-only fields. HA and Z2M have sibling
methods with their own bespoke params.

**Approach:** extract a `TargetFactory` (per-kind builders, similar to
`value_source_factories.py` from Phase 7). Couple with H5/C5 work.

#### M2 — `TargetContext` god-bag

**File:** `server/src/ledgrab/core/processing/processor_manager.py`

`@dataclass TargetContext` exposes ~8 attributes (device_store,
color_strip_stream_manager, value_stream_manager, metrics_history,
mqtt_manager, ha_manager, …). Processors silently depend on whichever
fields they read. Tests have to construct a huge mock context.

**Approach:** make per-processor explicit dependency injection. Couple
with H5 work.

#### M3 — Validation duplicated across layers

Field-level constraints (composite nesting depth, name uniqueness, span
ranges) are enforced in route + schema + store. Adding a new constraint
means editing 3 places.

**Approach:** move all validation to the model/schema layer (Pydantic
validators + dataclass `__post_init__`). Routes trust the schema; store
trusts the model.

**Risk:** moderate — cross-cutting; needs careful review of which layer
currently owns which constraint.

#### M6 — `ws_stream.py` mixed concerns (699 LOC)

**File:** `server/src/ledgrab/api/routes/color_strip_sources/ws_stream.py`

The worst part (stream-creation dispatch) was fixed in Phase 2.1 — it
now calls `color_strip_kinds.build_stream(source, deps)`. The remaining
699 lines mix config parsing + WebSocket lifecycle + frame loop. Could
extract the frame loop into a separate `PreviewFrameLoop` class.

**Estimated scope:** half a session. Low impact since the parallel-change
problem is already fixed.

#### M7 — No shared frontend API client

**File:** every `static/js/features/*.ts`

`fetchWithAuth(...)` + bespoke error-unwrapping is copy-pasted in every
feature's save / load function. ~25 files.

**Approach:** introduce `static/js/core/api-client.ts` with typed
methods (`get`, `post`, `put`, `delete`) that handle auth, JSON parsing,
error normalisation. Replace `fetchWithAuth` calls across features.

#### M8 — Global `_cached*` `let` vars

Mutable module-level state mutated from multiple feature modules. No
subscription model — features manually `invalidate()` after CRUD.

**Approach:** introduce a reactive cache (EventEmitter pattern or a tiny
store like Nano Stores). Couple with M7 (the API client can drive cache
invalidation on write).

#### M9 — `dashboard.ts` 1421 LOC

Frontend god-module orchestrating + rendering device / target / CSS
cards. Couple with C8/C9/C10 frontend split work.

#### M10 — Duplicate frontend modal classes

`ValueSourceModal`, `StreamEditorModal`, `TargetEditorModal`,
`AddDeviceModal`, etc. each reimplement pristine-check / undo / focus
management.

**Approach:** introduce a `FormModal<T>` base class.

#### M11 — Hardcoded `_getSectionForSource` / `_getTabForSource`

Routing tables duplicated across multiple feature files (streams.ts,
value-sources.ts). Adding a new stream type requires hunting strings.

**Approach:** single routing registry keyed by source_type.

#### M12 — Late imports masking cycles

Partially addressed by the kind registries (Phase 2.1, 2.2). Some
late-imports still exist in `value_stream.py`, `audio_stream.py`, the
target processors. Resolving them requires restructuring module layout
to break the circular dependencies.

**Estimated scope:** small follow-up after H5.

### LOW

#### L1 — `(src as any).field` casts in `value-sources.ts`

Discriminated unions aren't narrowed properly. Couple with C8 frontend
split.

#### L2 — Mutable state without locks

`_preview_clients`, `_last_preview_data`, `_color_stream`,
`_css_stream` are mutated from multiple async tasks without explicit
locks. Production has not exhibited issues but the contract is fragile.

**Approach:** add explicit `asyncio.Lock` per processor. Couple with H5.

#### L3 — `Calibration.validate()` raises instead of returning result

**File:** `server/src/ledgrab/core/capture/calibration.py:164`

All 4 call sites currently rely on the raise; converting to
`ValidationResult` would force every caller to check a return value
without adding safety. **Recommendation:** skip — current design is
appropriate.

#### L4 — `_SOURCE_TYPE_MAP` is module-private

No public `GET /api/v1/source-types` discovery endpoint. Frontend
hardcodes the list of source types in `types.ts`.

**Approach:** add a discovery route + matching frontend fetch. Couple
with H6 frontend split (since `types.ts` is involved).

#### L5 — `AudioValueStream` implicit state machine

**File:** `server/src/ledgrab/core/processing/value_stream.py:169-383`

`get_value()` can be called before `start()`; transitions are implicit.
**Approach:** explicit State pattern. Low value (production callers
always start before reading).

---

## Remaining frontend items (all)

### CRITICAL

- **C8** — `value-sources.ts` 1972 LOC (4 god-functions, type-dispatch ladders)
- **C9** — `graph-editor.ts` 2707 LOC (layout + interaction + state + WS sync + …)
- **C10** — `streams.ts` 2341 LOC (picture / audio / template kitchen-sink)

### Other frontend (severity in main list above)

- **H6 rest** — split remaining ~1100 LOC of `types.ts` into per-entity files
- **H7** — `device-discovery.ts` 1745 LOC (couple with H4)
- **H8** — `automations.ts` 1410 LOC (mirror H2)
- **M7** — shared API client
- **M8** — reactive cache
- **M9** — `dashboard.ts` 1421 LOC
- **M10** — `FormModal<T>` base
- **M11** — routing registry
- **L1** — narrowing the discriminated unions

The frontend remainder is **multi-day work** even when broken up by
finding. Recommended approach: a dedicated frontend sprint with the
typescript-reviewer agent + manual UI testing for each god-module
split. Order:

1. Finish `types.ts` split (H6) — pure organisation, low risk, unblocks
   the rest
2. Introduce API client (M7) — every feature file gains a cleaner shape
3. Split `value-sources.ts` (C8) — uses the API client + per-type
   registry pattern
4. Split `streams.ts` (C10)
5. Split `graph-editor.ts` (C9) — needs the most care; the file owns
   the entire visual editor
6. Polish: `dashboard.ts` (M9), `device-discovery.ts` (H7),
   `automations.ts` (H8), `FormModal` (M10), routing registry (M11),
   reactive cache (M8), narrowing (L1)

---

## Recommended ordering for future sessions

### Session A — Frontend sprint (multi-day)

Address H6-rest, C8, C9, C10, H7, H8, M7-M11, L1. See order above.
Critical to have typescript-reviewer feedback + manual UI testing after
each split.

### Session B — Device redesign (1-2 sessions)

Address H4 alone. Touches device storage + provider classes; needs a
data migration. Once H4 lands, H7 frontend mirror can follow.

### Session C — BaseTargetProcessor ABC (2-3 sessions)

Address C5 (full) + H5 + M1 + M2 + L2 together. Highest risk in the
audit because it drives physical LED hardware. Each step needs manual
verification with a real device.

### Session D — Polish (half a session)

Address M3, M6 (remainder), M12 (remainder), L3 (decision: skip), L4,
L5.

---

## Pattern reference for new contributors

Three registry-pattern templates that already exist in the codebase and
should be the model for the remaining dispatch ladders:

1. **Class-level handler dict + import-time coverage assertion**
   - `core/processing/effect_stream.py::_RENDERERS`
     (`@_effect_renderer` decorator + `@_collect_effect_renderers`
     class decorator)
   - `core/automations/automation_engine.py::AutomationEngine._RULE_HANDLERS`
     (module-level binding after class definition)
   - `api/routes/output_targets.py::_TARGET_RESPONSE_BUILDERS`
     (response-shape dispatch keyed by storage class)

2. **Per-type free functions + dependency-bag dataclass**
   - `core/processing/color_strip_kinds.py` (`StreamDeps` + `STREAM_BUILDERS`)
   - `core/processing/value_kinds.py` (`ValueStreamDeps` + `STREAM_BUILDERS`)
   - `storage/value_source_factories.py` (`CREATE_BUILDERS` + `UPDATE_APPLIERS`)

3. **Versioned migration runner**
   - `storage/data_migrations.py` (`MigrationRunner` + `DataMigration` ABC)
   - Used for any storage rename / field-shape change in the future.
   - Audit-table contract: atomic transaction covers
     applied-check + apply + record, so partial-failure cannot leave
     data rewritten but unrecorded.

Adding a new feature that touches dispatch should reach for one of
these three patterns before writing a fresh if/elif chain.