Files
ledgrab/AUDIT_REMAINING.md
T
alexei.dolgolyov 628c6b2f0d docs: capture architecture-audit remainder for follow-up sessions
10 commits in this branch landed the data-safety bugs (C2, C11), the
worst parallel-change problems (C1/C3/C4/C6/C7), and 5 of the 9 HIGH
audit findings. The remainder splits cleanly into four follow-up
sessions: a frontend sprint (C8/C9/C10/H6/H7/H8/M7-M11/L1), a Device
redesign (H4), a BaseTargetProcessor ABC (C5/H5/M1/M2/L2), and polish
(M3/M6/M12/L3-L5).

This file documents what's left, the recommended ordering, and the
three registry patterns already in the codebase that new contributors
should reach for instead of writing fresh if/elif chains.
2026-05-23 00:36:39 +03:00

348 lines
13 KiB
Markdown

# LedGrab Architecture Audit — Remaining Items
Roadmap for the architecture-audit refactor sprint that started 2026-05-22.
This file lists every audit finding that is **not yet addressed**; the ones
already landed in commits `563cbac..2f15fbb` are summarised below for
context.
## Already done (10 commits)
| Commit | Findings addressed |
|---|---|
| `563cbac` | C2, C11, C1 (parallel-change only), C3, C4, C6, C7-streams |
| `29bdacf` | C5 (HA/Z2M swap helper; full ABC deferred) |
| `97dae2c` | H1 |
| `5fec8db` | M4 |
| `98fb61d` | H2 |
| `9f3f346` | M5 |
| `05f73ee` | H6 (bindable extraction only) |
| `3b8f00e` + `c1aa2eb` | C7 store-side |
| `2f15fbb` | H3 |
All commits have ≥1 code-review subagent pass with HIGH findings fixed
before commit. Tests pass on each commit; ruff clean; tsc + bundle build
clean for the frontend commit.
The two CRITICAL **data-safety** items (C2 silent CSS fallback, C11
string-replace JSON migration) are fixed. The two CRITICAL
**parallel-change** problems for color-strip + value-source dispatch are
fixed. The two HIGH dispatch problems (H1 effects, H2 rules) are fixed.
---
## Remaining backend items
### HIGH
#### H4 — `Device.__init__` 40+ params mixing per-type fields
**File:** `server/src/ledgrab/storage/device_store.py:46-150`
The `Device` dataclass constructor accepts ~40 parameters that mix common
fields with DMX-only / DDP-only / Hue-only / Yeelight-only / Wiz-only /
LIFX-only / Govee-only / Nanoleaf-only / SPI-only / Chroma-only /
GameSense-only fields. Setting `hue_username` on a WLED device is
silently ignored.
**Approach:** introduce per-device-type config dataclasses
(`DmxConfig`, `HueConfig`, `DdpConfig`, …) and make `Device.config` a
discriminated union. Per-type validation moves to the config classes.
Wire migration: every existing device row needs to be re-parsed; use the
versioned `MigrationRunner` introduced in Phase 1.2.
**Risk:** medium-high. Touches:
- `storage/device_store.py` — Device dataclass, `from_dict`, `to_dict`,
`create_device`, `update_device`
- `api/schemas/devices.py` — Pydantic schemas
- `api/routes/devices.py` — request validation
- `core/devices/*` — every provider reads device fields
- A new migration to translate flat fields → nested `config`
**Estimated scope:** ~1500 LOC diff, 1-2 dedicated sessions.
#### H5 — `WledTargetProcessor` god class (32 methods, 5 responsibilities)
**File:** `server/src/ledgrab/core/processing/wled_target_processor.py` (1238 LOC)
Conflates:
1. Device connectivity (probe, liveness, reconnect)
2. FPS negotiation (adaptive_fps, keepalive_interval, state_check_interval)
3. LED resampling (`_fit_to_device` — 60 lines of numpy)
4. Preview WebSocket fanout (`_preview_clients`, `_broadcast_led_preview`)
5. Metrics emission (`get_state`, `get_metrics`)
**Approach:** extract `WledDeviceConnector`, `WledPixelSender`,
`TargetFitProcessor`, `TargetPreviewBroadcaster`, `TargetMetricsCollector`.
`WledTargetProcessor` becomes an orchestrator that composes them.
**Risk:** HIGHEST in the audit. This class drives physical LED hardware
in production. A regression caught at runtime (in the user's living
room) is the expensive failure mode. Needs manual verification with at
least one real WLED device after the refactor.
**Coupled with:** C5 (HA/Z2M shared the same shape; should extract a
common `BaseTargetProcessor` ABC at the same time so all three
processors share lifecycle / preview / metrics code).
**Estimated scope:** ~2000 LOC diff, 2-3 dedicated sessions, with manual
device testing after each.
#### H7 — `device-discovery.ts` 1745 LOC
Frontend mirror of H4. The `onDeviceTypeChanged` handler has a giant
switch with 15+ device kinds and 15+ `_showXxxFields` / `_buildXxxItems`
helpers. Adding a device type requires editing 5 separate frontend hooks.
**Approach:** mirror the H4 backend redesign — once the storage layer
has per-type config objects, the frontend can have a per-type field-set
registry. Best done **after** H4 lands so the schemas drive the
registry.
**Estimated scope:** 1-2 sessions; coupled to H4.
#### H8 — `automations.ts` 1410 LOC
Frontend mirror of H2 (rule polymorphism). Already addressed on the
backend in `98fb61d`; the frontend dispatch on `RuleType` is still
hand-rolled.
**Approach:** introduce a rule-type registry on the frontend matching
the backend's `_RULE_HANDLERS` shape.
**Estimated scope:** half a session.
### MEDIUM
#### M1 — `ProcessorManager.add_target` shotgun (11 args, WLED-leak)
**File:** `server/src/ledgrab/core/processing/processor_manager.py:396`
Method is named generically (`add_target`) but accepts `protocol="ddp"`
and `keepalive_interval` — WLED-only fields. HA and Z2M have sibling
methods with their own bespoke params.
**Approach:** extract a `TargetFactory` (per-kind builders, similar to
`value_source_factories.py` from Phase 7). Couple with H5/C5 work.
#### M2 — `TargetContext` god-bag
**File:** `server/src/ledgrab/core/processing/processor_manager.py`
`@dataclass TargetContext` exposes ~8 attributes (device_store,
color_strip_stream_manager, value_stream_manager, metrics_history,
mqtt_manager, ha_manager, …). Processors silently depend on whichever
fields they read. Tests have to construct a huge mock context.
**Approach:** make per-processor explicit dependency injection. Couple
with H5 work.
#### M3 — Validation duplicated across layers
Field-level constraints (composite nesting depth, name uniqueness, span
ranges) are enforced in route + schema + store. Adding a new constraint
means editing 3 places.
**Approach:** move all validation to the model/schema layer (Pydantic
validators + dataclass `__post_init__`). Routes trust the schema; store
trusts the model.
**Risk:** moderate — cross-cutting; needs careful review of which layer
currently owns which constraint.
#### M6 — `ws_stream.py` mixed concerns (699 LOC)
**File:** `server/src/ledgrab/api/routes/color_strip_sources/ws_stream.py`
The worst part (stream-creation dispatch) was fixed in Phase 2.1 — it
now calls `color_strip_kinds.build_stream(source, deps)`. The remaining
699 lines mix config parsing + WebSocket lifecycle + frame loop. Could
extract the frame loop into a separate `PreviewFrameLoop` class.
**Estimated scope:** half a session. Low impact since the parallel-change
problem is already fixed.
#### M7 — No shared frontend API client
**File:** every `static/js/features/*.ts`
`fetchWithAuth(...)` + bespoke error-unwrapping is copy-pasted in every
feature's save / load function. ~25 files.
**Approach:** introduce `static/js/core/api-client.ts` with typed
methods (`get`, `post`, `put`, `delete`) that handle auth, JSON parsing,
error normalisation. Replace `fetchWithAuth` calls across features.
#### M8 — Global `_cached*` `let` vars
Mutable module-level state mutated from multiple feature modules. No
subscription model — features manually `invalidate()` after CRUD.
**Approach:** introduce a reactive cache (EventEmitter pattern or a tiny
store like Nano Stores). Couple with M7 (the API client can drive cache
invalidation on write).
#### M9 — `dashboard.ts` 1421 LOC
Frontend god-module orchestrating + rendering device / target / CSS
cards. Couple with C8/C9/C10 frontend split work.
#### M10 — Duplicate frontend modal classes
`ValueSourceModal`, `StreamEditorModal`, `TargetEditorModal`,
`AddDeviceModal`, etc. each reimplement pristine-check / undo / focus
management.
**Approach:** introduce a `FormModal<T>` base class.
#### M11 — Hardcoded `_getSectionForSource` / `_getTabForSource`
Routing tables duplicated across multiple feature files (streams.ts,
value-sources.ts). Adding a new stream type requires hunting strings.
**Approach:** single routing registry keyed by source_type.
#### M12 — Late imports masking cycles
Partially addressed by the kind registries (Phase 2.1, 2.2). Some
late-imports still exist in `value_stream.py`, `audio_stream.py`, the
target processors. Resolving them requires restructuring module layout
to break the circular dependencies.
**Estimated scope:** small follow-up after H5.
### LOW
#### L1 — `(src as any).field` casts in `value-sources.ts`
Discriminated unions aren't narrowed properly. Couple with C8 frontend
split.
#### L2 — Mutable state without locks
`_preview_clients`, `_last_preview_data`, `_color_stream`,
`_css_stream` are mutated from multiple async tasks without explicit
locks. Production has not exhibited issues but the contract is fragile.
**Approach:** add explicit `asyncio.Lock` per processor. Couple with H5.
#### L3 — `Calibration.validate()` raises instead of returning result
**File:** `server/src/ledgrab/core/capture/calibration.py:164`
All 4 call sites currently rely on the raise; converting to
`ValidationResult` would force every caller to check a return value
without adding safety. **Recommendation:** skip — current design is
appropriate.
#### L4 — `_SOURCE_TYPE_MAP` is module-private
No public `GET /api/v1/source-types` discovery endpoint. Frontend
hardcodes the list of source types in `types.ts`.
**Approach:** add a discovery route + matching frontend fetch. Couple
with H6 frontend split (since `types.ts` is involved).
#### L5 — `AudioValueStream` implicit state machine
**File:** `server/src/ledgrab/core/processing/value_stream.py:169-383`
`get_value()` can be called before `start()`; transitions are implicit.
**Approach:** explicit State pattern. Low value (production callers
always start before reading).
---
## Remaining frontend items (all)
### CRITICAL
- **C8** — `value-sources.ts` 1972 LOC (4 god-functions, type-dispatch ladders)
- **C9** — `graph-editor.ts` 2707 LOC (layout + interaction + state + WS sync + …)
- **C10** — `streams.ts` 2341 LOC (picture / audio / template kitchen-sink)
### Other frontend (severity in main list above)
- **H6 rest** — split remaining ~1100 LOC of `types.ts` into per-entity files
- **H7** — `device-discovery.ts` 1745 LOC (couple with H4)
- **H8** — `automations.ts` 1410 LOC (mirror H2)
- **M7** — shared API client
- **M8** — reactive cache
- **M9** — `dashboard.ts` 1421 LOC
- **M10** — `FormModal<T>` base
- **M11** — routing registry
- **L1** — narrowing the discriminated unions
The frontend remainder is **multi-day work** even when broken up by
finding. Recommended approach: a dedicated frontend sprint with the
typescript-reviewer agent + manual UI testing for each god-module
split. Order:
1. Finish `types.ts` split (H6) — pure organisation, low risk, unblocks
the rest
2. Introduce API client (M7) — every feature file gains a cleaner shape
3. Split `value-sources.ts` (C8) — uses the API client + per-type
registry pattern
4. Split `streams.ts` (C10)
5. Split `graph-editor.ts` (C9) — needs the most care; the file owns
the entire visual editor
6. Polish: `dashboard.ts` (M9), `device-discovery.ts` (H7),
`automations.ts` (H8), `FormModal` (M10), routing registry (M11),
reactive cache (M8), narrowing (L1)
---
## Recommended ordering for future sessions
### Session A — Frontend sprint (multi-day)
Address H6-rest, C8, C9, C10, H7, H8, M7-M11, L1. See order above.
Critical to have typescript-reviewer feedback + manual UI testing after
each split.
### Session B — Device redesign (1-2 sessions)
Address H4 alone. Touches device storage + provider classes; needs a
data migration. Once H4 lands, H7 frontend mirror can follow.
### Session C — BaseTargetProcessor ABC (2-3 sessions)
Address C5 (full) + H5 + M1 + M2 + L2 together. Highest risk in the
audit because it drives physical LED hardware. Each step needs manual
verification with a real device.
### Session D — Polish (half a session)
Address M3, M6 (remainder), M12 (remainder), L3 (decision: skip), L4,
L5.
---
## Pattern reference for new contributors
Three registry-pattern templates that already exist in the codebase and
should be the model for the remaining dispatch ladders:
1. **Class-level handler dict + import-time coverage assertion**
- `core/processing/effect_stream.py::_RENDERERS`
(`@_effect_renderer` decorator + `@_collect_effect_renderers`
class decorator)
- `core/automations/automation_engine.py::AutomationEngine._RULE_HANDLERS`
(module-level binding after class definition)
- `api/routes/output_targets.py::_TARGET_RESPONSE_BUILDERS`
(response-shape dispatch keyed by storage class)
2. **Per-type free functions + dependency-bag dataclass**
- `core/processing/color_strip_kinds.py` (`StreamDeps` + `STREAM_BUILDERS`)
- `core/processing/value_kinds.py` (`ValueStreamDeps` + `STREAM_BUILDERS`)
- `storage/value_source_factories.py` (`CREATE_BUILDERS` + `UPDATE_APPLIERS`)
3. **Versioned migration runner**
- `storage/data_migrations.py` (`MigrationRunner` + `DataMigration` ABC)
- Used for any storage rename / field-shape change in the future.
- Audit-table contract: atomic transaction covers
applied-check + apply + record, so partial-failure cannot leave
data rewritten but unrecorded.
Adding a new feature that touches dispatch should reach for one of
these three patterns before writing a fresh if/elif chain.