Files

T

alexei.dolgolyov 628c6b2f0d docs: capture architecture-audit remainder for follow-up sessions

10 commits in this branch landed the data-safety bugs (C2, C11), the
worst parallel-change problems (C1/C3/C4/C6/C7), and 5 of the 9 HIGH
audit findings. The remainder splits cleanly into four follow-up
sessions: a frontend sprint (C8/C9/C10/H6/H7/H8/M7-M11/L1), a Device
redesign (H4), a BaseTargetProcessor ABC (C5/H5/M1/M2/L2), and polish
(M3/M6/M12/L3-L5).

This file documents what's left, the recommended ordering, and the
three registry patterns already in the codebase that new contributors
should reach for instead of writing fresh if/elif chains.

2026-05-23 00:36:39 +03:00

13 KiB

Raw Blame History

LedGrab Architecture Audit — Remaining Items

Roadmap for the architecture-audit refactor sprint that started 2026-05-22. This file lists every audit finding that is not yet addressed; the ones already landed in commits 563cbac..2f15fbb are summarised below for context.

Already done (10 commits)

Commit	Findings addressed
`563cbac`	C2, C11, C1 (parallel-change only), C3, C4, C6, C7-streams
`29bdacf`	C5 (HA/Z2M swap helper; full ABC deferred)
`97dae2c`	H1
`5fec8db`	M4
`98fb61d`	H2
`9f3f346`	M5
`05f73ee`	H6 (bindable extraction only)
`3b8f00e` + `c1aa2eb`	C7 store-side
`2f15fbb`	H3

All commits have ≥1 code-review subagent pass with HIGH findings fixed before commit. Tests pass on each commit; ruff clean; tsc + bundle build clean for the frontend commit.

The two CRITICAL data-safety items (C2 silent CSS fallback, C11 string-replace JSON migration) are fixed. The two CRITICAL parallel-change problems for color-strip + value-source dispatch are fixed. The two HIGH dispatch problems (H1 effects, H2 rules) are fixed.

Remaining backend items

HIGH

H4 — `Device.init` 40+ params mixing per-type fields

File: server/src/ledgrab/storage/device_store.py:46-150

The Device dataclass constructor accepts ~40 parameters that mix common fields with DMX-only / DDP-only / Hue-only / Yeelight-only / Wiz-only / LIFX-only / Govee-only / Nanoleaf-only / SPI-only / Chroma-only / GameSense-only fields. Setting hue_username on a WLED device is silently ignored.

Approach: introduce per-device-type config dataclasses (DmxConfig, HueConfig, DdpConfig, …) and make Device.config a discriminated union. Per-type validation moves to the config classes. Wire migration: every existing device row needs to be re-parsed; use the versioned MigrationRunner introduced in Phase 1.2.

Risk: medium-high. Touches:

storage/device_store.py — Device dataclass, from_dict, to_dict, create_device, update_device
api/schemas/devices.py — Pydantic schemas
api/routes/devices.py — request validation
core/devices/* — every provider reads device fields
A new migration to translate flat fields → nested config

Estimated scope: ~1500 LOC diff, 1-2 dedicated sessions.

H5 — `WledTargetProcessor` god class (32 methods, 5 responsibilities)

File: server/src/ledgrab/core/processing/wled_target_processor.py (1238 LOC)

Conflates:

Device connectivity (probe, liveness, reconnect)
FPS negotiation (adaptive_fps, keepalive_interval, state_check_interval)
LED resampling (_fit_to_device — 60 lines of numpy)
Preview WebSocket fanout (_preview_clients, _broadcast_led_preview)
Metrics emission (get_state, get_metrics)

Approach: extract WledDeviceConnector, WledPixelSender, TargetFitProcessor, TargetPreviewBroadcaster, TargetMetricsCollector. WledTargetProcessor becomes an orchestrator that composes them.

Risk: HIGHEST in the audit. This class drives physical LED hardware in production. A regression caught at runtime (in the user's living room) is the expensive failure mode. Needs manual verification with at least one real WLED device after the refactor.

Coupled with: C5 (HA/Z2M shared the same shape; should extract a common BaseTargetProcessor ABC at the same time so all three processors share lifecycle / preview / metrics code).

Estimated scope: ~2000 LOC diff, 2-3 dedicated sessions, with manual device testing after each.

H7 — `device-discovery.ts` 1745 LOC

Frontend mirror of H4. The onDeviceTypeChanged handler has a giant switch with 15+ device kinds and 15+ _showXxxFields / _buildXxxItems helpers. Adding a device type requires editing 5 separate frontend hooks.

Approach: mirror the H4 backend redesign — once the storage layer has per-type config objects, the frontend can have a per-type field-set registry. Best done after H4 lands so the schemas drive the registry.

Estimated scope: 1-2 sessions; coupled to H4.

H8 — `automations.ts` 1410 LOC

Frontend mirror of H2 (rule polymorphism). Already addressed on the backend in 98fb61d; the frontend dispatch on RuleType is still hand-rolled.

Approach: introduce a rule-type registry on the frontend matching the backend's _RULE_HANDLERS shape.

Estimated scope: half a session.

MEDIUM

M1 — `ProcessorManager.add_target` shotgun (11 args, WLED-leak)

File: server/src/ledgrab/core/processing/processor_manager.py:396

Method is named generically (add_target) but accepts protocol="ddp" and keepalive_interval — WLED-only fields. HA and Z2M have sibling methods with their own bespoke params.

Approach: extract a TargetFactory (per-kind builders, similar to value_source_factories.py from Phase 7). Couple with H5/C5 work.

M2 — `TargetContext` god-bag

File: server/src/ledgrab/core/processing/processor_manager.py

@dataclass TargetContext exposes ~8 attributes (device_store, color_strip_stream_manager, value_stream_manager, metrics_history, mqtt_manager, ha_manager, …). Processors silently depend on whichever fields they read. Tests have to construct a huge mock context.

Approach: make per-processor explicit dependency injection. Couple with H5 work.

M3 — Validation duplicated across layers

Field-level constraints (composite nesting depth, name uniqueness, span ranges) are enforced in route + schema + store. Adding a new constraint means editing 3 places.

Approach: move all validation to the model/schema layer (Pydantic validators + dataclass __post_init__). Routes trust the schema; store trusts the model.

Risk: moderate — cross-cutting; needs careful review of which layer currently owns which constraint.

M6 — `ws_stream.py` mixed concerns (699 LOC)

File: server/src/ledgrab/api/routes/color_strip_sources/ws_stream.py

The worst part (stream-creation dispatch) was fixed in Phase 2.1 — it now calls color_strip_kinds.build_stream(source, deps). The remaining 699 lines mix config parsing + WebSocket lifecycle + frame loop. Could extract the frame loop into a separate PreviewFrameLoop class.

Estimated scope: half a session. Low impact since the parallel-change problem is already fixed.

M7 — No shared frontend API client

File: every static/js/features/*.ts

fetchWithAuth(...) + bespoke error-unwrapping is copy-pasted in every feature's save / load function. ~25 files.

Approach: introduce static/js/core/api-client.ts with typed methods (get, post, put, delete) that handle auth, JSON parsing, error normalisation. Replace fetchWithAuth calls across features.

M8 — Global `_cached*` `let` vars

Mutable module-level state mutated from multiple feature modules. No subscription model — features manually invalidate() after CRUD.

Approach: introduce a reactive cache (EventEmitter pattern or a tiny store like Nano Stores). Couple with M7 (the API client can drive cache invalidation on write).

M9 — `dashboard.ts` 1421 LOC

Frontend god-module orchestrating + rendering device / target / CSS cards. Couple with C8/C9/C10 frontend split work.

ValueSourceModal, StreamEditorModal, TargetEditorModal, AddDeviceModal, etc. each reimplement pristine-check / undo / focus management.

Approach: introduce a FormModal<T> base class.

M11 — Hardcoded `_getSectionForSource` / `_getTabForSource`

Routing tables duplicated across multiple feature files (streams.ts, value-sources.ts). Adding a new stream type requires hunting strings.

Approach: single routing registry keyed by source_type.

M12 — Late imports masking cycles

Partially addressed by the kind registries (Phase 2.1, 2.2). Some late-imports still exist in value_stream.py, audio_stream.py, the target processors. Resolving them requires restructuring module layout to break the circular dependencies.

Estimated scope: small follow-up after H5.

LOW

L1 — `(src as any).field` casts in `value-sources.ts`

Discriminated unions aren't narrowed properly. Couple with C8 frontend split.

L2 — Mutable state without locks

_preview_clients, _last_preview_data, _color_stream, _css_stream are mutated from multiple async tasks without explicit locks. Production has not exhibited issues but the contract is fragile.

Approach: add explicit asyncio.Lock per processor. Couple with H5.

L3 — `Calibration.validate()` raises instead of returning result

File: server/src/ledgrab/core/capture/calibration.py:164

All 4 call sites currently rely on the raise; converting to ValidationResult would force every caller to check a return value without adding safety. Recommendation: skip — current design is appropriate.

L4 — `_SOURCE_TYPE_MAP` is module-private

No public GET /api/v1/source-types discovery endpoint. Frontend hardcodes the list of source types in types.ts.

Approach: add a discovery route + matching frontend fetch. Couple with H6 frontend split (since types.ts is involved).

L5 — `AudioValueStream` implicit state machine

File: server/src/ledgrab/core/processing/value_stream.py:169-383

get_value() can be called before start(); transitions are implicit. Approach: explicit State pattern. Low value (production callers always start before reading).

Remaining frontend items (all)

CRITICAL

C8 — value-sources.ts 1972 LOC (4 god-functions, type-dispatch ladders)
C9 — graph-editor.ts 2707 LOC (layout + interaction + state + WS sync + …)
C10 — streams.ts 2341 LOC (picture / audio / template kitchen-sink)

Other frontend (severity in main list above)

H6 rest — split remaining ~1100 LOC of types.ts into per-entity files
H7 — device-discovery.ts 1745 LOC (couple with H4)
H8 — automations.ts 1410 LOC (mirror H2)
M7 — shared API client
M8 — reactive cache
M9 — dashboard.ts 1421 LOC
M10 — FormModal<T> base
M11 — routing registry
L1 — narrowing the discriminated unions

The frontend remainder is multi-day work even when broken up by finding. Recommended approach: a dedicated frontend sprint with the typescript-reviewer agent + manual UI testing for each god-module split. Order:

Finish types.ts split (H6) — pure organisation, low risk, unblocks the rest
Introduce API client (M7) — every feature file gains a cleaner shape
Split value-sources.ts (C8) — uses the API client + per-type registry pattern
Split streams.ts (C10)
Split graph-editor.ts (C9) — needs the most care; the file owns the entire visual editor
Polish: dashboard.ts (M9), device-discovery.ts (H7), automations.ts (H8), FormModal (M10), routing registry (M11), reactive cache (M8), narrowing (L1)

Recommended ordering for future sessions

Session A — Frontend sprint (multi-day)

Address H6-rest, C8, C9, C10, H7, H8, M7-M11, L1. See order above. Critical to have typescript-reviewer feedback + manual UI testing after each split.

Session B — Device redesign (1-2 sessions)

Address H4 alone. Touches device storage + provider classes; needs a data migration. Once H4 lands, H7 frontend mirror can follow.

Session C — BaseTargetProcessor ABC (2-3 sessions)

Address C5 (full) + H5 + M1 + M2 + L2 together. Highest risk in the audit because it drives physical LED hardware. Each step needs manual verification with a real device.

Session D — Polish (half a session)

Address M3, M6 (remainder), M12 (remainder), L3 (decision: skip), L4, L5.

Pattern reference for new contributors

Three registry-pattern templates that already exist in the codebase and should be the model for the remaining dispatch ladders:

Class-level handler dict + import-time coverage assertion
- core/processing/effect_stream.py::_RENDERERS (@_effect_renderer decorator + @_collect_effect_renderers class decorator)
- core/automations/automation_engine.py::AutomationEngine._RULE_HANDLERS (module-level binding after class definition)
- api/routes/output_targets.py::_TARGET_RESPONSE_BUILDERS (response-shape dispatch keyed by storage class)
Per-type free functions + dependency-bag dataclass
- core/processing/color_strip_kinds.py (StreamDeps + STREAM_BUILDERS)
- core/processing/value_kinds.py (ValueStreamDeps + STREAM_BUILDERS)
- storage/value_source_factories.py (CREATE_BUILDERS + UPDATE_APPLIERS)
Versioned migration runner
- storage/data_migrations.py (MigrationRunner + DataMigration ABC)
- Used for any storage rename / field-shape change in the future.
- Audit-table contract: atomic transaction covers applied-check + apply + record, so partial-failure cannot leave data rewritten but unrecorded.

Adding a new feature that touches dispatch should reach for one of these three patterns before writing a fresh if/elif chain.

13 KiB Raw Blame History

LedGrab Architecture Audit — Remaining Items

Already done (10 commits)

Remaining backend items

HIGH

H4 — Device.__init__ 40+ params mixing per-type fields

H5 — WledTargetProcessor god class (32 methods, 5 responsibilities)

H7 — device-discovery.ts 1745 LOC

H8 — automations.ts 1410 LOC