Files
ledgrab/AUDIT_REMAINING.md
alexei.dolgolyov bb3a316e35 refactor(frontend): shared API client + automations registry (audit M7, H8)
H8 — automations.ts rule-type registry
  Convert the two hand-rolled RuleType dispatch ladders into per-type
  registries (RULE_FIELD_RENDERERS + RULE_COLLECTORS) keyed by RuleType,
  joining the existing RULE_CHIP_RENDERERS. All three are typed
  Record<RuleType, ...> for compile-time exhaustiveness; an import-time
  _assertRuleHandlerCoverage() check logs loudly if any registry drifts
  from RULE_TYPE_KEYS — mirrors the backend's _RULE_HANDLERS shape, the
  one intentional divergence being that the frontend logs rather than
  throws (a thrown error at module import would brick the whole bundle,
  not just the editor).

M7 — shared API client + 35 file migrations
  New core/api-client.ts wrapping fetchWithAuth with typed apiGet /
  apiPost / apiPut / apiPatch / apiDelete. Auth, 401-relogin, retry,
  timeout, and the offline toast all stay owned by fetchWithAuth; the
  client just collapses the
  if (!resp.ok) { detail || HTTP <status> } ... resp.json()
  dance into one typed call. The detail unwrap is hardened to join
  FastAPI validation arrays instead of stringifying to [object Object].

  35 feature/core files migrated to it across many batches, reviewer-
  approved for behaviour parity in three passes covering the riskier
  divergences (bulk Promise.allSettled deletes, inline-error saves,
  array-detail joins, silent-failure GETs, immutable clones).

  9 files remain on fetchWithAuth — the big god-modules tied to the
  pending C8/C9/C10 splits (streams, settings, targets, dashboard,
  color-strips/index, graph-editor, assets, value-sources) plus
  pairing-flow which by design stays on raw fetch (branches on raw
  Response.status codes).

i18n — 14 new locale keys (en / ru / zh)
  Added save/load/delete error keys across automations, pattern,
  audio_processing, audio_template, templates, gradient, target,
  device namespaces, plus backfilled gradient.error.delete_failed into
  ru/zh. Scan confirms no hardcoded English errorMessage strings
  remain in the migrated diff.

AUDIT_REMAINING.md updated to reflect H6, H8, and M7 status.

Verified: tsc --noEmit clean + npm run build clean after every batch.
2026-05-28 14:58:08 +03:00

18 KiB

LedGrab Architecture Audit — Remaining Items

Roadmap for the architecture-audit refactor sprint that started 2026-05-22. This file lists every audit finding that is not yet addressed; the ones already landed in commits 563cbac..2f15fbb are summarised below for context.

Already done (10 commits)

Commit Findings addressed
563cbac C2, C11, C1 (parallel-change only), C3, C4, C6, C7-streams
29bdacf C5 (HA/Z2M swap helper; full ABC deferred)
97dae2c H1
5fec8db M4
98fb61d H2
9f3f346 M5
05f73ee H6 (bindable extraction only)
3b8f00e + c1aa2eb C7 store-side
2f15fbb H3
uncommitted (2026-05-27 autonomous pass) H6-rest, H8, M7 (foundation + 3 reference files)

All commits have ≥1 code-review subagent pass with HIGH findings fixed before commit. Tests pass on each commit; ruff clean; tsc + bundle build clean for the frontend commit.

The two CRITICAL data-safety items (C2 silent CSS fallback, C11 string-replace JSON migration) are fixed. The two CRITICAL parallel-change problems for color-strip + value-source dispatch are fixed. The two HIGH dispatch problems (H1 effects, H2 rules) are fixed.


Remaining backend items

HIGH

H4 — Device.__init__ 40+ params mixing per-type fields

File: server/src/ledgrab/storage/device_store.py:46-150

The Device dataclass constructor accepts ~40 parameters that mix common fields with DMX-only / DDP-only / Hue-only / Yeelight-only / Wiz-only / LIFX-only / Govee-only / Nanoleaf-only / SPI-only / Chroma-only / GameSense-only fields. Setting hue_username on a WLED device is silently ignored.

Approach: introduce per-device-type config dataclasses (DmxConfig, HueConfig, DdpConfig, …) and make Device.config a discriminated union. Per-type validation moves to the config classes. Wire migration: every existing device row needs to be re-parsed; use the versioned MigrationRunner introduced in Phase 1.2.

Risk: medium-high. Touches:

  • storage/device_store.py — Device dataclass, from_dict, to_dict, create_device, update_device
  • api/schemas/devices.py — Pydantic schemas
  • api/routes/devices.py — request validation
  • core/devices/* — every provider reads device fields
  • A new migration to translate flat fields → nested config

Estimated scope: ~1500 LOC diff, 1-2 dedicated sessions.

H5 — WledTargetProcessor god class (32 methods, 5 responsibilities)

File: server/src/ledgrab/core/processing/wled_target_processor.py (1238 LOC)

Conflates:

  1. Device connectivity (probe, liveness, reconnect)
  2. FPS negotiation (adaptive_fps, keepalive_interval, state_check_interval)
  3. LED resampling (_fit_to_device — 60 lines of numpy)
  4. Preview WebSocket fanout (_preview_clients, _broadcast_led_preview)
  5. Metrics emission (get_state, get_metrics)

Approach: extract WledDeviceConnector, WledPixelSender, TargetFitProcessor, TargetPreviewBroadcaster, TargetMetricsCollector. WledTargetProcessor becomes an orchestrator that composes them.

Risk: HIGHEST in the audit. This class drives physical LED hardware in production. A regression caught at runtime (in the user's living room) is the expensive failure mode. Needs manual verification with at least one real WLED device after the refactor.

Coupled with: C5 (HA/Z2M shared the same shape; should extract a common BaseTargetProcessor ABC at the same time so all three processors share lifecycle / preview / metrics code).

Estimated scope: ~2000 LOC diff, 2-3 dedicated sessions, with manual device testing after each.

H7 — device-discovery.ts 1745 LOC

Frontend mirror of H4. The onDeviceTypeChanged handler has a giant switch with 15+ device kinds and 15+ _showXxxFields / _buildXxxItems helpers. Adding a device type requires editing 5 separate frontend hooks.

Approach: mirror the H4 backend redesign — once the storage layer has per-type config objects, the frontend can have a per-type field-set registry. Best done after H4 lands so the schemas drive the registry.

Estimated scope: 1-2 sessions; coupled to H4.

H8 — automations.ts 1410 LOC — DONE (uncommitted, 2026-05-27)

Frontend mirror of H2 (rule polymorphism). Already addressed on the backend in 98fb61d; the frontend dispatch on RuleType was hand-rolled.

Done: the two remaining hand-rolled dispatch ladders were converted to registries keyed by RuleType, alongside the pre-existing RULE_CHIP_RENDERERS:

  • RULE_FIELD_RENDERERS — the renderFields if/elif ladder was extracted into module-level _renderXxxFields(container, data) functions (they only ever closed over container); the in-row renderFields is now a 3-line dispatcher.
  • RULE_COLLECTORS — the getAutomationEditorRules if/elif ladder became per-type collectors; the loop is now a registry lookup.
  • All three registries are typed Record<RuleType, …> (compile-time exhaustiveness) and an import-time _assertRuleHandlerCoverage() logs loudly if any registry drifts from RULE_TYPE_KEYS. (Frontend logs rather than throws — a thrown error at import would brick the whole bundle, not just the editor — the one intentional divergence from the backend's raising _assert_rule_handler_coverage.)

Adding a new rule type now means: one entry in RULE_TYPE_KEYS, RULE_TYPE_ICONS, and each of the three registries — and tsc + the coverage check flag any omission.

Verified: tsc + bundle build clean; typescript-reviewer APPROVE (the extracted renderer bodies are byte-identical to the originals; no stray closure captures; http_poll widget-stash + HA entity loading preserved).

MEDIUM

M1 — ProcessorManager.add_target shotgun (11 args, WLED-leak)

File: server/src/ledgrab/core/processing/processor_manager.py:396

Method is named generically (add_target) but accepts protocol="ddp" and keepalive_interval — WLED-only fields. HA and Z2M have sibling methods with their own bespoke params.

Approach: extract a TargetFactory (per-kind builders, similar to value_source_factories.py from Phase 7). Couple with H5/C5 work.

M2 — TargetContext god-bag

File: server/src/ledgrab/core/processing/processor_manager.py

@dataclass TargetContext exposes ~8 attributes (device_store, color_strip_stream_manager, value_stream_manager, metrics_history, mqtt_manager, ha_manager, …). Processors silently depend on whichever fields they read. Tests have to construct a huge mock context.

Approach: make per-processor explicit dependency injection. Couple with H5 work.

M3 — Validation duplicated across layers

Field-level constraints (composite nesting depth, name uniqueness, span ranges) are enforced in route + schema + store. Adding a new constraint means editing 3 places.

Approach: move all validation to the model/schema layer (Pydantic validators + dataclass __post_init__). Routes trust the schema; store trusts the model.

Risk: moderate — cross-cutting; needs careful review of which layer currently owns which constraint.

M6 — ws_stream.py mixed concerns (699 LOC)

File: server/src/ledgrab/api/routes/color_strip_sources/ws_stream.py

The worst part (stream-creation dispatch) was fixed in Phase 2.1 — it now calls color_strip_kinds.build_stream(source, deps). The remaining 699 lines mix config parsing + WebSocket lifecycle + frame loop. Could extract the frame loop into a separate PreviewFrameLoop class.

Estimated scope: half a session. Low impact since the parallel-change problem is already fixed.

M7 — No shared frontend API client — 🟡 FOUNDATION DONE (uncommitted, 2026-05-27)

File: every static/js/features/*.ts

fetchWithAuth(...) + bespoke error-unwrapping is copy-pasted in every feature's save / load function. ~45 files, ~243 call sites.

Done: static/js/core/api-client.ts now provides typed apiGet / apiPost / apiPut / apiPatch / apiDelete that wrap fetchWithAuth (so auth, 401-relogin, retry, timeout, and the offline toast are unchanged) and collapse the repeated if (!resp.ok) { detail || HTTP <status> } … resp.json() dance into one call returning a typed body and throwing ApiError on failure. The detail unwrap is hardened to join FastAPI validation arrays instead of stringifying to [object Object]. 35 feature/core files migrated (covers GET/POST/PUT/DELETE, typed response bodies, custom i18n error messages, silent-failure GETs, bulk Promise.allSettled deletes, inline-error saves, array-detail joins, fire-and-forget POSTs, and local catch handling) — reviewer-approved for behaviour parity across the riskier divergences. Migrated files include the integration sources (weather / HA / MQTT / HTTP), the template families (capture / audio / audio-processing / pattern), the scene-preset CRUD, the simple-CRUD entity files (sync-clocks / audio-sources / game-integration / gradient / displays / device-discovery), the light-target editors (z2m / ha), the preferences modules (dashboard-layout / card-modes / notifications-watcher), the calibration editors (simple + advanced), the entire automations.ts and devices.ts CRUD surfaces, and several core utilities (api-client.ts itself, cache.ts, command-palette.ts, graph-connections.ts, tag-input.ts, process-picker.ts, perf-charts.ts, icon-picker.ts, update.ts, integrations.ts).

Also added 14 new locale keys (en / ru / zh) so the fallback messages the migration surfaces — pattern.error.save_failed, audio_processing.error.save_failed, audio_template.error.save_failed, audio_template.error.load_failed, templates.error.save_failed, templates.error.load_failed, gradient.error.save_failed, target.error.load_failed, device.error.load_failed, automations.error.{load,save,delete,toggle}_failed, plus gradient.error.delete_failed for ru/zh — are translated instead of hardcoded English. A scan confirms no errorMessage: '<English>' strings remain in the migrated diff.

Remaining: 9 feature files (~94 call sites). All but one are the big god-modules whose migration is best done as part of their C8/C9/C10 splits: streams.ts (18), settings.ts (18), targets.ts (16), dashboard.ts (15), color-strips/index.ts (8), graph-editor.ts (7), assets.ts (6 — also blocked by multipart upload + blob download paths that legitimately bypass the JSON client), and value-sources.ts (5). The lone leaf file still on fetchWithAuth is pairing-flow.ts (1) — its branching on raw Response.status codes (200 / 409 / 4xx) doesn't fit the api-client contract, so it stays on raw fetch by design. Migration is mechanical but not a blind find/replace — each site carries its own localised error key that must be preserved as the errorMessage option, and binary/multipart endpoints (e.g. assets.ts file upload / blob download) must stay on raw fetchWithAuth (the client is JSON-only). Each migrated file ideally gets manual UI smoke-testing. Behaviour note: migrated GET sites now prefer the server's detail over the generic localised fallback when present — matching what the write paths already did; intended, but user-visible.

M8 — Global _cached* let vars

Mutable module-level state mutated from multiple feature modules. No subscription model — features manually invalidate() after CRUD.

Approach: introduce a reactive cache (EventEmitter pattern or a tiny store like Nano Stores). Couple with M7 (the API client can drive cache invalidation on write).

M9 — dashboard.ts 1421 LOC

Frontend god-module orchestrating + rendering device / target / CSS cards. Couple with C8/C9/C10 frontend split work.

M10 — Duplicate frontend modal classes

ValueSourceModal, StreamEditorModal, TargetEditorModal, AddDeviceModal, etc. each reimplement pristine-check / undo / focus management.

Approach: introduce a FormModal<T> base class.

M11 — Hardcoded _getSectionForSource / _getTabForSource

Routing tables duplicated across multiple feature files (streams.ts, value-sources.ts). Adding a new stream type requires hunting strings.

Approach: single routing registry keyed by source_type.

M12 — Late imports masking cycles

Partially addressed by the kind registries (Phase 2.1, 2.2). Some late-imports still exist in value_stream.py, audio_stream.py, the target processors. Resolving them requires restructuring module layout to break the circular dependencies.

Estimated scope: small follow-up after H5.

LOW

L1 — (src as any).field casts in value-sources.ts

Discriminated unions aren't narrowed properly. Couple with C8 frontend split.

L2 — Mutable state without locks

_preview_clients, _last_preview_data, _color_stream, _css_stream are mutated from multiple async tasks without explicit locks. Production has not exhibited issues but the contract is fragile.

Approach: add explicit asyncio.Lock per processor. Couple with H5.

L3 — Calibration.validate() raises instead of returning result

File: server/src/ledgrab/core/capture/calibration.py:164

All 4 call sites currently rely on the raise; converting to ValidationResult would force every caller to check a return value without adding safety. Recommendation: skip — current design is appropriate.

L4 — _SOURCE_TYPE_MAP is module-private

No public GET /api/v1/source-types discovery endpoint. Frontend hardcodes the list of source types in types.ts.

Approach: add a discovery route + matching frontend fetch. Couple with H6 frontend split (since types.ts is involved).

L5 — AudioValueStream implicit state machine

File: server/src/ledgrab/core/processing/value_stream.py:169-383

get_value() can be called before start(); transitions are implicit. Approach: explicit State pattern. Low value (production callers always start before reading).


Remaining frontend items (all)

CRITICAL

  • C8value-sources.ts 1972 LOC (4 god-functions, type-dispatch ladders)
  • C9graph-editor.ts 2707 LOC (layout + interaction + state + WS sync + …)
  • C10streams.ts 2341 LOC (picture / audio / template kitchen-sink)

Other frontend (severity in main list above)

  • H6 rest DONE (uncommitted, 2026-05-27): types.ts (1140 LOC) split into 18 per-entity files under types/ (joining the existing bindable.ts); types.ts is now a ~200-line pure re-export barrel, so every import { … } from '../types.ts' still resolves. Reviewer confirmed all 102 exported symbols preserved, none renamed.
  • H7device-discovery.ts 1745 LOC (couple with H4)
  • H8automations.ts 1410 LOC (mirror H2)
  • M7 — shared API client
  • M8 — reactive cache
  • M9dashboard.ts 1421 LOC
  • M10FormModal<T> base
  • M11 — routing registry
  • L1 — narrowing the discriminated unions

The frontend remainder is multi-day work even when broken up by finding. Recommended approach: a dedicated frontend sprint with the typescript-reviewer agent + manual UI testing for each god-module split. Order:

  1. Finish types.ts split (H6) — pure organisation, low risk, unblocks the rest
  2. Introduce API client (M7) — every feature file gains a cleaner shape
  3. Split value-sources.ts (C8) — uses the API client + per-type registry pattern
  4. Split streams.ts (C10)
  5. Split graph-editor.ts (C9) — needs the most care; the file owns the entire visual editor
  6. Polish: dashboard.ts (M9), device-discovery.ts (H7), automations.ts (H8), FormModal (M10), routing registry (M11), reactive cache (M8), narrowing (L1)

Session A — Frontend sprint (multi-day)

Address H6-rest, C8, C9, C10, H7, H8, M7-M11, L1. See order above. Critical to have typescript-reviewer feedback + manual UI testing after each split.

Progress (2026-05-27, uncommitted): steps 1 & 2 of the order above are done — H6-rest (types.ts split) and M7-foundation (api-client.ts

  • 3 reference migrations). H8 (automations registry) also landed. Still open: C8, C9, C10, H7, the remaining ~40 M7 file migrations, M8-M11, L1. Next per the order: introduce the API client everywhere (finish M7), then split value-sources.ts (C8).

Session B — Device redesign (1-2 sessions)

Address H4 alone. Touches device storage + provider classes; needs a data migration. Once H4 lands, H7 frontend mirror can follow.

Session C — BaseTargetProcessor ABC (2-3 sessions)

Address C5 (full) + H5 + M1 + M2 + L2 together. Highest risk in the audit because it drives physical LED hardware. Each step needs manual verification with a real device.

Session D — Polish (half a session)

Address M3, M6 (remainder), M12 (remainder), L3 (decision: skip), L4, L5.


Pattern reference for new contributors

Three registry-pattern templates that already exist in the codebase and should be the model for the remaining dispatch ladders:

  1. Class-level handler dict + import-time coverage assertion

    • core/processing/effect_stream.py::_RENDERERS (@_effect_renderer decorator + @_collect_effect_renderers class decorator)
    • core/automations/automation_engine.py::AutomationEngine._RULE_HANDLERS (module-level binding after class definition)
    • api/routes/output_targets.py::_TARGET_RESPONSE_BUILDERS (response-shape dispatch keyed by storage class)
  2. Per-type free functions + dependency-bag dataclass

    • core/processing/color_strip_kinds.py (StreamDeps + STREAM_BUILDERS)
    • core/processing/value_kinds.py (ValueStreamDeps + STREAM_BUILDERS)
    • storage/value_source_factories.py (CREATE_BUILDERS + UPDATE_APPLIERS)
  3. Versioned migration runner

    • storage/data_migrations.py (MigrationRunner + DataMigration ABC)
    • Used for any storage rename / field-shape change in the future.
    • Audit-table contract: atomic transaction covers applied-check + apply + record, so partial-failure cannot leave data rewritten but unrecorded.

Adding a new feature that touches dispatch should reach for one of these three patterns before writing a fresh if/elif chain.