wled-screen-controller-mixed/REVIEW.md

# Codebase Review Report

_Generated 2026-03-09_

---

## 1. Bugs (Critical)

### Thread Safety / Race Conditions

| Issue | Location | Description |
|-------|----------|-------------|
| **Dict mutation during iteration** | `composite_stream.py:121`, `mapped_stream.py:102` | `update_source()` calls `_sub_streams.clear()` from the API thread while `_processing_loop` iterates the dict on a background thread. **Will crash with `RuntimeError: dictionary changed size during iteration`.** |
| **Clock ref-count corruption** | `color_strip_stream_manager.py:286-304` | On clock hot-swap, `_release_clock` reads the *new* clock_id from the store (already updated), so it releases the newly acquired clock instead of the old one. Leaks the old runtime, destroys the new one. |
| **SyncClockRuntime race** | `sync_clock_runtime.py:42-49` | `get_time()` reads `_running`, `_offset`, `_epoch` without `_lock`, while `pause()`/`resume()`/`reset()` modify them under `_lock`. Compound read can double-count elapsed time. |
| **SyncClockManager unprotected dicts** | `sync_clock_manager.py:26-54` | `_runtimes` and `_ref_counts` are plain dicts mutated from both the async event loop and background threads with no lock. |

### Silent Failures

| Issue | Location | Description |
|-------|----------|-------------|
| **Crashed streams go undetected** | `mapped_stream.py:214`, `composite_stream.py` | When the processing loop dies, `get_latest_colors()` permanently returns stale data. The target keeps sending frozen colors to LEDs with no indicator anything is wrong. |
| **Crash doesn't fire state_change event** | `wled_target_processor.py:900` | Fatal exception path sets `_is_running = False` without firing `state_change` event (only `stop()` fires it). Dashboard doesn't learn about crashes via WebSocket. |
| **WebSocket broadcast client mismatch** | `kc_target_processor.py:481-485` | `zip(self._ws_clients, results)` pairs results with the live list, but clients can be removed between scheduling `gather` and collecting results, causing wrong clients to be dropped. |

### Security

| Issue | Location | Description |
|-------|----------|-------------|
| **Incomplete path traversal guard** | `auto_backup.py` | Filename validation uses string checks (`".." in filename`) instead of `Path.resolve().is_relative_to()`. |

---

## 2. Performance

### High Impact (Hot Path)

| Issue | Location | Impact |
|-------|----------|--------|
| **Per-frame `np.array()` from list** | `ddp_client.py:195` | Allocates a new numpy array from a Python list every frame. Should use pre-allocated buffer. |
| **Triple FFT for mono audio** | `analysis.py:168-174` | When audio is mono (common for system loopback), runs 3 identical FFTs. 2x wasted CPU. |
| **`frame_time = 1.0/fps` in every loop iteration** | 8 stream files | Recomputed every frame despite `_fps` only changing on consumer subscribe. Should be cached. |
| **4x deque traversals per frame for metrics** | `kc_target_processor.py:413-416` | Full traversal of metrics deques every frame to compute avg/min/max. |
| **3x spectrum `.copy()` per audio chunk** | `analysis.py:195-201` | ~258 array allocations/sec for read-only consumers. Could use non-writable views. |

### Medium Impact

| Issue | Location |
|-------|----------|
| `getattr` + dict lookup per composite layer per frame | `composite_stream.py:299-304` |
| Unconditional `self.*=` attribute writes every frame in audio stream | `audio_stream.py:255-261` |
| `JSON.parse(localStorage)` on every collapsed-section call | `dashboard.js` `_getCollapsedSections` |
| Effect/composite/mapped streams hardcoded to 30 FPS | `effect_stream.py`, `composite_stream.py:37`, `mapped_stream.py:33` |
| Double `querySelectorAll` on card reconcile | `card-sections.js:229-232` |
| Module import inside per-second sampling function | `metrics_history.py:21,35` |
| `datetime.utcnow()` twice per frame | `kc_target_processor.py:420,464` |
| Redundant `bytes()` copy of bytes slice | `ddp_client.py:222` |
| Unnecessary `.copy()` of temp interp result | `audio_stream.py:331,342` |
| Multiple intermediate numpy allocs for luminance | `value_stream.py:486-494` |

---

## 3. Code Quality

### Architecture

| Issue | Description |
|-------|-------------|
| **12 store classes with duplicated boilerplate** | All JSON stores repeat the same load/save/CRUD pattern with no base class. A `BaseJsonStore[T]` would eliminate ~60% of each store file. |
| **`DeviceStore.save()` uses unsafe temp file** | Fixed-path temp file instead of `atomic_write_json` used by all other stores. |
| **`scene_activator.py` accesses `ProcessorManager._processors` directly** | Lines 33, 68, 90, 110 — bypasses public API, breaks encapsulation. |
| **Route code directly mutates `ProcessorManager` internals** | `devices.py` accesses `manager._devices` and `manager._color_strip_stream_manager` in 13+ places. |
| **`color-strips.js` is 1900+ lines** | Handles 11 CSS source types, gradient editor, composite layers, mapped zones, card rendering, overlay control — should be split. |
| **No `DataCache` for color strip sources** | Every other entity uses `DataCache`. CSS sources are fetched with raw `fetchWithAuth` in 5+ places with no deduplication. |

### Consistency / Hygiene

| Issue | Location |
|-------|----------|
| `Dict[str, any]` (lowercase `any`) — invalid type annotation | `template_store.py:138,187`, `audio_template_store.py:126,155` |
| `datetime.utcnow()` deprecated — 88 call sites in 42 files | Project-wide |
| `_icon` SVG helper duplicated verbatim in 3 JS files | `color-strips.js:293`, `automations.js:41`, `kc-targets.js:49` |
| `hexToRgbArray` private to one file, pattern inlined elsewhere | `color-strips.js:471` vs line 1403 |
| Hardcoded English fallback in `showToast` | `color-strips.js:1593` |
| `ColorStripStore.create_source` silently creates wrong type for unknown `source_type` | `color_strip_store.py:92-332` |
| `update_source` clock_id clearing uses undocumented empty-string sentinel | `color_strip_store.py:394-395` |
| `DeviceStore._load` lacks per-item error isolation (unlike all other stores) | `device_store.py:122-138` |
| No unit tests | Zero test files. Highest-risk: `CalibrationConfig`/`PixelMapper` geometry, DDP packets, automation conditions. |

---

## 4. Features & Suggestions

### High Impact / Low Effort

| Suggestion | Details |
|------------|---------|
| **Auto-restart crashed processing loops** | Add backoff-based restart when `_processing_loop` dies. Currently crashes are permanent until manual intervention. |
| **Fire `state_change` on crash** | Add `finally` block in `_processing_loop` to notify the dashboard immediately. |
| **`POST /system/auto-backup/trigger`** | ~5 lines of Python. Manual backup trigger before risky config changes. |
| **`is_healthy` property on streams** | Let target processors detect when their color source has died. |
| **Rotate webhook token endpoint** | `POST /automations/{id}/rotate-webhook-token` — regenerate without recreating automation. |
| **"Start All" targets button** | "Stop All" exists but "Start All" (the more common operation after restart) is missing. |
| **Include auto-backup settings in backup** | Currently lost on restore. |
| **Distinguish "crashed" vs "stopped" in dashboard** | `metrics.last_error` is already populated — just surface it. |

### High Impact / Moderate Effort

| Suggestion | Details |
|------------|---------|
| **Home Assistant MQTT discovery** | Publish auto-discovery payloads so devices appear in HA automatically. MQTT infra already exists. |
| **Device health WebSocket events** | Eliminates 5-30s poll latency for online/offline detection. |
| **`GET /system/store-errors`** | Surface startup deserialization failures to the user. Currently only in logs. |
| **Scene snapshot should capture device brightness** | `software_brightness` is not saved/restored by scenes. |
| **Exponential backoff on events WebSocket reconnect** | Currently fixed 3s retry, generates constant logs during outages. |
| **CSS source import/export** | Share individual sources without full config backup. |
| **Per-target error ring buffer via API** | `GET /targets/{id}/logs` for remote debugging. |
| **DDP socket reconnection** | UDP socket invalidated on network changes; no reconnect path exists. |
| **Adalight serial reconnection** | COM port disconnect crashes the target permanently. |
| **MQTT-controlled brightness and scene activation** | Direct command handler without requiring API key management. |

### Nice-to-Have

| Suggestion | Details |
|------------|---------|
| Configurable metrics history window (currently hardcoded 120 samples / 2 min) | |
| Replace `window.prompt()` API key entry with proper modal | |
| Pattern template live preview (SVG/Canvas) | |
| Keyboard shortcuts for start/stop targets and scene activation | |
| FPS chart auto-scaling y-axis (`Math.max(target*1.15, maxSeen*1.1)`) | |
| WLED native preset target type (send `{"ps": id}` instead of pixels) | |
| Configurable DDP max packet size per device | |
| `GET /system/active-streams` unified runtime snapshot | |
| OpenMetrics / Prometheus endpoint for Grafana integration | |
| Configurable health check intervals (currently hardcoded 10s/60s) | |
| Configurable backup directory path | |
| `GET /system/logs?tail=100&level=ERROR` for in-app log viewing | |
| Device card "currently streaming" badge | |