Backend
- snapshot: GET /api/v1/snapshot aggregates targets, devices, sources,
presets and system into one payload for the HA coordinator, collapsing
the prior ~2N+M request fan-out; per-section ?include= gating.
- graph: GET /api/v1/graph{,/schema,/dependents} backed by a pure,
unit-tested graph_schema engine — one authoritative connectable-field
registry so the editor no longer hard-codes topology in two places.
- devices: thread mqtt_source_id through DeviceCreate/Update/Response and
the routes for multi-broker MQTT; shared validate_mqtt_source_exists
(_mqtt_validation.py) reused by device + output-target routes; stop
update_device masking intentional 4xx as 500.
- shutdown: bound uvicorn graceful-shutdown via GRACEFUL_SHUTDOWN_TIMEOUT
(shared by __main__, android_entry, demo) so a lingering events WebSocket
can't strand LED targets or block process exit.
- access log: structured _access_log middleware attributing each request to
its authenticated token label (never the secret); uvicorn access_log off.
Frontend
- graph editor: generic schema-driven port/edge rendering, layout and
connection handling; service-worker refresh.
- device modals: MQTT broker EntitySelect for device_type=mqtt in add-device
and settings, wired into load/save/validate/dirty-check/clone.
- i18n: en/ru/zh keys.
Tests: graph routes + schema, snapshot routes, access log, mqtt_source_id
device regressions, bounded-shutdown entrypoint. 1614 passed.
WindowsShutdownGuard was binding user32.GetMessageW.argtypes with
POINTER(_MSG) (project-local struct), while PlatformDetector's display-
power monitor binds it with POINTER(wintypes.MSG). argtypes is a
mutable global on the cached WinDLL handle, so whichever module
imported last won, and the other module's byref() then tripped
Python 3.13's strict argtype check with
"expected LP_MSG instance instead of pointer to _MSG".
The two structs are byte-identical (same field types in the same
order, just pt vs pt_x/pt_y naming) and we never touch the pt field,
so aliasing _MSG to wintypes.MSG eliminates the conflict — both
modules now bind the same POINTER class, the writes become idempotent,
and the full test suite passes regardless of import order.
CI runs on Linux so this never fired in release builds, but it broke
the local Windows test run.
Previously, both the per-app override sound dropdown and the main
notification sound row only attached an EntitySelect when at least
one sound asset was registered — and never with allowNone. That left
users with no way to pick "no sound" once an entry existed, and made
the override dropdown silently inert before any assets were added.
Always construct the EntitySelect (so an empty assets list still
renders a usable, searchable input) and pass allowNone with the
localized none label so "no sound" is a first-class choice in both
the override list and the main row.
Design doc capturing the architectural pattern behind the perf-card
flicker (every innerHTML rewrite tears down the subtree, restarting
CSS animations / dropping focus / wasting layout) and the two
drivers — poll timers and server:* push events. Inventories every
remaining latent site in perf-charts.ts / dashboard.ts /
entity-events.ts / game-integration.ts, walks the decision ladder
from a setInnerHtmlIfChanged helper through hand-rolled cell
components to a Lit migration, and lands on Lit for polling-heavy
modules with entity-events.ts tab reconciliation sequenced ahead of
the dashboard cards because of its higher blast radius.
Planning artifact only — no implementation here.
The CSS editor modal collected every other notification field on
save but silently dropped notification_sound and notification_volume,
so toggling them in the modal had no effect on the saved strip.
Include both in the save payload alongside the existing notification
fields.
create-release now creates the release as a draft so users never see
a release page that's missing artifacts (or, worse, missing the
sha256 sidecars that the in-app updater requires). A new
publish-release job runs after create-release, build-windows,
build-linux, and build-docker all succeed, and PATCHes the release
to draft=false in one step. If any build fails, the draft stays
hidden and can be deleted manually.
Regenerate the LedGrab icon family from a single Pillow script
(build/generate_icon.py): a rounded-square aperture traced by a
continuous RGB color-wheel stroke over a vignette canvas with a soft
chromatic bloom. 4x supersampled then downsampled per output for
crispness. Outputs 192/512 standard, 512 maskable (safe-area padded
for PWA round-crops), and a new 256 transparent-background tray
variant so the taskbar icon reads cleanly against light themes
instead of showing a dark tile.
icon.ico now embeds 16/24/32/48/64/128/256 frames sourced from the
transparent tray master, fixing the dark-square halo around the
file/taskbar icon on light Windows themes.
__main__ picks icon-tray.png for the tray and falls back to
icon-192.png when the tray asset isn't present (older bundles /
forks).
Auto-backups now produce a ZIP containing ledgrab.db plus every file
in the assets dir under assets/ — matching the manual
GET /api/v1/system/backup format, so restore accepts either output
interchangeably. Legacy .db backups remain listable, restorable, and
prunable; both extensions count toward max_backups.
Writes stage to <name>.partial then os.replace into place — a crash
mid-ZIP never leaves a half-written backup that masquerades as valid.
Stale .partials from prior crashes are swept on the next run.
Symlinks inside the assets dir are skipped so a hostile link can't
slurp a target outside the dir into every backup. Backups larger than
500 MB log a warning so operators notice unbounded asset growth before
disk fills up.
restart.py: redirect the spawned restart script's stdout/stderr to
restart.log and bail out early if the script is missing — silent
failures (PowerShell off PATH, restart.ps1 erroring) used to vanish
into a detached child with no diagnostic trail.
Tests cover happy path, asset bytes round-trip, partial cleanup,
None/missing assets_dir, failure rollback, stale-partial sweep,
symlink rejection, mixed legacy+new listing, and cross-format prune.
Python 3.13's ctypes tightened argtypes checks and now rejects
mismatched POINTER(MSG) cache entries — each call to POINTER(MSG)
can return a class identity that doesn't match what byref() of an
instance produces, raising "expected LP_MSG instance instead of
pointer to _MSG" inside GetMessageW/TranslateMessage/DispatchMessageW.
Capture POINTER(MSG) once into LPMSG and reuse the same class object
across all three prototypes in both the WindowsShutdownGuard pump and
the PlatformDetector display-power monitor. Restores the 4 failing
win_shutdown tests.
H8 — automations.ts rule-type registry
Convert the two hand-rolled RuleType dispatch ladders into per-type
registries (RULE_FIELD_RENDERERS + RULE_COLLECTORS) keyed by RuleType,
joining the existing RULE_CHIP_RENDERERS. All three are typed
Record<RuleType, ...> for compile-time exhaustiveness; an import-time
_assertRuleHandlerCoverage() check logs loudly if any registry drifts
from RULE_TYPE_KEYS — mirrors the backend's _RULE_HANDLERS shape, the
one intentional divergence being that the frontend logs rather than
throws (a thrown error at module import would brick the whole bundle,
not just the editor).
M7 — shared API client + 35 file migrations
New core/api-client.ts wrapping fetchWithAuth with typed apiGet /
apiPost / apiPut / apiPatch / apiDelete. Auth, 401-relogin, retry,
timeout, and the offline toast all stay owned by fetchWithAuth; the
client just collapses the
if (!resp.ok) { detail || HTTP <status> } ... resp.json()
dance into one typed call. The detail unwrap is hardened to join
FastAPI validation arrays instead of stringifying to [object Object].
35 feature/core files migrated to it across many batches, reviewer-
approved for behaviour parity in three passes covering the riskier
divergences (bulk Promise.allSettled deletes, inline-error saves,
array-detail joins, silent-failure GETs, immutable clones).
9 files remain on fetchWithAuth — the big god-modules tied to the
pending C8/C9/C10 splits (streams, settings, targets, dashboard,
color-strips/index, graph-editor, assets, value-sources) plus
pairing-flow which by design stays on raw fetch (branches on raw
Response.status codes).
i18n — 14 new locale keys (en / ru / zh)
Added save/load/delete error keys across automations, pattern,
audio_processing, audio_template, templates, gradient, target,
device namespaces, plus backfilled gradient.error.delete_failed into
ru/zh. Scan confirms no hardcoded English errorMessage strings
remain in the migrated diff.
AUDIT_REMAINING.md updated to reflect H6, H8, and M7 status.
Verified: tsc --noEmit clean + npm run build clean after every batch.
Convert the 1140-LOC types.ts into a pure re-export barrel backed by
focused per-entity files under types/, joining the existing
bindable.ts. Every import { ... } from '../types.ts' resolves
unchanged; reviewer-confirmed all 102 type exports preserved.
Multi-axis lift to ship-quality after a full review:
Security
- ApiKeyManager: per-install random API key, persisted via SharedPreferences
with synchronous first-write; threaded into uvicorn via the
LEDGRAB_AUTH__API_KEYS env var; embedded in QR as a URL fragment (#k=)
so it never appears in HTTP requests or server logs; frontend reads
location.hash on first visit and strips it via history.replaceState
- Root.runAsRoot(argv: Array<String>) overload with POSIX shell-quoting to
eliminate the shell-injection footgun (= excluded from unquoted-safe set)
- UsbSerialBridge: ContextCompat.RECEIVER_NOT_EXPORTED + intent.package
check in the broadcast receiver for defence-in-depth across API levels
- Release builds refuse to silently fall back to debug keystore; require
ANDROID_KEYSTORE_* env vars or explicit
ANDROID_ALLOW_DEBUG_SIGNED_RELEASE=1
- Crash log retention capped at 10 entries
- Fatal-error stack trace hidden behind a toggle on the error screen
Performance
- ScreenCapture / RootScreenrecord reuse a single RGBA ByteArray per
pipeline instead of allocating per frame — eliminates ~15 MB/s GC churn
at 30 fps on low-end TV boxes
- Frame pacer switched from System.currentTimeMillis() + integer division
(~30.3 fps drift) to SystemClock.elapsedRealtimeNanos with a catch-up
accumulator
- ScreenCapture computes capture dimensions from source aspect ratio so
non-16:9 displays don't get squashed
- RootScreenrecord input pump backs off 5 ms when MediaCodec is starved,
ending a tight spin that burned a CPU core on decoder stalls
- QR cached by URL — onResume from background no longer rebuilds the
560×560 bitmap each time
- ApiKey commit() pre-warmed off Main on app startup
Compatibility
- compileSdk / targetSdk bumped to 35 (Play Store requirement)
- armeabi-v7a build path added to build script + conditionally included
in gradle splits when the matching wheel is present in android/wheels/
- Foreground service type declared as mediaProjection|specialUse with
PROPERTY_SPECIAL_USE_FGS_SUBTYPE rationale; promotion via
ServiceCompat.startForeground with the correct type per mode
- NetworkUtils picks Ethernet > Wi-Fi > VPN > cellular instead of just
activeNetwork — fixes wrong-URL on TV boxes with both Ethernet + Wi-Fi
- enableOnBackInvokedCallback=true for Android 15 predictive-back
- Splash screen API via androidx.core:core-splashscreen — hides Chaquopy
stdlib unpack delay on cold first launch
UI / UX
- All previously hardcoded English strings (root prompt, permission
denial, fatal-error screen, notification text) now localised across
en/ru/zh
- Monochrome notification icon (was a colored launcher → gray blob in
status bar)
- 320×180 TV banner (was the square launcher → squashed on Leanback row)
- ViewStub-based running panel (deferred inflation)
- ObjectAnimator pulse on the Running status dot for liveness feedback
- "Starting…" button state while root is being probed
- Autostart checkbox hidden entirely on unrooted devices
- "No network" status when getLocalIpAddress returns null
- QR fallback hint text
- Animator cancelled in onStop to avoid leaking view hierarchy
Lifecycle hardening (from review)
- RootScreenrecord: processLock serialises EOF respawn vs concurrent
stop() to prevent orphaned screenrecord processes
- CaptureService.restartRootPipeline: publish-before-start under
@Synchronized to close the orphan window during watchdog restarts
- ScreenCapture.MediaProjection.Callback.onStop just flips
running=false instead of calling stop() (which self-joined
captureThread and hung 500 ms)
- updateUI early-returns when lateinit not initialised (fatal-error path)
- Watchdog give-up bound fixed (>= instead of >, was allowing 4 attempts)
server/android_entry.py accepts an optional api_key, sets
LEDGRAB_AUTH__API_KEYS={"android":<key>} as JSON before any LedGrab
import, logs a clear error if pydantic-settings parsing doesn't land
the value back in config (defensive guard against future settings
behaviour drift).
server/static/js/app.ts: bootstrap reads #k= from location.hash,
persists to localStorage, then strips via history.replaceState.
Two independent code-review passes; 147 relevant server tests still
pass; TypeScript and ruff clean.
Database opened its sqlite3 connection eagerly in __init__ and closed it
in close(); the lifespan called close() on shutdown. In production this
is fine — the lifespan runs once per process. Under pytest the module-
level ``db`` singleton survives across every TestClient session, so the
second test file's lifespan startup hit
``sqlite3.ProgrammingError: Cannot operate on a closed database`` at
fixture-setup time (AutoBackupEngine.__init__ → db.get_setting("…")
was the first reader). 65 spurious "errors" on a full Windows pytest run.
- Database: extract _open() from __init__, add ensure_open() that
reopens iff _conn is None, and have close() null _conn after the
TRUNCATE checkpoint so re-close is idempotent.
- main.py lifespan startup: call db.ensure_open() before any setting
read, so subsequent TestClient sessions get a live connection.
- tests/storage/test_database_reopen.py: pin the four invariants —
close→ensure_open round-trips data, ensure_open is a no-op when
open, close is idempotent, and using the DB after close without
ensure_open raises (callers must opt in).
Full backend suite: 1551 pass / 1 skip / 0 errors. Ruff clean.
- dashboard: only destroy/recreate FPS charts for added/removed/detached
targets; skip the history fetch when local samples already exist.
Drops sync-clock `is_running` from the structure signature so toggles
don't trigger a full rebuild; route clock/automation refresh through
the in-place path.
- perf-charts: cache SVG skeleton per spark host and mutate node
attributes instead of rewriting `innerHTML` every poll. Memoize
patches/devices rendering by content signature so unchanged ticks
no longer restart CSS animations. Skip render for env-hidden cards.
- perf-charts: switch `/system/performance` poll to `fetchWithAuth`,
re-read `dashboardPollInterval` per tooltip move, and route the
remaining hardcoded English strings ("no captures", "{n} total",
"{rate} skipped/s", tooltip age, metric labels) through `t()`.
- locales: add `perf.no_captures`, `perf.captures_count`,
`perf.ratio_of_requested`, `perf.total_count`, `perf.skipped_per_sec`,
`perf.tip.now`, `perf.tip.ago` in en/ru/zh.
Mark devices.py PATCH fix, WLED route-level test, IPv6 regression
test, IconSelect XSS audit, PEP-604 sweep, magic-number constants,
api/auth except specificity, and the (window as any) static-access
cleanup as done. Defer items are unchanged: performance items keep
their "profile first" caveat, Hue cert pinning + CSP keep the design-
sensitive note, architecture refactors keep the multi-day banner,
and i18n parity is now annotated with the exact missing-key counts
(328 ru / 325 zh) so the next translator pass has a clear scope.
59 sites across 19 feature modules switched from
`(window as any).foo` to the typed `window.foo` form against
global-types.d.ts. The 7 remaining sites use dynamic string indexing
(`window[fnName]`) where a typed access is impossible — those keep
the narrow cast and are documented as the legitimate exception in
the typedef file's header.
global-types.d.ts grows entries for `loadIntegrations`,
`loadUpdateSettings`, `loadUpdateStatus`, `initUpdateSettingsPanel`,
`onTestDisplaySelected`, `openSettingsModal`, `renderAboutPanel`,
`switchSettingsTab`. The `applyAccentColor` signature is widened to
accept the (accent, persist) call shape observed at the appearance
preset call site so tsc validates the real contract.
ruff --select UP007,UP045 --fix converted ~1760 sites across the
backend: `Optional[T]` → `T | None`, `Union[X, Y]` → `X | Y`. The
remaining module-level alias targets that ruff conservatively skips
(BindableFloatInput, ColorList, DeviceConfig) were converted by hand
earlier in the pass. black -formatted the result so the wider unions
fit cleanly under the 100-char line budget.
pyproject.toml now sets [tool.ruff.lint] extend-select = ["UP007",
"UP045"] so future legacy imports fire CI on every push. The
pre-commit ruff hook was bumped from v0.8.0 -> v0.15.12 to recognise
UP045 (split off from UP007 in v0.13).
The 11 except Exception sites around websocket.send_json and
websocket.close are now except _WS_SEND_BENIGN_EXC — a narrow tuple of
WebSocketDisconnect, RuntimeError, ConnectionError, OSError. Real
programming errors (AttributeError, TypeError) no longer silently
disappear inside the handshake path. The receive_text branch grows a
narrow `(RuntimeError, ConnectionError, OSError)` case plus a final
`except Exception: logger.exception(...)` catch-all so genuinely
unexpected error shapes are recorded with a stack trace instead of
being swallowed.
processed_stream lifts the 30-iteration filter recheck cadence to
_FILTER_RECHECK_EVERY_N_FRAMES with a comment explaining the 30 fps
trade-off. wled_target_processor lifts SKIP_REPOLL, the diagnostics
interval, and the CSPT recheck cadence to module-level
_SKIP_REPOLL_SLEEP_SECONDS, _DIAGNOSTICS_REPORT_INTERVAL_SECONDS, and
_CSPT_RECHECK_EVERY_N_ITERATIONS. Tests can monkeypatch them now.
Every IconSelect caller was audited: each builds item.icon from a
constant ICON_* literal, a lookup-table getter, or
renderDeviceIcon(stored_id) — none of which embed user input today.
The new sanitiseIcon() helper is the belt-and-braces guard for a
future caller that forgets the trusted-SVG contract: reject icon
strings containing <script>/<iframe>/<embed>/<object>/javascript:/on*=
markers, warn to the console, and fall back to empty so the cell still
renders the (escaped) label + desc.
TestWLEDSchemeInference in test_devices_routes covers the POST/PUT
create-and-update flow with a stubbed WLED provider so the
infer_http_scheme integration hop has end-to-end coverage instead of
just the unit tests.
test_url_scheme grows public IPv6 (Cloudflare / Google / Quad9 DNS),
bracketed-form, and ULA cases. Adds an explicit pin for the Python
ipaddress documentation-prefix quirk (2001:db8::/32 is is_private,
so it routes to http:// even though some audits colloquially call it
"public").
When a PATCH omits `url` (rename / icon-only edit), normalized_url
arrived at the processor as None and the manager kept whatever it had
cached — or refused to re-sync if it had nothing. Fall back to
existing.url so the processor is always told the current address.
Surfaced by the production-review backlog.
TODO.md grows the device-support follow-up roadmap. CLAUDE.md trims a
stale section. en/ru/zh locales add the strings used by the new
HTTP-endpoint editor, MiniSelect labels, automations expansion, and
value-source kinds. Ru/zh parity for the older keys is tracked
separately in REVIEW_TODO.md.
events-ws gains an inbound-event allowlist matching the new server-side
allowlist; test_events_ws_parity pins the two lists in sync.
state + storage modules and the streams / integrations /
z2m-light-targets / streams-*-templates editors absorb the
closeIfPristine guard alongside small UX fixes. css-editor template
picks up the new MiniSelect markup for the filter-kind picker.
Bundle the remaining backend touch-ups that the production review
landed individually as small surgical edits across many modules:
- MQTT runtime: fire-and-forget task tracking + drain resilience.
- mqtt_source + store + storage/color_strip_source: secret_box
encryption for credentials with auto-migration of plaintext fields.
- devices/discovery_watcher: task tracking on watcher start/stop.
- devices/wled_client + wled_provider: URL scheme inference helper
applied at the create/update boundary so bare hostnames stay valid.
- core/capture/screen_capture: hardened error paths.
- core/processing (mapped/processed/processor_manager/video/wled_target):
smaller follow-throughs from the registry refactor that landed
earlier on the branch.
- utils/safe_source + utils/file_ops + utils/__init__: shared URL +
IP classification helpers + larger streaming upload size caps.
- api/auth: WebSocket Origin allow-list + /docs auth-gate.
- api/dependencies: register the new HTTP-endpoint store.
- api/routes (assets, backup, webhooks): streaming-upload caps +
asyncio.gather return_exceptions on broadcast loops.
- tests/test_api + tests/e2e/test_backup_flow: cover the new caps and
the Origin allow-list.
update_service grows explicit URL validation on the redirect chain so a
hostile mirror can't bounce the updater to a private IP. restart.ps1
gets stricter argument handling and clearer log lines.
default_config.yaml exposes the new toggles. test_system_routes pins
the new behaviour.
The "static" source kind always rendered a SINGLE color and the name
confused new code paths. Rename the module + kind to "single". Storage
keeps backward-compatible serialisation. Frontend color-strip cards /
gradient / index / test modules and the affected tests follow the new
name.
Storage model + Pydantic schema + route gain fields for the value-source
kinds introduced by the per-type factory refactor. Frontend editor adds
inputs for them and the modal template grows new field rows.
Storage model + Pydantic schema + route surface gain the new rule
shapes the engine already supports. Frontend automations editor grows
the matching inputs. New core/test_automation_engine.py pins the
dispatch table rules behind ~285 lines of unit coverage.
Modal gains closeIfPristine(entityId): when editing an existing entity
and no tracked field has changed, the helper force-closes the modal
silently and returns true so the caller can skip the PUT and the
misleading "updated" toast. Each editor's save handler now early-returns
on the no-op edit path: advanced-calibration, assets,
audio-processing-templates, audio-sources, calibration, devices,
game-integration, ha-light-targets, home-assistant-sources,
mqtt-sources, pattern-templates, scene-presets, sync-clocks, targets,
weather-sources.
MiniSelect replaces the forbidden plain <select> in editors where the
option list isn't large enough to justify the full IconSelect grid. It
shares the IconSelect look-and-feel (chip + dropdown panel, keyboard
nav, search). IconSelect grows an explicit HTML-escape for item labels
and keeps `item.icon` documented as a trusted-SVG sink for callers
that build the icon string from constants. global-types.d.ts gives
existing window.* accesses real types so feature modules can stop
falling back to `(window as any)`. The modal.css additions style the
two selectors and the new dropdown panels.
New output kind that POSTs the current strip frame to a user-configured
HTTP endpoint, alongside WLED / MQTT / Hue. Stack mirrors the existing
output-target shape end-to-end: storage model + store, FastAPI router +
Pydantic schemas, JS feature module + modal template, router wiring in
api/__init__.py and the modal include in index.html. Tests cover both
the routes and the store.
Add .vex.toml so `vex` is the project's primary code-search backend with
auto-update + semantic embeddings enabled. Ignore the .fastembed_cache/
directory that vex creates on first --semantic run. REVIEW_TODO.md
captures items flagged by the multi-agent production review that were
deliberately deferred (multi-day refactors, profile-first perf, and
design-sensitive security work).
10 commits in this branch landed the data-safety bugs (C2, C11), the
worst parallel-change problems (C1/C3/C4/C6/C7), and 5 of the 9 HIGH
audit findings. The remainder splits cleanly into four follow-up
sessions: a frontend sprint (C8/C9/C10/H6/H7/H8/M7-M11/L1), a Device
redesign (H4), a BaseTargetProcessor ABC (C5/H5/M1/M2/L2), and polish
(M3/M6/M12/L3-L5).
This file documents what's left, the recommended ordering, and the
three registry patterns already in the codebase that new contributors
should reach for instead of writing fresh if/elif chains.
``_target_to_response`` in ``api/routes/output_targets.py`` used to be
an isinstance ladder over the three OutputTarget subclasses with a
silent fallback that fabricated a ``LedOutputTargetResponse`` for
unknown types (audit finding H3). The fallback masked exactly the
kind of bug we hit on the CSS side in Phase 1.1: a new target subclass
slipped past the ladder and got mis-shaped on the wire.
Replace the ladder with a ``_TARGET_RESPONSE_BUILDERS`` dict keyed by
the concrete subclass plus an import-time
``_assert_target_response_coverage()`` that requires the registry to
exactly match ``{WledOutputTarget, HALightOutputTarget,
Z2MLightOutputTarget}``. ``_target_to_response`` now raises
``RuntimeError`` instead of silently fabricating a LED response for an
unknown subclass — coverage is asserted at import so this branch is
unreachable in normal operation.
Tests: 5 new regression tests cover bijection between expected classes
and registered builders, callable shape, the rogue-target-raises
contract, and missing/extra entry rejection in the assertion. 24
existing output-target tests stay green; ruff clean.
Two HIGH issues surfaced by review of 3b8f00e:
1. ``_build_game_event`` was newly succeeding where the old store
raised ``ValueError("Invalid source type: game_event")``. The
coverage-assertion-symmetry comment was honest about it being
a path that didn't exist before, but silent broadening of the
create contract is a real behaviour delta — any internal caller
that previously caught the error would now succeed.
Make ``_build_game_event`` raise NotImplementedError. The
coverage assertion still passes (the entry exists), but the
historical "you can't create game_event sources through the
store" contract is preserved. game_event instances continue to
be wired up by the game-integration setup path.
2. The new ``create_source`` ran ``_check_name_unique`` BEFORE
``build_source``. When both ``source_type`` is invalid AND
``name`` collides with an existing source, the old code raised
``"Invalid source type: …"`` first; the new code raised the
name-collision error. Swap the order: build first (which
validates source_type), then check name uniqueness, then
persist. Bonus: a uuid is no longer minted for a source we end
up rejecting on type.
New test pins the game_event NotImplementedError so a future
refactor doesn't accidentally re-open the create path.
38 value-source-store + factory tests stay green; ruff clean.
ValueSourceStore.create_source used to be a ~260-line if/elif chain
over 14 source_type strings; update_source did the same dance again
with 14 isinstance branches (audit finding C7 store-side). Each
branch duplicated the common-fields scaffold and the per-type
defaulting + validation logic.
Lift each per-type create / update body into a free function in a
new ``storage.value_source_factories`` module:
* ``CREATE_BUILDERS[source_type]`` — owns defaulting + per-type
validation (HA needs ha_source_id + entity_id; gradient_map
needs value_source_id; system_metrics validates against
VALID_SYSTEM_METRICS; http rejects interval_s < 1; the two
adaptive_* sub-modes route to the same AdaptiveValueSource
class with different source_type discriminators).
* ``UPDATE_APPLIERS[source_type]`` — mirrors the above on the
update side; ``resolve_ref`` is applied to cross-entity
references so empty-string clears keep working.
* ``build_source(...)`` / ``apply_update(source, **kwargs)`` are
the public entry points the store calls.
* ``_assert_factory_coverage()`` runs at module import and
requires BOTH registries to match storage's _VALUE_SOURCE_MAP
exactly.
The store's ``create_source`` shrinks from ~260 lines to ~25;
``update_source`` from ~200 lines to ~40.
Tests: 14 new tests cover registry coverage in both directions
plus drift assertions, representative builder paths (static /
adaptive_time / adaptive_scene / ha_entity / http / unknown),
the AdaptiveValueSource dual-source-type discriminator, and
several applier paths including ``**_`` swallowing unknown kwargs
and HTTP zero-interval rejection. 47 existing value-source store
tests stay green; 769 storage / core / api tests in aggregate.
Ruff clean.
types.ts is 1159 lines of kitchen-sink discriminated-union definitions
(audit finding H6) shared across ~30 frontend modules. Splitting the
whole file in one pass would need careful per-group TypeScript wrangling
and verification across every entity shape; this commit lands the
first slice as a proof of pattern.
What changed
------------
* New ``static/js/types/bindable.ts`` owns ``BindableFloat`` /
``BindableColor`` plus their four accessor helpers
(``bindableValue``, ``bindableSourceId``, ``bindableColor``,
``bindableColorSourceId``).
* ``static/js/types.ts`` keeps every interface and union shape that
references those primitives, but the primitives themselves now come
from the new file. Re-export keeps every existing ``import { ... }
from '../types.ts'`` site working unchanged.
Why this slice first
--------------------
Bindable types and helpers are the most heavily-imported piece of
types.ts (~30 modules already use ``bindable*`` helpers) and they have
zero downstream dependencies — they don't reference any other type
group. That makes them the safest extraction and the cleanest
demonstration of the barrel-re-export pattern the remaining groups
(devices, sources, integrations, automations, templates, …) will
follow in a follow-up sprint.
Verification
------------
* ``npx tsc --noEmit`` clean (no compile errors anywhere in the
frontend tree).
* ``npm run build`` clean (esbuild bundle and CSS bundle produced
without warnings).
The remaining ~1130 lines of types.ts plus the C8/C9/C10 god-module
splits (value-sources.ts, streams.ts, graph-editor.ts) need a
dedicated frontend session with typescript-reviewer + manual UI
testing — deferring those rather than half-finishing them here.
SystemMetricsValueStream used to dispatch on its ``self._metric`` string
across three independent if/elif chains (audit finding M5):
* priming in ``start()`` (cpu_percent seed, initial network counter);
* raw reading in ``_read_metric_psutil`` plus ``_read_metric_fallback``;
* normalisation in ``_normalize`` (percent / min-max range / max-rate).
Adding a new metric meant editing all three chains plus the Android
fallback — and forgetting one branch made the metric silently return 0.
Lift each per-metric concern into a free function and register them as a
``MetricSpec(name, read_psutil, read_fallback, normalize, prime)`` in a
new ``core.processing.metric_readers`` module. Shared normalisers
(``_norm_percent`` / ``_norm_range`` / ``_norm_rate`` / ``_zero``) live
once. The stream's ``start()`` / ``_read_metric()`` / ``_normalize()``
collapse to a single registry lookup + delegation.
The stream still owns its mutable state (``_disk_path``,
``_sensor_label``, ``_gpu_unavailable``, ``_prev_net_bytes``,
``_prev_net_time``, etc.) — readers operate on the stream by
parameter, not by inheritance, so the kitchen-sink class shrinks by
~140 lines without losing the per-stream cadence bookkeeping. Each
spec function's docstring documents which fields it reads or mutates.
Tests: 16 new tests cover the 10-metric coverage set, callable shape
of every spec field, the three normaliser primitives' clamping +
divide-by-zero behaviour, prime-hook presence (only the three metrics
that need a baseline: cpu_load + network_rx + network_tx), and
fallback-path expectations (desktop-only sensors -> _zero, cpu/ram ->
real MetricsProvider).
754 existing core / storage / api tests stay green; ruff clean.
AutomationEngine._evaluate_rule used to rebuild a 9-entry dispatch
dict on EVERY rule evaluation (audit finding H2). Unknown rule types
silently returned False — adding a new Rule subclass without an entry
just made it inert forever.
Refactor:
* Per-rule-type bodies are now ``_handle_<kind>(self, rule, ctx)``
methods on AutomationEngine.
* A ``_RuleEvalContext`` frozen dataclass bundles all the
cross-cutting state (running_procs, topmost_proc,
topmost_fullscreen, fullscreen_procs, idle_seconds, display_state)
so adding a new handler does not require widening
``_evaluate_rule``'s parameter list.
* ``AutomationEngine._RULE_HANDLERS`` is bound once at module-import
time after the class is defined.
* ``_assert_rule_handler_coverage()`` runs at import: every Rule
subclass imported by the module must have an entry, and entries
keyed by an unknown class are also rejected.
Unknown-type fallback now logs a warning instead of silently returning
False, so a future Rule subclass missing from the registry surfaces in
operator logs rather than just behaving as if the automation were off.
The pure storage layer (storage/automation.py) is untouched — the
handler bodies stay on the engine where the cross-layer dependencies
(MQTT runtime, HA manager, HTTP endpoint store, webhook state) live.
Tests: 4 new tests cover the rule-type/handler bijection, callable
shape, missing-entry rejection, and unknown-class rejection. 44
existing automation engine tests stay green; ruff clean.
PixelMapper and AdvancedPixelMapper in calibration.py used to carry
byte-for-byte copies of two ~80-line numpy kernels (audit finding M4):
* the vectorised average-colour-per-LED path with its cumsum + take
scratch-buffer dance; and
* the per-LED fallback loop for median / dominant colour modes.
Lift both into a new ``core.capture.edge_interpolation`` module exposing
``average_edge_to_leds(edge_pixels, edge_name, led_count, cache,
cache_key)`` and ``fallback_edge_to_leds(edge_pixels, edge_name,
led_count, calc_color)``. The cache parameter is the caller-owned dict
(``self._edge_cache``) so allocations still happen once per
(edge_len, led_count) signature — the difference is that the
boundary-builder, the buffer set, and the inner numpy ops live in
exactly one place.
PixelMapper keys its cache by edge name (``"top"`` / ``"left"`` etc.);
AdvancedPixelMapper keys by line-index int (same dict, no collision).
Both mappers' ``_map_edge_average`` / ``_map_edge_fallback`` shrink to
single delegating lines.
Tests: 9 new kernel-level tests cover uint8 dtype + shape, the cache
reuse / rebuild contract, independent cache keying, a gradient input
producing a monotonic output, the calc_color callable contract for the
fallback path, and segment-position tracking for both axes. 30
existing calibration tests stay green; ruff clean.
EffectColorStripStream._animate_loop used to rebuild a 12-entry dict
``renderers = {"fire": self._render_fire, ...}`` on every frame, then
look up ``renderers.get(self._effect_type, self._render_fire)``. Two
audit smells (H1) at once: per-frame dict-rebuild churn and a silent
fallback to fire whenever ``self._effect_type`` was a typo or any
``_render_*`` method got renamed without updating the dict.
Fix:
* ``@_effect_renderer("fire")`` stamps an attribute on the unbound
method.
* ``@_collect_effect_renderers`` (applied to the class) walks
members at class-creation, gathers the marked ones into
``cls._RENDERERS``, and raises ``RuntimeError`` on duplicate
registration.
The loop now reads ``type(self)._RENDERERS`` once and calls the
unbound method with explicit ``self``. An unknown ``_effect_type``
logs a warning and skips the frame (sleep one frame_time) instead of
silently rendering fire — louder failure mode without crashing the
animation thread.
Tests: 5 new tests cover the 12-effect coverage set, callable shape,
class-level (not per-instance) dict identity, duplicate-name
rejection, and the marker stamp contract.
343 existing processing / storage / API tests stay green; ruff clean.
HALightTargetProcessor and Z2MLightTargetProcessor used to carry
character-for-character identical _swap_color_source method bodies
(audit finding C5) — only the log prefix differed. Extract the body
into a free function ``swap_color_source(processor, new_kind,
new_color_vs_id, *, log_label)`` in a new ``light_target_helpers``
module. Each processor's _swap_color_source now delegates to the helper
and then clears its per-entity history (``_previous_colors`` /
``_previous_on``) — that bit stays on the processor because it's per-
target state, not colour-source state.
Scope deliberately narrower than the full BaseLightTargetProcessor ABC
the audit gestured at: the 76 read sites for the per-processor colour
state across the two files made a full state-composition refactor too
risky for the live LED control loop. The free-function helper is the
minimum-blast-radius way to delete the duplication while leaving WLED
(which has no value-stream-vs-CSS dispatch) untouched.
The helper standardises both warning messages on HA's original wording
("failed to acquire color VS stream" / "failed to re-acquire CSS
stream") so existing log alerts/grep patterns keep working.
A LightTargetSwapState Protocol under TYPE_CHECKING documents the
expected processor surface; no runtime enforcement (acceptable trade-
off vs a 76-site touchpoint).
Tests: 7 new tests cover the release+acquire ordering, the not-running
no-op path, the manager-error-swallowing behaviour, the empty-id
short-circuit, and the missing-manager (TargetContext(None, None))
fallback. 354 existing storage + API + e2e + processing tests stay
green; ruff clean.
Two CRITICAL data-safety bugs from the architecture audit and the two
worst parallel-change problems are fixed in one coherent pass.
Audit findings addressed:
- C2 silent CSS response fallback. The previous _RESPONSE_MAP fell
through to a fabricated PictureCSSResponse whenever a source
class lacked an entry; in particular game_event sources were
silently mis-shaped. Now: GameEventCSSResponse/Create/Update
schemas exist, _RESPONSE_MAP is re-keyed by source_type string,
an import-time _assert_response_map_coverage() requires symmetric
agreement with storage._SOURCE_TYPE_MAP, and the runtime path
raises instead of fabricating a response.
- C11 string-replace JSON migration. ColorStripStore used
blob.replace('"source_type": "static"', '"source_type":
"single_color"') which can corrupt unrelated substrings (e.g.
an animation type named "static_wave") and provides no audit,
no transaction, no idempotency. Replaced with
storage.data_migrations.MigrationRunner backed by a
data_migrations audit table. Each migration runs inside one
db.transaction() that covers the applied-check, the apply(),
and the audit-INSERT — partial failures roll back atomically.
StaticToSingleColorMigration parses each row with json.loads
and mutates only the source_type field. Frozen-write databases
skip with a warning.
- C3+C4 color-strip stream dispatch. The 7-branch elif in
ColorStripStreamManager.acquire() and the duplicate one in
ws_stream._create_stream() now share a single STREAM_BUILDERS
registry in core.processing.color_strip_kinds, keyed by
source.source_type. Both call sites populate a StreamDeps bag
and delegate to build_stream(). _assert_stream_kind_coverage()
asserts at import that STREAM_BUILDERS plus SHARABLE_KINDS
partitions storage._SOURCE_TYPE_MAP. ws_stream's preview path
wraps each FastAPI-DI getter in _safe() so non-audio previews
no longer crash when audio/CSPT stores are not wired.
- C6+C7 value stream dispatch. The 14-branch isinstance ladder in
ValueStreamManager._create_stream and its silent
StaticValueStream(value=1.0) fallback are replaced by
core.processing.value_kinds.STREAM_BUILDERS, keyed by
source_type string (so AdaptiveValueSource's adaptive_time and
adaptive_scene route to different builders correctly). The
manager retains only the SyncClockRuntime pre-acquisition step
for animated_color (kinds needing this are listed explicitly
in NEEDS_CLOCK_RUNTIME). Symmetric coverage assertion plus a
separate assertion that NEEDS_CLOCK_RUNTIME is a subset of the
registry.
Bundled in: the static->single_color rename plus the HTTPValueStream
/ http_endpoint introduction that were already in flight on this
branch share these files; the registry refactor naturally absorbs
both via the new "single_color" / "static" alias entries and the
_build_http builder.
Tests: 26 new tests cover response-map coverage drift, migration
runner audit-table mechanics + transactional rollback +
frozen-write skip, and the two stream-builder registries. 343
existing storage / API / e2e tests stay green. Ruff clean.
Two bugs caused user data ('G502' target's color-strip ref, etc.) to
revert after PC restart while persisting fine across normal app
restarts:
1. SQLite was in WAL mode with synchronous=NORMAL and Database.close()
was never called. On graceful Python exit the sqlite3 finalizer
checkpoints the WAL, but on an unclean PC shutdown (power loss,
forced reboot, or Windows force-terminating pythonw.exe) the WAL
stayed in OS cache, never reached disk, and the next boot rolled the
DB back to the last checkpoint -- losing recent edits.
2. Nothing handled WM_QUERYENDSESSION / WM_ENDSESSION, so on PC
shutdown Windows force-killed pythonw.exe after ~5s and the FastAPI
lifespan never ran. The 'stop_targets' setting was silently ignored
and devices were left at their last frame.
Changes:
- Database: PRAGMA synchronous=FULL + wal_autocheckpoint=100, plus an
explicit wal_checkpoint(TRUNCATE) inside Database.close().
- New utils/win_shutdown.py: hidden top-level window in a daemon thread
with a ctypes WindowProc that catches WM_QUERYENDSESSION (calls
ShutdownBlockReasonCreate to extend Windows' 5s hung-app timeout up
to the ~20s GUI ceiling), fires the shutdown callback, then waits in
WM_ENDSESSION on a completion event before returning. Also raises
the process shutdown priority via SetProcessShutdownParameters. All
Win32 argtypes/restypes are bound once at import to avoid LPARAM
overflow on x64.
- New shutdown_state.py: leaf module owning the cross-thread Event so
__main__ does not import the heavy ledgrab.main at startup.
- main.py lifespan: per-step asyncio.wait_for budgets (8s for
processor_manager.stop_all, 1.5s each for HA/MQTT, etc.) so a hung
device cannot starve the DB checkpoint, then db.close() and
shutdown_complete.set() always run.
- __main__.py: install the Windows shutdown guard before tray start;
install SIGINT/SIGTERM/SIGBREAK handlers only on the tray path
(uvicorn overwrites them on no-tray); raise server_thread.join to 20s.
- Tests cover WM_QUERYENDSESSION (fires callback, returns TRUE,
idempotent), WM_ENDSESSION (waits on event, times out cleanly,
cancel-path returns instantly), signal handler installation, and
that main and shutdown_state share the same Event instance.
The MODIFIED hint in the Customize Dashboard panel was driven by
`presetActive`, recomputed on every save/load via strict deep-equal
against each preset. Any drift between a saved layout and the current
defaults — older app versions that hadn't yet had some new perf cells
added, prior buggy merges that appended new registry keys to the end
of perfCells, or stale `visible` values from intermediate dev builds —
left `presetActive` undefined forever and pinned the panel in MODIFIED
state for users who had not actually edited anything.
Split the two concerns:
- `presetActive` keeps driving the chip highlight (recomputed). When
the layout happens to match a preset exactly the chip lights up.
- New `userModified` boolean drives the MODIFIED indicator. Set to true
only on actual edits through the panel (visibility / density /
ordering / select changes) and on JSON import; cleared by applying a
preset and by Reset.
Legacy saves without the field load as `userModified: false` so the
indicator no longer fires retroactively on data the user never
touched. Also tighten `_mergeWithDefaults` so newly-added registry
keys land at their canonical positions (subsequence detection) when
the saved order is consistent with defaults, which keeps the chip
highlight stable across upgrades.
The inline transport-uptime ticker only repainted on its 1 s setInterval,
so the field could sit on - for up to ~1 s after page load (and much
longer if init's first /health response landed between ticks - the next
seed then had to wait for the 10 s connection-monitor poll). Dispatch a
serverUptimeChanged DOM event when window.__serverUptime is seeded and
let the inline IIFE re-render on receipt, so the value appears as soon
as the response arrives.