A) Per-pixel smart-lights
- LIFX multizone (SetExtendedColorZones msg 510, <=82 zones) + Tile
(SetTileState64 715), auto-detected on connect with single-colour
fallback; lifx_per_zone threaded like nanoleaf_per_panel
- Hue gradient-lightstrip mapping: Entertainment v2 frame now keyed by
channel id (was 1 light=1 LED), channels discovered on connect;
hue_gradient_mode toggle (default on)
B) Integrations bundle
- Outbound webhook automation action (Discord/IFTTT/Zapier/Node-RED),
SSRF-gated via validate_polling_url at both save and fire time; fires
on activate/deactivate, best-effort, audited
- Home Assistant MQTT auto-discovery: read-only binary_sensors per
automation + connectivity, availability via birth/will, cleanup on
disable/delete, live state from the engine
Shared: pixel_reduce.resample_to_n nearest-neighbour helper.
57 new tests (lifx_multizone, hue_segment, webhook_action, ha_discovery).
Gate: ruff + tsc + build clean, pytest 2719 passed / 2 skipped.
Eight roadmap features from the 2026-06-19 review, each a full vertical
(backend + tests + frontend + i18n en/ru/zh); ~67 new unit tests:
- automations: SolarRule sunrise/sunset trigger (new utils/solar.py, shared
with the daylight cycle; window logic mirrors TimeOfDayRule)
- ci: best-effort arm64 multi-arch Docker manifest via QEMU + docker manifest
(release.yml; amd64 path untouched, continue-on-error)
- game-integration: wire the orphaned LoLPoller via a LoLPollManager + a shared
runtime_state module (poll lifecycle on enable/CRUD/startup/shutdown)
- ui: color-harmony gradient generator (complementary/analogous/triadic/...)
- effects: audio-reactive palette modulation (new audio_energy_tap; brightness/
saturation modulation across all 12 procedural effects)
- capture: linear-light blending + spatio-temporal dithering, opt-in per
calibration (new utils/linear_light.py, utils/dither.py)
- devices: Nanoleaf extControl v2 per-panel UDP streaming (per_panel mode)
Also bundles the pending 2026-06-18 production-review fixes and other
in-progress work already in the working tree (manual-trigger rule, etc.),
since they share files and could not be cleanly separated.
Gate: ruff + tsc clean; pytest 2654 passed / 2 skipped. The single failing
test (automation manual_trigger handler coverage) is a separate in-progress
item owned elsewhere, intentionally left as-is.
The /docs, /redoc and /openapi.json routes are gated by AuthRequired, so a
browser can't open them on plain navigation (no way to attach a Bearer token).
Add an opt-in auth.expose_docs flag (default off) that relaxes ONLY those three
routes to anonymous access (loopback + LAN) via a new verify_docs_access
dependency. Every real endpoint stays protected, and a startup WARNING fires
when the flag is on.
- config: AuthConfig.expose_docs: bool = False
- auth: verify_docs_access / DocsAccess dependency
- main: docs routes use DocsAccess; startup warning
- default_config.yaml: documented flag
- tests: docs anonymous when exposed; real endpoints still 401
Final-review blocker: the setup scaffold created the LED output target in the
store but never registered it with the ProcessorManager, so the wizard's
"Start" step 404'd on a fresh setup (target not found) — the lights never
started despite a success screen. Now the scaffold calls
target.register_with_manager(manager) right after create (mirroring the
canonical POST /output-targets route, same ValueError guard), so
start_processing finds the target. Rollback unregisters via
manager.remove_target before deleting the store entity, so a post-registration
failure leaves no half-registered target.
Also from the final review:
- solve corner_indices elements now bounded ge=0 (clear 422 instead of silent
modulo-wrap).
- setup-wizard.ts: reuse tutorials' suppressGettingStartedTour()/TOUR_KEY
instead of a duplicated 'tour_completed' literal; drop a duplicate manual-form
submit listener.
Tests: + adversarial pass over the whole feature (solver/session/scaffold edge
cases) and a scaffold->register->startable regression test. Full suite
2149 passed / 2 skipped; tsc clean; build passes; ruff clean.
The aggregated /api/v1/snapshot poll now emits a `scene_playlists` section
(each playlist with its `is_running` flag) plus a companion `playlist_state`
key carrying the single global cycling state (running playlist, current index/
preset, dwell) — so the HA-coordinator and other low-overhead pollers get
playlist state in the same round trip as scenes/targets, matching the other
entity sections. Gated by the `scene_playlists` include-section like the rest.
Reuses the existing list_scene_playlists handler; snapshot route tests updated.
Backend for the first-run wizard (phase 4).
- POST /api/v1/setup/scaffold: given an existing device_id + display_index
(+ optional calibration), wires a working chain via the real validated
store create paths — create-or-reuse capture template -> raw picture
source -> picture color-strip source (calibration or default) -> LED
output target -> returns the ids. Does NOT auto-start. Rolls back every
entity it created (reverse order) on any partial failure, leaving no
orphans; "created" events are deferred until the whole chain succeeds so
a rolled-back scaffold never leaves ghost cards in the UI.
- Requires an existing device_id (no inline device creation) — the wizard
creates the device first via the canonical, URL-validated POST /devices,
so the scaffold can't bypass device validation. display_index is bounded.
- GET/PUT /api/v1/preferences/onboarding: persistent first-run flag
({onboarded, completed_at}) via db.set_setting; server stamps completed_at.
- Both routes AuthRequired. Tests: 25 (scaffold happy/reuse/rollback/
validation + onboarding + calibration round-trip integration). docs/API.md.
Part of the edge-calibration + first-run-wizard feature (Big Bang; intermediate
phase — full build/suite gated at the final phase).
Backend engine for guided LED-chase calibration, driven by the upcoming
auto-calibration UI (phase 3) and first-run wizard (phase 4).
- solve_calibration(): pure function mapping start corner + direction + 4
corner-tap indices to per-edge LED counts, consistent with EDGE_ORDER/
EDGE_REVERSE so it round-trips through build_segments().
- CalibrationChaseMixin.set_calibration_pixel(): light a specific LED index
(+ optional window) on a device, reusing the device_test_mode idle-client
send path.
- CalibrationSession: single-active session with start/position/stop/cancel,
a 60s idle-timeout watchdog, and a concurrency lock so interleaved calls
can't corrupt the stop/restore bookkeeping — start() stops + remembers any
running target on the device and stop/cancel/timeout always restore it
(never leaves the device dark or stuck in chase).
- Routes /api/v1/calibration/{session,session/position,session/stop,
session/cancel,session/state,solve} (all AuthRequired, bounds-validated);
calibration is persisted by reusing the existing PUT /color-strip-sources/
{id} (hot-reloads running streams) rather than a duplicate endpoint.
- Tests: 19 solver pure-logic + 19 route/bounds. docs/API.md updated.
Part of the edge-calibration + first-run-wizard feature (Big Bang; intermediate
phase — full build/suite gated at the final phase).
Add ordered, timed sequences of scene presets that auto-cycle — activating
each preset and holding it for its dwell duration before advancing.
Backend:
- ScenePlaylist / PlaylistItem models + SQLite store (new scene_playlists table)
- PlaylistEngine: cycles ONE playlist at a time (starting one stops any other),
loop/shuffle, re-reads the playlist each cycle so edits/deletes apply at the
boundary, skips missing presets, guards against busy-loops; reuses the shared
apply_scene_state path used by scene presets and automations
- REST API: CRUD + /start, /stop, /state with scene-preset reference validation
- Constructed in the app lifespan with a bounded stop on shutdown
Frontend:
- New "Playlists" sub-tab in the Automations tab with start/stop controls and a
running indicator; editor modal with ordered scene rows (reorder + per-item
duration), loop/shuffle toggles, and tags
- Live refresh via the playlist_state_changed WebSocket event
- i18n in en/ru/zh
Tests: new unit + API coverage for the store/model, engine (cycling,
single-active exclusivity, missing-preset skip, shuffle, and the
playlist_state_changed event contract), and routes. Full suite green;
ruff and tsc clean.
Sample only a sub-rectangle of the captured frame instead of the whole display,
so a taskbar, game HUD, or letterbox bars don't pollute the border colours — the
first functional gap a reviewer hits (capture was full-display only).
- New pure crop_screen_capture() returns a numpy view (no copy), fast-paths the
full-frame case, and clamps degenerate/out-of-range ROIs to >=1px.
- ROI lives on CalibrationConfig (simple mode) as fractions 0..1 with a has_roi
helper; applied in the picture color-strip stream just before border
extraction, clamping border_width to the cropped size. Additive + backward
compatible (full-frame default, omitted from serialization when unset -> no
migration).
- Round-trips through the calibration schema automatically; frontend adds an
X/Y/Width/Height (%) 'Capture region' group to the calibration editor with
i18n (en/ru/zh).
10 unit tests (crop geometry, view-not-copy, clamping, ROI round-trip, legacy
default); full suite green (1946 passed).
Extend the time-of-day condition from a bare server-local HH:MM window to a real
schedule: pick which weekdays it is active (0=Mon..6=Sun, empty = every day) and
an optional IANA timezone (empty = server local). Closes the parity gap where
even a $5 WLED chip has weekday timers.
- Overnight windows (start > end) count toward the day they START on, so the
after-midnight tail is matched against the previous weekday.
- Timezones are resolved via zoneinfo, cached, and fall back to server-local
with a one-time warning on an invalid name (the ~1Hz tick never log-spams).
- Backward compatible: new fields default to all-days / server-local, so
existing automations are unchanged (no migration).
- Frontend: weekday chips + timezone input on the rule editor, day/timezone in
the rule summary, styles + i18n (en/ru/zh).
10 unit tests (weekday filter, overnight start-day semantics, tz fallback,
round-trip, invalid-day filtering); full suite green (1936 passed).
(Geographic sunrise/sunset triggers are a natural follow-up — the daylight
value source already has the solar math to reuse.)
Seed five curated, read-only post-processing templates so a non-expert gets
instant good-looking output before discovering the filter pipeline. Each is an
opinionated chain of existing filters (auto-crop/saturation/contrast/colour-
temperature/temporal-blur) tuned for a use case (films, games, evening ambience,
low-flicker, crisp cool-white).
Mirrors the built-in-gradient pattern: adds is_builtin to PostprocessingTemplate,
seeds missing looks on store init (idempotent, additive — no migration), and
makes built-ins read-only (update/delete raise -> 400; clone to customise).
Surfaced via the existing template picker + is_builtin in the response/type.
7 unit tests (seeding, idempotency, read-only protection, round-trip); full
suite green (1926 passed). (A runtime intensity slider is a follow-up — it needs
a filter-chain parameterisation layer.)
Add WLED's native realtime UDP protocol (port 21324) as a third output mode for
LED targets, alongside DDP and HTTP. For the device LedGrab drives most, this
brings three user-visible wins DDP lacks:
- Auto-revert: every packet carries a timeout byte, so if the stream stops
(host hiccup/sleep/crash) WLED returns to its preset instead of freezing on
the last frame.
- Correct RGBW whites: the DRGBW variant carries an explicit white channel.
- Lighter on weak Wi-Fi: raw RGB with a 2-byte header.
New WledRealtimeClient auto-selects DRGB (<=490), DRGBW (<=367), or chunked
DNRGB (>490). WLED applies its own per-bus colour order in realtime mode, so we
send plain RGB and the user's colour-order config just works. Protocol 'udp' is
threaded through WLEDConfig/provider/processor and the schema pattern; the
target editor gains a protocol option + badge + i18n (en/ru/zh).
8 unit tests for the packet builder; full suite green (1919 passed).
Cap an addressable strip's estimated current draw to a PSU budget so bright/
white scenes can't brown out an under-spec'd supply (voltage sag -> red/orange
shift, flicker, controller resets) — a classic 'it's broken' first impression.
- New core/processing/power_limit.py: pure current estimate (full white over N
LEDs draws N * mA_per_led) + a (0,1] scale to land a frame on budget.
- Applied in WledTargetProcessor._send_to_device (single choke point, every send
path; scales into a reusable scratch buffer, never mutates shared frames).
- Two per-target fields on LED targets: max_milliamps (0 = unlimited) and
milliamps_per_led (default 55), threaded through model/store/manager/processor/
schema/route with hot-update via update_target_settings. Additive with safe
defaults (no data migration needed; legacy targets read as unlimited).
- Frontend: editor fields + i18n (en/ru/zh) + LedOutputTarget type.
- Tests: 10 unit tests for the estimator/scale; full suite green (1911 passed).
Make the existing Application automation rule (foreground app -> activate
scene) work on the Android-TV build. A Kotlin ForegroundAppBridge reads the
foreground app via UsageStatsManager and lists launchable apps via LauncherApps;
PlatformDetector bridges it in (ahead of the Windows-only ctypes guard) so the
existing AutomationEngine / ApplicationRule / storage / deactivation modes are
unchanged. New /system/installed-apps + /system/info endpoints feed an app picker
that stores package names (vs process names on desktop); on Android the editor
hides the match-type selector since the foreground app is the only obtainable
signal. PACKAGE_USAGE_STATS is granted via an on-device button + a web-UI banner
(no blanket prompt at capture start); detection degrades gracefully until granted.
Zero new Python/Gradle deps (UsageStatsManager + LauncherApps are in-platform;
matching only string-compares the package name, so no QUERY_ALL_PACKAGES).
assembleDebug + 1897 pytest + ruff + tsc + npm build all green; independent final
review (0 blockers) + security review (no critical issues).
Add on-device webcam capture to the experimental Android-TV build. Desktop
captures webcams via OpenCV (no Chaquopy/Android wheel); this adds a push-based
AndroidCameraEngine that plugs into the same selection path desktop uses
(capture template engine_type="android_camera" + display_index, HAS_OWN_DISPLAYS).
A Kotlin CameraBridge (Camera2) enumerates cameras and opens them on demand —
only while a capture source is active, driven Python->Kotlin via a guarded jclass
singleton (BleBridge pattern) — converts each frame YUV_420_888->RGB, and pushes
RGB bytes into a module-level queue mirroring mediaprojection_engine.py. Cameras
surface as selectable displays like the desktop OpenCV engine; the data-driven
capture-template UI is unchanged. No new Python deps; no new Gradle deps
(Camera2 is in-platform).
Engine: ENGINE_PRIORITY=0 (never auto-selected over MediaProjection=100; explicit
engine_type only). Single-camera ownership is serialized with a lock + ref-count
(same-camera streams attach, different-camera refused, last release stops),
mirroring the desktop CameraEngine guard.
Permission: CAMERA requested at capture-start, gated on FEATURE_CAMERA_ANY so
camera-less TV boxes never prompt; graceful degradation when denied. The service
is promoted with the camera FGS type (+ FOREGROUND_SERVICE_CAMERA) only when
CAMERA is already granted, so backgrounded capture keeps working without risking
a failed startForeground on camera-less boxes (camera can't ride the
MediaProjection token the way audio playback capture does).
Reviewed via multi-agent adversarial pass (13 findings -> 4 fixed: device leak on
session-failure, multi-stream collision, camera FGS type, i18n key; 9 refuted).
Tests: 18 new desktop-CI tests (no device needed); full suite 1883 passed.
Verified: assembleDebug BUILD SUCCESSFUL, ruff clean.
Docs: ANDROID-REVIEW/android-webcam-capture-plan.md (design), updated
android-missing-functionality.md + README feature table + en/ru/zh locales.
Add an Android backend to os_notification_listener.py so notifications on the
experimental Android-TV build drive the existing NotificationColorStripSource
LED effects (flash/pulse/sweep, per-app colors + sounds) at app-name parity
with the Windows/Linux backends.
A Kotlin NotificationListenerService forwards the posting app's display label
across the Chaquopy JNI boundary into a new push-based _AndroidBackend +
module-level push_notification() receiver; the existing color-strip pipeline,
per-app colors/filters, and history endpoint are reused unchanged.
- Python: _AndroidBackend (probed first), push_notification() receiver,
_LinuxBackend.probe() hardened with is_linux() to exclude Android (which
also reports platform.system() == "Linux").
- Android: LedGrabNotificationListener NLS — serial single-thread executor,
full crash isolation around Python.getInstance(), label-only forwarding
(never notification title/body), ongoing/group-summary/self-package noise
filtering. Manifest service exported + gated by
BIND_NOTIFICATION_LISTENER_SERVICE (no new uses-permission).
- UX: prompt-once notification-access + manual "Grant notification access"
button wired into the D-pad focus chain (computed from visible controls);
en/ru/zh strings.
- Tests: 11 isolated unit tests — module-global + tmp_path history isolation,
push routing contract, callback-exception swallowing, None app-name, and a
desktop-regression lock on backend selection order.
- Docs: README OS-support Android column (notification + audio cells),
ANDROID-REVIEW status flipped to Implemented.
Zero new Python deps; no build.gradle.kts / Chaquopy pip changes.
Enable audio-reactive lighting on the Android-TV build. A push-based
AndroidAudioEngine captures system playback audio via AudioPlaybackCapture
(API 29+), reusing the existing MediaProjection token, and feeds PCM into
the unchanged AudioAnalyzer pipeline. No new Python deps; no Chaquopy/pip
changes (numpy already bundled).
- Python: android_audio_engine.py — module-level queue + configure/
push_samples/shutdown mirroring mediaprojection_engine; AndroidAudioEngine
(priority 100) registered behind a guarded import. push_samples copies and
defensively trims/clamps each block so the analyzer can't crash on
variable-length or non-frame-divisible PCM.
- Kotlin: AudioCapture.kt — AudioRecord + AudioPlaybackCaptureConfiguration,
fixed chunk-size block framing, little-endian float32, mic fallback;
reads back the actual negotiated channel/sample rate. PythonBridge gains
configureAudio/pushAudio/shutdownAudio with a cached module handle.
- Wiring: CaptureService starts/stops AudioCapture in the MediaProjection
path (gated on API>=29 + RECORD_AUDIO + live projection); MainActivity
requests RECORD_AUDIO; manifest declares it. Degrades gracefully when
denied; root path stays audio-less by design.
- Tests: 13 desktop-CI tests incl. an over-length/non-divisible regression
guard that exercises the full read_chunk -> AudioAnalyzer.analyze path.
Add an opt-out `normalize` flag to the four magnitude value sources
(ha_entity, http, system_metrics, game_event). get_value() stays in
[0,1] for every source (the normalized scalar-bus invariant), so all
existing consumers — brightness sinks, gradient_map, template `name`
bindings, and color-strip bindable floats — are unaffected.
- normalize=False is a clamp-passthrough: skip the min/max rescale and
clamp the raw reading into [0,1] (for sources already reporting a
0..1 fraction). The un-normalized magnitude stays available via
get_raw_value() / template raw[name] / automations.
- Add get_raw_value() + a raw channel to GameEventValueStream (it had
none). game_event's flag is model/stream-level only (no CRUD schema).
- Finite-safe clamp01() util; harden the composite-layer brightness
multiply (latent negative-wrap / >=1 skip) with it.
- Preview WebSocket: tolerate non-numeric raw, generalize raw-range.
- Frontend: settings-toggle slider per HA/system_metrics/http editor
with min/max grey-out; toggle hidden for fixed-mapping percent
metrics. en/ru/zh locale keys.
- Additive optional field (default True) — JSON round-trip, no migration.
Tests: store create/update round-trip, clamp-passthrough, live
normalize flip, game_event raw channel + build_stream forwarding,
and finite-safe clamp01.
A new `template` value source evaluates a hardened, sandboxed Jinja
expression over the live values of other value sources — the system's
first float combinator.
Backend:
- Shared engine (utils/template_expr.py): ImmutableSandboxedEnvironment with
filters/tests and auto-injected globals stripped; only min/max/abs/round/
clamp exposed; rejects **, string/collection-literal repetition, attribute
access and non-global calls; NaN/inf-safe result coercion.
- TemplateValueSource model + TemplateValueStream runtime: compile-once,
primitives-only eval context, raw[name] exposure, eval_interval throttle,
ref-counted input acquire/release, rename-safe hot-update.
- Validation: unbound-variable + reserved-name rejection, reference
cycle/depth guards (depth-only at create, full cycle at update), runtime
acquire() depth backstop, and delete referential-integrity.
- API: Create/Update/Response schemas + discriminated unions, _RESPONSE_MAP,
and an advisory POST /value-sources/validate-template endpoint.
- Demo seed: a static source plus a template combinator example.
Frontend:
- Editor modal section: repeatable inputs list (EntitySelect rows), a
zero-dependency Jinja syntax highlighter, a hints/reference panel, and a
debounced live validator that gates Save (stale-response-safe).
- Graph editor: read-only template node with one edge per input.
- i18n (en/ru/zh), icon, and card rendering.
Tests: engine, stream, factory/cycle, validate endpoint, and demo seed.
Follow-up polish from the duplicate-subgraph review:
- BaseSqliteStore.clone() now requires an opt-in `_cloneable` flag (default
off), so a new or secret-bearing store can never be cloned by accident —
defence-in-depth on top of the endpoint's `_DUPLICABLE_KINDS` restriction.
Only value-source and colour-strip stores are flagged.
- Fix `_check_name_unique` annotation (`str | None = None`).
- Add tests: `layers[].brightness_source_id` remap, the warnings safety net
firing, the clone allowlist invariant, and clone() refusing a non-cloneable
store.
A secret-safe equivalent of blueprint import: the graph editor's overflow menu
gains "Duplicate selection", which deep-clones the selected value and
colour-strip sources server-side (full config preserved, never crossing the
wire) and rewires references that point within the selection — shared
dependencies (devices, HA sources, …) stay shared.
- graph_schema.remap_refs: write-twin of extract_refs (same dot/list/bindable
grammar) that rewrites only in-selection ids; 8 unit tests.
- BaseSqliteStore.clone(): faithful deep-copy clone (no schema round-trip, so no
field is lost), prefix-preserving fresh id; reusable by any store.
- POST /api/v1/graph/duplicate: two-pass clone-then-rewire restricted to value /
colour-strip sources (no inline secrets), with a safety net flagging any
unremapped reference; 7 integration tests vs real stores.
- Frontend: duplicateSubgraph (+cache invalidation), graphDuplicateSelection
(reload + reselect the new cluster), overflow-menu item, i18n (en/ru/zh).
Lets users wire the system end-to-end from the graph, and fixes the core
bug that made drag-to-wire silently fail.
- Fix drag-to-wire 422s across 5 entity kinds: updateConnection() now echoes
the target's discriminator (source_type/stream_type/target_type) into the
partial PUT, so value/colour-strip/audio/picture sources and output targets
all wire correctly. New contract test (54 cases) in test_graph_wiring_contract.py.
- Re-wire composite layers / mapped zones from the graph (right-click a
layer/zone source edge -> Re-wire). Whole-list write preserves every sibling
layer/zone setting, with an optimistic-concurrency guard and undo.
- Secret-safe /graph topology: project entities to id/name/subtype + reference
roots so the endpoint cannot leak webhook tokens or other credentials.
- Carry slot indices on list edges; node custom-icon + schema-drift refinements;
rewire i18n keys (en/ru/zh); wiring-control roadmap (TODO.md).
Backend
- snapshot: GET /api/v1/snapshot aggregates targets, devices, sources,
presets and system into one payload for the HA coordinator, collapsing
the prior ~2N+M request fan-out; per-section ?include= gating.
- graph: GET /api/v1/graph{,/schema,/dependents} backed by a pure,
unit-tested graph_schema engine — one authoritative connectable-field
registry so the editor no longer hard-codes topology in two places.
- devices: thread mqtt_source_id through DeviceCreate/Update/Response and
the routes for multi-broker MQTT; shared validate_mqtt_source_exists
(_mqtt_validation.py) reused by device + output-target routes; stop
update_device masking intentional 4xx as 500.
- shutdown: bound uvicorn graceful-shutdown via GRACEFUL_SHUTDOWN_TIMEOUT
(shared by __main__, android_entry, demo) so a lingering events WebSocket
can't strand LED targets or block process exit.
- access log: structured _access_log middleware attributing each request to
its authenticated token label (never the secret); uvicorn access_log off.
Frontend
- graph editor: generic schema-driven port/edge rendering, layout and
connection handling; service-worker refresh.
- device modals: MQTT broker EntitySelect for device_type=mqtt in add-device
and settings, wired into load/save/validate/dirty-check/clone.
- i18n: en/ru/zh keys.
Tests: graph routes + schema, snapshot routes, access log, mqtt_source_id
device regressions, bounded-shutdown entrypoint. 1614 passed.
Auto-backups now produce a ZIP containing ledgrab.db plus every file
in the assets dir under assets/ — matching the manual
GET /api/v1/system/backup format, so restore accepts either output
interchangeably. Legacy .db backups remain listable, restorable, and
prunable; both extensions count toward max_backups.
Writes stage to <name>.partial then os.replace into place — a crash
mid-ZIP never leaves a half-written backup that masquerades as valid.
Stale .partials from prior crashes are swept on the next run.
Symlinks inside the assets dir are skipped so a hostile link can't
slurp a target outside the dir into every backup. Backups larger than
500 MB log a warning so operators notice unbounded asset growth before
disk fills up.
restart.py: redirect the spawned restart script's stdout/stderr to
restart.log and bail out early if the script is missing — silent
failures (PowerShell off PATH, restart.ps1 erroring) used to vanish
into a detached child with no diagnostic trail.
Tests cover happy path, asset bytes round-trip, partial cleanup,
None/missing assets_dir, failure rollback, stale-partial sweep,
symlink rejection, mixed legacy+new listing, and cross-format prune.
Database opened its sqlite3 connection eagerly in __init__ and closed it
in close(); the lifespan called close() on shutdown. In production this
is fine — the lifespan runs once per process. Under pytest the module-
level ``db`` singleton survives across every TestClient session, so the
second test file's lifespan startup hit
``sqlite3.ProgrammingError: Cannot operate on a closed database`` at
fixture-setup time (AutoBackupEngine.__init__ → db.get_setting("…")
was the first reader). 65 spurious "errors" on a full Windows pytest run.
- Database: extract _open() from __init__, add ensure_open() that
reopens iff _conn is None, and have close() null _conn after the
TRUNCATE checkpoint so re-close is idempotent.
- main.py lifespan startup: call db.ensure_open() before any setting
read, so subsequent TestClient sessions get a live connection.
- tests/storage/test_database_reopen.py: pin the four invariants —
close→ensure_open round-trips data, ensure_open is a no-op when
open, close is idempotent, and using the DB after close without
ensure_open raises (callers must opt in).
Full backend suite: 1551 pass / 1 skip / 0 errors. Ruff clean.
ruff --select UP007,UP045 --fix converted ~1760 sites across the
backend: `Optional[T]` → `T | None`, `Union[X, Y]` → `X | Y`. The
remaining module-level alias targets that ruff conservatively skips
(BindableFloatInput, ColorList, DeviceConfig) were converted by hand
earlier in the pass. black -formatted the result so the wider unions
fit cleanly under the 100-char line budget.
pyproject.toml now sets [tool.ruff.lint] extend-select = ["UP007",
"UP045"] so future legacy imports fire CI on every push. The
pre-commit ruff hook was bumped from v0.8.0 -> v0.15.12 to recognise
UP045 (split off from UP007 in v0.13).
TestWLEDSchemeInference in test_devices_routes covers the POST/PUT
create-and-update flow with a stubbed WLED provider so the
infer_http_scheme integration hop has end-to-end coverage instead of
just the unit tests.
test_url_scheme grows public IPv6 (Cloudflare / Google / Quad9 DNS),
bracketed-form, and ULA cases. Adds an explicit pin for the Python
ipaddress documentation-prefix quirk (2001:db8::/32 is is_private,
so it routes to http:// even though some audits colloquially call it
"public").
events-ws gains an inbound-event allowlist matching the new server-side
allowlist; test_events_ws_parity pins the two lists in sync.
state + storage modules and the streams / integrations /
z2m-light-targets / streams-*-templates editors absorb the
closeIfPristine guard alongside small UX fixes. css-editor template
picks up the new MiniSelect markup for the filter-kind picker.
Bundle the remaining backend touch-ups that the production review
landed individually as small surgical edits across many modules:
- MQTT runtime: fire-and-forget task tracking + drain resilience.
- mqtt_source + store + storage/color_strip_source: secret_box
encryption for credentials with auto-migration of plaintext fields.
- devices/discovery_watcher: task tracking on watcher start/stop.
- devices/wled_client + wled_provider: URL scheme inference helper
applied at the create/update boundary so bare hostnames stay valid.
- core/capture/screen_capture: hardened error paths.
- core/processing (mapped/processed/processor_manager/video/wled_target):
smaller follow-throughs from the registry refactor that landed
earlier on the branch.
- utils/safe_source + utils/file_ops + utils/__init__: shared URL +
IP classification helpers + larger streaming upload size caps.
- api/auth: WebSocket Origin allow-list + /docs auth-gate.
- api/dependencies: register the new HTTP-endpoint store.
- api/routes (assets, backup, webhooks): streaming-upload caps +
asyncio.gather return_exceptions on broadcast loops.
- tests/test_api + tests/e2e/test_backup_flow: cover the new caps and
the Origin allow-list.
update_service grows explicit URL validation on the redirect chain so a
hostile mirror can't bounce the updater to a private IP. restart.ps1
gets stricter argument handling and clearer log lines.
default_config.yaml exposes the new toggles. test_system_routes pins
the new behaviour.
The "static" source kind always rendered a SINGLE color and the name
confused new code paths. Rename the module + kind to "single". Storage
keeps backward-compatible serialisation. Frontend color-strip cards /
gradient / index / test modules and the affected tests follow the new
name.
Storage model + Pydantic schema + route surface gain the new rule
shapes the engine already supports. Frontend automations editor grows
the matching inputs. New core/test_automation_engine.py pins the
dispatch table rules behind ~285 lines of unit coverage.
New output kind that POSTs the current strip frame to a user-configured
HTTP endpoint, alongside WLED / MQTT / Hue. Stack mirrors the existing
output-target shape end-to-end: storage model + store, FastAPI router +
Pydantic schemas, JS feature module + modal template, router wiring in
api/__init__.py and the modal include in index.html. Tests cover both
the routes and the store.
``_target_to_response`` in ``api/routes/output_targets.py`` used to be
an isinstance ladder over the three OutputTarget subclasses with a
silent fallback that fabricated a ``LedOutputTargetResponse`` for
unknown types (audit finding H3). The fallback masked exactly the
kind of bug we hit on the CSS side in Phase 1.1: a new target subclass
slipped past the ladder and got mis-shaped on the wire.
Replace the ladder with a ``_TARGET_RESPONSE_BUILDERS`` dict keyed by
the concrete subclass plus an import-time
``_assert_target_response_coverage()`` that requires the registry to
exactly match ``{WledOutputTarget, HALightOutputTarget,
Z2MLightOutputTarget}``. ``_target_to_response`` now raises
``RuntimeError`` instead of silently fabricating a LED response for an
unknown subclass — coverage is asserted at import so this branch is
unreachable in normal operation.
Tests: 5 new regression tests cover bijection between expected classes
and registered builders, callable shape, the rogue-target-raises
contract, and missing/extra entry rejection in the assertion. 24
existing output-target tests stay green; ruff clean.
Two HIGH issues surfaced by review of 3b8f00e:
1. ``_build_game_event`` was newly succeeding where the old store
raised ``ValueError("Invalid source type: game_event")``. The
coverage-assertion-symmetry comment was honest about it being
a path that didn't exist before, but silent broadening of the
create contract is a real behaviour delta — any internal caller
that previously caught the error would now succeed.
Make ``_build_game_event`` raise NotImplementedError. The
coverage assertion still passes (the entry exists), but the
historical "you can't create game_event sources through the
store" contract is preserved. game_event instances continue to
be wired up by the game-integration setup path.
2. The new ``create_source`` ran ``_check_name_unique`` BEFORE
``build_source``. When both ``source_type`` is invalid AND
``name`` collides with an existing source, the old code raised
``"Invalid source type: …"`` first; the new code raised the
name-collision error. Swap the order: build first (which
validates source_type), then check name uniqueness, then
persist. Bonus: a uuid is no longer minted for a source we end
up rejecting on type.
New test pins the game_event NotImplementedError so a future
refactor doesn't accidentally re-open the create path.
38 value-source-store + factory tests stay green; ruff clean.
ValueSourceStore.create_source used to be a ~260-line if/elif chain
over 14 source_type strings; update_source did the same dance again
with 14 isinstance branches (audit finding C7 store-side). Each
branch duplicated the common-fields scaffold and the per-type
defaulting + validation logic.
Lift each per-type create / update body into a free function in a
new ``storage.value_source_factories`` module:
* ``CREATE_BUILDERS[source_type]`` — owns defaulting + per-type
validation (HA needs ha_source_id + entity_id; gradient_map
needs value_source_id; system_metrics validates against
VALID_SYSTEM_METRICS; http rejects interval_s < 1; the two
adaptive_* sub-modes route to the same AdaptiveValueSource
class with different source_type discriminators).
* ``UPDATE_APPLIERS[source_type]`` — mirrors the above on the
update side; ``resolve_ref`` is applied to cross-entity
references so empty-string clears keep working.
* ``build_source(...)`` / ``apply_update(source, **kwargs)`` are
the public entry points the store calls.
* ``_assert_factory_coverage()`` runs at module import and
requires BOTH registries to match storage's _VALUE_SOURCE_MAP
exactly.
The store's ``create_source`` shrinks from ~260 lines to ~25;
``update_source`` from ~200 lines to ~40.
Tests: 14 new tests cover registry coverage in both directions
plus drift assertions, representative builder paths (static /
adaptive_time / adaptive_scene / ha_entity / http / unknown),
the AdaptiveValueSource dual-source-type discriminator, and
several applier paths including ``**_`` swallowing unknown kwargs
and HTTP zero-interval rejection. 47 existing value-source store
tests stay green; 769 storage / core / api tests in aggregate.
Ruff clean.
SystemMetricsValueStream used to dispatch on its ``self._metric`` string
across three independent if/elif chains (audit finding M5):
* priming in ``start()`` (cpu_percent seed, initial network counter);
* raw reading in ``_read_metric_psutil`` plus ``_read_metric_fallback``;
* normalisation in ``_normalize`` (percent / min-max range / max-rate).
Adding a new metric meant editing all three chains plus the Android
fallback — and forgetting one branch made the metric silently return 0.
Lift each per-metric concern into a free function and register them as a
``MetricSpec(name, read_psutil, read_fallback, normalize, prime)`` in a
new ``core.processing.metric_readers`` module. Shared normalisers
(``_norm_percent`` / ``_norm_range`` / ``_norm_rate`` / ``_zero``) live
once. The stream's ``start()`` / ``_read_metric()`` / ``_normalize()``
collapse to a single registry lookup + delegation.
The stream still owns its mutable state (``_disk_path``,
``_sensor_label``, ``_gpu_unavailable``, ``_prev_net_bytes``,
``_prev_net_time``, etc.) — readers operate on the stream by
parameter, not by inheritance, so the kitchen-sink class shrinks by
~140 lines without losing the per-stream cadence bookkeeping. Each
spec function's docstring documents which fields it reads or mutates.
Tests: 16 new tests cover the 10-metric coverage set, callable shape
of every spec field, the three normaliser primitives' clamping +
divide-by-zero behaviour, prime-hook presence (only the three metrics
that need a baseline: cpu_load + network_rx + network_tx), and
fallback-path expectations (desktop-only sensors -> _zero, cpu/ram ->
real MetricsProvider).
754 existing core / storage / api tests stay green; ruff clean.
AutomationEngine._evaluate_rule used to rebuild a 9-entry dispatch
dict on EVERY rule evaluation (audit finding H2). Unknown rule types
silently returned False — adding a new Rule subclass without an entry
just made it inert forever.
Refactor:
* Per-rule-type bodies are now ``_handle_<kind>(self, rule, ctx)``
methods on AutomationEngine.
* A ``_RuleEvalContext`` frozen dataclass bundles all the
cross-cutting state (running_procs, topmost_proc,
topmost_fullscreen, fullscreen_procs, idle_seconds, display_state)
so adding a new handler does not require widening
``_evaluate_rule``'s parameter list.
* ``AutomationEngine._RULE_HANDLERS`` is bound once at module-import
time after the class is defined.
* ``_assert_rule_handler_coverage()`` runs at import: every Rule
subclass imported by the module must have an entry, and entries
keyed by an unknown class are also rejected.
Unknown-type fallback now logs a warning instead of silently returning
False, so a future Rule subclass missing from the registry surfaces in
operator logs rather than just behaving as if the automation were off.
The pure storage layer (storage/automation.py) is untouched — the
handler bodies stay on the engine where the cross-layer dependencies
(MQTT runtime, HA manager, HTTP endpoint store, webhook state) live.
Tests: 4 new tests cover the rule-type/handler bijection, callable
shape, missing-entry rejection, and unknown-class rejection. 44
existing automation engine tests stay green; ruff clean.
PixelMapper and AdvancedPixelMapper in calibration.py used to carry
byte-for-byte copies of two ~80-line numpy kernels (audit finding M4):
* the vectorised average-colour-per-LED path with its cumsum + take
scratch-buffer dance; and
* the per-LED fallback loop for median / dominant colour modes.
Lift both into a new ``core.capture.edge_interpolation`` module exposing
``average_edge_to_leds(edge_pixels, edge_name, led_count, cache,
cache_key)`` and ``fallback_edge_to_leds(edge_pixels, edge_name,
led_count, calc_color)``. The cache parameter is the caller-owned dict
(``self._edge_cache``) so allocations still happen once per
(edge_len, led_count) signature — the difference is that the
boundary-builder, the buffer set, and the inner numpy ops live in
exactly one place.
PixelMapper keys its cache by edge name (``"top"`` / ``"left"`` etc.);
AdvancedPixelMapper keys by line-index int (same dict, no collision).
Both mappers' ``_map_edge_average`` / ``_map_edge_fallback`` shrink to
single delegating lines.
Tests: 9 new kernel-level tests cover uint8 dtype + shape, the cache
reuse / rebuild contract, independent cache keying, a gradient input
producing a monotonic output, the calc_color callable contract for the
fallback path, and segment-position tracking for both axes. 30
existing calibration tests stay green; ruff clean.
EffectColorStripStream._animate_loop used to rebuild a 12-entry dict
``renderers = {"fire": self._render_fire, ...}`` on every frame, then
look up ``renderers.get(self._effect_type, self._render_fire)``. Two
audit smells (H1) at once: per-frame dict-rebuild churn and a silent
fallback to fire whenever ``self._effect_type`` was a typo or any
``_render_*`` method got renamed without updating the dict.
Fix:
* ``@_effect_renderer("fire")`` stamps an attribute on the unbound
method.
* ``@_collect_effect_renderers`` (applied to the class) walks
members at class-creation, gathers the marked ones into
``cls._RENDERERS``, and raises ``RuntimeError`` on duplicate
registration.
The loop now reads ``type(self)._RENDERERS`` once and calls the
unbound method with explicit ``self``. An unknown ``_effect_type``
logs a warning and skips the frame (sleep one frame_time) instead of
silently rendering fire — louder failure mode without crashing the
animation thread.
Tests: 5 new tests cover the 12-effect coverage set, callable shape,
class-level (not per-instance) dict identity, duplicate-name
rejection, and the marker stamp contract.
343 existing processing / storage / API tests stay green; ruff clean.
HALightTargetProcessor and Z2MLightTargetProcessor used to carry
character-for-character identical _swap_color_source method bodies
(audit finding C5) — only the log prefix differed. Extract the body
into a free function ``swap_color_source(processor, new_kind,
new_color_vs_id, *, log_label)`` in a new ``light_target_helpers``
module. Each processor's _swap_color_source now delegates to the helper
and then clears its per-entity history (``_previous_colors`` /
``_previous_on``) — that bit stays on the processor because it's per-
target state, not colour-source state.
Scope deliberately narrower than the full BaseLightTargetProcessor ABC
the audit gestured at: the 76 read sites for the per-processor colour
state across the two files made a full state-composition refactor too
risky for the live LED control loop. The free-function helper is the
minimum-blast-radius way to delete the duplication while leaving WLED
(which has no value-stream-vs-CSS dispatch) untouched.
The helper standardises both warning messages on HA's original wording
("failed to acquire color VS stream" / "failed to re-acquire CSS
stream") so existing log alerts/grep patterns keep working.
A LightTargetSwapState Protocol under TYPE_CHECKING documents the
expected processor surface; no runtime enforcement (acceptable trade-
off vs a 76-site touchpoint).
Tests: 7 new tests cover the release+acquire ordering, the not-running
no-op path, the manager-error-swallowing behaviour, the empty-id
short-circuit, and the missing-manager (TargetContext(None, None))
fallback. 354 existing storage + API + e2e + processing tests stay
green; ruff clean.
Two CRITICAL data-safety bugs from the architecture audit and the two
worst parallel-change problems are fixed in one coherent pass.
Audit findings addressed:
- C2 silent CSS response fallback. The previous _RESPONSE_MAP fell
through to a fabricated PictureCSSResponse whenever a source
class lacked an entry; in particular game_event sources were
silently mis-shaped. Now: GameEventCSSResponse/Create/Update
schemas exist, _RESPONSE_MAP is re-keyed by source_type string,
an import-time _assert_response_map_coverage() requires symmetric
agreement with storage._SOURCE_TYPE_MAP, and the runtime path
raises instead of fabricating a response.
- C11 string-replace JSON migration. ColorStripStore used
blob.replace('"source_type": "static"', '"source_type":
"single_color"') which can corrupt unrelated substrings (e.g.
an animation type named "static_wave") and provides no audit,
no transaction, no idempotency. Replaced with
storage.data_migrations.MigrationRunner backed by a
data_migrations audit table. Each migration runs inside one
db.transaction() that covers the applied-check, the apply(),
and the audit-INSERT — partial failures roll back atomically.
StaticToSingleColorMigration parses each row with json.loads
and mutates only the source_type field. Frozen-write databases
skip with a warning.
- C3+C4 color-strip stream dispatch. The 7-branch elif in
ColorStripStreamManager.acquire() and the duplicate one in
ws_stream._create_stream() now share a single STREAM_BUILDERS
registry in core.processing.color_strip_kinds, keyed by
source.source_type. Both call sites populate a StreamDeps bag
and delegate to build_stream(). _assert_stream_kind_coverage()
asserts at import that STREAM_BUILDERS plus SHARABLE_KINDS
partitions storage._SOURCE_TYPE_MAP. ws_stream's preview path
wraps each FastAPI-DI getter in _safe() so non-audio previews
no longer crash when audio/CSPT stores are not wired.
- C6+C7 value stream dispatch. The 14-branch isinstance ladder in
ValueStreamManager._create_stream and its silent
StaticValueStream(value=1.0) fallback are replaced by
core.processing.value_kinds.STREAM_BUILDERS, keyed by
source_type string (so AdaptiveValueSource's adaptive_time and
adaptive_scene route to different builders correctly). The
manager retains only the SyncClockRuntime pre-acquisition step
for animated_color (kinds needing this are listed explicitly
in NEEDS_CLOCK_RUNTIME). Symmetric coverage assertion plus a
separate assertion that NEEDS_CLOCK_RUNTIME is a subset of the
registry.
Bundled in: the static->single_color rename plus the HTTPValueStream
/ http_endpoint introduction that were already in flight on this
branch share these files; the registry refactor naturally absorbs
both via the new "single_color" / "static" alias entries and the
_build_http builder.
Tests: 26 new tests cover response-map coverage drift, migration
runner audit-table mechanics + transactional rollback +
frozen-write skip, and the two stream-builder registries. 343
existing storage / API / e2e tests stay green. Ruff clean.
Two bugs caused user data ('G502' target's color-strip ref, etc.) to
revert after PC restart while persisting fine across normal app
restarts:
1. SQLite was in WAL mode with synchronous=NORMAL and Database.close()
was never called. On graceful Python exit the sqlite3 finalizer
checkpoints the WAL, but on an unclean PC shutdown (power loss,
forced reboot, or Windows force-terminating pythonw.exe) the WAL
stayed in OS cache, never reached disk, and the next boot rolled the
DB back to the last checkpoint -- losing recent edits.
2. Nothing handled WM_QUERYENDSESSION / WM_ENDSESSION, so on PC
shutdown Windows force-killed pythonw.exe after ~5s and the FastAPI
lifespan never ran. The 'stop_targets' setting was silently ignored
and devices were left at their last frame.
Changes:
- Database: PRAGMA synchronous=FULL + wal_autocheckpoint=100, plus an
explicit wal_checkpoint(TRUNCATE) inside Database.close().
- New utils/win_shutdown.py: hidden top-level window in a daemon thread
with a ctypes WindowProc that catches WM_QUERYENDSESSION (calls
ShutdownBlockReasonCreate to extend Windows' 5s hung-app timeout up
to the ~20s GUI ceiling), fires the shutdown callback, then waits in
WM_ENDSESSION on a completion event before returning. Also raises
the process shutdown priority via SetProcessShutdownParameters. All
Win32 argtypes/restypes are bound once at import to avoid LPARAM
overflow on x64.
- New shutdown_state.py: leaf module owning the cross-thread Event so
__main__ does not import the heavy ledgrab.main at startup.
- main.py lifespan: per-step asyncio.wait_for budgets (8s for
processor_manager.stop_all, 1.5s each for HA/MQTT, etc.) so a hung
device cannot starve the DB checkpoint, then db.close() and
shutdown_complete.set() always run.
- __main__.py: install the Windows shutdown guard before tray start;
install SIGINT/SIGTERM/SIGBREAK handlers only on the tray path
(uvicorn overwrites them on no-tray); raise server_thread.join to 20s.
- Tests cover WM_QUERYENDSESSION (fires callback, returns TRUE,
idempotent), WM_ENDSESSION (waits on event, times out cleanly,
cancel-path returns instantly), signal handler installation, and
that main and shutdown_state share the same Event instance.
Closes the issues surfaced by the pre-merge code review of the
expand-device-support branch.
CRITICAL #2 -- update_device double-encrypts secrets in memory.
storage/device_store.py round-tripped through device.to_dict() which
encrypts hue_username / hue_client_key / ble_govee_key / nanoleaf_token
via _enc(), but Device.__init__ does not decrypt. The cached
self._items[device_id] thus held ciphertext where plaintext belonged,
breaking runtime auth for paired devices on any update -- even an
innocuous rename. Sourcing kwargs from vars(device) directly avoids
the round-trip. Regression tests cover Nanoleaf and Hue.
HIGH #3 -- secrets leaked in GET /api/v1/devices response.
DeviceResponse previously returned nanoleaf_token / hue_username /
hue_client_key in plaintext (decrypted server-side from storage),
defeating the encryption-at-rest. Replaced with nanoleaf_paired and
hue_paired booleans. ble_govee_key intentionally stays -- it's a
user-managed value pasted from a third-party tool, must remain visible
for edit. Frontend types.ts + the one nanoleaf_token reader updated to
the boolean.
HIGH #4 -- SSRF surface. validate_lan_host() added to net_classify.py;
called from each new driver's validate_device (DDP / Yeelight / WiZ /
LIFX / Govee / OPC / Nanoleaf) and from pair_device. Rejects literal
public IPs with a descriptive ValueError; non-IP hostnames pass
through (mDNS labels, bare hostnames). RFC6890 ranges (documentation,
former class E) are accepted as LAN-like since Python's
ipaddress.is_private treats them so -- correct policy for LedGrab.
HIGH #5 -- decrypt failure deletes the device row. _dec() now catches
the exception, logs an error, and returns "" instead of propagating.
Without the fix, a regenerated data/.secret_key would silently make
every Hue / Nanoleaf / BLE-Govee device disappear from the device list
on next startup. Regression test asserts a corrupt envelope leaves the
device hydratable.
HIGH #6 -- update_device route does not rstrip("/") for non-WLED.
Moved the trim before the WLED-specific scheme inference so every
device type gets consistent URL normalization between create and
update.
MEDIUM #7 -- Govee discovery port 4002 collision. Added a lazily-
initialized module-level asyncio.Lock that serializes concurrent
discover_govee_devices() calls; the previous behavior had the second
parallel scan silently return [] when the first still held port 4002.
Error message also clarified to mention another Govee tool.
MEDIUM #8 -- Nanoleaf discover() leaked browser tasks on cancellation.
Moved the browser cancel loop into the finally block so an interrupted
mDNS scan still tears them down.
MEDIUM #9 -- pair endpoint logged user-supplied URL with exc_info=True.
Added _sanitize_url_for_log() that strips userinfo + fragment, and
demoted the log from exc_info to type(exc).__name__ + str(exc) so a
hostile receiver's response body can't end up in the log file.
LOW -- Nanoleaf was the only client without a .port property. Added
one (returns NANOLEAF_PORT, fixed) for cross-driver symmetry.
LOW -- no end-to-end pair-then-create coverage. Added
TestPairThenCreateFlow.test_pair_then_create_persists_encrypted_token
which exercises the full path: POST /api/v1/devices/pair returned
fields, store.create_device, then asserts (a) in-memory plaintext,
(b) to_config() plaintext, (c) persisted ciphertext, (d) API response
strip + paired-boolean.
Tests: 1379 pass (was 1358 -- 21 new regression tests added).
ruff clean. TypeScript clean.