Mark devices.py PATCH fix, WLED route-level test, IPv6 regression test, IconSelect XSS audit, PEP-604 sweep, magic-number constants, api/auth except specificity, and the (window as any) static-access cleanup as done. Defer items are unchanged: performance items keep their "profile first" caveat, Hue cert pinning + CSP keep the design- sensitive note, architecture refactors keep the multi-day banner, and i18n parity is now annotated with the exact missing-key counts (328 ru / 325 zh) so the next translator pass has a clear scope.
11 KiB
Production Review — Remaining Items
Output of the multi-agent production review (security / Python / TypeScript / performance / architecture / code-quality). Each entry below is something the original audit flagged and the autonomous hardening pass deliberately did not address — either because it needs design input, profiling validation, or a multi-day refactor that should land in its own session.
The hardening pass landed everything else: see git log between master and
the head of the review branch for the applied changes (URL-scheme +
malicious-input rejection, IconSelect XSS escape, MiniSelect for forbidden
plain <select>s, WebSocket Origin allow-list, /docs auth-gate, security
headers middleware, streaming upload size caps, fire-and-forget task
tracking + drain resilience in MQTT runtime, discovery_watcher task
tracking, asyncio.gather return_exceptions, secret_box encryption for MQTT
/ Hue / Govee credentials with auto-migration, SSRF-validated update
redirects, single source of truth for IP classification in
utils/net_classify.py, allowlist + parity test for inbound WS events,
typed Window globals, and more).
Items completed in the follow-up autonomous pass (2026-05-23)
- devices.py PATCH-without-url processor desync —
update_devicenow falls back toexisting.urlso a rename / icon-only edit always tells the processor the current address. - WLED scheme integration test on
/api/v1/devices— covers bare IPv4 (http://), public hostname (https://), and trailing-slash normalisation; lives intests/api/routes/test_devices_routes.py. - IPv6 regression test —
tests/test_url_scheme.pynow pins public IPv6 →https://, ULA →http://, and documents the Python-ipaddressdocumentation-prefix classification quirk. - IconSelect XSS audit + defence-in-depth — every caller
audited (all feed
iconfrom constants or lookup tables); addedsanitiseIconthat rejects<script>,javascript:,on*=,<iframe>,<embed>,<object>and warns to the console. Optional[T]→T | None(PEP 604) — 55 sites cleaned viaruff --fix UP007. The remainingUnion[…]aliases for pixel/colour/device-config typing converted by hand.UP007now lives inpyproject.tomlso the rule fires on new code.- Hot-path magic numbers → named constants —
processed_streamgains_FILTER_RECHECK_EVERY_N_FRAMES;wled_target_processorgains_SKIP_REPOLL_SLEEP_SECONDS,_DIAGNOSTICS_REPORT_INTERVAL_SECONDS,_CSPT_RECHECK_EVERY_N_ITERATIONS. api/auth.pyexcept Exceptiontightening — every WS send / close site is nowexcept _WS_SEND_BENIGN_EXC(a narrow tuple of WebSocketDisconnect / RuntimeError / ConnectionError / OSError). The auth-receive path catches the same set plus a finallogger.exceptioncatch-all for observability on truly unexpected shapes.(window as any)cleanup — 59 static-property accesses migrated to typedwindow.<name>againstglobal-types.d.ts. The remaining 7 sites use dynamic string indexing (window[fnName]) and intentionally keep the cast (documented in the typedef file).
Architecture refactors (multi-day — own session)
- Split
core/processing/value_stream.py(1856 LOC, 14 stream classes) into avalue_streams/package. Each value-stream type gets its own file ≤300 LOC;manager.pyholdsValueStreamManager. - Split
storage/color_strip_source.py(1841 LOC, 18 source kinds) into acolor_strip_sources/package mirroringvalue_streams/. - Frontend file splits —
graph-editor.ts(2707),streams.ts(2335),value-sources.ts(1889),types.ts(1062). Highest-churn modules; mixed UI / state / network responsibilities. - Layering reversal: introduce a neutral
domain/package and move shared DTOs (FilterInstance,CalibrationConfig, etc.) into it sostorage/no longer importscore/. Eliminates 7+ layering violations and the lazy-import hacks used to break the resulting circulars. main.pyboot refactor — extract import-time side effects intobootstrap.py+create_app()factory.lifespan()becomes the single place that wires stores and managers.- DI consolidation — replace
api/dependencies.pygetter sprawl (30+get_*()functions reading a process-global_depsdict) with a single typedget_container()dependency. Makes test-overrides trivial; ban direct getter calls in handler bodies. - Exception hierarchy — define
ledgrab/errors.py(LedGrabError,NotFoundError,ValidationError,RemoteUnavailableError,SSRFBlockedError). Move HTTP translation into a FastAPI exception handler. Stop raisingHTTPExceptionfromutils/safe_source.py. - Lazy-import audit — 289 in-function
from ledgrab.*imports. Specificallycore/processing/daylight_settings.pyimportsapi.dependencies(core → api inversion). Pass the database in via the constructor instead of service-locator lookup.
Performance (profile before applying)
composite_stream.pyblend modes — pre-allocate scratch buffers in_blend_override / overlay / hard_light / soft_light / difference / exclusion. Each currently allocates per frame (mul,scr,blended,np.where(...)). At 100 LEDs × 30 fps × N layers this adds up.mapped_stream/composite_streamzone resize — replace the per-channelnp.interpcalls with a cachedfloor/ceil/fracLUT (same trick aswled_target_processor._fit_to_device) or a singlecv2.resizecall on the (N,3) array.np.interpallocates a newfloat64array per channel per frame even on cache-hit.processed_stream._processing_loop— add ping-pong output buffers and pass them asout=to filterprocess_strip()calls. Today every filter that returns a fresh allocation costs us a copy per frame. Also: the loop usestime.sleepinstead of an event-driven wait on the input stream — input updates faster than 30 fps see up toframe_timeof latency.mqtt_client.pysend_pixels— add a binary publish path (or at minimum cache the outer dict skeleton). Today every framepixels.tolist()+json.dumpsfor ~300 LEDs × 30 fps × N devices.- Frontend
static/js/features/color-strips/test.ts— cacheImageDataper canvas (canvas._imageData); only re-create on dimension change; use aUint32Arrayview to copy pixels in one loop instead of the per-pixel JS loop. Border-overlay rebuild on every frame should also be debounced to dimension changes only. ws_stream.pycomposite branch — pre-allocate abytearraysized to the largest frame and write into slices instead ofb"".join(tobytes()) per layerevery iteration. Same anti-pattern inwled_target_processor._broadcast_led_preview.- Preview broadcast slow-client guard —
asyncio.gatherover preview clients waits for the slowest. Move toasyncio.waitwith a timeout and drop slow clients, or fire-and-forget with aws.application_statefilter.
Security (deferred — non-trivial or design-sensitive)
- Content-Security-Policy header — would need careful tuning because the UI uses inline event handlers / Jinja templates. Mis-set CSP would break the app silently. Defer until templates can move to event-delegated handlers, then add a strict policy.
api/auth.pyexception specificity — done in the 2026-05-23 pass; see top of file.- Hue bridge cert pinning —
httpx.AsyncClient(verify=False)for Hue bridge (self-signed cert by design). Should record the certificate fingerprint at pairing time and pin it on subsequent requests; otherwise an on-path attacker can MITM the bridge.
Mechanical / code-quality (low risk, high line-count)
- i18n parity — confirmed 328 keys missing in
ru.jsonand 325 missing inzh.jsonagainst the canonical English file. Translation work — needs a native speaker, not a machine-translation pass. Runpy scripts/diff_locale_keys.py(or copy the diff block out of the 2026-05-23 pass log) to get the exact key list. Optional[T]→T | None— done;UP007now enforced viapyproject.tomlso the rule prevents regressions.- Hot-path
logger.error(f"...")→logger.error("... %s", e)lazy-eval — 658 sites flagged byruff --select G004. Deferred because it is genuinely cosmetic at ERROR level (always emitted) and the cumulative cost is negligible. Worth doing if/when ruff gains a safe autofix, or as a Codemod in a dedicated session. - Remaining
(window as any)sites — 59 migrated to typedwindow.<name>access; the 7 surviving sites use dynamic string indexing and are documented as the legitimate exception. - Magic numbers → named constants — done; see
processed_streamandwled_target_processorconstants at the top of each module. - Standardise
from __future__ import annotations— partially mooted by the UP007 cleanup. Files that previously relied onOptional/Unionno longer need the future import; the few that already use__future__keep it for forward-reference convenience. A blanket policy would still help — leave as a stylistic followup.
Test gaps
- Route-level integration test for the WLED scheme inference —
done; covers create + update in
tests/api/routes/test_devices_routes.py::TestWLEDSchemeInference. - IPv6 public address regression — done; pinned in
tests/test_url_scheme.pyfor both bracketless and bracketed forms.
Pre-existing issues surfaced during the audit (not in our diff)
These were flagged by the auditors but predate the review session — kept here as a future-work backlog:
icon-select.ts:_buildGriditem.iconis interpolated raw — audited; all callers pass project-owned literals or table-lookup results. Added a runtime sanitiser as defence-in-depth.devices.pymanager.update_device_info(device_url=update_data.url)None-on-PATCH path — fixed; now falls back toexisting.url.asyncio.gatherover uncapped client lists in preview broadcasts — slow clients block the loop. Already noted under Performance above; pre-existing.