Files
ledgrab/REVIEW_TODO.md
T
alexei.dolgolyov 48dbdb90e9 docs(review-todo): check off items addressed in 2026-05-23 autonomous pass
Mark devices.py PATCH fix, WLED route-level test, IPv6 regression
test, IconSelect XSS audit, PEP-604 sweep, magic-number constants,
api/auth except specificity, and the (window as any) static-access
cleanup as done. Defer items are unchanged: performance items keep
their "profile first" caveat, Hue cert pinning + CSP keep the design-
sensitive note, architecture refactors keep the multi-day banner,
and i18n parity is now annotated with the exact missing-key counts
(328 ru / 325 zh) so the next translator pass has a clear scope.
2026-05-23 01:22:41 +03:00

181 lines
11 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Production Review — Remaining Items
Output of the multi-agent production review (security / Python / TypeScript /
performance / architecture / code-quality). Each entry below is something
the original audit flagged and the autonomous hardening pass deliberately
did **not** address — either because it needs design input, profiling
validation, or a multi-day refactor that should land in its own session.
The hardening pass landed everything else: see git log between `master` and
the head of the review branch for the applied changes (URL-scheme +
malicious-input rejection, IconSelect XSS escape, MiniSelect for forbidden
plain `<select>`s, WebSocket Origin allow-list, /docs auth-gate, security
headers middleware, streaming upload size caps, fire-and-forget task
tracking + drain resilience in MQTT runtime, discovery_watcher task
tracking, asyncio.gather return_exceptions, secret_box encryption for MQTT
/ Hue / Govee credentials with auto-migration, SSRF-validated update
redirects, single source of truth for IP classification in
`utils/net_classify.py`, allowlist + parity test for inbound WS events,
typed `Window` globals, and more).
## Items completed in the follow-up autonomous pass (2026-05-23)
- [x] **devices.py PATCH-without-url processor desync**`update_device`
now falls back to `existing.url` so a rename / icon-only edit
always tells the processor the current address.
- [x] **WLED scheme integration test** on `/api/v1/devices` — covers
bare IPv4 (`http://`), public hostname (`https://`), and trailing-slash
normalisation; lives in `tests/api/routes/test_devices_routes.py`.
- [x] **IPv6 regression test**`tests/test_url_scheme.py` now pins
public IPv6 → `https://`, ULA → `http://`, and documents the
Python-`ipaddress` documentation-prefix classification quirk.
- [x] **IconSelect XSS audit + defence-in-depth** — every caller
audited (all feed `icon` from constants or lookup tables); added
`sanitiseIcon` that rejects `<script>`, `javascript:`, `on*=`,
`<iframe>`, `<embed>`, `<object>` and warns to the console.
- [x] **`Optional[T]``T | None` (PEP 604)** — 55 sites cleaned via
`ruff --fix UP007`. The remaining `Union[…]` aliases for
pixel/colour/device-config typing converted by hand. `UP007` now
lives in `pyproject.toml` so the rule fires on new code.
- [x] **Hot-path magic numbers → named constants**`processed_stream`
gains `_FILTER_RECHECK_EVERY_N_FRAMES`; `wled_target_processor`
gains `_SKIP_REPOLL_SLEEP_SECONDS`, `_DIAGNOSTICS_REPORT_INTERVAL_SECONDS`,
`_CSPT_RECHECK_EVERY_N_ITERATIONS`.
- [x] **`api/auth.py` `except Exception` tightening** — every WS send /
close site is now `except _WS_SEND_BENIGN_EXC` (a narrow tuple of
WebSocketDisconnect / RuntimeError / ConnectionError / OSError).
The auth-receive path catches the same set plus a final
`logger.exception` catch-all for observability on truly unexpected
shapes.
- [x] **`(window as any)` cleanup** — 59 static-property accesses
migrated to typed `window.<name>` against `global-types.d.ts`. The
remaining 7 sites use dynamic string indexing (`window[fnName]`)
and intentionally keep the cast (documented in the typedef file).
---
## Architecture refactors (multi-day — own session)
- [ ] **Split `core/processing/value_stream.py`** (1856 LOC, 14 stream classes)
into a `value_streams/` package. Each value-stream type gets its own
file ≤300 LOC; `manager.py` holds `ValueStreamManager`.
- [ ] **Split `storage/color_strip_source.py`** (1841 LOC, 18 source kinds)
into a `color_strip_sources/` package mirroring `value_streams/`.
- [ ] **Frontend file splits**`graph-editor.ts` (2707), `streams.ts`
(2335), `value-sources.ts` (1889), `types.ts` (1062). Highest-churn
modules; mixed UI / state / network responsibilities.
- [ ] **Layering reversal**: introduce a neutral `domain/` package and move
shared DTOs (`FilterInstance`, `CalibrationConfig`, etc.) into it so
`storage/` no longer imports `core/`. Eliminates 7+ layering
violations and the lazy-import hacks used to break the resulting
circulars.
- [ ] **`main.py` boot refactor** — extract import-time side effects into
`bootstrap.py` + `create_app()` factory. `lifespan()` becomes the
single place that wires stores and managers.
- [ ] **DI consolidation** — replace `api/dependencies.py` getter sprawl
(30+ `get_*()` functions reading a process-global `_deps` dict) with
a single typed `get_container()` dependency. Makes test-overrides
trivial; ban direct getter calls in handler bodies.
- [ ] **Exception hierarchy** — define `ledgrab/errors.py` (`LedGrabError`,
`NotFoundError`, `ValidationError`, `RemoteUnavailableError`,
`SSRFBlockedError`). Move HTTP translation into a FastAPI exception
handler. Stop raising `HTTPException` from `utils/safe_source.py`.
- [ ] **Lazy-import audit** — 289 in-function `from ledgrab.*` imports.
Specifically `core/processing/daylight_settings.py` imports
`api.dependencies` (core → api inversion). Pass the database in via
the constructor instead of service-locator lookup.
## Performance (profile before applying)
- [ ] **`composite_stream.py` blend modes** — pre-allocate scratch buffers
in `_blend_override / overlay / hard_light / soft_light / difference
/ exclusion`. Each currently allocates per frame (`mul`, `scr`,
`blended`, `np.where(...)`). At 100 LEDs × 30 fps × N layers this
adds up.
- [ ] **`mapped_stream` / `composite_stream` zone resize** — replace the
per-channel `np.interp` calls with a cached `floor/ceil/frac` LUT
(same trick as `wled_target_processor._fit_to_device`) or a single
`cv2.resize` call on the (N,3) array. `np.interp` allocates a new
`float64` array per channel per frame even on cache-hit.
- [ ] **`processed_stream._processing_loop`** — add ping-pong output
buffers and pass them as `out=` to filter `process_strip()` calls.
Today every filter that returns a fresh allocation costs us a copy
per frame. Also: the loop uses `time.sleep` instead of an
event-driven wait on the input stream — input updates faster than
30 fps see up to `frame_time` of latency.
- [ ] **`mqtt_client.py` `send_pixels`** — add a binary publish path (or
at minimum cache the outer dict skeleton). Today every frame
`pixels.tolist()` + `json.dumps` for ~300 LEDs × 30 fps × N devices.
- [ ] **Frontend `static/js/features/color-strips/test.ts`** — cache
`ImageData` per canvas (`canvas._imageData`); only re-create on
dimension change; use a `Uint32Array` view to copy pixels in one
loop instead of the per-pixel JS loop. Border-overlay rebuild on
every frame should also be debounced to dimension changes only.
- [ ] **`ws_stream.py` composite branch** — pre-allocate a `bytearray`
sized to the largest frame and write into slices instead of
`b"".join(tobytes()) per layer` every iteration. Same anti-pattern
in `wled_target_processor._broadcast_led_preview`.
- [ ] **Preview broadcast slow-client guard**`asyncio.gather` over
preview clients waits for the slowest. Move to `asyncio.wait` with a
timeout and drop slow clients, or fire-and-forget with a
`ws.application_state` filter.
## Security (deferred — non-trivial or design-sensitive)
- [ ] **Content-Security-Policy header** — would need careful tuning
because the UI uses inline event handlers / Jinja templates.
Mis-set CSP would break the app silently. Defer until templates can
move to event-delegated handlers, then add a strict policy.
- [x] **`api/auth.py` exception specificity** — done in the 2026-05-23
pass; see top of file.
- [ ] **Hue bridge cert pinning**`httpx.AsyncClient(verify=False)` for
Hue bridge (self-signed cert by design). Should record the
certificate fingerprint at pairing time and pin it on subsequent
requests; otherwise an on-path attacker can MITM the bridge.
## Mechanical / code-quality (low risk, high line-count)
- [ ] **i18n parity** — confirmed **328** keys missing in `ru.json` and
**325** missing in `zh.json` against the canonical English file.
Translation work — needs a native speaker, not a machine-translation
pass. Run `py scripts/diff_locale_keys.py` (or copy the diff block
out of the 2026-05-23 pass log) to get the exact key list.
- [x] **`Optional[T]``T | None`** — done; `UP007` now enforced via
`pyproject.toml` so the rule prevents regressions.
- [ ] **Hot-path `logger.error(f"...")` → `logger.error("... %s", e)`**
lazy-eval — 658 sites flagged by `ruff --select G004`. Deferred
because it is genuinely cosmetic at ERROR level (always emitted)
and the cumulative cost is negligible. Worth doing if/when ruff
gains a safe autofix, or as a Codemod in a dedicated session.
- [x] **Remaining `(window as any)` sites** — 59 migrated to typed
`window.<name>` access; the 7 surviving sites use dynamic string
indexing and are documented as the legitimate exception.
- [x] **Magic numbers → named constants** — done; see `processed_stream`
and `wled_target_processor` constants at the top of each module.
- [ ] **Standardise `from __future__ import annotations`** — partially
mooted by the UP007 cleanup. Files that previously relied on
`Optional`/`Union` no longer need the future import; the few that
already use `__future__` keep it for forward-reference convenience.
A blanket policy would still help — leave as a stylistic followup.
## Test gaps
- [x] **Route-level integration test** for the WLED scheme inference —
done; covers create + update in `tests/api/routes/test_devices_routes.py::TestWLEDSchemeInference`.
- [x] **IPv6 public address regression** — done; pinned in
`tests/test_url_scheme.py` for both bracketless and bracketed forms.
## Pre-existing issues surfaced during the audit (not in our diff)
These were flagged by the auditors but predate the review session — kept
here as a future-work backlog:
- [x] **`icon-select.ts:_buildGrid` `item.icon` is interpolated raw** —
audited; all callers pass project-owned literals or table-lookup
results. Added a runtime sanitiser as defence-in-depth.
- [x] **`devices.py` `manager.update_device_info(device_url=update_data.url)`**
None-on-PATCH path — fixed; now falls back to `existing.url`.
- [ ] **`asyncio.gather` over uncapped client lists** in preview broadcasts
— slow clients block the loop. Already noted under Performance
above; pre-existing.