Design doc capturing the architectural pattern behind the perf-card flicker (every innerHTML rewrite tears down the subtree, restarting CSS animations / dropping focus / wasting layout) and the two drivers — poll timers and server:* push events. Inventories every remaining latent site in perf-charts.ts / dashboard.ts / entity-events.ts / game-integration.ts, walks the decision ladder from a setInnerHtmlIfChanged helper through hand-rolled cell components to a Lit migration, and lands on Lit for polling-heavy modules with entity-events.ts tab reconciliation sequenced ahead of the dashboard cards because of its higher blast radius. Planning artifact only — no implementation here.
20 KiB
Dashboard Reconciliation — Review Notes
Captured 2026-05-26. Session focused on dashboard + perf-card flicker and per-poll re-rendering.
Updated 2026-05-27 — widened the audit beyond the two poll timers and found a second driver (server push) plus the highest-blast-radius site (entity-events.ts). Added §3.5, corrected the "out of scope" reasoning in §5, and confirmed the decision: commit to the Lit migration. Implementation deferred — this is still a planning doc, not a spec.
This is a thinking-aloud document for whoever picks up reconciliation work next (likely me). It captures the bug class, what's already shipped, what's still latent, the decision ladder we walked through, and the recommendation we landed on. It is not a spec — treat any code shown as illustrative.
1. The bug class in one sentence
Every place a data-driven render — a poll timer or a server-pushed
server:*event — writesel.innerHTML = ..., the existing DOM is torn down — even when the new HTML equals the old — which restarts CSS animations, drops focus, skips transitions, and burns wasted DOM mutation cycles.
The symptom only becomes visually loud when the destroyed subtree contains a CSS keyframe animation (e.g. the pulsing .perf-patches-empty-dot). Everywhere else the cost is silent: lost transitions, broken focus, wasted layout work. The bug is load-bearing in the architecture, not in any single call site — that's why we keep coming back to it.
2. What landed in commit f6486f9 (this session)
Tactical work — solves the worst cases, does not change the architecture.
server/src/ledgrab/static/js/features/dashboard.ts
- Collapsed the two fast-path branches into one. Fast path runs when
structureUnchanged && !forceFullRenderregardless ofrunning.length. Previously, zero running targets meant every poll rebuilt the entire dashboard even when nothing changed. _lastSyncClockIdsno longer fingerprintsis_running— pausing/resuming a clock no longer tears down every card._updateSyncClocksInPlacealready handles the toggle._updateAutomationsInPlacenow called from the unified fast path. Automation badges were silently going stale on the fast path._initFpsChartsrewritten diff-based: only destroy charts for ids that left or whose canvas was detached by a DOM swap; only create for new ids; only fetch/api/metrics/historywhen there are genuinely new ids needing seed data.- Sync-clock pause/resume/reset callers +
server:automation_state_changedSSE handler now useloadDashboard()(no force) —forceFullRenderis now actually load-bearing, meaning "settings changed, full rebuild required."
server/src/ledgrab/static/js/features/perf-charts.ts
_renderChartSvgno longer rewritesinnerHTMLper poll. The SVG skeleton (ref line + sys area/line + app line) is built once via_ensureSparkNodesand mutated thereafter. WeakMap cache (_sparkNodeCache) keyed by host element avoids the per-tickquerySelectorcost.- Hidden cards (env-disabled GPU/Temp) skip render entirely.
_fetchPerformanceswitched tofetchWithAuth.- Hardcoded English strings replaced with
t()calls. New keys:perf.no_captures,perf.captures_count.{one,few,many,other},perf.ratio_of_requested,perf.total_count,perf.skipped_per_sec,perf.tip.now,perf.tip.ago(en/ru/zh). - Tooltip reads
dashboardPollIntervalper mousemove tick (was captured at bind time). - Dead
<defs><linearGradient>block removed. updateTotalCaptureFpsActualnow delegates to_paintCaptureFpsActualValue— single code path.updateActivePatches/updateDevicesskip theinnerHTMLwrite when content signature hasn't changed. This is the direct fix for the "READY TO LAUNCH flickers every update" report — the empty-state dot's CSS pulse no longer resets.- Two missing semicolons in
_seedAggregateHistories(ASI was saving us).
Reviewer findings addressed (typescript-reviewer pass)
- HIGH:
_metricLabelwas looking updashboard.perf.${key}but the FPS family usesdashboard.perf.total_fps,total_capture_fps,total_capture_fps_actual. Tooltip would have shoutedFPS/CAPTURE_FPS/CAPTURE_FPS_ACTUAL. Fixed via explicitMETRIC_LABEL_KEYSmap. - HIGH:
_ensureSparkNodessilently coercednullchildren to non-null when the SVG existed but a child was missing. Hardened to validate all four children and rebuild if any are missing.
3. Hot spots still latent
These are the call sites where innerHTML is still written every poll. None are flickering today (no CSS animations on their inner elements), but every one is the same bug shape and will bite the next time someone adds a keyframe / transition / focus target inside.
perf-charts.ts
| Line | Site | Fires per poll? | Notes |
|---|---|---|---|
| 462 | updateActivePatches → listEl.innerHTML |
yes | guarded by signature compare (✓) |
| 493 | updateTotalFps → valEl.innerHTML |
yes | FPS value, no inner animation |
| 526 | updateTotalCaptureFps → valEl.innerHTML |
yes | same |
| 638 | _paintNetworkValue → valEl.innerHTML |
yes | bytes/s value |
| 655 | _paintDeviceLatencyValue → valEl.innerHTML (no-devices hint) |
yes | hint span |
| 657 | _paintDeviceLatencyValue → valEl.innerHTML (offline hint) |
yes | hint span |
| 660 | _paintDeviceLatencyValue → valEl.innerHTML (ms value) |
yes | value |
| 676 | _paintSendTimingValue → valEl.innerHTML (idle hint) |
yes | hint span |
| 679 | _paintSendTimingValue → valEl.innerHTML (ms value) |
yes | value |
| 738 | _paintErrorsValue → valEl.innerHTML |
yes | rate value |
| 806 | updateDevices → dotsEl.innerHTML |
yes | guarded by signature compare (✓) |
| 1086 | _renderValuePair → mainEl.innerHTML = appVal |
yes | dual sys/app value |
| 1088 | _renderValuePair → mainEl.innerHTML = sysVal |
yes | dual sys/app value |
| 1094 | _renderValuePair → tagEl.innerHTML (App tag) |
mode='both' only | App tag in both mode |
| 1181 | _applyPerfDataToDom temp hint |
only when cpu_temp_hint_key changes | rare |
| 1449 | _paintFpsValue |
seed only | once per init |
| 1456 | _paintCaptureFpsValue |
seed only | once per init |
| 1463 | _paintCaptureFpsActualValue (no-captures hint) |
yes via live updater | now goes through painter |
| 1469 | _paintCaptureFpsActualValue (value) |
yes via live updater | same |
| 1499 | _paintErrorsValue (duplicate of 738) |
seed only | once per init |
| 1823 | tooltip tip.innerHTML |
per mousemove | rate-limited by hover only |
dashboard.ts
| Line | Site | Fires per poll? | Notes |
|---|---|---|---|
| 275 | _updateRunningMetrics → fpsEl.innerHTML |
per running target | live FPS pill — visible churn |
| 293 | _updateRunningMetrics → labelEl.innerHTML (errors label) |
per running target | rebuilt each poll |
| 340 | _updateAutomationsInPlace → btn.innerHTML |
only on enable/disable change | low frequency |
| 366 | _updateSyncClocksInPlace → btn.innerHTML |
per poll for every clock | wasteful |
| 975 | loadDashboard first-load → container.innerHTML |
once per init | fine |
| 989 | loadDashboard slow path → dynamic.innerHTML = dynamicHtml |
only when slow path fires | the big swap, scoped already |
| 1010 | loadDashboard error path |
rare | fine |
| 1416 | subscribeDashboardLayout clear |
rare | fine |
What this list tells us
- The remaining innerHTML writes are per-cell value updates that paint formatted spans (
{value}<span class="perf-fps-unit">fps</span>). Each rewrite destroys two text nodes + a span every poll across ~10 cells. Not flickering today; will flicker the moment anyone adds an animation to.perf-fps-unitor.perf-fps-ceiling. - The pattern can be killed without architectural change by splitting these into a stable structure (number text node + static unit span) and only updating
textContentof the number. That's what L3 / Lit would force naturally.
3.5 Beyond dashboard/perf — push-driven reconciliation
Added 2026-05-27. The §3 audit was scoped to the two poll timers we were debugging. Widening the \.innerHTML\s*= search showed the bug class has a second driver and lives outside dashboard/perf too.
Two drivers, not one
The teardown is triggered by anything that re-renders without user intent:
- Poll timers (
setInterval) — what §2/§3 covered (dashboard.ts_uptimeTimer+ main refresh,perf-charts.ts_pollTimer). - Server-pushed
server:*events —core/events-ws.tsturns each WS message into aserver:*CustomEvent; feature modules listen and re-render through the sameinnerHTMLpaths.
So the one-line bug class in §1 reads "poll- or push-driven," not just poll.
Genuinely-affected sites outside dashboard/perf
| Site | Driver | Shape | Notes |
|---|---|---|---|
core/entity-events.ts _invalidateAndReload |
push (server:entity_changed, server:device_health_changed) |
full-tab rebuild via loadTargetsTab / loadPictureSources / loadAutomations / loadIntegrations |
highest blast radius. A single pushed entity change tears down and rebuilds an entire tab — losing scroll, focus, open inline editors, restarting card-enter animations. |
features/game-integration.ts event feed (_eventMonitorTimer) |
poll (2 s) | feed.innerHTML = events.slice(0,20).map(...) |
full 20-item list rebuild every 2 s while the panel is open. |
features/game-integration.ts connection test (_connectionTestTimer) |
poll | panel.innerHTML = … per tick |
transient, low frequency. |
entity-events.ts already has the L1 floor applied by hand: a 600 ms debounce plus a diff check (oldData === newData, then length + id + updated_at compare) that skips the reload when nothing changed. That kills the no-op case — but a real change still does the full-tab teardown. This is exactly the §4-L1 limitation ("still tears down when content does differ"), live across the whole app.
Counter-examples that already do it right
Two poll loops never flicker because they mutate textContent on a stable structure instead of rewriting innerHTML:
core/api.tsloadServerInfo(connection-check poll) —versionEl.textContent/statusEl.textContent.features/color-strips/test.tsFPS sampler (1 s) —valueEl.textContent/avgEl.textContent.
These are live proof that "stable structure + mutate text node" is the fix — i.e. what L3 / Lit force by construction.
What this changes about the plan
The §4 ladder was reasoned entirely around per-cell rendering, because that was the visible flicker. The push-driven finding surfaces a second, qualitatively different problem:
- Problem A — cell value churn: every poll, one value span. Loud only with animations. Mostly fixed in
f6486f9. → wantssetText/ skip-if-unchanged. - Problem B — list/tab teardown: on change/push, an entire list or tab. Loses scroll/focus/open editors. Unaddressed.
entity-events.tsand the game feed are Problem B. → wants keyed list reconciliation.
Problem B is a list-level concern, not a cell-level one. In Lit terms it maps to a keyed repeat() directive over the tab/list body — the dashboard-card work in Phase 2 already needs this, but entity-events.ts needs it for tabs that §5 used to list as "out of scope." This does not change the chosen direction (Lit); it adds entity-events.ts as a first-class, high-priority target.
4. Decision ladder
Walked through with the user 2026-05-26. Captured here so we don't re-litigate.
L1 — drop-in setInnerHtmlIfChanged helper
- Shape:
WeakMap<Element, string>cache; replace everyel.innerHTML = xwithsetInnerHtmlIfChanged(el, x). - Wins: stops the no-change rewrites globally; zero behavior risk; ~30 call-site changes.
- Misses: still tears down DOM when content does differ (e.g. FPS row values change every tick); doesn't preserve focus/transition state inside a list.
- Verdict: floor, not ceiling. Worth doing for cells that don't get migrated to L3/Lit.
L2 — lint guard
- Shape: pre-commit script greps
\.innerHTML\s*=instatic/js/outside an allowlist, fails the commit. - Wins: keeps the discipline; cheap.
- Misses: only useful as a pair with L1+; bare guard with no helper makes contributors angry.
- Verdict: pair with whatever helper we land on.
L3 — hand-rolled cell-component pattern
- Shape:
defineCell({ html, refs, mount, update, unmount })+reconcileList(host, items, binding)+setText/setClass/setAttrmutators. ~150–300 lines of runtime. - Wins: correct by construction; no dependencies; explicit about what mutates; composes with existing customize panel / color picker.
- Misses: we own the abstraction — it grows over time as we need transitions, async data, focus, devtools, error boundaries. Death by a thousand features.
- Verdict: second-best. Strong contender if zero-deps is a hard constraint.
Lit migration of polling modules — recommended
- Shape: convert each perf cell + each dashboard card cell to a Lit web component. Use
html\${value}`` tagged-template + targeted diff. ~5KB gzip added to bundle, no new build step (esbuild handles it). - Wins: solves the bug class by design; maintained by Google + community; web-components-based so no framework lock-in; composes with vanilla DOM trivially; mental model is close to current template-string idiom; non-polling code can stay vanilla forever.
- Misses: introduces a dependency; contributors learn one more thing; rare edge cases (
@html-equivalent exists and reintroduces the bug if misused). - Verdict: best ceiling-to-cost ratio for a small team. Recommended.
Full framework rewrite (React / Vue / Solid)
- Verdict: overkill. The bug class lives in polling paths; the rest of the app is fine. Spending the migration budget on rebuilding IconSelect / EntitySelect / modals / customize panel / graph editor — none of which are broken — is a bad trade.
5. Recommendation
Lit for the polling-heavy modules.
Migration plan:
Phase 0 — spike (2-hour time-box)
- Convert
patchescell to a Lit component, end to end. - Verify it plays nicely with: color picker integration, customize panel layout reorder,
rerenderPerfGridreconciliation,setPerfModetoggle, hidden-by-env state, the spark tooltip handler. - If any of those break in an unfixable way → pivot to L3.
- If they work → commit to the migration.
Phase 1 — perf-charts cells
patches(already spiked)devicesfps/capture_fps/capture_fps_actual(share a sparkline base class)cpu/ram/gpu/temp(share_sparkCardHtmltemplate family)network/device_latency/send_timing/errors
Each is its own PR, dashboard stays working at every step. renderPerfSection becomes a registry of Lit components; rerenderPerfGrid becomes "reorder existing elements in the grid" (which it mostly already does).
Phase 2 — dashboard card cells
- Output target cards (running variant — biggest payoff, has live FPS + uptime + errors)
- Output target cards (stopped variant)
- Sync clock cards
- Automation cards
- Integration (HA / MQTT) cards
These get bigger wins from the migration because they have nested mutable state (FPS pill, errors cell, health dot, action button) that's currently rebuilt per poll via the _updateRunningMetrics path.
Highest-impact: entity-events.ts tab reconciliation (sequence early)
entity-events.ts (§3.5) is the single highest-blast-radius site and is not on the dashboard — it re-renders the Targets / Integrations / Automations tabs on server push. Whether or not those tabs' cells become Lit components, the loader path (loadTargetsTab / loadIntegrations / loadAutomations) should switch from a full innerHTML rebuild to a keyed list reconcile (a Lit repeat() over the tab body). This preserves scroll / focus / open inline editors across pushes. If the goal is "biggest UX win first" rather than "lowest-risk first," sequence this ahead of Phase 2.
Phase 3 — stopgap helper for the rest
Add setInnerHtmlIfChanged and apply to any remaining vanilla polling sites we don't plan to migrate. Add the L2 lint guard at this point — by now everything that polls is either Lit-managed or uses the helper.
Out of scope (deliberately) — with one correction (2026-05-27)
- Targets tab, automations editor, integrations, scene presets — these render on-demand, but they are ALSO re-rendered on server push via
entity-events.ts(see §3.5). The original claim that "the bug class doesn't bite them" was wrong: a pushedserver:entity_changeddoes a full-tabinnerHTMLteardown. The editor / on-demand views can stay vanilla, but the list/tab render that entity-events triggers needs reconciliation (a keyed list diff) regardless of whether those cells become Lit components. Treat the entity-events reload path as in-scope — it is the highest-blast-radius Problem B site. - Color strips editor, graph editor, settings — genuinely on-demand, no push re-render path, stay vanilla.
- Transport bar cells (CPU/Mem chip in the top bar) — read from the same perf payload, can be migrated opportunistically but not urgent.
6. Open questions to settle before committing
These came up during the discussion and weren't resolved:
- Bundle-size budget. Is +5KB acceptable? Current bundle is 2.7MB so this is noise — but worth confirming there isn't a strict cap (e.g. for slow networks / Android Chaquopy embed).
- Contributor model. If the project will grow to multiple contributors, Lit's smaller community vs React's is a recruiting tradeoff. Currently solo-ish, so probably moot.
- Android TV target. Chaquopy embed serves the same bundle. Lit works fine in any modern browser — Android TV WebView is Chromium-based. Should be a no-op but verify in Phase 0 spike.
- Long-term framework intent. If there's a chance we ever migrate to React/Vue/Solid for the rest of the app, doing Lit now is not lock-in (web components are standard), but it does add a second mental model. Probably fine; just naming the tradeoff.
- Customize panel. The drag-reorder code in
dashboard-customize.tsmutates.dashboard-sectionDOM directly. Lit components reorder cleanly viamoveBefore/insertBeforesince they're just elements, but the dnd library needs to treat them as opaque drag handles. Phase 0 spike should confirm.
7. Pointers
- Source files most relevant:
server/src/ledgrab/static/js/features/dashboard.tsserver/src/ledgrab/static/js/features/perf-charts.tsserver/src/ledgrab/static/js/features/dashboard-layout.ts(cell ordering + visibility)server/src/ledgrab/static/js/features/dashboard-customize.ts(drag-reorder UI)server/src/ledgrab/static/js/core/card-modes.ts(mode toggle that hangs off section headers)server/src/ledgrab/static/js/core/entity-events.ts(push-driven tab reloads — §3.5, highest blast radius)server/src/ledgrab/static/js/core/events-ws.ts(WS →server:*CustomEvent dispatch)server/src/ledgrab/static/js/features/game-integration.ts(2 s event-feed list rebuild — §3.5)
- Most recent reconciliation commit:
f6486f9. - Related skill files in
~/.claude/skills/:frontend-patterns,documentation-lookup(for Lit docs via Context7). - Locale convention:
perf.*for cross-card primitives,dashboard.perf.*for cell titles.
8. If this doc gets stale
If you read this and the perf cells are already Lit components — delete this file. If you read this and there's a new flicker / focus / transition bug nobody can explain — search for \.innerHTML\s*= in static/js/features/ and static/js/core/ (entity-events lives in core) and you've probably found it. For state loss on a server event (scroll jump, focus drop, an inline editor closing itself), look at the server:* listeners in core/entity-events.ts first.