Optimize processing pipeline and fix multi-target crash

Performance optimizations across 5 phases:
- Saturation filter: float32 → int32 integer math (~2-3x faster)
- Frame interpolation: pre-allocated uint16 scratch buffers
- Color correction: single-pass cv2.LUT instead of 3 channel lookups
- DDP: numpy vectorized color reorder + pre-allocated RGBW buffer
- Calibration boundaries: vectorized with np.arange + np.maximum
- wled_client: vectorized pixel validation and HTTP pixel list
- _fit_to_device: cached linspace arrays (now per-instance)
- Diagnostic lists: bounded deque(maxlen=...) instead of unbounded list
- Health checks: adaptive intervals (10s streaming, 60s idle)
- Profile engine: poll interval 3s → 1s

Bug fixes:
- Fix deque slicing crash killing targets when multiple run in parallel
  (deque doesn't support [-1:] or [:5] slice syntax unlike list)
- Fix numpy array boolean ambiguity in send_pixels() validation
- Persist fatal processing loop errors to metrics for API visibility
- Move _fit_to_device cache from class-level to instance-level to
  prevent cross-target cache thrashing

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
2026-02-26 21:28:17 +03:00
parent fda040ae18
commit 6f5bda6d8f
9 changed files with 135 additions and 92 deletions
@@ -264,15 +264,11 @@ class PixelMapper:
# Compute segment boundaries (matching get_edge_segments float stepping)
step = edge_len / led_count
boundaries = np.empty(led_count + 1, dtype=np.int64)
for i in range(led_count + 1):
boundaries[i] = int(i * step)
boundaries = (np.arange(led_count + 1, dtype=np.float64) * step).astype(np.int64)
# Ensure each segment has at least 1 pixel
for i in range(led_count):
if boundaries[i + 1] <= boundaries[i]:
boundaries[i + 1] = boundaries[i] + 1
boundaries[1:] = np.maximum(boundaries[1:], boundaries[:-1] + 1)
# Clamp all boundaries to edge_len (not just the last one)
boundaries = np.minimum(boundaries, edge_len)
np.minimum(boundaries, edge_len, out=boundaries)
# Cumulative sum for O(1) range means — no per-LED Python numpy calls
cumsum = np.zeros((edge_len + 1, 3), dtype=np.float64)