feat(android): on-device webcam capture via Camera2 (AndroidCameraEngine)

Add on-device webcam capture to the experimental Android-TV build. Desktop
captures webcams via OpenCV (no Chaquopy/Android wheel); this adds a push-based
AndroidCameraEngine that plugs into the same selection path desktop uses
(capture template engine_type="android_camera" + display_index, HAS_OWN_DISPLAYS).

A Kotlin CameraBridge (Camera2) enumerates cameras and opens them on demand —
only while a capture source is active, driven Python->Kotlin via a guarded jclass
singleton (BleBridge pattern) — converts each frame YUV_420_888->RGB, and pushes
RGB bytes into a module-level queue mirroring mediaprojection_engine.py. Cameras
surface as selectable displays like the desktop OpenCV engine; the data-driven
capture-template UI is unchanged. No new Python deps; no new Gradle deps
(Camera2 is in-platform).

Engine: ENGINE_PRIORITY=0 (never auto-selected over MediaProjection=100; explicit
engine_type only). Single-camera ownership is serialized with a lock + ref-count
(same-camera streams attach, different-camera refused, last release stops),
mirroring the desktop CameraEngine guard.

Permission: CAMERA requested at capture-start, gated on FEATURE_CAMERA_ANY so
camera-less TV boxes never prompt; graceful degradation when denied. The service
is promoted with the camera FGS type (+ FOREGROUND_SERVICE_CAMERA) only when
CAMERA is already granted, so backgrounded capture keeps working without risking
a failed startForeground on camera-less boxes (camera can't ride the
MediaProjection token the way audio playback capture does).

Reviewed via multi-agent adversarial pass (13 findings -> 4 fixed: device leak on
session-failure, multi-stream collision, camera FGS type, i18n key; 9 refuted).

Tests: 18 new desktop-CI tests (no device needed); full suite 1883 passed.
Verified: assembleDebug BUILD SUCCESSFUL, ruff clean.

Docs: ANDROID-REVIEW/android-webcam-capture-plan.md (design), updated
android-missing-functionality.md + README feature table + en/ru/zh locales.
This commit is contained in:
2026-06-02 13:36:23 +03:00
parent 34db5de8c3
commit 4bf3fe65db
14 changed files with 1480 additions and 17 deletions
+37 -14
View File
@@ -46,7 +46,7 @@ Python receiver engine mirroring that pattern.**
| System metrics | psutil | ✅ CPU/RAM/battery/thermal via `/proc`, `/sys` (`AndroidMetricsProvider`) | No |
| **Audio capture** | WASAPI / Sounddevice | ❌ no PortAudio | **Yes** |
| Notification capture | WinRT / D-Bus | ✅ NotificationListenerService → `push_notification()` | No (implemented) |
| Webcam capture | OpenCV | ❌ no OpenCV wheel | Yes (niche) |
| Webcam capture | OpenCV | ✅ Camera2 + on-demand bridge (`AndroidCameraEngine`) | No (implemented) |
| GPU monitoring | NVML | ❌ no NVIDIA GPU | Marginal |
| Capture from *another* Android phone | scrcpy/ADB | ❌ | Skip (redundant) |
| Automation: window/process conditions | Windows ctypes | ❌ sandboxed | Partial |
@@ -90,13 +90,34 @@ Python receiver engine mirroring that pattern.**
`app_name` / Android `getApplicationLabel`), so desktop-configured per-app colors/filters
may need re-matching on Android.
### 📷 Webcam capture — **FEASIBLE, LOW VALUE**
### 📷 Webcam capture — **IMPLEMENTED** ✅ (shipped)
- **Blocker** is `opencv-python-headless` (no Chaquopy cp311 wheel) — but capture doesn't
*need* OpenCV. Use **CameraX / Camera2** + `ImageReader` in Kotlin and push frames through
the same bridge as MediaProjection into a new `CameraBridgeEngine`.
- **Effort:** moderate. **Value:** low — TVs rarely have cameras; USB-UVC webcams need extra
device handling. Recommend deferring unless a concrete use case appears.
- **Blocker** was `opencv-python-headless` (no Chaquopy cp311 wheel) — but capture doesn't
*need* OpenCV. Implemented with **Camera2** + `ImageReader` in Kotlin pushing RGB frames
through the same bridge as MediaProjection into a new `AndroidCameraEngine`.
- **Path:** a Kotlin `CameraBridge` singleton (Camera2) enumerates cameras and **opens the
camera on demand** (only while a capture source is active — driven Python→Kotlin via the
`BleBridge`/`UsbSerialBridge` pattern), converts each frame YUV_420_888→RGB, and pushes it
into a push-based `AndroidCameraEngine` (`core/capture_engines/android_camera_engine.py`)
that mirrors `mediaprojection_engine.py`. Cameras surface as selectable "displays" exactly
like the desktop OpenCV `CameraEngine`; the data-driven capture-template UI (engine list +
`resolution` config + display picker) needs **no changes**. **No new Python deps; no new
Gradle deps** (Camera2 is in-platform).
- **Permission:** `CAMERA` requested at capture-start, gated on `FEATURE_CAMERA_ANY` so
camera-less TV boxes never see the prompt; graceful degradation when denied. The service is
promoted with the `camera` FGS type (+ `FOREGROUND_SERVICE_CAMERA`) **only when CAMERA is
already granted**, so backgrounded capture keeps working without risking a failed service
start on camera-less boxes. (Unlike audio playback capture, the camera can't ride the
MediaProjection token, so it needs its own FGS type to survive backgrounding.)
- **Effort:** moderate. **Value:** low (TVs rarely have cameras), but the implementation reuses
existing infrastructure end-to-end. **Priority `0`** so it's never auto-selected over
MediaProjection — chosen explicitly via `engine_type="android_camera"`.
- ⚠️ **MVP scope / limitations:** webcam capture works **while LedGrab capture is running**
(no camera-only server path on Android); one camera active at a time; `"auto"` picks a
balanced output size (not the sensor max) to keep per-frame YUV→RGB cheap; USB-UVC webcams
appear only if the device routes them through Camera2 (varies by box); no frame-rotation
correction.
- 📄 **See `android-webcam-capture-plan.md`** for the full implementation notes.
### 🎮 GPU monitoring — **MARGINAL, SKIP FOR NOW**
@@ -134,17 +155,19 @@ Python receiver engine mirroring that pattern.**
| Priority | Feature | Effort | Value | New Python deps | Status |
| -------- | ------- | ------ | ----- | --------------- | ------ |
| 1 | Notification capture | Moderate | High | None | **✅ Implemented** |
| 2 | Audio capture | Moderate | High | None | **Plan written** (this folder) |
| 3 | Automation: foreground-app condition | Moderate | Moderate | None | Idea |
| 4 | Webcam capture (CameraX) | Moderate | Low | None | Idea |
| 2 | Audio capture | Moderate | High | None | **✅ Implemented** |
| 4 | Webcam capture (Camera2) | Moderate | Low | None | **✅ Implemented** |
| 3 | Automation: foreground-app condition | Moderate | Moderate | None | Idea (only remaining) |
| — | GPU load (vendor sysfs) | LowMed | Low | None | Not recommended |
| — | Capture from another phone | — | — | — | Won't do |
| — | Multi-display / monitor names | Low | Low | None | Not recommended |
**Recommended order:** ship notifications → ship audio → reassess. Both reuse existing
infrastructure (bridge pattern, the MediaProjection consent token, the audio/notification
pipelines) and add **zero** Python dependencies, so neither risks the Chaquopy
`--no-deps` build constraint documented in `CLAUDE.md`.
**Status:** notifications, audio, **and webcam** are all shipped — each reuses existing
infrastructure (bridge pattern, the MediaProjection consent token / process-global
`Python.getInstance()`, the capture/audio/notification pipelines) and adds **zero** Python
dependencies, so none risks the Chaquopy `--no-deps` build constraint documented in
`CLAUDE.md`. The only remaining idea is the **foreground-app automation condition** (moderate
value); GPU load, another-phone capture, and multi-display remain not-recommended / won't-do.
## Cross-cutting notes
@@ -0,0 +1,168 @@
# Plan: Android on-device webcam capture
> Status: **implemented** on branch `feature/android-webcam-capture`. Last updated 2026-06-02.
## Context
LedGrab captures webcams on desktop through OpenCV (`cv2.VideoCapture`) in
`server/src/ledgrab/core/capture_engines/camera_engine.py`. On the **experimental Android-TV
build**, `opencv-python-headless` has no Chaquopy cp311 wheel, so the camera engine never
loads and cameras are unusable on-device.
Android doesn't need OpenCV to capture a camera: the platform exposes **Camera2**
(`android.hardware.camera2`), and the codebase already has the bridge shape to plug a Kotlin
capture source into a push-based Python engine. This feature adds an on-device camera engine
so a USB/integrated camera can drive ambient lighting, at parity with how the desktop OpenCV
camera engine feeds the pipeline.
The design mirrors the working screen-capture bridge
(`mediaprojection_engine.py``ScreenCapture.kt`) and the just-shipped audio engine
(`android_audio_engine.py``AudioCapture.kt`). **No new Python dependencies** (numpy already
bundled) and **no new Gradle dependencies** (Camera2 is in-platform) → no Chaquopy /
`build.gradle.kts` changes.
## Approach
A new **push-based** capture engine registered in the existing `EngineRegistry`, plus a Kotlin
`CameraBridge` that opens the camera **on demand**:
```
[capture source acquired] → AndroidCameraCaptureStream.initialize()
→ android_camera_engine.start_camera(index, w, h) [guarded jclass]
→ CameraBridge.startCamera(index, w, h) [Camera2 open + session]
→ onImageAvailable → YUV_420_888→RGB (stride-aware) → push_frame(rgbBytes, w, h)
→ android_camera_engine [module-level queue] → AndroidCameraCaptureStream.capture_frame()
→ ScreenCaptureLiveStream → processing pipeline [unchanged]
[capture source released] → AndroidCameraCaptureStream.cleanup()
→ android_camera_engine.stop_camera() → CameraBridge.stopCamera() [releases the camera]
```
The camera is **only open while a camera source is active** — the camera-in-use indicator and
battery cost are bounded to actual use, unlike always-on screen/audio capture. This on-demand
control reuses the synchronous Python→Kotlin singleton pattern of `BleBridge`/`UsbSerialBridge`.
## Selection path (why nothing downstream changes)
Webcams on desktop are a `ScreenCapturePictureSource` (`stream_type="raw"`) bound to a capture
template whose `engine_type="camera"` + a `display_index`. `live_stream_manager`
`_create_screen_capture_live_stream` reads `engine_type` from the template and calls
`EngineRegistry.create_stream(engine_type, display_index, config)`. Android adds
`engine_type="android_camera"` — the **same path**. The frontend
(`static/js/features/streams-capture-templates.ts`) is fully data-driven: the engine list,
the `resolution` config dropdown (keyed by field name), and the camera picker
(`/config/displays?engine_type=android_camera`, since `HAS_OWN_DISPLAYS=True`) all work with
no frontend changes.
## Part A — Python (`core/capture_engines/android_camera_engine.py`)
Mirrors `mediaprojection_engine.py` (module-level `queue.Queue` + `push_frame` + `_last_frame`
fallback + drop-oldest) and the desktop `CameraEngine` shape (cameras as displays,
`resolution` config).
- `_camera_bridge()` — lazy, `is_android()`-guarded `from java import jclass;
jclass("com.ledgrab.android.CameraBridge").INSTANCE`. **Never imported at module load** (this
module imports on desktop CI). Mirrors `core/devices/android_ble_transport.py`.
- `list_cameras()` → parses `CameraBridge.listCameras()` JSON into
`[{"index","name","facing"}]`; `_enumerate_cameras()` caches it (30 s TTL).
- `push_frame(rgb_bytes, w, h)` → `np.frombuffer(...uint8)` reshape **`(h, w, 3)`** (RGB, 3
B/px — NOT the RGBA `(h,w,4)` of the screen engine) → `.copy()` → drop-oldest enqueue. A
short/malformed buffer is dropped, never reshape-crashes.
- `start_camera(index, w, h) -> bool` / `stop_camera(index)` → guarded bridge calls.
- `AndroidCameraEngine`: `ENGINE_TYPE="android_camera"`, `ENGINE_PRIORITY=0` (never
auto-selected over MediaProjection=100 — explicit `engine_type` only), `HAS_OWN_DISPLAYS=True`,
`is_available()=is_android() and ≥1 enumerated camera`, `get_config_choices()` exposes
`resolution` (same presets as desktop).
- `AndroidCameraCaptureStream`: `initialize()` parses `resolution` → `start_camera(...)` (raises
if it returns False), drains stale frames; `capture_frame()` pops queue / returns `_last_frame`;
`cleanup()` → `stop_camera(...)`.
Registered in `capture_engines/__init__.py` behind a guarded import (mirrors the
mediaprojection block).
## Part B — Android (`CameraBridge.kt`)
`object CameraBridge` (mirrors `BleBridge`):
- `init(context)` — from `LedGrabApp.onCreate` (context only, no camera opened).
- `listCameras(): String` — JSON array from `CameraManager.cameraIdList` + `LENS_FACING`
(front/back/external). No CAMERA permission needed.
- `startCamera(index, width, height): Boolean` — checks CAMERA permission; resolves cameraId;
picks the supported YUV size closest to the request (balanced default ≤1280×720 for "auto");
opens device + capture session on a private `HandlerThread`, blocking until configured
(`runBlocking { withTimeout { ... } }` over `suspendCancellableCoroutine`-wrapped Camera2
callbacks); sets a repeating preview request. Returns false (no throw across JNI) on
permission/range/configure failure. Closes any prior camera first.
- `onImageAvailable` → paced (≈20 fps) → stride-aware **YUV_420_888→RGB** (BT.601 fixed-point,
reused plane + RGB buffers) → push to the cached `android_camera_engine` module handle.
- `stopCamera()` — stops repeating, closes session/device/reader, idempotent.
## Part C — Wiring + permission + manifest
- `LedGrabApp.kt` — `CameraBridge.init(this)` next to `BleBridge.init`.
- `MainActivity.kt` — `ensureCameraPermission()` (mirror `ensureAudioPermission`): request
`CAMERA` iff `hasSystemFeature(FEATURE_CAMERA_ANY)`; called from both `startCaptureService`
(MediaProjection path) and `startRootCaptureService` (root path). Fire-and-forget.
- `AndroidManifest.xml` — `<uses-permission CAMERA>` + `<uses-feature camera.any required=false>`
+ `<uses-permission FOREGROUND_SERVICE_CAMERA>`, and `camera` added to the `CaptureService`
`foregroundServiceType` union (`mediaProjection|specialUse|camera`).
- `CaptureService.onStartCommand` — on API 34+, OR `FOREGROUND_SERVICE_TYPE_CAMERA` into the
promotion type **only when CAMERA is already granted**. Unlike audio playback capture (which
rides the MediaProjection token under the mediaProjection type), the camera has no such
coupling, so without its own FGS type Android 14+ revokes camera access once the app is
backgrounded. The conditional guard avoids a failed `startForeground` (which would kill the
whole service) on a camera-less / not-yet-granted box. If CAMERA is granted later, the camera
type takes effect on the next Start.
- No `proguard-rules.pro` change — the blanket `-keep class com.ledgrab.android.** { *; }`
already covers `CameraBridge`, and R8/minify is disabled.
## What does NOT change
- **Frontend / API** — data-driven engine list, config, and display picker.
- **`build.gradle.kts` / Chaquopy pip block** — no new Python or Gradle packages.
- **Processing pipeline** — `ScreenCaptureLiveStream`, filters, color-strip sources unchanged.
## Files
**Create**
- `server/src/ledgrab/core/capture_engines/android_camera_engine.py`
- `android/app/src/main/java/com/ledgrab/android/CameraBridge.kt`
- `server/tests/core/test_android_camera_engine.py`
**Modify**
- `server/src/ledgrab/core/capture_engines/__init__.py` — guarded import + registration.
- `android/app/src/main/java/com/ledgrab/android/LedGrabApp.kt` — `CameraBridge.init`.
- `android/app/src/main/java/com/ledgrab/android/MainActivity.kt` — `ensureCameraPermission`.
- `android/app/src/main/AndroidManifest.xml` — `CAMERA` + `camera.any`.
## Tests (Python — desktop CI, no device)
`server/tests/core/test_android_camera_engine.py`: push→capture round-trips RGB `(h,w,3)`;
drop-oldest when full; `_last_frame` fallback on empty; short-buffer never crashes;
`initialize()` opens with parsed/auto resolution and raises on open-failure / off-Android;
`cleanup()` closes once (idempotent); `is_available()` gating (android + cameras); display
enumeration; priority 0 never beats MediaProjection; create-via-registry yields a pushed frame.
## Verification
1. **Python:** `py -3.13 -m pytest tests/core/test_android_camera_engine.py --no-cov -q`, then
the full suite (1880 passed, 2 skipped; 15 new).
2. **Lint:** `ruff check src/ tests/ --fix` — clean.
3. **Android build:** `./gradlew :app:assembleDebug` — BUILD SUCCESSFUL.
4. **On device (manual):** install APK → Start capture → grant CAMERA → create a capture
template with engine `android_camera` + a camera display + a ScreenCapture source bound to
a strip → confirm LEDs react to the camera feed and the camera indicator only lights while
the source is active.
## Risks / notes
- **MVP scope:** webcam works **while LedGrab capture is running** (the Python server only runs
inside `CaptureService`; there is no camera-only start path on Android).
- **One camera at a time:** `startCamera` closes any previously-open camera first.
- **`"auto"` resolution** picks a balanced output size (~720p), not the sensor max, to keep the
per-frame YUV→RGB conversion cheap on low-end TV boxes.
- **USB-UVC webcams** appear only if the device exposes them through Camera2 (`LENS_FACING_EXTERNAL`),
which varies by box; an explicit UVC library would be a separate, larger effort.
- **No frame-rotation correction** — sensor orientation is not applied (ambient color sampling
is largely orientation-tolerant); could be added later.
- **CAMERA denied** → the engine reports no usable camera and capture proceeds without it.