feat(android): on-device webcam capture via Camera2 (AndroidCameraEngine)

Add on-device webcam capture to the experimental Android-TV build. Desktop
captures webcams via OpenCV (no Chaquopy/Android wheel); this adds a push-based
AndroidCameraEngine that plugs into the same selection path desktop uses
(capture template engine_type="android_camera" + display_index, HAS_OWN_DISPLAYS).

A Kotlin CameraBridge (Camera2) enumerates cameras and opens them on demand —
only while a capture source is active, driven Python->Kotlin via a guarded jclass
singleton (BleBridge pattern) — converts each frame YUV_420_888->RGB, and pushes
RGB bytes into a module-level queue mirroring mediaprojection_engine.py. Cameras
surface as selectable displays like the desktop OpenCV engine; the data-driven
capture-template UI is unchanged. No new Python deps; no new Gradle deps
(Camera2 is in-platform).

Engine: ENGINE_PRIORITY=0 (never auto-selected over MediaProjection=100; explicit
engine_type only). Single-camera ownership is serialized with a lock + ref-count
(same-camera streams attach, different-camera refused, last release stops),
mirroring the desktop CameraEngine guard.

Permission: CAMERA requested at capture-start, gated on FEATURE_CAMERA_ANY so
camera-less TV boxes never prompt; graceful degradation when denied. The service
is promoted with the camera FGS type (+ FOREGROUND_SERVICE_CAMERA) only when
CAMERA is already granted, so backgrounded capture keeps working without risking
a failed startForeground on camera-less boxes (camera can't ride the
MediaProjection token the way audio playback capture does).

Reviewed via multi-agent adversarial pass (13 findings -> 4 fixed: device leak on
session-failure, multi-stream collision, camera FGS type, i18n key; 9 refuted).

Tests: 18 new desktop-CI tests (no device needed); full suite 1883 passed.
Verified: assembleDebug BUILD SUCCESSFUL, ruff clean.

Docs: ANDROID-REVIEW/android-webcam-capture-plan.md (design), updated
android-missing-functionality.md + README feature table + en/ru/zh locales.
This commit is contained in:
2026-06-02 13:36:23 +03:00
parent 34db5de8c3
commit 4bf3fe65db
14 changed files with 1480 additions and 17 deletions
+37 -14
View File
@@ -46,7 +46,7 @@ Python receiver engine mirroring that pattern.**
| System metrics | psutil | ✅ CPU/RAM/battery/thermal via `/proc`, `/sys` (`AndroidMetricsProvider`) | No |
| **Audio capture** | WASAPI / Sounddevice | ❌ no PortAudio | **Yes** |
| Notification capture | WinRT / D-Bus | ✅ NotificationListenerService → `push_notification()` | No (implemented) |
| Webcam capture | OpenCV | ❌ no OpenCV wheel | Yes (niche) |
| Webcam capture | OpenCV | ✅ Camera2 + on-demand bridge (`AndroidCameraEngine`) | No (implemented) |
| GPU monitoring | NVML | ❌ no NVIDIA GPU | Marginal |
| Capture from *another* Android phone | scrcpy/ADB | ❌ | Skip (redundant) |
| Automation: window/process conditions | Windows ctypes | ❌ sandboxed | Partial |
@@ -90,13 +90,34 @@ Python receiver engine mirroring that pattern.**
`app_name` / Android `getApplicationLabel`), so desktop-configured per-app colors/filters
may need re-matching on Android.
### 📷 Webcam capture — **FEASIBLE, LOW VALUE**
### 📷 Webcam capture — **IMPLEMENTED** ✅ (shipped)
- **Blocker** is `opencv-python-headless` (no Chaquopy cp311 wheel) — but capture doesn't
*need* OpenCV. Use **CameraX / Camera2** + `ImageReader` in Kotlin and push frames through
the same bridge as MediaProjection into a new `CameraBridgeEngine`.
- **Effort:** moderate. **Value:** low — TVs rarely have cameras; USB-UVC webcams need extra
device handling. Recommend deferring unless a concrete use case appears.
- **Blocker** was `opencv-python-headless` (no Chaquopy cp311 wheel) — but capture doesn't
*need* OpenCV. Implemented with **Camera2** + `ImageReader` in Kotlin pushing RGB frames
through the same bridge as MediaProjection into a new `AndroidCameraEngine`.
- **Path:** a Kotlin `CameraBridge` singleton (Camera2) enumerates cameras and **opens the
camera on demand** (only while a capture source is active — driven Python→Kotlin via the
`BleBridge`/`UsbSerialBridge` pattern), converts each frame YUV_420_888→RGB, and pushes it
into a push-based `AndroidCameraEngine` (`core/capture_engines/android_camera_engine.py`)
that mirrors `mediaprojection_engine.py`. Cameras surface as selectable "displays" exactly
like the desktop OpenCV `CameraEngine`; the data-driven capture-template UI (engine list +
`resolution` config + display picker) needs **no changes**. **No new Python deps; no new
Gradle deps** (Camera2 is in-platform).
- **Permission:** `CAMERA` requested at capture-start, gated on `FEATURE_CAMERA_ANY` so
camera-less TV boxes never see the prompt; graceful degradation when denied. The service is
promoted with the `camera` FGS type (+ `FOREGROUND_SERVICE_CAMERA`) **only when CAMERA is
already granted**, so backgrounded capture keeps working without risking a failed service
start on camera-less boxes. (Unlike audio playback capture, the camera can't ride the
MediaProjection token, so it needs its own FGS type to survive backgrounding.)
- **Effort:** moderate. **Value:** low (TVs rarely have cameras), but the implementation reuses
existing infrastructure end-to-end. **Priority `0`** so it's never auto-selected over
MediaProjection — chosen explicitly via `engine_type="android_camera"`.
- ⚠️ **MVP scope / limitations:** webcam capture works **while LedGrab capture is running**
(no camera-only server path on Android); one camera active at a time; `"auto"` picks a
balanced output size (not the sensor max) to keep per-frame YUV→RGB cheap; USB-UVC webcams
appear only if the device routes them through Camera2 (varies by box); no frame-rotation
correction.
- 📄 **See `android-webcam-capture-plan.md`** for the full implementation notes.
### 🎮 GPU monitoring — **MARGINAL, SKIP FOR NOW**
@@ -134,17 +155,19 @@ Python receiver engine mirroring that pattern.**
| Priority | Feature | Effort | Value | New Python deps | Status |
| -------- | ------- | ------ | ----- | --------------- | ------ |
| 1 | Notification capture | Moderate | High | None | **✅ Implemented** |
| 2 | Audio capture | Moderate | High | None | **Plan written** (this folder) |
| 3 | Automation: foreground-app condition | Moderate | Moderate | None | Idea |
| 4 | Webcam capture (CameraX) | Moderate | Low | None | Idea |
| 2 | Audio capture | Moderate | High | None | **✅ Implemented** |
| 4 | Webcam capture (Camera2) | Moderate | Low | None | **✅ Implemented** |
| 3 | Automation: foreground-app condition | Moderate | Moderate | None | Idea (only remaining) |
| — | GPU load (vendor sysfs) | LowMed | Low | None | Not recommended |
| — | Capture from another phone | — | — | — | Won't do |
| — | Multi-display / monitor names | Low | Low | None | Not recommended |
**Recommended order:** ship notifications → ship audio → reassess. Both reuse existing
infrastructure (bridge pattern, the MediaProjection consent token, the audio/notification
pipelines) and add **zero** Python dependencies, so neither risks the Chaquopy
`--no-deps` build constraint documented in `CLAUDE.md`.
**Status:** notifications, audio, **and webcam** are all shipped — each reuses existing
infrastructure (bridge pattern, the MediaProjection consent token / process-global
`Python.getInstance()`, the capture/audio/notification pipelines) and adds **zero** Python
dependencies, so none risks the Chaquopy `--no-deps` build constraint documented in
`CLAUDE.md`. The only remaining idea is the **foreground-app automation condition** (moderate
value); GPU load, another-phone capture, and multi-display remain not-recommended / won't-do.
## Cross-cutting notes