feat(android): on-device webcam capture via Camera2 (AndroidCameraEngine)
Add on-device webcam capture to the experimental Android-TV build. Desktop captures webcams via OpenCV (no Chaquopy/Android wheel); this adds a push-based AndroidCameraEngine that plugs into the same selection path desktop uses (capture template engine_type="android_camera" + display_index, HAS_OWN_DISPLAYS). A Kotlin CameraBridge (Camera2) enumerates cameras and opens them on demand — only while a capture source is active, driven Python->Kotlin via a guarded jclass singleton (BleBridge pattern) — converts each frame YUV_420_888->RGB, and pushes RGB bytes into a module-level queue mirroring mediaprojection_engine.py. Cameras surface as selectable displays like the desktop OpenCV engine; the data-driven capture-template UI is unchanged. No new Python deps; no new Gradle deps (Camera2 is in-platform). Engine: ENGINE_PRIORITY=0 (never auto-selected over MediaProjection=100; explicit engine_type only). Single-camera ownership is serialized with a lock + ref-count (same-camera streams attach, different-camera refused, last release stops), mirroring the desktop CameraEngine guard. Permission: CAMERA requested at capture-start, gated on FEATURE_CAMERA_ANY so camera-less TV boxes never prompt; graceful degradation when denied. The service is promoted with the camera FGS type (+ FOREGROUND_SERVICE_CAMERA) only when CAMERA is already granted, so backgrounded capture keeps working without risking a failed startForeground on camera-less boxes (camera can't ride the MediaProjection token the way audio playback capture does). Reviewed via multi-agent adversarial pass (13 findings -> 4 fixed: device leak on session-failure, multi-stream collision, camera FGS type, i18n key; 9 refuted). Tests: 18 new desktop-CI tests (no device needed); full suite 1883 passed. Verified: assembleDebug BUILD SUCCESSFUL, ruff clean. Docs: ANDROID-REVIEW/android-webcam-capture-plan.md (design), updated android-missing-functionality.md + README feature table + en/ru/zh locales.
This commit is contained in:
@@ -46,7 +46,7 @@ Python receiver engine mirroring that pattern.**
|
||||
| System metrics | psutil | ✅ CPU/RAM/battery/thermal via `/proc`, `/sys` (`AndroidMetricsProvider`) | No |
|
||||
| **Audio capture** | WASAPI / Sounddevice | ❌ no PortAudio | **Yes** |
|
||||
| Notification capture | WinRT / D-Bus | ✅ NotificationListenerService → `push_notification()` | No (implemented) |
|
||||
| Webcam capture | OpenCV | ❌ no OpenCV wheel | Yes (niche) |
|
||||
| Webcam capture | OpenCV | ✅ Camera2 + on-demand bridge (`AndroidCameraEngine`) | No (implemented) |
|
||||
| GPU monitoring | NVML | ❌ no NVIDIA GPU | Marginal |
|
||||
| Capture from *another* Android phone | scrcpy/ADB | ❌ | Skip (redundant) |
|
||||
| Automation: window/process conditions | Windows ctypes | ❌ sandboxed | Partial |
|
||||
@@ -90,13 +90,34 @@ Python receiver engine mirroring that pattern.**
|
||||
`app_name` / Android `getApplicationLabel`), so desktop-configured per-app colors/filters
|
||||
may need re-matching on Android.
|
||||
|
||||
### 📷 Webcam capture — **FEASIBLE, LOW VALUE**
|
||||
### 📷 Webcam capture — **IMPLEMENTED** ✅ (shipped)
|
||||
|
||||
- **Blocker** is `opencv-python-headless` (no Chaquopy cp311 wheel) — but capture doesn't
|
||||
*need* OpenCV. Use **CameraX / Camera2** + `ImageReader` in Kotlin and push frames through
|
||||
the same bridge as MediaProjection into a new `CameraBridgeEngine`.
|
||||
- **Effort:** moderate. **Value:** low — TVs rarely have cameras; USB-UVC webcams need extra
|
||||
device handling. Recommend deferring unless a concrete use case appears.
|
||||
- **Blocker** was `opencv-python-headless` (no Chaquopy cp311 wheel) — but capture doesn't
|
||||
*need* OpenCV. Implemented with **Camera2** + `ImageReader` in Kotlin pushing RGB frames
|
||||
through the same bridge as MediaProjection into a new `AndroidCameraEngine`.
|
||||
- **Path:** a Kotlin `CameraBridge` singleton (Camera2) enumerates cameras and **opens the
|
||||
camera on demand** (only while a capture source is active — driven Python→Kotlin via the
|
||||
`BleBridge`/`UsbSerialBridge` pattern), converts each frame YUV_420_888→RGB, and pushes it
|
||||
into a push-based `AndroidCameraEngine` (`core/capture_engines/android_camera_engine.py`)
|
||||
that mirrors `mediaprojection_engine.py`. Cameras surface as selectable "displays" exactly
|
||||
like the desktop OpenCV `CameraEngine`; the data-driven capture-template UI (engine list +
|
||||
`resolution` config + display picker) needs **no changes**. **No new Python deps; no new
|
||||
Gradle deps** (Camera2 is in-platform).
|
||||
- **Permission:** `CAMERA` requested at capture-start, gated on `FEATURE_CAMERA_ANY` so
|
||||
camera-less TV boxes never see the prompt; graceful degradation when denied. The service is
|
||||
promoted with the `camera` FGS type (+ `FOREGROUND_SERVICE_CAMERA`) **only when CAMERA is
|
||||
already granted**, so backgrounded capture keeps working without risking a failed service
|
||||
start on camera-less boxes. (Unlike audio playback capture, the camera can't ride the
|
||||
MediaProjection token, so it needs its own FGS type to survive backgrounding.)
|
||||
- **Effort:** moderate. **Value:** low (TVs rarely have cameras), but the implementation reuses
|
||||
existing infrastructure end-to-end. **Priority `0`** so it's never auto-selected over
|
||||
MediaProjection — chosen explicitly via `engine_type="android_camera"`.
|
||||
- ⚠️ **MVP scope / limitations:** webcam capture works **while LedGrab capture is running**
|
||||
(no camera-only server path on Android); one camera active at a time; `"auto"` picks a
|
||||
balanced output size (not the sensor max) to keep per-frame YUV→RGB cheap; USB-UVC webcams
|
||||
appear only if the device routes them through Camera2 (varies by box); no frame-rotation
|
||||
correction.
|
||||
- 📄 **See `android-webcam-capture-plan.md`** for the full implementation notes.
|
||||
|
||||
### 🎮 GPU monitoring — **MARGINAL, SKIP FOR NOW**
|
||||
|
||||
@@ -134,17 +155,19 @@ Python receiver engine mirroring that pattern.**
|
||||
| Priority | Feature | Effort | Value | New Python deps | Status |
|
||||
| -------- | ------- | ------ | ----- | --------------- | ------ |
|
||||
| 1 | Notification capture | Moderate | High | None | **✅ Implemented** |
|
||||
| 2 | Audio capture | Moderate | High | None | **Plan written** (this folder) |
|
||||
| 3 | Automation: foreground-app condition | Moderate | Moderate | None | Idea |
|
||||
| 4 | Webcam capture (CameraX) | Moderate | Low | None | Idea |
|
||||
| 2 | Audio capture | Moderate | High | None | **✅ Implemented** |
|
||||
| 4 | Webcam capture (Camera2) | Moderate | Low | None | **✅ Implemented** |
|
||||
| 3 | Automation: foreground-app condition | Moderate | Moderate | None | Idea (only remaining) |
|
||||
| — | GPU load (vendor sysfs) | Low–Med | Low | None | Not recommended |
|
||||
| — | Capture from another phone | — | — | — | Won't do |
|
||||
| — | Multi-display / monitor names | Low | Low | None | Not recommended |
|
||||
|
||||
**Recommended order:** ship notifications → ship audio → reassess. Both reuse existing
|
||||
infrastructure (bridge pattern, the MediaProjection consent token, the audio/notification
|
||||
pipelines) and add **zero** Python dependencies, so neither risks the Chaquopy
|
||||
`--no-deps` build constraint documented in `CLAUDE.md`.
|
||||
**Status:** notifications, audio, **and webcam** are all shipped — each reuses existing
|
||||
infrastructure (bridge pattern, the MediaProjection consent token / process-global
|
||||
`Python.getInstance()`, the capture/audio/notification pipelines) and adds **zero** Python
|
||||
dependencies, so none risks the Chaquopy `--no-deps` build constraint documented in
|
||||
`CLAUDE.md`. The only remaining idea is the **foreground-app automation condition** (moderate
|
||||
value); GPU load, another-phone capture, and multi-display remain not-recommended / won't-do.
|
||||
|
||||
## Cross-cutting notes
|
||||
|
||||
|
||||
@@ -0,0 +1,168 @@
|
||||
# Plan: Android on-device webcam capture
|
||||
|
||||
> Status: **implemented** on branch `feature/android-webcam-capture`. Last updated 2026-06-02.
|
||||
|
||||
## Context
|
||||
|
||||
LedGrab captures webcams on desktop through OpenCV (`cv2.VideoCapture`) in
|
||||
`server/src/ledgrab/core/capture_engines/camera_engine.py`. On the **experimental Android-TV
|
||||
build**, `opencv-python-headless` has no Chaquopy cp311 wheel, so the camera engine never
|
||||
loads and cameras are unusable on-device.
|
||||
|
||||
Android doesn't need OpenCV to capture a camera: the platform exposes **Camera2**
|
||||
(`android.hardware.camera2`), and the codebase already has the bridge shape to plug a Kotlin
|
||||
capture source into a push-based Python engine. This feature adds an on-device camera engine
|
||||
so a USB/integrated camera can drive ambient lighting, at parity with how the desktop OpenCV
|
||||
camera engine feeds the pipeline.
|
||||
|
||||
The design mirrors the working screen-capture bridge
|
||||
(`mediaprojection_engine.py` ↔ `ScreenCapture.kt`) and the just-shipped audio engine
|
||||
(`android_audio_engine.py` ↔ `AudioCapture.kt`). **No new Python dependencies** (numpy already
|
||||
bundled) and **no new Gradle dependencies** (Camera2 is in-platform) → no Chaquopy /
|
||||
`build.gradle.kts` changes.
|
||||
|
||||
## Approach
|
||||
|
||||
A new **push-based** capture engine registered in the existing `EngineRegistry`, plus a Kotlin
|
||||
`CameraBridge` that opens the camera **on demand**:
|
||||
|
||||
```
|
||||
[capture source acquired] → AndroidCameraCaptureStream.initialize()
|
||||
→ android_camera_engine.start_camera(index, w, h) [guarded jclass]
|
||||
→ CameraBridge.startCamera(index, w, h) [Camera2 open + session]
|
||||
→ onImageAvailable → YUV_420_888→RGB (stride-aware) → push_frame(rgbBytes, w, h)
|
||||
→ android_camera_engine [module-level queue] → AndroidCameraCaptureStream.capture_frame()
|
||||
→ ScreenCaptureLiveStream → processing pipeline [unchanged]
|
||||
|
||||
[capture source released] → AndroidCameraCaptureStream.cleanup()
|
||||
→ android_camera_engine.stop_camera() → CameraBridge.stopCamera() [releases the camera]
|
||||
```
|
||||
|
||||
The camera is **only open while a camera source is active** — the camera-in-use indicator and
|
||||
battery cost are bounded to actual use, unlike always-on screen/audio capture. This on-demand
|
||||
control reuses the synchronous Python→Kotlin singleton pattern of `BleBridge`/`UsbSerialBridge`.
|
||||
|
||||
## Selection path (why nothing downstream changes)
|
||||
|
||||
Webcams on desktop are a `ScreenCapturePictureSource` (`stream_type="raw"`) bound to a capture
|
||||
template whose `engine_type="camera"` + a `display_index`. `live_stream_manager`
|
||||
`_create_screen_capture_live_stream` reads `engine_type` from the template and calls
|
||||
`EngineRegistry.create_stream(engine_type, display_index, config)`. Android adds
|
||||
`engine_type="android_camera"` — the **same path**. The frontend
|
||||
(`static/js/features/streams-capture-templates.ts`) is fully data-driven: the engine list,
|
||||
the `resolution` config dropdown (keyed by field name), and the camera picker
|
||||
(`/config/displays?engine_type=android_camera`, since `HAS_OWN_DISPLAYS=True`) all work with
|
||||
no frontend changes.
|
||||
|
||||
## Part A — Python (`core/capture_engines/android_camera_engine.py`)
|
||||
|
||||
Mirrors `mediaprojection_engine.py` (module-level `queue.Queue` + `push_frame` + `_last_frame`
|
||||
fallback + drop-oldest) and the desktop `CameraEngine` shape (cameras as displays,
|
||||
`resolution` config).
|
||||
|
||||
- `_camera_bridge()` — lazy, `is_android()`-guarded `from java import jclass;
|
||||
jclass("com.ledgrab.android.CameraBridge").INSTANCE`. **Never imported at module load** (this
|
||||
module imports on desktop CI). Mirrors `core/devices/android_ble_transport.py`.
|
||||
- `list_cameras()` → parses `CameraBridge.listCameras()` JSON into
|
||||
`[{"index","name","facing"}]`; `_enumerate_cameras()` caches it (30 s TTL).
|
||||
- `push_frame(rgb_bytes, w, h)` → `np.frombuffer(...uint8)` reshape **`(h, w, 3)`** (RGB, 3
|
||||
B/px — NOT the RGBA `(h,w,4)` of the screen engine) → `.copy()` → drop-oldest enqueue. A
|
||||
short/malformed buffer is dropped, never reshape-crashes.
|
||||
- `start_camera(index, w, h) -> bool` / `stop_camera(index)` → guarded bridge calls.
|
||||
- `AndroidCameraEngine`: `ENGINE_TYPE="android_camera"`, `ENGINE_PRIORITY=0` (never
|
||||
auto-selected over MediaProjection=100 — explicit `engine_type` only), `HAS_OWN_DISPLAYS=True`,
|
||||
`is_available()=is_android() and ≥1 enumerated camera`, `get_config_choices()` exposes
|
||||
`resolution` (same presets as desktop).
|
||||
- `AndroidCameraCaptureStream`: `initialize()` parses `resolution` → `start_camera(...)` (raises
|
||||
if it returns False), drains stale frames; `capture_frame()` pops queue / returns `_last_frame`;
|
||||
`cleanup()` → `stop_camera(...)`.
|
||||
|
||||
Registered in `capture_engines/__init__.py` behind a guarded import (mirrors the
|
||||
mediaprojection block).
|
||||
|
||||
## Part B — Android (`CameraBridge.kt`)
|
||||
|
||||
`object CameraBridge` (mirrors `BleBridge`):
|
||||
|
||||
- `init(context)` — from `LedGrabApp.onCreate` (context only, no camera opened).
|
||||
- `listCameras(): String` — JSON array from `CameraManager.cameraIdList` + `LENS_FACING`
|
||||
(front/back/external). No CAMERA permission needed.
|
||||
- `startCamera(index, width, height): Boolean` — checks CAMERA permission; resolves cameraId;
|
||||
picks the supported YUV size closest to the request (balanced default ≤1280×720 for "auto");
|
||||
opens device + capture session on a private `HandlerThread`, blocking until configured
|
||||
(`runBlocking { withTimeout { ... } }` over `suspendCancellableCoroutine`-wrapped Camera2
|
||||
callbacks); sets a repeating preview request. Returns false (no throw across JNI) on
|
||||
permission/range/configure failure. Closes any prior camera first.
|
||||
- `onImageAvailable` → paced (≈20 fps) → stride-aware **YUV_420_888→RGB** (BT.601 fixed-point,
|
||||
reused plane + RGB buffers) → push to the cached `android_camera_engine` module handle.
|
||||
- `stopCamera()` — stops repeating, closes session/device/reader, idempotent.
|
||||
|
||||
## Part C — Wiring + permission + manifest
|
||||
|
||||
- `LedGrabApp.kt` — `CameraBridge.init(this)` next to `BleBridge.init`.
|
||||
- `MainActivity.kt` — `ensureCameraPermission()` (mirror `ensureAudioPermission`): request
|
||||
`CAMERA` iff `hasSystemFeature(FEATURE_CAMERA_ANY)`; called from both `startCaptureService`
|
||||
(MediaProjection path) and `startRootCaptureService` (root path). Fire-and-forget.
|
||||
- `AndroidManifest.xml` — `<uses-permission CAMERA>` + `<uses-feature camera.any required=false>`
|
||||
+ `<uses-permission FOREGROUND_SERVICE_CAMERA>`, and `camera` added to the `CaptureService`
|
||||
`foregroundServiceType` union (`mediaProjection|specialUse|camera`).
|
||||
- `CaptureService.onStartCommand` — on API 34+, OR `FOREGROUND_SERVICE_TYPE_CAMERA` into the
|
||||
promotion type **only when CAMERA is already granted**. Unlike audio playback capture (which
|
||||
rides the MediaProjection token under the mediaProjection type), the camera has no such
|
||||
coupling, so without its own FGS type Android 14+ revokes camera access once the app is
|
||||
backgrounded. The conditional guard avoids a failed `startForeground` (which would kill the
|
||||
whole service) on a camera-less / not-yet-granted box. If CAMERA is granted later, the camera
|
||||
type takes effect on the next Start.
|
||||
- No `proguard-rules.pro` change — the blanket `-keep class com.ledgrab.android.** { *; }`
|
||||
already covers `CameraBridge`, and R8/minify is disabled.
|
||||
|
||||
## What does NOT change
|
||||
|
||||
- **Frontend / API** — data-driven engine list, config, and display picker.
|
||||
- **`build.gradle.kts` / Chaquopy pip block** — no new Python or Gradle packages.
|
||||
- **Processing pipeline** — `ScreenCaptureLiveStream`, filters, color-strip sources unchanged.
|
||||
|
||||
## Files
|
||||
|
||||
**Create**
|
||||
- `server/src/ledgrab/core/capture_engines/android_camera_engine.py`
|
||||
- `android/app/src/main/java/com/ledgrab/android/CameraBridge.kt`
|
||||
- `server/tests/core/test_android_camera_engine.py`
|
||||
|
||||
**Modify**
|
||||
- `server/src/ledgrab/core/capture_engines/__init__.py` — guarded import + registration.
|
||||
- `android/app/src/main/java/com/ledgrab/android/LedGrabApp.kt` — `CameraBridge.init`.
|
||||
- `android/app/src/main/java/com/ledgrab/android/MainActivity.kt` — `ensureCameraPermission`.
|
||||
- `android/app/src/main/AndroidManifest.xml` — `CAMERA` + `camera.any`.
|
||||
|
||||
## Tests (Python — desktop CI, no device)
|
||||
|
||||
`server/tests/core/test_android_camera_engine.py`: push→capture round-trips RGB `(h,w,3)`;
|
||||
drop-oldest when full; `_last_frame` fallback on empty; short-buffer never crashes;
|
||||
`initialize()` opens with parsed/auto resolution and raises on open-failure / off-Android;
|
||||
`cleanup()` closes once (idempotent); `is_available()` gating (android + cameras); display
|
||||
enumeration; priority 0 never beats MediaProjection; create-via-registry yields a pushed frame.
|
||||
|
||||
## Verification
|
||||
|
||||
1. **Python:** `py -3.13 -m pytest tests/core/test_android_camera_engine.py --no-cov -q`, then
|
||||
the full suite (1880 passed, 2 skipped; 15 new).
|
||||
2. **Lint:** `ruff check src/ tests/ --fix` — clean.
|
||||
3. **Android build:** `./gradlew :app:assembleDebug` — BUILD SUCCESSFUL.
|
||||
4. **On device (manual):** install APK → Start capture → grant CAMERA → create a capture
|
||||
template with engine `android_camera` + a camera display + a ScreenCapture source bound to
|
||||
a strip → confirm LEDs react to the camera feed and the camera indicator only lights while
|
||||
the source is active.
|
||||
|
||||
## Risks / notes
|
||||
|
||||
- **MVP scope:** webcam works **while LedGrab capture is running** (the Python server only runs
|
||||
inside `CaptureService`; there is no camera-only start path on Android).
|
||||
- **One camera at a time:** `startCamera` closes any previously-open camera first.
|
||||
- **`"auto"` resolution** picks a balanced output size (~720p), not the sensor max, to keep the
|
||||
per-frame YUV→RGB conversion cheap on low-end TV boxes.
|
||||
- **USB-UVC webcams** appear only if the device exposes them through Camera2 (`LENS_FACING_EXTERNAL`),
|
||||
which varies by box; an explicit UVC library would be a separate, larger effort.
|
||||
- **No frame-rotation correction** — sensor orientation is not applied (ambient color sampling
|
||||
is largely orientation-tolerant); could be added later.
|
||||
- **CAMERA denied** → the engine reports no usable camera and capture proceeds without it.
|
||||
Reference in New Issue
Block a user