feat(android): on-device webcam capture via Camera2 (AndroidCameraEngine)

Add on-device webcam capture to the experimental Android-TV build. Desktop
captures webcams via OpenCV (no Chaquopy/Android wheel); this adds a push-based
AndroidCameraEngine that plugs into the same selection path desktop uses
(capture template engine_type="android_camera" + display_index, HAS_OWN_DISPLAYS).

A Kotlin CameraBridge (Camera2) enumerates cameras and opens them on demand —
only while a capture source is active, driven Python->Kotlin via a guarded jclass
singleton (BleBridge pattern) — converts each frame YUV_420_888->RGB, and pushes
RGB bytes into a module-level queue mirroring mediaprojection_engine.py. Cameras
surface as selectable displays like the desktop OpenCV engine; the data-driven
capture-template UI is unchanged. No new Python deps; no new Gradle deps
(Camera2 is in-platform).

Engine: ENGINE_PRIORITY=0 (never auto-selected over MediaProjection=100; explicit
engine_type only). Single-camera ownership is serialized with a lock + ref-count
(same-camera streams attach, different-camera refused, last release stops),
mirroring the desktop CameraEngine guard.

Permission: CAMERA requested at capture-start, gated on FEATURE_CAMERA_ANY so
camera-less TV boxes never prompt; graceful degradation when denied. The service
is promoted with the camera FGS type (+ FOREGROUND_SERVICE_CAMERA) only when
CAMERA is already granted, so backgrounded capture keeps working without risking
a failed startForeground on camera-less boxes (camera can't ride the
MediaProjection token the way audio playback capture does).

Reviewed via multi-agent adversarial pass (13 findings -> 4 fixed: device leak on
session-failure, multi-stream collision, camera FGS type, i18n key; 9 refuted).

Tests: 18 new desktop-CI tests (no device needed); full suite 1883 passed.
Verified: assembleDebug BUILD SUCCESSFUL, ruff clean.

Docs: ANDROID-REVIEW/android-webcam-capture-plan.md (design), updated
android-missing-functionality.md + README feature table + en/ru/zh locales.
This commit is contained in:
2026-06-02 13:36:23 +03:00
parent 34db5de8c3
commit 4bf3fe65db
14 changed files with 1480 additions and 17 deletions
@@ -86,6 +86,18 @@ try:
except ImportError:
_has_mediaprojection = False
# ── Android camera/webcam (Camera2 via Chaquopy bridge) ─────────────
try:
from ledgrab.core.capture_engines.android_camera_engine import (
AndroidCameraEngine,
AndroidCameraCaptureStream,
)
_has_android_camera = True
except ImportError:
_has_android_camera = False
# ── Android root screenrecord (rooted Magisk devices) ───────────────
try:
@@ -120,6 +132,8 @@ if _has_camera:
EngineRegistry.register(CameraEngine)
if _has_mediaprojection:
EngineRegistry.register(MediaProjectionEngine)
if _has_android_camera:
EngineRegistry.register(AndroidCameraEngine)
if _has_root_screenrecord:
EngineRegistry.register(RootScreenrecordEngine)
EngineRegistry.register(DemoCaptureEngine)
@@ -152,5 +166,7 @@ if _has_camera:
__all__ += ["CameraEngine", "CameraCaptureStream"]
if _has_mediaprojection:
__all__ += ["MediaProjectionEngine", "MediaProjectionCaptureStream"]
if _has_android_camera:
__all__ += ["AndroidCameraEngine", "AndroidCameraCaptureStream"]
if _has_root_screenrecord:
__all__ += ["RootScreenrecordEngine", "RootScreenrecordCaptureStream"]
@@ -0,0 +1,430 @@
"""Android camera (webcam) capture engine.
Receives camera frames pushed from Kotlin (via Chaquopy) through a
module-level frame queue. The Kotlin :class:`CameraBridge` opens a
camera with the Camera2 API, converts each frame to RGB, and calls
:func:`push_frame` with raw RGB bytes.
The physical camera is opened **on demand** — only while a capture
stream is active. :meth:`AndroidCameraCaptureStream.initialize` calls
:func:`start_camera` (which signals the Kotlin bridge to open the
camera) and :meth:`cleanup` calls :func:`stop_camera`. This keeps the
camera-in-use indicator and battery cost limited to actual use, unlike
the always-on screen/audio capture.
Mirrors the screen-capture bridge
(``core/capture_engines/mediaprojection_engine.py``): a module-level
queue plus push/last-frame fallback/drop-oldest, consumed through the
standard :class:`CaptureEngine` / :class:`CaptureStream` interface so
the live-stream and processing pipelines work unchanged. Cameras are
exposed as selectable "displays" exactly like the desktop OpenCV
:class:`CameraEngine`.
This engine is only available when running inside the LedGrab Android
app (``is_android()``) with at least one camera the Kotlin bridge can
enumerate. All Java interop is lazy + guarded so this module imports
cleanly on desktop CI.
"""
import json
import queue
import threading
import time
from typing import Any, Dict, List, Optional
import numpy as np
from ledgrab.core.capture_engines.base import (
CaptureEngine,
CaptureStream,
DisplayInfo,
ScreenCapture,
)
from ledgrab.utils import get_logger
from ledgrab.utils.platform import is_android
logger = get_logger(__name__)
# ---------------------------------------------------------------------------
# Frame queue — the bridge between Kotlin and Python
# ---------------------------------------------------------------------------
_frame_queue: "queue.Queue[ScreenCapture]" = queue.Queue(maxsize=2)
_active = False
_active_index = 0
_frames_received = 0
# Single-camera ownership. The Kotlin bridge supports exactly one open camera
# at a time (it closes any prior camera on a new open), and all streams share
# the one module-level frame queue. So the engine serializes ownership the way
# the desktop CameraEngine does with its _camera_lock/_active_cv2_indices: the
# first stream to initialize() owns the camera; a second stream on the SAME
# camera attaches (ref-counted); a second stream on a DIFFERENT camera is
# refused. Only the last owner to clean up actually stops the camera. Without
# this, two concurrent android_camera sources on different displays would make
# the second open silently steal the first's frames, and either stream's
# cleanup would drain the shared queue out from under the other.
_state_lock = threading.Lock()
_owner_index: int | None = None # display_index that currently owns the camera
_owner_refs = 0 # number of streams attached to the active camera
# Camera2 delivers frames continuously, but cache the last one so a
# brief consumer stall still has something to read (mirrors
# mediaprojection_engine's _last_frame).
_last_frame: Optional["ScreenCapture"] = None
# Enumeration cache. is_available() is polled by the engine registry,
# so the (cheap but non-free) Camera2 enumeration is cached briefly —
# matching the desktop CameraEngine's 30 s TTL.
_cam_cache: List[Dict[str, Any]] | None = None
_cam_cache_time: float = 0.0
_CAM_CACHE_TTL = 30.0 # seconds
# Resolution presets shown in the UI. Identical to the desktop
# CameraEngine set so the data-driven capture-template config UI
# (keyed by the "resolution" field name) renders the same dropdown.
# "auto" lets the Kotlin bridge pick a balanced output size.
_RESOLUTION_CHOICES: List[str] = [
"auto",
"640x480",
"1280x720",
"1920x1080",
"2560x1440",
"3840x2160",
]
def _parse_resolution(value: Any) -> tuple[int, int] | None:
"""Parse a 'WxH' string into (width, height). None for 'auto'/invalid."""
if not isinstance(value, str):
return None
s = value.strip().lower()
if s in ("", "auto"):
return None
parts = s.replace("×", "x").split("x")
if len(parts) != 2:
return None
try:
w, h = int(parts[0]), int(parts[1])
except ValueError:
return None
if w <= 0 or h <= 0:
return None
return w, h
# ---------------------------------------------------------------------------
# Kotlin CameraBridge interop — lazy + guarded (never at import time)
# ---------------------------------------------------------------------------
def _camera_bridge():
"""Return the Kotlin ``CameraBridge`` singleton, or None off-Android.
The ``from java import jclass`` import only resolves inside the
Chaquopy runtime, so it must never run at module import time (this
module is imported on desktop CI too). Mirrors
``core/devices/android_ble_transport.py``.
"""
if not is_android():
return None
try:
from java import jclass # type: ignore[import-not-found]
except ImportError as exc:
logger.debug("Chaquopy java interop not available: %s", exc)
return None
try:
return jclass("com.ledgrab.android.CameraBridge").INSTANCE
except Exception as exc: # pragma: no cover - Android-only path
logger.debug("CameraBridge singleton unavailable: %s", exc)
return None
def list_cameras() -> List[Dict[str, Any]]:
"""Enumerate cameras via the Kotlin bridge.
Returns a list of ``{"index": int, "name": str, "facing": str}``
dicts in stable enumeration order, or ``[]`` off-Android / on error
/ when the device has no cameras or CAMERA enumeration fails.
Monkeypatched in tests to inject a fake list without Android.
"""
bridge = _camera_bridge()
if bridge is None:
return []
try:
raw = bridge.listCameras() # JSON array string
except Exception as exc: # pragma: no cover - Android-only path
logger.warning("CameraBridge.listCameras failed: %s", exc)
return []
try:
parsed = json.loads(str(raw))
except (ValueError, TypeError) as exc: # pragma: no cover
logger.warning("CameraBridge.listCameras returned invalid JSON: %s", exc)
return []
cameras: List[Dict[str, Any]] = []
for i, entry in enumerate(parsed if isinstance(parsed, list) else []):
if not isinstance(entry, dict):
continue
cameras.append(
{
"index": int(entry.get("index", i)),
"name": str(entry.get("name") or f"Camera {i}"),
"facing": str(entry.get("facing") or "unknown"),
}
)
return cameras
def _enumerate_cameras() -> List[Dict[str, Any]]:
"""Cached camera enumeration (TTL ``_CAM_CACHE_TTL``)."""
global _cam_cache, _cam_cache_time
now = time.monotonic()
if _cam_cache is not None and (now - _cam_cache_time) < _CAM_CACHE_TTL:
return _cam_cache
_cam_cache = list_cameras()
_cam_cache_time = now
return _cam_cache
def start_camera(index: int, width: int, height: int) -> bool:
"""Signal the Kotlin bridge to open camera ``index`` (on demand).
``width``/``height`` are the requested capture size (0 => let the
bridge pick a balanced default). Returns True if the camera began
streaming. False off-Android, when the bridge is unavailable, or
when the open failed (e.g. CAMERA permission denied, camera in use).
Monkeypatched in tests.
"""
bridge = _camera_bridge()
if bridge is None:
return False
try:
return bool(bridge.startCamera(index, width, height))
except Exception as exc: # pragma: no cover - Android-only path
logger.warning("CameraBridge.startCamera(%d) failed: %s", index, exc)
return False
def stop_camera(index: int) -> None:
"""Signal the Kotlin bridge to close the active camera. No-op off-Android."""
bridge = _camera_bridge()
if bridge is None:
return
try:
bridge.stopCamera()
except Exception as exc: # pragma: no cover - Android-only path
logger.debug("CameraBridge.stopCamera failed: %s", exc)
def push_frame(rgb_bytes: bytes, width: int, height: int) -> None:
"""Push one RGB frame from Kotlin into the capture pipeline.
Called from ``CameraBridge`` on its capture thread. The byte buffer
is interpreted as tightly-packed RGB (``width * height * 3`` bytes,
3 bytes/pixel — NOT RGBA). The buffer is copied out so Kotlin may
reuse its backing array; the oldest queued frame is dropped if the
consumer is slow.
"""
global _frames_received, _last_frame
expected = width * height * 3
if expected <= 0:
return
arr = np.frombuffer(rgb_bytes, dtype=np.uint8)
if arr.size < expected:
# Short/malformed buffer — drop rather than reshape-crash.
return
# Copy out of the read-only frombuffer view (and off any reusable
# Kotlin buffer) so the queued frame owns its memory. Mirrors
# mediaprojection_engine.push_frame's .copy().
rgb = arr[:expected].reshape((height, width, 3)).copy()
frame = ScreenCapture(
image=rgb,
width=width,
height=height,
display_index=_active_index,
)
_last_frame = frame
_frames_received += 1
if _frames_received == 1 or _frames_received % 100 == 0:
logger.info("Android camera: received %d frames", _frames_received)
# Drop oldest frame if queue is full (non-blocking).
try:
_frame_queue.put_nowait(frame)
except queue.Full:
try:
_frame_queue.get_nowait()
except queue.Empty:
pass
try:
_frame_queue.put_nowait(frame)
except queue.Full:
pass
def shutdown() -> None:
"""Deactivate the engine. Called when the Android app stops."""
global _active
_active = False
logger.info("Android camera engine shut down")
def _drain_queue() -> None:
"""Discard any queued frames (stale frames from a prior session)."""
global _last_frame
while not _frame_queue.empty():
try:
_frame_queue.get_nowait()
except queue.Empty:
break
_last_frame = None
# ---------------------------------------------------------------------------
# CaptureStream
# ---------------------------------------------------------------------------
class AndroidCameraCaptureStream(CaptureStream):
"""Reads camera frames pushed by Kotlin from the module-level queue.
Opening the physical camera is on demand: :meth:`initialize` asks
the Kotlin bridge to open the camera bound to ``display_index`` and
:meth:`cleanup` asks it to close.
"""
def initialize(self) -> None:
if self._initialized:
return
if not is_android():
raise RuntimeError(
"Android camera engine not available. "
"This engine is only usable inside the Android app."
)
parsed = _parse_resolution(self.config.get("resolution", "auto"))
target_w, target_h = parsed if parsed is not None else (0, 0)
global _active, _active_index, _owner_index, _owner_refs
with _state_lock:
if _owner_index is not None and _owner_index != self.display_index:
# Another camera is already streaming — the bridge can only
# drive one at a time, so refuse rather than silently stealing
# the active camera's frames (mirrors the desktop CameraEngine's
# "already in use by another stream").
raise RuntimeError(
f"Android camera {_owner_index} is already in use by another "
f"capture; only one camera can stream at a time"
)
if _owner_index == self.display_index:
# Same camera already open — attach to it (ref-counted).
_owner_refs += 1
self._initialized = True
logger.info(
"Android camera capture stream attached (camera=%d, refs=%d)",
self.display_index,
_owner_refs,
)
return
# No camera open — open this one. Drain stale frames first so the
# first captured frame is actually current.
_drain_queue()
if not start_camera(self.display_index, target_w, target_h):
raise RuntimeError(
f"Failed to open Android camera {self.display_index} "
f"(CAMERA permission denied, camera in use, or unavailable)"
)
_owner_index = self.display_index
_owner_refs = 1
_active = True
_active_index = self.display_index
self._initialized = True
logger.info("Android camera capture stream initialized (camera=%d)", self.display_index)
def capture_frame(self) -> ScreenCapture | None:
if not self._initialized:
self.initialize()
# Prefer a fresh frame; fall back to the last one on a brief stall.
try:
return _frame_queue.get(timeout=0.1)
except queue.Empty:
return _last_frame
def cleanup(self) -> None:
if self._initialized:
global _active, _owner_index, _owner_refs
with _state_lock:
_owner_refs -= 1
if _owner_refs <= 0:
# Last owner released — actually stop the camera.
stop_camera(self.display_index)
_owner_index = None
_owner_refs = 0
_active = False
_drain_queue()
self._initialized = False
logger.info("Android camera capture stream cleaned up (camera=%d)", self.display_index)
else:
self._initialized = False
# ---------------------------------------------------------------------------
# CaptureEngine
# ---------------------------------------------------------------------------
class AndroidCameraEngine(CaptureEngine):
"""Android camera/webcam capture engine (Camera2 via Kotlin bridge).
Only available inside the LedGrab Android app with at least one
enumerable camera. Each camera is exposed as a selectable
"display", mirroring the desktop OpenCV :class:`CameraEngine`.
Selected explicitly via ``engine_type="android_camera"`` in a
capture template — never auto-selected (priority 0, below
MediaProjection's 100).
"""
ENGINE_TYPE = "android_camera"
ENGINE_PRIORITY = 0 # never auto-selected over MediaProjection (100); explicit only
HAS_OWN_DISPLAYS = True
@classmethod
def is_available(cls) -> bool:
return is_android() and len(_enumerate_cameras()) > 0
@classmethod
def get_default_config(cls) -> Dict[str, Any]:
return {"resolution": "auto"}
@classmethod
def get_config_choices(cls) -> Dict[str, List[str]]:
return {"resolution": list(_RESOLUTION_CHOICES)}
@classmethod
def get_available_displays(cls) -> List[DisplayInfo]:
displays: List[DisplayInfo] = []
for cam in _enumerate_cameras():
idx = cam["index"]
displays.append(
DisplayInfo(
index=idx,
name=cam["name"],
width=0,
height=0,
x=idx * 500,
y=0,
is_primary=(idx == 0),
refresh_rate=30,
)
)
return displays
@classmethod
def create_stream(
cls, display_index: int, config: Dict[str, Any]
) -> AndroidCameraCaptureStream:
merged = {**cls.get_default_config(), **config}
return AndroidCameraCaptureStream(display_index, merged)
@@ -103,6 +103,7 @@
"templates.engine.wgc.desc": "Windows Graphics Capture",
"templates.engine.demo.desc": "Animated test pattern (demo mode)",
"templates.engine.mediaprojection.desc": "Native Android screen capture",
"templates.engine.android_camera.desc": "On-device camera capture (Camera2)",
"templates.config": "Configuration",
"templates.config.show": "Show configuration",
"templates.config.none": "No additional configuration",
@@ -158,6 +158,7 @@
"templates.engine.wgc.desc": "Windows Graphics Capture",
"templates.engine.demo.desc": "Тестовый анимированный шаблон (демо)",
"templates.engine.mediaprojection.desc": "Нативный захват экрана Android",
"templates.engine.android_camera.desc": "Захват камеры устройства (Camera2)",
"templates.config": "Конфигурация",
"templates.config.show": "Показать конфигурацию",
"templates.config.none": "Нет дополнительных настроек",
@@ -156,6 +156,7 @@
"templates.engine.wgc.desc": "Windows图形捕获",
"templates.engine.demo.desc": "动画测试图案(演示模式)",
"templates.engine.mediaprojection.desc": "原生Android屏幕捕获",
"templates.engine.android_camera.desc": "设备摄像头捕获 (Camera2)",
"templates.config": "配置",
"templates.config.show": "显示配置",
"templates.config.none": "无额外配置",