feat(android): on-device webcam capture via Camera2 (AndroidCameraEngine)

Add on-device webcam capture to the experimental Android-TV build. Desktop
captures webcams via OpenCV (no Chaquopy/Android wheel); this adds a push-based
AndroidCameraEngine that plugs into the same selection path desktop uses
(capture template engine_type="android_camera" + display_index, HAS_OWN_DISPLAYS).

A Kotlin CameraBridge (Camera2) enumerates cameras and opens them on demand —
only while a capture source is active, driven Python->Kotlin via a guarded jclass
singleton (BleBridge pattern) — converts each frame YUV_420_888->RGB, and pushes
RGB bytes into a module-level queue mirroring mediaprojection_engine.py. Cameras
surface as selectable displays like the desktop OpenCV engine; the data-driven
capture-template UI is unchanged. No new Python deps; no new Gradle deps
(Camera2 is in-platform).

Engine: ENGINE_PRIORITY=0 (never auto-selected over MediaProjection=100; explicit
engine_type only). Single-camera ownership is serialized with a lock + ref-count
(same-camera streams attach, different-camera refused, last release stops),
mirroring the desktop CameraEngine guard.

Permission: CAMERA requested at capture-start, gated on FEATURE_CAMERA_ANY so
camera-less TV boxes never prompt; graceful degradation when denied. The service
is promoted with the camera FGS type (+ FOREGROUND_SERVICE_CAMERA) only when
CAMERA is already granted, so backgrounded capture keeps working without risking
a failed startForeground on camera-less boxes (camera can't ride the
MediaProjection token the way audio playback capture does).

Reviewed via multi-agent adversarial pass (13 findings -> 4 fixed: device leak on
session-failure, multi-stream collision, camera FGS type, i18n key; 9 refuted).

Tests: 18 new desktop-CI tests (no device needed); full suite 1883 passed.
Verified: assembleDebug BUILD SUCCESSFUL, ruff clean.

Docs: ANDROID-REVIEW/android-webcam-capture-plan.md (design), updated
android-missing-functionality.md + README feature table + en/ru/zh locales.
This commit is contained in:
2026-06-02 13:36:23 +03:00
parent 34db5de8c3
commit 4bf3fe65db
14 changed files with 1480 additions and 17 deletions
+28 -1
View File
@@ -35,6 +35,13 @@
<uses-permission android:name="android.permission.FOREGROUND_SERVICE" />
<uses-permission android:name="android.permission.FOREGROUND_SERVICE_MEDIA_PROJECTION" />
<uses-permission android:name="android.permission.FOREGROUND_SERVICE_SPECIAL_USE" />
<!-- FOREGROUND_SERVICE_CAMERA (API 34+): required to keep camera access while
the app is backgrounded during on-device webcam capture. The service is
promoted with the `camera` FGS type ONLY when CAMERA is already granted
(see CaptureService.onStartCommand) — unlike audio playback capture (which
rides the MediaProjection token under the mediaProjection type), the camera
has no such coupling and needs its own FGS type to survive backgrounding. -->
<uses-permission android:name="android.permission.FOREGROUND_SERVICE_CAMERA" />
<!-- POST_NOTIFICATIONS for Android 13+ foreground service notification -->
<uses-permission android:name="android.permission.POST_NOTIFICATIONS" />
@@ -47,6 +54,17 @@
only be required if the mic-fallback path ran inside the service). -->
<uses-permission android:name="android.permission.RECORD_AUDIO" />
<!-- CAMERA for on-device webcam capture (Camera2). Runtime "dangerous"
permission, requested in MainActivity gated on FEATURE_CAMERA_ANY so
camera-less TV boxes never see the prompt; capture degrades gracefully
when denied. The camera is opened ON DEMAND (only while a camera
capture source is active). To keep capturing after the app is
backgrounded, the service is promoted with the `camera` FGS type
(FOREGROUND_SERVICE_CAMERA above) — but only when CAMERA is already
granted, so a camera-less / not-yet-granted box never risks a failed
service start. -->
<uses-permission android:name="android.permission.CAMERA" />
<!-- Autostart on boot — BootReceiver spawns CaptureService in root
mode so capture resumes without the user touching the remote. -->
<uses-permission android:name="android.permission.RECEIVE_BOOT_COMPLETED" />
@@ -71,6 +89,15 @@
android:name="android.hardware.usb.host"
android:required="false" />
<!-- Camera hardware — for on-device webcam capture. required=false so
camera-less TV boxes (the common case) still install; the camera
engine simply reports no displays on such devices. camera.any covers
built-in (front/back) and external/USB-UVC cameras the platform
routes through Camera2. -->
<uses-feature
android:name="android.hardware.camera.any"
android:required="false" />
<application
android:name=".LedGrabApp"
android:allowBackup="false"
@@ -103,7 +130,7 @@
PROPERTY_SPECIAL_USE_FGS_SUBTYPE rationale below. -->
<service
android:name=".CaptureService"
android:foregroundServiceType="mediaProjection|specialUse"
android:foregroundServiceType="mediaProjection|specialUse|camera"
android:exported="false">
<property
android:name="android.app.PROPERTY_SPECIAL_USE_FGS_SUBTYPE"
@@ -0,0 +1,411 @@
package com.ledgrab.android
import android.Manifest
import android.annotation.SuppressLint
import android.content.Context
import android.content.pm.PackageManager
import android.graphics.ImageFormat
import android.hardware.camera2.CameraCaptureSession
import android.hardware.camera2.CameraCharacteristics
import android.hardware.camera2.CameraDevice
import android.hardware.camera2.CameraManager
import android.media.Image
import android.media.ImageReader
import android.os.Handler
import android.os.HandlerThread
import android.os.SystemClock
import android.util.Log
import android.util.Size
import android.view.Surface
import com.chaquo.python.PyObject
import com.chaquo.python.Python
import kotlin.coroutines.resume
import kotlin.coroutines.resumeWithException
import kotlinx.coroutines.runBlocking
import kotlinx.coroutines.suspendCancellableCoroutine
import kotlinx.coroutines.withTimeout
import org.json.JSONArray
import org.json.JSONObject
/**
* Android camera bridge exposed to the Python server via Chaquopy.
*
* Wraps the Camera2 API into synchronous, blocking calls that can be
* invoked from a Python thread (Chaquopy proxy threads are real OS
* threads). The physical camera is opened **on demand** — Python's
* `android_camera_engine` calls [startCamera] when a capture stream
* initializes and [stopCamera] when it cleans up, so the camera-in-use
* indicator and battery cost are limited to actual use.
*
* Each captured frame is converted YUV_420_888 → RGB and pushed to the
* Python engine's `push_frame`, mirroring how [ScreenCapture] feeds
* `mediaprojection_engine`. Camera2 callbacks run on a private
* [HandlerThread] so they never touch the main looper.
*
* Python callers access the singleton via
* `jclass("com.ledgrab.android.CameraBridge").INSTANCE` — see
* `server/src/ledgrab/core/capture_engines/android_camera_engine.py`.
*/
object CameraBridge {
private const val TAG = "CameraBridge"
private const val ENGINE_MODULE = "ledgrab.core.capture_engines.android_camera_engine"
private const val OPEN_TIMEOUT_MS = 8_000L
private const val MAX_IMAGES = 2
private const val TARGET_FPS = 20
// "auto" capture size — balanced for ambient LED sampling (the LED
// pipeline downscales anyway), kept modest so the per-frame YUV→RGB
// conversion stays cheap on low-end TV boxes.
private const val DEFAULT_W = 1280
private const val DEFAULT_H = 720
private const val BYTES_PER_RGB = 3
@Volatile private var appContext: Context? = null
// Dedicated looper thread so Camera2 callbacks don't land on main.
private val camThread = HandlerThread("LedGrab-Camera").also { it.start() }
private val camHandler = Handler(camThread.looper)
// Active session state — guarded by [lock]. One camera at a time.
private val lock = Any()
private var cameraDevice: CameraDevice? = null
private var captureSession: CameraCaptureSession? = null
private var imageReader: ImageReader? = null
@Volatile private var running = false
private var activeIndex = -1
// Cached Python engine module handle for the per-frame push fast path.
@Volatile private var engineModule: PyObject? = null
// Reusable conversion buffers — sized once per session (output size is
// fixed for the session), reused to avoid per-frame GC churn on TV boxes.
private var rgbBuffer: ByteArray? = null
private var yBuf: ByteArray? = null
private var uBuf: ByteArray? = null
private var vBuf: ByteArray? = null
// Monotonic frame pacing (mirrors ScreenCapture's accumulator).
private val frameIntervalNanos = 1_000_000_000L / TARGET_FPS.coerceAtLeast(1)
private var nextFrameNanos = 0L
/** Called once from [LedGrabApp.onCreate] to bind the application context. */
@JvmStatic
fun init(context: Context) {
appContext = context.applicationContext
}
/**
* Enumerate cameras as a JSON array string the Python engine parses:
* `[{"index":0,"name":"Back camera","facing":"back","cameraId":"0"}, ...]`
*
* Indices are stable (positional in [CameraManager.cameraIdList]) so
* Python's `display_index` maps 1:1 to [startCamera]'s `index`.
* Enumeration needs no CAMERA permission. Returns `[]` on any error.
*/
@JvmStatic
fun listCameras(): String {
val arr = JSONArray()
val ctx = appContext
if (ctx == null) {
Log.w(TAG, "listCameras: context not bound (init not called)")
return arr.toString()
}
try {
val mgr = ctx.getSystemService(Context.CAMERA_SERVICE) as CameraManager
mgr.cameraIdList.forEachIndexed { idx, id ->
val facing = facingOf(mgr, id)
val name = when (facing) {
"front" -> "Front camera"
"back" -> "Back camera"
"external" -> "External camera $idx"
else -> "Camera $idx"
}
arr.put(
JSONObject()
.put("index", idx)
.put("name", name)
.put("facing", facing)
.put("cameraId", id),
)
}
} catch (e: Exception) {
Log.w(TAG, "listCameras failed: ${e.message}")
}
return arr.toString()
}
/**
* Open camera [index] and start streaming RGB frames to Python.
* Blocks until the capture session is configured (or fails/times out).
*
* Returns false — without throwing across the JNI boundary — when the
* CAMERA permission is missing, the index is out of range, or the
* device/session fails to configure. Closes any previously-open camera
* first (one active at a time).
*/
@SuppressLint("MissingPermission")
@JvmStatic
fun startCamera(index: Int, width: Int, height: Int): Boolean {
synchronized(lock) {
closeLocked()
val ctx = appContext ?: run {
Log.w(TAG, "startCamera: context not bound")
return false
}
if (ctx.checkSelfPermission(Manifest.permission.CAMERA)
!= PackageManager.PERMISSION_GRANTED
) {
Log.w(TAG, "startCamera: CAMERA permission not granted")
return false
}
val mgr = ctx.getSystemService(Context.CAMERA_SERVICE) as CameraManager
val ids = try {
mgr.cameraIdList
} catch (e: Exception) {
Log.w(TAG, "startCamera: cameraIdList failed: ${e.message}")
return false
}
if (index < 0 || index >= ids.size) {
Log.w(TAG, "startCamera: index $index out of range (${ids.size} cameras)")
return false
}
val cameraId = ids[index]
val size = chooseSize(mgr, cameraId, width, height) ?: run {
Log.w(TAG, "startCamera: no YUV output sizes for camera $index")
return false
}
val reader = ImageReader.newInstance(
size.width, size.height, ImageFormat.YUV_420_888, MAX_IMAGES,
)
// Size the conversion buffers once for this session.
rgbBuffer = ByteArray(size.width * size.height * BYTES_PER_RGB)
yBuf = null; uBuf = null; vBuf = null
nextFrameNanos = SystemClock.elapsedRealtimeNanos()
reader.setOnImageAvailableListener({ r -> onFrame(r) }, camHandler)
return try {
runBlocking {
withTimeout(OPEN_TIMEOUT_MS) {
// Publish each resource to its field as soon as it exists so
// closeLocked() (in the catch) can release it if a LATER step
// throws. Assigning only after setRepeatingRequest succeeds
// would orphan the opened CameraDevice on a createSession /
// setRepeatingRequest failure (camera stuck on; subsequent
// opens fail with CAMERA_IN_USE).
imageReader = reader
val device = openCamera(mgr, cameraId)
cameraDevice = device
val session = createSession(device, reader.surface)
captureSession = session
val request = device.createCaptureRequest(CameraDevice.TEMPLATE_PREVIEW)
.apply { addTarget(reader.surface) }
.build()
session.setRepeatingRequest(request, null, camHandler)
activeIndex = index
running = true
Log.i(TAG, "Camera $index opened (${size.width}x${size.height} @ ${TARGET_FPS}fps)")
true
}
}
} catch (e: Exception) {
Log.e(TAG, "startCamera($index) failed: ${e.message}")
// imageReader/cameraDevice/captureSession are now whatever got
// assigned before the failure — closeLocked releases each exactly
// once (idempotent, runCatching-wrapped).
closeLocked()
false
}
}
}
/** Stop streaming and release the camera. Idempotent; safe if not started. */
@JvmStatic
fun stopCamera() {
synchronized(lock) { closeLocked() }
Log.i(TAG, "Camera stopped")
}
// ── internals ────────────────────────────────────────────────────────
private fun facingOf(mgr: CameraManager, id: String): String =
when (mgr.getCameraCharacteristics(id).get(CameraCharacteristics.LENS_FACING)) {
CameraCharacteristics.LENS_FACING_FRONT -> "front"
CameraCharacteristics.LENS_FACING_BACK -> "back"
CameraCharacteristics.LENS_FACING_EXTERNAL -> "external"
else -> "unknown"
}
/** Pick the supported YUV size closest in area to the request (or the
* balanced default for `auto`/0). */
private fun chooseSize(mgr: CameraManager, cameraId: String, reqW: Int, reqH: Int): Size? {
val map = mgr.getCameraCharacteristics(cameraId)
.get(CameraCharacteristics.SCALER_STREAM_CONFIGURATION_MAP) ?: return null
val sizes = map.getOutputSizes(ImageFormat.YUV_420_888)
if (sizes == null || sizes.isEmpty()) return null
val targetArea = (if (reqW > 0) reqW else DEFAULT_W).toLong() *
(if (reqH > 0) reqH else DEFAULT_H)
return sizes.minByOrNull { kotlin.math.abs(it.width.toLong() * it.height - targetArea) }
}
@SuppressLint("MissingPermission")
private suspend fun openCamera(mgr: CameraManager, cameraId: String): CameraDevice =
suspendCancellableCoroutine { cont ->
mgr.openCamera(cameraId, object : CameraDevice.StateCallback() {
override fun onOpened(device: CameraDevice) {
if (cont.isActive) cont.resume(device) else device.close()
}
override fun onDisconnected(device: CameraDevice) {
device.close()
if (cont.isActive) cont.resumeWithException(IllegalStateException("camera disconnected"))
}
override fun onError(device: CameraDevice, error: Int) {
device.close()
if (cont.isActive) cont.resumeWithException(IllegalStateException("camera error $error"))
}
}, camHandler)
}
@Suppress("DEPRECATION")
private suspend fun createSession(device: CameraDevice, surface: Surface): CameraCaptureSession =
suspendCancellableCoroutine { cont ->
// createCaptureSession(List, callback, handler) is deprecated at
// API 30 but is the correct API down to minSdk 24 (the
// SessionConfiguration overload is API 28+).
device.createCaptureSession(
listOf(surface),
object : CameraCaptureSession.StateCallback() {
override fun onConfigured(session: CameraCaptureSession) {
if (cont.isActive) cont.resume(session)
}
override fun onConfigureFailed(session: CameraCaptureSession) {
if (cont.isActive) cont.resumeWithException(IllegalStateException("session configure failed"))
}
},
camHandler,
)
}
/** ImageReader callback — paced, converts YUV→RGB, pushes to Python. */
private fun onFrame(reader: ImageReader) {
if (!running) {
runCatching { reader.acquireLatestImage()?.close() }
return
}
val now = SystemClock.elapsedRealtimeNanos()
if (now < nextFrameNanos) {
runCatching { reader.acquireLatestImage()?.close() }
return
}
val image = runCatching { reader.acquireLatestImage() }.getOrNull() ?: return
try {
val w = image.width
val h = image.height
val out = ensureRgbBuffer(w * h * BYTES_PER_RGB)
yuv420ToRgb(image, out, w, h)
pushFrame(out, w, h)
nextFrameNanos += frameIntervalNanos
if (now - nextFrameNanos > frameIntervalNanos * 4) {
nextFrameNanos = now + frameIntervalNanos
}
} catch (e: Exception) {
Log.w(TAG, "frame processing error: ${e.message}")
} finally {
runCatching { image.close() }
}
}
private fun ensureRgbBuffer(size: Int): ByteArray {
val buf = rgbBuffer
if (buf != null && buf.size == size) return buf
return ByteArray(size).also { rgbBuffer = it }
}
/**
* Stride-aware YUV_420_888 → packed RGB (3 bytes/px) using BT.601
* fixed-point coefficients. Handles both planar and semi-planar
* (NV21-like, pixelStride 2) chroma layouts via the plane strides.
*/
private fun yuv420ToRgb(image: Image, out: ByteArray, width: Int, height: Int) {
val planes = image.planes
val yPlane = planes[0]
val uPlane = planes[1]
val vPlane = planes[2]
val yRowStride = yPlane.rowStride
val yPixStride = yPlane.pixelStride
val uRowStride = uPlane.rowStride
val uPixStride = uPlane.pixelStride
val vRowStride = vPlane.rowStride
val vPixStride = vPlane.pixelStride
// Copy each plane to a reusable array for fast indexed access
// (ByteBuffer absolute-get per pixel is far slower).
val yByteBuf = yPlane.buffer
val uByteBuf = uPlane.buffer
val vByteBuf = vPlane.buffer
val yArr = ensurePlane(yBuf, yByteBuf.remaining()).also { yBuf = it }
val uArr = ensurePlane(uBuf, uByteBuf.remaining()).also { uBuf = it }
val vArr = ensurePlane(vBuf, vByteBuf.remaining()).also { vBuf = it }
yByteBuf.get(yArr, 0, yArr.size)
uByteBuf.get(uArr, 0, uArr.size)
vByteBuf.get(vArr, 0, vArr.size)
var o = 0
for (row in 0 until height) {
val yRowBase = row * yRowStride
val uvRow = row shr 1
val uRowBase = uvRow * uRowStride
val vRowBase = uvRow * vRowStride
for (col in 0 until width) {
val y = (yArr[yRowBase + col * yPixStride].toInt() and 0xFF)
val uvCol = col shr 1
val u = (uArr[uRowBase + uvCol * uPixStride].toInt() and 0xFF) - 128
val v = (vArr[vRowBase + uvCol * vPixStride].toInt() and 0xFF) - 128
// BT.601 full-range, fixed-point (<<16).
var r = y + ((91881 * v) shr 16)
var g = y - ((22554 * u + 46802 * v) shr 16)
var b = y + ((116130 * u) shr 16)
if (r < 0) r = 0 else if (r > 255) r = 255
if (g < 0) g = 0 else if (g > 255) g = 255
if (b < 0) b = 0 else if (b > 255) b = 255
out[o++] = r.toByte()
out[o++] = g.toByte()
out[o++] = b.toByte()
}
}
}
/** Return [cached] if it already fits [n] bytes, else a fresh array. */
private fun ensurePlane(cached: ByteArray?, n: Int): ByteArray =
if (cached != null && cached.size == n) cached else ByteArray(n)
private fun pushFrame(rgb: ByteArray, width: Int, height: Int) {
val module = engineModule ?: runCatching {
Python.getInstance().getModule(ENGINE_MODULE)
}.getOrNull()?.also { engineModule = it } ?: return
try {
module.callAttr("push_frame", rgb, width, height)
} catch (e: Exception) {
Log.w(TAG, "push_frame failed: ${e.message}")
}
}
/** Tear down the active session. Caller holds [lock]. */
private fun closeLocked() {
running = false
activeIndex = -1
runCatching { imageReader?.setOnImageAvailableListener(null, null) }
runCatching { captureSession?.stopRepeating() }
runCatching { captureSession?.close() }
captureSession = null
runCatching { cameraDevice?.close() }
cameraDevice = null
runCatching { imageReader?.close() }
imageReader = null
}
}
@@ -113,11 +113,25 @@ class CaptureService : Service() {
val url = "http://$localIp:$SERVER_PORT"
try {
val type = if (Build.VERSION.SDK_INT >= Build.VERSION_CODES.UPSIDE_DOWN_CAKE) {
if (useRoot) {
var t = if (useRoot) {
ServiceInfo.FOREGROUND_SERVICE_TYPE_SPECIAL_USE
} else {
ServiceInfo.FOREGROUND_SERVICE_TYPE_MEDIA_PROJECTION
}
// On-demand webcam capture opens the camera from this service.
// To retain camera access once the app is backgrounded (the
// always-on ambient-lighting case), API 34+ requires the camera
// FGS type. Add it ONLY when CAMERA is already granted — promoting
// with the camera type without the runtime permission throws and
// would kill the whole service on the (common) camera-less or
// not-yet-granted box. If CAMERA is granted later, it takes effect
// on the next Start (matches the audio/permission UX).
if (checkSelfPermission(Manifest.permission.CAMERA) ==
PackageManager.PERMISSION_GRANTED
) {
t = t or ServiceInfo.FOREGROUND_SERVICE_TYPE_CAMERA
}
t
} else {
0
}
@@ -51,6 +51,9 @@ class LedGrabApp : Application() {
// Bind application context for the BLE bridge so Python can
// scan and connect to BLE LED controllers.
BleBridge.init(this)
// Bind application context for the camera bridge so Python can
// enumerate cameras and open them on demand (webcam capture).
CameraBridge.init(this)
// Pre-warm the API key on a background thread. First-launch
// generation does a SharedPreferences.commit() (synchronous
@@ -55,6 +55,7 @@ class MainActivity : Activity() {
private const val REQUEST_MEDIA_PROJECTION = 1001
private const val REQUEST_POST_NOTIFICATIONS = 1002
private const val REQUEST_RECORD_AUDIO = 1003
private const val REQUEST_CAMERA = 1004
private const val QR_SIZE_PX = 560
private const val NOTIF_PREFS = "ledgrab_notif"
private const val KEY_NOTIF_ACCESS_PROMPTED = "notif_access_prompted"
@@ -209,6 +210,7 @@ class MainActivity : Activity() {
private fun startRootCaptureService() {
ensureNotificationPermission()
ensureNotificationListenerAccess()
ensureCameraPermission()
ContextCompat.startForegroundService(this, CaptureService.createRootIntent(this))
updateUI()
}
@@ -230,6 +232,7 @@ class MainActivity : Activity() {
ensureNotificationPermission()
ensureNotificationListenerAccess()
ensureAudioPermission()
ensureCameraPermission()
val intent = CaptureService.createIntent(this, resultCode, resultData)
ContextCompat.startForegroundService(this, intent)
updateUI()
@@ -507,6 +510,29 @@ class MainActivity : Activity() {
}
}
/**
* Request CAMERA so the capture service can open the device camera for
* on-device webcam capture. Fire-and-forget, like [ensureAudioPermission]:
* capture still works without it (just no camera engine), so we don't block
* on the result. Gated on actual camera hardware via FEATURE_CAMERA_ANY so
* camera-less TV boxes (the common case) never see the prompt. The camera
* is opened on demand only while a camera source is active — granting this
* does not keep the camera on. If first granted here, the camera engine
* becomes available on the next Start.
*/
private fun ensureCameraPermission() {
if (!packageManager.hasSystemFeature(PackageManager.FEATURE_CAMERA_ANY)) return
if (checkSelfPermission(Manifest.permission.CAMERA)
!= PackageManager.PERMISSION_GRANTED
) {
@Suppress("DEPRECATION")
requestPermissions(
arrayOf(Manifest.permission.CAMERA),
REQUEST_CAMERA,
)
}
}
/** Whether the user has granted notification-listener access to this app. */
private fun isNotificationAccessGranted(): Boolean =
NotificationManagerCompat.getEnabledListenerPackages(this).contains(packageName)