Files
ledgrab/ANDROID-REVIEW/android-missing-functionality.md
T
alexei.dolgolyov 1c1bbe2551 feat(android): foreground-app automation condition
Make the existing Application automation rule (foreground app -> activate
scene) work on the Android-TV build. A Kotlin ForegroundAppBridge reads the
foreground app via UsageStatsManager and lists launchable apps via LauncherApps;
PlatformDetector bridges it in (ahead of the Windows-only ctypes guard) so the
existing AutomationEngine / ApplicationRule / storage / deactivation modes are
unchanged. New /system/installed-apps + /system/info endpoints feed an app picker
that stores package names (vs process names on desktop); on Android the editor
hides the match-type selector since the foreground app is the only obtainable
signal. PACKAGE_USAGE_STATS is granted via an on-device button + a web-UI banner
(no blanket prompt at capture start); detection degrades gracefully until granted.

Zero new Python/Gradle deps (UsageStatsManager + LauncherApps are in-platform;
matching only string-compares the package name, so no QUERY_ALL_PACKAGES).
assembleDebug + 1897 pytest + ruff + tsc + npm build all green; independent final
review (0 blockers) + security review (no critical issues).
2026-06-02 14:57:29 +03:00

12 KiB
Raw Blame History

Android (TV) — Missing Functionality Assessment

Status: review/feasibility document. No code changes. Last updated 2026-06-01.

Context

LedGrab ships an experimental on-device Android-TV build: a Kotlin shell that embeds the Python FastAPI server via Chaquopy, with Kotlin↔Python bridges (PythonBridge, BleBridge, UsbSerialBridge). Several desktop features are unavailable on this build because their Python backends rely on native libraries that have no Android/Chaquopy wheels (mss, dxcam, sounddevice/PortAudio, opencv, nvidia-ml-py, winrt, dbus-next), or on OS facilities Android sandboxes differently.

The README "Feature support by OS" table now carries an Android column reflecting this. This document assesses whether each missing feature can be added, how, and whether it's worth it.

The enabling pattern (why most of this is feasible)

Every desktop capability that's "missing" on Android is missing only because of a native dependency, not because the capability is impossible. Android exposes the same capability through a platform API, and the codebase already has the bridge shape to plug it in:

Bridge pattern: a Kotlin component captures an event/buffer → pushes it across the Chaquopy JNI boundary into a module-level receiver in a small Python engine → an existing engine/stream consumes it unchanged.

Reference implementation: server/src/ledgrab/core/capture_engines/mediaprojection_engine.py (configure() + push_frame() + a bounded queue.Queue) ↔ android/app/src/main/java/com/ledgrab/android/ScreenCapture.ktPythonBridge.pushFrame(). Screen capture already works on Android this exact way.

So for most missing features the work is: add a Kotlin capture source + a thin Python receiver engine mirroring that pattern.


Current Android capability matrix

Feature Desktop Android (TV) today Missing?
Screen capture DXCam/WGC/MSS MediaProjection + root screenrecord No
LED transports (network/USB-serial/BLE) (USB via Android driver, BLE via Android bridge) No
System metrics psutil CPU/RAM/battery/thermal via /proc, /sys (AndroidMetricsProvider) No
Audio capture WASAPI / Sounddevice no PortAudio Yes
Notification capture WinRT / D-Bus NotificationListenerService → push_notification() No (implemented)
Webcam capture OpenCV Camera2 + on-demand bridge (AndroidCameraEngine) No (implemented)
GPU monitoring NVML no NVIDIA GPU Marginal
Capture from another Android phone scrcpy/ADB Skip (redundant)
Automation: foreground-app condition Windows ctypes (running/topmost/fullscreen) foreground app via UsageStatsManager (ForegroundAppBridge) No (implemented)
Monitor names / multi-display WMI / generic Single built-in display Low value

Per-feature feasibility

🔊 Audio capture — FEASIBLE, HIGH VALUE (detailed plan exists)

  • Blocker: only sounddevice/PortAudio is missing — not the capability.
  • Android path: AudioPlaybackCapture (API 29+) captures system playback audio and takes a MediaProjection token — which the app already obtains for screen capture. Kotlin AudioRecord → push PCM (float32) → a new push-based AndroidAudioEngine mirroring mediaprojection_engine.py, registered in core/audio/__init__.py, feeding the existing AudioAnalyzer unchanged. Mic (AudioSource.MIC) is the fallback.
  • Effort: moderate. Value: high — music/sound-reactive lighting is a flagship use on a TV box. No new Python deps.
  • ⚠️ DRM-protected apps (Netflix etc.) opt out of playback capture; works for non-DRM media and the device's own audio. Root mode (no MediaProjection) → mic-only.
  • 📄 See android-audio-capture-plan.md for the full implementation plan.

🔔 Notification capture — IMPLEMENTED (shipped)

  • Android is the best platform for this: NotificationListenerService is the native, event-push mechanism (no polling).
  • Path: a NotificationListenerService resolves the posting app's display label and pushes it via a module-level push_notification() into the existing os_notification_listener.py pipeline (a new push-based _AndroidBackend alongside _WindowsBackend/_LinuxBackend). Existing NotificationColorStripSource filters, per-app colors/sounds, and the history endpoint all work unchanged. No new Python deps.
  • Permission: user enables "Notification access" in Settings (ACTION_NOTIFICATION_LISTENER_SETTINGS); no runtime-permission popup.
  • Effort: moderate. Value: high.
  • Implemented on branch feature/android-notification-capture: a push-based _AndroidBackend + module-level push_notification() in os_notification_listener.py, a Kotlin LedGrabNotificationListener (NLS), and prompt-once permission UX. App-name parity — only the resolved app label crosses the JNI boundary, never the notification title/body. ⚠️ App labels can differ across OSes (Windows display_name / Linux D-Bus app_name / Android getApplicationLabel), so desktop-configured per-app colors/filters may need re-matching on Android.

📷 Webcam capture — IMPLEMENTED (shipped)

  • Blocker was opencv-python-headless (no Chaquopy cp311 wheel) — but capture doesn't need OpenCV. Implemented with Camera2 + ImageReader in Kotlin pushing RGB frames through the same bridge as MediaProjection into a new AndroidCameraEngine.
  • Path: a Kotlin CameraBridge singleton (Camera2) enumerates cameras and opens the camera on demand (only while a capture source is active — driven Python→Kotlin via the BleBridge/UsbSerialBridge pattern), converts each frame YUV_420_888→RGB, and pushes it into a push-based AndroidCameraEngine (core/capture_engines/android_camera_engine.py) that mirrors mediaprojection_engine.py. Cameras surface as selectable "displays" exactly like the desktop OpenCV CameraEngine; the data-driven capture-template UI (engine list + resolution config + display picker) needs no changes. No new Python deps; no new Gradle deps (Camera2 is in-platform).
  • Permission: CAMERA requested at capture-start, gated on FEATURE_CAMERA_ANY so camera-less TV boxes never see the prompt; graceful degradation when denied. The service is promoted with the camera FGS type (+ FOREGROUND_SERVICE_CAMERA) only when CAMERA is already granted, so backgrounded capture keeps working without risking a failed service start on camera-less boxes. (Unlike audio playback capture, the camera can't ride the MediaProjection token, so it needs its own FGS type to survive backgrounding.)
  • Effort: moderate. Value: low (TVs rarely have cameras), but the implementation reuses existing infrastructure end-to-end. Priority 0 so it's never auto-selected over MediaProjection — chosen explicitly via engine_type="android_camera".
  • ⚠️ MVP scope / limitations: webcam capture works while LedGrab capture is running (no camera-only server path on Android); one camera active at a time; "auto" picks a balanced output size (not the sensor max) to keep per-frame YUV→RGB cheap; USB-UVC webcams appear only if the device routes them through Camera2 (varies by box); no frame-rotation correction.
  • 📄 See android-webcam-capture-plan.md for the full implementation notes.

🎮 GPU monitoring — MARGINAL, SKIP FOR NOW

  • NVML is desktop-NVIDIA only. Android GPU load lives in vendor-specific sysfs (Adreno /sys/class/kgsl/kgsl-3d0/gpubusy, Mali /sys/class/devfreq/*.mali/...), inconsistent and often root-only.
  • CPU/RAM/battery/thermal are already covered by AndroidMetricsProvider. A best-effort GPU-load reader could be added to that provider, but reliability is poor and value is low.

🪟 Automation: foreground-app condition — IMPLEMENTED (shipped)

  • Android forbids full window/process enumeration (getRunningTasks restricted since API 21+), but the current foreground app package is obtainable via UsageStatsManager (needs the PACKAGE_USAGE_STATS special access).
  • Path: a Kotlin ForegroundAppBridge (UsageStatsManager queryEvents over a ~10s trailing window + LauncherApps for the picker + an AppOpsManager access check) bridged into automations/platform_detector.py via the guarded-jclass pattern, ahead of the Windows-only ctypes path. The existing ApplicationRule / AutomationEngine / storage / deactivation modes are unchanged — only the detection + the picker's data source were filled in. No new Python or Gradle deps (UsageStatsManager + LauncherApps are in-platform; matching only string-compares the package name, so no QUERY_ALL_PACKAGES / package visibility is needed).
  • UI: the automation editor's app picker lists launchable apps by human label (storing the package name) via a new GET /api/v1/system/installed-apps; on Android the match-type selector is hidden and match_type is forced to topmost (the only obtainable signal), with a cross-platform value caveat — apps are package names on Android (com.netflix.mediaclient) vs process names on Windows (chrome.exe), so rules are not portable across platforms.
  • Permission: PACKAGE_USAGE_STATS is a special access (Settings deep-link via ACTION_USAGE_ACCESS_SETTINGS); the device shows a "Grant usage access" button when missing, and the web-UI rule editor shows a banner (driven by /system/info's usage_access_granted). No blanket prompt at capture start. Detection degrades gracefully (rule never matches, warned once) until access is granted. Effort: moderate. Value: moderate (per-app scenes on a TV box). Full window-title matching remains out of scope (Android does not expose it).
  • 📄 See android-foreground-app-automation-plan.md for the full implementation notes.

📱 Capture from another Android phone (scrcpy/ADB) — SKIP

  • Impractical and redundant: no adb binary in Chaquopy, TV boxes can't reliably host an adb server, and the device already captures its own screen via MediaProjection.

🖥️ Monitor names / multi-display — LOW VALUE

  • DisplayManager can report a better display name and enumerate secondary (HDMI) displays, but MediaProjection captures the default display; capturing a secondary display is more involved and rarely useful on a single-screen box.

Prioritization

Priority Feature Effort Value New Python deps Status
1 Notification capture Moderate High None Implemented
2 Audio capture Moderate High None Implemented
4 Webcam capture (Camera2) Moderate Low None Implemented
3 Automation: foreground-app condition Moderate Moderate None Implemented
GPU load (vendor sysfs) LowMed Low None Not recommended
Capture from another phone Won't do
Multi-display / monitor names Low Low None Not recommended

Status: notifications, audio, webcam, and the foreground-app automation condition are all shipped — each reuses existing infrastructure (the Kotlin↔Python bridge pattern, the MediaProjection consent token / process-global Python.getInstance(), the capture/audio/notification/automation pipelines) and adds zero Python dependencies, so none risks the Chaquopy --no-deps build constraint documented in CLAUDE.md. No prioritized ideas remain; GPU load, another-phone capture, and multi-display remain not-recommended / won't-do.

Cross-cutting notes

  • No build.gradle.kts / Chaquopy pip impact for notifications or audio — both use Android platform APIs (Kotlin) + stdlib/numpy (already bundled) on the Python side.
  • Per-instance PythonBridge: PythonBridge is created per CaptureService instance, so system-bound services (e.g. a NotificationListenerService) call Python via the process-global Python.getInstance() rather than borrowing that bridge.
  • Permissions are the recurring friction, not the capture: audio needs RECORD_AUDIO + (for playback capture) a MediaProjection token; notifications need the "Notification access" settings toggle; foreground-app automation needs PACKAGE_USAGE_STATS.