docs(android): add audio-capture design + missing-functionality review

- android-audio-capture-plan.md — design behind the merged on-device audio
  capture feature (487259a).
- android-missing-functionality.md — Android missing-feature review notes.
This commit is contained in:
2026-06-02 03:30:43 +03:00
parent 487259a96d
commit 4b2e8fc5ec
2 changed files with 461 additions and 0 deletions
@@ -0,0 +1,153 @@
# Android (TV) — Missing Functionality Assessment
> Status: review/feasibility document. No code changes. Last updated 2026-06-01.
## Context
LedGrab ships an **experimental on-device Android-TV build**: a Kotlin shell that
embeds the Python FastAPI server via **Chaquopy**, with Kotlin↔Python **bridges**
(`PythonBridge`, `BleBridge`, `UsbSerialBridge`). Several desktop features are
unavailable on this build because their Python backends rely on native libraries
that have no Android/Chaquopy wheels (`mss`, `dxcam`, `sounddevice`/PortAudio,
`opencv`, `nvidia-ml-py`, `winrt`, `dbus-next`), or on OS facilities Android
sandboxes differently.
The README "Feature support by OS" table now carries an Android column reflecting
this. This document assesses **whether each missing feature can be added**, how, and
whether it's worth it.
### The enabling pattern (why most of this is feasible)
Every desktop capability that's "missing" on Android is missing only because of a
*native dependency*, not because the capability is impossible. Android exposes the
same capability through a platform API, and the codebase already has the bridge
shape to plug it in:
> **Bridge pattern:** a Kotlin component captures an event/buffer → pushes it across
> the Chaquopy JNI boundary into a **module-level receiver** in a small Python engine
> → an existing engine/stream consumes it unchanged.
Reference implementation: `server/src/ledgrab/core/capture_engines/mediaprojection_engine.py`
(`configure()` + `push_frame()` + a bounded `queue.Queue`) ↔
`android/app/src/main/java/com/ledgrab/android/ScreenCapture.kt`
`PythonBridge.pushFrame()`. Screen capture already works on Android this exact way.
So for most missing features the work is: **add a Kotlin capture source + a thin
Python receiver engine mirroring that pattern.**
---
## Current Android capability matrix
| Feature | Desktop | Android (TV) today | Missing? |
| ------- | ------- | ------------------ | -------- |
| Screen capture | DXCam/WGC/MSS | ✅ MediaProjection + root `screenrecord` | No |
| LED transports (network/USB-serial/BLE) | ✅ | ✅ (USB via Android driver, BLE via Android bridge) | No |
| System metrics | psutil | ✅ CPU/RAM/battery/thermal via `/proc`, `/sys` (`AndroidMetricsProvider`) | No |
| **Audio capture** | WASAPI / Sounddevice | ❌ no PortAudio | **Yes** |
| **Notification capture** | WinRT / D-Bus | ❌ listener only Win/Linux | **Yes** |
| Webcam capture | OpenCV | ❌ no OpenCV wheel | Yes (niche) |
| GPU monitoring | NVML | ❌ no NVIDIA GPU | Marginal |
| Capture from *another* Android phone | scrcpy/ADB | ❌ | Skip (redundant) |
| Automation: window/process conditions | Windows ctypes | ❌ sandboxed | Partial |
| Monitor names / multi-display | WMI / generic | Single built-in display | Low value |
---
## Per-feature feasibility
### 🔊 Audio capture — **FEASIBLE, HIGH VALUE** ⭐ (detailed plan exists)
- **Blocker:** only `sounddevice`/PortAudio is missing — not the capability.
- **Android path:** `AudioPlaybackCapture` (API 29+) captures system playback audio and
**takes a `MediaProjection` token — which the app already obtains for screen capture.**
Kotlin `AudioRecord` → push PCM (float32) → a new push-based `AndroidAudioEngine`
mirroring `mediaprojection_engine.py`, registered in `core/audio/__init__.py`, feeding
the existing `AudioAnalyzer` unchanged. Mic (`AudioSource.MIC`) is the fallback.
- **Effort:** moderate. **Value:** high — music/sound-reactive lighting is a flagship use
on a TV box. **No new Python deps.**
- ⚠️ DRM-protected apps (Netflix etc.) opt out of playback capture; works for non-DRM
media and the device's own audio. Root mode (no MediaProjection) → mic-only.
- 📄 **See `android-audio-capture-plan.md`** for the full implementation plan.
### 🔔 Notification capture — **FEASIBLE, HIGH VALUE** ⭐ (planned)
- **Android is the *best* platform for this:** `NotificationListenerService` is the native,
event-push mechanism (no polling).
- **Path:** a `NotificationListenerService` resolves the posting app's display label and
pushes it via a module-level `push_notification()` into the existing
`os_notification_listener.py` pipeline (a new push-based `_AndroidBackend` alongside
`_WindowsBackend`/`_LinuxBackend`). Existing `NotificationColorStripSource` filters,
per-app colors/sounds, and the history endpoint all work unchanged. **No new Python deps.**
- **Permission:** user enables "Notification access" in Settings (`ACTION_NOTIFICATION_LISTENER_SETTINGS`);
no runtime-permission popup.
- **Effort:** moderate. **Value:** high.
- 📄 **Plan approved & detailed** — see `C:\Users\Alexei\.claude\plans\deep-enchanting-muffin.md`
(app-name parity; prompt-once permission UX).
### 📷 Webcam capture — **FEASIBLE, LOW VALUE**
- **Blocker** is `opencv-python-headless` (no Chaquopy cp311 wheel) — but capture doesn't
*need* OpenCV. Use **CameraX / Camera2** + `ImageReader` in Kotlin and push frames through
the same bridge as MediaProjection into a new `CameraBridgeEngine`.
- **Effort:** moderate. **Value:** low — TVs rarely have cameras; USB-UVC webcams need extra
device handling. Recommend deferring unless a concrete use case appears.
### 🎮 GPU monitoring — **MARGINAL, SKIP FOR NOW**
- NVML is desktop-NVIDIA only. Android GPU load lives in **vendor-specific sysfs**
(Adreno `/sys/class/kgsl/kgsl-3d0/gpubusy`, Mali `/sys/class/devfreq/*.mali/...`),
inconsistent and often root-only.
- CPU/RAM/battery/thermal are **already** covered by `AndroidMetricsProvider`. A best-effort
GPU-load reader could be added to that provider, but reliability is poor and value is low.
### 🪟 Automation: window/process conditions — **PARTIAL**
- Android forbids full window/process enumeration (`getRunningTasks` restricted since API 21+).
- **Obtainable:** the *current foreground app package* via `UsageStatsManager` (needs the
`PACKAGE_USAGE_STATS` special access) or an `AccessibilityService`.
- So "when <app> is in the foreground → scene X" is feasible (mirrors
`automations/platform_detector.py`, which currently returns empty off-Windows); full
window-title matching is **not**. **Effort:** moderate. **Value:** moderate (per-app scenes
on a TV box).
### 📱 Capture from *another* Android phone (scrcpy/ADB) — **SKIP**
- Impractical and redundant: no `adb` binary in Chaquopy, TV boxes can't reliably host an
adb server, and the device already captures its **own** screen via MediaProjection.
### 🖥️ Monitor names / multi-display — **LOW VALUE**
- `DisplayManager` can report a better display name and enumerate secondary (HDMI) displays,
but MediaProjection captures the default display; capturing a secondary display is more
involved and rarely useful on a single-screen box.
---
## Prioritization
| Priority | Feature | Effort | Value | New Python deps | Status |
| -------- | ------- | ------ | ----- | --------------- | ------ |
| 1 | Notification capture | Moderate | High | None | **Plan approved** |
| 2 | Audio capture | Moderate | High | None | **Plan written** (this folder) |
| 3 | Automation: foreground-app condition | Moderate | Moderate | None | Idea |
| 4 | Webcam capture (CameraX) | Moderate | Low | None | Idea |
| — | GPU load (vendor sysfs) | LowMed | Low | None | Not recommended |
| — | Capture from another phone | — | — | — | Won't do |
| — | Multi-display / monitor names | Low | Low | None | Not recommended |
**Recommended order:** ship notifications → ship audio → reassess. Both reuse existing
infrastructure (bridge pattern, the MediaProjection consent token, the audio/notification
pipelines) and add **zero** Python dependencies, so neither risks the Chaquopy
`--no-deps` build constraint documented in `CLAUDE.md`.
## Cross-cutting notes
- **No `build.gradle.kts` / Chaquopy pip impact** for notifications or audio — both use Android
platform APIs (Kotlin) + stdlib/`numpy` (already bundled) on the Python side.
- **Per-instance `PythonBridge`:** `PythonBridge` is created per `CaptureService` instance, so
system-bound services (e.g. a `NotificationListenerService`) call Python via the
process-global `Python.getInstance()` rather than borrowing that bridge.
- **Permissions are the recurring friction**, not the capture: audio needs `RECORD_AUDIO` +
(for playback capture) a MediaProjection token; notifications need the "Notification access"
settings toggle; foreground-app automation needs `PACKAGE_USAGE_STATS`.