# Plan: Android on-device audio capture > Status: proposed plan (not yet approved). No code changes. Last updated 2026-06-01. ## Context LedGrab's audio-reactive features (music analyzer, audio value sources, band filters) depend on capturing an audio stream and running it through `AudioAnalyzer` (`server/src/ledgrab/core/audio/analysis.py`). On desktop this is fed by **WASAPI** (Windows) or **Sounddevice/PortAudio** (cross-platform). On the **experimental Android-TV build** neither is available — `sounddevice` has no Chaquopy wheel and PortAudio isn't bundled — so `core/audio/__init__.py` registers only `DemoAudioEngine`, and audio-reactive lighting is effectively dead on Android. Android does not need PortAudio: the platform exposes **`AudioPlaybackCapture`** (API 29+), which captures system playback audio and **takes a `MediaProjection` token — the very token the app already obtains for screen capture** (`ScreenCapture(projection, …)`). This plan adds a push-based Android audio engine so the TV box can drive sound-reactive lighting from its own media playback, at parity with how desktop audio feeds the analyzer. The design mirrors the working screen-capture bridge (`mediaprojection_engine.py` ↔ `ScreenCapture.kt` ↔ `PythonBridge`) and the existing audio engine abstraction (`AudioCaptureEngine` / `AudioCaptureStreamBase` / `AudioEngineRegistry`). **No new Python dependencies** (`numpy` is already bundled) → no Chaquopy / `build.gradle.kts` `pip {}` changes. --- ## Approach A new **push-based** audio engine registered in the existing `AudioEngineRegistry`: - **Python:** `AndroidAudioEngine` + `AndroidAudioCaptureStream` mirroring `SounddeviceEngine`, but `read_chunk()` pops PCM from a module-level queue that **Kotlin fills** (mirror of `mediaprojection_engine.push_frame`). High `ENGINE_PRIORITY` so `AudioEngineRegistry.get_best_available_engine()` selects it on Android. The existing `ManagedAudioStream` capture loop and `AudioAnalyzer` consume `read_chunk()` unchanged. - **Android:** an `AudioCapture` helper using `AudioRecord` + `AudioPlaybackCaptureConfiguration` (reusing `CaptureService`'s `MediaProjection`), pushing float32 PCM to Python. Mic (`AudioSource.MIC`) fallback. Wired into `CaptureService` next to `ScreenCapture`. ``` [media playback] → AudioRecord (AudioPlaybackCapture, reuses MediaProjection) → AudioCapture.kt → PythonBridge.pushAudio(pcmFloat32, frames, channels) → android_audio_engine.push_samples() [module-level queue] → AndroidAudioCaptureStream.read_chunk() → ManagedAudioStream → AudioAnalyzer [unchanged] ``` --- ## Part A — Python (server) **New file: `server/src/ledgrab/core/audio/android_audio_engine.py`** — mirror `mediaprojection_engine.py` (queue + configure + push) and `sounddevice_engine.py` (engine/stream shape): ```python import queue import numpy as np from typing import Any, Dict, List from ledgrab.core.audio.base import AudioCaptureEngine, AudioCaptureStreamBase, AudioDeviceInfo from ledgrab.utils import get_logger logger = get_logger(__name__) _pcm_queue: "queue.Queue[np.ndarray]" = queue.Queue(maxsize=8) _sample_rate = 48000 _channels = 2 _chunk_size = 1024 _active = False def configure(sample_rate: int, channels: int, chunk_size: int) -> None: """Called from Kotlin before audio frames start flowing. Drains stale PCM.""" global _sample_rate, _channels, _chunk_size, _active while not _pcm_queue.empty(): try: _pcm_queue.get_nowait() except queue.Empty: break _sample_rate, _channels, _chunk_size = sample_rate, channels, chunk_size _active = True def push_samples(pcm_float32: bytes) -> None: """Push one interleaved float32 PCM chunk from Kotlin. Drops oldest if full.""" samples = np.frombuffer(pcm_float32, dtype=np.float32) try: _pcm_queue.put_nowait(samples) except queue.Full: try: _pcm_queue.get_nowait() except queue.Empty: pass try: _pcm_queue.put_nowait(samples) except queue.Full: pass def shutdown() -> None: global _active _active = False class AndroidAudioCaptureStream(AudioCaptureStreamBase): @property def channels(self) -> int: return _channels @property def sample_rate(self) -> int: return _sample_rate @property def chunk_size(self) -> int: return _chunk_size def initialize(self) -> None: if not _active: raise RuntimeError("Android audio engine not configured (only valid in-app).") self._initialized = True def cleanup(self) -> None: self._initialized = False def read_chunk(self) -> np.ndarray | None: try: return _pcm_queue.get(timeout=0.1) # 1-D float32 interleaved except queue.Empty: return None class AndroidAudioEngine(AudioCaptureEngine): ENGINE_TYPE = "android_playback" ENGINE_PRIORITY = 100 # highest on Android (demo is lower) @classmethod def is_available(cls) -> bool: from ledgrab.utils.platform import is_android return is_android() and _active @classmethod def get_default_config(cls) -> Dict[str, Any]: return {"sample_rate": _sample_rate, "channels": _channels, "chunk_size": _chunk_size} @classmethod def enumerate_devices(cls) -> List[AudioDeviceInfo]: if not cls.is_available(): return [] return [AudioDeviceInfo(index=0, name="Android playback (system audio)", is_input=True, is_loopback=True, channels=_channels, default_samplerate=float(_sample_rate))] @classmethod def create_stream(cls, device_index, is_loopback, config) -> AndroidAudioCaptureStream: return AndroidAudioCaptureStream(device_index, is_loopback, {**cls.get_default_config(), **config}) ``` **Modify `server/src/ledgrab/core/audio/__init__.py`** — register behind a guarded import, matching the existing `_has_wasapi` / `_has_sounddevice` pattern: ```python try: from ledgrab.core.audio.android_audio_engine import AndroidAudioEngine _has_android_audio = True except ImportError: _has_android_audio = False ... if _has_android_audio: AudioEngineRegistry.register(AndroidAudioEngine) ``` **Reused, unchanged:** `AudioEngineRegistry.get_best_available_engine()` (picks by priority), `ManagedAudioStream._capture_loop()` (`audio_capture.py`), `AudioAnalyzer`, the audio value sources, and the device-enumeration endpoints. The Android engine appears as one loopback device named "Android playback (system audio)". --- ## Part B — Android (Kotlin + manifest) **New file: `android/app/src/main/java/com/ledgrab/android/AudioCapture.kt`** Mirrors `ScreenCapture.kt`, taking the same `MediaProjection`: ```kotlin class AudioCapture( private val projection: MediaProjection, private val bridge: PythonBridge, private val sampleRate: Int = 48000, private val channels: Int = 2, private val chunkFrames: Int = 1024, ) ``` - `start()` (API 29+, MediaProjection mode): - Build `AudioPlaybackCaptureConfiguration(projection)` adding usages `USAGE_MEDIA`, `USAGE_GAME`, `USAGE_UNKNOWN` (the capturable set). - `AudioRecord.Builder().setAudioPlaybackCaptureConfig(cfg)` with `AudioFormat(ENCODING_PCM_FLOAT, sampleRate, CHANNEL_IN_STEREO)`. - On a dedicated `HandlerThread`, loop `audioRecord.read(floatBuf, …, READ_BLOCKING)` → wrap into a little-endian float32 `ByteArray` (reusable buffer, like `ScreenCapture`'s `frameBuffer`) → `bridge.pushAudio(bytes, framesRead, channels)`. - `stop()`: stop/release `AudioRecord`, quit the thread. - **Mic fallback** (`startMic()`): `AudioSource.MIC` for root mode (no MediaProjection) or API < 29. Used only when playback capture is unavailable. **Modify `android/app/src/main/java/com/ledgrab/android/PythonBridge.kt`** — add the audio push path (same shape as `pushFrame`, with a cached PyObject handle): ```kotlin @Volatile private var androidAudioEngine: PyObject? = null fun configureAudio(sampleRate: Int, channels: Int, chunkFrames: Int) { val engine = Python.getInstance().getModule("ledgrab.core.audio.android_audio_engine") engine.callAttr("configure", sampleRate, channels, chunkFrames) androidAudioEngine = engine } fun pushAudio(pcmFloat32: ByteArray, frames: Int, channels: Int) { if (!running) return androidAudioEngine?.let { try { it.callAttr("push_samples", pcmFloat32) } catch (e: Exception) { Log.w(TAG, "pushAudio failed: ${e.message}") } } } ``` **Modify `android/app/src/main/java/com/ledgrab/android/CaptureService.kt`** — in the MediaProjection start path (where `ScreenCapture` is created with the projection), if `RECORD_AUDIO` is granted and API ≥ 29, also `bridge.configureAudio(...)` and start an `AudioCapture(projection, bridge)`. Stop/release it in `onDestroy` alongside `ScreenCapture`. Root path → optional mic fallback (or skip; see Risks). **Modify `android/app/src/main/AndroidManifest.xml`:** ```xml ``` The existing `CaptureService` already declares `foregroundServiceType="mediaProjection|specialUse"` and holds `FOREGROUND_SERVICE_MEDIA_PROJECTION`; add `microphone` to the type only if mic fallback is implemented. **Modify `MainActivity.kt`** — request `RECORD_AUDIO` at runtime alongside the existing `ensureNotificationPermission()` (POST_NOTIFICATIONS) flow, before starting capture. Capture proceeds without audio if denied (graceful degradation). --- ## Orchestration decision (the main trade-off) Desktop starts audio capture **on demand** when an audio-reactive source is acquired (`AudioCaptureManager.acquire`). On Android, PCM only flows if Kotlin has set up `AudioRecord`. - **MVP (recommended):** start `AudioCapture` when `CaptureService` starts (if `RECORD_AUDIO` granted + MediaProjection mode + API ≥ 29) and push continuously; the bounded queue drops frames when no audio source consumes them. Simplest; modest extra CPU. - **Future optimization:** on-demand start/stop signaled Python→Kotlin (Chaquopy can call Kotlin, as `BleBridge`/`UsbSerialBridge` show) so `AudioRecord` runs only while an audio-reactive source is active. Defer unless CPU/battery on low-end boxes warrants it. --- ## What does NOT change - **Frontend / API** — audio engine + device selection, the music analyzer UI, and audio value sources are engine-agnostic; the Android engine shows up via the existing device enumeration. - **`build.gradle.kts` / Chaquopy pip block** — no new Python packages. - **Audio analysis pipeline** — `AudioAnalyzer`, band filters, `ManagedAudioStream` untouched. --- ## Files **Create** - `server/src/ledgrab/core/audio/android_audio_engine.py` - `android/app/src/main/java/com/ledgrab/android/AudioCapture.kt` - `server/tests/core/audio/test_android_audio_engine.py` **Modify** - `server/src/ledgrab/core/audio/__init__.py` — guarded import + registry registration. - `android/app/src/main/java/com/ledgrab/android/PythonBridge.kt` — `configureAudio` + `pushAudio`. - `android/app/src/main/java/com/ledgrab/android/CaptureService.kt` — start/stop `AudioCapture`. - `android/app/src/main/java/com/ledgrab/android/MainActivity.kt` — request `RECORD_AUDIO`. - `android/app/src/main/AndroidManifest.xml` — `RECORD_AUDIO` (+ mic FGS if mic fallback). --- ## Tests (Python — run on desktop CI, no Android device needed) New `server/tests/core/audio/test_android_audio_engine.py`: - `configure()` then `push_samples()` → `read_chunk()` returns the same float32 samples; queue drops oldest when full (push > maxsize). - `AndroidAudioEngine.is_available()` is `False` until `configure()` and only on Android (monkeypatch `ledgrab.utils.platform.is_android`); `True` after. - `enumerate_devices()` returns exactly one loopback device when active, `[]` otherwise. - Integration: with `is_android()` patched true + `configure()`, `get_best_available_engine()` returns `"android_playback"` (priority beats demo), and a stream created via `AudioEngineRegistry.create_stream("android_playback", 0, True, {})` yields pushed chunks. - Registry isolation: use `AudioEngineRegistry.clear_registry()` / re-register in fixtures so desktop engines aren't disturbed. ## Verification 1. **Python:** `py -3.13 -m pytest tests/core/audio/test_android_audio_engine.py --no-cov -q` (from `server/`), then the full suite. 2. **Lint:** `ruff check src/ tests/ --fix` (from `server/`). 3. **Android build:** `./gradlew :app:assembleDebug` (from `android/`). 4. **On device/emulator (manual):** install APK → grant `RECORD_AUDIO` + screen-capture consent → start capture → play non-DRM media (e.g. a local video / YouTube web) → create an audio-reactive value source bound to a strip → confirm the LEDs react to the audio, and the Android playback device appears in audio device enumeration. ## Risks / notes - **DRM opt-out:** Netflix/Disney+/etc. set audio as non-capturable; `AudioPlaybackCapture` yields silence for them. Works for non-DRM media and the device's own audio. Document in UI. - **API 29 minimum** for playback capture (minSdk is 24). API 24–28 and root mode (no MediaProjection) → mic fallback only, or audio unsupported. Gate cleanly + log. - **`RECORD_AUDIO`** is a runtime "dangerous" permission — must be requested; capture must degrade gracefully when denied. - **Format:** request `ENCODING_PCM_FLOAT` so Kotlin pushes float32 matching `read_chunk()`'s contract (1-D interleaved float32, length = frames × channels). If a device rejects float, capture 16-bit PCM and convert (`/32768.0`) before pushing. - **Latency/CPU:** small `chunkFrames` (e.g. 1024 @ 48 kHz ≈ 21 ms) keeps reactivity tight; continuous capture (MVP) adds modest CPU on low-end boxes — see the orchestration trade-off. - **R8/ProGuard:** minify is disabled and the Python module is resolved by string from Kotlin; no new keep-rules needed.