- android-audio-capture-plan.md — design behind the merged on-device audio
capture feature (487259a).
- android-missing-functionality.md — Android missing-feature review notes.
14 KiB
Plan: Android on-device audio capture
Status: proposed plan (not yet approved). No code changes. Last updated 2026-06-01.
Context
LedGrab's audio-reactive features (music analyzer, audio value sources, band filters)
depend on capturing an audio stream and running it through AudioAnalyzer
(server/src/ledgrab/core/audio/analysis.py). On desktop this is fed by WASAPI
(Windows) or Sounddevice/PortAudio (cross-platform). On the experimental
Android-TV build neither is available — sounddevice has no Chaquopy wheel and PortAudio
isn't bundled — so core/audio/__init__.py registers only DemoAudioEngine, and
audio-reactive lighting is effectively dead on Android.
Android does not need PortAudio: the platform exposes AudioPlaybackCapture (API 29+),
which captures system playback audio and takes a MediaProjection token — the very token
the app already obtains for screen capture (ScreenCapture(projection, …)). This plan adds
a push-based Android audio engine so the TV box can drive sound-reactive lighting from its own
media playback, at parity with how desktop audio feeds the analyzer.
The design mirrors the working screen-capture bridge
(mediaprojection_engine.py ↔ ScreenCapture.kt ↔ PythonBridge) and the existing audio
engine abstraction (AudioCaptureEngine / AudioCaptureStreamBase /
AudioEngineRegistry). No new Python dependencies (numpy is already bundled) → no
Chaquopy / build.gradle.kts pip {} changes.
Approach
A new push-based audio engine registered in the existing AudioEngineRegistry:
- Python:
AndroidAudioEngine+AndroidAudioCaptureStreammirroringSounddeviceEngine, butread_chunk()pops PCM from a module-level queue that Kotlin fills (mirror ofmediaprojection_engine.push_frame). HighENGINE_PRIORITYsoAudioEngineRegistry.get_best_available_engine()selects it on Android. The existingManagedAudioStreamcapture loop andAudioAnalyzerconsumeread_chunk()unchanged. - Android: an
AudioCapturehelper usingAudioRecord+AudioPlaybackCaptureConfiguration(reusingCaptureService'sMediaProjection), pushing float32 PCM to Python. Mic (AudioSource.MIC) fallback. Wired intoCaptureServicenext toScreenCapture.
[media playback] → AudioRecord (AudioPlaybackCapture, reuses MediaProjection)
→ AudioCapture.kt → PythonBridge.pushAudio(pcmFloat32, frames, channels)
→ android_audio_engine.push_samples() [module-level queue]
→ AndroidAudioCaptureStream.read_chunk() → ManagedAudioStream → AudioAnalyzer [unchanged]
Part A — Python (server)
New file: server/src/ledgrab/core/audio/android_audio_engine.py — mirror
mediaprojection_engine.py (queue + configure + push) and sounddevice_engine.py (engine/stream shape):
import queue
import numpy as np
from typing import Any, Dict, List
from ledgrab.core.audio.base import AudioCaptureEngine, AudioCaptureStreamBase, AudioDeviceInfo
from ledgrab.utils import get_logger
logger = get_logger(__name__)
_pcm_queue: "queue.Queue[np.ndarray]" = queue.Queue(maxsize=8)
_sample_rate = 48000
_channels = 2
_chunk_size = 1024
_active = False
def configure(sample_rate: int, channels: int, chunk_size: int) -> None:
"""Called from Kotlin before audio frames start flowing. Drains stale PCM."""
global _sample_rate, _channels, _chunk_size, _active
while not _pcm_queue.empty():
try: _pcm_queue.get_nowait()
except queue.Empty: break
_sample_rate, _channels, _chunk_size = sample_rate, channels, chunk_size
_active = True
def push_samples(pcm_float32: bytes) -> None:
"""Push one interleaved float32 PCM chunk from Kotlin. Drops oldest if full."""
samples = np.frombuffer(pcm_float32, dtype=np.float32)
try:
_pcm_queue.put_nowait(samples)
except queue.Full:
try: _pcm_queue.get_nowait()
except queue.Empty: pass
try: _pcm_queue.put_nowait(samples)
except queue.Full: pass
def shutdown() -> None:
global _active
_active = False
class AndroidAudioCaptureStream(AudioCaptureStreamBase):
@property
def channels(self) -> int: return _channels
@property
def sample_rate(self) -> int: return _sample_rate
@property
def chunk_size(self) -> int: return _chunk_size
def initialize(self) -> None:
if not _active:
raise RuntimeError("Android audio engine not configured (only valid in-app).")
self._initialized = True
def cleanup(self) -> None:
self._initialized = False
def read_chunk(self) -> np.ndarray | None:
try:
return _pcm_queue.get(timeout=0.1) # 1-D float32 interleaved
except queue.Empty:
return None
class AndroidAudioEngine(AudioCaptureEngine):
ENGINE_TYPE = "android_playback"
ENGINE_PRIORITY = 100 # highest on Android (demo is lower)
@classmethod
def is_available(cls) -> bool:
from ledgrab.utils.platform import is_android
return is_android() and _active
@classmethod
def get_default_config(cls) -> Dict[str, Any]:
return {"sample_rate": _sample_rate, "channels": _channels, "chunk_size": _chunk_size}
@classmethod
def enumerate_devices(cls) -> List[AudioDeviceInfo]:
if not cls.is_available(): return []
return [AudioDeviceInfo(index=0, name="Android playback (system audio)",
is_input=True, is_loopback=True,
channels=_channels, default_samplerate=float(_sample_rate))]
@classmethod
def create_stream(cls, device_index, is_loopback, config) -> AndroidAudioCaptureStream:
return AndroidAudioCaptureStream(device_index, is_loopback, {**cls.get_default_config(), **config})
Modify server/src/ledgrab/core/audio/__init__.py — register behind a guarded import,
matching the existing _has_wasapi / _has_sounddevice pattern:
try:
from ledgrab.core.audio.android_audio_engine import AndroidAudioEngine
_has_android_audio = True
except ImportError:
_has_android_audio = False
...
if _has_android_audio:
AudioEngineRegistry.register(AndroidAudioEngine)
Reused, unchanged: AudioEngineRegistry.get_best_available_engine() (picks by priority),
ManagedAudioStream._capture_loop() (audio_capture.py), AudioAnalyzer, the audio value
sources, and the device-enumeration endpoints. The Android engine appears as one loopback
device named "Android playback (system audio)".
Part B — Android (Kotlin + manifest)
New file: android/app/src/main/java/com/ledgrab/android/AudioCapture.kt
Mirrors ScreenCapture.kt, taking the same MediaProjection:
class AudioCapture(
private val projection: MediaProjection,
private val bridge: PythonBridge,
private val sampleRate: Int = 48000,
private val channels: Int = 2,
private val chunkFrames: Int = 1024,
)
start()(API 29+, MediaProjection mode):- Build
AudioPlaybackCaptureConfiguration(projection)adding usagesUSAGE_MEDIA,USAGE_GAME,USAGE_UNKNOWN(the capturable set). AudioRecord.Builder().setAudioPlaybackCaptureConfig(cfg)withAudioFormat(ENCODING_PCM_FLOAT, sampleRate, CHANNEL_IN_STEREO).- On a dedicated
HandlerThread, loopaudioRecord.read(floatBuf, …, READ_BLOCKING)→ wrap into a little-endian float32ByteArray(reusable buffer, likeScreenCapture'sframeBuffer) →bridge.pushAudio(bytes, framesRead, channels).
- Build
stop(): stop/releaseAudioRecord, quit the thread.- Mic fallback (
startMic()):AudioSource.MICfor root mode (no MediaProjection) or API < 29. Used only when playback capture is unavailable.
Modify android/app/src/main/java/com/ledgrab/android/PythonBridge.kt — add the audio
push path (same shape as pushFrame, with a cached PyObject handle):
@Volatile private var androidAudioEngine: PyObject? = null
fun configureAudio(sampleRate: Int, channels: Int, chunkFrames: Int) {
val engine = Python.getInstance().getModule("ledgrab.core.audio.android_audio_engine")
engine.callAttr("configure", sampleRate, channels, chunkFrames)
androidAudioEngine = engine
}
fun pushAudio(pcmFloat32: ByteArray, frames: Int, channels: Int) {
if (!running) return
androidAudioEngine?.let {
try { it.callAttr("push_samples", pcmFloat32) }
catch (e: Exception) { Log.w(TAG, "pushAudio failed: ${e.message}") }
}
}
Modify android/app/src/main/java/com/ledgrab/android/CaptureService.kt — in the
MediaProjection start path (where ScreenCapture is created with the projection), if
RECORD_AUDIO is granted and API ≥ 29, also bridge.configureAudio(...) and start an
AudioCapture(projection, bridge). Stop/release it in onDestroy alongside ScreenCapture.
Root path → optional mic fallback (or skip; see Risks).
Modify android/app/src/main/AndroidManifest.xml:
<uses-permission android:name="android.permission.RECORD_AUDIO" />
<!-- For mic-mode foreground capture on API 34+ (playback capture is covered by the
existing mediaProjection FGS type): -->
<uses-permission android:name="android.permission.FOREGROUND_SERVICE_MICROPHONE" />
The existing CaptureService already declares foregroundServiceType="mediaProjection|specialUse"
and holds FOREGROUND_SERVICE_MEDIA_PROJECTION; add microphone to the type only if mic
fallback is implemented.
Modify MainActivity.kt — request RECORD_AUDIO at runtime alongside the existing
ensureNotificationPermission() (POST_NOTIFICATIONS) flow, before starting capture. Capture
proceeds without audio if denied (graceful degradation).
Orchestration decision (the main trade-off)
Desktop starts audio capture on demand when an audio-reactive source is acquired
(AudioCaptureManager.acquire). On Android, PCM only flows if Kotlin has set up AudioRecord.
- MVP (recommended): start
AudioCapturewhenCaptureServicestarts (ifRECORD_AUDIOgranted + MediaProjection mode + API ≥ 29) and push continuously; the bounded queue drops frames when no audio source consumes them. Simplest; modest extra CPU. - Future optimization: on-demand start/stop signaled Python→Kotlin (Chaquopy can call
Kotlin, as
BleBridge/UsbSerialBridgeshow) soAudioRecordruns only while an audio-reactive source is active. Defer unless CPU/battery on low-end boxes warrants it.
What does NOT change
- Frontend / API — audio engine + device selection, the music analyzer UI, and audio value sources are engine-agnostic; the Android engine shows up via the existing device enumeration.
build.gradle.kts/ Chaquopy pip block — no new Python packages.- Audio analysis pipeline —
AudioAnalyzer, band filters,ManagedAudioStreamuntouched.
Files
Create
server/src/ledgrab/core/audio/android_audio_engine.pyandroid/app/src/main/java/com/ledgrab/android/AudioCapture.ktserver/tests/core/audio/test_android_audio_engine.py
Modify
server/src/ledgrab/core/audio/__init__.py— guarded import + registry registration.android/app/src/main/java/com/ledgrab/android/PythonBridge.kt—configureAudio+pushAudio.android/app/src/main/java/com/ledgrab/android/CaptureService.kt— start/stopAudioCapture.android/app/src/main/java/com/ledgrab/android/MainActivity.kt— requestRECORD_AUDIO.android/app/src/main/AndroidManifest.xml—RECORD_AUDIO(+ mic FGS if mic fallback).
Tests (Python — run on desktop CI, no Android device needed)
New server/tests/core/audio/test_android_audio_engine.py:
configure()thenpush_samples()→read_chunk()returns the same float32 samples; queue drops oldest when full (push > maxsize).AndroidAudioEngine.is_available()isFalseuntilconfigure()and only on Android (monkeypatchledgrab.utils.platform.is_android);Trueafter.enumerate_devices()returns exactly one loopback device when active,[]otherwise.- Integration: with
is_android()patched true +configure(),get_best_available_engine()returns"android_playback"(priority beats demo), and a stream created viaAudioEngineRegistry.create_stream("android_playback", 0, True, {})yields pushed chunks. - Registry isolation: use
AudioEngineRegistry.clear_registry()/ re-register in fixtures so desktop engines aren't disturbed.
Verification
- Python:
py -3.13 -m pytest tests/core/audio/test_android_audio_engine.py --no-cov -q(fromserver/), then the full suite. - Lint:
ruff check src/ tests/ --fix(fromserver/). - Android build:
./gradlew :app:assembleDebug(fromandroid/). - On device/emulator (manual): install APK → grant
RECORD_AUDIO+ screen-capture consent → start capture → play non-DRM media (e.g. a local video / YouTube web) → create an audio-reactive value source bound to a strip → confirm the LEDs react to the audio, and the Android playback device appears in audio device enumeration.
Risks / notes
- DRM opt-out: Netflix/Disney+/etc. set audio as non-capturable;
AudioPlaybackCaptureyields silence for them. Works for non-DRM media and the device's own audio. Document in UI. - API 29 minimum for playback capture (minSdk is 24). API 24–28 and root mode (no MediaProjection) → mic fallback only, or audio unsupported. Gate cleanly + log.
RECORD_AUDIOis a runtime "dangerous" permission — must be requested; capture must degrade gracefully when denied.- Format: request
ENCODING_PCM_FLOATso Kotlin pushes float32 matchingread_chunk()'s contract (1-D interleaved float32, length = frames × channels). If a device rejects float, capture 16-bit PCM and convert (/32768.0) before pushing. - Latency/CPU: small
chunkFrames(e.g. 1024 @ 48 kHz ≈ 21 ms) keeps reactivity tight; continuous capture (MVP) adds modest CPU on low-end boxes — see the orchestration trade-off. - R8/ProGuard: minify is disabled and the Python module is resolved by string from Kotlin; no new keep-rules needed.