Files
ledgrab/plans/processed-audio-sources/phase-2-audio-filters.md
T
alexei.dolgolyov eb94066386 feat(processed-audio-sources): phase 2 - implement 11 audio filters
Add all audio filters that transform AudioAnalysis data:
- Channel Extract, Band Extract (migration from old source types)
- Peak Hold, Gain, Noise Gate, Envelope Follower
- Spectral Smoothing, Compressor, Inverter, Beat Gate, Delay
All registered via AudioFilterRegistry with option schemas.
2026-03-31 18:43:36 +03:00

7.3 KiB

Phase 2: Audio Filters

Status: 🔨 In Progress Parent plan: PLAN.md Domain: backend

Objective

Implement all 11 audio filters and register them with the AudioFilterRegistry.

Tasks

  • Task 1: Channel Extract filter (core/audio/filters/channel_extract.py)
    • Options: channel (select: mono | left | right)
    • Stateful: No
    • Behavior: Replaces main rms/spectrum with selected channel data. If "mono", averages L+R. If "left"/"right", copies that channel's data to the main fields.
  • Task 2: Band Extract filter (core/audio/filters/band_extract.py)
    • Options: band (select: bass | mid | treble | custom), freq_low (float, 20-20000), freq_high (float, 20-20000)
    • Stateful: No
    • Behavior: Computes a band mask for the 64 log-spaced bins, applies it to spectrum, recomputes RMS from in-band data. Reuse logic from existing core/audio/band_filter.py.
    • Presets: bass=20-250Hz, mid=250-4000Hz, treble=4000-20000Hz
  • Task 3: Peak Hold filter (core/audio/filters/peak_hold.py)
    • Options: decay_rate (float, 0.1-50.0, dB/s), per_bin (bool, default true)
    • Stateful: Yes
    • Behavior: For each spectrum bin (if per_bin) or for rms/peak, retains the maximum value seen and decays it over time. Outputs the max of current value and held peak.
  • Task 4: Gain filter (core/audio/filters/gain.py)
    • Options: factor (float, 0.1-10.0, default 1.0)
    • Stateful: No
    • Behavior: Multiplies rms, peak, spectrum, and per-channel values by factor. Clamps to [0, 1] for rms/peak.
  • Task 5: Noise Gate filter (core/audio/filters/noise_gate.py)
    • Options: threshold (float, 0.0-1.0), hysteresis (float, 0.0-0.2, default 0.05)
    • Stateful: No (hysteresis is stateless — it's a secondary threshold, not temporal)
    • Behavior: If rms < threshold, zeros out all levels and spectrum. Hysteresis means: if gate was open and rms drops below (threshold - hysteresis), close it; if gate was closed and rms rises above threshold, open it.
    • Actually stateful for hysteresis tracking: needs to remember gate open/closed state.
  • Task 6: Envelope Follower filter (core/audio/filters/envelope_follower.py)
    • Options: attack_ms (float, 1-500, default 10), release_ms (float, 10-2000, default 200)
    • Stateful: Yes
    • Behavior: Smooths rms and peak with asymmetric time constants. When signal rises, uses attack rate. When signal falls, uses release rate. Applied per-bin to spectrum optionally.
    • Fast attack + slow release = punchy transients that fade smoothly.
  • Task 7: Spectral Smoothing filter (core/audio/filters/spectral_smoothing.py)
    • Options: factor (float, 0.0-0.99, default 0.5)
    • Stateful: Yes (maintains previous spectrum state)
    • Behavior: Applies exponential moving average per-bin: smoothed[i] = factor * prev[i] + (1-factor) * current[i]. Higher factor = smoother/slower.
  • Task 8: Compressor filter (core/audio/filters/compressor.py)
    • Options: threshold (float, 0.0-1.0, default 0.5), ratio (float, 1.0-20.0, default 4.0), makeup_gain (float, 0.0-2.0, default 1.0)
    • Stateful: Yes (envelope tracking for gain reduction)
    • Behavior: When signal exceeds threshold, reduces by ratio. output = threshold + (input - threshold) / ratio. Apply makeup_gain after. Applied to rms, peak, and spectrum.
  • Task 9: Inverter filter (core/audio/filters/inverter.py)
    • Options: none (or invert_spectrum bool, default true)
    • Stateful: No
    • Behavior: rms = 1.0 - rms, peak = 1.0 - peak, spectrum bins inverted if option set. Beat fields unchanged.
  • Task 10: Beat Gate filter (core/audio/filters/beat_gate.py)
    • Options: hold_ms (float, 10-500, default 50) — how long to hold signal after beat
    • Stateful: Yes (tracks last beat timestamp)
    • Behavior: When beat detected, passes signal through for hold_ms milliseconds. Between beats, zeros out rms/peak/spectrum. Beat fields themselves always pass through.
  • Task 11: Delay filter (core/audio/filters/delay.py)
    • Options: delay_ms (float, 10-2000, default 100)
    • Stateful: Yes (ring buffer of AudioAnalysis snapshots)
    • Behavior: Buffers incoming AudioAnalysis snapshots and outputs the one from delay_ms ago. Ring buffer sized based on ~30Hz update rate.
  • Task 12: Register all 11 filters in core/audio/filters/__init__.py
  • Task 13: Update Noise Gate to be stateful (hysteresis requires gate state tracking)

Files to Modify/Create

  • core/audio/filters/channel_extract.pycreate
  • core/audio/filters/band_extract.pycreate
  • core/audio/filters/peak_hold.pycreate
  • core/audio/filters/gain.pycreate
  • core/audio/filters/noise_gate.pycreate
  • core/audio/filters/envelope_follower.pycreate
  • core/audio/filters/spectral_smoothing.pycreate
  • core/audio/filters/compressor.pycreate
  • core/audio/filters/inverter.pycreate
  • core/audio/filters/beat_gate.pycreate
  • core/audio/filters/delay.pycreate
  • core/audio/filters/__init__.pymodify — register all filters

Acceptance Criteria

  • All 11 filters are implemented and registered
  • Each filter correctly transforms AudioAnalysis according to its specification
  • Stateful filters (peak hold, envelope follower, spectral smoothing, compressor, beat gate, delay, noise gate) properly maintain and reset state
  • Filter option schemas are complete and accurate
  • All filters are accessible via GET /api/v1/audio-filters

Notes

  • 6 stateful filters: peak hold, envelope follower, spectral smoothing, compressor, beat gate, delay. Noise gate is also stateful due to hysteresis.
  • Band extract can reuse math from existing core/audio/band_filter.pycompute_band_mask() and apply_band_filter()
  • Filters must produce a NEW AudioAnalysis (immutability principle), not mutate the input
  • For delay filter, ring buffer size = delay_ms / (1000 / update_rate). At 30Hz, 2000ms delay = 60 slots.

Review Checklist

  • All tasks completed
  • Code follows project conventions
  • No unintended side effects
  • Build passes
  • Tests pass (new + existing)

Handoff to Next Phase

What was built

  • All 11 audio filters implemented, each in its own file under core/audio/filters/
  • 7 stateful filters (peak_hold, noise_gate, envelope_follower, spectral_smoothing, compressor, beat_gate, delay) with proper is_stateful and reset() implementations
  • 4 stateless filters (channel_extract, band_extract, gain, inverter)
  • All filters registered in __init__.py via import-triggered @AudioFilterRegistry.register
  • All filters produce NEW AudioAnalysis via dataclasses.replace() (immutability preserved)
  • Band extract reuses existing compute_band_mask() and apply_band_filter() from core/audio/band_filter.py

What Phase 3 needs to know

  • All 11 filters + the audio_filter_template meta-filter are now registered in the AudioFilterRegistry (12 total)
  • GET /api/v1/audio-filters will return all filters with their option schemas
  • Filters are instantiated via AudioFilterRegistry.create_instance(filter_id, options)
  • Stateful filters need per-stream instances (not shared) due to internal state
  • The process() method signature is process(analysis: AudioAnalysis) -> AudioAnalysis

Known deviations from plan

  • None. All 11 filters implemented exactly as specified plus Task 13 (noise gate stateful).