feat(processed-audio-sources): phase 2 - implement 11 audio filters
Add all audio filters that transform AudioAnalysis data: - Channel Extract, Band Extract (migration from old source types) - Peak Hold, Gain, Noise Gate, Envelope Follower - Spectral Smoothing, Compressor, Inverter, Beat Gate, Delay All registered via AudioFilterRegistry with option schemas.
This commit is contained in:
@@ -9,13 +9,20 @@
|
||||
- **Test:** `cd server && py -3.13 -m pytest tests/ --no-cov -q`
|
||||
|
||||
## Current State
|
||||
Phase 1 (Audio Filter Framework) implemented. Core framework is in place:
|
||||
Phase 1 (Audio Filter Framework) and Phase 2 (Audio Filters) implemented.
|
||||
|
||||
Phase 1 framework:
|
||||
- `AudioFilter` base class, `AudioFilterRegistry`, `AudioFilterOptionDef` in `core/audio/filters/`
|
||||
- `AudioProcessingTemplate` dataclass + `AudioProcessingTemplateStore` (SQLite-backed) in `storage/`
|
||||
- `audio_filter_template` meta-filter with recursive resolution
|
||||
- Full REST API: CRUD templates + filter registry discovery
|
||||
- Dependency injection wired in `dependencies.py` and `main.py`
|
||||
|
||||
Phase 2 filters (12 total registered, 11 real + 1 meta):
|
||||
- Stateless: `channel_extract`, `band_extract`, `gain`, `inverter`
|
||||
- Stateful: `peak_hold`, `noise_gate`, `envelope_follower`, `spectral_smoothing`, `compressor`, `beat_gate`, `delay`
|
||||
- All produce new `AudioAnalysis` via `dataclasses.replace()` (immutability preserved)
|
||||
|
||||
## Key Architecture Reference
|
||||
|
||||
### Existing Pattern to Mirror: Processed Picture Sources
|
||||
@@ -83,7 +90,7 @@ _(none yet)_
|
||||
| Phase | Agent Used | Test Writer | Parallel | Notes |
|
||||
|-------|-----------|-------------|----------|-------|
|
||||
| Phase 1 | impl-agent | — | No | Tasks 7+8 skipped (SQLite migration made them obsolete) |
|
||||
| Phase 2 | — | — | — | — |
|
||||
| Phase 2 | impl-agent | — | No | All 11 filters implemented, no deviations |
|
||||
| Phase 3 | — | — | — | — |
|
||||
| Phase 4 | — | — | — | — |
|
||||
| Phase 5 | — | — | — | — |
|
||||
@@ -98,6 +105,6 @@ _(none yet)_
|
||||
|
||||
## Implementation Notes
|
||||
- Clean-slate approach: no migration of existing MonoAudioSource/BandExtractAudioSource data
|
||||
- 5 of 11 filters are stateful (peak hold, envelope follower, spectral smoothing, compressor, delay) — need per-stream instance lifecycle
|
||||
- 7 of 11 filters are stateful (peak hold, noise gate, envelope follower, spectral smoothing, compressor, beat gate, delay) — need per-stream instance lifecycle
|
||||
- Audio filters operate on AudioAnalysis snapshots, not raw audio samples
|
||||
- Big Bang strategy: intermediate phases may break the build; only Phase 7 enforces build/tests
|
||||
|
||||
@@ -40,7 +40,7 @@ Clean-slate approach: no data migration for old source types.
|
||||
| Phase | Domain | Status | Review | Build | Committed |
|
||||
|-------|--------|--------|--------|-------|-----------|
|
||||
| Phase 1: Audio Filter Framework | backend | 🔨 In Progress | ⬜ | ⬜ | ⬜ |
|
||||
| Phase 2: Audio Filters | backend | ⬜ Not Started | ⬜ | ⬜ | ⬜ |
|
||||
| Phase 2: Audio Filters | backend | 🔨 In Progress | ⬜ | ⬜ | ⬜ |
|
||||
| Phase 3: Processed Audio Source Model | backend | ⬜ Not Started | ⬜ | ⬜ | ⬜ |
|
||||
| Phase 4: Runtime Integration | backend | ⬜ Not Started | ⬜ | ⬜ | ⬜ |
|
||||
| Phase 5: Frontend — Audio Processing Templates | frontend | ⬜ Not Started | ⬜ | ⬜ | ⬜ |
|
||||
|
||||
@@ -1,6 +1,6 @@
|
||||
# Phase 2: Audio Filters
|
||||
|
||||
**Status:** ⬜ Not Started
|
||||
**Status:** 🔨 In Progress
|
||||
**Parent plan:** [PLAN.md](./PLAN.md)
|
||||
**Domain:** backend
|
||||
|
||||
@@ -9,55 +9,55 @@ Implement all 11 audio filters and register them with the AudioFilterRegistry.
|
||||
|
||||
## Tasks
|
||||
|
||||
- [ ] Task 1: **Channel Extract** filter (`core/audio/filters/channel_extract.py`)
|
||||
- [x] Task 1: **Channel Extract** filter (`core/audio/filters/channel_extract.py`)
|
||||
- Options: `channel` (select: mono | left | right)
|
||||
- Stateful: No
|
||||
- Behavior: Replaces main rms/spectrum with selected channel data. If "mono", averages L+R. If "left"/"right", copies that channel's data to the main fields.
|
||||
- [ ] Task 2: **Band Extract** filter (`core/audio/filters/band_extract.py`)
|
||||
- [x] Task 2: **Band Extract** filter (`core/audio/filters/band_extract.py`)
|
||||
- Options: `band` (select: bass | mid | treble | custom), `freq_low` (float, 20-20000), `freq_high` (float, 20-20000)
|
||||
- Stateful: No
|
||||
- Behavior: Computes a band mask for the 64 log-spaced bins, applies it to spectrum, recomputes RMS from in-band data. Reuse logic from existing `core/audio/band_filter.py`.
|
||||
- Presets: bass=20-250Hz, mid=250-4000Hz, treble=4000-20000Hz
|
||||
- [ ] Task 3: **Peak Hold** filter (`core/audio/filters/peak_hold.py`)
|
||||
- [x] Task 3: **Peak Hold** filter (`core/audio/filters/peak_hold.py`)
|
||||
- Options: `decay_rate` (float, 0.1-50.0, dB/s), `per_bin` (bool, default true)
|
||||
- Stateful: Yes
|
||||
- Behavior: For each spectrum bin (if per_bin) or for rms/peak, retains the maximum value seen and decays it over time. Outputs the max of current value and held peak.
|
||||
- [ ] Task 4: **Gain** filter (`core/audio/filters/gain.py`)
|
||||
- [x] Task 4: **Gain** filter (`core/audio/filters/gain.py`)
|
||||
- Options: `factor` (float, 0.1-10.0, default 1.0)
|
||||
- Stateful: No
|
||||
- Behavior: Multiplies rms, peak, spectrum, and per-channel values by factor. Clamps to [0, 1] for rms/peak.
|
||||
- [ ] Task 5: **Noise Gate** filter (`core/audio/filters/noise_gate.py`)
|
||||
- [x] Task 5: **Noise Gate** filter (`core/audio/filters/noise_gate.py`)
|
||||
- Options: `threshold` (float, 0.0-1.0), `hysteresis` (float, 0.0-0.2, default 0.05)
|
||||
- Stateful: No (hysteresis is stateless — it's a secondary threshold, not temporal)
|
||||
- Behavior: If rms < threshold, zeros out all levels and spectrum. Hysteresis means: if gate was open and rms drops below (threshold - hysteresis), close it; if gate was closed and rms rises above threshold, open it.
|
||||
- Actually stateful for hysteresis tracking: needs to remember gate open/closed state.
|
||||
- [ ] Task 6: **Envelope Follower** filter (`core/audio/filters/envelope_follower.py`)
|
||||
- [x] Task 6: **Envelope Follower** filter (`core/audio/filters/envelope_follower.py`)
|
||||
- Options: `attack_ms` (float, 1-500, default 10), `release_ms` (float, 10-2000, default 200)
|
||||
- Stateful: Yes
|
||||
- Behavior: Smooths rms and peak with asymmetric time constants. When signal rises, uses attack rate. When signal falls, uses release rate. Applied per-bin to spectrum optionally.
|
||||
- Fast attack + slow release = punchy transients that fade smoothly.
|
||||
- [ ] Task 7: **Spectral Smoothing** filter (`core/audio/filters/spectral_smoothing.py`)
|
||||
- [x] Task 7: **Spectral Smoothing** filter (`core/audio/filters/spectral_smoothing.py`)
|
||||
- Options: `factor` (float, 0.0-0.99, default 0.5)
|
||||
- Stateful: Yes (maintains previous spectrum state)
|
||||
- Behavior: Applies exponential moving average per-bin: `smoothed[i] = factor * prev[i] + (1-factor) * current[i]`. Higher factor = smoother/slower.
|
||||
- [ ] Task 8: **Compressor** filter (`core/audio/filters/compressor.py`)
|
||||
- [x] Task 8: **Compressor** filter (`core/audio/filters/compressor.py`)
|
||||
- Options: `threshold` (float, 0.0-1.0, default 0.5), `ratio` (float, 1.0-20.0, default 4.0), `makeup_gain` (float, 0.0-2.0, default 1.0)
|
||||
- Stateful: Yes (envelope tracking for gain reduction)
|
||||
- Behavior: When signal exceeds threshold, reduces by ratio. `output = threshold + (input - threshold) / ratio`. Apply makeup_gain after. Applied to rms, peak, and spectrum.
|
||||
- [ ] Task 9: **Inverter** filter (`core/audio/filters/inverter.py`)
|
||||
- [x] Task 9: **Inverter** filter (`core/audio/filters/inverter.py`)
|
||||
- Options: none (or `invert_spectrum` bool, default true)
|
||||
- Stateful: No
|
||||
- Behavior: `rms = 1.0 - rms`, `peak = 1.0 - peak`, spectrum bins inverted if option set. Beat fields unchanged.
|
||||
- [ ] Task 10: **Beat Gate** filter (`core/audio/filters/beat_gate.py`)
|
||||
- [x] Task 10: **Beat Gate** filter (`core/audio/filters/beat_gate.py`)
|
||||
- Options: `hold_ms` (float, 10-500, default 50) — how long to hold signal after beat
|
||||
- Stateful: Yes (tracks last beat timestamp)
|
||||
- Behavior: When beat detected, passes signal through for `hold_ms` milliseconds. Between beats, zeros out rms/peak/spectrum. Beat fields themselves always pass through.
|
||||
- [ ] Task 11: **Delay** filter (`core/audio/filters/delay.py`)
|
||||
- [x] Task 11: **Delay** filter (`core/audio/filters/delay.py`)
|
||||
- Options: `delay_ms` (float, 10-2000, default 100)
|
||||
- Stateful: Yes (ring buffer of AudioAnalysis snapshots)
|
||||
- Behavior: Buffers incoming AudioAnalysis snapshots and outputs the one from `delay_ms` ago. Ring buffer sized based on ~30Hz update rate.
|
||||
- [ ] Task 12: Register all 11 filters in `core/audio/filters/__init__.py`
|
||||
- [ ] Task 13: Update Noise Gate to be stateful (hysteresis requires gate state tracking)
|
||||
- [x] Task 12: Register all 11 filters in `core/audio/filters/__init__.py`
|
||||
- [x] Task 13: Update Noise Gate to be stateful (hysteresis requires gate state tracking)
|
||||
|
||||
## Files to Modify/Create
|
||||
- `core/audio/filters/channel_extract.py` — **create**
|
||||
@@ -94,4 +94,21 @@ Implement all 11 audio filters and register them with the AudioFilterRegistry.
|
||||
- [ ] Tests pass (new + existing)
|
||||
|
||||
## Handoff to Next Phase
|
||||
<!-- Filled in by the implementation agent after completing this phase. -->
|
||||
|
||||
### What was built
|
||||
- All 11 audio filters implemented, each in its own file under `core/audio/filters/`
|
||||
- 7 stateful filters (peak_hold, noise_gate, envelope_follower, spectral_smoothing, compressor, beat_gate, delay) with proper `is_stateful` and `reset()` implementations
|
||||
- 4 stateless filters (channel_extract, band_extract, gain, inverter)
|
||||
- All filters registered in `__init__.py` via import-triggered `@AudioFilterRegistry.register`
|
||||
- All filters produce NEW AudioAnalysis via `dataclasses.replace()` (immutability preserved)
|
||||
- Band extract reuses existing `compute_band_mask()` and `apply_band_filter()` from `core/audio/band_filter.py`
|
||||
|
||||
### What Phase 3 needs to know
|
||||
- All 11 filters + the `audio_filter_template` meta-filter are now registered in the AudioFilterRegistry (12 total)
|
||||
- `GET /api/v1/audio-filters` will return all filters with their option schemas
|
||||
- Filters are instantiated via `AudioFilterRegistry.create_instance(filter_id, options)`
|
||||
- Stateful filters need per-stream instances (not shared) due to internal state
|
||||
- The `process()` method signature is `process(analysis: AudioAnalysis) -> AudioAnalysis`
|
||||
|
||||
### Known deviations from plan
|
||||
- None. All 11 filters implemented exactly as specified plus Task 13 (noise gate stateful).
|
||||
|
||||
Reference in New Issue
Block a user