Initial commit: Media Server for remote media control

FastAPI REST API server for controlling system-wide media playback
on Windows, Linux, macOS, and Android.

Features:
- Play/Pause/Stop/Next/Previous track controls
- Volume control and mute
- Seek within tracks
- Current track info (title, artist, album, artwork)
- WebSocket real-time status updates
- Script execution API
- Token-based authentication
- Cross-platform support

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
2026-02-04 14:41:00 +03:00
commit 83acf5f1ec
26 changed files with 3562 additions and 0 deletions

48
.gitignore vendored Normal file
View File

@@ -0,0 +1,48 @@
# Python
__pycache__/
*.py[cod]
*$py.class
*.so
.Python
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
*.egg-info/
.installed.cfg
*.egg
# Virtual environments
venv/
ENV/
env/
.venv/
# IDE
.idea/
.vscode/
.claude/
*.swp
*.swo
*~
# Config files with secrets
config.yaml
config.json
.env
# Logs
*.log
logs/
# OS
.DS_Store
Thumbs.db

39
CLAUDE.md Normal file
View File

@@ -0,0 +1,39 @@
# Media Server - Development Guide
## Overview
Standalone REST API server (FastAPI) for controlling system-wide media playback on Windows, Linux, macOS, and Android.
## Running the Server
### Manual Start
```bash
python -m media_server.main
```
### Auto-Start on Boot (Windows Task Scheduler)
Run in **Administrator PowerShell** from the media-server directory:
```powershell
.\media_server\service\install_task_windows.ps1
```
To remove the scheduled task:
```powershell
Unregister-ScheduledTask -TaskName "MediaServer" -Confirm:$false
```
## Configuration
Copy `config.example.yaml` to `config.yaml` and customize.
The API token is generated on first run and displayed in the console output.
Default port: `8765`
## Git Rules
Always ask for user approval before committing changes to git.

383
README.md Normal file
View File

@@ -0,0 +1,383 @@
# Media Server
A REST API server for controlling system media playback on Windows, Linux, macOS, and Android.
## Features
- Control any media player via system-wide media transport controls
- Play/Pause/Stop/Next/Previous track
- Volume control and mute
- Seek within tracks
- Get current track info (title, artist, album, artwork)
- Token-based authentication
- Cross-platform support
## Requirements
- Python 3.10+
- Platform-specific dependencies (see below)
## Installation
### Windows
```bash
pip install -r requirements.txt
```
Required packages: `winsdk`, `pywin32`, `pycaw`, `comtypes`
### Linux
```bash
# Install system dependencies
sudo apt-get install python3-dbus python3-gi libdbus-1-dev libglib2.0-dev
pip install -r requirements.txt
```
### macOS
```bash
pip install -r requirements.txt
```
No additional dependencies - uses built-in `osascript`.
### Android (Termux)
```bash
# In Termux
pkg install python termux-api
pip install -r requirements.txt
```
Requires Termux and Termux:API apps from F-Droid.
## Quick Start
1. Generate configuration with API token:
```bash
python -m media_server.main --generate-config
```
2. View your API token:
```bash
python -m media_server.main --show-token
```
3. Start the server:
```bash
python -m media_server.main
```
4. Test the connection:
```bash
curl http://localhost:8765/api/health
```
5. Test with authentication:
```bash
curl -H "Authorization: Bearer YOUR_TOKEN" http://localhost:8765/api/media/status
```
## Configuration
Configuration file locations:
- Windows: `%APPDATA%\media-server\config.yaml`
- Linux/macOS: `~/.config/media-server/config.yaml`
### config.yaml
```yaml
host: 0.0.0.0
port: 8765
api_token: your-secret-token-here
poll_interval: 1.0
log_level: INFO
```
### Environment Variables
All settings can be overridden with environment variables (prefix: `MEDIA_SERVER_`):
```bash
export MEDIA_SERVER_HOST=0.0.0.0
export MEDIA_SERVER_PORT=8765
export MEDIA_SERVER_API_TOKEN=your-token
export MEDIA_SERVER_LOG_LEVEL=DEBUG
```
## API Reference
### Health Check
```
GET /api/health
```
No authentication required. Returns server status and platform info.
**Response:**
```json
{
"status": "healthy",
"platform": "Windows",
"version": "1.0.0"
}
```
### Get Media Status
```
GET /api/media/status
Authorization: Bearer <token>
```
**Response:**
```json
{
"state": "playing",
"title": "Song Title",
"artist": "Artist Name",
"album": "Album Name",
"album_art_url": "https://...",
"duration": 240.5,
"position": 120.3,
"volume": 75,
"muted": false,
"source": "Spotify"
}
```
### Media Controls
All control endpoints require authentication and return `{"success": true}` on success.
| Endpoint | Method | Body | Description |
|----------|--------|------|-------------|
| `/api/media/play` | POST | - | Resume playback |
| `/api/media/pause` | POST | - | Pause playback |
| `/api/media/stop` | POST | - | Stop playback |
| `/api/media/next` | POST | - | Next track |
| `/api/media/previous` | POST | - | Previous track |
| `/api/media/volume` | POST | `{"volume": 75}` | Set volume (0-100) |
| `/api/media/mute` | POST | - | Toggle mute |
| `/api/media/seek` | POST | `{"position": 60.0}` | Seek to position (seconds) |
### Script Execution
The server supports executing pre-defined scripts via API.
#### List Scripts
```
GET /api/scripts/list
Authorization: Bearer <token>
```
**Response:**
```json
[
{
"name": "lock_screen",
"label": "Lock Screen",
"description": "Lock the workstation",
"timeout": 5
}
]
```
#### Execute Script
```
POST /api/scripts/execute/{script_name}
Authorization: Bearer <token>
Content-Type: application/json
{"args": []}
```
**Response:**
```json
{
"success": true,
"script": "lock_screen",
"exit_code": 0,
"stdout": "",
"stderr": ""
}
```
### Configuring Scripts
Add scripts in your `config.yaml`:
```yaml
scripts:
lock_screen:
command: "rundll32.exe user32.dll,LockWorkStation"
label: "Lock Screen"
description: "Lock the workstation"
timeout: 5
shell: true
shutdown:
command: "shutdown /s /t 0"
label: "Shutdown"
description: "Shutdown the PC immediately"
timeout: 10
shell: true
restart:
command: "shutdown /r /t 0"
label: "Restart"
description: "Restart the PC"
timeout: 10
shell: true
hibernate:
command: "shutdown /h"
label: "Hibernate"
description: "Hibernate the PC"
timeout: 10
shell: true
sleep:
command: "rundll32.exe powrprof.dll,SetSuspendState 0,1,0"
label: "Sleep"
description: "Put PC to sleep"
timeout: 10
shell: true
```
Script configuration options:
| Field | Required | Description |
|-------|----------|-------------|
| `command` | Yes | Command to execute |
| `label` | No | User-friendly display name (defaults to script name) |
| `description` | No | Description of what the script does |
| `icon` | No | Custom MDI icon (e.g., `mdi:power`) |
| `timeout` | No | Execution timeout in seconds (default: 30, max: 300) |
| `working_dir` | No | Working directory for the command |
| `shell` | No | Run in shell (default: true) |
## Running as a Service
### Windows Task Scheduler (Recommended)
Run in **Administrator PowerShell** from the project root:
```powershell
.\media_server\service\install_task_windows.ps1
```
To remove the scheduled task:
```powershell
Unregister-ScheduledTask -TaskName "MediaServer" -Confirm:$false
```
### Windows Service
Install:
```bash
python -m media_server.service.install_windows install
```
Start/Stop:
```bash
python -m media_server.service.install_windows start
python -m media_server.service.install_windows stop
```
Remove:
```bash
python -m media_server.service.install_windows remove
```
### Linux (systemd)
Install:
```bash
sudo ./service/install_linux.sh install
```
Enable and start for your user:
```bash
sudo systemctl enable media-server@$USER
sudo systemctl start media-server@$USER
```
View logs:
```bash
journalctl -u media-server@$USER -f
```
## Command Line Options
```
python -m media_server.main [OPTIONS]
Options:
--host TEXT Host to bind to (default: 0.0.0.0)
--port INTEGER Port to bind to (default: 8765)
--generate-config Generate default config file and exit
--show-token Show current API token and exit
```
## Security Recommendations
1. **Use HTTPS in production** - Set up a reverse proxy (nginx, Caddy) with SSL
2. **Strong tokens** - Default tokens are 32 random characters; don't use weak tokens
3. **Firewall** - Only expose the port to trusted networks
4. **Secrets management** - Don't commit tokens to version control
## Supported Media Players
### Windows
- Spotify
- Windows Media Player
- VLC
- Groove Music
- Web browsers (Chrome, Edge, Firefox)
- Any app using Windows Media Transport Controls
### Linux
- Any MPRIS-compliant player:
- Spotify
- VLC
- Rhythmbox
- Clementine
- Web browsers
- MPD (with MPRIS bridge)
### macOS
- Spotify
- Apple Music
- VLC (partial)
- QuickTime Player
### Android (via Termux)
- System media controls
- Limited seek support
## Troubleshooting
### "No active media session"
- Ensure a media player is running and has played content
- On Windows, check that the app supports media transport controls
- On Linux, verify MPRIS with: `dbus-send --print-reply --dest=org.freedesktop.DBus /org/freedesktop/DBus org.freedesktop.DBus.ListNames | grep mpris`
### Permission errors on Linux
- Ensure your user has access to the D-Bus session bus
- For systemd service, the `DBUS_SESSION_BUS_ADDRESS` must be set correctly
### Volume control not working
- Windows: Run as administrator if needed
- Linux: Ensure PulseAudio/PipeWire is running
## License
MIT License

47
config.example.yaml Normal file
View File

@@ -0,0 +1,47 @@
# Media Server Configuration
# Copy this file to config.yaml and customize as needed.
# A secure token will be auto-generated on first run if not specified.
# API Token (generate a secure random token)
api_token: "your-secure-token-here"
# Server settings
host: "0.0.0.0"
port: 8765
# Custom scripts
scripts:
lock_screen:
command: "rundll32.exe user32.dll,LockWorkStation"
label: "Lock Screen"
description: "Lock the workstation"
timeout: 5
shell: true
hibernate:
command: "shutdown /h"
label: "Hibernate"
description: "Hibernate the PC"
timeout: 10
shell: true
sleep:
command: "rundll32.exe powrprof.dll,SetSuspendState 0,1,0"
label: "Sleep"
description: "Put PC to sleep"
timeout: 10
shell: true
shutdown:
command: "shutdown /s /t 0"
label: "Shutdown"
description: "Shutdown the PC immediately"
timeout: 10
shell: true
restart:
command: "shutdown /r /t 0"
label: "Restart"
description: "Restart the PC immediately"
timeout: 10
shell: true

3
media_server/__init__.py Normal file
View File

@@ -0,0 +1,3 @@
"""Media Server - REST API for controlling system media playback."""
__version__ = "1.0.0"

111
media_server/auth.py Normal file
View File

@@ -0,0 +1,111 @@
"""Authentication middleware and utilities."""
from typing import Optional
from fastapi import Depends, HTTPException, Query, Request, status
from fastapi.security import HTTPAuthorizationCredentials, HTTPBearer
from .config import settings
security = HTTPBearer(auto_error=False)
async def verify_token(
request: Request,
credentials: HTTPAuthorizationCredentials = Depends(security),
) -> str:
"""Verify the API token from the Authorization header.
Args:
request: The incoming request
credentials: The bearer token credentials
Returns:
The validated token
Raises:
HTTPException: If the token is missing or invalid
"""
if credentials is None:
raise HTTPException(
status_code=status.HTTP_401_UNAUTHORIZED,
detail="Missing authentication token",
headers={"WWW-Authenticate": "Bearer"},
)
if credentials.credentials != settings.api_token:
raise HTTPException(
status_code=status.HTTP_401_UNAUTHORIZED,
detail="Invalid authentication token",
headers={"WWW-Authenticate": "Bearer"},
)
return credentials.credentials
class TokenAuth:
"""Dependency class for token authentication."""
def __init__(self, auto_error: bool = True):
self.auto_error = auto_error
async def __call__(
self,
request: Request,
credentials: HTTPAuthorizationCredentials = Depends(security),
) -> str | None:
"""Verify the token and return it or raise an exception."""
if credentials is None:
if self.auto_error:
raise HTTPException(
status_code=status.HTTP_401_UNAUTHORIZED,
detail="Missing authentication token",
headers={"WWW-Authenticate": "Bearer"},
)
return None
if credentials.credentials != settings.api_token:
if self.auto_error:
raise HTTPException(
status_code=status.HTTP_401_UNAUTHORIZED,
detail="Invalid authentication token",
headers={"WWW-Authenticate": "Bearer"},
)
return None
return credentials.credentials
async def verify_token_or_query(
credentials: HTTPAuthorizationCredentials = Depends(security),
token: Optional[str] = Query(None, description="API token as query parameter"),
) -> str:
"""Verify the API token from header or query parameter.
Useful for endpoints that need to be accessed via URL (like images).
Args:
credentials: The bearer token credentials from header
token: Token from query parameter
Returns:
The validated token
Raises:
HTTPException: If the token is missing or invalid
"""
# Try header first
if credentials is not None:
if credentials.credentials == settings.api_token:
return credentials.credentials
# Try query parameter
if token is not None:
if token == settings.api_token:
return token
raise HTTPException(
status_code=status.HTTP_401_UNAUTHORIZED,
detail="Missing or invalid authentication token",
headers={"WWW-Authenticate": "Bearer"},
)

142
media_server/config.py Normal file
View File

@@ -0,0 +1,142 @@
"""Configuration management for the media server."""
import os
import secrets
from pathlib import Path
from typing import Optional
import yaml
from pydantic import BaseModel, Field
from pydantic_settings import BaseSettings, SettingsConfigDict
class ScriptConfig(BaseModel):
"""Configuration for a custom script."""
command: str = Field(..., description="Command or script to execute")
label: Optional[str] = Field(default=None, description="User-friendly display label")
description: str = Field(default="", description="Script description")
icon: Optional[str] = Field(default=None, description="Custom icon (e.g., 'mdi:power')")
timeout: int = Field(default=30, description="Execution timeout in seconds", ge=1, le=300)
working_dir: Optional[str] = Field(default=None, description="Working directory")
shell: bool = Field(default=True, description="Run command in shell")
class Settings(BaseSettings):
"""Application settings loaded from environment or config file."""
model_config = SettingsConfigDict(
env_prefix="MEDIA_SERVER_",
env_file=".env",
env_file_encoding="utf-8",
extra="ignore",
)
# Server settings
host: str = Field(default="0.0.0.0", description="Server bind address")
port: int = Field(default=8765, description="Server port")
# Authentication
api_token: str = Field(
default_factory=lambda: secrets.token_urlsafe(32),
description="API authentication token",
)
# Media controller settings
poll_interval: float = Field(
default=1.0, description="Media status poll interval in seconds"
)
# Logging
log_level: str = Field(default="INFO", description="Logging level")
# Custom scripts (loaded separately from YAML)
scripts: dict[str, ScriptConfig] = Field(
default_factory=dict,
description="Custom scripts that can be executed via API",
)
@classmethod
def load_from_yaml(cls, path: Optional[Path] = None) -> "Settings":
"""Load settings from a YAML configuration file."""
if path is None:
# Look for config in standard locations
search_paths = [
Path("config.yaml"),
Path("config.yml"),
]
# Add platform-specific config directory
if os.name == "nt": # Windows
appdata = os.environ.get("APPDATA", "")
if appdata:
search_paths.append(Path(appdata) / "media-server" / "config.yaml")
else: # Linux/Unix/macOS
search_paths.append(Path.home() / ".config" / "media-server" / "config.yaml")
search_paths.append(Path("/etc/media-server/config.yaml"))
for search_path in search_paths:
if search_path.exists():
path = search_path
break
if path and path.exists():
with open(path, "r", encoding="utf-8") as f:
config_data = yaml.safe_load(f) or {}
return cls(**config_data)
return cls()
def get_config_dir() -> Path:
"""Get the configuration directory path."""
if os.name == "nt": # Windows
config_dir = Path(os.environ.get("APPDATA", "")) / "media-server"
else: # Linux/Unix
config_dir = Path.home() / ".config" / "media-server"
config_dir.mkdir(parents=True, exist_ok=True)
return config_dir
def generate_default_config(path: Optional[Path] = None) -> Path:
"""Generate a default configuration file with a new API token."""
if path is None:
path = get_config_dir() / "config.yaml"
config = {
"host": "0.0.0.0",
"port": 8765,
"api_token": secrets.token_urlsafe(32),
"poll_interval": 1.0,
"log_level": "INFO",
"scripts": {
"example_script": {
"command": "echo Hello from Media Server!",
"description": "Example script - echoes a message",
"timeout": 10,
"shell": True,
},
# Add your custom scripts here:
# "shutdown": {
# "command": "shutdown /s /t 60",
# "description": "Shutdown computer in 60 seconds",
# "timeout": 5,
# },
# "lock_screen": {
# "command": "rundll32.exe user32.dll,LockWorkStation",
# "description": "Lock the workstation",
# "timeout": 5,
# },
},
}
path.parent.mkdir(parents=True, exist_ok=True)
with open(path, "w", encoding="utf-8") as f:
yaml.dump(config, f, default_flow_style=False, sort_keys=False)
return path
# Global settings instance
settings = Settings.load_from_yaml()

123
media_server/main.py Normal file
View File

@@ -0,0 +1,123 @@
"""Media Server - FastAPI application entry point."""
import argparse
import logging
import sys
from contextlib import asynccontextmanager
import uvicorn
from fastapi import FastAPI
from fastapi.middleware.cors import CORSMiddleware
from .config import settings, generate_default_config, get_config_dir
from .routes import health_router, media_router, scripts_router
from .services import get_media_controller
from .services.websocket_manager import ws_manager
def setup_logging():
"""Configure application logging."""
logging.basicConfig(
level=getattr(logging, settings.log_level.upper()),
format="%(asctime)s - %(name)s - %(levelname)s - %(message)s",
handlers=[logging.StreamHandler(sys.stdout)],
)
@asynccontextmanager
async def lifespan(app: FastAPI):
"""Application lifespan handler."""
setup_logging()
logger = logging.getLogger(__name__)
logger.info(f"Media Server starting on {settings.host}:{settings.port}")
logger.info(f"API Token: {settings.api_token[:8]}...")
# Start WebSocket status monitor
controller = get_media_controller()
await ws_manager.start_status_monitor(controller.get_status)
logger.info("WebSocket status monitor started")
yield
# Stop WebSocket status monitor
await ws_manager.stop_status_monitor()
logger.info("Media Server shutting down")
def create_app() -> FastAPI:
"""Create and configure the FastAPI application."""
app = FastAPI(
title="Media Server",
description="REST API for controlling system media playback",
version="1.0.0",
lifespan=lifespan,
)
# Add CORS middleware for cross-origin requests
app.add_middleware(
CORSMiddleware,
allow_origins=["*"],
allow_credentials=True,
allow_methods=["*"],
allow_headers=["*"],
)
# Register routers
app.include_router(health_router)
app.include_router(media_router)
app.include_router(scripts_router)
return app
app = create_app()
def main():
"""Main entry point for running the server."""
parser = argparse.ArgumentParser(description="Media Server")
parser.add_argument(
"--host",
default=settings.host,
help=f"Host to bind to (default: {settings.host})",
)
parser.add_argument(
"--port",
type=int,
default=settings.port,
help=f"Port to bind to (default: {settings.port})",
)
parser.add_argument(
"--generate-config",
action="store_true",
help="Generate a default configuration file and exit",
)
parser.add_argument(
"--show-token",
action="store_true",
help="Show the current API token and exit",
)
args = parser.parse_args()
if args.generate_config:
config_path = generate_default_config()
print(f"Configuration file generated at: {config_path}")
print(f"API Token has been saved to the config file.")
return
if args.show_token:
print(f"API Token: {settings.api_token}")
print(f"Config directory: {get_config_dir()}")
return
uvicorn.run(
"media_server.main:app",
host=args.host,
port=args.port,
reload=False,
)
if __name__ == "__main__":
main()

View File

@@ -0,0 +1,17 @@
"""Pydantic models for the media server API."""
from .media import (
MediaState,
MediaStatus,
VolumeRequest,
SeekRequest,
MediaInfo,
)
__all__ = [
"MediaState",
"MediaStatus",
"VolumeRequest",
"SeekRequest",
"MediaInfo",
]

View File

@@ -0,0 +1,61 @@
"""Media-related Pydantic models."""
from enum import Enum
from typing import Optional
from pydantic import BaseModel, Field
class MediaState(str, Enum):
"""Playback state enumeration."""
PLAYING = "playing"
PAUSED = "paused"
STOPPED = "stopped"
IDLE = "idle"
class MediaInfo(BaseModel):
"""Information about the currently playing media."""
title: Optional[str] = Field(None, description="Track/media title")
artist: Optional[str] = Field(None, description="Artist name")
album: Optional[str] = Field(None, description="Album name")
album_art_url: Optional[str] = Field(None, description="URL to album artwork")
duration: Optional[float] = Field(
None, description="Total duration in seconds", ge=0
)
position: Optional[float] = Field(
None, description="Current position in seconds", ge=0
)
class MediaStatus(BaseModel):
"""Complete media playback status."""
state: MediaState = Field(default=MediaState.IDLE, description="Playback state")
title: Optional[str] = Field(None, description="Track/media title")
artist: Optional[str] = Field(None, description="Artist name")
album: Optional[str] = Field(None, description="Album name")
album_art_url: Optional[str] = Field(None, description="URL to album artwork")
duration: Optional[float] = Field(
None, description="Total duration in seconds", ge=0
)
position: Optional[float] = Field(
None, description="Current position in seconds", ge=0
)
volume: int = Field(default=100, description="Volume level (0-100)", ge=0, le=100)
muted: bool = Field(default=False, description="Whether audio is muted")
source: Optional[str] = Field(None, description="Media source/player name")
class VolumeRequest(BaseModel):
"""Request model for setting volume."""
volume: int = Field(..., description="Volume level (0-100)", ge=0, le=100)
class SeekRequest(BaseModel):
"""Request model for seeking to a position."""
position: float = Field(..., description="Position in seconds to seek to", ge=0)

View File

@@ -0,0 +1,7 @@
"""API route modules."""
from .health import router as health_router
from .media import router as media_router
from .scripts import router as scripts_router
__all__ = ["health_router", "media_router", "scripts_router"]

View File

@@ -0,0 +1,22 @@
"""Health check endpoint."""
import platform
from typing import Any
from fastapi import APIRouter
router = APIRouter(prefix="/api", tags=["health"])
@router.get("/health")
async def health_check() -> dict[str, Any]:
"""Health check endpoint - no authentication required.
Returns:
Health status and server information
"""
return {
"status": "healthy",
"platform": platform.system(),
"version": "1.0.0",
}

View File

@@ -0,0 +1,242 @@
"""Media control API endpoints."""
import logging
from fastapi import APIRouter, Depends, HTTPException, Query, WebSocket, WebSocketDisconnect
from fastapi import status
from fastapi.responses import Response
from ..auth import verify_token, verify_token_or_query
from ..config import settings
from ..models import MediaStatus, VolumeRequest, SeekRequest
from ..services import get_media_controller, get_current_album_art
from ..services.websocket_manager import ws_manager
logger = logging.getLogger(__name__)
router = APIRouter(prefix="/api/media", tags=["media"])
@router.get("/status", response_model=MediaStatus)
async def get_media_status(_: str = Depends(verify_token)) -> MediaStatus:
"""Get current media playback status.
Returns:
Current playback state, media info, volume, etc.
"""
controller = get_media_controller()
return await controller.get_status()
@router.post("/play")
async def play(_: str = Depends(verify_token)) -> dict:
"""Resume or start playback.
Returns:
Success status
"""
controller = get_media_controller()
success = await controller.play()
if not success:
raise HTTPException(
status_code=status.HTTP_503_SERVICE_UNAVAILABLE,
detail="Failed to start playback - no active media session",
)
return {"success": True}
@router.post("/pause")
async def pause(_: str = Depends(verify_token)) -> dict:
"""Pause playback.
Returns:
Success status
"""
controller = get_media_controller()
success = await controller.pause()
if not success:
raise HTTPException(
status_code=status.HTTP_503_SERVICE_UNAVAILABLE,
detail="Failed to pause - no active media session",
)
return {"success": True}
@router.post("/stop")
async def stop(_: str = Depends(verify_token)) -> dict:
"""Stop playback.
Returns:
Success status
"""
controller = get_media_controller()
success = await controller.stop()
if not success:
raise HTTPException(
status_code=status.HTTP_503_SERVICE_UNAVAILABLE,
detail="Failed to stop - no active media session",
)
return {"success": True}
@router.post("/next")
async def next_track(_: str = Depends(verify_token)) -> dict:
"""Skip to next track.
Returns:
Success status
"""
controller = get_media_controller()
success = await controller.next_track()
if not success:
raise HTTPException(
status_code=status.HTTP_503_SERVICE_UNAVAILABLE,
detail="Failed to skip - no active media session",
)
return {"success": True}
@router.post("/previous")
async def previous_track(_: str = Depends(verify_token)) -> dict:
"""Go to previous track.
Returns:
Success status
"""
controller = get_media_controller()
success = await controller.previous_track()
if not success:
raise HTTPException(
status_code=status.HTTP_503_SERVICE_UNAVAILABLE,
detail="Failed to go back - no active media session",
)
return {"success": True}
@router.post("/volume")
async def set_volume(
request: VolumeRequest, _: str = Depends(verify_token)
) -> dict:
"""Set the system volume.
Args:
request: Volume level (0-100)
Returns:
Success status with new volume level
"""
controller = get_media_controller()
success = await controller.set_volume(request.volume)
if not success:
raise HTTPException(
status_code=status.HTTP_503_SERVICE_UNAVAILABLE,
detail="Failed to set volume",
)
return {"success": True, "volume": request.volume}
@router.post("/mute")
async def toggle_mute(_: str = Depends(verify_token)) -> dict:
"""Toggle mute state.
Returns:
Success status with new mute state
"""
controller = get_media_controller()
muted = await controller.toggle_mute()
return {"success": True, "muted": muted}
@router.post("/seek")
async def seek(request: SeekRequest, _: str = Depends(verify_token)) -> dict:
"""Seek to a position in the current track.
Args:
request: Position in seconds
Returns:
Success status
"""
controller = get_media_controller()
success = await controller.seek(request.position)
if not success:
raise HTTPException(
status_code=status.HTTP_503_SERVICE_UNAVAILABLE,
detail="Failed to seek - no active media session or seek not supported",
)
return {"success": True, "position": request.position}
@router.get("/artwork")
async def get_artwork(_: str = Depends(verify_token_or_query)) -> Response:
"""Get the current album artwork.
Returns:
The album art image as PNG/JPEG
"""
art_bytes = get_current_album_art()
if art_bytes is None:
raise HTTPException(
status_code=status.HTTP_404_NOT_FOUND,
detail="No album artwork available",
)
# Try to detect image type from magic bytes
content_type = "image/png" # Default
if art_bytes[:3] == b"\xff\xd8\xff":
content_type = "image/jpeg"
elif art_bytes[:8] == b"\x89PNG\r\n\x1a\n":
content_type = "image/png"
elif art_bytes[:4] == b"RIFF" and art_bytes[8:12] == b"WEBP":
content_type = "image/webp"
return Response(content=art_bytes, media_type=content_type)
@router.websocket("/ws")
async def websocket_endpoint(
websocket: WebSocket,
token: str = Query(..., description="API authentication token"),
) -> None:
"""WebSocket endpoint for real-time media status updates.
Authentication is done via query parameter since WebSocket
doesn't support custom headers in the browser.
Messages sent to client:
- {"type": "status", "data": {...}} - Initial status on connect
- {"type": "status_update", "data": {...}} - Status changes
- {"type": "error", "message": "..."} - Error messages
Client can send:
- {"type": "ping"} - Keepalive, server responds with {"type": "pong"}
- {"type": "get_status"} - Request current status
"""
# Verify token
if token != settings.api_token:
await websocket.close(code=4001, reason="Invalid authentication token")
return
await ws_manager.connect(websocket)
try:
while True:
# Wait for messages from client (for keepalive/ping)
data = await websocket.receive_json()
if data.get("type") == "ping":
await websocket.send_json({"type": "pong"})
elif data.get("type") == "get_status":
# Allow manual status request
controller = get_media_controller()
status_data = await controller.get_status()
await websocket.send_json({
"type": "status",
"data": status_data.model_dump(),
})
except WebSocketDisconnect:
await ws_manager.disconnect(websocket)
except Exception as e:
logger.error("WebSocket error: %s", e)
await ws_manager.disconnect(websocket)

View File

@@ -0,0 +1,169 @@
"""Script execution API endpoints."""
import asyncio
import logging
import subprocess
from typing import Any
from fastapi import APIRouter, Depends, HTTPException, status
from pydantic import BaseModel, Field
from ..auth import verify_token
from ..config import settings
router = APIRouter(prefix="/api/scripts", tags=["scripts"])
logger = logging.getLogger(__name__)
class ScriptExecuteRequest(BaseModel):
"""Request model for script execution with optional arguments."""
args: list[str] = Field(default_factory=list, description="Additional arguments")
class ScriptExecuteResponse(BaseModel):
"""Response model for script execution."""
success: bool
script: str
exit_code: int | None = None
stdout: str = ""
stderr: str = ""
error: str | None = None
class ScriptInfo(BaseModel):
"""Information about an available script."""
name: str
label: str
description: str
icon: str | None = None
timeout: int
@router.get("/list")
async def list_scripts(_: str = Depends(verify_token)) -> list[ScriptInfo]:
"""List all available scripts.
Returns:
List of available scripts with their descriptions
"""
return [
ScriptInfo(
name=name,
label=config.label or name.replace("_", " ").title(),
description=config.description,
icon=config.icon,
timeout=config.timeout,
)
for name, config in settings.scripts.items()
]
@router.post("/execute/{script_name}")
async def execute_script(
script_name: str,
request: ScriptExecuteRequest | None = None,
_: str = Depends(verify_token),
) -> ScriptExecuteResponse:
"""Execute a pre-defined script by name.
Args:
script_name: Name of the script to execute (must be defined in config)
request: Optional arguments to pass to the script
Returns:
Execution result including stdout, stderr, and exit code
"""
# Check if script exists
if script_name not in settings.scripts:
raise HTTPException(
status_code=status.HTTP_404_NOT_FOUND,
detail=f"Script '{script_name}' not found. Use /api/scripts/list to see available scripts.",
)
script_config = settings.scripts[script_name]
args = request.args if request else []
logger.info(f"Executing script: {script_name}")
try:
# Build command
command = script_config.command
if args:
# Append arguments to command
command = f"{command} {' '.join(args)}"
# Execute in thread pool to not block
loop = asyncio.get_event_loop()
result = await loop.run_in_executor(
None,
lambda: _run_script(
command=command,
timeout=script_config.timeout,
shell=script_config.shell,
working_dir=script_config.working_dir,
),
)
return ScriptExecuteResponse(
success=result["exit_code"] == 0,
script=script_name,
exit_code=result["exit_code"],
stdout=result["stdout"],
stderr=result["stderr"],
)
except Exception as e:
logger.error(f"Script execution error: {e}")
return ScriptExecuteResponse(
success=False,
script=script_name,
error=str(e),
)
def _run_script(
command: str,
timeout: int,
shell: bool,
working_dir: str | None,
) -> dict[str, Any]:
"""Run a script synchronously.
Args:
command: Command to execute
timeout: Timeout in seconds
shell: Whether to run in shell
working_dir: Working directory
Returns:
Dict with exit_code, stdout, stderr
"""
try:
result = subprocess.run(
command,
shell=shell,
cwd=working_dir,
capture_output=True,
text=True,
timeout=timeout,
)
return {
"exit_code": result.returncode,
"stdout": result.stdout[:10000], # Limit output size
"stderr": result.stderr[:10000],
}
except subprocess.TimeoutExpired:
return {
"exit_code": -1,
"stdout": "",
"stderr": f"Script timed out after {timeout} seconds",
}
except Exception as e:
return {
"exit_code": -1,
"stdout": "",
"stderr": str(e),
}

View File

@@ -0,0 +1,144 @@
#!/bin/bash
# Linux service installation script for Media Server
set -e
SERVICE_NAME="media-server"
INSTALL_DIR="/opt/media-server"
SERVICE_FILE="/etc/systemd/system/${SERVICE_NAME}@.service"
CURRENT_USER=$(whoami)
# Colors for output
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
NC='\033[0m' # No Color
echo_info() {
echo -e "${GREEN}[INFO]${NC} $1"
}
echo_warn() {
echo -e "${YELLOW}[WARN]${NC} $1"
}
echo_error() {
echo -e "${RED}[ERROR]${NC} $1"
}
check_root() {
if [[ $EUID -ne 0 ]]; then
echo_error "This script must be run as root (use sudo)"
exit 1
fi
}
install_dependencies() {
echo_info "Installing system dependencies..."
if command -v apt-get &> /dev/null; then
apt-get update
apt-get install -y python3 python3-pip python3-venv python3-dbus python3-gi libdbus-1-dev libglib2.0-dev
elif command -v dnf &> /dev/null; then
dnf install -y python3 python3-pip python3-dbus python3-gobject dbus-devel glib2-devel
elif command -v pacman &> /dev/null; then
pacman -S --noconfirm python python-pip python-dbus python-gobject
else
echo_warn "Unknown package manager. Please install dependencies manually:"
echo " - python3, python3-pip, python3-venv"
echo " - python3-dbus, python3-gi"
echo " - libdbus-1-dev, libglib2.0-dev"
fi
}
install_service() {
echo_info "Installing Media Server..."
# Create installation directory
mkdir -p "$INSTALL_DIR"
# Copy source files
cp -r "$(dirname "$0")/../"* "$INSTALL_DIR/"
# Create virtual environment
echo_info "Creating Python virtual environment..."
python3 -m venv "$INSTALL_DIR/venv"
# Install Python dependencies
echo_info "Installing Python dependencies..."
"$INSTALL_DIR/venv/bin/pip" install --upgrade pip
"$INSTALL_DIR/venv/bin/pip" install -r "$INSTALL_DIR/requirements.txt"
# Install systemd service file
echo_info "Installing systemd service..."
cp "$INSTALL_DIR/service/media-server.service" "$SERVICE_FILE"
# Reload systemd
systemctl daemon-reload
# Generate config if not exists
if [[ ! -f "/home/$SUDO_USER/.config/media-server/config.yaml" ]]; then
echo_info "Generating configuration file..."
sudo -u "$SUDO_USER" "$INSTALL_DIR/venv/bin/python" -m media_server.main --generate-config
fi
echo_info "Installation complete!"
echo ""
echo "To enable and start the service for user '$SUDO_USER':"
echo " sudo systemctl enable ${SERVICE_NAME}@${SUDO_USER}"
echo " sudo systemctl start ${SERVICE_NAME}@${SUDO_USER}"
echo ""
echo "To view the API token:"
echo " cat ~/.config/media-server/config.yaml"
echo ""
echo "To view logs:"
echo " journalctl -u ${SERVICE_NAME}@${SUDO_USER} -f"
}
uninstall_service() {
echo_info "Uninstalling Media Server..."
# Stop and disable service
systemctl stop "${SERVICE_NAME}@*" 2>/dev/null || true
systemctl disable "${SERVICE_NAME}@*" 2>/dev/null || true
# Remove service file
rm -f "$SERVICE_FILE"
systemctl daemon-reload
# Remove installation directory
rm -rf "$INSTALL_DIR"
echo_info "Uninstallation complete!"
echo "Note: Configuration files in ~/.config/media-server were not removed."
}
show_usage() {
echo "Usage: $0 [install|uninstall|deps]"
echo ""
echo "Commands:"
echo " install Install the Media Server as a systemd service"
echo " uninstall Remove the Media Server service"
echo " deps Install system dependencies only"
}
# Main
case "${1:-}" in
install)
check_root
install_dependencies
install_service
;;
uninstall)
check_root
uninstall_service
;;
deps)
check_root
install_dependencies
;;
*)
show_usage
exit 1
;;
esac

View File

@@ -0,0 +1,10 @@
# Get the project root directory (two levels up from this script)
$projectRoot = (Get-Item $PSScriptRoot).Parent.Parent.FullName
$action = New-ScheduledTaskAction -Execute "python" -Argument "-m media_server.main" -WorkingDirectory $projectRoot
$trigger = New-ScheduledTaskTrigger -AtStartup
$principal = New-ScheduledTaskPrincipal -UserId "$env:USERNAME" -LogonType S4U -RunLevel Highest
$settings = New-ScheduledTaskSettingsSet -AllowStartIfOnBatteries -DontStopIfGoingOnBatteries -StartWhenAvailable
Register-ScheduledTask -TaskName "MediaServer" -Action $action -Trigger $trigger -Principal $principal -Settings $settings -Description "Media Server for Home Assistant"
Write-Host "Scheduled task 'MediaServer' created with working directory: $projectRoot"

View File

@@ -0,0 +1,151 @@
"""Windows service installer for Media Server.
This module allows the media server to be installed as a Windows service
that starts automatically on boot.
Usage:
Install: python -m media_server.service.install_windows install
Start: python -m media_server.service.install_windows start
Stop: python -m media_server.service.install_windows stop
Remove: python -m media_server.service.install_windows remove
Debug: python -m media_server.service.install_windows debug
"""
import os
import sys
import socket
import logging
try:
import win32serviceutil
import win32service
import win32event
import servicemanager
import win32api
WIN32_AVAILABLE = True
except ImportError:
WIN32_AVAILABLE = False
print("pywin32 not installed. Install with: pip install pywin32")
class MediaServerService:
"""Windows service wrapper for the Media Server."""
_svc_name_ = "MediaServer"
_svc_display_name_ = "Media Server"
_svc_description_ = "REST API server for controlling system media playback"
def __init__(self, args=None):
if WIN32_AVAILABLE:
win32serviceutil.ServiceFramework.__init__(self, args)
self.stop_event = win32event.CreateEvent(None, 0, 0, None)
self.is_running = False
self.server = None
def SvcStop(self):
"""Stop the service."""
self.ReportServiceStatus(win32service.SERVICE_STOP_PENDING)
win32event.SetEvent(self.stop_event)
self.is_running = False
if self.server:
self.server.should_exit = True
def SvcDoRun(self):
"""Run the service."""
servicemanager.LogMsg(
servicemanager.EVENTLOG_INFORMATION_TYPE,
servicemanager.PYS_SERVICE_STARTED,
(self._svc_name_, ""),
)
self.is_running = True
self.main()
def main(self):
"""Main service loop."""
import uvicorn
from media_server.main import app
from media_server.config import settings
config = uvicorn.Config(
app,
host=settings.host,
port=settings.port,
log_level=settings.log_level.lower(),
)
self.server = uvicorn.Server(config)
self.server.run()
if WIN32_AVAILABLE:
# Dynamically inherit from ServiceFramework when available
MediaServerService = type(
"MediaServerService",
(win32serviceutil.ServiceFramework,),
dict(MediaServerService.__dict__),
)
def install_service():
"""Install the Windows service."""
if not WIN32_AVAILABLE:
print("Error: pywin32 is required for Windows service installation")
print("Install with: pip install pywin32")
return False
try:
# Get the path to the Python executable
python_exe = sys.executable
# Get the path to this module
module_path = os.path.abspath(__file__)
win32serviceutil.InstallService(
MediaServerService._svc_name_,
MediaServerService._svc_name_,
MediaServerService._svc_display_name_,
startType=win32service.SERVICE_AUTO_START,
description=MediaServerService._svc_description_,
)
print(f"Service '{MediaServerService._svc_display_name_}' installed successfully")
print("Start the service with: sc start MediaServer")
return True
except Exception as e:
print(f"Failed to install service: {e}")
return False
def remove_service():
"""Remove the Windows service."""
if not WIN32_AVAILABLE:
print("Error: pywin32 is required")
return False
try:
win32serviceutil.RemoveService(MediaServerService._svc_name_)
print(f"Service '{MediaServerService._svc_display_name_}' removed successfully")
return True
except Exception as e:
print(f"Failed to remove service: {e}")
return False
def main():
"""Main entry point for service management."""
if not WIN32_AVAILABLE:
print("Error: pywin32 is required for Windows service support")
print("Install with: pip install pywin32")
sys.exit(1)
if len(sys.argv) == 1:
# Running as a service
servicemanager.Initialize()
servicemanager.PrepareToHostSingle(MediaServerService)
servicemanager.StartServiceCtrlDispatcher()
else:
# Command line management
win32serviceutil.HandleCommandLine(MediaServerService)
if __name__ == "__main__":
main()

View File

@@ -0,0 +1,36 @@
[Unit]
Description=Media Server - REST API for controlling system media playback
After=network.target sound.target
Wants=sound.target
[Service]
Type=simple
User=%i
Group=%i
# Environment variables (optional - can also use config file)
# Environment=MEDIA_SERVER_HOST=0.0.0.0
# Environment=MEDIA_SERVER_PORT=8765
# Environment=MEDIA_SERVER_API_TOKEN=your-secret-token
# Working directory
WorkingDirectory=/opt/media-server
# Start command - adjust path to your Python environment
ExecStart=/opt/media-server/venv/bin/python -m media_server.main
# Restart policy
Restart=always
RestartSec=10
# Security hardening
NoNewPrivileges=true
ProtectSystem=strict
ProtectHome=read-only
PrivateTmp=true
# Required for D-Bus access (MPRIS)
Environment=DBUS_SESSION_BUS_ADDRESS=unix:path=/run/user/%U/bus
[Install]
WantedBy=multi-user.target

View File

@@ -0,0 +1,75 @@
"""Media controller services."""
import os
import platform
from pathlib import Path
from typing import TYPE_CHECKING
if TYPE_CHECKING:
from .media_controller import MediaController
_controller_instance: "MediaController | None" = None
def _is_android() -> bool:
"""Check if running on Android (e.g., via Termux)."""
# Check for Android-specific paths and environment
android_indicators = [
Path("/system/build.prop").exists(),
Path("/data/data/com.termux").exists(),
"ANDROID_ROOT" in os.environ,
"TERMUX_VERSION" in os.environ,
]
return any(android_indicators)
def get_media_controller() -> "MediaController":
"""Get the platform-specific media controller instance.
Returns:
The media controller for the current platform
Raises:
RuntimeError: If the platform is not supported
"""
global _controller_instance
if _controller_instance is not None:
return _controller_instance
system = platform.system()
if system == "Windows":
from .windows_media import WindowsMediaController
_controller_instance = WindowsMediaController()
elif system == "Linux":
# Check if running on Android
if _is_android():
from .android_media import AndroidMediaController
_controller_instance = AndroidMediaController()
else:
from .linux_media import LinuxMediaController
_controller_instance = LinuxMediaController()
elif system == "Darwin": # macOS
from .macos_media import MacOSMediaController
_controller_instance = MacOSMediaController()
else:
raise RuntimeError(f"Unsupported platform: {system}")
return _controller_instance
def get_current_album_art() -> bytes | None:
"""Get the current album art bytes (Windows only for now)."""
system = platform.system()
if system == "Windows":
from .windows_media import get_current_album_art as _get_art
return _get_art()
return None
__all__ = ["get_media_controller", "get_current_album_art"]

View File

@@ -0,0 +1,232 @@
"""Android media controller using Termux:API.
This controller is designed to run on Android devices using Termux.
It requires the Termux:API app and termux-api package to be installed.
Installation:
1. Install Termux from F-Droid (not Play Store)
2. Install Termux:API from F-Droid
3. In Termux: pkg install termux-api
4. Grant necessary permissions to Termux:API
"""
import asyncio
import json
import logging
import subprocess
from typing import Optional, Any
from ..models import MediaState, MediaStatus
from .media_controller import MediaController
logger = logging.getLogger(__name__)
def _check_termux_api() -> bool:
"""Check if termux-api is available."""
try:
result = subprocess.run(
["which", "termux-media-player"],
capture_output=True,
timeout=5,
)
return result.returncode == 0
except Exception:
return False
TERMUX_API_AVAILABLE = _check_termux_api()
class AndroidMediaController(MediaController):
"""Media controller for Android using Termux:API.
Requires:
- Termux app
- Termux:API app
- termux-api package (pkg install termux-api)
"""
def __init__(self):
if not TERMUX_API_AVAILABLE:
logger.warning(
"Termux:API not available. Install with: pkg install termux-api"
)
def _run_termux_command(
self, command: list[str], timeout: int = 10
) -> Optional[str]:
"""Run a termux-api command and return the output."""
try:
result = subprocess.run(
command,
capture_output=True,
text=True,
timeout=timeout,
)
if result.returncode == 0:
return result.stdout.strip()
logger.error(f"Termux command failed: {result.stderr}")
return None
except subprocess.TimeoutExpired:
logger.error(f"Termux command timed out: {command}")
return None
except Exception as e:
logger.error(f"Termux command error: {e}")
return None
def _send_media_key(self, key: str) -> bool:
"""Send a media key event.
Args:
key: One of: play, pause, play-pause, stop, next, previous
"""
# termux-media-player command
result = self._run_termux_command(["termux-media-player", key])
return result is not None
def _get_media_info(self) -> dict[str, Any]:
"""Get current media playback info using termux-media-player."""
result = self._run_termux_command(["termux-media-player", "info"])
if result:
try:
return json.loads(result)
except json.JSONDecodeError:
pass
return {}
def _get_volume(self) -> tuple[int, bool]:
"""Get current volume using termux-volume."""
result = self._run_termux_command(["termux-volume"])
if result:
try:
volumes = json.loads(result)
# Find music stream
for stream in volumes:
if stream.get("stream") == "music":
volume = stream.get("volume", 0)
max_volume = stream.get("max_volume", 15)
# Convert to 0-100 scale
percent = int((volume / max_volume) * 100) if max_volume > 0 else 0
return percent, False
except (json.JSONDecodeError, KeyError):
pass
return 100, False
def _set_volume_internal(self, volume: int) -> bool:
"""Set volume using termux-volume."""
# termux-volume expects stream name and volume level
# Convert 0-100 to device scale (usually 0-15)
result = self._run_termux_command(["termux-volume"])
if result:
try:
volumes = json.loads(result)
for stream in volumes:
if stream.get("stream") == "music":
max_volume = stream.get("max_volume", 15)
device_volume = int((volume / 100) * max_volume)
self._run_termux_command(
["termux-volume", "music", str(device_volume)]
)
return True
except (json.JSONDecodeError, KeyError):
pass
return False
async def get_status(self) -> MediaStatus:
"""Get current media playback status."""
status = MediaStatus()
# Get volume
volume, muted = self._get_volume()
status.volume = volume
status.muted = muted
# Get media info
info = self._get_media_info()
if not info:
status.state = MediaState.IDLE
return status
# Parse playback status
playback_status = info.get("status", "").lower()
if playback_status == "playing":
status.state = MediaState.PLAYING
elif playback_status == "paused":
status.state = MediaState.PAUSED
elif playback_status == "stopped":
status.state = MediaState.STOPPED
else:
status.state = MediaState.IDLE
# Parse track info
status.title = info.get("title") or info.get("Track") or None
status.artist = info.get("artist") or info.get("Artist") or None
status.album = info.get("album") or info.get("Album") or None
# Duration and position (in milliseconds from some sources)
duration = info.get("duration", 0)
if duration > 1000: # Likely milliseconds
duration = duration / 1000
status.duration = duration if duration > 0 else None
position = info.get("position", info.get("current_position", 0))
if position > 1000: # Likely milliseconds
position = position / 1000
status.position = position if position > 0 else None
status.source = "Android"
return status
async def play(self) -> bool:
"""Resume playback."""
return self._send_media_key("play")
async def pause(self) -> bool:
"""Pause playback."""
return self._send_media_key("pause")
async def stop(self) -> bool:
"""Stop playback."""
return self._send_media_key("stop")
async def next_track(self) -> bool:
"""Skip to next track."""
return self._send_media_key("next")
async def previous_track(self) -> bool:
"""Go to previous track."""
return self._send_media_key("previous")
async def set_volume(self, volume: int) -> bool:
"""Set system volume."""
return self._set_volume_internal(volume)
async def toggle_mute(self) -> bool:
"""Toggle mute state.
Note: Android doesn't have a simple mute toggle via termux-api,
so we set volume to 0 or restore previous volume.
"""
volume, _ = self._get_volume()
if volume > 0:
# Store current volume and mute
self._previous_volume = volume
self._set_volume_internal(0)
return True
else:
# Restore previous volume
prev = getattr(self, "_previous_volume", 50)
self._set_volume_internal(prev)
return False
async def seek(self, position: float) -> bool:
"""Seek to position in seconds.
Note: Seek functionality may be limited depending on the media player.
"""
# termux-media-player doesn't support seek directly
# This is a limitation of the API
logger.warning("Seek not fully supported on Android via Termux:API")
return False

View File

@@ -0,0 +1,295 @@
"""Linux media controller using MPRIS D-Bus interface."""
import asyncio
import logging
import subprocess
from typing import Optional, Any
from ..models import MediaState, MediaStatus
from .media_controller import MediaController
logger = logging.getLogger(__name__)
# Linux-specific imports
try:
import dbus
from dbus.mainloop.glib import DBusGMainLoop
DBUS_AVAILABLE = True
except ImportError:
DBUS_AVAILABLE = False
logger.warning("D-Bus libraries not available")
class LinuxMediaController(MediaController):
"""Media controller for Linux using MPRIS D-Bus interface."""
MPRIS_PATH = "/org/mpris/MediaPlayer2"
MPRIS_INTERFACE = "org.mpris.MediaPlayer2.Player"
MPRIS_PREFIX = "org.mpris.MediaPlayer2."
PROPERTIES_INTERFACE = "org.freedesktop.DBus.Properties"
def __init__(self):
if not DBUS_AVAILABLE:
raise RuntimeError(
"Linux media control requires dbus-python package. "
"Install with: sudo apt-get install python3-dbus"
)
DBusGMainLoop(set_as_default=True)
self._bus = dbus.SessionBus()
def _get_active_player(self) -> Optional[str]:
"""Find an active MPRIS media player on the bus."""
try:
bus_names = self._bus.list_names()
mpris_players = [
name for name in bus_names if name.startswith(self.MPRIS_PREFIX)
]
if not mpris_players:
return None
# Prefer players that are currently playing
for player in mpris_players:
try:
proxy = self._bus.get_object(player, self.MPRIS_PATH)
props = dbus.Interface(proxy, self.PROPERTIES_INTERFACE)
status = props.Get(self.MPRIS_INTERFACE, "PlaybackStatus")
if status == "Playing":
return player
except Exception:
continue
# Return the first available player
return mpris_players[0]
except Exception as e:
logger.error(f"Failed to get active player: {e}")
return None
def _get_player_interface(self, player_name: str):
"""Get the MPRIS player interface."""
proxy = self._bus.get_object(player_name, self.MPRIS_PATH)
return dbus.Interface(proxy, self.MPRIS_INTERFACE)
def _get_properties_interface(self, player_name: str):
"""Get the properties interface for a player."""
proxy = self._bus.get_object(player_name, self.MPRIS_PATH)
return dbus.Interface(proxy, self.PROPERTIES_INTERFACE)
def _get_property(self, player_name: str, property_name: str) -> Any:
"""Get a property from the player."""
try:
props = self._get_properties_interface(player_name)
return props.Get(self.MPRIS_INTERFACE, property_name)
except Exception as e:
logger.debug(f"Failed to get property {property_name}: {e}")
return None
def _get_volume_pulseaudio(self) -> tuple[int, bool]:
"""Get volume using pactl (PulseAudio/PipeWire)."""
try:
# Get default sink volume
result = subprocess.run(
["pactl", "get-sink-volume", "@DEFAULT_SINK@"],
capture_output=True,
text=True,
timeout=5,
)
if result.returncode == 0:
# Parse volume from output like "Volume: front-left: 65536 / 100% / 0.00 dB"
for part in result.stdout.split("/"):
if "%" in part:
volume = int(part.strip().rstrip("%"))
break
else:
volume = 100
else:
volume = 100
# Get mute status
result = subprocess.run(
["pactl", "get-sink-mute", "@DEFAULT_SINK@"],
capture_output=True,
text=True,
timeout=5,
)
muted = "yes" in result.stdout.lower() if result.returncode == 0 else False
return volume, muted
except Exception as e:
logger.error(f"Failed to get volume via pactl: {e}")
return 100, False
def _set_volume_pulseaudio(self, volume: int) -> bool:
"""Set volume using pactl."""
try:
result = subprocess.run(
["pactl", "set-sink-volume", "@DEFAULT_SINK@", f"{volume}%"],
capture_output=True,
timeout=5,
)
return result.returncode == 0
except Exception as e:
logger.error(f"Failed to set volume: {e}")
return False
def _toggle_mute_pulseaudio(self) -> bool:
"""Toggle mute using pactl, returns new mute state."""
try:
result = subprocess.run(
["pactl", "set-sink-mute", "@DEFAULT_SINK@", "toggle"],
capture_output=True,
timeout=5,
)
if result.returncode == 0:
_, muted = self._get_volume_pulseaudio()
return muted
return False
except Exception as e:
logger.error(f"Failed to toggle mute: {e}")
return False
async def get_status(self) -> MediaStatus:
"""Get current media playback status."""
status = MediaStatus()
# Get system volume
volume, muted = self._get_volume_pulseaudio()
status.volume = volume
status.muted = muted
# Get active player
player_name = self._get_active_player()
if player_name is None:
status.state = MediaState.IDLE
return status
# Get playback status
playback_status = self._get_property(player_name, "PlaybackStatus")
if playback_status == "Playing":
status.state = MediaState.PLAYING
elif playback_status == "Paused":
status.state = MediaState.PAUSED
elif playback_status == "Stopped":
status.state = MediaState.STOPPED
else:
status.state = MediaState.IDLE
# Get metadata
metadata = self._get_property(player_name, "Metadata")
if metadata:
status.title = str(metadata.get("xesam:title", "")) or None
artists = metadata.get("xesam:artist", [])
if artists:
status.artist = str(artists[0]) if isinstance(artists, list) else str(artists)
status.album = str(metadata.get("xesam:album", "")) or None
status.album_art_url = str(metadata.get("mpris:artUrl", "")) or None
# Duration in microseconds
length = metadata.get("mpris:length", 0)
if length:
status.duration = int(length) / 1_000_000
# Get position (in microseconds)
position = self._get_property(player_name, "Position")
if position is not None:
status.position = int(position) / 1_000_000
# Get source name
status.source = player_name.replace(self.MPRIS_PREFIX, "")
return status
async def play(self) -> bool:
"""Resume playback."""
player_name = self._get_active_player()
if player_name is None:
return False
try:
player = self._get_player_interface(player_name)
player.Play()
return True
except Exception as e:
logger.error(f"Failed to play: {e}")
return False
async def pause(self) -> bool:
"""Pause playback."""
player_name = self._get_active_player()
if player_name is None:
return False
try:
player = self._get_player_interface(player_name)
player.Pause()
return True
except Exception as e:
logger.error(f"Failed to pause: {e}")
return False
async def stop(self) -> bool:
"""Stop playback."""
player_name = self._get_active_player()
if player_name is None:
return False
try:
player = self._get_player_interface(player_name)
player.Stop()
return True
except Exception as e:
logger.error(f"Failed to stop: {e}")
return False
async def next_track(self) -> bool:
"""Skip to next track."""
player_name = self._get_active_player()
if player_name is None:
return False
try:
player = self._get_player_interface(player_name)
player.Next()
return True
except Exception as e:
logger.error(f"Failed to skip next: {e}")
return False
async def previous_track(self) -> bool:
"""Go to previous track."""
player_name = self._get_active_player()
if player_name is None:
return False
try:
player = self._get_player_interface(player_name)
player.Previous()
return True
except Exception as e:
logger.error(f"Failed to skip previous: {e}")
return False
async def set_volume(self, volume: int) -> bool:
"""Set system volume."""
return self._set_volume_pulseaudio(volume)
async def toggle_mute(self) -> bool:
"""Toggle mute state."""
return self._toggle_mute_pulseaudio()
async def seek(self, position: float) -> bool:
"""Seek to position in seconds."""
player_name = self._get_active_player()
if player_name is None:
return False
try:
player = self._get_player_interface(player_name)
# MPRIS expects position in microseconds
player.SetPosition(
self._get_property(player_name, "Metadata").get("mpris:trackid", "/"),
int(position * 1_000_000),
)
return True
except Exception as e:
logger.error(f"Failed to seek: {e}")
return False

View File

@@ -0,0 +1,296 @@
"""macOS media controller using AppleScript and system commands."""
import asyncio
import logging
import subprocess
import json
from typing import Optional
from ..models import MediaState, MediaStatus
from .media_controller import MediaController
logger = logging.getLogger(__name__)
class MacOSMediaController(MediaController):
"""Media controller for macOS using osascript and system commands."""
def _run_osascript(self, script: str) -> Optional[str]:
"""Run an AppleScript and return the output."""
try:
result = subprocess.run(
["osascript", "-e", script],
capture_output=True,
text=True,
timeout=5,
)
if result.returncode == 0:
return result.stdout.strip()
return None
except Exception as e:
logger.error(f"osascript error: {e}")
return None
def _get_active_app(self) -> Optional[str]:
"""Get the currently active media application."""
# Check common media apps in order of preference
apps = ["Spotify", "Music", "TV", "VLC", "QuickTime Player"]
for app in apps:
script = f'''
tell application "System Events"
if exists (processes where name is "{app}") then
return "{app}"
end if
end tell
return ""
'''
result = self._run_osascript(script)
if result:
return result
return None
def _get_spotify_info(self) -> dict:
"""Get playback info from Spotify."""
script = '''
tell application "Spotify"
if player state is playing then
set currentState to "playing"
else if player state is paused then
set currentState to "paused"
else
set currentState to "stopped"
end if
try
set trackName to name of current track
set artistName to artist of current track
set albumName to album of current track
set trackDuration to duration of current track
set trackPosition to player position
set artUrl to artwork url of current track
on error
set trackName to ""
set artistName to ""
set albumName to ""
set trackDuration to 0
set trackPosition to 0
set artUrl to ""
end try
return currentState & "|" & trackName & "|" & artistName & "|" & albumName & "|" & trackDuration & "|" & trackPosition & "|" & artUrl
end tell
'''
result = self._run_osascript(script)
if result:
parts = result.split("|")
if len(parts) >= 7:
return {
"state": parts[0],
"title": parts[1] or None,
"artist": parts[2] or None,
"album": parts[3] or None,
"duration": float(parts[4]) / 1000 if parts[4] else None, # ms to seconds
"position": float(parts[5]) if parts[5] else None,
"art_url": parts[6] or None,
}
return {}
def _get_music_info(self) -> dict:
"""Get playback info from Apple Music."""
script = '''
tell application "Music"
if player state is playing then
set currentState to "playing"
else if player state is paused then
set currentState to "paused"
else
set currentState to "stopped"
end if
try
set trackName to name of current track
set artistName to artist of current track
set albumName to album of current track
set trackDuration to duration of current track
set trackPosition to player position
on error
set trackName to ""
set artistName to ""
set albumName to ""
set trackDuration to 0
set trackPosition to 0
end try
return currentState & "|" & trackName & "|" & artistName & "|" & albumName & "|" & trackDuration & "|" & trackPosition
end tell
'''
result = self._run_osascript(script)
if result:
parts = result.split("|")
if len(parts) >= 6:
return {
"state": parts[0],
"title": parts[1] or None,
"artist": parts[2] or None,
"album": parts[3] or None,
"duration": float(parts[4]) if parts[4] else None,
"position": float(parts[5]) if parts[5] else None,
}
return {}
def _get_volume(self) -> tuple[int, bool]:
"""Get system volume and mute state."""
try:
# Get volume level
result = self._run_osascript("output volume of (get volume settings)")
volume = int(result) if result else 100
# Get mute state
result = self._run_osascript("output muted of (get volume settings)")
muted = result == "true"
return volume, muted
except Exception as e:
logger.error(f"Failed to get volume: {e}")
return 100, False
async def get_status(self) -> MediaStatus:
"""Get current media playback status."""
status = MediaStatus()
# Get system volume
volume, muted = self._get_volume()
status.volume = volume
status.muted = muted
# Try to get info from active media app
active_app = self._get_active_app()
if active_app is None:
status.state = MediaState.IDLE
return status
status.source = active_app
if active_app == "Spotify":
info = self._get_spotify_info()
elif active_app == "Music":
info = self._get_music_info()
else:
info = {}
if info:
state = info.get("state", "stopped")
if state == "playing":
status.state = MediaState.PLAYING
elif state == "paused":
status.state = MediaState.PAUSED
else:
status.state = MediaState.STOPPED
status.title = info.get("title")
status.artist = info.get("artist")
status.album = info.get("album")
status.duration = info.get("duration")
status.position = info.get("position")
status.album_art_url = info.get("art_url")
else:
status.state = MediaState.IDLE
return status
async def play(self) -> bool:
"""Resume playback using media key simulation."""
# Use system media key
script = '''
tell application "System Events"
key code 16 using {command down, option down}
end tell
'''
# Fallback: try specific app
active_app = self._get_active_app()
if active_app == "Spotify":
self._run_osascript('tell application "Spotify" to play')
return True
elif active_app == "Music":
self._run_osascript('tell application "Music" to play')
return True
# Use media key simulation
result = subprocess.run(
["osascript", "-e", 'tell application "System Events" to key code 49'],
capture_output=True,
)
return result.returncode == 0
async def pause(self) -> bool:
"""Pause playback."""
active_app = self._get_active_app()
if active_app == "Spotify":
self._run_osascript('tell application "Spotify" to pause')
return True
elif active_app == "Music":
self._run_osascript('tell application "Music" to pause')
return True
return False
async def stop(self) -> bool:
"""Stop playback."""
active_app = self._get_active_app()
if active_app == "Spotify":
self._run_osascript('tell application "Spotify" to pause')
return True
elif active_app == "Music":
self._run_osascript('tell application "Music" to stop')
return True
return False
async def next_track(self) -> bool:
"""Skip to next track."""
active_app = self._get_active_app()
if active_app == "Spotify":
self._run_osascript('tell application "Spotify" to next track')
return True
elif active_app == "Music":
self._run_osascript('tell application "Music" to next track')
return True
return False
async def previous_track(self) -> bool:
"""Go to previous track."""
active_app = self._get_active_app()
if active_app == "Spotify":
self._run_osascript('tell application "Spotify" to previous track')
return True
elif active_app == "Music":
self._run_osascript('tell application "Music" to previous track')
return True
return False
async def set_volume(self, volume: int) -> bool:
"""Set system volume."""
result = self._run_osascript(f"set volume output volume {volume}")
return result is not None or True # osascript returns empty on success
async def toggle_mute(self) -> bool:
"""Toggle mute state."""
_, current_mute = self._get_volume()
new_mute = not current_mute
self._run_osascript(f"set volume output muted {str(new_mute).lower()}")
return new_mute
async def seek(self, position: float) -> bool:
"""Seek to position in seconds."""
active_app = self._get_active_app()
if active_app == "Spotify":
self._run_osascript(
f'tell application "Spotify" to set player position to {position}'
)
return True
elif active_app == "Music":
self._run_osascript(
f'tell application "Music" to set player position to {position}'
)
return True
return False

View File

@@ -0,0 +1,96 @@
"""Abstract base class for media controllers."""
from abc import ABC, abstractmethod
from ..models import MediaStatus
class MediaController(ABC):
"""Abstract base class for platform-specific media controllers."""
@abstractmethod
async def get_status(self) -> MediaStatus:
"""Get the current media playback status.
Returns:
MediaStatus with current playback info
"""
pass
@abstractmethod
async def play(self) -> bool:
"""Resume or start playback.
Returns:
True if successful, False otherwise
"""
pass
@abstractmethod
async def pause(self) -> bool:
"""Pause playback.
Returns:
True if successful, False otherwise
"""
pass
@abstractmethod
async def stop(self) -> bool:
"""Stop playback.
Returns:
True if successful, False otherwise
"""
pass
@abstractmethod
async def next_track(self) -> bool:
"""Skip to the next track.
Returns:
True if successful, False otherwise
"""
pass
@abstractmethod
async def previous_track(self) -> bool:
"""Go to the previous track.
Returns:
True if successful, False otherwise
"""
pass
@abstractmethod
async def set_volume(self, volume: int) -> bool:
"""Set the system volume.
Args:
volume: Volume level (0-100)
Returns:
True if successful, False otherwise
"""
pass
@abstractmethod
async def toggle_mute(self) -> bool:
"""Toggle the mute state.
Returns:
The new mute state (True = muted)
"""
pass
@abstractmethod
async def seek(self, position: float) -> bool:
"""Seek to a position in the current track.
Args:
position: Position in seconds
Returns:
True if successful, False otherwise
"""
pass

View File

@@ -0,0 +1,189 @@
"""WebSocket connection manager and status broadcaster."""
import asyncio
import logging
import time
from typing import Any, Callable, Coroutine
from fastapi import WebSocket
logger = logging.getLogger(__name__)
class ConnectionManager:
"""Manages WebSocket connections and broadcasts status updates."""
def __init__(self) -> None:
"""Initialize the connection manager."""
self._active_connections: set[WebSocket] = set()
self._lock = asyncio.Lock()
self._last_status: dict[str, Any] | None = None
self._broadcast_task: asyncio.Task | None = None
self._poll_interval: float = 0.5 # Internal poll interval for change detection
self._position_broadcast_interval: float = 5.0 # Send position updates every 5s during playback
self._last_broadcast_time: float = 0.0
self._running: bool = False
async def connect(self, websocket: WebSocket) -> None:
"""Accept a new WebSocket connection."""
await websocket.accept()
async with self._lock:
self._active_connections.add(websocket)
logger.info(
"WebSocket client connected. Total: %d", len(self._active_connections)
)
# Send current status immediately upon connection
if self._last_status:
try:
await websocket.send_json({"type": "status", "data": self._last_status})
except Exception as e:
logger.debug("Failed to send initial status: %s", e)
async def disconnect(self, websocket: WebSocket) -> None:
"""Remove a WebSocket connection."""
async with self._lock:
self._active_connections.discard(websocket)
logger.info(
"WebSocket client disconnected. Total: %d", len(self._active_connections)
)
async def broadcast(self, message: dict[str, Any]) -> None:
"""Broadcast a message to all connected clients."""
async with self._lock:
connections = list(self._active_connections)
if not connections:
return
disconnected = []
for websocket in connections:
try:
await websocket.send_json(message)
except Exception as e:
logger.debug("Failed to send to client: %s", e)
disconnected.append(websocket)
# Clean up disconnected clients
for ws in disconnected:
await self.disconnect(ws)
def status_changed(
self, old: dict[str, Any] | None, new: dict[str, Any]
) -> bool:
"""Detect if media status has meaningfully changed.
Position is NOT included for normal playback (let HA interpolate).
But seeks (large unexpected jumps) are detected.
"""
if old is None:
return True
# Fields to compare for changes (NO position - let HA interpolate)
significant_fields = [
"state",
"title",
"artist",
"album",
"volume",
"muted",
"duration",
"source",
"album_art_url",
]
for field in significant_fields:
if old.get(field) != new.get(field):
return True
# Detect seeks - large position jumps that aren't normal playback
old_pos = old.get("position") or 0
new_pos = new.get("position") or 0
pos_diff = new_pos - old_pos
# During playback, position should increase by ~0.5s (our poll interval)
# A seek is when position jumps backwards OR forward by more than expected
if new.get("state") == "playing":
# Backward seek or forward jump > 3s indicates seek
if pos_diff < -1.0 or pos_diff > 3.0:
return True
else:
# When paused, any significant position change is a seek
if abs(pos_diff) > 1.0:
return True
return False
async def start_status_monitor(
self,
get_status_func: Callable[[], Coroutine[Any, Any, Any]],
) -> None:
"""Start the background status monitoring loop."""
if self._running:
return
self._running = True
self._broadcast_task = asyncio.create_task(
self._status_monitor_loop(get_status_func)
)
logger.info("WebSocket status monitor started")
async def stop_status_monitor(self) -> None:
"""Stop the background status monitoring loop."""
self._running = False
if self._broadcast_task:
self._broadcast_task.cancel()
try:
await self._broadcast_task
except asyncio.CancelledError:
pass
logger.info("WebSocket status monitor stopped")
async def _status_monitor_loop(
self,
get_status_func: Callable[[], Coroutine[Any, Any, Any]],
) -> None:
"""Background loop that polls for status changes and broadcasts."""
while self._running:
try:
# Only poll if we have connected clients
async with self._lock:
has_clients = len(self._active_connections) > 0
if has_clients:
status = await get_status_func()
status_dict = status.model_dump()
# Only broadcast on actual state changes
# Let HA handle position interpolation during playback
if self.status_changed(self._last_status, status_dict):
self._last_status = status_dict
self._last_broadcast_time = time.time()
await self.broadcast(
{"type": "status_update", "data": status_dict}
)
logger.debug("Broadcast sent: status change")
else:
# Update cached status even without broadcast
self._last_status = status_dict
else:
# Still update cache for when clients connect
status = await get_status_func()
self._last_status = status.model_dump()
await asyncio.sleep(self._poll_interval)
except asyncio.CancelledError:
break
except Exception as e:
logger.error("Error in status monitor: %s", e)
await asyncio.sleep(self._poll_interval)
@property
def client_count(self) -> int:
"""Return the number of connected clients."""
return len(self._active_connections)
# Global instance
ws_manager = ConnectionManager()

View File

@@ -0,0 +1,596 @@
"""Windows media controller using WinRT APIs."""
import asyncio
import logging
from concurrent.futures import ThreadPoolExecutor
from typing import Optional, Any
from ..models import MediaState, MediaStatus
from .media_controller import MediaController
logger = logging.getLogger(__name__)
# Thread pool for WinRT operations (they don't play well with asyncio)
_executor = ThreadPoolExecutor(max_workers=2, thread_name_prefix="winrt")
# Global storage for current album art (as bytes)
_current_album_art_bytes: bytes | None = None
# Global storage for position tracking
import time as _time
_position_cache = {
"track_id": "",
"base_position": 0.0,
"base_time": 0.0,
"is_playing": False,
"duration": 0.0,
}
# Flag to force position to 0 after track skip (until title changes)
_track_skip_pending = {
"active": False,
"old_title": "",
"skip_time": 0.0,
"grace_until": 0.0, # After title changes, ignore stale SMTC positions
"stale_pos": -999, # The stale SMTC position we're ignoring
}
def get_current_album_art() -> bytes | None:
"""Get the current album art bytes."""
return _current_album_art_bytes
# Windows-specific imports
try:
from winsdk.windows.media.control import (
GlobalSystemMediaTransportControlsSessionManager as MediaManager,
GlobalSystemMediaTransportControlsSessionPlaybackStatus as PlaybackStatus,
)
WINSDK_AVAILABLE = True
except ImportError:
WINSDK_AVAILABLE = False
logger.warning("winsdk not available")
# Volume control imports
PYCAW_AVAILABLE = False
_volume_control = None
try:
from ctypes import cast, POINTER
from comtypes import CLSCTX_ALL, CoInitialize, CoUninitialize
from pycaw.pycaw import AudioUtilities, IAudioEndpointVolume
def _init_volume_control():
"""Initialize volume control interface."""
global _volume_control
if _volume_control is not None:
return _volume_control
try:
devices = AudioUtilities.GetSpeakers()
interface = devices.Activate(IAudioEndpointVolume._iid_, CLSCTX_ALL, None)
_volume_control = cast(interface, POINTER(IAudioEndpointVolume))
return _volume_control
except AttributeError:
# Try accessing the underlying device
try:
devices = AudioUtilities.GetSpeakers()
if hasattr(devices, '_dev'):
interface = devices._dev.Activate(IAudioEndpointVolume._iid_, CLSCTX_ALL, None)
_volume_control = cast(interface, POINTER(IAudioEndpointVolume))
return _volume_control
except Exception as e:
logger.debug(f"Volume control init failed: {e}")
except Exception as e:
logger.debug(f"Volume control init error: {e}")
return None
PYCAW_AVAILABLE = True
except ImportError as e:
logger.warning(f"pycaw not available: {e}")
def _init_volume_control():
return None
WINDOWS_AVAILABLE = WINSDK_AVAILABLE
def _sync_get_media_status() -> dict[str, Any]:
"""Synchronously get media status (runs in thread pool)."""
import asyncio
result = {
"state": "idle",
"title": None,
"artist": None,
"album": None,
"duration": None,
"position": None,
"source": None,
}
try:
# Create a new event loop for this thread
loop = asyncio.new_event_loop()
asyncio.set_event_loop(loop)
try:
# Get media session manager
manager = loop.run_until_complete(MediaManager.request_async())
if manager is None:
return result
session = _find_best_session(manager, loop)
if session is None:
return result
# Get playback status
playback_info = session.get_playback_info()
if playback_info:
status = playback_info.playback_status
if status == PlaybackStatus.PLAYING:
result["state"] = "playing"
elif status == PlaybackStatus.PAUSED:
result["state"] = "paused"
elif status == PlaybackStatus.STOPPED:
result["state"] = "stopped"
# Get media properties FIRST (needed for track ID)
media_props = loop.run_until_complete(
session.try_get_media_properties_async()
)
if media_props:
result["title"] = media_props.title or None
result["artist"] = media_props.artist or None
result["album"] = media_props.album_title or None
# Get timeline
timeline = session.get_timeline_properties()
if timeline:
try:
# end_time and position are datetime.timedelta objects
end_time = timeline.end_time
position = timeline.position
# Get duration
if hasattr(end_time, 'total_seconds'):
duration = end_time.total_seconds()
# Sanity check: duration should be positive and reasonable (< 24 hours)
if 0 < duration < 86400:
result["duration"] = duration
# Get position from SMTC and interpolate for smooth updates
if hasattr(position, 'total_seconds'):
smtc_pos = position.total_seconds()
current_time = _time.time()
is_playing = result["state"] == "playing"
current_title = result.get('title', '')
# Check if track skip is pending and title changed
skip_just_completed = False
if _track_skip_pending["active"]:
if current_title and current_title != _track_skip_pending["old_title"]:
# Title changed - clear the skip flag and start grace period
_track_skip_pending["active"] = False
_track_skip_pending["old_title"] = ""
_track_skip_pending["grace_until"] = current_time + 300.0 # Long grace period
_track_skip_pending["stale_pos"] = -999 # Reset stale position tracking
skip_just_completed = True
# Reset position cache for new track
new_track_id = f"{current_title}:{result.get('artist', '')}:{result.get('duration', 0)}"
_position_cache["track_id"] = new_track_id
_position_cache["base_position"] = 0.0
_position_cache["base_time"] = current_time
_position_cache["last_smtc_pos"] = -999 # Force fresh start
_position_cache["is_playing"] = is_playing
logger.debug(f"Track skip complete, new title: {current_title}, grace until: {_track_skip_pending['grace_until']}")
elif current_time - _track_skip_pending["skip_time"] > 5.0:
# Timeout after 5 seconds
_track_skip_pending["active"] = False
logger.debug("Track skip timeout")
# Check if we're in grace period (after skip, ignore high SMTC positions)
in_grace_period = current_time < _track_skip_pending.get("grace_until", 0)
# If track skip is pending or just completed, use cached/reset position
if _track_skip_pending["active"]:
pos = 0.0
_position_cache["base_position"] = 0.0
_position_cache["base_time"] = current_time
_position_cache["is_playing"] = is_playing
elif skip_just_completed:
# Just completed skip - interpolate from 0
if is_playing:
elapsed = current_time - _position_cache["base_time"]
pos = elapsed
else:
pos = 0.0
elif in_grace_period:
# Grace period after track skip
# SMTC position is stale (from old track) and won't update until seek/pause
# We interpolate from 0 and only trust SMTC when it changes or reports low value
# Calculate interpolated position from start of new track
if is_playing:
elapsed = current_time - _position_cache.get("base_time", current_time)
interpolated_pos = _position_cache.get("base_position", 0.0) + elapsed
else:
interpolated_pos = _position_cache.get("base_position", 0.0)
# Get the stale position we've been tracking
stale_pos = _track_skip_pending.get("stale_pos", -999)
# Detect if SMTC position changed significantly from the stale value (user seeked)
smtc_changed = stale_pos >= 0 and abs(smtc_pos - stale_pos) > 3.0
# Trust SMTC if:
# 1. It reports a low position (indicating new track started)
# 2. It changed from the stale value (user seeked)
if smtc_pos < 10.0 or smtc_changed:
# SMTC is now trustworthy
_position_cache["base_position"] = smtc_pos
_position_cache["base_time"] = current_time
_position_cache["last_smtc_pos"] = smtc_pos
_position_cache["is_playing"] = is_playing
pos = smtc_pos
_track_skip_pending["grace_until"] = 0
_track_skip_pending["stale_pos"] = -999
logger.debug(f"Grace period: accepting SMTC pos {smtc_pos} (low={smtc_pos < 10}, changed={smtc_changed})")
else:
# SMTC is stale - keep interpolating
pos = interpolated_pos
# Record the stale position for change detection
if stale_pos < 0:
_track_skip_pending["stale_pos"] = smtc_pos
# Keep grace period active indefinitely while SMTC is stale
_track_skip_pending["grace_until"] = current_time + 300.0
logger.debug(f"Grace period: SMTC stale ({smtc_pos}), using interpolated {interpolated_pos}")
else:
# Normal position tracking
# Create track ID from title + artist + duration
track_id = f"{current_title}:{result.get('artist', '')}:{result.get('duration', 0)}"
# Detect if SMTC position changed (new track, seek, or state change)
smtc_pos_changed = abs(smtc_pos - _position_cache.get("last_smtc_pos", -999)) > 0.5
track_changed = track_id != _position_cache.get("track_id", "")
if smtc_pos_changed or track_changed:
# SMTC updated - store new baseline
_position_cache["track_id"] = track_id
_position_cache["last_smtc_pos"] = smtc_pos
_position_cache["base_position"] = smtc_pos
_position_cache["base_time"] = current_time
_position_cache["is_playing"] = is_playing
pos = smtc_pos
elif is_playing:
# Interpolate position based on elapsed time
elapsed = current_time - _position_cache.get("base_time", current_time)
pos = _position_cache.get("base_position", smtc_pos) + elapsed
else:
# Paused - use base position
pos = _position_cache.get("base_position", smtc_pos)
# Update playing state
if _position_cache.get("is_playing") != is_playing:
_position_cache["base_position"] = pos if is_playing else _position_cache.get("base_position", smtc_pos)
_position_cache["base_time"] = current_time
_position_cache["is_playing"] = is_playing
# Sanity check: position should be non-negative and <= duration
if pos >= 0:
if result["duration"] and pos <= result["duration"]:
result["position"] = pos
elif result["duration"] and pos > result["duration"]:
result["position"] = result["duration"]
elif not result["duration"]:
result["position"] = pos
logger.debug(f"Timeline: duration={result['duration']}, position={result['position']}")
except Exception as e:
logger.debug(f"Timeline parse error: {e}")
# Try to get album art (requires media_props)
if media_props:
try:
thumbnail = media_props.thumbnail
if thumbnail:
stream = loop.run_until_complete(thumbnail.open_read_async())
if stream:
size = stream.size
if size > 0 and size < 10 * 1024 * 1024: # Max 10MB
from winsdk.windows.storage.streams import DataReader
reader = DataReader(stream)
loop.run_until_complete(reader.load_async(size))
buffer = bytearray(size)
reader.read_bytes(buffer)
reader.close()
stream.close()
global _current_album_art_bytes
_current_album_art_bytes = bytes(buffer)
result["album_art_url"] = "/api/media/artwork"
except Exception as e:
logger.debug(f"Failed to get album art: {e}")
result["source"] = session.source_app_user_model_id
finally:
loop.close()
except Exception as e:
logger.error(f"Error getting media status: {e}")
return result
def _find_best_session(manager, loop):
"""Find the best media session to control."""
# First try the current session
session = manager.get_current_session()
# Log all available sessions for debugging
sessions = manager.get_sessions()
if sessions:
logger.debug(f"Total sessions available: {sessions.size}")
for i in range(sessions.size):
s = sessions.get_at(i)
if s:
playback_info = s.get_playback_info()
status_name = "unknown"
if playback_info:
status_name = str(playback_info.playback_status)
logger.debug(f" Session {i}: {s.source_app_user_model_id} - status: {status_name}")
# If no current session, try to find any active session
if session is None:
if sessions and sessions.size > 0:
# Find a playing session, or use the first one
for i in range(sessions.size):
s = sessions.get_at(i)
if s:
playback_info = s.get_playback_info()
if playback_info and playback_info.playback_status == PlaybackStatus.PLAYING:
session = s
break
# If no playing session found, use the first available one
if session is None and sessions.size > 0:
session = sessions.get_at(0)
return session
def _sync_media_command(command: str) -> bool:
"""Synchronously execute a media command (runs in thread pool)."""
import asyncio
try:
loop = asyncio.new_event_loop()
asyncio.set_event_loop(loop)
try:
manager = loop.run_until_complete(MediaManager.request_async())
if manager is None:
return False
session = _find_best_session(manager, loop)
if session is None:
return False
if command == "play":
return loop.run_until_complete(session.try_play_async())
elif command == "pause":
return loop.run_until_complete(session.try_pause_async())
elif command == "stop":
return loop.run_until_complete(session.try_stop_async())
elif command == "next":
return loop.run_until_complete(session.try_skip_next_async())
elif command == "previous":
return loop.run_until_complete(session.try_skip_previous_async())
return False
finally:
loop.close()
except Exception as e:
logger.error(f"Error executing media command {command}: {e}")
return False
def _sync_seek(position: float) -> bool:
"""Synchronously seek to position."""
import asyncio
try:
loop = asyncio.new_event_loop()
asyncio.set_event_loop(loop)
try:
manager = loop.run_until_complete(MediaManager.request_async())
if manager is None:
return False
session = _find_best_session(manager, loop)
if session is None:
return False
position_ticks = int(position * 10_000_000)
return loop.run_until_complete(
session.try_change_playback_position_async(position_ticks)
)
finally:
loop.close()
except Exception as e:
logger.error(f"Error seeking: {e}")
return False
class WindowsMediaController(MediaController):
"""Media controller for Windows using WinRT and pycaw."""
def __init__(self):
if not WINDOWS_AVAILABLE:
raise RuntimeError(
"Windows media control requires winsdk, pycaw, and comtypes packages"
)
self._volume_interface = None
self._volume_init_attempted = False
def _get_volume_interface(self):
"""Get the audio endpoint volume interface."""
if not self._volume_init_attempted:
self._volume_init_attempted = True
self._volume_interface = _init_volume_control()
if self._volume_interface:
logger.info("Volume control initialized successfully")
else:
logger.warning("Volume control not available")
return self._volume_interface
async def get_status(self) -> MediaStatus:
"""Get current media playback status."""
status = MediaStatus()
# Get volume info (synchronous, fast)
volume_if = self._get_volume_interface()
if volume_if:
try:
volume_scalar = volume_if.GetMasterVolumeLevelScalar()
status.volume = int(volume_scalar * 100)
status.muted = bool(volume_if.GetMute())
except Exception as e:
logger.debug(f"Failed to get volume: {e}")
# Get media info in thread pool (avoids asyncio/WinRT issues)
try:
loop = asyncio.get_event_loop()
media_info = await asyncio.wait_for(
loop.run_in_executor(_executor, _sync_get_media_status),
timeout=5.0
)
state_map = {
"playing": MediaState.PLAYING,
"paused": MediaState.PAUSED,
"stopped": MediaState.STOPPED,
"idle": MediaState.IDLE,
}
status.state = state_map.get(media_info.get("state", "idle"), MediaState.IDLE)
status.title = media_info.get("title")
status.artist = media_info.get("artist")
status.album = media_info.get("album")
status.album_art_url = media_info.get("album_art_url")
status.duration = media_info.get("duration")
status.position = media_info.get("position")
status.source = media_info.get("source")
except asyncio.TimeoutError:
logger.warning("Media status request timed out")
status.state = MediaState.IDLE
except Exception as e:
logger.error(f"Error getting media status: {e}")
status.state = MediaState.IDLE
return status
async def _run_command(self, command: str) -> bool:
"""Run a media command in the thread pool."""
try:
loop = asyncio.get_event_loop()
return await asyncio.wait_for(
loop.run_in_executor(_executor, _sync_media_command, command),
timeout=5.0
)
except asyncio.TimeoutError:
logger.warning(f"Media command {command} timed out")
return False
except Exception as e:
logger.error(f"Error running media command {command}: {e}")
return False
async def play(self) -> bool:
"""Resume playback."""
return await self._run_command("play")
async def pause(self) -> bool:
"""Pause playback."""
return await self._run_command("pause")
async def stop(self) -> bool:
"""Stop playback."""
return await self._run_command("stop")
async def next_track(self) -> bool:
"""Skip to next track."""
# Get current title before skipping
try:
status = await self.get_status()
old_title = status.title or ""
except Exception:
old_title = ""
result = await self._run_command("next")
if result:
# Set flag to force position to 0 until title changes
_track_skip_pending["active"] = True
_track_skip_pending["old_title"] = old_title
_track_skip_pending["skip_time"] = _time.time()
logger.debug(f"Track skip initiated, old title: {old_title}")
return result
async def previous_track(self) -> bool:
"""Go to previous track."""
# Get current title before skipping
try:
status = await self.get_status()
old_title = status.title or ""
except Exception:
old_title = ""
result = await self._run_command("previous")
if result:
# Set flag to force position to 0 until title changes
_track_skip_pending["active"] = True
_track_skip_pending["old_title"] = old_title
_track_skip_pending["skip_time"] = _time.time()
logger.debug(f"Track skip initiated, old title: {old_title}")
return result
async def set_volume(self, volume: int) -> bool:
"""Set system volume."""
volume_if = self._get_volume_interface()
if volume_if is None:
return False
try:
volume_if.SetMasterVolumeLevelScalar(volume / 100.0, None)
return True
except Exception as e:
logger.error(f"Failed to set volume: {e}")
return False
async def toggle_mute(self) -> bool:
"""Toggle mute state."""
volume_if = self._get_volume_interface()
if volume_if is None:
return False
try:
current_mute = bool(volume_if.GetMute())
volume_if.SetMute(not current_mute, None)
return not current_mute
except Exception as e:
logger.error(f"Failed to toggle mute: {e}")
return False
async def seek(self, position: float) -> bool:
"""Seek to position in seconds."""
try:
loop = asyncio.get_event_loop()
return await asyncio.wait_for(
loop.run_in_executor(_executor, _sync_seek, position),
timeout=5.0
)
except asyncio.TimeoutError:
logger.warning("Seek command timed out")
return False
except Exception as e:
logger.error(f"Failed to seek: {e}")
return False

28
requirements.txt Normal file
View File

@@ -0,0 +1,28 @@
# Core dependencies
fastapi>=0.109.0
uvicorn[standard]>=0.27.0
pydantic>=2.0
pydantic-settings>=2.0
pyyaml>=6.0
# Windows media control (install on Windows only)
# pip install winsdk pywin32 pycaw comtypes
winsdk>=1.0.0b10; sys_platform == "win32"
pywin32>=306; sys_platform == "win32"
comtypes>=1.2.0; sys_platform == "win32"
pycaw>=20230407; sys_platform == "win32"
# Linux media control (install on Linux only)
# pip install dbus-python PyGObject
# Note: dbus-python requires system dependencies:
# sudo apt-get install libdbus-1-dev libglib2.0-dev python3-gi
# dbus-python>=1.3.2; sys_platform == "linux"
# PyGObject>=3.46.0; sys_platform == "linux"
# macOS media control
# No additional dependencies needed - uses osascript (AppleScript)
# Android media control (via Termux)
# Requires Termux and Termux:API apps from F-Droid
# In Termux: pkg install python termux-api
# No additional pip packages needed