fix: production-readiness hardening — security, perf, a11y, observability
Lint & Test / test (push) Successful in 20s
Lint & Test / test (push) Successful in 20s
Security - Default scripts_management, callbacks_management, links_management, and media_folders_management to False so a leaked token cannot escalate to RCE through admin CRUD endpoints. - TokenSpec + scope hierarchy (read | control | admin); legacy bare-string api_tokens entries promote to admin for back-compat. Management endpoints now require admin scope. - WebSocket subprotocol auth (Sec-WebSocket-Protocol: media-server.token.<T>) preferred over ?token= query so the token no longer lands in URL/history/ Referer; query fallback retained for HA integration back-compat. - Origin allow-list check on the WS endpoint (CSWSH defence). - In-process token-bucket rate limiter: 5/min for failed auths, 10/min for /api/scripts/execute and /api/callbacks/execute. - shell=False subprocess path (shlex.split) + per-parameter regex `pattern` in ScriptParameterConfig to harden shell=true scripts against parameter injection (Windows cmd.exe env-var expansion). - CSP gains form-action, worker-src, manifest-src directives. - Refuse cors_origins=["*"] at startup; strip token=... from uvicorn access logs; validate Gitea release tag against strict SemVer regex. - noopener noreferrer + no-referrer referrerpolicy on every outbound link. - icacls hardening of config.yaml on Windows (current user + SYSTEM + Administrators only); 0600 still enforced on POSIX. - WS volume handler clamps input and never drops the socket on bad messages. Performance - Album-art read in windows_media gated by track key — was decoding the WinRT thumbnail twice per second regardless of track changes. - /api/media/artwork returns content-derived ETag + Cache-Control so the browser sends If-None-Match and gets 304s on track repeats. - Foreground-service ctypes argtypes hoisted to one-time module init (was re-declaring ~14 prototypes per probe). - display_service _static_cache keyed by (edid_hash, ...) tuple with eviction of disappeared monitors — fixes stale capabilities on hot-plug swaps where the new topology has the same monitor count. - Visualizer rAF loop paused on document.hidden, resumed on visible. Reliability / bug fixes - Lifespan rewritten as try/yield/finally so a partial-startup failure cannot orphan background tasks or executors. - _run_callback in routes/media.py keeps a strong task ref (GC-safe) and uses the dedicated callback executor instead of the default pool. - macos_media.set_volume() no longer always returns True. - TrayManager._restart_requested initialised in __init__; set before signalling exit so the main thread observes it correctly. - Missing static_dir now logs a WARNING instead of silent UI disable. UX / accessibility / PWA - manifest.json theme_color and background_color match the Studio Reference base (#0E0D0B); added id and scope for PWA installability. - ARIA on mini-player icon buttons; inner SVGs marked aria-hidden. - OS mediaSession API wired so headset / lockscreen / Bluetooth buttons drive play/pause/next/prev/seek and show track metadata + artwork. Observability - X-Request-ID middleware (accept upstream id if it matches a safe regex, otherwise UUID4); request_id_var added to ContextVars and included in every log line alongside the token label. - Audit log (append-only JSONL) for every script + callback execution, including the on_play/on_pause/etc. event callbacks. Background-thread writer; queue capped; flushed in lifespan teardown. Deployment - proxy_headers + forwarded_allow_ips plumbed through Settings → uvicorn.Config for reverse-proxy installs. - HTTPS support via ssl_certfile + ssl_keyfile (+ optional password); startup refuses to launch with only one of the pair set. - Thumbnail cache moved from project-root .cache to %LOCALAPPDATA%/media-server/cache (Windows) and $XDG_CACHE_HOME/media-server/thumbnails (POSIX). Tests - 35 new tests across auth scopes, rate limiter, browser path traversal (../ NUL UNC absolute), script-param validation incl. regex, Gitea tag whitelist, config atomic write + POSIX perms. 47 passed / 4 skipped.
This commit is contained in:
+247
-90
@@ -15,7 +15,7 @@ from fastapi.responses import FileResponse
|
||||
from fastapi.staticfiles import StaticFiles
|
||||
|
||||
from . import __version__
|
||||
from .auth import get_token_label, token_label_var
|
||||
from .auth import get_token_label, request_id_var, token_label_var
|
||||
from .config import generate_default_config, get_config_dir, settings
|
||||
from .routes import (
|
||||
audio_router,
|
||||
@@ -33,10 +33,34 @@ from .services.websocket_manager import ws_manager
|
||||
|
||||
|
||||
class TokenLabelFilter(logging.Filter):
|
||||
"""Add token label to log records."""
|
||||
"""Add token label + request_id to log records."""
|
||||
|
||||
def filter(self, record):
|
||||
record.token_label = token_label_var.get("unknown")
|
||||
record.request_id = request_id_var.get("-")
|
||||
return True
|
||||
|
||||
|
||||
class _StripTokenQueryFilter(logging.Filter):
|
||||
"""Strip `token=...` from query strings before they hit the access log.
|
||||
|
||||
uvicorn's default access log format includes the full request line, so
|
||||
`/api/media/artwork?token=SECRET` would otherwise be persisted verbatim
|
||||
in stdout/journald/file sinks.
|
||||
"""
|
||||
|
||||
import re as _re
|
||||
|
||||
_TOKEN_RE = _re.compile(r"([?&])token=[^&\s\"']+")
|
||||
|
||||
def filter(self, record): # type: ignore[override]
|
||||
if isinstance(record.args, tuple):
|
||||
record.args = tuple(
|
||||
self._TOKEN_RE.sub(r"\1token=REDACTED", a) if isinstance(a, str) else a
|
||||
for a in record.args
|
||||
)
|
||||
if isinstance(record.msg, str) and "token=" in record.msg:
|
||||
record.msg = self._TOKEN_RE.sub(r"\1token=REDACTED", record.msg)
|
||||
return True
|
||||
|
||||
|
||||
@@ -49,17 +73,34 @@ def setup_logging():
|
||||
|
||||
logging.basicConfig(
|
||||
level=getattr(logging, settings.log_level.upper()),
|
||||
format="%(asctime)s - %(name)s - [%(token_label)s] - %(levelname)s - %(message)s",
|
||||
format=(
|
||||
"%(asctime)s - %(name)s - [%(token_label)s] [%(request_id)s]"
|
||||
" - %(levelname)s - %(message)s"
|
||||
),
|
||||
handlers=[handler],
|
||||
)
|
||||
|
||||
# Suppress noisy third-party loggers
|
||||
logging.getLogger("screen_brightness_control").setLevel(logging.ERROR)
|
||||
|
||||
# Make sure the uvicorn access log never persists tokens leaked into the
|
||||
# query string (the artwork + WS endpoints accept `?token=` for browser
|
||||
# compatibility — see verify_token_or_query).
|
||||
strip_filter = _StripTokenQueryFilter()
|
||||
for name in ("uvicorn.access", "uvicorn"):
|
||||
logging.getLogger(name).addFilter(strip_filter)
|
||||
|
||||
|
||||
@asynccontextmanager
|
||||
async def lifespan(app: FastAPI):
|
||||
"""Application lifespan handler."""
|
||||
"""Application lifespan handler.
|
||||
|
||||
All long-lived resources started during startup are kept in local refs and
|
||||
torn down in a `finally:` so a partial-startup failure cannot orphan tasks
|
||||
or thread pools.
|
||||
"""
|
||||
import asyncio
|
||||
|
||||
setup_logging()
|
||||
logger = logging.getLogger(__name__)
|
||||
logger.info(f"Media Server starting on {settings.host}:{settings.port}")
|
||||
@@ -71,92 +112,125 @@ async def lifespan(app: FastAPI):
|
||||
else:
|
||||
logger.warning("No API tokens configured — authentication is DISABLED")
|
||||
|
||||
# Start WebSocket status monitor
|
||||
controller = get_media_controller()
|
||||
await ws_manager.start_status_monitor(controller.get_status)
|
||||
logger.info("WebSocket status monitor started")
|
||||
|
||||
# Start update checker
|
||||
update_checker = None
|
||||
if settings.update_check_enabled:
|
||||
from .services.gitea_release_provider import GiteaReleaseProvider
|
||||
from .services.update_checker import UpdateChecker
|
||||
|
||||
provider = GiteaReleaseProvider()
|
||||
update_checker = UpdateChecker(provider, __version__)
|
||||
await update_checker.start(settings.update_check_interval)
|
||||
# Store globally so health endpoint can access cached result
|
||||
app.state.update_checker = update_checker
|
||||
|
||||
# Schedule periodic thumbnail cache cleanup so the 500 MB cap is actually
|
||||
# enforced. Runs once at startup and then hourly until shutdown.
|
||||
from .services.thumbnail_service import ThumbnailService
|
||||
|
||||
async def _thumbnail_cleanup_loop() -> None:
|
||||
while True:
|
||||
try:
|
||||
await asyncio.to_thread(ThumbnailService.cleanup_cache)
|
||||
except Exception as e:
|
||||
logger.warning("Thumbnail cache cleanup failed: %s", e)
|
||||
try:
|
||||
await asyncio.sleep(3600)
|
||||
except asyncio.CancelledError:
|
||||
break
|
||||
|
||||
import asyncio
|
||||
cleanup_task = asyncio.create_task(_thumbnail_cleanup_loop())
|
||||
|
||||
# Register audio visualizer (capture starts on-demand when clients subscribe)
|
||||
cleanup_task: asyncio.Task | None = None
|
||||
analyzer = None
|
||||
if settings.visualizer_enabled:
|
||||
from .services.audio_analyzer import get_audio_analyzer
|
||||
status_monitor_started = False
|
||||
|
||||
analyzer = get_audio_analyzer(
|
||||
num_bins=settings.visualizer_bins,
|
||||
target_fps=settings.visualizer_fps,
|
||||
device_name=settings.visualizer_device,
|
||||
)
|
||||
if analyzer.available:
|
||||
await ws_manager.start_audio_monitor(analyzer)
|
||||
logger.info("Audio visualizer available (capture on-demand)")
|
||||
else:
|
||||
logger.info("Audio visualizer unavailable (install soundcard + numpy)")
|
||||
|
||||
yield
|
||||
|
||||
# Stop update checker
|
||||
if update_checker is not None:
|
||||
await update_checker.stop()
|
||||
|
||||
# Cancel periodic thumbnail cleanup
|
||||
cleanup_task.cancel()
|
||||
try:
|
||||
await cleanup_task
|
||||
except asyncio.CancelledError:
|
||||
pass
|
||||
# Start WebSocket status monitor
|
||||
controller = get_media_controller()
|
||||
await ws_manager.start_status_monitor(controller.get_status)
|
||||
status_monitor_started = True
|
||||
logger.info("WebSocket status monitor started")
|
||||
|
||||
# Stop audio visualizer
|
||||
await ws_manager.stop_audio_monitor()
|
||||
if analyzer and analyzer.running:
|
||||
analyzer.stop()
|
||||
# Start update checker
|
||||
if settings.update_check_enabled:
|
||||
from .services.gitea_release_provider import GiteaReleaseProvider
|
||||
from .services.update_checker import UpdateChecker
|
||||
|
||||
# Stop WebSocket status monitor
|
||||
await ws_manager.stop_status_monitor()
|
||||
provider = GiteaReleaseProvider()
|
||||
update_checker = UpdateChecker(provider, __version__)
|
||||
await update_checker.start(settings.update_check_interval)
|
||||
# Store globally so health endpoint can access cached result
|
||||
app.state.update_checker = update_checker
|
||||
|
||||
# Shut down dedicated thread pools so pending scripts don't leak threads
|
||||
from .routes.callbacks import shutdown_callback_executor
|
||||
from .routes.scripts import shutdown_script_executor
|
||||
# Schedule periodic thumbnail cache cleanup so the 500 MB cap is actually
|
||||
# enforced. Runs once at startup and then hourly until shutdown.
|
||||
from .services.thumbnail_service import ThumbnailService
|
||||
|
||||
shutdown_script_executor()
|
||||
shutdown_callback_executor()
|
||||
async def _thumbnail_cleanup_loop() -> None:
|
||||
while True:
|
||||
try:
|
||||
await asyncio.to_thread(ThumbnailService.cleanup_cache)
|
||||
except Exception as e:
|
||||
logger.warning("Thumbnail cache cleanup failed: %s", e)
|
||||
try:
|
||||
await asyncio.sleep(3600)
|
||||
except asyncio.CancelledError:
|
||||
break
|
||||
|
||||
# Clean up platform-specific resources
|
||||
import platform as _platform
|
||||
if _platform.system() == "Windows":
|
||||
from .services.windows_media import shutdown_executor
|
||||
shutdown_executor()
|
||||
cleanup_task = asyncio.create_task(_thumbnail_cleanup_loop())
|
||||
|
||||
logger.info("Media Server shutting down")
|
||||
# Register audio visualizer (capture starts on-demand when clients subscribe)
|
||||
if settings.visualizer_enabled:
|
||||
from .services.audio_analyzer import get_audio_analyzer
|
||||
|
||||
analyzer = get_audio_analyzer(
|
||||
num_bins=settings.visualizer_bins,
|
||||
target_fps=settings.visualizer_fps,
|
||||
device_name=settings.visualizer_device,
|
||||
)
|
||||
if analyzer.available:
|
||||
await ws_manager.start_audio_monitor(analyzer)
|
||||
logger.info("Audio visualizer available (capture on-demand)")
|
||||
else:
|
||||
logger.info("Audio visualizer unavailable (install soundcard + numpy)")
|
||||
|
||||
yield
|
||||
finally:
|
||||
# Stop update checker
|
||||
if update_checker is not None:
|
||||
try:
|
||||
await update_checker.stop()
|
||||
except Exception:
|
||||
logger.exception("Error stopping update checker")
|
||||
|
||||
# Cancel periodic thumbnail cleanup
|
||||
if cleanup_task is not None:
|
||||
cleanup_task.cancel()
|
||||
try:
|
||||
await cleanup_task
|
||||
except asyncio.CancelledError:
|
||||
pass
|
||||
except Exception:
|
||||
logger.exception("Error awaiting thumbnail cleanup task")
|
||||
|
||||
# Stop audio visualizer
|
||||
try:
|
||||
await ws_manager.stop_audio_monitor()
|
||||
except Exception:
|
||||
logger.exception("Error stopping audio monitor")
|
||||
if analyzer and analyzer.running:
|
||||
try:
|
||||
analyzer.stop()
|
||||
except Exception:
|
||||
logger.exception("Error stopping audio analyzer")
|
||||
|
||||
# Stop WebSocket status monitor
|
||||
if status_monitor_started:
|
||||
try:
|
||||
await ws_manager.stop_status_monitor()
|
||||
except Exception:
|
||||
logger.exception("Error stopping status monitor")
|
||||
|
||||
# Shut down dedicated thread pools so pending scripts don't leak threads
|
||||
try:
|
||||
from .routes.callbacks import shutdown_callback_executor
|
||||
from .routes.scripts import shutdown_script_executor
|
||||
|
||||
shutdown_script_executor()
|
||||
shutdown_callback_executor()
|
||||
except Exception:
|
||||
logger.exception("Error shutting down script/callback executors")
|
||||
|
||||
# Flush audit log writer
|
||||
try:
|
||||
from .services.audit_log import shutdown_audit_log
|
||||
shutdown_audit_log()
|
||||
except Exception:
|
||||
logger.exception("Error flushing audit log")
|
||||
|
||||
# Clean up platform-specific resources
|
||||
import platform as _platform
|
||||
if _platform.system() == "Windows":
|
||||
try:
|
||||
from .services.windows_media import shutdown_executor
|
||||
shutdown_executor()
|
||||
except Exception:
|
||||
logger.exception("Error shutting down windows_media executor")
|
||||
|
||||
logger.info("Media Server shutting down")
|
||||
|
||||
|
||||
def create_app() -> FastAPI:
|
||||
@@ -173,7 +247,15 @@ def create_app() -> FastAPI:
|
||||
|
||||
# CORS — restrict to same-origin by default; users that integrate the API
|
||||
# from another origin (e.g. Home Assistant on a different host) can set
|
||||
# cors_origins in config.yaml.
|
||||
# cors_origins in config.yaml. Refuse "*" outright: combined with the
|
||||
# admin endpoints this would let any origin in the universe run
|
||||
# arbitrary shell. If users genuinely need every origin, they can list
|
||||
# them explicitly.
|
||||
if any(o.strip() == "*" for o in settings.cors_origins):
|
||||
raise RuntimeError(
|
||||
"cors_origins must not contain '*' — list exact origins instead. "
|
||||
"This protects the script-execution endpoints from any-origin abuse."
|
||||
)
|
||||
cors_origins = settings.cors_origins or [
|
||||
f"http://localhost:{settings.port}",
|
||||
f"http://127.0.0.1:{settings.port}",
|
||||
@@ -186,6 +268,23 @@ def create_app() -> FastAPI:
|
||||
allow_headers=["Authorization", "Content-Type"],
|
||||
)
|
||||
|
||||
# Request correlation ID — accept upstream X-Request-ID if it's a sane
|
||||
# ASCII id, otherwise mint a fresh UUID4. Emitted on the response so
|
||||
# clients can quote it back in bug reports.
|
||||
import re
|
||||
import uuid as _uuid
|
||||
|
||||
_REQ_ID_RE = re.compile(r"^[A-Za-z0-9._\-]{1,128}$")
|
||||
|
||||
@app.middleware("http")
|
||||
async def request_id_middleware(request: Request, call_next):
|
||||
incoming = request.headers.get("x-request-id", "")
|
||||
req_id = incoming if _REQ_ID_RE.match(incoming) else _uuid.uuid4().hex[:16]
|
||||
request_id_var.set(req_id)
|
||||
response = await call_next(request)
|
||||
response.headers["X-Request-ID"] = req_id
|
||||
return response
|
||||
|
||||
# Security headers — strict CSP for the bundled UI, disallow framing, hide referrer.
|
||||
@app.middleware("http")
|
||||
async def security_headers_middleware(request: Request, call_next):
|
||||
@@ -200,6 +299,9 @@ def create_app() -> FastAPI:
|
||||
"style-src 'self' 'unsafe-inline'; "
|
||||
"font-src 'self' data:; "
|
||||
"frame-ancestors 'none'; "
|
||||
"form-action 'self'; "
|
||||
"worker-src 'self'; "
|
||||
"manifest-src 'self'; "
|
||||
"base-uri 'self'"
|
||||
),
|
||||
)
|
||||
@@ -208,32 +310,63 @@ def create_app() -> FastAPI:
|
||||
response.headers.setdefault("Referrer-Policy", "no-referrer")
|
||||
return response
|
||||
|
||||
# Add token logging middleware
|
||||
# Add token logging middleware + auth-failure rate limit
|
||||
from fastapi.responses import JSONResponse
|
||||
|
||||
from .services.rate_limit import check as ratelimit_check
|
||||
from .services.rate_limit import get_peer
|
||||
|
||||
@app.middleware("http")
|
||||
async def token_logging_middleware(request: Request, call_next):
|
||||
"""Extract token label and set in context for logging."""
|
||||
"""Extract token label, set in context, and rate-limit failed auths."""
|
||||
if not settings.api_tokens:
|
||||
token_label_var.set("anonymous")
|
||||
else:
|
||||
token_label = "unknown"
|
||||
token_present = False
|
||||
token_valid = False
|
||||
|
||||
# Try Authorization header
|
||||
auth_header = request.headers.get("authorization", "")
|
||||
if auth_header.startswith("Bearer "):
|
||||
token_present = True
|
||||
token = auth_header[7:]
|
||||
label = get_token_label(token)
|
||||
if label:
|
||||
token_label = label
|
||||
token_valid = True
|
||||
|
||||
# Try query parameter (for artwork endpoint)
|
||||
elif "token" in request.query_params:
|
||||
token_present = True
|
||||
token = request.query_params["token"]
|
||||
label = get_token_label(token)
|
||||
if label:
|
||||
token_label = label
|
||||
token_valid = True
|
||||
|
||||
token_label_var.set(token_label)
|
||||
|
||||
# Brute-force gate: a peer that produces a wrong/missing token gets
|
||||
# 5 failures per minute before being throttled. Static-asset
|
||||
# requests (GET /static/*, /, /sw.js) and the docs endpoint are
|
||||
# exempt — they're served unauthenticated by design.
|
||||
if token_present and not token_valid:
|
||||
path = request.url.path
|
||||
if not (
|
||||
path == "/" or path == "/sw.js"
|
||||
or path.startswith("/static/")
|
||||
or path.startswith("/docs") or path.startswith("/openapi")
|
||||
or path.startswith("/redoc")
|
||||
):
|
||||
allowed, retry_after = ratelimit_check("auth", get_peer(request))
|
||||
if not allowed:
|
||||
return JSONResponse(
|
||||
status_code=429,
|
||||
content={"detail": "Too many authentication failures"},
|
||||
headers={"Retry-After": str(int(retry_after or 60))},
|
||||
)
|
||||
|
||||
response = await call_next(request)
|
||||
return response
|
||||
|
||||
@@ -266,6 +399,11 @@ def create_app() -> FastAPI:
|
||||
async def serve_ui():
|
||||
"""Serve the Web UI."""
|
||||
return FileResponse(static_dir / "index.html")
|
||||
else:
|
||||
logging.getLogger(__name__).warning(
|
||||
"static_dir not found at %s — Web UI disabled (API only)",
|
||||
static_dir,
|
||||
)
|
||||
|
||||
return app
|
||||
|
||||
@@ -316,8 +454,9 @@ def main():
|
||||
print(f"Config directory: {get_config_dir()}")
|
||||
if settings.api_tokens:
|
||||
print("\nAPI Tokens:")
|
||||
for label, token in settings.api_tokens.items():
|
||||
print(f" {label:20} {token}")
|
||||
for label, spec in settings.api_tokens.items():
|
||||
scope_str = ",".join(spec.scopes)
|
||||
print(f" {label:20} {spec.token} [scopes: {scope_str}]")
|
||||
else:
|
||||
print("\nAuthentication is DISABLED (no tokens configured)")
|
||||
return
|
||||
@@ -374,6 +513,27 @@ def main():
|
||||
|
||||
use_tray = PYSTRAY_AVAILABLE and not args.no_tray
|
||||
|
||||
# Validate TLS pair consistency before either path so we don't fail late.
|
||||
if bool(settings.ssl_certfile) ^ bool(settings.ssl_keyfile):
|
||||
_fatal(
|
||||
"ERROR: ssl_certfile and ssl_keyfile must both be set, or both unset."
|
||||
)
|
||||
|
||||
def _uvicorn_kwargs() -> dict:
|
||||
kw: dict = {
|
||||
"host": args.host,
|
||||
"port": args.port,
|
||||
"log_level": settings.log_level.lower(),
|
||||
"proxy_headers": settings.proxy_headers,
|
||||
"forwarded_allow_ips": settings.forwarded_allow_ips,
|
||||
}
|
||||
if settings.ssl_certfile and settings.ssl_keyfile:
|
||||
kw["ssl_certfile"] = settings.ssl_certfile
|
||||
kw["ssl_keyfile"] = settings.ssl_keyfile
|
||||
if settings.ssl_keyfile_password:
|
||||
kw["ssl_keyfile_password"] = settings.ssl_keyfile_password
|
||||
return kw
|
||||
|
||||
if use_tray:
|
||||
import asyncio
|
||||
import threading
|
||||
@@ -381,9 +541,7 @@ def main():
|
||||
# Run uvicorn in a background thread so tray owns the main thread message loop
|
||||
uv_config = uvicorn.Config(
|
||||
"media_server.main:app",
|
||||
host=args.host,
|
||||
port=args.port,
|
||||
log_level=settings.log_level.lower(),
|
||||
**_uvicorn_kwargs(),
|
||||
)
|
||||
server = uvicorn.Server(uv_config)
|
||||
|
||||
@@ -421,9 +579,8 @@ def main():
|
||||
else:
|
||||
uvicorn.run(
|
||||
"media_server.main:app",
|
||||
host=args.host,
|
||||
port=args.port,
|
||||
reload=False,
|
||||
**_uvicorn_kwargs(),
|
||||
)
|
||||
|
||||
|
||||
|
||||
Reference in New Issue
Block a user