fix: production-readiness hardening — security, perf, a11y, observability
Lint & Test / test (push) Successful in 20s

Security
- Default scripts_management, callbacks_management, links_management, and
  media_folders_management to False so a leaked token cannot escalate to RCE
  through admin CRUD endpoints.
- TokenSpec + scope hierarchy (read | control | admin); legacy bare-string
  api_tokens entries promote to admin for back-compat. Management endpoints
  now require admin scope.
- WebSocket subprotocol auth (Sec-WebSocket-Protocol: media-server.token.<T>)
  preferred over ?token= query so the token no longer lands in URL/history/
  Referer; query fallback retained for HA integration back-compat.
- Origin allow-list check on the WS endpoint (CSWSH defence).
- In-process token-bucket rate limiter: 5/min for failed auths,
  10/min for /api/scripts/execute and /api/callbacks/execute.
- shell=False subprocess path (shlex.split) + per-parameter regex `pattern`
  in ScriptParameterConfig to harden shell=true scripts against parameter
  injection (Windows cmd.exe env-var expansion).
- CSP gains form-action, worker-src, manifest-src directives.
- Refuse cors_origins=["*"] at startup; strip token=... from uvicorn access
  logs; validate Gitea release tag against strict SemVer regex.
- noopener noreferrer + no-referrer referrerpolicy on every outbound link.
- icacls hardening of config.yaml on Windows (current user + SYSTEM +
  Administrators only); 0600 still enforced on POSIX.
- WS volume handler clamps input and never drops the socket on bad messages.

Performance
- Album-art read in windows_media gated by track key — was decoding the
  WinRT thumbnail twice per second regardless of track changes.
- /api/media/artwork returns content-derived ETag + Cache-Control so the
  browser sends If-None-Match and gets 304s on track repeats.
- Foreground-service ctypes argtypes hoisted to one-time module init
  (was re-declaring ~14 prototypes per probe).
- display_service _static_cache keyed by (edid_hash, ...) tuple with
  eviction of disappeared monitors — fixes stale capabilities on hot-plug
  swaps where the new topology has the same monitor count.
- Visualizer rAF loop paused on document.hidden, resumed on visible.

Reliability / bug fixes
- Lifespan rewritten as try/yield/finally so a partial-startup failure
  cannot orphan background tasks or executors.
- _run_callback in routes/media.py keeps a strong task ref (GC-safe) and
  uses the dedicated callback executor instead of the default pool.
- macos_media.set_volume() no longer always returns True.
- TrayManager._restart_requested initialised in __init__; set before
  signalling exit so the main thread observes it correctly.
- Missing static_dir now logs a WARNING instead of silent UI disable.

UX / accessibility / PWA
- manifest.json theme_color and background_color match the Studio Reference
  base (#0E0D0B); added id and scope for PWA installability.
- ARIA on mini-player icon buttons; inner SVGs marked aria-hidden.
- OS mediaSession API wired so headset / lockscreen / Bluetooth buttons
  drive play/pause/next/prev/seek and show track metadata + artwork.

Observability
- X-Request-ID middleware (accept upstream id if it matches a safe regex,
  otherwise UUID4); request_id_var added to ContextVars and included in
  every log line alongside the token label.
- Audit log (append-only JSONL) for every script + callback execution,
  including the on_play/on_pause/etc. event callbacks. Background-thread
  writer; queue capped; flushed in lifespan teardown.

Deployment
- proxy_headers + forwarded_allow_ips plumbed through Settings →
  uvicorn.Config for reverse-proxy installs.
- HTTPS support via ssl_certfile + ssl_keyfile (+ optional password);
  startup refuses to launch with only one of the pair set.
- Thumbnail cache moved from project-root .cache to
  %LOCALAPPDATA%/media-server/cache (Windows) and
  $XDG_CACHE_HOME/media-server/thumbnails (POSIX).

Tests
- 35 new tests across auth scopes, rate limiter, browser path traversal
  (../ NUL UNC absolute), script-param validation incl. regex, Gitea tag
  whitelist, config atomic write + POSIX perms. 47 passed / 4 skipped.
This commit is contained in:
2026-05-22 22:25:54 +03:00
parent 450f9fe1ee
commit d131ba461c
31 changed files with 1586 additions and 204 deletions
+157 -11
View File
@@ -7,12 +7,49 @@ from pathlib import Path
from typing import Optional
import yaml
from pydantic import BaseModel, Field
from pydantic import BaseModel, Field, field_validator
from pydantic_settings import BaseSettings, SettingsConfigDict
logger = logging.getLogger(__name__)
# Token scopes form a strict hierarchy: admin > control > read. Helper utility
# used by both auth.py and the validator below.
SCOPE_HIERARCHY: dict[str, frozenset[str]] = {
"read": frozenset({"read"}),
"control": frozenset({"read", "control"}),
"admin": frozenset({"read", "control", "admin"}),
}
ALL_SCOPES: frozenset[str] = frozenset(SCOPE_HIERARCHY.keys())
class TokenSpec(BaseModel):
"""Per-token authentication entry with explicit scopes."""
token: str = Field(..., min_length=8, description="Secret token value")
scopes: list[str] = Field(
default_factory=lambda: ["admin"],
description="Granted scopes (subset of read|control|admin).",
)
@field_validator("scopes")
@classmethod
def _validate_scopes(cls, v: list[str]) -> list[str]:
if not v:
raise ValueError("scopes must list at least one of read|control|admin")
unknown = set(v) - ALL_SCOPES
if unknown:
raise ValueError(f"unknown scopes: {sorted(unknown)}; valid={sorted(ALL_SCOPES)}")
return v
def grants(self, required: str) -> bool:
"""Whether this token grants the requested scope (with hierarchy expansion)."""
granted: set[str] = set()
for s in self.scopes:
granted |= SCOPE_HIERARCHY.get(s, frozenset({s}))
return required in granted
class MediaFolderConfig(BaseModel):
"""Configuration for a media folder."""
@@ -48,6 +85,13 @@ class ScriptParameterConfig(BaseModel):
options: Optional[list[str]] = Field(
default=None, description="Allowed values (select type only)"
)
pattern: Optional[str] = Field(
default=None,
description=(
"Optional regex (Python flavour) that string-typed values must match."
" Use to harden parameters that flow into shell=true scripts."
),
)
class ScriptConfig(BaseModel):
@@ -108,19 +152,84 @@ class Settings(BaseSettings):
),
)
# Reverse-proxy deployment: when serving the API behind nginx/Caddy/Traefik,
# uvicorn must trust the X-Forwarded-* headers from the proxy so that the
# `Origin` allow-list, request URLs, and logs reflect the public-facing
# values. Off by default — only enable when there's a real proxy in front
# (otherwise clients can spoof their own IP).
proxy_headers: bool = Field(
default=False,
description="Honor X-Forwarded-For / X-Forwarded-Proto from upstream proxy.",
)
forwarded_allow_ips: str = Field(
default="127.0.0.1",
description=(
"Comma-separated IPs / CIDRs that uvicorn should trust X-Forwarded-* from."
" Use '*' to trust all (only safe when bound to a private interface)."
),
)
# HTTPS / TLS. Both must be set together to enable TLS; if only one is set
# the server refuses to start. Use `mkcert` or letsencrypt to generate the
# pair; the server reads them at startup.
ssl_certfile: Optional[str] = Field(
default=None,
description="Path to TLS certificate (PEM). Pair with ssl_keyfile.",
)
ssl_keyfile: Optional[str] = Field(
default=None,
description="Path to TLS private key (PEM). Pair with ssl_certfile.",
)
ssl_keyfile_password: Optional[str] = Field(
default=None,
description="Optional password for the private key if encrypted.",
)
# Admin-grade operations (script / callback / link / folder create/update/delete).
# When True the same token used for read/play can also persist arbitrary shell
# commands. Disable to make the API read+execute only.
scripts_management: bool = Field(default=True, description="Allow scripts CRUD via API")
callbacks_management: bool = Field(default=True, description="Allow callbacks CRUD via API")
links_management: bool = Field(default=True, description="Allow links CRUD via API")
# commands. Default False so a single leaked token cannot escalate to RCE; opt
# in explicitly to manage scripts/callbacks/links via the Web UI.
scripts_management: bool = Field(default=False, description="Allow scripts CRUD via API")
callbacks_management: bool = Field(default=False, description="Allow callbacks CRUD via API")
links_management: bool = Field(default=False, description="Allow links CRUD via API")
# Authentication (empty = auth disabled, anyone can access the API)
api_tokens: dict[str, str] = Field(
# Authentication (empty = auth disabled, anyone can access the API).
#
# Each entry can be either:
# • a bare string (legacy form, treated as scopes = ["admin"] for back-compat), OR
# • a mapping with explicit scopes, e.g.
# "ha": {token: "<token>", scopes: ["read", "control"]}
# "kiosk": {token: "<token>", scopes: ["read"]}
# "ops": {token: "<token>", scopes: ["admin"]}
#
# Available scopes:
# read — GET /api/* (status, list, browse) but no state-changing calls.
# control — read + media transport, display/audio, script EXECUTE, callback EXECUTE.
# admin — control + CRUD on scripts/callbacks/links/folders.
#
# Validation normalises both forms to TokenSpec at load time.
api_tokens: dict[str, TokenSpec] = Field(
default_factory=dict,
description="Named API tokens for access control (label: token pairs). Empty = no auth.",
description=(
"Named API tokens. Value can be a bare token string (= admin scope) or"
" a {token, scopes} mapping. See TokenSpec for scope definitions."
),
)
@field_validator("api_tokens", mode="before")
@classmethod
def _normalise_tokens(cls, v):
"""Accept legacy `label: <bare-token>` form and promote to TokenSpec."""
if not isinstance(v, dict):
return v
out: dict[str, dict | TokenSpec] = {}
for label, entry in v.items():
if isinstance(entry, str):
out[label] = {"token": entry, "scopes": ["admin"]}
else:
out[label] = entry
return out
# Media controller settings
poll_interval: float = Field(
default=1.0, description="Media status poll interval in seconds"
@@ -156,7 +265,7 @@ class Settings(BaseSettings):
description="Media folders available for browsing in the media browser",
)
media_folders_management: bool = Field(
default=True,
default=False,
description="Allow adding, editing, and deleting media folders from the Web UI",
)
@@ -263,8 +372,11 @@ def generate_default_config(path: Optional[Path] = None) -> Path:
config = {
"host": "127.0.0.1",
"port": 8765,
# Default token grants "admin" scope (full access). To create a
# read-only or control-only token, add a second entry:
# ha_readonly: {token: "<token>", scopes: ["read"]}
"api_tokens": {
"default": default_token,
"default": {"token": default_token, "scopes": ["admin"]},
},
"poll_interval": 1.0,
"log_level": "INFO",
@@ -298,8 +410,16 @@ def _write_yaml_atomic(path: Path, data: dict) -> None:
def _restrict_config_perms(path: Path) -> None:
"""On POSIX, ensure config file is readable only by owner (0600)."""
"""Ensure config file is readable only by its owner.
POSIX → ``chmod 0600``. On Windows the default NTFS ACL leaves the file
readable by every interactive user on the machine (Users group has Read),
which is bad given the file stores plaintext API tokens. Use ``icacls`` to
grant exclusive access to the current user + SYSTEM + Administrators and
strip inheritance.
"""
if os.name == "nt":
_restrict_config_perms_windows(path)
return
try:
os.chmod(path, 0o600)
@@ -308,5 +428,31 @@ def _restrict_config_perms(path: Path) -> None:
logger.debug("Could not chmod %s", path, exc_info=True)
def _restrict_config_perms_windows(path: Path) -> None:
"""Apply restrictive NTFS ACL to a config file (Windows only)."""
import subprocess
try:
username = os.environ.get("USERNAME") or os.environ.get("USER")
if not username:
logger.debug("Cannot detect current user; skipping icacls hardening")
return
# Disable inheritance and remove every existing ACE, then grant access
# only to current user, SYSTEM, and Administrators. /Q suppresses
# progress output; /C lets per-file errors not abort the batch.
subprocess.run(
["icacls", str(path), "/inheritance:r"],
check=False, capture_output=True, timeout=5,
)
for principal in (username, "SYSTEM", "Administrators"):
subprocess.run(
["icacls", str(path), "/grant:r", f"{principal}:(R,W)"],
check=False, capture_output=True, timeout=5,
)
except (FileNotFoundError, subprocess.TimeoutExpired, OSError):
# `icacls` missing or sandboxed — leave the default ACL in place.
logger.debug("icacls hardening failed for %s", path, exc_info=True)
# Global settings instance
settings = Settings.load_from_yaml()