fix: production-readiness hardening — security, perf, a11y, observability
Lint & Test / test (push) Successful in 20s

Security
- Default scripts_management, callbacks_management, links_management, and
  media_folders_management to False so a leaked token cannot escalate to RCE
  through admin CRUD endpoints.
- TokenSpec + scope hierarchy (read | control | admin); legacy bare-string
  api_tokens entries promote to admin for back-compat. Management endpoints
  now require admin scope.
- WebSocket subprotocol auth (Sec-WebSocket-Protocol: media-server.token.<T>)
  preferred over ?token= query so the token no longer lands in URL/history/
  Referer; query fallback retained for HA integration back-compat.
- Origin allow-list check on the WS endpoint (CSWSH defence).
- In-process token-bucket rate limiter: 5/min for failed auths,
  10/min for /api/scripts/execute and /api/callbacks/execute.
- shell=False subprocess path (shlex.split) + per-parameter regex `pattern`
  in ScriptParameterConfig to harden shell=true scripts against parameter
  injection (Windows cmd.exe env-var expansion).
- CSP gains form-action, worker-src, manifest-src directives.
- Refuse cors_origins=["*"] at startup; strip token=... from uvicorn access
  logs; validate Gitea release tag against strict SemVer regex.
- noopener noreferrer + no-referrer referrerpolicy on every outbound link.
- icacls hardening of config.yaml on Windows (current user + SYSTEM +
  Administrators only); 0600 still enforced on POSIX.
- WS volume handler clamps input and never drops the socket on bad messages.

Performance
- Album-art read in windows_media gated by track key — was decoding the
  WinRT thumbnail twice per second regardless of track changes.
- /api/media/artwork returns content-derived ETag + Cache-Control so the
  browser sends If-None-Match and gets 304s on track repeats.
- Foreground-service ctypes argtypes hoisted to one-time module init
  (was re-declaring ~14 prototypes per probe).
- display_service _static_cache keyed by (edid_hash, ...) tuple with
  eviction of disappeared monitors — fixes stale capabilities on hot-plug
  swaps where the new topology has the same monitor count.
- Visualizer rAF loop paused on document.hidden, resumed on visible.

Reliability / bug fixes
- Lifespan rewritten as try/yield/finally so a partial-startup failure
  cannot orphan background tasks or executors.
- _run_callback in routes/media.py keeps a strong task ref (GC-safe) and
  uses the dedicated callback executor instead of the default pool.
- macos_media.set_volume() no longer always returns True.
- TrayManager._restart_requested initialised in __init__; set before
  signalling exit so the main thread observes it correctly.
- Missing static_dir now logs a WARNING instead of silent UI disable.

UX / accessibility / PWA
- manifest.json theme_color and background_color match the Studio Reference
  base (#0E0D0B); added id and scope for PWA installability.
- ARIA on mini-player icon buttons; inner SVGs marked aria-hidden.
- OS mediaSession API wired so headset / lockscreen / Bluetooth buttons
  drive play/pause/next/prev/seek and show track metadata + artwork.

Observability
- X-Request-ID middleware (accept upstream id if it matches a safe regex,
  otherwise UUID4); request_id_var added to ContextVars and included in
  every log line alongside the token label.
- Audit log (append-only JSONL) for every script + callback execution,
  including the on_play/on_pause/etc. event callbacks. Background-thread
  writer; queue capped; flushed in lifespan teardown.

Deployment
- proxy_headers + forwarded_allow_ips plumbed through Settings →
  uvicorn.Config for reverse-proxy installs.
- HTTPS support via ssl_certfile + ssl_keyfile (+ optional password);
  startup refuses to launch with only one of the pair set.
- Thumbnail cache moved from project-root .cache to
  %LOCALAPPDATA%/media-server/cache (Windows) and
  $XDG_CACHE_HOME/media-server/thumbnails (POSIX).

Tests
- 35 new tests across auth scopes, rate limiter, browser path traversal
  (../ NUL UNC absolute), script-param validation incl. regex, Gitea tag
  whitelist, config atomic write + POSIX perms. 47 passed / 4 skipped.
This commit is contained in:
2026-05-22 22:25:54 +03:00
parent 450f9fe1ee
commit d131ba461c
31 changed files with 1586 additions and 204 deletions
+38 -14
View File
@@ -192,10 +192,11 @@ _CACHE_TTL = 5.0 # seconds
# Per-monitor cache of static capabilities (option lists + support flags).
# DDC/CI capability discovery is the slow part — it only changes when a
# monitor is replaced or rewired, so we probe it once per monitor and reuse
# it across refreshes. Cleared on explicit `rediscover` or when the monitor
# count changes (cheap stale-detection for hot-plug events).
_static_cache: dict[int, dict] = {}
_static_cache_monitor_count: int = -1
# it across refreshes. Keyed by a stable identity tuple
# (manufacturer, model, edid_hash) so that hot-plug swaps where the new
# topology has the same number of monitors but different devices still
# refresh the cache for the new monitor instead of serving stale capabilities.
_static_cache: dict[tuple, dict] = {}
def _enum_name(value, enum_cls=None) -> str | None:
@@ -353,7 +354,7 @@ def list_monitors(force_refresh: bool = False, rediscover: bool = False) -> list
next probe re-runs DDC/CI capability discovery. Use after hot-plug
or when a monitor's reported capabilities change.
"""
global _monitor_cache, _cache_time, _static_cache_monitor_count
global _monitor_cache, _cache_time
if (
not force_refresh
@@ -372,12 +373,11 @@ def list_monitors(force_refresh: bool = False, rediscover: bool = False) -> list
info_list = sbc.list_monitors_info()
brightnesses = sbc.get_brightness()
# Invalidate the static cache on explicit rediscover OR on topology
# change (hot-plug / disconnect). Both indicate the cached probe is
# potentially stale.
if rediscover or len(info_list) != _static_cache_monitor_count:
# Explicit rediscover wipes the whole cache; otherwise rely on stable
# per-monitor keys (manufacturer|model|edid_hash) so a hot-plug swap
# invalidates the entry for the missing monitor automatically.
if rediscover:
_static_cache.clear()
_static_cache_monitor_count = len(info_list)
mc = _load_monitorcontrol()
ddc_monitors = []
@@ -387,6 +387,9 @@ def list_monitors(force_refresh: bool = False, rediscover: bool = False) -> list
except Exception:
pass
import hashlib
seen_keys: set[tuple] = set()
for i, info in enumerate(info_list):
name = info.get("name", f"Monitor {i}")
model = info.get("model", "")
@@ -400,6 +403,21 @@ def list_monitors(force_refresh: bool = False, rediscover: bool = False) -> list
edid = info.get("edid", "")
resolution = _parse_edid_resolution(edid) if edid else None
# Stable cache key — EDID hash is unique per physical monitor.
# Fall back to (manufacturer, model, serial-ish) when EDID is
# missing, then to the legacy index as a last resort.
if edid:
edid_hash = hashlib.blake2b(
edid.encode("utf-8") if isinstance(edid, str) else bytes(edid),
digest_size=8,
).hexdigest()
cache_key: tuple = ("edid", edid_hash)
elif manufacturer or model:
cache_key = ("mm", manufacturer, model, name)
else:
cache_key = ("idx", i)
seen_keys.add(cache_key)
static: dict = {}
dynamic: dict = {}
@@ -409,13 +427,13 @@ def list_monitors(force_refresh: bool = False, rediscover: bool = False) -> list
if power_supported and i < len(ddc_monitors):
try:
with ddc_monitors[i] as mon:
if i not in _static_cache:
_static_cache[i] = _probe_static_open(mon, mc, i)
static = _static_cache[i]
if cache_key not in _static_cache:
_static_cache[cache_key] = _probe_static_open(mon, mc, i)
static = _static_cache[cache_key]
dynamic = _probe_dynamic_open(mon, mc, i, static)
except Exception as e:
logger.debug("Monitor %d: DDC/CI session failed: %s", i, e)
static = _static_cache.get(i, {})
static = _static_cache.get(cache_key, {})
monitors.append(MonitorInfo(
id=i,
@@ -439,6 +457,12 @@ def list_monitors(force_refresh: bool = False, rediscover: bool = False) -> list
available_picture_modes=static.get("available_picture_modes", []),
picture_mode_supported=static.get("picture_mode_supported", False),
))
# Evict cache entries for monitors that disappeared from this scan so
# the next hot-plug of a different monitor with the same identity
# tuple (e.g. same model) doesn't hit a stale entry first.
for stale_key in list(_static_cache.keys()):
if stale_key not in seen_keys:
_static_cache.pop(stale_key, None)
except Exception as e:
logger.error("Failed to enumerate monitors: %s", e)