Files
alexei.dolgolyov d131ba461c
Lint & Test / test (push) Successful in 20s
fix: production-readiness hardening — security, perf, a11y, observability
Security
- Default scripts_management, callbacks_management, links_management, and
  media_folders_management to False so a leaked token cannot escalate to RCE
  through admin CRUD endpoints.
- TokenSpec + scope hierarchy (read | control | admin); legacy bare-string
  api_tokens entries promote to admin for back-compat. Management endpoints
  now require admin scope.
- WebSocket subprotocol auth (Sec-WebSocket-Protocol: media-server.token.<T>)
  preferred over ?token= query so the token no longer lands in URL/history/
  Referer; query fallback retained for HA integration back-compat.
- Origin allow-list check on the WS endpoint (CSWSH defence).
- In-process token-bucket rate limiter: 5/min for failed auths,
  10/min for /api/scripts/execute and /api/callbacks/execute.
- shell=False subprocess path (shlex.split) + per-parameter regex `pattern`
  in ScriptParameterConfig to harden shell=true scripts against parameter
  injection (Windows cmd.exe env-var expansion).
- CSP gains form-action, worker-src, manifest-src directives.
- Refuse cors_origins=["*"] at startup; strip token=... from uvicorn access
  logs; validate Gitea release tag against strict SemVer regex.
- noopener noreferrer + no-referrer referrerpolicy on every outbound link.
- icacls hardening of config.yaml on Windows (current user + SYSTEM +
  Administrators only); 0600 still enforced on POSIX.
- WS volume handler clamps input and never drops the socket on bad messages.

Performance
- Album-art read in windows_media gated by track key — was decoding the
  WinRT thumbnail twice per second regardless of track changes.
- /api/media/artwork returns content-derived ETag + Cache-Control so the
  browser sends If-None-Match and gets 304s on track repeats.
- Foreground-service ctypes argtypes hoisted to one-time module init
  (was re-declaring ~14 prototypes per probe).
- display_service _static_cache keyed by (edid_hash, ...) tuple with
  eviction of disappeared monitors — fixes stale capabilities on hot-plug
  swaps where the new topology has the same monitor count.
- Visualizer rAF loop paused on document.hidden, resumed on visible.

Reliability / bug fixes
- Lifespan rewritten as try/yield/finally so a partial-startup failure
  cannot orphan background tasks or executors.
- _run_callback in routes/media.py keeps a strong task ref (GC-safe) and
  uses the dedicated callback executor instead of the default pool.
- macos_media.set_volume() no longer always returns True.
- TrayManager._restart_requested initialised in __init__; set before
  signalling exit so the main thread observes it correctly.
- Missing static_dir now logs a WARNING instead of silent UI disable.

UX / accessibility / PWA
- manifest.json theme_color and background_color match the Studio Reference
  base (#0E0D0B); added id and scope for PWA installability.
- ARIA on mini-player icon buttons; inner SVGs marked aria-hidden.
- OS mediaSession API wired so headset / lockscreen / Bluetooth buttons
  drive play/pause/next/prev/seek and show track metadata + artwork.

Observability
- X-Request-ID middleware (accept upstream id if it matches a safe regex,
  otherwise UUID4); request_id_var added to ContextVars and included in
  every log line alongside the token label.
- Audit log (append-only JSONL) for every script + callback execution,
  including the on_play/on_pause/etc. event callbacks. Background-thread
  writer; queue capped; flushed in lifespan teardown.

Deployment
- proxy_headers + forwarded_allow_ips plumbed through Settings →
  uvicorn.Config for reverse-proxy installs.
- HTTPS support via ssl_certfile + ssl_keyfile (+ optional password);
  startup refuses to launch with only one of the pair set.
- Thumbnail cache moved from project-root .cache to
  %LOCALAPPDATA%/media-server/cache (Windows) and
  $XDG_CACHE_HOME/media-server/thumbnails (POSIX).

Tests
- 35 new tests across auth scopes, rate limiter, browser path traversal
  (../ NUL UNC absolute), script-param validation incl. regex, Gitea tag
  whitelist, config atomic write + POSIX perms. 47 passed / 4 skipped.
2026-05-22 22:25:54 +03:00

121 lines
3.7 KiB
Python

"""Append-only audit log for sensitive actions (script + callback execution).
Writes a single JSONL line per event to ``<config_dir>/audit.log``. The log is
write-only from the app's perspective — it never reads back, and rotation is
left to the operator (the file size is dominated by stdout/stderr truncation,
which is already capped at 10 KB per stream in `_run_script`).
Designed to be cheap: the write goes through a small background thread so the
hot path never blocks on disk I/O, and a failure to write is logged at WARNING
but never raised to callers.
"""
from __future__ import annotations
import json
import logging
import queue
import threading
import time
from typing import Any
from ..auth import token_label_var
from ..config import get_config_dir
logger = logging.getLogger(__name__)
# Cap on stdout/stderr inside the audit record so a chatty script doesn't
# explode the log. Mirrors the 10k cap used by _run_script.
_OUTPUT_CAP = 2000
_audit_queue: "queue.Queue[dict[str, Any] | None]" = queue.Queue(maxsize=1000)
_audit_thread: threading.Thread | None = None
_audit_lock = threading.Lock()
def _ensure_writer_started() -> None:
global _audit_thread
with _audit_lock:
if _audit_thread is not None and _audit_thread.is_alive():
return
_audit_thread = threading.Thread(
target=_audit_writer_loop,
name="audit-log",
daemon=True,
)
_audit_thread.start()
def _audit_writer_loop() -> None:
log_path = get_config_dir() / "audit.log"
while True:
try:
record = _audit_queue.get()
except Exception:
return
if record is None:
return
try:
line = json.dumps(record, ensure_ascii=False, default=str)
with open(log_path, "a", encoding="utf-8") as f:
f.write(line + "\n")
except OSError as e:
logger.warning("Failed to write audit record: %s", e)
def _truncate(value: str | None) -> str | None:
if value is None:
return None
if len(value) <= _OUTPUT_CAP:
return value
return value[:_OUTPUT_CAP] + f"\n…[truncated, {len(value) - _OUTPUT_CAP} chars]"
def record_script_execution(
*,
kind: str,
name: str,
exit_code: int | None,
duration: float | None,
stdout: str | None = None,
stderr: str | None = None,
error: str | None = None,
) -> None:
"""Append a single audit record. Never raises."""
_ensure_writer_started()
try:
record = {
"ts": time.time(),
"iso": time.strftime("%Y-%m-%dT%H:%M:%S", time.gmtime()),
"token_label": token_label_var.get("unknown"),
"kind": kind,
"name": name,
"exit_code": exit_code,
"duration_s": round(duration, 4) if duration is not None else None,
"success": exit_code == 0 if exit_code is not None else False,
"stdout": _truncate(stdout),
"stderr": _truncate(stderr),
"error": error,
}
_audit_queue.put_nowait(record)
except queue.Full:
# Backpressure: drop oldest record to make room. We'd rather lose an
# old entry than block the script that just ran.
try:
_audit_queue.get_nowait()
_audit_queue.put_nowait(record)
except queue.Empty:
pass
except Exception as e:
logger.warning("Failed to enqueue audit record: %s", e)
def shutdown_audit_log() -> None:
"""Flush the audit queue on app shutdown."""
try:
_audit_queue.put_nowait(None)
except queue.Full:
pass
if _audit_thread is not None:
_audit_thread.join(timeout=2)