fix: production-readiness hardening — security, perf, a11y, observability

Security - Default scripts_management, callbacks_management, links_management, and media_folders_management to False so a leaked token cannot escalate to RCE through admin CRUD endpoints. - TokenSpec + scope hierarchy (read | control | admin); legacy bare-string api_tokens entries promote to admin for back-compat. Management endpoints now require admin scope. - WebSocket subprotocol auth (Sec-WebSocket-Protocol: media-server.token.<T>) preferred over ?token= query so the token no longer lands in URL/history/ Referer; query fallback retained for HA integration back-compat. - Origin allow-list check on the WS endpoint (CSWSH defence). - In-process token-bucket rate limiter: 5/min for failed auths, 10/min for /api/scripts/execute and /api/callbacks/execute. - shell=False subprocess path (shlex.split) + per-parameter regex `pattern` in ScriptParameterConfig to harden shell=true scripts against parameter injection (Windows cmd.exe env-var expansion). - CSP gains form-action, worker-src, manifest-src directives. - Refuse cors_origins=["*"] at startup; strip token=... from uvicorn access logs; validate Gitea release tag against strict SemVer regex. - noopener noreferrer + no-referrer referrerpolicy on every outbound link. - icacls hardening of config.yaml on Windows (current user + SYSTEM + Administrators only); 0600 still enforced on POSIX. - WS volume handler clamps input and never drops the socket on bad messages. Performance - Album-art read in windows_media gated by track key — was decoding the WinRT thumbnail twice per second regardless of track changes. - /api/media/artwork returns content-derived ETag + Cache-Control so the browser sends If-None-Match and gets 304s on track repeats. - Foreground-service ctypes argtypes hoisted to one-time module init (was re-declaring ~14 prototypes per probe). - display_service _static_cache keyed by (edid_hash, ...) tuple with eviction of disappeared monitors — fixes stale capabilities on hot-plug swaps where the new topology has the same monitor count. - Visualizer rAF loop paused on document.hidden, resumed on visible. Reliability / bug fixes - Lifespan rewritten as try/yield/finally so a partial-startup failure cannot orphan background tasks or executors. - _run_callback in routes/media.py keeps a strong task ref (GC-safe) and uses the dedicated callback executor instead of the default pool. - macos_media.set_volume() no longer always returns True. - TrayManager._restart_requested initialised in __init__; set before signalling exit so the main thread observes it correctly. - Missing static_dir now logs a WARNING instead of silent UI disable. UX / accessibility / PWA - manifest.json theme_color and background_color match the Studio Reference base (#0E0D0B); added id and scope for PWA installability. - ARIA on mini-player icon buttons; inner SVGs marked aria-hidden. - OS mediaSession API wired so headset / lockscreen / Bluetooth buttons drive play/pause/next/prev/seek and show track metadata + artwork. Observability - X-Request-ID middleware (accept upstream id if it matches a safe regex, otherwise UUID4); request_id_var added to ContextVars and included in every log line alongside the token label. - Audit log (append-only JSONL) for every script + callback execution, including the on_play/on_pause/etc. event callbacks. Background-thread writer; queue capped; flushed in lifespan teardown. Deployment - proxy_headers + forwarded_allow_ips plumbed through Settings → uvicorn.Config for reverse-proxy installs. - HTTPS support via ssl_certfile + ssl_keyfile (+ optional password); startup refuses to launch with only one of the pair set. - Thumbnail cache moved from project-root .cache to %LOCALAPPDATA%/media-server/cache (Windows) and $XDG_CACHE_HOME/media-server/thumbnails (POSIX). Tests - 35 new tests across auth scopes, rate limiter, browser path traversal (../ NUL UNC absolute), script-param validation incl. regex, Gitea tag whitelist, config atomic write + POSIX perms. 47 passed / 4 skipped.
2026-05-22 22:25:54 +03:00
parent 450f9fe1ee
commit d131ba461c
31 changed files with 1586 additions and 204 deletions
@@ -0,0 +1,120 @@
+"""Append-only audit log for sensitive actions (script + callback execution).
+
+Writes a single JSONL line per event to ``<config_dir>/audit.log``. The log is
+write-only from the app's perspective — it never reads back, and rotation is
+left to the operator (the file size is dominated by stdout/stderr truncation,
+which is already capped at 10 KB per stream in `_run_script`).
+
+Designed to be cheap: the write goes through a small background thread so the
+hot path never blocks on disk I/O, and a failure to write is logged at WARNING
+but never raised to callers.
+"""
+
+from __future__ import annotations
+
+import json
+import logging
+import queue
+import threading
+import time
+from typing import Any
+
+from ..auth import token_label_var
+from ..config import get_config_dir
+
+logger = logging.getLogger(__name__)
+
+# Cap on stdout/stderr inside the audit record so a chatty script doesn't
+# explode the log. Mirrors the 10k cap used by _run_script.
+_OUTPUT_CAP = 2000
+
+_audit_queue: "queue.Queue[dict[str, Any] | None]" = queue.Queue(maxsize=1000)
+_audit_thread: threading.Thread | None = None
+_audit_lock = threading.Lock()
+
+
+def _ensure_writer_started() -> None:
+    global _audit_thread
+    with _audit_lock:
+        if _audit_thread is not None and _audit_thread.is_alive():
+            return
+        _audit_thread = threading.Thread(
+            target=_audit_writer_loop,
+            name="audit-log",
+            daemon=True,
+        )
+        _audit_thread.start()
+
+
+def _audit_writer_loop() -> None:
+    log_path = get_config_dir() / "audit.log"
+    while True:
+        try:
+            record = _audit_queue.get()
+        except Exception:
+            return
+        if record is None:
+            return
+        try:
+            line = json.dumps(record, ensure_ascii=False, default=str)
+            with open(log_path, "a", encoding="utf-8") as f:
+                f.write(line + "\n")
+        except OSError as e:
+            logger.warning("Failed to write audit record: %s", e)
+
+
+def _truncate(value: str | None) -> str | None:
+    if value is None:
+        return None
+    if len(value) <= _OUTPUT_CAP:
+        return value
+    return value[:_OUTPUT_CAP] + f"\n…[truncated, {len(value) - _OUTPUT_CAP} chars]"
+
+
+def record_script_execution(
+    *,
+    kind: str,
+    name: str,
+    exit_code: int | None,
+    duration: float | None,
+    stdout: str | None = None,
+    stderr: str | None = None,
+    error: str | None = None,
+) -> None:
+    """Append a single audit record. Never raises."""
+    _ensure_writer_started()
+    try:
+        record = {
+            "ts": time.time(),
+            "iso": time.strftime("%Y-%m-%dT%H:%M:%S", time.gmtime()),
+            "token_label": token_label_var.get("unknown"),
+            "kind": kind,
+            "name": name,
+            "exit_code": exit_code,
+            "duration_s": round(duration, 4) if duration is not None else None,
+            "success": exit_code == 0 if exit_code is not None else False,
+            "stdout": _truncate(stdout),
+            "stderr": _truncate(stderr),
+            "error": error,
+        }
+        _audit_queue.put_nowait(record)
+    except queue.Full:
+        # Backpressure: drop oldest record to make room. We'd rather lose an
+        # old entry than block the script that just ran.
+        try:
+            _audit_queue.get_nowait()
+            _audit_queue.put_nowait(record)
+        except queue.Empty:
+            pass
+    except Exception as e:
+        logger.warning("Failed to enqueue audit record: %s", e)
+
+
+def shutdown_audit_log() -> None:
+    """Flush the audit queue on app shutdown."""
+    try:
+        _audit_queue.put_nowait(None)
+    except queue.Full:
+        pass
+    if _audit_thread is not None:
+        _audit_thread.join(timeout=2)
@@ -192,10 +192,11 @@ _CACHE_TTL = 5.0  # seconds
 # Per-monitor cache of static capabilities (option lists + support flags).
 # DDC/CI capability discovery is the slow part — it only changes when a
 # monitor is replaced or rewired, so we probe it once per monitor and reuse
-# it across refreshes. Cleared on explicit `rediscover` or when the monitor
-# count changes (cheap stale-detection for hot-plug events).
-_static_cache: dict[int, dict] = {}
-_static_cache_monitor_count: int = -1
+# it across refreshes. Keyed by a stable identity tuple
+# (manufacturer, model, edid_hash) so that hot-plug swaps where the new
+# topology has the same number of monitors but different devices still
+# refresh the cache for the new monitor instead of serving stale capabilities.
+_static_cache: dict[tuple, dict] = {}


 def _enum_name(value, enum_cls=None) -> str | None:
@@ -353,7 +354,7 @@ def list_monitors(force_refresh: bool = False, rediscover: bool = False) -> list
            next probe re-runs DDC/CI capability discovery. Use after hot-plug
            or when a monitor's reported capabilities change.
    """
-    global _monitor_cache, _cache_time, _static_cache_monitor_count
+    global _monitor_cache, _cache_time

    if (
        not force_refresh
@@ -372,12 +373,11 @@ def list_monitors(force_refresh: bool = False, rediscover: bool = False) -> list
        info_list = sbc.list_monitors_info()
        brightnesses = sbc.get_brightness()

-        # Invalidate the static cache on explicit rediscover OR on topology
-        # change (hot-plug / disconnect). Both indicate the cached probe is
-        # potentially stale.
-        if rediscover or len(info_list) != _static_cache_monitor_count:
+        # Explicit rediscover wipes the whole cache; otherwise rely on stable
+        # per-monitor keys (manufacturer|model|edid_hash) so a hot-plug swap
+        # invalidates the entry for the missing monitor automatically.
+        if rediscover:
            _static_cache.clear()
-            _static_cache_monitor_count = len(info_list)

        mc = _load_monitorcontrol()
        ddc_monitors = []
@@ -387,6 +387,9 @@ def list_monitors(force_refresh: bool = False, rediscover: bool = False) -> list
            except Exception:
                pass

+        import hashlib
+
+        seen_keys: set[tuple] = set()
        for i, info in enumerate(info_list):
            name = info.get("name", f"Monitor {i}")
            model = info.get("model", "")
@@ -400,6 +403,21 @@ def list_monitors(force_refresh: bool = False, rediscover: bool = False) -> list
            edid = info.get("edid", "")
            resolution = _parse_edid_resolution(edid) if edid else None

+            # Stable cache key — EDID hash is unique per physical monitor.
+            # Fall back to (manufacturer, model, serial-ish) when EDID is
+            # missing, then to the legacy index as a last resort.
+            if edid:
+                edid_hash = hashlib.blake2b(
+                    edid.encode("utf-8") if isinstance(edid, str) else bytes(edid),
+                    digest_size=8,
+                ).hexdigest()
+                cache_key: tuple = ("edid", edid_hash)
+            elif manufacturer or model:
+                cache_key = ("mm", manufacturer, model, name)
+            else:
+                cache_key = ("idx", i)
+            seen_keys.add(cache_key)
+
            static: dict = {}
            dynamic: dict = {}

@@ -409,13 +427,13 @@ def list_monitors(force_refresh: bool = False, rediscover: bool = False) -> list
            if power_supported and i < len(ddc_monitors):
                try:
                    with ddc_monitors[i] as mon:
-                        if i not in _static_cache:
-                            _static_cache[i] = _probe_static_open(mon, mc, i)
-                        static = _static_cache[i]
+                        if cache_key not in _static_cache:
+                            _static_cache[cache_key] = _probe_static_open(mon, mc, i)
+                        static = _static_cache[cache_key]
                        dynamic = _probe_dynamic_open(mon, mc, i, static)
                except Exception as e:
                    logger.debug("Monitor %d: DDC/CI session failed: %s", i, e)
-                    static = _static_cache.get(i, {})
+                    static = _static_cache.get(cache_key, {})

            monitors.append(MonitorInfo(
                id=i,
@@ -439,6 +457,12 @@ def list_monitors(force_refresh: bool = False, rediscover: bool = False) -> list
                available_picture_modes=static.get("available_picture_modes", []),
                picture_mode_supported=static.get("picture_mode_supported", False),
            ))
+        # Evict cache entries for monitors that disappeared from this scan so
+        # the next hot-plug of a different monitor with the same identity
+        # tuple (e.g. same model) doesn't hit a stale entry first.
+        for stale_key in list(_static_cache.keys()):
+            if stale_key not in seen_keys:
+                _static_cache.pop(stale_key, None)
    except Exception as e:
        logger.error("Failed to enumerate monitors: %s", e)

@@ -86,9 +86,29 @@ class _Cache:

 _cache = _Cache()

+# Win32 handles + signatures are declared once at module load (when running on
+# Windows). The TTL cache fires this hundreds of times per minute; redoing the
+# DLL load + ~10 argtype assignments per call was the largest chunk of probe
+# cost. Keep these guarded behind a lazy init so non-Windows platforms don't
+# pay the import.
+_WIN32_INITIALIZED = False
+_win32_user32 = None
+_win32_kernel32 = None
+_win32_psapi = None
+
+
+def _init_win32_apis() -> None:
+    """Declare ctypes argtypes/restype on every Win32 call we make.
+
+    CRITICAL: ctypes defaults to `c_int` (32-bit) for HANDLE/HWND/HMONITOR
+    which silently truncates 64-bit pointer values on x64 — that corrupts the
+    handle so `CloseHandle()` can either fail or close the wrong kernel
+    object, and pointer-equality comparisons (monitor index lookup) miss.
+    """
+    global _WIN32_INITIALIZED, _win32_user32, _win32_kernel32, _win32_psapi
+    if _WIN32_INITIALIZED:
+        return

-def _probe_windows() -> ForegroundInfo:
-    """Probe foreground window state on Windows via Win32 API."""
    import ctypes
    import ctypes.wintypes as wt

@@ -96,11 +116,6 @@ def _probe_windows() -> ForegroundInfo:
    kernel32 = ctypes.WinDLL("kernel32", use_last_error=True)
    psapi = ctypes.WinDLL("psapi", use_last_error=True)

-    # CRITICAL: declare argtypes/restype on every Win32 call that returns a
-    # HANDLE/HWND/HMONITOR. ctypes defaults to `c_int` (32-bit) which
-    # silently truncates 64-bit pointer values on x64 — that corrupts the
-    # handle so `CloseHandle()` can either fail or close the wrong kernel
-    # object, and pointer-equality comparisons (monitor index lookup) miss.
    user32.GetForegroundWindow.restype = wt.HWND
    user32.GetWindowThreadProcessId.argtypes = [wt.HWND, ctypes.POINTER(wt.DWORD)]
    user32.GetWindowThreadProcessId.restype = wt.DWORD
@@ -137,6 +152,20 @@ def _probe_windows() -> ForegroundInfo:
    psapi.GetModuleFileNameExW.argtypes = [wt.HANDLE, wt.HMODULE, wt.LPWSTR, wt.DWORD]
    psapi.GetModuleFileNameExW.restype = wt.DWORD

+    _win32_user32, _win32_kernel32, _win32_psapi = user32, kernel32, psapi
+    _WIN32_INITIALIZED = True
+
+
+def _probe_windows() -> ForegroundInfo:
+    """Probe foreground window state on Windows via Win32 API."""
+    import ctypes
+    import ctypes.wintypes as wt
+
+    _init_win32_apis()
+    user32 = _win32_user32
+    kernel32 = _win32_kernel32
+    psapi = _win32_psapi
+
    hwnd = user32.GetForegroundWindow()
    if not hwnd:
        return ForegroundInfo(available=True, error="no foreground window")
@@ -2,6 +2,7 @@

 import json
 import logging
+import re
 import urllib.error
 import urllib.request
 from typing import Optional
@@ -15,6 +16,11 @@ _DEFAULT_BASE_URL = "https://git.dolgolyov-family.by"
 _DEFAULT_OWNER = "alexei.dolgolyov"
 _DEFAULT_REPO = "media-player-server"

+# Restrictive tag whitelist — prevents a hostile Gitea response (or MITM) from
+# injecting `..`, slashes, or URL-altering characters into the release URL we
+# broadcast to clients. SemVer + pre-release suffix only.
+_TAG_RE = re.compile(r"^v?\d+\.\d+\.\d+(?:[\w.\-+]{0,32})?$")
+

 class GiteaReleaseProvider(ReleaseProvider):
    """Fetches the latest release from a Gitea repository."""
@@ -53,6 +59,9 @@ class GiteaReleaseProvider(ReleaseProvider):
                continue

            tag = release.get("tag_name", "")
+            if not isinstance(tag, str) or not _TAG_RE.match(tag):
+                logger.warning("Rejecting malformed release tag from upstream: %r", tag)
+                continue
            version = tag.lstrip("v")
            if not version:
                continue
@@ -264,8 +264,12 @@ class MacOSMediaController(MediaController):

    async def set_volume(self, volume: int) -> bool:
        """Set system volume."""
-        result = self._run_osascript(f"set volume output volume {volume}")
-        return result is not None or True  # osascript returns empty on success
+        # osascript returns empty string on success and None on failure (the
+        # _run_osascript helper catches subprocess errors). The previous
+        # `result is not None or True` always returned True regardless of
+        # outcome — surface real failures so the route can return 503.
+        result = self._run_osascript(f"set volume output volume {int(volume)}")
+        return result is not None

    async def toggle_mute(self) -> bool:
        """Toggle mute state."""
@@ -0,0 +1,95 @@
+"""In-process token-bucket rate limiter.
+
+Light enough for a single-process app: one dict keyed by ``(bucket, peer)``
+guarded by a thread lock. No extra dependency, no Redis. Good enough for
+defeating credential-stuffing and runaway clients on a LAN; not a substitute
+for an upstream WAF in a public deployment.
+
+Buckets:
+    auth        — failed-auth attempts, 5/min/peer (used in auth middleware)
+    execute     — script + callback execute calls, 10/min/peer (LAN-friendly)
+    default     — generic POST/DELETE writes, 60/min/peer
+"""
+
+from __future__ import annotations
+
+import logging
+import threading
+import time
+from dataclasses import dataclass
+from typing import Optional
+
+logger = logging.getLogger(__name__)
+
+
+@dataclass
+class BucketConfig:
+    capacity: float       # max tokens (= burst size)
+    refill_per_sec: float # tokens added per second
+
+
+# Defaults — tuned for "trusted LAN" use; operator can override via Settings.
+BUCKETS: dict[str, BucketConfig] = {
+    "auth":    BucketConfig(capacity=5,  refill_per_sec=5 / 60),     # 5/min
+    "execute": BucketConfig(capacity=10, refill_per_sec=10 / 60),    # 10/min
+    "default": BucketConfig(capacity=60, refill_per_sec=60 / 60),    # 60/min
+}
+
+
+_state: dict[tuple[str, str], tuple[float, float]] = {}
+_lock = threading.Lock()
+_LAST_CLEANUP = 0.0
+
+
+def _evict_stale_locked(now: float) -> None:
+    """Drop entries whose buckets are full (= idle for capacity / refill seconds)."""
+    global _LAST_CLEANUP
+    if now - _LAST_CLEANUP < 60:
+        return
+    _LAST_CLEANUP = now
+    stale = []
+    for key, (tokens, last) in _state.items():
+        bucket = BUCKETS.get(key[0])
+        if bucket is None:
+            continue
+        if tokens >= bucket.capacity and (now - last) > 3600:
+            stale.append(key)
+    for key in stale:
+        _state.pop(key, None)
+
+
+def check(bucket: str, peer: str) -> tuple[bool, Optional[float]]:
+    """Try to consume one token from ``(bucket, peer)``.
+
+    Returns:
+        (allowed, retry_after_seconds). When allowed=True retry_after is None.
+        When allowed=False, retry_after is the seconds to wait for one more token.
+    """
+    cfg = BUCKETS.get(bucket) or BUCKETS["default"]
+    now = time.monotonic()
+    with _lock:
+        _evict_stale_locked(now)
+        tokens, last = _state.get((bucket, peer), (cfg.capacity, now))
+        elapsed = max(0.0, now - last)
+        tokens = min(cfg.capacity, tokens + elapsed * cfg.refill_per_sec)
+        if tokens >= 1:
+            tokens -= 1
+            _state[(bucket, peer)] = (tokens, now)
+            return True, None
+        deficit = 1 - tokens
+        retry = deficit / cfg.refill_per_sec if cfg.refill_per_sec > 0 else 60
+        _state[(bucket, peer)] = (tokens, now)
+        return False, retry
+
+
+def get_peer(request) -> str:
+    """Best-effort peer identifier from a Starlette request.
+
+    Honors X-Forwarded-For (only when settings.proxy_headers is True, which is
+    already enforced by uvicorn's middleware) so a reverse-proxied install
+    still rate-limits per real client.
+    """
+    client = getattr(request, "client", None)
+    if client and client.host:
+        return client.host
+    return "unknown"
@@ -26,12 +26,23 @@ class ThumbnailService:
    def get_cache_dir() -> Path:
        """Get the thumbnail cache directory path.

-        Returns:
-            Path to the cache directory (project-local).
+        Returns user-writable platform cache dir so installs under
+        ``%PROGRAMFILES%`` / ``/opt`` work without elevated permissions.
+        Mirrors the platform branching of ``config.get_config_dir``.
        """
-        # Store cache in project directory: media-server/.cache/thumbnails/
-        project_root = Path(__file__).parent.parent.parent
-        cache_dir = project_root / ".cache" / "thumbnails"
+        import os
+
+        if os.name == "nt":
+            # %LOCALAPPDATA% so the cache survives roaming-profile sync.
+            base = Path(os.environ.get("LOCALAPPDATA")
+                        or os.environ.get("APPDATA")
+                        or Path.home() / "AppData" / "Local")
+            cache_dir = base / "media-server" / "cache" / "thumbnails"
+        else:
+            # XDG_CACHE_HOME convention; falls back to ~/.cache.
+            xdg = os.environ.get("XDG_CACHE_HOME")
+            base = Path(xdg) if xdg else Path.home() / ".cache"
+            cache_dir = base / "media-server" / "thumbnails"

        cache_dir.mkdir(parents=True, exist_ok=True)
        return cache_dir
@@ -33,9 +33,15 @@ class ConnectionManager:
        self._audio_task: asyncio.Task | None = None
        self._audio_analyzer = None

-    async def connect(self, websocket: WebSocket) -> None:
-        """Accept a new WebSocket connection."""
-        await websocket.accept()
+    async def connect(self, websocket: WebSocket, already_accepted: bool = False) -> None:
+        """Accept a new WebSocket connection.
+
+        ``already_accepted=True`` is for callers that needed to call
+        ``websocket.accept(subprotocol=...)`` themselves (token-via-subprotocol
+        auth path).
+        """
+        if not already_accepted:
+            await websocket.accept()
        async with self._lock:
            self._active_connections.add(websocket)
        logger.info(
@@ -31,8 +31,15 @@ def _thread_loop() -> asyncio.AbstractEventLoop:
        _thread_local.loop = loop
    return loop

-# Global storage for current album art (as bytes)
+# Global storage for current album art (as bytes). Guarded by _art_lock so the
+# WinRT polling thread and the FastAPI handler thread don't race on swap.
 _current_album_art_bytes: bytes | None = None
+_art_lock = threading.Lock()
+
+# Identity of the track whose art is currently in _current_album_art_bytes.
+# Used to gate the expensive WinRT thumbnail.open_read_async() so the bytes
+# aren't re-decoded on every 500ms status poll.
+_current_album_art_key: tuple | None = None

 # Lock protecting _position_cache and _track_skip_pending from concurrent access
 _position_lock = threading.Lock()
@@ -56,8 +63,9 @@ _track_skip_pending = {


 def get_current_album_art() -> bytes | None:
-    """Get the current album art bytes."""
-    return _current_album_art_bytes
+    """Get the current album art bytes (thread-safe snapshot)."""
+    with _art_lock:
+        return _current_album_art_bytes

 # Windows-specific imports
 try:
@@ -379,28 +387,48 @@ def _sync_get_media_status() -> dict[str, Any]:
                except Exception as e:
                    logger.debug(f"Timeline parse error: {e}")

-            # Try to get album art (requires media_props)
+            # Try to get album art (requires media_props). Gated by track key so
+            # the WinRT IPC + bytes copy only runs when the track actually
+            # changes; otherwise we just preserve the existing cached bytes.
            if media_props:
-                try:
-                    thumbnail = media_props.thumbnail
-                    if thumbnail:
-                        stream = loop.run_until_complete(thumbnail.open_read_async())
-                        if stream:
-                            size = stream.size
-                            if size > 0 and size < 10 * 1024 * 1024:  # Max 10MB
-                                from winsdk.windows.storage.streams import DataReader
-                                reader = DataReader(stream)
-                                loop.run_until_complete(reader.load_async(size))
-                                buffer = bytearray(size)
-                                reader.read_bytes(buffer)
-                                reader.close()
-                                stream.close()
+                track_key = (
+                    getattr(media_props, "title", "") or "",
+                    getattr(media_props, "artist", "") or "",
+                    getattr(media_props, "album_title", "") or "",
+                )
+                global _current_album_art_bytes, _current_album_art_key
+                if track_key == _current_album_art_key and _current_album_art_bytes:
+                    # Same track — reuse cached art bytes without touching WinRT.
+                    result["album_art_url"] = "/api/media/artwork"
+                else:
+                    try:
+                        thumbnail = media_props.thumbnail
+                        if thumbnail:
+                            stream = loop.run_until_complete(thumbnail.open_read_async())
+                            if stream:
+                                size = stream.size
+                                if size > 0 and size < 10 * 1024 * 1024:  # Max 10MB
+                                    from winsdk.windows.storage.streams import DataReader
+                                    reader = DataReader(stream)
+                                    loop.run_until_complete(reader.load_async(size))
+                                    buffer = bytearray(size)
+                                    reader.read_bytes(buffer)
+                                    reader.close()
+                                    stream.close()

-                                global _current_album_art_bytes
-                                _current_album_art_bytes = bytes(buffer)
-                                result["album_art_url"] = "/api/media/artwork"
-                except Exception as e:
-                    logger.debug(f"Failed to get album art: {e}")
+                                    with _art_lock:
+                                        _current_album_art_bytes = bytes(buffer)
+                                        _current_album_art_key = track_key
+                                    result["album_art_url"] = "/api/media/artwork"
+                        else:
+                            # No thumbnail on this track — drop stale bytes so
+                            # the ETag flips and clients don't keep showing the
+                            # previous album's cover.
+                            with _art_lock:
+                                _current_album_art_bytes = None
+                                _current_album_art_key = track_key
+                    except Exception as e:
+                        logger.debug(f"Failed to get album art: {e}")

            result["source"] = session.source_app_user_model_id