Files
notify-bridge/packages/server/src/notify_bridge_server/services/manual_dispatch.py
T
alexei.dolgolyov 10d30fc956 feat: production readiness — security, perf, bug fixes, bridge self-monitoring
Comprehensive multi-area pass driven by a parallel 8-agent production
review. Frontend, backend, database, security, performance, operational,
plus a new self-monitoring feature.

## Critical fixes
- Planka webhook: reads bounded raw body (was NameError on every call)
- HA quiet hours: ha_state_changed/automation_triggered/service_called/
  event_fired added to deferrable set (were silently dropped)
- DNS-rebinding SSRF: PinnedResolver wired into shared aiohttp session
- Telegram inbound webhook: secret now mandatory (401 without)
- Generic webhook: auth_mode="none" requires explicit
  acknowledge_unauthenticated=true; per-IP rate limit 60/min
- svelte-check: 5 null-narrowing errors in EventDetailModal fixed
- Provider hardcoding: Immich-only block extracted to descriptor
  featureDiscoveryHint
- command_sync: snapshot+expunge bot before exiting AsyncSession

## Bug fixes
- notifier asyncio.gather(return_exceptions=True) — one bad chat no longer
  cancels peer sends
- NotificationDispatcher hoisted out of per-tracker loop
- Provider credential resolution unified across all 5 dispatch sites
- HA asyncio.shield now drains inner task on cancellation
- Provider construction switched from if/elif ladder to factory registry
- NUT first poll seeds silently (no spurious ups_on_battery)
- Quiet-hours gate: event-type-disabled now wins over deferral
- APScheduler drain job ID resolution upgraded to seconds
- HA on_status_change wired through to EventLog
- Webhook payload rollback failures now logged (not swallowed)
- Batched receivers/chats/bots in load_link_data (was per-target N+1)
- flag_modified on JSON column reassignments in deferred_dispatch

## Database
- UNIQUE indexes on service_provider.webhook_token,
  telegram_bot.webhook_path_id, partial UNIQUE on telegram_bot.bot_id,
  telegram_chat(bot_id, chat_id), notification_tracker_target unique link,
  partial UNIQUE on bridge_self provider per user
- Composite ix_event_log_user_event_type_created index
- save_chat_from_webhook switched to ON CONFLICT DO UPDATE
- ondelete=CASCADE on user-id FKs (model annotation; app-side cascade
  delete added for existing data)
- delete_notification_tracker converted from N+1 to bulk DELETE/UPDATE
- Module-level asyncio.Lock replaced with lazy _get_lock() pattern
- VACUUM INTO snapshot now PRAGMA integrity_check verified

## Performance
- Jinja2 template compilation LRU cached (lru_cache maxsize=512)
- Per-locale render cache in NotificationDispatcher (skips re-rendering
  identical content for receivers sharing a locale)
- Tracker list cached per provider_id with 5s TTL + explicit invalidation
  on tracker CRUD (relieves HA chat-bus rate query pressure)
- Nav-counts collapsed from 16 round-trips to single UNION ALL
- HA event_log: skip persisting empty assets_added/removed events

## Security hardening
- Mass-assignment guard on Action create/update; cron sub-minute reject
- Backup JSON depth/node-count cap (depth ≤ 10, nodes ≤ 100k)
- _sanitize_config extended to all JSON-typed fields on backup import
- Telegram _safe_get walks redirects manually with SSRF revalidation
- Bcrypt 72-byte password length cap with clear 422
- Webhook payload body redaction; sensitive substring set extended with
  oauth/client_secret/webhook_secret/csrf in both header filter and
  template extras filter

## Frontend
- 76 catch (err: any) sites converted to errMsg(err) helper
- globalProviderFilter: pure getter; reconciliation moved to one-time
  $effect in +layout
- Provider-filter binding: removed paired $effects + _syncingFilter flag,
  now one-way derived
- entity-cache: separate _refreshing flag for background re-fetches
- api.ts 401 handling: AuthRedirectError class + dedup _redirecting flag,
  goto() instead of window.location.href
- a11y: aria-expanded on mobile More, role=switch + aria-checked on
  Telegram bot toggles

## Tests & operations
- CI pytest gate added to .gitea/workflows/build.yml + release.yml
  (wheel-built install to dodge editable-install slowness)
- /api/ready upgraded to deep healthcheck (db SELECT 1, scheduler.running,
  HA supervisor presence) returning {ready, checks, errors, version}
- /api/metrics endpoint with prometheus_client (deferred_pending,
  event_log_total, dispatch_duration, poll_failures, send_failures)
- New OPERATIONS.md covering deploy, healthchecks, metrics, backup/restore
  procedures, log handling, common scenarios, upgrade flow
- New tests: test_bridge_self (11), test_gitea_parser (9),
  test_planka_parser (6), test_immich_change_detector (6),
  test_backup_roundtrip (1)

## New feature: bridge self-monitoring
- New bridge_self provider type — internal sink for bridge health events
- Three event types: bridge_self_poll_failures (consecutive tracker poll
  failures), bridge_self_deferred_backlog (pending count crosses
  threshold), bridge_self_target_failures (consecutive 5xx/network
  failures per target)
- Per-user thresholds (defaults: 3 / 100 / 5) configurable via the
  provider config form
- Auto-seeded on user create + /setup + boot backfill for existing users
- Anti-spam: counters reset after emission; backlog uses transition latch
- Self-loop guard: bridge_self failures don't count toward target-failure
  thresholds (logged only) — wire to your own Telegram/Email/Matrix to
  get notified when polls/dispatches/sends fail
- 6 default templates (3 events × 2 locales), tracking config columns
  with backfill migration, frontend descriptor (excluded from "create
  provider" wizard since auto-managed)

Operator-visible behavior changes (call out in release notes):
- NOTIFY_BRIDGE_TELEGRAM_WEBHOOK_SECRET now REQUIRED for webhook mode
- Existing webhook providers with auth_mode="none" need explicit opt-in
- Generic webhook endpoint rate-limited 60/min per source IP
- HA disconnect/reconnect writes ha_status_* EventLog rows
- Every user gets a bridge_self provider — wire it to a target to
  receive failure alerts

Pre-existing test failures (test_ssrf, test_release_provider) on
Python 3.13 are unrelated; CI runs on 3.12.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-16 02:16:49 +03:00

554 lines
21 KiB
Python
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
"""Test dispatch — manual trigger through the real NotificationDispatcher.
No separate logic — just builds a ServiceEvent + TargetConfig from DB
objects and dispatches through the same path the watcher uses.
"""
import logging
from typing import Any
from sqlmodel import select
from sqlmodel.ext.asyncio.session import AsyncSession
from notify_bridge_core.models.events import EventType, ServiceEvent
from notify_bridge_core.models.media import MediaAsset
from notify_bridge_core.notifications.dispatcher import NotificationDispatcher, TargetConfig
from notify_bridge_core.providers.base import ServiceProviderType
from ..database.models import (
NotificationTarget,
NotificationTracker,
NotificationTrackerTarget,
ServiceProvider,
TemplateConfig,
TemplateSlot,
TrackingConfig,
)
from .dispatch_helpers import _resolve_target, resolve_provider_credential
from .watcher import _get_telegram_caches
_LOGGER = logging.getLogger(__name__)
# Maps test_type → DB template slot name
_TEST_TYPE_SLOT_MAP = {
"periodic": "periodic_summary_message",
"scheduled": "scheduled_assets_message",
"memory": "memory_mode_message",
}
async def dispatch_test_notification(
*,
session: AsyncSession,
tracker: NotificationTracker,
tt: NotificationTrackerTarget,
target: NotificationTarget,
test_type: str,
locale: str = "en",
) -> dict[str, Any]:
"""Dispatch a test notification through the real NotificationDispatcher."""
# Load provider
provider = await session.get(ServiceProvider, tracker.provider_id)
if not provider:
return {"success": False, "error": "Provider not found"}
provider_config = dict(provider.config)
collection_ids = list(tracker.collection_ids or [])
# Resolve tracking config: per-link override, else the tracker's default.
# The real watcher applies this fallback in ``load_link_data`` — tests
# must use the same logic or the user's per-tracker defaults look broken.
tracking_config_id = tt.tracking_config_id or tracker.default_tracking_config_id
tracking_config = None
if tracking_config_id:
tracking_config = await session.get(TrackingConfig, tracking_config_id)
# Same fallback for template config.
template_config_id = tt.template_config_id or tracker.default_template_config_id
template_config = None
template_slots: dict[str, dict[str, str]] | None = None
slot_name = _TEST_TYPE_SLOT_MAP.get(test_type, test_type)
if template_config_id:
template_config = await session.get(TemplateConfig, template_config_id)
if template_config:
slot_result = await session.exec(
select(TemplateSlot).where(
TemplateSlot.config_id == template_config.id,
TemplateSlot.slot_name == slot_name,
)
)
locale_map: dict[str, str] = {}
for s in slot_result.all():
locale_map[s.locale] = s.template
if locale_map:
template_slots = {EventType.SCHEDULED_MESSAGE.value: locale_map}
# Resolve target config + receivers (same as watcher — this already sets
# each receiver.locale from TargetReceiver.locale or TelegramChat override)
resolved = await _resolve_target(session, target)
target_cfg = TargetConfig(
type=resolved["target_type"],
config=resolved["target_config"],
template_slots=template_slots,
locale=locale,
date_format=template_config.date_format if template_config else "%d.%m.%Y, %H:%M UTC",
date_only_format=template_config.date_only_format if template_config and template_config.date_only_format else "%d.%m.%Y",
provider_api_key=resolve_provider_credential(provider_config),
provider_internal_url=provider_config.get("url", ""),
provider_external_url=provider_config.get("external_domain", ""),
receivers=resolved["receivers"],
)
if not template_slots:
if not template_config_id:
return {
"success": False,
"error": (
"This tracker has no Template Config linked (neither on the "
"tracker's default nor on this target link). Assign one in the "
"tracker settings and make sure it defines a "
f"'{slot_name}' slot."
),
}
return {
"success": False,
"error": (
f"No '{slot_name}' template defined in the linked Template Config "
f"'{template_config.name if template_config else template_config_id}' "
f"(locale: {locale}). Add the slot under Template Configs."
),
}
# Build events (single or per-album) via the shared helper so test and
# cron dispatch stay in lockstep on the mode decision.
try:
if provider.type == "immich" and test_type in ("periodic", "scheduled", "memory"):
events = await build_immich_dispatch_events(
provider_config=provider_config,
provider_name=provider.name or provider.type,
tracker_name=tracker.name or "",
collection_ids=collection_ids,
kind=test_type,
tracking_config=tracking_config,
)
else:
ev = await _build_event(
provider_type=provider.type,
provider_config=provider_config,
provider_name=provider.name or provider.type,
tracker_name=tracker.name or "",
tracker_filters=dict(tracker.filters) if tracker.filters else {},
collection_ids=collection_ids,
test_type=test_type,
tracking_config=tracking_config,
)
events = [ev] if ev is not None else []
except Exception as err: # noqa: BLE001
_LOGGER.exception("Test dispatch event build failed")
return {"success": False, "error": f"Provider connection failed: {err}"}
if not events:
if test_type in ("scheduled", "memory"):
return {
"success": False,
"error": (
"No matching assets found. Verify the tracker's albums contain assets "
"that pass the tracking config filters (favorites only, rating, asset type)."
) + (" for today" if test_type == "memory" else ""),
}
return {
"success": False,
"error": (
"Provider returned no data. Check that the provider is reachable, "
"credentials are valid, and the tracker has collections configured."
),
}
# Dispatch each event to the same target (per-album fan-out sends N messages).
# Apply display filters so the test notification matches production behavior
# for ``favorites_only``, ``include_tags``, ``include_asset_details``, etc.
from .dispatch_helpers import apply_tracking_display_filters
url_cache, asset_cache = await _get_telegram_caches()
dispatcher = NotificationDispatcher(url_cache=url_cache, asset_cache=asset_cache)
all_results: list[dict[str, Any]] = []
for event in events:
shaped_event = apply_tracking_display_filters(event, tracking_config)
if shaped_event is None:
all_results.append({
"success": False,
"error": (
"Event suppressed by tracking config (favorites_only is on "
"but no added assets are favorites)."
),
})
continue
results = await dispatcher.dispatch(shaped_event, [target_cfg])
if results:
all_results.append(results[0])
if not all_results:
return {"success": False, "error": "No dispatch results"}
all_ok = all(r.get("success") for r in all_results)
if all_ok:
return {"success": True, "dispatched": len(all_results)}
first_err = next(
(r.get("error") for r in all_results if not r.get("success")),
"Unknown error",
)
return {
"success": False,
"error": first_err,
"dispatched": sum(1 for r in all_results if r.get("success")),
"failed": sum(1 for r in all_results if not r.get("success")),
}
async def build_immich_dispatch_events(
*,
provider_config: dict,
provider_name: str,
tracker_name: str,
collection_ids: list[str],
kind: str,
tracking_config: TrackingConfig | None,
) -> list[ServiceEvent]:
"""Build the list of ServiceEvents to dispatch for an Immich scheduled kind.
Single source of truth for the mode decision: ``periodic`` is always one
summary event; ``scheduled``/``memory`` honour the ``{kind}_collection_mode``
on the tracking config and fan out one event per album in ``per_collection``
mode, or one combined event in ``combined`` mode.
Empty-payload filtering (no assets matched) is applied here so callers get
back only events that should actually dispatch. ``periodic`` is exempt —
a zero-asset summary is still meaningful (shows album stats only).
"""
if kind == "periodic":
ev = await _build_immich_periodic_event(
provider_config=provider_config,
provider_name=provider_name,
tracker_name=tracker_name,
collection_ids=collection_ids,
)
return [ev] if ev is not None else []
mode = getattr(
tracking_config, f"{kind}_collection_mode", "combined"
) or "combined"
if mode == "per_collection" and len(collection_ids) > 1:
events: list[ServiceEvent] = []
for aid in collection_ids:
ev = await _build_immich_event(
provider_config=provider_config,
provider_name=provider_name,
tracker_name=tracker_name,
collection_ids=[aid],
test_type=kind,
tracking_config=tracking_config,
)
if ev is not None and ev.added_assets:
events.append(ev)
return events
ev = await _build_immich_event(
provider_config=provider_config,
provider_name=provider_name,
tracker_name=tracker_name,
collection_ids=collection_ids,
test_type=kind,
tracking_config=tracking_config,
)
if ev is None or not ev.added_assets:
return []
return [ev]
async def _build_event(
*,
provider_type: str,
provider_config: dict,
provider_name: str,
tracker_name: str,
tracker_filters: dict,
collection_ids: list[str],
test_type: str,
tracking_config: TrackingConfig | None = None,
) -> ServiceEvent | None:
"""Build a ServiceEvent with real provider data."""
from datetime import datetime, timezone
if provider_type == "immich":
if test_type == "periodic":
return await _build_immich_periodic_event(
provider_config=provider_config,
provider_name=provider_name,
tracker_name=tracker_name,
collection_ids=collection_ids,
)
return await _build_immich_event(
provider_config=provider_config,
provider_name=provider_name,
tracker_name=tracker_name,
collection_ids=collection_ids,
test_type=test_type,
tracking_config=tracking_config,
)
elif provider_type == "scheduler":
from notify_bridge_core.providers.scheduler import SchedulerServiceProvider
custom_vars = tracker_filters.get("custom_variables", {})
sched = SchedulerServiceProvider(
name=provider_name,
tracker_name=tracker_name,
custom_variables=custom_vars,
)
events, _ = await sched.poll(collection_ids, {})
return events[0] if events else None
return None
async def _build_immich_event(
*,
provider_config: dict,
provider_name: str,
tracker_name: str,
collection_ids: list[str],
test_type: str,
tracking_config: TrackingConfig | None = None,
) -> ServiceEvent | None:
"""Build an Immich scheduled/memory event using shared core utilities."""
from datetime import datetime, timezone
from notify_bridge_core.providers.immich import ImmichServiceProvider
from notify_bridge_core.providers.immich.asset_utils import collect_scheduled_assets
from notify_bridge_core.providers.immich.models import ImmichAlbumData, SharedLinkInfo
ext_domain = provider_config.get("external_domain") or provider_config.get("url", "")
prefix = "memory" if test_type == "memory" else "scheduled"
limit = getattr(tracking_config, f"{prefix}_limit", 10) if tracking_config else 10
asset_type = getattr(tracking_config, f"{prefix}_asset_type", "all") if tracking_config else "all"
favorite_only = getattr(tracking_config, f"{prefix}_favorite_only", False) if tracking_config else False
min_rating = getattr(tracking_config, f"{prefix}_min_rating", 0) if tracking_config else 0
memory_source = getattr(tracking_config, "memory_source", "albums") if tracking_config else "albums"
is_memory = test_type == "memory"
from .http_session import get_http_session
http_session = await get_http_session()
immich = ImmichServiceProvider(
http_session,
provider_config.get("url", ""),
provider_config.get("api_key", ""),
provider_config.get("external_domain"),
provider_name,
)
if not await immich.connect():
return None
# Native Immich memories API path
if is_memory and memory_source == "native":
return await _build_native_memory_event(
immich, ext_domain, provider_name, tracker_name,
collection_ids, limit, asset_type, favorite_only, min_rating,
)
# Album-based path: use shared collect_scheduled_assets.
# Fetch albums + shared links in parallel — on a 20-album tracker the old
# serial ``await`` loop took ~2 × 20 × RTT, now it's one round-trip.
import asyncio as _asyncio
album_tasks = [immich.client.get_album(aid) for aid in collection_ids]
link_tasks = [immich.client.get_shared_links(aid) for aid in collection_ids]
album_results, link_results = await _asyncio.gather(
_asyncio.gather(*album_tasks, return_exceptions=True),
_asyncio.gather(*link_tasks, return_exceptions=True),
)
albums: dict[str, ImmichAlbumData] = {}
shared_links: dict[str, list[SharedLinkInfo]] = {}
for album_id, album, links in zip(collection_ids, album_results, link_results):
if isinstance(album, Exception) or album is None:
continue
albums[album_id] = album
shared_links[album_id] = links if not isinstance(links, Exception) else []
assets, collections_extra = collect_scheduled_assets(
albums, shared_links, ext_domain,
limit=limit,
asset_type=asset_type,
favorite_only=favorite_only,
min_rating=min_rating,
is_memory=is_memory,
)
first_col = collections_extra[0] if collections_extra else {}
return ServiceEvent(
event_type=EventType.SCHEDULED_MESSAGE,
provider_type=ServiceProviderType.IMMICH,
provider_name=provider_name,
collection_id=collection_ids[0] if collection_ids else "",
collection_name=first_col.get("name", tracker_name),
timestamp=datetime.now(timezone.utc),
added_assets=assets,
added_count=len(assets),
extra={
"collections": collections_extra,
"albums": collections_extra,
**(first_col if first_col else {}),
},
)
async def _build_immich_periodic_event(
*,
provider_config: dict,
provider_name: str,
tracker_name: str,
collection_ids: list[str],
) -> ServiceEvent | None:
"""Build a periodic-summary event (album stats only, no assets).
Reuses the same shared core utility (`collect_scheduled_assets`) that
scheduled/memory tests use, invoked with limit=0 so we get the full
``collections_extra`` block (album name/url/counts/...) without selecting
any individual assets — which is exactly what the
``periodic_summary_message`` template renders.
"""
from datetime import datetime, timezone
from notify_bridge_core.providers.immich import ImmichServiceProvider
from notify_bridge_core.providers.immich.asset_utils import collect_scheduled_assets
from notify_bridge_core.providers.immich.models import ImmichAlbumData, SharedLinkInfo
from .http_session import get_http_session
http_session = await get_http_session()
immich = ImmichServiceProvider(
http_session,
provider_config.get("url", ""),
provider_config.get("api_key", ""),
provider_config.get("external_domain"),
provider_name,
)
if not await immich.connect():
return None
ext_domain = provider_config.get("external_domain") or provider_config.get("url", "")
# Parallel fetch — see _build_immich_event above for the same rationale.
import asyncio as _asyncio
album_tasks = [immich.client.get_album(aid) for aid in collection_ids]
link_tasks = [immich.client.get_shared_links(aid) for aid in collection_ids]
album_results, link_results = await _asyncio.gather(
_asyncio.gather(*album_tasks, return_exceptions=True),
_asyncio.gather(*link_tasks, return_exceptions=True),
)
albums: dict[str, ImmichAlbumData] = {}
shared_links: dict[str, list[SharedLinkInfo]] = {}
for album_id, album, links in zip(collection_ids, album_results, link_results):
if isinstance(album, Exception) or album is None:
continue
albums[album_id] = album
shared_links[album_id] = links if not isinstance(links, Exception) else []
# limit=0 → returns ([], collections_extra) with full per-album stats.
_assets, collections_extra = collect_scheduled_assets(
albums, shared_links, ext_domain,
limit=0,
asset_type="all",
favorite_only=False,
min_rating=0,
is_memory=False,
)
first_col = collections_extra[0] if collections_extra else {}
return ServiceEvent(
event_type=EventType.SCHEDULED_MESSAGE,
provider_type=ServiceProviderType.IMMICH,
provider_name=provider_name,
collection_id=collection_ids[0] if collection_ids else "",
collection_name=first_col.get("name", tracker_name),
timestamp=datetime.now(timezone.utc),
added_assets=[],
added_count=0,
extra={
"collections": collections_extra,
"albums": collections_extra,
**(first_col if first_col else {}),
},
)
async def _build_native_memory_event(
immich,
ext_domain: str,
provider_name: str,
tracker_name: str,
collection_ids: list[str],
limit: int,
asset_type: str,
favorite_only: bool,
min_rating: int,
) -> ServiceEvent | None:
"""Build event from Immich native memories API."""
import random
from datetime import datetime, timezone
from notify_bridge_core.models.media import MediaAsset, MediaType
from notify_bridge_core.providers.immich.asset_utils import filter_assets
from notify_bridge_core.providers.immich.models import ImmichAssetInfo
memories = await immich.client.get_memories()
tracked_ids = set(collection_ids) if collection_ids else None
# Collect raw assets, convert to ImmichAssetInfo for unified filtering
raw_assets: list[ImmichAssetInfo] = []
year_map: dict[str, int | None] = {} # asset_id → memory year
for mem in memories:
mem_year = mem.get("data", {}).get("year")
for raw in mem.get("assets", []):
asset_id = raw.get("id", "")
if tracked_ids:
asset_albums = raw.get("albums", [])
if not any(a.get("id") in tracked_ids for a in asset_albums):
continue
asset = ImmichAssetInfo.from_api_response(raw)
if not asset.is_processed:
continue
raw_assets.append(asset)
year_map[asset_id] = mem_year
# Apply standard filters (no memory_date — native API already filters by date)
filtered = filter_assets(
raw_assets,
favorite_only=favorite_only,
min_rating=min_rating,
asset_type=asset_type,
)
# Random sample
if len(filtered) > limit:
selected = random.sample(filtered, limit)
else:
random.shuffle(filtered)
selected = filtered
from notify_bridge_core.providers.immich.asset_utils import asset_to_media
all_assets = []
for asset in selected:
media = asset_to_media(asset, ext_domain)
media.extra["year"] = year_map.get(asset.id)
all_assets.append(media)
return ServiceEvent(
event_type=EventType.SCHEDULED_MESSAGE,
provider_type=ServiceProviderType.IMMICH,
provider_name=provider_name,
collection_id=collection_ids[0] if collection_ids else "",
collection_name=tracker_name,
timestamp=datetime.now(timezone.utc),
added_assets=all_assets,
added_count=len(all_assets),
extra={
"collections": [],
"albums": [],
},
)