Files
notify-bridge/packages/server/tests/test_backup_roundtrip.py
T
alexei.dolgolyov 10d30fc956 feat: production readiness — security, perf, bug fixes, bridge self-monitoring
Comprehensive multi-area pass driven by a parallel 8-agent production
review. Frontend, backend, database, security, performance, operational,
plus a new self-monitoring feature.

## Critical fixes
- Planka webhook: reads bounded raw body (was NameError on every call)
- HA quiet hours: ha_state_changed/automation_triggered/service_called/
  event_fired added to deferrable set (were silently dropped)
- DNS-rebinding SSRF: PinnedResolver wired into shared aiohttp session
- Telegram inbound webhook: secret now mandatory (401 without)
- Generic webhook: auth_mode="none" requires explicit
  acknowledge_unauthenticated=true; per-IP rate limit 60/min
- svelte-check: 5 null-narrowing errors in EventDetailModal fixed
- Provider hardcoding: Immich-only block extracted to descriptor
  featureDiscoveryHint
- command_sync: snapshot+expunge bot before exiting AsyncSession

## Bug fixes
- notifier asyncio.gather(return_exceptions=True) — one bad chat no longer
  cancels peer sends
- NotificationDispatcher hoisted out of per-tracker loop
- Provider credential resolution unified across all 5 dispatch sites
- HA asyncio.shield now drains inner task on cancellation
- Provider construction switched from if/elif ladder to factory registry
- NUT first poll seeds silently (no spurious ups_on_battery)
- Quiet-hours gate: event-type-disabled now wins over deferral
- APScheduler drain job ID resolution upgraded to seconds
- HA on_status_change wired through to EventLog
- Webhook payload rollback failures now logged (not swallowed)
- Batched receivers/chats/bots in load_link_data (was per-target N+1)
- flag_modified on JSON column reassignments in deferred_dispatch

## Database
- UNIQUE indexes on service_provider.webhook_token,
  telegram_bot.webhook_path_id, partial UNIQUE on telegram_bot.bot_id,
  telegram_chat(bot_id, chat_id), notification_tracker_target unique link,
  partial UNIQUE on bridge_self provider per user
- Composite ix_event_log_user_event_type_created index
- save_chat_from_webhook switched to ON CONFLICT DO UPDATE
- ondelete=CASCADE on user-id FKs (model annotation; app-side cascade
  delete added for existing data)
- delete_notification_tracker converted from N+1 to bulk DELETE/UPDATE
- Module-level asyncio.Lock replaced with lazy _get_lock() pattern
- VACUUM INTO snapshot now PRAGMA integrity_check verified

## Performance
- Jinja2 template compilation LRU cached (lru_cache maxsize=512)
- Per-locale render cache in NotificationDispatcher (skips re-rendering
  identical content for receivers sharing a locale)
- Tracker list cached per provider_id with 5s TTL + explicit invalidation
  on tracker CRUD (relieves HA chat-bus rate query pressure)
- Nav-counts collapsed from 16 round-trips to single UNION ALL
- HA event_log: skip persisting empty assets_added/removed events

## Security hardening
- Mass-assignment guard on Action create/update; cron sub-minute reject
- Backup JSON depth/node-count cap (depth ≤ 10, nodes ≤ 100k)
- _sanitize_config extended to all JSON-typed fields on backup import
- Telegram _safe_get walks redirects manually with SSRF revalidation
- Bcrypt 72-byte password length cap with clear 422
- Webhook payload body redaction; sensitive substring set extended with
  oauth/client_secret/webhook_secret/csrf in both header filter and
  template extras filter

## Frontend
- 76 catch (err: any) sites converted to errMsg(err) helper
- globalProviderFilter: pure getter; reconciliation moved to one-time
  $effect in +layout
- Provider-filter binding: removed paired $effects + _syncingFilter flag,
  now one-way derived
- entity-cache: separate _refreshing flag for background re-fetches
- api.ts 401 handling: AuthRedirectError class + dedup _redirecting flag,
  goto() instead of window.location.href
- a11y: aria-expanded on mobile More, role=switch + aria-checked on
  Telegram bot toggles

## Tests & operations
- CI pytest gate added to .gitea/workflows/build.yml + release.yml
  (wheel-built install to dodge editable-install slowness)
- /api/ready upgraded to deep healthcheck (db SELECT 1, scheduler.running,
  HA supervisor presence) returning {ready, checks, errors, version}
- /api/metrics endpoint with prometheus_client (deferred_pending,
  event_log_total, dispatch_duration, poll_failures, send_failures)
- New OPERATIONS.md covering deploy, healthchecks, metrics, backup/restore
  procedures, log handling, common scenarios, upgrade flow
- New tests: test_bridge_self (11), test_gitea_parser (9),
  test_planka_parser (6), test_immich_change_detector (6),
  test_backup_roundtrip (1)

## New feature: bridge self-monitoring
- New bridge_self provider type — internal sink for bridge health events
- Three event types: bridge_self_poll_failures (consecutive tracker poll
  failures), bridge_self_deferred_backlog (pending count crosses
  threshold), bridge_self_target_failures (consecutive 5xx/network
  failures per target)
- Per-user thresholds (defaults: 3 / 100 / 5) configurable via the
  provider config form
- Auto-seeded on user create + /setup + boot backfill for existing users
- Anti-spam: counters reset after emission; backlog uses transition latch
- Self-loop guard: bridge_self failures don't count toward target-failure
  thresholds (logged only) — wire to your own Telegram/Email/Matrix to
  get notified when polls/dispatches/sends fail
- 6 default templates (3 events × 2 locales), tracking config columns
  with backfill migration, frontend descriptor (excluded from "create
  provider" wizard since auto-managed)

Operator-visible behavior changes (call out in release notes):
- NOTIFY_BRIDGE_TELEGRAM_WEBHOOK_SECRET now REQUIRED for webhook mode
- Existing webhook providers with auth_mode="none" need explicit opt-in
- Generic webhook endpoint rate-limited 60/min per source IP
- HA disconnect/reconnect writes ha_status_* EventLog rows
- Every user gets a bridge_self provider — wire it to a target to
  receive failure alerts

Pre-existing test failures (test_ssrf, test_release_provider) on
Python 3.13 are unrelated; CI runs on 3.12.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-16 02:16:49 +03:00

269 lines
9.2 KiB
Python

"""End-to-end backup roundtrip: seed -> export -> wipe -> import -> verify.
Drives the backup service module directly (no HTTP layer) against a fresh
SQLite DB built in the conftest temp data dir. Verifies entity counts and
key fields survive a full round-trip.
Kept under 5s by avoiding the lifespan startup — we build a private engine
in an isolated DB file so we don't share state with other tests in the
session.
"""
from __future__ import annotations
from pathlib import Path
import pytest
from sqlalchemy.ext.asyncio import create_async_engine
from sqlmodel import SQLModel, select
from sqlmodel.ext.asyncio.session import AsyncSession
@pytest.fixture
async def isolated_engine(tmp_path: Path):
"""A throwaway SQLite engine + freshly created schema for one test.
Avoids the global engine in ``database.engine`` — tests in the same
session share that singleton, and recreating tables on it would corrupt
parallel tests' state.
"""
# Importing the module registers all SQLModel tables on the metadata.
from notify_bridge_server.database import models # noqa: F401
db_path = tmp_path / "roundtrip.db"
engine = create_async_engine(f"sqlite+aiosqlite:///{db_path}")
async with engine.begin() as conn:
await conn.run_sync(SQLModel.metadata.create_all)
yield engine
await engine.dispose()
async def _seed(session: AsyncSession, user_id: int) -> dict[str, int]:
"""Insert enough rows to exercise the major code paths in import/export."""
from notify_bridge_server.database.models import (
EventLog,
NotificationTarget,
NotificationTracker,
ServiceProvider,
TargetReceiver,
TelegramBot,
TrackingConfig,
User,
)
user = User(
id=user_id,
username="roundtrip-user",
hashed_password="hash",
role="user",
)
session.add(user)
await session.flush()
bot = TelegramBot(
user_id=user_id, name="Test bot", token="123456:fake-token-value",
bot_username="testbot", bot_id=1,
)
session.add(bot)
await session.flush()
provider = ServiceProvider(
user_id=user_id, type="immich", name="Immich prod",
config={"base_url": "https://immich.example.com", "api_key": "secret"},
)
session.add(provider)
await session.flush()
target = NotificationTarget(
user_id=user_id, type="telegram", name="My channel",
config={"bot_token_id": bot.id, "disable_url_preview": True},
)
session.add(target)
await session.flush()
receiver = TargetReceiver(
target_id=target.id, name="Channel A",
config={"chat_id": "-100123"}, receiver_key="-100123", locale="en",
)
session.add(receiver)
tc = TrackingConfig(
user_id=user_id, provider_type="immich",
name="Default Immich tracking", track_assets_added=True,
)
session.add(tc)
await session.flush()
tracker = NotificationTracker(
user_id=user_id, provider_id=provider.id,
name="Family album tracker", scan_interval=120,
collection_ids=["album-uuid-1"],
)
session.add(tracker)
await session.flush()
# Capture IDs before commit — accessing attributes after commit
# triggers a refresh that needs an async-IO context the test caller
# may not be inside. Better to snapshot now and use plain ints later.
ids = {
"provider_id": provider.id,
"target_id": target.id,
"bot_id": bot.id,
"tracker_id": tracker.id,
"tracking_config_id": tc.id,
"tracker_name": tracker.name,
"provider_name": provider.name,
}
# EventLog rows are NOT in the backup schema — they're operational data,
# not configuration. Insert a few anyway so we can verify they survive
# the export step (since export only reads, never writes/wipes them).
for i in range(3):
session.add(EventLog(
user_id=user_id, tracker_id=ids["tracker_id"], tracker_name=ids["tracker_name"],
provider_id=ids["provider_id"], provider_name=ids["provider_name"],
event_type="assets_added", collection_id="album-uuid-1",
collection_name="Family", assets_count=i,
))
await session.commit()
return ids
async def _wipe_user_owned_rows(engine, user_id: int) -> None:
"""Delete every backup-able row for the user via raw SQL.
Using ORM-level deletes triggers SQLAlchemy's cascade machinery, which
lazy-loads relationships in a sync context that the async driver cannot
serve (MissingGreenlet). Raw DELETE statements skip cascades and let
SQLite's FKs enforce ordering naturally.
Order matters: child rows first, then parents.
"""
from sqlalchemy import text
statements = (
"DELETE FROM event_log",
"DELETE FROM notification_tracker_target",
"DELETE FROM notification_tracker",
"DELETE FROM target_receiver",
"DELETE FROM notification_target",
"DELETE FROM tracking_config",
"DELETE FROM service_provider",
"DELETE FROM template_slot",
"DELETE FROM template_config",
"DELETE FROM telegram_bot",
"DELETE FROM appsetting",
)
async with engine.begin() as conn:
for stmt in statements:
try:
await conn.execute(text(stmt))
except Exception: # noqa: BLE001 — table may not exist in test schema
pass
@pytest.mark.asyncio
async def test_export_wipe_import_roundtrip(isolated_engine, tmp_data_dir) -> None: # noqa: ARG001
"""A full round-trip preserves entity counts and the key fields the
UI relies on — names, configs (with secrets included), provider
references via id_map.
"""
from notify_bridge_server.database.models import (
NotificationTarget, NotificationTracker, ServiceProvider,
TargetReceiver, TelegramBot, TrackingConfig,
)
from notify_bridge_server.services.backup_schema import (
ConflictMode, SecretsMode,
)
from notify_bridge_server.services.backup_service import (
export_backup, import_backup,
)
user_id = 1
# ---- Seed ----
async with AsyncSession(isolated_engine) as session:
ids = await _seed(session, user_id)
# ---- Export with secrets included so import sees real values ----
async with AsyncSession(isolated_engine) as session:
backup = await export_backup(
session, user_id, secrets_mode=SecretsMode.INCLUDE,
)
assert len(backup.data.providers) == 1
assert len(backup.data.telegram_bots) == 1
assert len(backup.data.targets) == 1
assert len(backup.data.targets[0].receivers) == 1
assert len(backup.data.tracking_configs) == 1
assert len(backup.data.notification_trackers) == 1
assert backup.data.providers[0].config["api_key"] == "secret"
# ---- Wipe ----
await _wipe_user_owned_rows(isolated_engine, user_id)
async with AsyncSession(isolated_engine) as session:
result = await session.exec(
select(ServiceProvider).where(ServiceProvider.user_id == user_id)
)
assert result.all() == []
# ---- Import ----
async with AsyncSession(isolated_engine) as session:
result = await import_backup(
session, user_id, backup, conflict_mode=ConflictMode.SKIP,
)
assert result.errors == [], f"Import errors: {result.errors}"
assert result.created > 0
# ---- Verify the entities are back ----
async with AsyncSession(isolated_engine) as session:
providers = (await session.exec(
select(ServiceProvider).where(ServiceProvider.user_id == user_id)
)).all()
assert len(providers) == 1
prov = providers[0]
assert prov.name == "Immich prod"
assert prov.config["base_url"] == "https://immich.example.com"
# Secrets imported intact when SecretsMode.INCLUDE was used at export.
assert prov.config["api_key"] == "secret"
bots = (await session.exec(
select(TelegramBot).where(TelegramBot.user_id == user_id)
)).all()
assert len(bots) == 1
assert bots[0].name == "Test bot"
targets = (await session.exec(
select(NotificationTarget).where(NotificationTarget.user_id == user_id)
)).all()
assert len(targets) == 1
receivers = (await session.exec(
select(TargetReceiver).where(TargetReceiver.target_id == targets[0].id)
)).all()
assert len(receivers) == 1
assert receivers[0].config["chat_id"] == "-100123"
tcs = (await session.exec(
select(TrackingConfig).where(TrackingConfig.user_id == user_id)
)).all()
assert len(tcs) == 1
assert tcs[0].name == "Default Immich tracking"
trackers = (await session.exec(
select(NotificationTracker).where(NotificationTracker.user_id == user_id)
)).all()
assert len(trackers) == 1
# provider_id was remapped via id_map — original provider id may have
# changed across the wipe, so just check it links to a real row.
assert trackers[0].provider_id == prov.id
assert trackers[0].scan_interval == 120
assert trackers[0].collection_ids == ["album-uuid-1"]