Take a consistent, atomic copy of the DB at lifespan startup BEFORE
migrations run, so a botched future upgrade is recoverable by restoring
a single file instead of a data-loss incident.
Uses SQLite's VACUUM INTO — safe under WAL, cannot tear against
concurrent writes. Best-effort: failures are logged, never raised —
the main DB remains the source of truth.
Configurable via NOTIFY_BRIDGE_PRE_MIGRATE_SNAPSHOT_KEEP (default 5;
0 disables). Snapshots land in ``data_dir/backups/pre-migrate-<ts>.db``
and the N oldest are pruned each boot.
Security
- SSRF: async DNS resolver; allow_redirects=False on all outbound clients;
matrix homeserver_url validated on create/update/test; update_provider
and email_bot merge incoming config and reject ***-masked secrets.
- Auth: bcrypt offloaded to asyncio.to_thread; JWT now carries iss/aud +
leeway and rejects missing claims; setup TOCTOU closed inside a
transaction; rate limits extended (default 600/min, 10/min on password
change, 30/min on needs-setup); constant-time login to prevent username
enumeration.
- Config: rejects known dev secret keys; validates CORS origin schemes,
port range, token lifetimes.
- Webhook handlers stream-read body with a 1 MiB cap; Discord 429 retries
bounded (3 attempts, Retry-After capped at 60 s).
- CSP + HSTS added to SecurityHeadersMiddleware.
Async / runtime
- SQLite engine: WAL, synchronous=NORMAL, foreign_keys=ON, busy_timeout,
pool_pre_ping, dispose on shutdown.
- Lifespan shutdown now stops scheduler before closing HTTP session and
disposing the engine.
- Shared aiohttp session locked against concurrent first-caller races;
core NotificationDispatcher accepts and reuses it.
- Storage and scheduled backup writes wrapped in asyncio.to_thread.
- NUT client writes bounded by asyncio.wait_for.
- Telegram poller switched from 3 s short-poll to 30 s interval + 25 s
long-poll (~10x fewer API calls).
Database
- New performance-indexes migration covers every FK/owner column and
hot-path composite (notification_tracker(provider_id, enabled);
event_log(user_id, created_at DESC); webhook_payload_log(provider_id,
created_at DESC); action_execution(action_id, started_at DESC)).
- New schema_version table for future upgrade gating.
- __system__ placeholder user (id=0) seeded so user_id=0 system defaults
satisfy the newly enforced FK; filtered out of /auth/needs-setup,
/api/users, and setup.
- list_notification_trackers rewritten to batched loads (was 1+N+N*M).
- Retention job extended to event_log, webhook_payload_log, and
action_execution; retention days exposed as a setting.
Scheduler
- AsyncIOScheduler job_defaults: coalesce, misfire_grace_time=300,
max_instances=1.
Ops
- uvicorn runs with proxy_headers, forwarded_allow_ips,
timeout_graceful_shutdown; access log suppressed in non-debug.
- FastAPI version string now reads from importlib.metadata.
- New /api/ready endpoint separate from /api/health.
- docker-compose drops the ALLOW_PRIVATE_URLS=1 default, adds mem/cpu/pid
limits, read_only + tmpfs, cap_drop:ALL, no-new-privileges; healthcheck
targets /api/ready.
- CI now runs on push/PR with backend pytest, frontend svelte-check +
build, and a non-push image build; release workflow gated on tests,
publishes immutable sha-<commit> image tag, adds Trivy scan.
Tests
- New packages/server/tests/ with 29 passing tests: config validation,
JWT round-trip + aud/alg=none rejection, SSRF scheme and private-range
enforcement (sync + async), Discord bounded retry, and a lifespan-level
/api/health + /api/ready smoke check.
- Renamed the misnamed services/test_dispatch.py to manual_dispatch.py so
pytest never auto-collects production code.
Frontend
- /login now redirects already-authenticated users to /, shows a distinct
'backend unreachable' banner (en/ru) when /auth/needs-setup fails.
Boot-time logging was a three-line basicConfig stub with no timestamps, no
correlation, and silent drops at every layer of the Telegram send path — a
/random command that delivered text but no media left zero evidence in the
log. This replaces the setup and closes every silent drop encountered end-to-end.
New infrastructure:
- notify_bridge_core.log_context: request_id/command/chat_id/bot_id/dispatch_id
ContextVars with a bind_log_context() context manager so deep call sites
(TelegramClient, NotificationDispatcher) inherit the correlation tag without
threading args through.
- notify_bridge_server.logging_setup: dictConfig-based setup with a
LogRecordFactory that tags every record, a SecretMaskingFilter that redacts
/botN:TOKEN plus Authorization/x-api-key/password/secret in messages AND
tracebacks, a JSON formatter for aggregators, text formatter with grep-friendly
[req=... cmd=... bot=... chat=... disp=...] prefix, and default dampening
for sqlalchemy/aiohttp/apscheduler/urllib3/PIL.
Runtime control:
- NOTIFY_BRIDGE_LOG_LEVEL / _FORMAT / _LEVELS env vars (boot).
- DB-backed log_level / log_format / log_levels AppSettings, applied on
boot after migrations and live via apply_log_levels() when edited in
the settings UI (format still requires restart, logs a WARN).
- Frontend settings page gains a Logging card (level dropdown, format
dropdown, per-module overrides); en/ru i18n keys added.
Call-site fixes (/random media-group blind spot and adjacent):
- TelegramClient._fetch_asset: every silent drop now WARN-logs with reason
(missing url, HTTP non-200, size/dimension limits, ClientError).
- TelegramClient._send_media_group: WARN on "chunk had N items but 0 usable",
ERROR on sendMediaGroup non-ok/transport with full context; returns
success=False + "no_items_delivered" instead of success=True with an empty
message_ids list so callers can distinguish.
- TelegramClient.send_message / _upload_media / _send_from_cache: ERROR on
non-ok + transport failures with status/code/desc; DEBUG for cache-hit
fallbacks.
- NotificationDispatcher.dispatch: generates a dispatch_id, binds it, logs
start/finish with failure count, uses exc_info for target failures.
- commands/handler: missing/failed templates -> ERROR + exc_info; send_reply
and send_media_group errors upgraded WARNING -> ERROR with chat/error_code
context; rate-limit and truncation cases logged with full context.
- commands/webhook and services/telegram_poller: bind_log_context(request_id
=tg:<update_id>, command, chat_id, bot_id), INFO on receive/dispatch/
completion with duration, exc_info on raise, INFO when commands disabled.
- commands/immich: INFO when album scope is empty; WARN per asset dropped
from media payload and a summary WARN when "N assets in, 0 out".
- Remove top paginator from dashboard events, keep only bottom
- Fix test message locale: pass UI locale to email/matrix bot tests
- Convert webhook auth mode from text input to icon grid selector
- Generate secure UUID tokens for webhook URLs instead of sequential IDs
- Move Recent Payloads into per-provider expandable container (lazy-loaded)
- Make template config languages dynamic via app settings instead of hardcoded
- Change default dev port to 5175
Set up the Notify Bridge project structure:
- packages/core (notify_bridge_core) with provider, model, notification, template packages
- packages/server (notify_bridge_server) with FastAPI skeleton and health endpoint
- frontend with SvelteKit 2, Svelte 5, Tailwind CSS v4, static adapter
- Root configs: .gitignore, README.md, CLAUDE.md
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>