feat: production readiness — security, perf, bug fixes, bridge self-monitoring

Comprehensive multi-area pass driven by a parallel 8-agent production
review. Frontend, backend, database, security, performance, operational,
plus a new self-monitoring feature.

## Critical fixes
- Planka webhook: reads bounded raw body (was NameError on every call)
- HA quiet hours: ha_state_changed/automation_triggered/service_called/
  event_fired added to deferrable set (were silently dropped)
- DNS-rebinding SSRF: PinnedResolver wired into shared aiohttp session
- Telegram inbound webhook: secret now mandatory (401 without)
- Generic webhook: auth_mode="none" requires explicit
  acknowledge_unauthenticated=true; per-IP rate limit 60/min
- svelte-check: 5 null-narrowing errors in EventDetailModal fixed
- Provider hardcoding: Immich-only block extracted to descriptor
  featureDiscoveryHint
- command_sync: snapshot+expunge bot before exiting AsyncSession

## Bug fixes
- notifier asyncio.gather(return_exceptions=True) — one bad chat no longer
  cancels peer sends
- NotificationDispatcher hoisted out of per-tracker loop
- Provider credential resolution unified across all 5 dispatch sites
- HA asyncio.shield now drains inner task on cancellation
- Provider construction switched from if/elif ladder to factory registry
- NUT first poll seeds silently (no spurious ups_on_battery)
- Quiet-hours gate: event-type-disabled now wins over deferral
- APScheduler drain job ID resolution upgraded to seconds
- HA on_status_change wired through to EventLog
- Webhook payload rollback failures now logged (not swallowed)
- Batched receivers/chats/bots in load_link_data (was per-target N+1)
- flag_modified on JSON column reassignments in deferred_dispatch

## Database
- UNIQUE indexes on service_provider.webhook_token,
  telegram_bot.webhook_path_id, partial UNIQUE on telegram_bot.bot_id,
  telegram_chat(bot_id, chat_id), notification_tracker_target unique link,
  partial UNIQUE on bridge_self provider per user
- Composite ix_event_log_user_event_type_created index
- save_chat_from_webhook switched to ON CONFLICT DO UPDATE
- ondelete=CASCADE on user-id FKs (model annotation; app-side cascade
  delete added for existing data)
- delete_notification_tracker converted from N+1 to bulk DELETE/UPDATE
- Module-level asyncio.Lock replaced with lazy _get_lock() pattern
- VACUUM INTO snapshot now PRAGMA integrity_check verified

## Performance
- Jinja2 template compilation LRU cached (lru_cache maxsize=512)
- Per-locale render cache in NotificationDispatcher (skips re-rendering
  identical content for receivers sharing a locale)
- Tracker list cached per provider_id with 5s TTL + explicit invalidation
  on tracker CRUD (relieves HA chat-bus rate query pressure)
- Nav-counts collapsed from 16 round-trips to single UNION ALL
- HA event_log: skip persisting empty assets_added/removed events

## Security hardening
- Mass-assignment guard on Action create/update; cron sub-minute reject
- Backup JSON depth/node-count cap (depth ≤ 10, nodes ≤ 100k)
- _sanitize_config extended to all JSON-typed fields on backup import
- Telegram _safe_get walks redirects manually with SSRF revalidation
- Bcrypt 72-byte password length cap with clear 422
- Webhook payload body redaction; sensitive substring set extended with
  oauth/client_secret/webhook_secret/csrf in both header filter and
  template extras filter

## Frontend
- 76 catch (err: any) sites converted to errMsg(err) helper
- globalProviderFilter: pure getter; reconciliation moved to one-time
  $effect in +layout
- Provider-filter binding: removed paired $effects + _syncingFilter flag,
  now one-way derived
- entity-cache: separate _refreshing flag for background re-fetches
- api.ts 401 handling: AuthRedirectError class + dedup _redirecting flag,
  goto() instead of window.location.href
- a11y: aria-expanded on mobile More, role=switch + aria-checked on
  Telegram bot toggles

## Tests & operations
- CI pytest gate added to .gitea/workflows/build.yml + release.yml
  (wheel-built install to dodge editable-install slowness)
- /api/ready upgraded to deep healthcheck (db SELECT 1, scheduler.running,
  HA supervisor presence) returning {ready, checks, errors, version}
- /api/metrics endpoint with prometheus_client (deferred_pending,
  event_log_total, dispatch_duration, poll_failures, send_failures)
- New OPERATIONS.md covering deploy, healthchecks, metrics, backup/restore
  procedures, log handling, common scenarios, upgrade flow
- New tests: test_bridge_self (11), test_gitea_parser (9),
  test_planka_parser (6), test_immich_change_detector (6),
  test_backup_roundtrip (1)

## New feature: bridge self-monitoring
- New bridge_self provider type — internal sink for bridge health events
- Three event types: bridge_self_poll_failures (consecutive tracker poll
  failures), bridge_self_deferred_backlog (pending count crosses
  threshold), bridge_self_target_failures (consecutive 5xx/network
  failures per target)
- Per-user thresholds (defaults: 3 / 100 / 5) configurable via the
  provider config form
- Auto-seeded on user create + /setup + boot backfill for existing users
- Anti-spam: counters reset after emission; backlog uses transition latch
- Self-loop guard: bridge_self failures don't count toward target-failure
  thresholds (logged only) — wire to your own Telegram/Email/Matrix to
  get notified when polls/dispatches/sends fail
- 6 default templates (3 events × 2 locales), tracking config columns
  with backfill migration, frontend descriptor (excluded from "create
  provider" wizard since auto-managed)

Operator-visible behavior changes (call out in release notes):
- NOTIFY_BRIDGE_TELEGRAM_WEBHOOK_SECRET now REQUIRED for webhook mode
- Existing webhook providers with auth_mode="none" need explicit opt-in
- Generic webhook endpoint rate-limited 60/min per source IP
- HA disconnect/reconnect writes ha_status_* EventLog rows
- Every user gets a bridge_self provider — wire it to a target to
  receive failure alerts

Pre-existing test failures (test_ssrf, test_release_provider) on
Python 3.13 are unrelated; CI runs on 3.12.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-05-16 02:16:49 +03:00
parent 22127e2a59
commit 10d30fc956
97 changed files with 5423 additions and 821 deletions
+33 -31
View File
@@ -3,7 +3,7 @@
import { SvelteSet } from 'svelte/reactivity';
import { slide } from 'svelte/transition';
import { page } from '$app/state';
import { api, getBlockedBy, type BlockedByDetail } from '$lib/api';
import { api, getBlockedBy, type BlockedByDetail , errMsg} from '$lib/api';
import BlockedByModal from '$lib/components/BlockedByModal.svelte';
import { t, getLocale } from '$lib/i18n';
import { targetsCache, telegramBotsCache, emailBotsCache, matrixBotsCache } from '$lib/stores/caches.svelte';
@@ -26,7 +26,7 @@
import ReceiverSection from './ReceiverSection.svelte';
import BotGroupHeader from './BotGroupHeader.svelte';
// ── Helpers ──
// в”Ђв”Ђ Helpers в”Ђв”Ђ
function getBotName(target: NotificationTarget): string | null {
if (target.type === 'telegram' && target.config?.bot_id) {
@@ -74,7 +74,7 @@
return recv.receiver_key || '?';
}
// ── Constants ──
// в”Ђв”Ђ Constants в”Ђв”Ђ
const ALL_TYPES = ['telegram', 'webhook', 'email', 'discord', 'slack', 'ntfy', 'matrix', 'broadcast'] as const;
type TargetType = typeof ALL_TYPES[number];
@@ -97,7 +97,7 @@
function targetTiles(target: NotificationTarget): MetaTile[] {
const tiles: MetaTile[] = [];
// Type tile useful when the "all types" filter is active and rows
// Type tile — useful when the "all types" filter is active and rows
// from multiple types appear side-by-side. The receivers count is
// already shown inside the `target-summary` button, so we don't repeat
// it as a tile.
@@ -115,8 +115,8 @@
tone: 'sky',
});
}
// Telegram targets expose a chat label in config surface it so the
// row reads "Telegram · @bot · Family chat" without expanding.
// Telegram targets expose a chat label in config — surface it so the
// row reads "Telegram Р’В· @bot Р’В· Family chat" without expanding.
const cfg = (target.config || {}) as Record<string, any>;
if (target.type === 'telegram' && cfg.chat_id) {
tiles.push({
@@ -126,7 +126,7 @@
mono: true,
});
}
// Webhook target show host
// Webhook target — show host
if (target.type === 'webhook' && cfg.url) {
let host = String(cfg.url);
try { host = new URL(host).host; } catch { /* keep raw */ }
@@ -142,7 +142,7 @@
return tiles;
}
// ── Derived state ──
// в”Ђв”Ђ Derived state в”Ђв”Ђ
let allTargets = $derived(targetsCache.items);
let activeType = $derived(page.url.searchParams.get('type') as TargetType | null);
@@ -158,7 +158,7 @@
const emailBotItems = $derived(emailBots.map(b => ({ value: b.id, label: b.name, icon: b.icon || 'mdiEmailOutline', desc: b.email })));
const matrixBotItems = $derived(matrixBots.map(b => ({ value: b.id, label: b.name, icon: b.icon || 'mdiMatrix', desc: b.display_name || b.homeserver_url })));
// ── Target form state ──
// в”Ђв”Ђ Target form state в”Ђв”Ђ
let showForm = $state(false);
let editing = $state<number | null>(null);
@@ -204,7 +204,7 @@
formEl?.scrollIntoView({ behavior: 'smooth', block: 'start' });
}
// ── Receiver inline form state ──
// в”Ђв”Ђ Receiver inline form state в”Ђв”Ђ
let addingReceiverForTarget = $state<number | null>(null);
let receiverForm = $state<Record<string, any>>({});
@@ -228,7 +228,7 @@
if (!expandedTargets.has(id)) expandedTargets.add(id);
}
// ── Effects ──
// в”Ђв”Ђ Effects в”Ђв”Ђ
// Reset form when switching target type tabs
$effect(() => {
@@ -239,11 +239,11 @@
addingReceiverForTarget = null;
});
// ── Data loading ──
// в”Ђв”Ђ Data loading в”Ђв”Ђ
onMount(load);
// ── Bot grouping ──
// в”Ђв”Ђ Bot grouping в”Ђв”Ђ
type TargetGroup = {
key: string;
@@ -355,8 +355,8 @@
emailBotsCache.fetch(), matrixBotsCache.fetch(),
]);
loadError = '';
} catch (err: any) {
loadError = err.message || t('common.loadError');
} catch (err: unknown) {
loadError = errMsg(err, t('common.loadError'));
snackError(loadError);
} finally {
loaded = true;
@@ -372,7 +372,7 @@
} catch (e) { console.warn('Failed to load bot chats:', e); }
}
// Active discovery actually polls Telegram getUpdates and persists any new chats.
// Active discovery — actually polls Telegram getUpdates and persists any new chats.
// Fired when the chat picker opens so the user sees the freshest list without a manual click.
async function discoverReceiverBotChats(botId: number) {
if (!botId) return;
@@ -382,7 +382,7 @@
} catch (e) { console.warn('Failed to discover bot chats:', e); }
}
// ── Target CRUD ──
// в”Ђв”Ђ Target CRUD в”Ђв”Ђ
function openNew() {
form = defaultForm();
@@ -475,9 +475,10 @@
editing = null;
await load();
snackSuccess(t('snack.targetSaved'));
} catch (err: any) {
error = err.message;
snackError(err.message);
} catch (err: unknown) {
const m = errMsg(err);
error = m;
snackError(m);
} finally {
submitting = false;
}
@@ -488,7 +489,7 @@
const res = await api<{ success: boolean; error?: string }>(`/targets/${id}/test?locale=${getLocale()}`, { method: 'POST' });
if (res.success) snackSuccess(t('snack.targetTestSent'));
else snackError(`Failed: ${res.error}`);
} catch (err: any) { snackError(err.message); }
} catch (err: unknown) { snackError(errMsg(err)); }
}
let blockedBy = $state<BlockedByDetail | null>(null);
@@ -497,15 +498,16 @@
await api(`/targets/${id}`, { method: 'DELETE' });
await load();
snackSuccess(t('snack.targetDeleted'));
} catch (err: any) {
} catch (err: unknown) {
const bb = getBlockedBy(err);
if (bb) { blockedBy = bb; return; }
error = err.message;
snackError(err.message);
const m = errMsg(err);
error = m;
snackError(m);
}
}
// ── Receiver CRUD ──
// в”Ђв”Ђ Receiver CRUD в”Ђв”Ђ
async function openReceiverForm(targetId: number, targetType: string) {
// Force a remount of any picker palette when the same target is reopened
@@ -575,8 +577,8 @@
addingReceiverForTarget = null;
await load();
snackSuccess(t('targets.receiverAdded'));
} catch (err: any) {
snackError(err.message);
} catch (err: unknown) {
snackError(errMsg(err));
} finally {
receiverSubmitting = false;
}
@@ -590,7 +592,7 @@
});
await load();
snackSuccess(receiver.enabled ? t('targets.receiverDisabled') : t('targets.receiverEnabled'));
} catch (err: any) { snackError(err.message); }
} catch (err: unknown) { snackError(errMsg(err)); }
}
async function removeReceiver(targetId: number, receiverId: number) {
@@ -598,7 +600,7 @@
await api(`/targets/${targetId}/receivers/${receiverId}`, { method: 'DELETE' });
await load();
snackSuccess(t('targets.receiverDeleted'));
} catch (err: any) { snackError(err.message); }
} catch (err: unknown) { snackError(errMsg(err)); }
}
async function toggleBroadcastChild(targetId: number, childId: number) {
@@ -613,7 +615,7 @@
body: JSON.stringify({ config: { ...tgt.config, disabled_child_ids: [...disabled] } }),
});
await load();
} catch (err: any) { snackError(err.message); }
} catch (err: unknown) { snackError(errMsg(err)); }
}
async function testReceiver(targetId: number, receiverId: number) {
@@ -622,7 +624,7 @@
const res = await api<{ success: boolean; error?: string }>(`/targets/${targetId}/receivers/${receiverId}/test?locale=${getLocale()}`, { method: 'POST' });
if (res.success) snackSuccess(t('snack.targetTestSent'));
else snackError(`Failed: ${res.error}`);
} catch (err: any) { snackError(err.message); }
} catch (err: unknown) { snackError(errMsg(err)); }
finally { receiverTesting = { ...receiverTesting, [receiverId]: false }; }
}
</script>