feat(logging): production-grade logging with context vars, secret masking, and runtime level control

Boot-time logging was a three-line basicConfig stub with no timestamps, no
correlation, and silent drops at every layer of the Telegram send path — a
/random command that delivered text but no media left zero evidence in the
log. This replaces the setup and closes every silent drop encountered end-to-end.

New infrastructure:
- notify_bridge_core.log_context: request_id/command/chat_id/bot_id/dispatch_id
  ContextVars with a bind_log_context() context manager so deep call sites
  (TelegramClient, NotificationDispatcher) inherit the correlation tag without
  threading args through.
- notify_bridge_server.logging_setup: dictConfig-based setup with a
  LogRecordFactory that tags every record, a SecretMaskingFilter that redacts
  /botN:TOKEN plus Authorization/x-api-key/password/secret in messages AND
  tracebacks, a JSON formatter for aggregators, text formatter with grep-friendly
  [req=... cmd=... bot=... chat=... disp=...] prefix, and default dampening
  for sqlalchemy/aiohttp/apscheduler/urllib3/PIL.

Runtime control:
- NOTIFY_BRIDGE_LOG_LEVEL / _FORMAT / _LEVELS env vars (boot).
- DB-backed log_level / log_format / log_levels AppSettings, applied on
  boot after migrations and live via apply_log_levels() when edited in
  the settings UI (format still requires restart, logs a WARN).
- Frontend settings page gains a Logging card (level dropdown, format
  dropdown, per-module overrides); en/ru i18n keys added.

Call-site fixes (/random media-group blind spot and adjacent):
- TelegramClient._fetch_asset: every silent drop now WARN-logs with reason
  (missing url, HTTP non-200, size/dimension limits, ClientError).
- TelegramClient._send_media_group: WARN on "chunk had N items but 0 usable",
  ERROR on sendMediaGroup non-ok/transport with full context; returns
  success=False + "no_items_delivered" instead of success=True with an empty
  message_ids list so callers can distinguish.
- TelegramClient.send_message / _upload_media / _send_from_cache: ERROR on
  non-ok + transport failures with status/code/desc; DEBUG for cache-hit
  fallbacks.
- NotificationDispatcher.dispatch: generates a dispatch_id, binds it, logs
  start/finish with failure count, uses exc_info for target failures.
- commands/handler: missing/failed templates -> ERROR + exc_info; send_reply
  and send_media_group errors upgraded WARNING -> ERROR with chat/error_code
  context; rate-limit and truncation cases logged with full context.
- commands/webhook and services/telegram_poller: bind_log_context(request_id
  =tg:<update_id>, command, chat_id, bot_id), INFO on receive/dispatch/
  completion with duration, exc_info on raise, INFO when commands disabled.
- commands/immich: INFO when album scope is empty; WARN per asset dropped
  from media payload and a summary WARN when "N assets in, 0 out".
This commit is contained in:
2026-04-23 14:41:26 +03:00
parent 1f880daa0c
commit f50d465c0e
15 changed files with 831 additions and 63 deletions
+7
View File
@@ -695,6 +695,13 @@
"locales": "Template Languages",
"supportedLocales": "Supported Locales",
"supportedLocalesHint": "Languages available when authoring notification and command templates. Built-in defaults ship for English and Russian; other languages start empty.",
"logging": "Logging",
"logLevel": "Log Level",
"logLevelHint": "Root log level for the server. Raise to DEBUG while investigating; keep at INFO in production. WARNING/ERROR hide per-command progress lines.",
"logFormat": "Log Format",
"logFormatHint": "Output format. 'text' is human-readable; 'json' emits one object per line for log aggregators (Loki, ELK). Changing this requires a server restart.",
"logLevels": "Per-Module Overrides",
"logLevelsHint": "Comma-separated 'module=LEVEL' pairs to silence noisy modules or drill into one area. Example: sqlalchemy.engine=WARNING,notify_bridge_core.notifications.telegram.client=DEBUG",
"saved": "Settings saved"
},
"hints": {
+7
View File
@@ -695,6 +695,13 @@
"locales": "Языки шаблонов",
"supportedLocales": "Поддерживаемые локали",
"supportedLocalesHint": "Языки, доступные для редактирования шаблонов уведомлений и команд. Встроенные шаблоны поставляются для английского и русского; другие языки начинают с пустых.",
"logging": "Логирование",
"logLevel": "Уровень логов",
"logLevelHint": "Уровень логирования сервера. Поднимайте до DEBUG при отладке; оставляйте INFO в продакшене. WARNING/ERROR скрывают пошаговые строки по командам.",
"logFormat": "Формат логов",
"logFormatHint": "Формат вывода. 'text' — читаемый человеком; 'json' — по одному объекту в строке для агрегаторов (Loki, ELK). Смена требует перезапуска сервера.",
"logLevels": "Переопределения по модулям",
"logLevelsHint": "Пары 'модуль=УРОВЕНЬ' через запятую, чтобы приглушить шумные модули или углубиться в один. Пример: sqlalchemy.engine=WARNING,notify_bridge_core.notifications.telegram.client=DEBUG",
"saved": "Настройки сохранены"
},
"hints": {
+37
View File
@@ -37,6 +37,9 @@
telegram_asset_cache_max_entries: '5000',
supported_locales: 'en,ru',
timezone: 'UTC',
log_level: 'INFO',
log_format: 'text',
log_levels: '',
});
let cacheStats = $state<CacheStats | null>(null);
@@ -204,6 +207,40 @@
</div>
</Card>
<!-- Logging section -->
<Card>
<h3 class="text-sm font-semibold mb-4 flex items-center gap-2">
<MdiIcon name="mdiTextBoxOutline" size={18} />
{t('settings.logging')}
</h3>
<div class="grid grid-cols-1 sm:grid-cols-2 gap-4">
<div>
<label class="block text-xs font-medium mb-1">{t('settings.logLevel')}<Hint text={t('settings.logLevelHint')} /></label>
<select bind:value={settings.log_level}
class="w-full px-3 py-1.5 text-sm border border-[var(--color-border)] rounded-md bg-[var(--color-background)]">
<option value="DEBUG">DEBUG</option>
<option value="INFO">INFO</option>
<option value="WARNING">WARNING</option>
<option value="ERROR">ERROR</option>
</select>
</div>
<div>
<label class="block text-xs font-medium mb-1">{t('settings.logFormat')}<Hint text={t('settings.logFormatHint')} /></label>
<select bind:value={settings.log_format}
class="w-full px-3 py-1.5 text-sm border border-[var(--color-border)] rounded-md bg-[var(--color-background)]">
<option value="text">text</option>
<option value="json">json</option>
</select>
</div>
<div class="sm:col-span-2">
<label class="block text-xs font-medium mb-1">{t('settings.logLevels')}<Hint text={t('settings.logLevelsHint')} /></label>
<input bind:value={settings.log_levels}
placeholder="sqlalchemy.engine=WARNING,notify_bridge_core.notifications.telegram.client=DEBUG"
class="w-full px-3 py-1.5 text-sm border border-[var(--color-border)] rounded-md bg-[var(--color-background)] font-mono" />
</div>
</div>
</Card>
<Button onclick={save} disabled={saving}>
{saving ? t('common.loading') : t('common.save')}
</Button>