feat(logging): production-grade logging with context vars, secret masking, and runtime level control
Boot-time logging was a three-line basicConfig stub with no timestamps, no correlation, and silent drops at every layer of the Telegram send path — a /random command that delivered text but no media left zero evidence in the log. This replaces the setup and closes every silent drop encountered end-to-end. New infrastructure: - notify_bridge_core.log_context: request_id/command/chat_id/bot_id/dispatch_id ContextVars with a bind_log_context() context manager so deep call sites (TelegramClient, NotificationDispatcher) inherit the correlation tag without threading args through. - notify_bridge_server.logging_setup: dictConfig-based setup with a LogRecordFactory that tags every record, a SecretMaskingFilter that redacts /botN:TOKEN plus Authorization/x-api-key/password/secret in messages AND tracebacks, a JSON formatter for aggregators, text formatter with grep-friendly [req=... cmd=... bot=... chat=... disp=...] prefix, and default dampening for sqlalchemy/aiohttp/apscheduler/urllib3/PIL. Runtime control: - NOTIFY_BRIDGE_LOG_LEVEL / _FORMAT / _LEVELS env vars (boot). - DB-backed log_level / log_format / log_levels AppSettings, applied on boot after migrations and live via apply_log_levels() when edited in the settings UI (format still requires restart, logs a WARN). - Frontend settings page gains a Logging card (level dropdown, format dropdown, per-module overrides); en/ru i18n keys added. Call-site fixes (/random media-group blind spot and adjacent): - TelegramClient._fetch_asset: every silent drop now WARN-logs with reason (missing url, HTTP non-200, size/dimension limits, ClientError). - TelegramClient._send_media_group: WARN on "chunk had N items but 0 usable", ERROR on sendMediaGroup non-ok/transport with full context; returns success=False + "no_items_delivered" instead of success=True with an empty message_ids list so callers can distinguish. - TelegramClient.send_message / _upload_media / _send_from_cache: ERROR on non-ok + transport failures with status/code/desc; DEBUG for cache-hit fallbacks. - NotificationDispatcher.dispatch: generates a dispatch_id, binds it, logs start/finish with failure count, uses exc_info for target failures. - commands/handler: missing/failed templates -> ERROR + exc_info; send_reply and send_media_group errors upgraded WARNING -> ERROR with chat/error_code context; rate-limit and truncation cases logged with full context. - commands/webhook and services/telegram_poller: bind_log_context(request_id =tg:<update_id>, command, chat_id, bot_id), INFO on receive/dispatch/ completion with duration, exc_info on raise, INFO when commands disabled. - commands/immich: INFO when album scope is empty; WARN per asset dropped from media payload and a summary WARN when "N assets in, 0 out".
This commit is contained in:
@@ -144,6 +144,7 @@ def _format_assets(
|
||||
# other's cached file_ids (which is what made the cache look empty
|
||||
# from the WebUI after running /random).
|
||||
media_items: list[dict[str, Any]] = []
|
||||
dropped = 0
|
||||
for asset in assets:
|
||||
asset_id = asset.get("id", "")
|
||||
asset_type = (asset.get("type") or "").upper()
|
||||
@@ -156,6 +157,20 @@ def _format_assets(
|
||||
)
|
||||
if entry is not None:
|
||||
media_items.append(entry)
|
||||
else:
|
||||
dropped += 1
|
||||
_LOGGER.warning(
|
||||
"Dropped asset from /%s media payload: id=%s type=%s (empty preview URL)",
|
||||
cmd, asset_id, asset_type,
|
||||
)
|
||||
if not media_items and assets:
|
||||
# All assets were filtered out before reaching Telegram. The user
|
||||
# will see the text reply but no media — surface it here so the
|
||||
# log shows WHY the media group ended up empty.
|
||||
_LOGGER.warning(
|
||||
"/%s media payload empty: %d asset(s) in, 0 out (all dropped)",
|
||||
cmd, len(assets),
|
||||
)
|
||||
# Return text message + media items — text is sent first, media as reply
|
||||
return {"text": text, "media": media_items}
|
||||
|
||||
|
||||
Reference in New Issue
Block a user