Commit Graph

138 Commits

Author SHA1 Message Date
alexei.dolgolyov 920920bc67 feat: production-readiness hardening across security, async, DB, ops
Build and Test / test-frontend (push) Successful in 9m37s
Build and Test / test-backend (push) Successful in 10m53s
Build and Test / build-image (push) Failing after 14m52s
Security
- SSRF: async DNS resolver; allow_redirects=False on all outbound clients;
  matrix homeserver_url validated on create/update/test; update_provider
  and email_bot merge incoming config and reject ***-masked secrets.
- Auth: bcrypt offloaded to asyncio.to_thread; JWT now carries iss/aud +
  leeway and rejects missing claims; setup TOCTOU closed inside a
  transaction; rate limits extended (default 600/min, 10/min on password
  change, 30/min on needs-setup); constant-time login to prevent username
  enumeration.
- Config: rejects known dev secret keys; validates CORS origin schemes,
  port range, token lifetimes.
- Webhook handlers stream-read body with a 1 MiB cap; Discord 429 retries
  bounded (3 attempts, Retry-After capped at 60 s).
- CSP + HSTS added to SecurityHeadersMiddleware.

Async / runtime
- SQLite engine: WAL, synchronous=NORMAL, foreign_keys=ON, busy_timeout,
  pool_pre_ping, dispose on shutdown.
- Lifespan shutdown now stops scheduler before closing HTTP session and
  disposing the engine.
- Shared aiohttp session locked against concurrent first-caller races;
  core NotificationDispatcher accepts and reuses it.
- Storage and scheduled backup writes wrapped in asyncio.to_thread.
- NUT client writes bounded by asyncio.wait_for.
- Telegram poller switched from 3 s short-poll to 30 s interval + 25 s
  long-poll (~10x fewer API calls).

Database
- New performance-indexes migration covers every FK/owner column and
  hot-path composite (notification_tracker(provider_id, enabled);
  event_log(user_id, created_at DESC); webhook_payload_log(provider_id,
  created_at DESC); action_execution(action_id, started_at DESC)).
- New schema_version table for future upgrade gating.
- __system__ placeholder user (id=0) seeded so user_id=0 system defaults
  satisfy the newly enforced FK; filtered out of /auth/needs-setup,
  /api/users, and setup.
- list_notification_trackers rewritten to batched loads (was 1+N+N*M).
- Retention job extended to event_log, webhook_payload_log, and
  action_execution; retention days exposed as a setting.

Scheduler
- AsyncIOScheduler job_defaults: coalesce, misfire_grace_time=300,
  max_instances=1.

Ops
- uvicorn runs with proxy_headers, forwarded_allow_ips,
  timeout_graceful_shutdown; access log suppressed in non-debug.
- FastAPI version string now reads from importlib.metadata.
- New /api/ready endpoint separate from /api/health.
- docker-compose drops the ALLOW_PRIVATE_URLS=1 default, adds mem/cpu/pid
  limits, read_only + tmpfs, cap_drop:ALL, no-new-privileges; healthcheck
  targets /api/ready.
- CI now runs on push/PR with backend pytest, frontend svelte-check +
  build, and a non-push image build; release workflow gated on tests,
  publishes immutable sha-<commit> image tag, adds Trivy scan.

Tests
- New packages/server/tests/ with 29 passing tests: config validation,
  JWT round-trip + aud/alg=none rejection, SSRF scheme and private-range
  enforcement (sync + async), Discord bounded retry, and a lifespan-level
  /api/health + /api/ready smoke check.
- Renamed the misnamed services/test_dispatch.py to manual_dispatch.py so
  pytest never auto-collects production code.

Frontend
- /login now redirects already-authenticated users to /, shows a distinct
  'backend unreachable' banner (en/ru) when /auth/needs-setup fails.
2026-04-23 19:44:56 +03:00
alexei.dolgolyov f50d465c0e feat(logging): production-grade logging with context vars, secret masking, and runtime level control
Boot-time logging was a three-line basicConfig stub with no timestamps, no
correlation, and silent drops at every layer of the Telegram send path — a
/random command that delivered text but no media left zero evidence in the
log. This replaces the setup and closes every silent drop encountered end-to-end.

New infrastructure:
- notify_bridge_core.log_context: request_id/command/chat_id/bot_id/dispatch_id
  ContextVars with a bind_log_context() context manager so deep call sites
  (TelegramClient, NotificationDispatcher) inherit the correlation tag without
  threading args through.
- notify_bridge_server.logging_setup: dictConfig-based setup with a
  LogRecordFactory that tags every record, a SecretMaskingFilter that redacts
  /botN:TOKEN plus Authorization/x-api-key/password/secret in messages AND
  tracebacks, a JSON formatter for aggregators, text formatter with grep-friendly
  [req=... cmd=... bot=... chat=... disp=...] prefix, and default dampening
  for sqlalchemy/aiohttp/apscheduler/urllib3/PIL.

Runtime control:
- NOTIFY_BRIDGE_LOG_LEVEL / _FORMAT / _LEVELS env vars (boot).
- DB-backed log_level / log_format / log_levels AppSettings, applied on
  boot after migrations and live via apply_log_levels() when edited in
  the settings UI (format still requires restart, logs a WARN).
- Frontend settings page gains a Logging card (level dropdown, format
  dropdown, per-module overrides); en/ru i18n keys added.

Call-site fixes (/random media-group blind spot and adjacent):
- TelegramClient._fetch_asset: every silent drop now WARN-logs with reason
  (missing url, HTTP non-200, size/dimension limits, ClientError).
- TelegramClient._send_media_group: WARN on "chunk had N items but 0 usable",
  ERROR on sendMediaGroup non-ok/transport with full context; returns
  success=False + "no_items_delivered" instead of success=True with an empty
  message_ids list so callers can distinguish.
- TelegramClient.send_message / _upload_media / _send_from_cache: ERROR on
  non-ok + transport failures with status/code/desc; DEBUG for cache-hit
  fallbacks.
- NotificationDispatcher.dispatch: generates a dispatch_id, binds it, logs
  start/finish with failure count, uses exc_info for target failures.
- commands/handler: missing/failed templates -> ERROR + exc_info; send_reply
  and send_media_group errors upgraded WARNING -> ERROR with chat/error_code
  context; rate-limit and truncation cases logged with full context.
- commands/webhook and services/telegram_poller: bind_log_context(request_id
  =tg:<update_id>, command, chat_id, bot_id), INFO on receive/dispatch/
  completion with duration, exc_info on raise, INFO when commands disabled.
- commands/immich: INFO when album scope is empty; WARN per asset dropped
  from media payload and a summary WARN when "N assets in, 0 out".
2026-04-23 14:41:26 +03:00
alexei.dolgolyov 1f880daa0c chore: release v0.3.2
Release / release (push) Successful in 1m22s
v0.3.2
2026-04-23 13:38:28 +03:00
alexei.dolgolyov 1024085cdd fix(scheduler): honor app timezone for cron triggers and log scheduled events
CronTrigger.from_crontab was constructed without a timezone, so a cron like
'0 9 * * *' fired at 09:00 host-local instead of 09:00 in the admin-configured
timezone. Now all tracker/action cron triggers are built with the app tz, and
the setting endpoint rebuilds existing cron jobs when the tz changes (since
CronTrigger freezes its tz at construction time).

The scheduler provider also renders current_date/time/datetime/weekday in the
configured tz and exposes a new 'timezone' template variable.

EventLog entries for scheduled_message now include schedule_type,
cron_expression/interval_seconds, timezone, and fire_count, and the dashboard
shows the event type with a label/icon/color.
2026-04-23 13:35:49 +03:00
alexei.dolgolyov 5604c733d1 chore: release v0.3.1
Release / release (push) Successful in 1m1s
v0.3.1
2026-04-22 19:27:45 +03:00
alexei.dolgolyov 3b7808aa9c perf(immich): TTL cache for album bodies and shared-link listings
Bot commands like /random, /latest, /memory refetch the same albums in
quick succession; the GET /api/albums/{id} response can be tens of MB on
large albums, and /api/shared-links has no per-album filter so every
get_shared_links call was already paying for the full server-wide list.

- Module-level 60s TTL cache for album bodies, keyed by
  (server_digest, album_id), 32-entry FIFO cap.  Module-scoped (not
  instance-scoped) because ImmichClient is constructed fresh per request
  in several places, so an instance cache would never survive a second
  caller.  Mirrors the existing _users_cache pattern.
- Module-level 60s TTL cache for the bucketed shared-links map, keyed by
  server_digest.  get_shared_links(album_id) now delegates to a single
  server-wide fetch that serves every album.
- server_digest hashes url+api_key so raw creds don't sit in dict keys.
- get_album(use_cache=False) escape hatch for paths that must observe
  current server state — wired into ImmichActionExecutor.execute (diffs
  the album to decide what to add) and ImmichServiceProvider.poll's
  full-fetch path (stale data would silently delay removal events).
- Async locks guard cache writes with under-lock re-check so concurrent
  misses collapse to one fetch.
2026-04-22 19:27:09 +03:00
alexei.dolgolyov 155d25edf9 chore: release v0.3.0
Release / release (push) Successful in 1m19s
v0.3.0
2026-04-22 18:59:15 +03:00
alexei.dolgolyov 69711bbc84 feat(commands): keep chat-action hint alive during slow command fetches
Slow bot commands (/latest, /random, /favorites, /memory, /search,
/find, /person, /place, /summary) spend most of their wall time
fetching assets from the service provider, not uploading to Telegram.
Telegram chat actions expire after ~5s, so the previous one-shot hint
vanished long before media arrived — users saw nothing happening.

- TelegramClient.start_chat_action_keepalive: promoted from private
  helper to public API, posts the action every 4s until cancelled.
- telegram_send.telegram_chat_action: async context manager that
  starts the keep-alive task on enter and cancels + awaits it on
  exit. A None action makes it a no-op so callers don't branch.
- classify_command_chat_action: maps command name to the right
  Telegram action (upload_photo for media-returning commands, typing
  for /summary, None for fast DB-only commands like /status /events).
- webhook.py + telegram_poller.py: wrap handle_command in the context
  manager so the hint persists through the whole fetch+upload window
  in both webhook and long-poll modes.
2026-04-22 18:56:18 +03:00
alexei.dolgolyov fe38d20b96 perf(immich): skip full album fetch on idle ticks; delta-fetch for active ones
Optimizes polling for large Immich albums (tested path targets ~200k
assets). Combined impact on idle albums drops per-tick cost from ~150 MB
fetch to ~few hundred bytes; active albums fetch O(changes) instead of
O(library).

Core changes
- ImmichAlbumMeta + get_album_meta() using ?withoutAssets=true as a
  cheap change-detection probe.
- poll() fast-path: skip full fetch when meta fingerprint matches and
  no pending assets are outstanding.
- poll() delta-path: search/metadata with updatedAfter when fingerprint
  changed, falling back to full fetch on count decrease or mixed
  add+remove that delta can't reconcile.
- asyncio.gather over meta probes so a 20-album tracker pays one
  round-trip of latency instead of 20.
- Event payload cap (50 added / 200 removed) so a bulk import can't
  explode a Jinja template or exceed Telegram's message limits.
- Module-level users cache (1h TTL, sha256-keyed) shared across
  providers on the same Immich server.
- Tick-scoped shared-links cache via new
  get_all_shared_links_by_album() — one /api/shared-links request per
  tick instead of one per changed album.

Server changes
- meta_fingerprint JSON column on NotificationTrackerState + migration.
- watcher skips the asset_ids DB rewrite when the fingerprint didn't
  change, avoiding ~8 MB JSON writes on idle ticks for huge albums.
- Adaptive polling: after 10 empty ticks skip 1-in-2, after 30 skip
  1-in-4, reset on first detected change; resets on schedule changes.
- APScheduler jitter (interval/4, capped at 30s) to smooth thundering-
  herd bursts when many trackers share the same scan_interval.
2026-04-22 18:55:26 +03:00
alexei.dolgolyov d02616069d chore: release v0.2.8
Release / release (push) Successful in 1m58s
v0.2.8
2026-04-22 18:02:09 +03:00
alexei.dolgolyov 7dae68fd93 fix(commands): match notification cache-key format so writes share one namespace
common._format_assets was passing cache_key=<bare asset UUID>, but the
notification dispatcher writes keys as <host>:<uuid> (derived from the
URL by extract_asset_id_from_url). Result: the two paths populated
different keys for the same asset, so neither could hit the other's
cached file_id and the WebUI stats only ever reflected the notification
side.

Drop the explicit cache_key — TelegramClient derives <host>:<uuid> from
the URL, identical to the notification path, so one file_id cached by
any dispatch or /random / /latest reply is reused by every later send.
2026-04-22 17:00:07 +03:00
alexei.dolgolyov e6481605ca chore: release v0.2.7
Release / release (push) Successful in 4m23s
v0.2.7
2026-04-22 16:47:25 +03:00
alexei.dolgolyov 6de9a1289e fix(telegram): unify send routine across notifications and commands
- Route cache_key values that look like asset UUIDs through asset_cache
  in TelegramClient._get_cache_and_key. Single-asset sends previously
  stored file_ids in url_cache while the media-group path stored them
  in asset_cache, so repeat sends never hit.
- Extract build_asset_media_urls so the notification dispatcher
  (asset_to_media) and the bot command handlers (common._format_assets)
  share one rule for /video/playback vs thumbnail URLs.
- Add services/telegram_send.py as the single factory for constructing
  a TelegramClient. It always wires the shared aiohttp session and both
  file caches, so commands now reuse file_ids populated by notification
  dispatches (and vice versa) instead of re-uploading the same bytes.
- send_reply / send_media_group in commands/handler.py now delegate to
  the factory rather than constructing their own uncached clients.
2026-04-22 16:45:31 +03:00
alexei.dolgolyov 325eabd751 chore: release v0.2.6
Release / release (push) Successful in 1m19s
v0.2.6
2026-04-22 16:29:24 +03:00
alexei.dolgolyov fab6169cf9 fix(commands): enrich search assets, surface variables for all command slots
- UI: command-template-configs now resolves slot variables against the
  active provider first (varsRef[provider_type][slot]) before falling back
  to shared entries, so provider-specific slots like /search, /status,
  /repos, /issues, /boards show the Variables button and autocomplete.
- Backend: /search, /find, /person, /place now normalize raw Immich API
  responses through build_asset_dict, extracting city/country from
  exifInfo and mapping isFavorite -> is_favorite so templates render
  location and favorite indicators.
- Telegram: extract build_telegram_asset_entry into a shared helper so
  the notification dispatcher and command media groups agree on video
  typing and /video/playback URLs; videos no longer render as still
  thumbnails in /latest /random /favorites media mode.
- Commands: send_media_group now reuses the same Telegram file_id caches
  as the notification dispatcher, avoiding re-upload churn for repeated
  commands.
2026-04-22 16:28:26 +03:00
alexei.dolgolyov 85311684d9 fix(settings): don't clobber webhook secret with its mask on save
GET /settings returns the Telegram webhook secret masked as "***<last4>".
The frontend binds that masked value into its state, and any Save ships it
back — the PUT handler then persisted the mask as the new secret, silently
invalidating HMAC for every webhook-mode bot. The next GET re-masks the
mask to itself, so the UI showed no corruption.

Treat incoming values that begin with "***" as "unchanged" for the
webhook-secret field. Empty strings still pass through (explicit clear).
2026-04-22 16:10:34 +03:00
alexei.dolgolyov d7daadadc2 chore: release v0.2.5
Release / release (push) Successful in 1m9s
v0.2.5
2026-04-22 15:56:20 +03:00
alexei.dolgolyov e04ad16ca6 docs: add hotfix note to v0.2.4 release notes 2026-04-22 15:51:28 +03:00
alexei.dolgolyov d7d0a5d921 fix(settings): accept numeric values in update payload
Svelte bind:value on <input type="number"> coerces to a JS number, so the
frontend sends {telegram_cache_ttl_hours: 0} after v0.2.4. Pydantic v2
won't auto-coerce int -> str, which produced a 422 on every save that
touched a numeric setting.

- Widen numeric fields to int | str | None in SettingsUpdate.
- Normalize to str before persisting (DB column is text).
2026-04-22 15:44:40 +03:00
alexei.dolgolyov 93df538819 chore: release v0.2.4
Release / release (push) Successful in 1m18s
v0.2.4
2026-04-22 15:36:08 +03:00
alexei.dolgolyov 2be608ba95 feat(cache): thumbhash-validated asset cache + settings UX overhaul
Cache engine:
- TelegramFileCache: configurable max_entries (LRU cap applies in both TTL
  and thumbhash modes), ttl_seconds<=0 disables TTL, stats() method.
- Dispatcher builds an asset.id -> thumbhash resolver from event.added_assets
  (Immich populates thumbhash in extra) and passes it to TelegramClient, so
  asset-cache entries invalidate on visual change rather than age.
- Watcher wires app settings into cache init: URL cache = TTL + LRU cap,
  asset cache = thumbhash + LRU cap. Adds soft-reset (in-memory only) used
  when cache params change.

Settings:
- New key telegram_asset_cache_max_entries (default 5000).
- telegram_cache_ttl_hours default bumped 48 -> 720 (30d); now URL-only.
- PUT /settings resets in-memory caches when cache keys change (files kept).
- New endpoints: GET/POST /settings/telegram-cache/stats and /clear.

Settings page:
- Cache stats card (count + size + oldest/newest per bucket) with a hint
  explaining that the size is cumulative uploaded-to-Telegram bytes.
- Clear-cache button behind a confirm modal.
- New TimezoneSelector + LocaleSelector components replace raw inputs.
- max-entries input, TTL range updated (0..8760, 0 = disabled).

Mobile nav:
- "More" panel now mirrors the full sidebar tree (groups + subnodes) so
  every destination is reachable on mobile; previously flat hand-picked list.
- Nav height uses env(safe-area-inset-bottom); panel bottom + z-index fixed
  so content can't visually overlay the bottom bar.

A11y / DOM warnings:
- Password-change form has a hidden username field for password-manager
  association; autocomplete hints on all three password inputs.
- Telegram webhook secret wrapped in a no-op form + autocomplete=off.

Bug fix:
- update_settings used any(await ... for ...) which raised TypeError at
  runtime (async generator not an iterator); replaced with explicit loop.
2026-04-22 15:09:59 +03:00
alexei.dolgolyov 5028f15f4f chore: release v0.2.3
Release / release (push) Successful in 1m17s
v0.2.3
2026-04-22 03:30:45 +03:00
alexei.dolgolyov 5a232f18b8 feat(commands): drop tracker counts from /status
trackers_active / trackers_total are per-provider aggregates — once the
rest of /status is scoped to the chat's album set (total_albums and
last_event both filtered by the derived scope), leaving tracker counts
in would leak info about trackers this chat has no visibility into.

- _cmd_status no longer emits trackers_active / trackers_total.
- Immich default status templates (en, ru) just show Albums + Last event.
- Variable catalog updated so the template editor stops suggesting the
  removed vars for the Immich /status slot.
2026-04-22 03:28:05 +03:00
alexei.dolgolyov 3b76a09759 feat(commands): per-chat album scope derived from notification routing
The "per-chat album scope" feature stored on CommandTrackerListener was
really per-bot: listener_id = bot.id, and every chat that bot served
shared the same scope.  Commands like /albums, /random, /status,
/events leaked the full provider catalog into chats that were never
wired up to receive notifications from those trackers.

New model: the album scope for /commands in a given chat is derived
from the notification-routing graph.  For a (provider, bot, chat_id)
triple we walk TargetReceiver (chat_id match, enabled) →
NotificationTarget (telegram or broadcast parent) →
NotificationTrackerTarget → NotificationTracker (provider match) and
union their collection_ids.  That's the natural "what does this chat
get notifications about" set, and it becomes the command scope.

- New helper: command_utils.resolve_chat_album_scope(provider_id,
  bot_id, chat_id) -> set[str].  Empty set is the default for chats
  with no routing — commands return nothing rather than leaking the
  provider's catalog.
- Dispatcher computes the scope per (tracker, bot, chat) and threads
  it through handler.handle(..., allowed_album_ids=...).  Explicit
  CommandTrackerListener.allowed_album_ids override, when set, still
  wins verbatim (kept as an escape hatch for users who want a divergent
  scope for a whole bot).
- /status, /albums, /events, and all /_cmd_immich-routed commands
  (/random, /search, /find, /latest, /memory, /summary, /favorites,
  /place, /person) now intersect with the resolved scope.
- UI scope modal relabeled: it's an explicit *override for this bot*,
  not a per-chat setting.  Default is "derive from notification
  routing", which matches what users already configured elsewhere.

Also:
- /search, /find, /person, /place — _enrich_assets return value was
  discarded, dropping public_url enrichment.  Assign the return value.
- search_smart / search_metadata — consolidated into _search_items
  helper that logs non-200 responses and transport errors instead of
  silently returning [].  Makes "always no results" bugs actually
  diagnosable.  Also accepts the alternate {"assets": [...]} flat-list
  shape from older Immich versions.
- Immich search error bodies go through _redact_body so credentials
  echoed by authenticating proxies don't land in server logs.
2026-04-22 03:20:51 +03:00
alexei.dolgolyov 4ff3876e49 fix(commands): /albums honors per-chat scope, disable link previews
- /albums ignored CommandTrackerListener.allowed_album_ids and listed
  every album tracked by the provider — scoped chats saw neighbours'
  albums.  Thread the listener through _cmd_albums and apply the same
  intersect filter the media commands already use in _cmd_immich.
- Command text replies are listings (albums, events, people, ...) that
  embed multiple links; Telegram's default behavior of rendering a
  preview for the first URL is never useful here and ignored the
  "Disable link previews" toggle operators set on their target.  Always
  pass disable_web_page_preview=True from send_reply.
2026-04-22 03:03:09 +03:00
alexei.dolgolyov 83215473c7 chore: release v0.2.2
Release / release (push) Successful in 1m0s
v0.2.2
2026-04-22 02:51:10 +03:00
alexei.dolgolyov 4e23d2b054 chore(compose): hardcode NOTIFY_BRIDGE_ALLOW_PRIVATE_URLS=1 in compose
This project ships for homelab use; downstream targets (Immich, Gitea,
...) sit on RFC1918 addresses which the SSRF guard blocks by default.
Setting the flag directly in compose — not via ${...} substitution —
avoids the Portainer gotcha where the stack-level "Environment variables"
panel is for compose-file substitutions only, not runtime container env.
Operators who want to run this on a public-facing box can drop the line.
2026-04-22 02:49:19 +03:00
alexei.dolgolyov f7d51b27d2 Revert "chore(compose): default NOTIFY_BRIDGE_ALLOW_PRIVATE_URLS=1 for homelab"
This reverts commit 3bb0585e43.
2026-04-22 02:47:09 +03:00
alexei.dolgolyov 3bb0585e43 chore(compose): default NOTIFY_BRIDGE_ALLOW_PRIVATE_URLS=1 for homelab
Homelab targets (Immich, Gitea, ...) are almost always on RFC1918
addresses, which the SSRF guard rejects by default.  Exporting the flag
to 1 in the compose file — overridable via the host environment —
matches how this project is actually deployed (TrueNAS / unraid / etc.)
without weakening the defense for anyone who sets it to 0 on a
public-facing box.
2026-04-22 02:46:10 +03:00
alexei.dolgolyov 58cba88c92 docs(immich-ssrf): surface NOTIFY_BRIDGE_ALLOW_PRIVATE_URLS hint in error
Homelab/LAN Immich instances trip the SSRF guard (Host 192.168.x resolves
to blocked address).  The fix is to set NOTIFY_BRIDGE_ALLOW_PRIVATE_URLS=1
in the runtime env — call that out directly in the error message so
operators don't have to dig through source to find it.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 02:42:22 +03:00
alexei.dolgolyov 645331d320 chore: release v0.2.1
Release / release (push) Successful in 1m27s
v0.2.1
2026-04-22 02:35:38 +03:00
alexei.dolgolyov 6c3dd67c1b feat(tracking): per-config quiet hours with app-level IANA timezone
Add quiet_hours_enabled/start/end to TrackingConfig (HH:MM strings
interpreted in the app-level timezone AppSetting). The dispatch path
loads the app timezone once per run and passes it through
event_allowed_by_config -> in_quiet_hours, so overnight windows like
22:00-07:00 work correctly in any IANA tz.

Frontend exposes a Timezone field under Settings and a Quiet Hours
section on the Immich tracking-config form with time-picker inputs.
2026-04-22 02:31:48 +03:00
alexei.dolgolyov 56993d2ca3 fix(security,perf): harden restore, CSRF, token_version + perf pass
Security
- Sign pending_restore.json (SHA256 stored in AppSetting, verified on
  startup apply) + refuse path outside data_dir, tighten to 0600.
- Require same-origin Origin/Referer on POST /api/backup/apply-restart —
  Bearer-in-localStorage is CSRF-reachable from any XSS'd admin tab.
- Bump token_version on role/username change and admin password reset so
  demoted admins lose admin in already-issued JWTs.  Guard last-admin
  TOCTOU via COUNT + post-commit re-check that rolls back a race.
- SSRF guard (validate_outbound_url) in ImmichClient.__init__ and the
  external_domain setter — admin-mutable URLs were bypassing the check
  that webhook/slack/discord paths already used.  Dev restart script now
  sets NOTIFY_BRIDGE_ALLOW_PRIVATE_URLS=1 so homelab Immich still works.
- Redact + cap Immich error bodies to ~120 chars before they flow into
  ActionExecution.error / EventLog.details (both UI-visible).
- Deny-list sensitive keys (api_key / token / secret / password /
  authorization / cookie / ...) in template-context merges so a rogue
  template can't exfiltrate provider creds via {{ api_key }}.
- Cap user-controlled Immich search params (query ≤256, person_ids ≤50,
  size ≤100) so a Telegram listener can't DoS upstream.
- Stream upload reads with running byte counter + content-length precheck
  instead of buffering the full body and then rejecting.
- Log Telegram parse_mode fallbacks instead of swallowing silently;
  template escape bugs now surface in server logs.
- Rollback partial imports on pending-restore failure (error recorded on
  a fresh session).

Performance
- Fix N+1 in _refresh_telegram_chat_titles: single IN query instead of
  session.get per chat.
- Parallelize album + shared-link fetches in test_dispatch (asyncio.gather)
  and per-receiver Telegram test sends in notifier (semaphore 5).
- Early-exit collect_scheduled_assets(limit=0) so the periodic-summary
  test path skips full per-album filter/sample (was O(album_assets)).
- Emit explicit CREATE INDEX IF NOT EXISTS for event_log user_id /
  action_id / provider_id so the first boot after upgrade isn't left
  unindexed for the dashboard query.
- Add AbortController timeout (120s) to fetchAuth so uploads/downloads
  don't hang indefinitely.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 02:28:55 +03:00
alexei.dolgolyov fe92b206b7 chore: release v0.2.0
Release / release (push) Successful in 1m20s
v0.2.0
2026-04-22 01:35:24 +03:00
alexei.dolgolyov cf4976da2f fix(telegram): load chats/listeners before expanding to fix slide animation height 2026-04-22 01:29:44 +03:00
alexei.dolgolyov 80c034d2af fix(test-dispatch): fall back to tracker defaults, surface soft errors
- dispatch_test_notification now resolves tracking_config / template_config
  from the tracker's default_* fields when the per-link override is unset,
  matching what load_link_data does for the real watcher. Previously
  periodic/scheduled/memory tests silently failed with "no template
  defined" whenever the user configured the template config at the
  tracker level instead of on each link (the UI's normal default).
- Distinguish the two missing-template cases in the returned error
  ("no template config linked" vs. "slot missing in linked config").
- Frontend testTrackerTarget now treats {success:false,error:"..."} in a
  2xx body as a failure — previously any 2xx flashed a success snack so
  users never saw the real reason their test didn't deliver.
2026-04-22 01:25:35 +03:00
alexei.dolgolyov a7a2b4efa4 feat: large polish pass — UX fixes, per-chat scope, restore/backup, action events
Backend
- Per-chat album scope for Immich commands (search/latest/memory/...): new
  allowed_album_ids on CommandTrackerListener, threaded listener/page kwargs
  through ProviderCommandHandler.handle; PATCH listener-scope endpoint.
- /search and /find accept a trailing page number; Immich client search_smart
  / search_metadata take a page param.
- Immich person-asset lookup switched from removed GET /api/people/{id}/assets
  to POST /api/search/metadata with personIds (fixes /person command and
  auto_organize rules silently returning zero candidates on Immich 1.106+).
- Auto_organize rule now sets the target album's thumbnail to the first added
  image when missing (falls back to any asset type); failures do not fail the
  rule. add_assets_to_album surfaces the Immich error body on non-2xx.
- EventLog.user_id / action_id / action_name columns with defensive migration
  + backfill. Status query filters by user_id directly; Immich/webhook paths
  emit user_id explicitly. action_runner writes an action_success/partial/
  failed event on each non-dry-run.
- Dashboard DELETE /api/status/events (scoped to user_id) + rendering live
  tracker/provider/action names via FK join with snapshot fallback.
- PATCH /api/users/{id} for username/role change with last-admin guard.
- Deletion protection returns structured {message, entity, blocked_by}
  (ApiError carries .blockedBy; frontend opens BlockedByModal).
- Backup prepare-restore → AppSetting markers + atomic write of
  pending_restore.json; lifespan hook applies on next startup and archives
  under data/applied_restores/. apply-restart sends SIGTERM so the lifespan
  shutdown runs; NOTIFY_BRIDGE_SUPERVISED env override gates the button.
  Manual POST /api/backup/files (same format as scheduled).
- New periodic-summary test path reuses shared collect_scheduled_assets
  (limit=0) so test and future production code go through one primitive.
- Per-receiver locale for Telegram test messages (resolves
  TelegramChat.language_override per chat instead of applying the first
  receiver's locale to everyone).
- Bounded concurrency (semaphores) in NotificationDispatcher._preload_asset_data
  and _refresh_telegram_chat_titles; chat title sweep extended to 24h since
  save_chat_from_webhook covers active chats opportunistically.
- Telegram poller detects the \"webhook is active\" 409 and auto-calls
  deleteWebhook for bots whose DB update_mode is polling (throttled per bot).
- TelegramClient.get_chat added (CLAUDE.md rule 6); set_album_thumbnail added.
- Seeds: rename \"Default Commands\" → \"Default Immich Commands\";
  track_assets_removed default False.

Frontend
- Global provider selector visible when there is only one provider.
- Clear-events button + i18n + ConfirmModal on the dashboard; new icons/
  labels/filters/colors for action_success / action_partial / action_failed.
- Auto-select first available tracking/template/command/config + bot on
  create forms (trackers, command-trackers, targets, template/command
  configs).
- Telegram target disable_url_preview defaults to true.
- BlockedByModal wired into 8 deletion flows; fetchAuth helper for
  multipart/binary calls (reuses api()'s refresh + ApiError mapping).
- Immich tracker 'Checking links' parallelised (concurrency cap 6).
- Backup page: pending-restore banner + Apply-now / Apply-later modal,
  restarting overlay polling /api/health, manual 'Create backup' button.
- Command-trackers listener row gets an 'Edit album scope' modal with
  inherit/explicit multiselect.
- Users page: Edit user modal (username + role).
- parseDate helper for consistent UTC date rendering.

Migrations / schema
- event_log: + user_id, action_id, action_name (+ backfill user_id from
  notification_tracker).
- command_tracker_listener: + allowed_album_ids.
2026-04-22 01:13:11 +03:00
alexei.dolgolyov b5ffab7ece fix(command_trackers): allow system-shared command configs (user_id=0)
Release / release (push) Successful in 59s
Creating or updating a command tracker failed with 404
"Command config not found" when the selected config was a system
default (seeded with user_id=0). The LIST endpoint already accepts
both owned and system-shared rows via
  or_(CommandConfig.user_id == user.id, CommandConfig.user_id == 0)
so the frontend legitimately offered a user_id=0 option — the POST
and PATCH handlers then rejected it.

Align the create/update checks with the list behavior:
  config.user_id not in (user.id, 0)
v0.1.0
2026-04-21 21:02:33 +03:00
alexei.dolgolyov 28465f56f9 fix(spa): serve index.html for SvelteKit client-side routes
Release / release (push) Successful in 1m7s
Deep-linking to non-root URLs like /settings or /notification-trackers
returned {"detail":"Not Found"} because StaticFiles(html=True) only
serves index.html for directory roots, not for arbitrary SPA routes.

Subclass StaticFiles with an SPA fallback: any 404 on a non-/api path
serves the root index.html, letting the SvelteKit router hydrate the
correct view on the client. Real /api/* 404s still bubble up as JSON
from FastAPI.
2026-04-21 20:57:39 +03:00
alexei.dolgolyov 2eccbc7279 fix(webhook): avoid MissingGreenlet on expired ORM instance after commit
Release / release (push) Successful in 58s
Telegram webhook handler crashed with sqlalchemy.exc.MissingGreenlet
when processing any incoming message after committing the chat row:

    TelegramChat.bot_id == bot.id
                           ^^^^^^
    MissingGreenlet: greenlet_spawn has not been called

AsyncSession expires all instances on commit. Accessing bot.id/bot.token
after that triggers implicit lazy-load I/O from a sync attribute getter,
which can't enter the greenlet dispatcher → crash.

Fix: snapshot bot.id + bot.token to locals before commit, refresh the
ORM instance after a successful commit so handle_command() can still
use it, and route the remaining call sites through the snapshot
variables.
2026-04-21 20:54:00 +03:00
alexei.dolgolyov 293614d667 fix(db): declare locale on CommandConfig model + defensive migration
Release / release (push) Successful in 1m7s
Startup was crashing on fresh databases because:
- init_db() calls SQLModel.metadata.create_all(), which builds tables
  from the model classes. CommandConfig didn't declare `locale`, so
  the created command_config table lacked the column.
- The seeder then issued INSERTs that included locale='en', causing
  `OperationalError: table command_config has no column named locale`.

The legacy migration #6 in migrate_schema creates command_config WITH
locale via raw SQL, so upgraded databases worked. Only fresh installs
broke.

Fix:
- Add `locale: str = Field(default='en')` to CommandConfig model so
  create_all() produces a consistent schema.
- Add a defensive ALTER TABLE ... ADD COLUMN locale in migrate_schema's
  else-branch, so any existing command_config table missing the column
  (from a broken v0.1.0 install) is backfilled on next startup.
2026-04-21 20:35:21 +03:00
alexei.dolgolyov f27fa42b87 fix(ci): build release payload via heredoc, drop broken env-var passing
Release / release (push) Successful in 24s
Previous attempt used `python3 -c "..." KEY=VALUE` which passes
KEY=VALUE as positional args, not environment variables — the python
block then crashed with KeyError: 'BODY' because nothing actually set
it in the environment.

Consolidate into a single heredoc-fed python3 block that reads
RELEASE_NOTES from the already-exported env var and reads TAG/VERSION/
IS_PRE after an explicit `export`. Uses <<'PY' so shell metachars in
the Python source (backticks, $, quotes) are not interpreted.

Also drops the redundant intermediate BODY variable — body is built
directly inside the single python invocation.
2026-04-21 20:16:27 +03:00
alexei.dolgolyov e12820f150 ci: robust Gitea release creation with HTTP status + diagnostics
Release / release (push) Failing after 21s
Previous implementation silently assumed any missing 'id' in POST
response meant "release already exists", then called an unguarded
python3 on the fallback response — which crashes (exit 1) if the
fallback also fails (e.g. release really doesn't exist).

New logic:
- Build JSON payload in Python (avoids shell escaping + CLI length limits)
- Capture HTTP status explicitly
- 201 → success
- 409 or "already exists" message → reuse existing (with HTTP check on fetch)
- Anything else → fail loudly with the response body printed

This also unblocks diagnosis of the current v0.1.0 failure by surfacing
the actual error the Gitea API is returning.
2026-04-21 20:09:55 +03:00
alexei.dolgolyov 866a8df310 ci: fix changelog step on shallow checkout and small repos
Release / release (push) Failing after 54s
- Set fetch-depth: 0 so previous tag lookups work across full history.
- Use `-n 20` instead of HEAD~20..HEAD, which fails when the repo has
  fewer than 20 commits (e.g. on the first release).
2026-04-21 19:59:40 +03:00
alexei.dolgolyov 56b345188e ci: consolidate release.yml into single checkout step
Release / release (push) Failing after 1m53s
The two-step pattern (sparse-checkout RELEASE_NOTES.md, then full
checkout) left sparse-checkout config active on the workspace, so the
second checkout still only restored RELEASE_NOTES.md. Docker build
then failed with "open Dockerfile: no such file or directory".

Since both RELEASE_NOTES.md and the full source are needed in the same
job, one full checkout is simpler and correct.
2026-04-21 19:50:49 +03:00
alexei.dolgolyov af59615036 chore: release v0.1.0
Release / release (push) Failing after 2m53s
2026-04-21 19:45:22 +03:00
alexei.dolgolyov 90bc3ccdc2 chore: pre-release cleanup
- Skip token clear/redirect on 401 for unauthenticated requests
- Fix typo in test secret key in restart-backend script
- Remove completed plan documents (entity-relationship-refactor, ux-notification-improvements)
2026-04-21 19:39:33 +03:00
alexei.dolgolyov eecc9e295c ci: consolidate release tokens to single DEPLOY_TOKEN, rename redeploy step
- Use one DEPLOY_TOKEN for both registry login and Gitea release API,
  matching the claude-code-facts convention.
- Rename "Trigger Portainer redeploy" to "Trigger redeploy webhook" —
  the step calls a generic DOCKER_REDEPLOY_WEBHOOK_URL, not a
  Portainer-specific endpoint.
- Add .facts-sync.json to pin this project to the facts repo commit.
2026-04-21 19:35:50 +03:00
alexei.dolgolyov f0739ca949 feat: security hardening — SSRF guard, template sandbox timeout, webhook log prune, auth & backup polish
- Add outbound URL validation (SSRF) for webhook/Discord/Slack/ntfy/Matrix dispatch
- Template renderer: input/output caps and thread-based render timeout
- Webhook log filter: strip Authorization/signature/token-like headers; atomic prune
- Auth/JWT/backup/config tightening; misc frontend UX fixes
2026-04-16 03:21:45 +03:00
alexei.dolgolyov 734e5c9340 feat: UX improvements — secure webhooks, locale fixes, dynamic languages, UI polish
- Remove top paginator from dashboard events, keep only bottom
- Fix test message locale: pass UI locale to email/matrix bot tests
- Convert webhook auth mode from text input to icon grid selector
- Generate secure UUID tokens for webhook URLs instead of sequential IDs
- Move Recent Payloads into per-provider expandable container (lazy-loaded)
- Make template config languages dynamic via app settings instead of hardcoded
- Change default dev port to 5175
2026-04-11 02:14:15 +03:00