Adds Home Assistant as a service provider with two coordinated surfaces: Notifications (subscription): - Long-lived WebSocket client (aiohttp ws_connect) with auth handshake, exponential-backoff reconnect, bounded event queue, and area-registry enrichment cached per (re)connect - ServiceProvider ABC gains an optional `subscribe()` method for push-style providers; HomeAssistantServiceProvider uses it via a per-provider supervisor task started in the FastAPI lifespan - 4 event types (state_changed, automation_triggered, call_service, event_fired), 4 default Jinja templates (en + ru), HA-specific tracker filters (entity_glob, domain_allowlist, exact entity ids) - Extracted shared dispatch pipeline (api/webhooks.py → services/ event_dispatch.py) so subscription and webhook ingest share the same event_log + deferred-dispatch + quiet-hours code path Bot commands: - /status, /entities [glob], /state <entity_id>, /areas - Multi-command WS session so /status and /areas cost one handshake - Sensitive-attribute blocklist (camera access_token, entity_picture, etc.) and 30-attribute cap to keep /state output safe and within Telegram's message size - Error-message redaction strips URL userinfo before surfacing to chat Frontend: - HA descriptor with toggle ConfigField type (new) and tag-input filter mode for free-text glob/domain lists (new TagInput component) - 15 command slots + 4 notification slots wired into the existing template-config UI
13 KiB
Home Assistant Provider — Implementation Plan
Status: planned, not started. Sequencing: third item on the backlog (see feature-backlog.md). Last updated: 2026-05-13.
Decision: WebSocket subscription, not webhook
We considered three ingest modes (webhook automation, WebSocket subscription, hybrid). The WebSocket route is chosen as the architectural foundation because the medium-term roadmap forces it anyway:
| Phase | Capability | Needs API access? |
|---|---|---|
| 1 | Subscribe to events, emit notifications | Read (event stream) |
| 2 | Bot commands (/state, /entities, /areas) |
Read (REST or WS get_states) |
| 3 | Smart Actions (light.turn_on, scene activation) |
Write (call_service) |
A webhook-only Phase 1 would still need a REST client by Phase 2 and a write path by Phase 3 — net result is two client implementations + one event pipeline refactor. WebSocket consolidates all three phases on one connection.
Tradeoff (be honest): WebSocket introduces a long-lived-connection pattern
this codebase does not have yet. Reconnect logic, missed-events-on-restart
gap, and a new shape on the ServiceProvider ABC are real costs. Phase 1 is
not shippable in one short session — plan for a multi-session build.
Provider abstraction extension
The current ServiceProvider ABC
(packages/core/src/notify_bridge_core/providers/base.py)
is poll-oriented: every provider implements poll(collection_ids, state) → (events, new_state). Webhook providers (Gitea, Planka, Webhook) satisfy this
by no-op'ing poll and shoving events in via api/webhooks.py instead.
Home Assistant fits neither cleanly. The plan:
- Add an optional
async subscribe(emit) → Nonemethod on the base ABC. Default implementation raisesNotImplementedError. Polling providers do not override it. The scheduler / lifecycle layer (currentlyservices/watcher.py) gains a "subscription manager" branch that, for any provider whose class overridessubscribe, starts a long-lived task instead of registering a polling job. emitis a callback(event: ServiceEvent) → Noneprovided by the subscription manager — it routes events to the dispatcher exactly like the webhook handler does today. Keeping the dispatch path unchanged is the point of this design.- Reconnect lives inside
subscribe: the method is expected to be awhile not cancelled: try connect; on drop, sleep with backoff, retryloop. The manager cancels the task on shutdown via the cooperative cancel token used elsewhere.
This is a small, additive change to one ABC. No existing provider is modified.
Phase 1 — Subscribe + Dispatch (MVP)
Scope
- Long-lived WebSocket connection to HA, authenticated with a long-lived access token.
- Subscribe to the event bus with optional
event_typefilter (defaults tostate_changed). - Translate HA events into
ServiceEventand dispatch via the existing pipeline. Notifications go out exactly as they do today for any other provider. - Filter UI: entity-id glob list, domain allowlist (e.g.
light.*,binary_sensor.*), event-type allowlist. Hard-required to avoid the HA firehose drowning the bridge. - Connection test + entity listing via WS
get_states(no REST client yet — WS gives us both subscribe and read).
Out of scope for Phase 1
- Bot commands → Phase 2.
- Service calls → Phase 3.
- Replay of events missed during disconnect (HA does not support this; we document the gap and surface "reconnected after N seconds" in the event log).
- Webhook-style ingestion (path-embedded token webhook receiver). If a user prefers webhooks, we add it later as a second ingestion mode on the same provider — out of scope for v1.
Event types (v1)
| HA event | ServiceEvent type | Notification slot |
|---|---|---|
state_changed |
ha_state_changed |
message_state_changed |
automation_triggered |
ha_automation_triggered |
message_automation_triggered |
call_service |
ha_service_called |
message_service_called |
| (custom event types) | ha_event_fired |
message_event_fired |
Default tracking config enables state_changed only — the others are loud
and opt-in.
Context variables exposed to templates
Pulled directly from HA's state_changed payload, normalized:
entity_id—light.kitchenfriendly_name—attributes.friendly_nameor fallback toentity_iddomain— derived fromentity_idbefore the dotold_state—from_state.statenew_state—to_state.stateattributes— dict of new-state attributes (raw)device_class—attributes.device_classif presentarea—attributes.area_idif present (best effort; only set if HA exposes it via the area registry, which costs aget_registryWS call — see "Open questions")last_changed,last_updated— ISO timestamps- For non-
state_changedevents:event_type,event_data(full dict)
File touch map (Phase 1)
Core (packages/core/src/notify_bridge_core/)
| Path | Action | Notes |
|---|---|---|
providers/base.py |
Modify | Add optional subscribe(emit) ABC method (default NotImplementedError); add HOME_ASSISTANT = "home_assistant" to ServiceProviderType |
providers/capabilities.py |
Modify | Add HOME_ASSISTANT_CAPABILITIES + register |
providers/home_assistant/__init__.py |
Create | Export + register template variables |
providers/home_assistant/client.py |
Create | WebSocket client (auth, subscribe, get_states, call_service stub) |
providers/home_assistant/event_parser.py |
Create | HA event dict → ServiceEvent |
providers/home_assistant/provider.py |
Create | Class with connect, disconnect, subscribe, list_collections (entity list), get_available_variables, get_provider_config_schema, test_connection. poll raises NotImplementedError. |
templates/defaults/en/home_assistant_*.jinja2 |
Create | 4 slot templates |
templates/defaults/ru/home_assistant_*.jinja2 |
Create | 4 slot templates |
templates/defaults/loader.py |
Modify | Add to PROVIDER_SLOT_FILE_MAP |
templates/command_defaults/loader.py |
Modify | Stub entry — empty commands list for now |
templates/context.py |
Modify | HA context builder |
templates/validator.py |
Modify | Whitelist HA variable names |
Server (packages/server/src/notify_bridge_server/)
| Path | Action | Notes |
|---|---|---|
services/watcher.py (or scheduler / lifecycle module that hosts polling) |
Modify | Add subscription-manager branch — for providers whose class overrides subscribe, start/stop long-running task instead of polling |
services/scheduler.py |
Verify | Confirm we cancel HA subscription on shutdown (graceful_shutdown_seconds path) |
api/template_configs.py |
Modify | get_template_variables() entry |
api/command_template_configs.py |
Modify | Sample ctx (minimal for Phase 1 — no commands) |
services/sample_context.py |
Modify | _SAMPLE_CONTEXT["home_assistant"] |
database/seeds.py |
Modify | Seed notification templates + default tracking config |
Frontend (frontend/src/)
| Path | Action | Notes |
|---|---|---|
lib/providers/home-assistant.ts |
Create | Descriptor per CLAUDE.md rule 11 |
lib/providers/index.ts |
Modify | Register descriptor |
lib/locales/en.json |
Modify | providers.typeHomeAssistant, gridDesc.providerHomeAssistant |
lib/locales/ru.json |
Modify | Same |
Tests
| Path | Action |
|---|---|
packages/core/tests/providers/test_home_assistant_parser.py |
Create — HA payload → ServiceEvent |
packages/core/tests/providers/test_home_assistant_client.py |
Create — WS auth, subscribe, reconnect (use a fake server) |
packages/server/tests/test_home_assistant_subscription.py |
Create — subscription manager lifecycle, event flows through dispatcher |
Frontend descriptor essentials
type: "home_assistant"
defaultName: "Home Assistant"
icon: "home" (consider Lucide icon; HA logo if a custom asset exists)
hasUrl: true // base URL of HA (used to derive WS URL)
configFields:
- url: http(s)://homeassistant.local:8123
- access_token: long-lived access token (required)
- allowed_event_types: comma-separated, defaults to "state_changed"
eventFields: 4 checkboxes (state_changed, automation_triggered,
call_service, event_fired)
extraTrackingFields:
- entity_glob: tag input ("light.*", "binary_sensor.*_motion")
- domain_allowlist: tag input
collectionMeta: { label: "Entities", icon: "..." }
webhookBased: false // we are NOT webhook based
WS URL is derived: wss://{host}/api/websocket (or ws:// for plain http
HA). Document this in the UI hint.
Auth model
- Long-lived access token from HA (Profile → Long-Lived Access Tokens).
- Stored encrypted at rest via the same path the other providers use for secrets (the bridge already has a secret-encryption helper — verify the exact module name during implementation).
- WS auth handshake: connect → server sends
auth_required→ client sends{type: "auth", access_token: "..."}→ server repliesauth_okorauth_invalid.
Risks / open questions (Phase 1)
- Reconnect strategy. Exponential backoff capped at 60s, jittered.
On reconnect, log a
connection_restored_afterevent so the UI can surface the gap. Document that HA does not support event replay. - Area registry. Pulling
area_idfor entities requires a separateconfig/area_registry/listWS call. Decision needed: fetch once on connect and cache, refetch onarea_registry_updatedevent, or skipareafrom the context entirely in v1. Recommendation: fetch on connect, refetch onarea_registry_updated, skip if it fails (best-effort). - TLS verification for self-signed HA. Homelab users often have
self-signed certs. Need a
verify_tls: boolconfig field (default true) and a clear warning when disabled. Same pattern asNOTIFY_BRIDGE_ALLOW_PRIVATE_URLSfor the SSRF case. - Backpressure. HA's
state_changedcan fire hundreds of events per minute in a busy install. The subscription manager must drop or coalesce if the dispatcher backlog grows beyond a threshold. Cheapest cut: a boundedasyncio.Queuebetween WS receiver and dispatch —put_nowaitwith overflow counter visible in the event log. - Entity filter precedence. Tracking-config has
collection_ids(entity_id list) and we wantentity_glob+domain_allowlist. Decision: if bothcollection_idsand globs are set, union them (any match passes). Documented prominently in the tracker UI. - Library choice.
hass-clientis a Python WS client maintained by the HA community; alternative is rolling our own withwebsockets. The latter is ~150 LOC and has no external dependency surface. Recommendation: roll our own. Re-evaluate if Phase 3 needs registry-aware service calls.
Phase 2 — Bot Commands
Adds Telegram bot commands for HA tracking configs.
/status— connection status, subscribed event count/entities <glob>— list matching entities + current state/state <entity_id>— full state + attributes for one entity/areas— area registry summary/help
These use the existing WS connection (no new client) via WS commands
get_states, config/area_registry/list. Template slots and command
template configs follow the same pattern as Gitea/Planka — see
CLAUDE.md rule 7 / rule 11 for the full set of locations
that must be updated.
Out-of-scope for Phase 2: any command that mutates HA state.
Phase 3 — Smart Actions (Service Calls)
A new action descriptor in the existing Smart Actions framework (packages/core/src/notify_bridge_core/providers/actions.py).
- Action type:
ha_call_service - Rule: trigger event → service call (e.g. "on motion event in
binary_sensor.front_door→ calllight.turn_ononlight.porch") - Executor uses the existing WS connection to send
call_service.
This phase is gated behind explicit per-target authorization in the UI — HA service calls can do anything the access token allows, including unlocking doors. Default state: disabled, with a clear consent flow when enabling.
Rough effort estimates
These are rough — sub-task discovery during Phase 1 will refine them.
| Phase | Estimate (focused work) |
|---|---|
| Phase 1 (subscribe + dispatch) | 2–3 sessions |
| Phase 2 (bot commands) | 1 session |
| Phase 3 (smart actions) | 1–2 sessions |
When to start
Phase 1 work order, once you green-light it:
- ABC extension (
base.py) + tests for the newsubscribeshape on a fake provider. - WS client + parser + unit tests against recorded HA fixtures (no live HA needed for these).
- Subscription manager in
services/watcher.py— integration test with the fake provider from step 1. - Templates (en + ru), capabilities entry, validator whitelist.
- Server: seeds, sample context, template_configs entry.
- Frontend: descriptor, locale keys, i18n.
- End-to-end smoke test against a real HA instance (homelab).
Backend restart cadence per the project rule: after every change in
packages/server/ or packages/core/.
Decision log
- 2026-05-13 — Plan drafted. Ingest mode = WebSocket (chosen over webhook for future-proofing toward Phases 2 + 3). Not started.