Files
personal-ai-assistant/GeneralPlan.md
2026-03-19 11:30:58 +03:00

13 KiB

Personal AI-Assistant App - Implementation Plan

Context

Build a greenfield client-server personal AI-assistant focused on health management. Users upload health documents, chat with AI specialists, get proactive health reminders, and receive notifications across multiple channels. The app has admin and user roles with configurable limits.

Tech Stack

  • Backend: Python 3.12 + FastAPI (async)
  • Frontend: React 18 + TypeScript + Vite
  • UI: Shadcn/ui + Tailwind CSS + Lucide icons
  • Database: PostgreSQL 16 (async via SQLAlchemy 2.0 + asyncpg)
  • AI: Claude API (Anthropic SDK) with streaming + tool use
  • Notifications: In-app (WebSocket) + Email (aiosmtplib) + Telegram (aiogram)
  • File storage: Local filesystem with Docker volumes
  • i18n: i18next (English + Russian)
  • State: Zustand (UI state) + TanStack Query (server state)
  • Deployment: Docker Compose (nginx + backend + frontend + postgres + redis + telegram-bot)

Project Structure

ai-assistant/
├── docker-compose.yml
├── docker-compose.dev.yml
├── .env.example
├── .gitignore
│
├── backend/
│   ├── Dockerfile
│   ├── pyproject.toml
│   ├── alembic.ini
│   ├── alembic/versions/
│   ├── app/
│   │   ├── main.py              # App factory, lifespan, middleware
│   │   ├── config.py            # Pydantic Settings
│   │   ├── database.py          # Async engine + session factory
│   │   ├── models/              # SQLAlchemy ORM (user, chat, message, document, skill, memory, notification, setting, context_file)
│   │   ├── schemas/             # Pydantic request/response models
│   │   ├── api/
│   │   │   ├── deps.py          # get_db, get_current_user, require_admin
│   │   │   └── v1/              # auth, users, chats, messages, documents, skills, memory, notifications, admin, ws
│   │   ├── services/            # Business logic (auth, user, chat, ai, document, skill, memory, notification, pdf, email, telegram, scheduler)
│   │   ├── core/                # security.py (JWT, hashing), permissions.py, middleware.py
│   │   ├── workers/             # Background: document_processor, notification_sender
│   │   └── utils/               # file_storage.py, text_extraction.py
│   ├── tests/
│   └── scripts/seed_admin.py
│
├── frontend/
│   ├── Dockerfile
│   ├── package.json
│   ├── vite.config.ts
│   ├── tailwind.config.ts
│   ├── components.json
│   ├── public/locales/{en,ru}/translation.json
│   └── src/
│       ├── api/                 # Axios client + endpoint modules
│       ├── hooks/               # use-auth, use-chat, use-websocket, use-notifications
│       ├── stores/              # Zustand: auth-store, chat-store, notification-store, ui-store
│       ├── components/
│       │   ├── ui/              # shadcn/ui
│       │   ├── layout/          # app-layout, sidebar, header
│       │   ├── auth/            # login-form, register-form, account-switcher
│       │   ├── chat/            # chat-list, chat-window, message-bubble, message-input, skill-selector
│       │   ├── documents/       # document-list, upload, viewer
│       │   ├── notifications/   # bell, list, settings
│       │   ├── admin/           # user-management, context-editor, skill-editor, settings-panel
│       │   └── shared/          # theme-provider, language-toggle, protected-route, error-boundary
│       ├── pages/               # login, register, dashboard, chat, documents, memory, notifications, profile, admin/*
│       └── routes.tsx
│
├── telegram-bot/
│   ├── Dockerfile
│   ├── pyproject.toml
│   └── bot/                     # main.py, handlers.py, api_client.py
│
└── nginx/
    └── nginx.conf               # /api/* -> backend, /* -> frontend

Database Schema

Core Tables

users: id(UUID), email, username, hashed_password, full_name, role(user/admin), is_active, max_chats, oauth_provider, oauth_provider_id, telegram_chat_id, avatar_url, created_at, updated_at

sessions: id(UUID), user_id(FK), refresh_token_hash, device_info, ip_address, expires_at, created_at

chats: id(UUID), user_id(FK), title, skill_id(FK nullable), is_archived, created_at, updated_at

messages: id(UUID), chat_id(FK), role(user/assistant/system/tool), content(TEXT), metadata(JSONB), created_at

documents: id(UUID), user_id(FK), filename, storage_path, mime_type, file_size, doc_type(lab_result/consultation/prescription/imaging/other), extracted_text(TEXT), embedding_status(pending/processing/completed/failed), metadata(JSONB), created_at

skills: id(UUID), user_id(FK nullable, NULL=general), name, description, system_prompt(TEXT), icon, is_active, sort_order, created_at

memory_entries: id(UUID), user_id(FK), category(condition/medication/allergy/vital/document_summary/other), title, content(TEXT), source_document_id(FK nullable), importance(critical/high/medium/low), is_active, created_at

context_files: id(UUID), type(primary/personal), user_id(FK nullable), content(TEXT), version, updated_by(FK), created_at, updated_at. UNIQUE(type, user_id)

notifications: id(UUID), user_id(FK), title, body(TEXT), type(reminder/alert/info/ai_generated), channel(in_app/email/telegram), status(pending/sent/delivered/read/failed), scheduled_at, sent_at, read_at, metadata(JSONB), created_at

settings: key(PK), value(JSONB), updated_by(FK), updated_at. Keys: self_registration_enabled, default_max_chats, claude_model, smtp_config, telegram_bot_token

generated_pdfs: id(UUID), user_id(FK), title, storage_path, source_document_ids(UUID[]), source_chat_id(FK nullable), created_at


API Design (/api/v1/)

Group Key Endpoints
Auth POST login, register, refresh, logout; GET oauth/{provider}, oauth/{provider}/callback; POST switch-account
Users GET/PATCH /me; GET/PUT /me/context; PATCH /me/telegram
Chats CRUD + GET /{id}/messages + POST /{id}/messages (SSE streaming)
Documents CRUD + GET /{id}/download + POST /{id}/reindex
Skills CRUD (personal skills)
Memory CRUD
Notifications GET list, PATCH /{id}/read, POST mark-all-read, GET unread-count
PDF POST /compile, GET /{id}/download, GET /
Admin Users CRUD, GET/PUT /context, Skills CRUD, GET/PATCH /settings
WebSocket /ws/notifications, /ws/chat/{chat_id}

AI Integration

Context Assembly Order

  1. Primary context file (admin global prompt)
  2. Personal context file (user-specific)
  3. Active skill system prompt
  4. Global memory (critical/high importance entries)
  5. Relevant document excerpts (via PostgreSQL full-text search)
  6. Conversation history
  7. Current user message

AI Tools (Claude function calling)

Tool Purpose
save_memory Persist health info to user's global memory
schedule_notification Schedule reminders (one-time or recurring)
search_documents Search uploaded documents by content
generate_pdf Create PDF compilation from user data
get_memory Retrieve specific memory entries

Streaming

Backend receives SSE from Claude API, forwards via SSE to frontend. Tool use blocks are executed server-side in a loop until final text response.

Proactive AI

Daily scheduled job (APScheduler, 8 AM) reviews each user's memory + recent docs via Claude, generates reminders/notifications.


Document Processing Pipeline

  1. Upload -> validate type/size -> save to /data/uploads/{user_id}/{doc_id}/
  2. Background: extract text (PyMuPDF for PDFs, pytesseract for scanned/images)
  3. Background: Claude extracts structured data -> metadata JSONB
  4. Search: PostgreSQL full-text search on extracted_text (tsvector/tsquery)
  5. PDF generation: WeasyPrint renders HTML template -> PDF

Notification System

  • In-app: WebSocket push on connect, stored in DB for offline users
  • Email: aiosmtplib + Jinja2 HTML templates
  • Telegram: Separate aiogram bot service, linked via one-time code flow
  • Scheduling: APScheduler with CronTrigger (recurring) / DateTrigger (one-time)

Docker Compose Services

Service Image/Build Purpose
postgres postgres:16-alpine Database
redis redis:7-alpine Rate limiting, scheduler job store, WebSocket broker
backend ./backend FastAPI app (runs migrations on start)
frontend ./frontend Vite build served by nginx
telegram-bot ./telegram-bot Telegram notification bot
nginx ./nginx Reverse proxy (80/443)

Planning Rules

Subplan requirement: Before starting any phase, a detailed subplan must be created at plans/phase-N-<name>.md. The subplan breaks the phase into granular, trackable tasks with checkboxes. Implementation may only begin after the subplan is written.

Tracking format: Both this GeneralPlan and each subplan use [x]/[ ] checkboxes. GeneralPlan tracks phase-level progress (subplan created, phase completed). Subplans track individual tasks (files created, features implemented, tests passing). Update checkboxes as work is completed so progress is always visible at a glance.

Subplan structure: Each subplan must include:

  1. Goal — one-sentence summary of what the phase delivers
  2. Prerequisites — what must be done before this phase starts
  3. Tasks — numbered, checkboxed list of implementation steps (granular enough that each is ~1 working session)
  4. Files to create/modify — explicit list of files touched
  5. Acceptance criteria — how to verify the phase is complete
  6. Status — one of: NOT STARTED, IN PROGRESS, COMPLETED

Implementation Phases

Phase 1: Foundation

  • Status: NOT STARTED
  • Subplan created (plans/phase-1-foundation.md)
  • Phase completed
  • Summary: Monorepo setup, Docker Compose, FastAPI + Alembic, auth (JWT), frontend shell (Vite + React + shadcn/ui + i18n), seed admin script

Phase 2: Chat & AI Core

  • Status: NOT STARTED
  • Subplan created (plans/phase-2-chat-ai.md)
  • Phase completed
  • Summary: Chats + messages tables, chat CRUD, SSE streaming, Claude API integration, context assembly, frontend chat UI, admin context editor, chat limits

Phase 3: Skills & Context

  • Status: NOT STARTED
  • Subplan created (plans/phase-3-skills-context.md)
  • Phase completed
  • Summary: Skills + context_files tables, skills CRUD (general + personal), personal context CRUD, context layering, frontend skill selector + editors

Phase 4: Documents & Memory

  • Status: NOT STARTED
  • Subplan created (plans/phase-4-documents-memory.md)
  • Phase completed
  • Summary: Documents + memory tables, upload + processing pipeline, full-text search, AI tools (save_memory, search_documents, get_memory), frontend document/memory UI

Phase 5: Notifications

  • Status: NOT STARTED
  • Subplan created (plans/phase-5-notifications.md)
  • Phase completed
  • Summary: Notifications table, WebSocket + email + Telegram channels, APScheduler, AI schedule_notification tool, proactive health review job, frontend notification UI

Phase 6: PDF & Polish

  • Status: NOT STARTED
  • Subplan created (plans/phase-6-pdf-polish.md)
  • Phase completed
  • Summary: PDF generation (WeasyPrint), AI generate_pdf tool, OAuth, account switching, admin user management + settings, rate limiting, responsive pass

Phase 7: Hardening

  • Status: NOT STARTED
  • Subplan created (plans/phase-7-hardening.md)
  • Phase completed
  • Summary: Security audit, file upload validation, performance tuning, structured logging, production Docker images, health checks, backup strategy, documentation

Key Design Decisions

  • SSE for chat streaming (simpler than WebSocket, sufficient for server->client)
  • WebSocket only for notifications (bidirectional, real-time)
  • APScheduler over Celery (simpler for single-instance MVP, swappable later)
  • PostgreSQL FTS over vector DB (sufficient for MVP, pgvector addable later)
  • Separate Telegram bot service (different lifecycle, independent restarts)
  • file_storage.py abstraction (swap to S3 by changing one module)

Verification

  • Run docker-compose up and verify all services start
  • Create admin via seed script, log in, create a chat
  • Upload a document, verify text extraction
  • Chat with AI, verify streaming + skill context + memory tools
  • Schedule a notification, verify delivery across channels
  • Test i18n switching (en/ru), dark/light theme
  • Test on mobile viewport