Files

dolgolyov.alexei d86d53f473 Phase 10: Per-User Rate Limits — messages + tokens, quota UI, admin usage

Backend:
- max_ai_messages_per_day + max_ai_tokens_per_day on User model (nullable, override)
- Migration 008: add columns + seed default settings (100 msgs, 500K tokens)
- usage_service: count today's messages + tokens, check quota, get limits
- GET /chats/quota returns usage vs limits + reset time
- POST /chats/{id}/messages checks quota before streaming (429 if exceeded)
- Admin user schemas expose both limit fields
- GET /admin/usage returns per-user daily message + token counts
- admin_user_service allows updating both limit fields

Frontend:
- Chat header shows "X/Y messages · XK/YK tokens" with red highlight at limit
- Quota refreshes every 30s via TanStack Query
- Admin usage page with table: user, messages today, tokens today
- Route + sidebar entry for admin usage
- English + Russian translations

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

2026-03-19 15:44:51 +03:00

14 KiB

Raw Blame History

Personal AI-Assistant App - Implementation Plan

Context

Build a greenfield client-server personal AI-assistant focused on health management. Users upload health documents, chat with AI specialists, get proactive health reminders, and receive notifications across multiple channels. The app has admin and user roles with configurable limits.

Tech Stack

Backend: Python 3.12 + FastAPI (async)
Frontend: React 18 + TypeScript + Vite
UI: Shadcn/ui + Tailwind CSS + Lucide icons
Database: PostgreSQL 16 (async via SQLAlchemy 2.0 + asyncpg)
AI: Claude API (Anthropic SDK) with streaming + tool use
Notifications: In-app (WebSocket) + Email (aiosmtplib) + Telegram (aiogram)
File storage: Local filesystem with Docker volumes
i18n: i18next (English + Russian)
State: Zustand (UI state) + TanStack Query (server state)
Deployment: Docker Compose (nginx + backend + frontend + postgres + redis + telegram-bot)

Project Structure

ai-assistant/
├── docker-compose.yml
├── docker-compose.dev.yml
├── .env.example
├── .gitignore
│
├── backend/
│   ├── Dockerfile
│   ├── pyproject.toml
│   ├── alembic.ini
│   ├── alembic/versions/
│   ├── app/
│   │   ├── main.py              # App factory, lifespan, middleware
│   │   ├── config.py            # Pydantic Settings
│   │   ├── database.py          # Async engine + session factory
│   │   ├── models/              # SQLAlchemy ORM (user, chat, message, document, skill, memory, notification, setting, context_file)
│   │   ├── schemas/             # Pydantic request/response models
│   │   ├── api/
│   │   │   ├── deps.py          # get_db, get_current_user, require_admin
│   │   │   └── v1/              # auth, users, chats, messages, documents, skills, memory, notifications, admin, ws
│   │   ├── services/            # Business logic (auth, user, chat, ai, document, skill, memory, notification, pdf, email, telegram, scheduler)
│   │   ├── core/                # security.py (JWT, hashing), permissions.py, middleware.py
│   │   ├── workers/             # Background: document_processor, notification_sender
│   │   └── utils/               # file_storage.py, text_extraction.py
│   ├── tests/
│   └── scripts/seed_admin.py
│
├── frontend/
│   ├── Dockerfile
│   ├── package.json
│   ├── vite.config.ts
│   ├── tailwind.config.ts
│   ├── components.json
│   ├── public/locales/{en,ru}/translation.json
│   └── src/
│       ├── api/                 # Axios client + endpoint modules
│       ├── hooks/               # use-auth, use-chat, use-websocket, use-notifications
│       ├── stores/              # Zustand: auth-store, chat-store, notification-store, ui-store
│       ├── components/
│       │   ├── ui/              # shadcn/ui
│       │   ├── layout/          # app-layout, sidebar, header
│       │   ├── auth/            # login-form, register-form, account-switcher
│       │   ├── chat/            # chat-list, chat-window, message-bubble, message-input, skill-selector
│       │   ├── documents/       # document-list, upload, viewer
│       │   ├── notifications/   # bell, list, settings
│       │   ├── admin/           # user-management, context-editor, skill-editor, settings-panel
│       │   └── shared/          # theme-provider, language-toggle, protected-route, error-boundary
│       ├── pages/               # login, register, dashboard, chat, documents, memory, notifications, profile, admin/*
│       └── routes.tsx
│
├── telegram-bot/
│   ├── Dockerfile
│   ├── pyproject.toml
│   └── bot/                     # main.py, handlers.py, api_client.py
│
└── nginx/
    └── nginx.conf               # /api/* -> backend, /* -> frontend

Database Schema

Core Tables

users: id(UUID), email, username, hashed_password, full_name, role(user/admin), is_active, max_chats, oauth_provider, oauth_provider_id, telegram_chat_id, avatar_url, created_at, updated_at

sessions: id(UUID), user_id(FK), refresh_token_hash, device_info, ip_address, expires_at, created_at

chats: id(UUID), user_id(FK), title, skill_id(FK nullable), is_archived, created_at, updated_at

messages: id(UUID), chat_id(FK), role(user/assistant/system/tool), content(TEXT), metadata(JSONB), created_at

documents: id(UUID), user_id(FK), filename, storage_path, mime_type, file_size, doc_type(lab_result/consultation/prescription/imaging/other), extracted_text(TEXT), embedding_status(pending/processing/completed/failed), metadata(JSONB), created_at

skills: id(UUID), user_id(FK nullable, NULL=general), name, description, system_prompt(TEXT), icon, is_active, sort_order, created_at

memory_entries: id(UUID), user_id(FK), category(condition/medication/allergy/vital/document_summary/other), title, content(TEXT), source_document_id(FK nullable), importance(critical/high/medium/low), is_active, created_at

context_files: id(UUID), type(primary/personal), user_id(FK nullable), content(TEXT), version, updated_by(FK), created_at, updated_at. UNIQUE(type, user_id)

notifications: id(UUID), user_id(FK), title, body(TEXT), type(reminder/alert/info/ai_generated), channel(in_app/email/telegram), status(pending/sent/delivered/read/failed), scheduled_at, sent_at, read_at, metadata(JSONB), created_at

settings: key(PK), value(JSONB), updated_by(FK), updated_at. Keys: self_registration_enabled, default_max_chats, claude_model, smtp_config, telegram_bot_token

generated_pdfs: id(UUID), user_id(FK), title, storage_path, source_document_ids(UUID[]), source_chat_id(FK nullable), created_at

API Design (`/api/v1/`)

Group	Key Endpoints
Auth	POST login, register, refresh, logout; GET oauth/{provider}, oauth/{provider}/callback; POST switch-account
Users	GET/PATCH /me; GET/PUT /me/context; PATCH /me/telegram
Chats	CRUD + GET /{id}/messages + POST /{id}/messages (SSE streaming)
Documents	CRUD + GET /{id}/download + POST /{id}/reindex
Skills	CRUD (personal skills)
Memory	CRUD
Notifications	GET list, PATCH /{id}/read, POST mark-all-read, GET unread-count
PDF	POST /compile, GET /{id}/download, GET /
Admin	Users CRUD, GET/PUT /context, Skills CRUD, GET/PATCH /settings
WebSocket	/ws/notifications, /ws/chat/{chat_id}

AI Integration

Context Assembly Order

Primary context file (admin global prompt)
Personal context file (user-specific)
Active skill system prompt
Global memory (critical/high importance entries)
Relevant document excerpts (via PostgreSQL full-text search)
Conversation history
Current user message

AI Tools (Claude function calling)

Tool	Purpose
`save_memory`	Persist health info to user's global memory
`schedule_notification`	Schedule reminders (one-time or recurring)
`search_documents`	Search uploaded documents by content
`generate_pdf`	Create PDF compilation from user data
`get_memory`	Retrieve specific memory entries

Streaming

Backend receives SSE from Claude API, forwards via SSE to frontend. Tool use blocks are executed server-side in a loop until final text response.

Proactive AI

Daily scheduled job (APScheduler, 8 AM) reviews each user's memory + recent docs via Claude, generates reminders/notifications.

Document Processing Pipeline

Upload -> validate type/size -> save to /data/uploads/{user_id}/{doc_id}/
Background: extract text (PyMuPDF for PDFs, pytesseract for scanned/images)
Background: Claude extracts structured data -> metadata JSONB
Search: PostgreSQL full-text search on extracted_text (tsvector/tsquery)
PDF generation: WeasyPrint renders HTML template -> PDF

Notification System

In-app: WebSocket push on connect, stored in DB for offline users
Email: aiosmtplib + Jinja2 HTML templates
Telegram: Separate aiogram bot service, linked via one-time code flow
Scheduling: APScheduler with CronTrigger (recurring) / DateTrigger (one-time)

Docker Compose Services

Service	Image/Build	Purpose
postgres	postgres:16-alpine	Database
redis	redis:7-alpine	Rate limiting, scheduler job store, WebSocket broker
backend	./backend	FastAPI app (runs migrations on start)
frontend	./frontend	Vite build served by nginx
telegram-bot	./telegram-bot	Telegram notification bot
nginx	./nginx	Reverse proxy (80/443)

Planning Rules

Subplan requirement: Before starting any phase, a detailed subplan must be created at plans/phase-N-<name>.md. The subplan breaks the phase into granular, trackable tasks with checkboxes. Implementation may only begin after the subplan is written.

Tracking format: Both this GeneralPlan and each subplan use [x]/[ ] checkboxes. GeneralPlan tracks phase-level progress (subplan created, phase completed). Subplans track individual tasks (files created, features implemented, tests passing). Update checkboxes as work is completed so progress is always visible at a glance.

Phase review requirement: After completing all tasks in a phase, a detailed code review must be performed before marking the phase as completed. The review should check: (1) all acceptance criteria are met, (2) code quality and consistency with existing patterns, (3) no security vulnerabilities introduced, (4) all new endpoints tested, (5) frontend TypeScript compiles and Vite builds cleanly, (6) i18n complete for both languages. Review findings and any fixes applied must be noted in the subplan under a ## Review Notes section.

Subplan structure: Each subplan must include:

Goal — one-sentence summary of what the phase delivers

Prerequisites — what must be done before this phase starts

Tasks — numbered, checkboxed list of implementation steps (granular enough that each is ~1 working session)

Files to create/modify — explicit list of files touched

Acceptance criteria — how to verify the phase is complete

Status — one of: NOT STARTED, IN PROGRESS, COMPLETED

Implementation Phases

Phase 1: Foundation

Status: COMPLETED
Subplan created (plans/phase-1-foundation.md)
Phase completed
Summary: Monorepo setup, Docker Compose, FastAPI + Alembic, auth (JWT), frontend shell (Vite + React + shadcn/ui + i18n), seed admin script

Phase 2: Chat & AI Core

Status: COMPLETED
Subplan created (plans/phase-2-chat-ai.md)
Phase completed
Summary: Chats + messages tables, chat CRUD, SSE streaming, Claude API integration, context assembly, frontend chat UI, admin context editor, chat limits

Phase 3: Skills & Context

Status: COMPLETED
Subplan created (plans/phase-3-skills-context.md)
Phase completed
Summary: Skills + context_files tables, skills CRUD (general + personal), personal context CRUD, context layering, frontend skill selector + editors

Phase 4: Documents & Memory

Status: COMPLETED
Subplan created (plans/phase-4-documents-memory.md)
Phase completed
Summary: Documents + memory tables, upload + processing pipeline, full-text search, AI tools (save_memory, search_documents, get_memory), frontend document/memory UI

Phase 5: Notifications

Status: COMPLETED
Subplan created (plans/phase-5-notifications.md)
Phase completed
Summary: Notifications table, WebSocket + email + Telegram channels, APScheduler, AI schedule_notification tool, proactive health review job, frontend notification UI

Phase 6: PDF & Polish

Status: COMPLETED
Subplan created (plans/phase-6-pdf-polish.md)
Phase completed
Summary: PDF generation (WeasyPrint), AI generate_pdf tool, OAuth, account switching, admin user management + settings, rate limiting, responsive pass

Phase 7: Hardening

Status: COMPLETED
Subplan created (plans/phase-7-hardening.md)
Phase completed
Summary: Security audit, file upload validation, performance tuning, structured logging, production Docker images, health checks, backup strategy, documentation

Phase 8: Customizable PDF Templates

Status: COMPLETED
Subplan created (plans/phase-8-pdf-templates.md)
Phase completed
Summary: Admin-managed Jinja2 PDF templates in DB with locale support (en/ru), template selector for users/AI, live preview editor, basic + medical seed templates

Phase 9: OAuth & Account Switching

Status: NOT STARTED
Subplan created (plans/phase-9-oauth.md)
Phase completed
Summary: OAuth (Google, GitHub), account switching UI, multiple stored sessions

Phase 10: Per-User Rate Limits

Status: COMPLETED
Subplan created (plans/phase-10-rate-limits.md)
Phase completed
Summary: Per-user AI message + token rate limits, admin-configurable defaults, usage tracking dashboard

Key Design Decisions

SSE for chat streaming (simpler than WebSocket, sufficient for server->client)
WebSocket only for notifications (bidirectional, real-time)
APScheduler over Celery (simpler for single-instance MVP, swappable later)
PostgreSQL FTS over vector DB (sufficient for MVP, pgvector addable later)
Separate Telegram bot service (different lifecycle, independent restarts)
file_storage.py abstraction (swap to S3 by changing one module)

Verification

Run docker-compose up and verify all services start
Create admin via seed script, log in, create a chat
Upload a document, verify text extraction
Chat with AI, verify streaming + skill context + memory tools
Schedule a notification, verify delivery across channels
Test i18n switching (en/ru), dark/light theme
Test on mobile viewport

14 KiB Raw Blame History