mirror of
https://github.com/Monadical-SAS/reflector.git
synced 2026-02-04 09:56:47 +00:00
* feat: WIP doc (vibe started and iterated) * install from scratch docs * caddyfile.example * gitignore * authentik script * authentik script * authentik script * llm doc * authentik ongoing * more daily setup logs * doc website * gpu self hosted setup guide (no-mistakes) * doc review round * doc review round * doc review round * update doc site sidebars * feat(docs): add mermaid diagram support * docs polishing * live pipeline doc * move pipeline dev docs to dev docs location * doc pr review iteration * dockerfile healthcheck * docs/pr-comments * remove jwt comment * llm suggestion * pr comments * pr comments * document auto migrations * cleanup docs --------- Co-authored-by: Mathieu Virbel <mat@meltingrocks.com> Co-authored-by: Igor Loskutov <igor.loskutoff@gmail.com>
6.0 KiB
6.0 KiB
sidebar_position, title
| sidebar_position | title |
|---|---|
| 1 | Architecture Overview |
Architecture Overview
Reflector is built as a modern, scalable, microservices-based application designed to handle audio processing workloads efficiently while maintaining data privacy and control.
System Components
Frontend Application
The user interface is built with Next.js 15 using the App Router pattern, providing:
- Server-side rendering for optimal performance
- Real-time WebSocket connections for live transcription
- WebRTC support for audio streaming and live meetings (via Daily.co or Whereby)
- Responsive design with Chakra UI components
Backend API Server
The core API is powered by FastAPI, a modern Python framework that provides:
- High-performance async request handling
- Automatic OpenAPI documentation generation
- Type safety with Pydantic models
- WebSocket support for real-time updates
Processing Pipeline
Audio processing is handled through a modular pipeline architecture:
Audio Input → Chunking → Transcription → Diarization → Post-Processing → Storage
Each step can run independently and in parallel, allowing for:
- Scalable processing of large files
- Real-time streaming capabilities
- Fault tolerance and retry mechanisms
Worker Architecture
Background tasks are managed by Celery workers with Redis as the message broker:
- Distributed task processing
- Priority queues for time-sensitive operations
- Automatic retry on failure
- Progress tracking and notifications
GPU Acceleration
ML models run on GPU-accelerated infrastructure:
- Modal.com for serverless GPU processing
- Self-hosted GPU with Docker deployment
- Automatic scaling based on demand
- Cost-effective pay-per-use model
Data Flow
Daily.co Meeting Recording Flow
- Recording: Daily.co captures separate audio tracks per participant
- Webhook: Daily.co notifies Reflector when recording is ready
- Track Download: Individual participant tracks fetched from S3
- Padding: Tracks padded with silence based on join time for synchronization
- Transcription: Each track transcribed independently (speaker = track index)
- Merge: Transcriptions sorted by timestamp and combined
- Mixdown: Tracks mixed to single MP3 for playback
- Post-Processing: Topics, title, and summaries generated via LLM
- Delivery: Results stored and user notified via WebSocket
File Upload Flow
- Upload: User uploads audio file through web interface
- Storage: File stored temporarily
- Transcription: Full file transcribed via Whisper
- Diarization: ML-based speaker identification (Pyannote)
- Post-Processing: Topics, title, summaries
- Delivery: Results stored and user notified
Live Streaming Flow
- WebRTC Connection: Browser establishes peer connection via Daily.co or Whereby
- Audio Capture: Microphone audio streamed to server
- Buffering: Audio buffered for processing
- Real-time Processing: Segments transcribed as they arrive
- WebSocket Updates: Results streamed back to client
- Continuous Assembly: Full transcript built progressively
Deployment Architecture
Container-Based Deployment
All components are containerized for consistent deployment:
services:
web: # Next.js application
server: # FastAPI server
worker: # Celery workers
redis: # Message broker
postgres: # Database
caddy: # Reverse proxy
Networking
- Host Network Mode: Required for WebRTC/ICE compatibility
- Caddy Reverse Proxy: Handles SSL termination and routing
- WebSocket Upgrade: Supports real-time connections
Scalability Considerations
Horizontal Scaling
- Stateless Backend: Multiple API server instances
- Worker Pools: Add workers based on queue depth
- Database Pooling: Connection management for concurrent access
Vertical Scaling
- GPU Workers: Scale up for faster model inference
- Memory Optimization: Efficient audio buffering
Security Architecture
Authentication & Authorization
- JWT Tokens: Stateless authentication
- Authentik Integration: Enterprise SSO support
- Role-Based Access: Granular permissions
Data Protection
- Encryption in Transit: TLS for all connections
- Temporary Storage: Automatic cleanup of processed files
Privacy by Design
- Local Processing: Option to process entirely on-premises
- No Training on User Data: Models are pre-trained
- Data Isolation: Multi-tenant data separation
Integration Points
External Services
- Modal.com: GPU processing
- AWS S3: Long-term storage
- Whereby: Video conferencing rooms
- Zulip: Chat integration (optional)
APIs and Webhooks
- RESTful API: Standard CRUD operations
- WebSocket API: Real-time updates
- Webhook Notifications: Processing completion events
- OpenAPI Specification: Machine-readable API definition
Performance Optimization
Caching Strategy
- Redis Cache: Frequently accessed data
- CDN: Static asset delivery
- Browser Cache: Client-side optimization
Database Optimization
- Indexed Queries: Fast search and retrieval
- Connection Pooling: Efficient resource usage
- Query Optimization: N+1 query prevention
Processing Optimization
- Batch Processing: Efficient GPU utilization
- Parallel Execution: Multi-core CPU usage
- Stream Processing: Reduced memory footprint
Monitoring and Observability
Metrics Collection
- Application Metrics: Request rates, response times
- System Metrics: CPU, memory, disk usage
- Business Metrics: Transcription accuracy, processing times
Logging
- Structured Logging: JSON format for analysis
- Log Aggregation: Centralized log management
- Error Tracking: Sentry integration
Health Checks
- Liveness Probes: Component availability
- Readiness Probes: Service readiness
- Dependency Checks: External service status