Files
reflector/docs/docs/concepts/overview.md
Igor Monadical 407c15299f docs: docs website + installation (#778)
* feat: WIP doc (vibe started and iterated)

* install from scratch docs

* caddyfile.example

* gitignore

* authentik script

* authentik script

* authentik script

* llm doc

* authentik ongoing

* more daily setup logs

* doc website

* gpu self hosted setup guide (no-mistakes)

* doc review round

* doc review round

* doc review round

* update doc site sidebars

* feat(docs): add mermaid diagram support

* docs polishing

* live pipeline doc

* move pipeline dev docs to dev docs location

* doc pr review iteration

* dockerfile healthcheck

* docs/pr-comments

* remove jwt comment

* llm suggestion

* pr comments

* pr comments

* document auto migrations

* cleanup docs

---------

Co-authored-by: Mathieu Virbel <mat@meltingrocks.com>
Co-authored-by: Igor Loskutov <igor.loskutoff@gmail.com>
2026-01-06 17:25:02 -05:00

201 lines
6.0 KiB
Markdown

---
sidebar_position: 1
title: Architecture Overview
---
# Architecture Overview
Reflector is built as a modern, scalable, microservices-based application designed to handle audio processing workloads efficiently while maintaining data privacy and control.
## System Components
### Frontend Application
The user interface is built with **Next.js 15** using the App Router pattern, providing:
- Server-side rendering for optimal performance
- Real-time WebSocket connections for live transcription
- WebRTC support for audio streaming and live meetings (via Daily.co or Whereby)
- Responsive design with Chakra UI components
### Backend API Server
The core API is powered by **FastAPI**, a modern Python framework that provides:
- High-performance async request handling
- Automatic OpenAPI documentation generation
- Type safety with Pydantic models
- WebSocket support for real-time updates
### Processing Pipeline
Audio processing is handled through a modular pipeline architecture:
```
Audio Input → Chunking → Transcription → Diarization → Post-Processing → Storage
```
Each step can run independently and in parallel, allowing for:
- Scalable processing of large files
- Real-time streaming capabilities
- Fault tolerance and retry mechanisms
### Worker Architecture
Background tasks are managed by **Celery** workers with **Redis** as the message broker:
- Distributed task processing
- Priority queues for time-sensitive operations
- Automatic retry on failure
- Progress tracking and notifications
### GPU Acceleration
ML models run on GPU-accelerated infrastructure:
- **Modal.com** for serverless GPU processing
- **Self-hosted GPU** with Docker deployment
- Automatic scaling based on demand
- Cost-effective pay-per-use model
## Data Flow
### Daily.co Meeting Recording Flow
1. **Recording**: Daily.co captures separate audio tracks per participant
2. **Webhook**: Daily.co notifies Reflector when recording is ready
3. **Track Download**: Individual participant tracks fetched from S3
4. **Padding**: Tracks padded with silence based on join time for synchronization
5. **Transcription**: Each track transcribed independently (speaker = track index)
6. **Merge**: Transcriptions sorted by timestamp and combined
7. **Mixdown**: Tracks mixed to single MP3 for playback
8. **Post-Processing**: Topics, title, and summaries generated via LLM
9. **Delivery**: Results stored and user notified via WebSocket
### File Upload Flow
1. **Upload**: User uploads audio file through web interface
2. **Storage**: File stored temporarily
3. **Transcription**: Full file transcribed via Whisper
4. **Diarization**: ML-based speaker identification (Pyannote)
5. **Post-Processing**: Topics, title, summaries
6. **Delivery**: Results stored and user notified
### Live Streaming Flow
1. **WebRTC Connection**: Browser establishes peer connection via Daily.co or Whereby
2. **Audio Capture**: Microphone audio streamed to server
3. **Buffering**: Audio buffered for processing
4. **Real-time Processing**: Segments transcribed as they arrive
5. **WebSocket Updates**: Results streamed back to client
6. **Continuous Assembly**: Full transcript built progressively
## Deployment Architecture
### Container-Based Deployment
All components are containerized for consistent deployment:
```yaml
services:
web: # Next.js application
server: # FastAPI server
worker: # Celery workers
redis: # Message broker
postgres: # Database
caddy: # Reverse proxy
```
### Networking
- **Host Network Mode**: Required for WebRTC/ICE compatibility
- **Caddy Reverse Proxy**: Handles SSL termination and routing
- **WebSocket Upgrade**: Supports real-time connections
## Scalability Considerations
### Horizontal Scaling
- **Stateless Backend**: Multiple API server instances
- **Worker Pools**: Add workers based on queue depth
- **Database Pooling**: Connection management for concurrent access
### Vertical Scaling
- **GPU Workers**: Scale up for faster model inference
- **Memory Optimization**: Efficient audio buffering
## Security Architecture
### Authentication & Authorization
- **JWT Tokens**: Stateless authentication
- **Authentik Integration**: Enterprise SSO support
- **Role-Based Access**: Granular permissions
### Data Protection
- **Encryption in Transit**: TLS for all connections
- **Temporary Storage**: Automatic cleanup of processed files
### Privacy by Design
- **Local Processing**: Option to process entirely on-premises
- **No Training on User Data**: Models are pre-trained
- **Data Isolation**: Multi-tenant data separation
## Integration Points
### External Services
- **Modal.com**: GPU processing
- **AWS S3**: Long-term storage
- **Whereby**: Video conferencing rooms
- **Zulip**: Chat integration (optional)
### APIs and Webhooks
- **RESTful API**: Standard CRUD operations
- **WebSocket API**: Real-time updates
- **Webhook Notifications**: Processing completion events
- **OpenAPI Specification**: Machine-readable API definition
## Performance Optimization
### Caching Strategy
- **Redis Cache**: Frequently accessed data
- **CDN**: Static asset delivery
- **Browser Cache**: Client-side optimization
### Database Optimization
- **Indexed Queries**: Fast search and retrieval
- **Connection Pooling**: Efficient resource usage
- **Query Optimization**: N+1 query prevention
### Processing Optimization
- **Batch Processing**: Efficient GPU utilization
- **Parallel Execution**: Multi-core CPU usage
- **Stream Processing**: Reduced memory footprint
## Monitoring and Observability
### Metrics Collection
- **Application Metrics**: Request rates, response times
- **System Metrics**: CPU, memory, disk usage
- **Business Metrics**: Transcription accuracy, processing times
### Logging
- **Structured Logging**: JSON format for analysis
- **Log Aggregation**: Centralized log management
- **Error Tracking**: Sentry integration
### Health Checks
- **Liveness Probes**: Component availability
- **Readiness Probes**: Service readiness
- **Dependency Checks**: External service status