reflector/docs/docs/concepts/overview.md

---
sidebar_position: 1
title: Architecture Overview
---

# Architecture Overview

Reflector is built as a modern, scalable, microservices-based application designed to handle audio processing workloads efficiently while maintaining data privacy and control.

## System Components

### Frontend Application

The user interface is built with **Next.js 15** using the App Router pattern, providing:

- Server-side rendering for optimal performance
- Real-time WebSocket connections for live transcription
- WebRTC support for audio streaming and live meetings (via Daily.co or Whereby)
- Responsive design with Chakra UI components

### Backend API Server

The core API is powered by **FastAPI**, a modern Python framework that provides:

- High-performance async request handling
- Automatic OpenAPI documentation generation
- Type safety with Pydantic models
- WebSocket support for real-time updates

### Processing Pipeline

Audio processing is handled through a modular pipeline architecture:

```
Audio Input → Chunking → Transcription → Diarization → Post-Processing → Storage
```

Each step can run independently and in parallel, allowing for:
- Scalable processing of large files
- Real-time streaming capabilities
- Fault tolerance and retry mechanisms

### Worker Architecture

Background tasks are managed by **Celery** workers with **Redis** as the message broker:

- Distributed task processing
- Priority queues for time-sensitive operations
- Automatic retry on failure
- Progress tracking and notifications

### GPU Acceleration

ML models run on GPU-accelerated infrastructure:

- **Modal.com** for serverless GPU processing
- **Self-hosted GPU** with Docker deployment
- Automatic scaling based on demand
- Cost-effective pay-per-use model

## Data Flow

### Daily.co Meeting Recording Flow

1. **Recording**: Daily.co captures separate audio tracks per participant
2. **Webhook**: Daily.co notifies Reflector when recording is ready
3. **Track Download**: Individual participant tracks fetched from S3
4. **Padding**: Tracks padded with silence based on join time for synchronization
5. **Transcription**: Each track transcribed independently (speaker = track index)
6. **Merge**: Transcriptions sorted by timestamp and combined
7. **Mixdown**: Tracks mixed to single MP3 for playback
8. **Post-Processing**: Topics, title, and summaries generated via LLM
9. **Delivery**: Results stored and user notified via WebSocket

### File Upload Flow

1. **Upload**: User uploads audio file through web interface
2. **Storage**: File stored temporarily
3. **Transcription**: Full file transcribed via Whisper
4. **Diarization**: ML-based speaker identification (Pyannote)
5. **Post-Processing**: Topics, title, summaries
6. **Delivery**: Results stored and user notified

### Live Streaming Flow

1. **WebRTC Connection**: Browser establishes peer connection via Daily.co or Whereby
2. **Audio Capture**: Microphone audio streamed to server
3. **Buffering**: Audio buffered for processing
4. **Real-time Processing**: Segments transcribed as they arrive
5. **WebSocket Updates**: Results streamed back to client
6. **Continuous Assembly**: Full transcript built progressively

## Deployment Architecture

### Container-Based Deployment

All components are containerized for consistent deployment:

```yaml
services:
  web:         # Next.js application
  server:      # FastAPI server
  worker:      # Celery workers
  redis:       # Message broker
  postgres:    # Database
  caddy:       # Reverse proxy
```

### Networking

- **Host Network Mode**: Required for WebRTC/ICE compatibility
- **Caddy Reverse Proxy**: Handles SSL termination and routing
- **WebSocket Upgrade**: Supports real-time connections

## Scalability Considerations

### Horizontal Scaling

- **Stateless Backend**: Multiple API server instances
- **Worker Pools**: Add workers based on queue depth
- **Database Pooling**: Connection management for concurrent access

### Vertical Scaling

- **GPU Workers**: Scale up for faster model inference
- **Memory Optimization**: Efficient audio buffering

## Security Architecture

### Authentication & Authorization

- **JWT Tokens**: Stateless authentication
- **Authentik Integration**: Enterprise SSO support
- **Role-Based Access**: Granular permissions

### Data Protection

- **Encryption in Transit**: TLS for all connections
- **Temporary Storage**: Automatic cleanup of processed files

### Privacy by Design

- **Local Processing**: Option to process entirely on-premises
- **No Training on User Data**: Models are pre-trained
- **Data Isolation**: Multi-tenant data separation

## Integration Points

### External Services

- **Modal.com**: GPU processing
- **AWS S3**: Long-term storage
- **Whereby**: Video conferencing rooms
- **Zulip**: Chat integration (optional)

### APIs and Webhooks

- **RESTful API**: Standard CRUD operations
- **WebSocket API**: Real-time updates
- **Webhook Notifications**: Processing completion events
- **OpenAPI Specification**: Machine-readable API definition

## Performance Optimization

### Caching Strategy

- **Redis Cache**: Frequently accessed data
- **CDN**: Static asset delivery
- **Browser Cache**: Client-side optimization

### Database Optimization

- **Indexed Queries**: Fast search and retrieval
- **Connection Pooling**: Efficient resource usage
- **Query Optimization**: N+1 query prevention

### Processing Optimization

- **Batch Processing**: Efficient GPU utilization
- **Parallel Execution**: Multi-core CPU usage
- **Stream Processing**: Reduced memory footprint

## Monitoring and Observability

### Metrics Collection

- **Application Metrics**: Request rates, response times
- **System Metrics**: CPU, memory, disk usage
- **Business Metrics**: Transcription accuracy, processing times

### Logging

- **Structured Logging**: JSON format for analysis
- **Log Aggregation**: Centralized log management
- **Error Tracking**: Sentry integration

### Health Checks

- **Liveness Probes**: Component availability
- **Readiness Probes**: Service readiness
- **Dependency Checks**: External service status