docs: docs website + installation (#778)

* feat: WIP doc (vibe started and iterated) * install from scratch docs * caddyfile.example * gitignore * authentik script * authentik script * authentik script * llm doc * authentik ongoing * more daily setup logs * doc website * gpu self hosted setup guide (no-mistakes) * doc review round * doc review round * doc review round * update doc site sidebars * feat(docs): add mermaid diagram support * docs polishing * live pipeline doc * move pipeline dev docs to dev docs location * doc pr review iteration * dockerfile healthcheck * docs/pr-comments * remove jwt comment * llm suggestion * pr comments * pr comments * document auto migrations * cleanup docs --------- Co-authored-by: Mathieu Virbel <mat@meltingrocks.com> Co-authored-by: Igor Loskutov <igor.loskutoff@gmail.com>
2026-02-04 18:06:48 +00:00 · 2026-01-06 17:25:02 -05:00
parent e644d6497b
commit 407c15299f
61 changed files with 32653 additions and 26 deletions
--- a/docs/docs/concepts/modes.md
+++ b/docs/docs/concepts/modes.md
@@ -0,0 +1,115 @@
+---
+sidebar_position: 2
+title: Operating Modes
+---
+
+# Operating Modes
+
+Reflector operates in two distinct modes to accommodate different use cases and security requirements.
+
+## Public Mode
+
+Public mode provides immediate access to core transcription features without requiring authentication.
+
+### Features Available
+- **File Upload**: Process audio files
+- **Live Transcription**: Stream audio from microphone
+- **Basic Processing**: Transcription and diarization
+- **Temporary Storage**: Temporary data retention (configurable)
+
+### Limitations
+- No persistent storage
+- No meeting rooms
+- Limited to single-user sessions
+- No team collaboration features
+
+### Use Cases
+- Quick transcription needs
+- Testing and evaluation
+- Individual users
+- Public demonstrations
+
+## Private Mode
+
+Private mode unlocks the full potential of Reflector with authentication and persistent storage.
+
+### Additional Features
+- **Virtual Meeting Rooms**: Whereby and Daily.co integration
+- **Team Collaboration**: Share transcripts with team
+- **Persistent Storage**: Long-term transcript archive
+- **Meeting History**: Search and browse past transcripts
+- **Custom Integration**: Webhooks and API access
+- **User Management**: Role-based access control
+
+### Authentication Options
+
+#### Authentik Integration
+Enterprise-grade SSO with support for:
+- SAML 2.0
+- OAuth 2.0 / OIDC
+- LDAP / Active Directory
+- Multi-factor authentication
+
+### Room Management
+
+Virtual rooms provide dedicated spaces for meetings:
+- **Persistent URLs**: Same link for recurring meetings
+- **Access Control**: Invite-only or open rooms
+- **Recording Consent**: Automatic consent management
+- **Custom Settings**: Per-room configuration
+
+## Mode Selection
+
+The mode is determined by your deployment configuration:
+
+```yaml
+# Public Mode (no authentication)
+AUTH_BACKEND=none
+
+# Private Mode (with authentication)
+AUTH_BACKEND=jwt
+```
+
+See [Authentication Setup](../installation/auth-setup) for configuring JWT authentication.
+
+## Feature Comparison
+
+| Feature | Public Mode | Private Mode |
+|---------|------------|--------------|
+| File Upload | ✅ | ✅ |
+| Live Transcription | ✅ | ✅ |
+| Speaker Diarization | ✅ | ✅ |
+| Summarization | ✅ | ✅ |
+| Meeting Rooms | ❌ | ✅ |
+| Persistent Storage | ❌ | ✅ |
+| Team Collaboration | ❌ | ✅ |
+| API Access | Limited | Full |
+| User Management | ❌ | ✅ |
+| Custom Branding | ❌ | ✅ |
+| Meeting History | ❌ | ✅ |
+| Webhooks | ❌ | ✅ |
+
+## Security Considerations
+
+### Public Mode Security
+- File size restrictions
+- Automatic cleanup of old data
+
+### Private Mode Security
+- Access control lists
+- Data retention policies
+
+## Choosing the Right Mode
+
+### Choose Public Mode if:
+- You need quick, one-time transcriptions
+- You're evaluating Reflector
+- You don't need persistent storage
+- You're processing non-sensitive content
+
+### Choose Private Mode if:
+- You need team collaboration
+- You require persistent storage
+- You're processing sensitive content
+- You need meeting room functionality
+- You want searchable meeting history
--- a/docs/docs/concepts/overview.md
+++ b/docs/docs/concepts/overview.md
@@ -0,0 +1,201 @@
+---
+sidebar_position: 1
+title: Architecture Overview
+---
+
+# Architecture Overview
+
+Reflector is built as a modern, scalable, microservices-based application designed to handle audio processing workloads efficiently while maintaining data privacy and control.
+
+## System Components
+
+### Frontend Application
+
+The user interface is built with **Next.js 15** using the App Router pattern, providing:
+
+- Server-side rendering for optimal performance
+- Real-time WebSocket connections for live transcription
+- WebRTC support for audio streaming and live meetings (via Daily.co or Whereby)
+- Responsive design with Chakra UI components
+
+### Backend API Server
+
+The core API is powered by **FastAPI**, a modern Python framework that provides:
+
+- High-performance async request handling
+- Automatic OpenAPI documentation generation
+- Type safety with Pydantic models
+- WebSocket support for real-time updates
+
+### Processing Pipeline
+
+Audio processing is handled through a modular pipeline architecture:
+
+```
+Audio Input → Chunking → Transcription → Diarization → Post-Processing → Storage
+```
+
+Each step can run independently and in parallel, allowing for:
+- Scalable processing of large files
+- Real-time streaming capabilities
+- Fault tolerance and retry mechanisms
+
+### Worker Architecture
+
+Background tasks are managed by **Celery** workers with **Redis** as the message broker:
+
+- Distributed task processing
+- Priority queues for time-sensitive operations
+- Automatic retry on failure
+- Progress tracking and notifications
+
+### GPU Acceleration
+
+ML models run on GPU-accelerated infrastructure:
+
+- **Modal.com** for serverless GPU processing
+- **Self-hosted GPU** with Docker deployment
+- Automatic scaling based on demand
+- Cost-effective pay-per-use model
+
+## Data Flow
+
+### Daily.co Meeting Recording Flow
+
+1. **Recording**: Daily.co captures separate audio tracks per participant
+2. **Webhook**: Daily.co notifies Reflector when recording is ready
+3. **Track Download**: Individual participant tracks fetched from S3
+4. **Padding**: Tracks padded with silence based on join time for synchronization
+5. **Transcription**: Each track transcribed independently (speaker = track index)
+6. **Merge**: Transcriptions sorted by timestamp and combined
+7. **Mixdown**: Tracks mixed to single MP3 for playback
+8. **Post-Processing**: Topics, title, and summaries generated via LLM
+9. **Delivery**: Results stored and user notified via WebSocket
+
+### File Upload Flow
+
+1. **Upload**: User uploads audio file through web interface
+2. **Storage**: File stored temporarily
+3. **Transcription**: Full file transcribed via Whisper
+4. **Diarization**: ML-based speaker identification (Pyannote)
+5. **Post-Processing**: Topics, title, summaries
+6. **Delivery**: Results stored and user notified
+
+### Live Streaming Flow
+
+1. **WebRTC Connection**: Browser establishes peer connection via Daily.co or Whereby
+2. **Audio Capture**: Microphone audio streamed to server
+3. **Buffering**: Audio buffered for processing
+4. **Real-time Processing**: Segments transcribed as they arrive
+5. **WebSocket Updates**: Results streamed back to client
+6. **Continuous Assembly**: Full transcript built progressively
+
+## Deployment Architecture
+
+### Container-Based Deployment
+
+All components are containerized for consistent deployment:
+
+```yaml
+services:
+  web:         # Next.js application
+  server:      # FastAPI server
+  worker:      # Celery workers
+  redis:       # Message broker
+  postgres:    # Database
+  caddy:       # Reverse proxy
+```
+
+### Networking
+
+- **Host Network Mode**: Required for WebRTC/ICE compatibility
+- **Caddy Reverse Proxy**: Handles SSL termination and routing
+- **WebSocket Upgrade**: Supports real-time connections
+
+## Scalability Considerations
+
+### Horizontal Scaling
+
+- **Stateless Backend**: Multiple API server instances
+- **Worker Pools**: Add workers based on queue depth
+- **Database Pooling**: Connection management for concurrent access
+
+### Vertical Scaling
+
+- **GPU Workers**: Scale up for faster model inference
+- **Memory Optimization**: Efficient audio buffering
+
+## Security Architecture
+
+### Authentication & Authorization
+
+- **JWT Tokens**: Stateless authentication
+- **Authentik Integration**: Enterprise SSO support
+- **Role-Based Access**: Granular permissions
+
+### Data Protection
+
+- **Encryption in Transit**: TLS for all connections
+- **Temporary Storage**: Automatic cleanup of processed files
+
+### Privacy by Design
+
+- **Local Processing**: Option to process entirely on-premises
+- **No Training on User Data**: Models are pre-trained
+- **Data Isolation**: Multi-tenant data separation
+
+## Integration Points
+
+### External Services
+
+- **Modal.com**: GPU processing
+- **AWS S3**: Long-term storage
+- **Whereby**: Video conferencing rooms
+- **Zulip**: Chat integration (optional)
+
+### APIs and Webhooks
+
+- **RESTful API**: Standard CRUD operations
+- **WebSocket API**: Real-time updates
+- **Webhook Notifications**: Processing completion events
+- **OpenAPI Specification**: Machine-readable API definition
+
+## Performance Optimization
+
+### Caching Strategy
+
+- **Redis Cache**: Frequently accessed data
+- **CDN**: Static asset delivery
+- **Browser Cache**: Client-side optimization
+
+### Database Optimization
+
+- **Indexed Queries**: Fast search and retrieval
+- **Connection Pooling**: Efficient resource usage
+- **Query Optimization**: N+1 query prevention
+
+### Processing Optimization
+
+- **Batch Processing**: Efficient GPU utilization
+- **Parallel Execution**: Multi-core CPU usage
+- **Stream Processing**: Reduced memory footprint
+
+## Monitoring and Observability
+
+### Metrics Collection
+
+- **Application Metrics**: Request rates, response times
+- **System Metrics**: CPU, memory, disk usage
+- **Business Metrics**: Transcription accuracy, processing times
+
+### Logging
+
+- **Structured Logging**: JSON format for analysis
+- **Log Aggregation**: Centralized log management
+- **Error Tracking**: Sentry integration
+
+### Health Checks
+
+- **Liveness Probes**: Component availability
+- **Readiness Probes**: Service readiness
+- **Dependency Checks**: External service status
--- a/docs/docs/concepts/pipeline.md
+++ b/docs/docs/concepts/pipeline.md
@@ -0,0 +1,183 @@
+---
+sidebar_position: 4
+title: Processing Pipeline
+---
+
+# Processing Pipeline
+
+Reflector uses a modular pipeline architecture to process audio efficiently and accurately.
+
+## Pipeline Overview
+
+The processing pipeline consists of modular components that can be combined and configured based on your needs:
+
+```mermaid
+graph LR
+    A[Audio Input] --> B[Pre-processing]
+    B --> C[Chunking]
+    C --> D[Transcription]
+    D --> E[Diarization]
+    E --> F[Alignment]
+    F --> G[Post-processing]
+    G --> H[Output]
+```
+
+## Pipeline Components
+
+### Audio Input
+
+Accepts various input sources:
+- **File Upload**: MP3, WAV, M4A, WebM, MP4
+- **WebRTC Stream**: Live browser audio
+- **Recording Integration**: Daily.co and Whereby recordings
+- **API Upload**: Direct API submission
+
+### Pre-processing
+
+Prepares audio for optimal processing:
+- **Format Conversion**: Convert to 16kHz mono WAV
+- **Noise Reduction**: Optional background noise removal
+- **Validation**: Check duration and quality
+
+### Chunking
+
+Splits audio for parallel processing:
+- **Configurable Size**: Audio split into processable segments
+- **Silence Detection**: Optional splitting at natural pauses
+- **Metadata**: Track chunk positions
+
+### Transcription
+
+Converts speech to text:
+- **Model Selection**: Whisper or Parakeet
+- **Language Detection**: Automatic or specified
+- **Timestamp Generation**: Word-level timing
+- **Confidence Scores**: Quality indicators
+
+### Diarization
+
+Identifies different speakers:
+- **Voice Activity Detection**: Find speech segments
+- **Speaker Embedding**: Extract voice characteristics
+- **Clustering**: Group similar voices
+- **Label Assignment**: Assign speaker IDs
+
+### Alignment
+
+Merges all processing results:
+- **Chunk Assembly**: Combine transcription chunks
+- **Speaker Mapping**: Align speakers with text
+- **Overlap Resolution**: Handle chunk boundaries
+- **Timeline Creation**: Build unified timeline
+
+### Post-processing
+
+Enhances the final output:
+- **Formatting**: Apply punctuation and capitalization
+- **Summarization**: Generate concise summaries
+- **Topic Extraction**: Identify key themes
+- **Action Items**: Extract tasks and decisions
+
+## Processing Modes
+
+### Batch Processing
+
+For uploaded files:
+- Optimized for throughput
+- Parallel chunk processing
+- Higher accuracy models
+- Complete file analysis
+
+### Stream Processing
+
+For live audio:
+- Optimized for latency
+- Sequential processing
+- Real-time feedback
+- Progressive results
+
+### Hybrid Processing
+
+For meetings:
+- Stream during meeting
+- Batch after completion
+- Best of both modes
+- Maximum accuracy
+
+## Pipeline Orchestration
+
+### Error Handling
+
+Error recovery:
+- **Automatic Retry**: Failed tasks retry up to 3 times
+- **Partial Recovery**: Continue with successful chunks
+- **Fallback Models**: Use alternative models on failure
+- **Error Reporting**: Detailed error messages
+
+### Progress Tracking
+
+Real-time progress updates:
+- **Chunk Progress**: Track individual chunk processing
+- **Overall Progress**: Percentage completion
+- **ETA Calculation**: Estimated completion time
+- **WebSocket Updates**: Live progress to clients
+
+## Optimization Strategies
+
+### GPU Utilization
+
+Maximize GPU efficiency:
+- **Batch Processing**: Process multiple chunks together
+- **Model Caching**: Keep models loaded in memory
+- **Dynamic Batching**: Adjust batch size based on GPU memory
+- **Multi-GPU Support**: Distribute across available GPUs
+
+### Memory Management
+
+Efficient memory usage:
+- **Streaming Processing**: Process large files in chunks
+- **Garbage Collection**: Clean up after each chunk
+- **Memory Limits**: Prevent out-of-memory errors
+- **Disk Caching**: Use disk for large intermediate results
+
+### Network Optimization
+
+Minimize network overhead:
+- **Compression**: Compress audio before transfer
+- **CDN Integration**: Use CDN for static assets
+- **Connection Pooling**: Reuse network connections
+- **Parallel Uploads**: Multiple concurrent uploads
+
+## Quality Assurance
+
+### Accuracy Metrics
+
+Monitor processing quality:
+- **Word Error Rate (WER)**: Transcription accuracy
+- **Diarization Error Rate (DER)**: Speaker identification accuracy
+- **Summary Coherence**: Summary quality metrics
+
+### Validation Steps
+
+Ensure output quality:
+- **Confidence Thresholds**: Filter low-confidence segments
+- **Consistency Checks**: Verify timeline consistency
+- **Language Validation**: Ensure correct language detection
+- **Format Validation**: Check output format compliance
+
+## Advanced Features
+
+### Custom Models
+
+Use your own models:
+- **Fine-tuned Whisper**: Domain-specific models
+- **Custom Diarization**: Trained on your speakers
+- **Specialized Post-processing**: Industry-specific formatting
+
+### Pipeline Extensions
+
+Add custom processing steps:
+- **Sentiment Analysis**: Analyze emotional tone
+- **Entity Extraction**: Identify people, places, organizations
+- **Custom Metrics**: Calculate domain-specific metrics
+- **Integration Hooks**: Call external services
--- a/docs/docs/installation/auth-setup.md
+++ b/docs/docs/installation/auth-setup.md
@@ -0,0 +1,285 @@
+---
+sidebar_position: 5
+title: Authentication Setup
+---
+
+# Authentication Setup
+
+This page covers authentication setup in detail. For the complete deployment guide, see [Deployment Guide](./overview).
+
+Reflector uses [Authentik](https://goauthentik.io/) for OAuth/OIDC authentication. This guide walks you through setting up Authentik and connecting it to Reflector.
+
+The guide simplistically sets Authentic on the same server as Reflector. You can use your own Authentic instance instead.
+
+## Overview
+
+Reflector's authentication flow:
+1. User clicks "Sign In" on frontend
+2. Frontend redirects to Authentik login page
+3. User authenticates with Authentik
+4. Authentik redirects back with OAuth tokens
+5. Frontend stores tokens, backends verify JWT signature
+
+## Option 1: Self-Hosted Authentik (Same Server)
+
+This setup runs Authentik on the same server as Reflector, with Caddy proxying to both.
+
+### Deploy Authentik
+
+```bash
+# Create directory for Authentik
+mkdir -p ~/authentik && cd ~/authentik
+
+# Download docker-compose file
+curl -O https://goauthentik.io/docker-compose.yml
+
+# Generate secrets and bootstrap credentials
+cat > .env << 'EOF'
+PG_PASS=$(openssl rand -base64 36 | tr -d '\n')
+AUTHENTIK_SECRET_KEY=$(openssl rand -base64 60 | tr -d '\n')
+# Privacy-focused choice for self-hosted deployments
+AUTHENTIK_ERROR_REPORTING__ENABLED=false
+AUTHENTIK_BOOTSTRAP_PASSWORD=YourSecurePassword123
+AUTHENTIK_BOOTSTRAP_EMAIL=admin@example.com
+EOF
+
+# Start Authentik
+sudo docker compose up -d
+```
+
+Authentik takes ~2 minutes to run migrations and apply blueprints on first start.
+
+### Connect Authentik to Reflector's Network
+
+If Authentik runs in a separate Docker Compose project, connect it to Reflector's network so Caddy can proxy to it:
+
+```bash
+# Wait for Authentik to be healthy
+# Connect Authentik server to Reflector's network
+sudo docker network connect reflector_default authentik-server-1
+```
+
+**Important:** This step must be repeated if you restart Authentik with `docker compose down`. Add it to your deployment scripts or use `docker compose up -d` (which preserves containers) instead of down/up.
+
+### Add Authentik to Caddy
+
+Uncomment the Authentik section in your `Caddyfile` and set your domain:
+
+```bash
+nano Caddyfile
+```
+
+Uncomment and edit:
+```
+{$AUTHENTIK_DOMAIN:authentik.example.com} {
+    reverse_proxy authentik-server-1:9000
+}
+```
+
+Reload Caddy:
+```bash
+docker compose -f docker-compose.prod.yml exec caddy caddy reload --config /etc/caddy/Caddyfile
+```
+
+### Create OAuth2 Provider in Authentik
+
+**Option A: Automated Setup (Recommended)**
+
+**Location: Reflector server**
+
+Run the setup script from the Reflector repository:
+
+```bash
+ssh user@your-server-ip
+cd ~/reflector
+./scripts/setup-authentik-oauth.sh https://authentik.example.com YourSecurePassword123 https://app.example.com
+```
+
+**Important:** The script must be run from the `~/reflector` directory on your server, as it creates files using relative paths.
+
+The script will output the configuration values to add to your `.env` files. Skip to "Update docker-compose.prod.yml".
+
+**Option B: Manual Setup**
+
+1. **Login to Authentik Admin** at `https://authentik.example.com/`
+   - Username: `akadmin`
+   - Password: The `AUTHENTIK_BOOTSTRAP_PASSWORD` you set in .env
+
+2. **Create OAuth2 Provider:**
+   - Go to **Applications > Providers > Create**
+   - Select **OAuth2/OpenID Provider**
+   - Configure:
+     - **Name**: `Reflector`
+     - **Authorization flow**: `default-provider-authorization-implicit-consent`
+     - **Client type**: `Confidential`
+     - **Client ID**: Note this value (auto-generated)
+     - **Client Secret**: Note this value (auto-generated)
+     - **Redirect URIs**: Add entry with:
+       ```
+       https://app.example.com/api/auth/callback/authentik
+       ```
+   - Scroll down to **Advanced protocol settings**
+   - In **Scopes**, add these three mappings:
+     - `authentik default OAuth Mapping: OpenID 'email'`
+     - `authentik default OAuth Mapping: OpenID 'openid'`
+     - `authentik default OAuth Mapping: OpenID 'profile'`
+   - Click **Finish**
+
+3. **Create Application:**
+   - Go to **Applications > Applications > Create**
+   - Configure:
+     - **Name**: `Reflector`
+     - **Slug**: `reflector` (auto-filled)
+     - **Provider**: Select the `Reflector` provider you just created
+   - Click **Create**
+
+### Get Public Key for JWT Verification
+
+**Location: Reflector server**
+
+Extract the public key from Authentik's JWKS endpoint:
+
+```bash
+mkdir -p ~/reflector/server/reflector/auth/jwt/keys
+curl -s https://authentik.example.com/application/o/reflector/jwks/ | \
+  jq -r '.keys[0].x5c[0]' | base64 -d | openssl x509 -pubkey -noout \
+  > ~/reflector/server/reflector/auth/jwt/keys/authentik_public.pem
+```
+
+### Update docker-compose.prod.yml
+
+**Location: Reflector server**
+
+**Note:** This step is already done in the current `docker-compose.prod.yml`. Verify the volume mounts exist:
+
+```yaml
+server:
+  image: monadicalsas/reflector-backend:latest
+  # ... other config ...
+  volumes:
+    - server_data:/app/data
+    - ./server/reflector/auth/jwt/keys:/app/reflector/auth/jwt/keys:ro
+
+worker:
+  image: monadicalsas/reflector-backend:latest
+  # ... other config ...
+  volumes:
+    - server_data:/app/data
+    - ./server/reflector/auth/jwt/keys:/app/reflector/auth/jwt/keys:ro
+```
+
+### Configure Reflector Backend
+
+**Location: Reflector server**
+
+Update `server/.env`:
+```env
+# Authentication
+AUTH_BACKEND=jwt
+AUTH_JWT_PUBLIC_KEY=authentik_public.pem
+AUTH_JWT_AUDIENCE=<your-client-id>
+CORS_ALLOW_CREDENTIALS=true
+```
+
+Replace `<your-client-id>` with the Client ID from previous steps.
+
+### Configure Reflector Frontend
+
+**Location: Reflector server**
+
+Update `www/.env`:
+```env
+# Authentication
+FEATURE_REQUIRE_LOGIN=true
+
+# Authentik OAuth
+AUTHENTIK_ISSUER=https://authentik.example.com/application/o/reflector
+AUTHENTIK_REFRESH_TOKEN_URL=https://authentik.example.com/application/o/token/
+AUTHENTIK_CLIENT_ID=<your-client-id>
+AUTHENTIK_CLIENT_SECRET=<your-client-secret>
+
+# NextAuth
+NEXTAUTH_SECRET=<generate-with-openssl-rand-hex-32>
+```
+
+### Restart Services
+
+**Location: Reflector server**
+
+```bash
+cd ~/reflector
+sudo docker compose -f docker-compose.prod.yml up -d --force-recreate server worker web
+```
+
+### Verify Authentication
+
+1. Visit `https://app.example.com`
+2. Click "Log in" or navigate to `/api/auth/signin`
+3. Click "Sign in with Authentik"
+4. Login with your Authentik credentials
+5. You should be redirected back and see "Log out" in the header
+
+## Option 2: Disable Authentication
+
+For testing or internal deployments where authentication isn't needed:
+
+**Backend `server/.env`:**
+```env
+AUTH_BACKEND=none
+```
+
+**Frontend `www/.env`:**
+```env
+FEATURE_REQUIRE_LOGIN=false
+```
+
+**Note:** The pre-built Docker images have `FEATURE_REQUIRE_LOGIN=true` baked in. To disable auth, you'll need to rebuild the frontend image with the env var set at build time, or set up Authentik.
+
+## Troubleshooting
+
+### "Invalid redirect URI" error
+- Verify the redirect URI in Authentik matches exactly:
+  ```
+  https://app.example.com/api/auth/callback/authentik
+  ```
+- Check for trailing slashes - they must match exactly
+
+### "Invalid audience" JWT error
+- Ensure `AUTH_JWT_AUDIENCE` in `server/.env` matches the Client ID from Authentik
+- The audience value is the OAuth Client ID, not the issuer URL
+
+### "JWT verification failed" error
+- Verify the public key file is mounted in the container
+- Check `AUTH_JWT_PUBLIC_KEY` points to the correct filename
+- Ensure the key was extracted from the correct provider's JWKS endpoint
+
+### Caddy returns 503 for Authentik
+- Verify Authentik container is connected to Reflector's network:
+  ```bash
+  sudo docker network connect reflector_default authentik-server-1
+  ```
+- Check Authentik is healthy: `cd ~/authentik && sudo docker compose ps`
+
+### Users can't access protected pages
+- Verify `FEATURE_REQUIRE_LOGIN=true` in frontend
+- Check `AUTH_BACKEND=jwt` in backend
+- Verify CORS settings allow credentials
+
+### Token refresh errors
+- Ensure Redis is running (frontend uses Redis for token caching)
+- Verify `KV_URL` is set correctly in frontend env
+- Check `AUTHENTIK_REFRESH_TOKEN_URL` is correct
+
+## API Key Authentication
+
+For programmatic access (scripts, integrations), users can generate API keys:
+
+1. Login to Reflector
+2. Go to Settings > API Keys
+3. Click "Generate New Key"
+4. Use the key in requests:
+   ```bash
+   curl -H "X-API-Key: your-api-key" https://api.example.com/v1/transcripts
+   ```
+
+API keys are stored hashed and can be revoked at any time.
--- a/docs/docs/installation/daily-setup.md
+++ b/docs/docs/installation/daily-setup.md
@@ -0,0 +1,165 @@
+---
+sidebar_position: 6
+title: Daily.co Setup
+---
+
+# Daily.co Setup
+
+This page covers Daily.co video platform setup for live meeting rooms. For the complete deployment guide, see [Deployment Guide](./overview).
+
+Daily.co enables live video meetings with automatic recording and transcription.
+
+## What You'll Set Up
+
+```
+User joins meeting → Daily.co video room → Recording to S3 → [Webhook] → Reflector transcribes
+```
+
+## Prerequisites
+
+- [ ] **Daily.co account** - Free tier at https://dashboard.daily.co
+- [ ] **AWS account** - For S3 storage
+- [ ] **Reflector deployed** - Complete steps from [Deployment Guide](./overview)
+
+---
+
+## Create Daily.co Account
+
+1. Visit https://dashboard.daily.co and sign up
+2. Verify your email
+3. Note your subdomain (e.g., `yourname.daily.co` → subdomain is `yourname`)
+
+---
+
+## Get Daily.co API Key
+
+1. In Daily.co dashboard, go to **Developers**
+2. Click **API Keys**
+3. Click **Create API Key**
+4. Copy the key (starts with a long string)
+
+Save this for later.
+
+---
+
+## Create AWS S3 Bucket
+
+Daily.co needs somewhere to store recordings before Reflector processes them.
+
+```bash
+# Choose a unique bucket name
+BUCKET_NAME="reflector-dailyco-yourname" # -yourname is not a requirement, you can name the bucket as you wish
+AWS_REGION="us-east-1"
+
+# Create bucket
+aws s3 mb s3://$BUCKET_NAME --region $AWS_REGION
+
+# Enable versioning (required)
+aws s3api put-bucket-versioning \
+  --bucket $BUCKET_NAME \
+  --versioning-configuration Status=Enabled
+```
+
+---
+
+## Create IAM Role for Daily.co
+
+Daily.co needs permission to write recordings to your S3 bucket.
+
+Follow the guide https://docs.daily.co/guides/products/live-streaming-recording/storing-recordings-in-a-custom-s3-bucket
+
+Save the role ARN - you'll need it soon.
+
+It looks like: `arn:aws:iam::123456789012:role/DailyCo`
+
+Shortly, you'll need to set up a role and give this role your s3 bucket access
+
+No additional setup is required from Daily.co settings website side: the app code takes care of letting Daily know where to save the recordings.
+
+---
+
+## Configure Reflector
+
+**Location: Reflector server**
+
+Add to `server/.env`:
+
+```env
+# Daily.co Configuration
+DEFAULT_VIDEO_PLATFORM=daily
+DAILY_API_KEY=<your-api-key-from-daily-setup>
+DAILY_SUBDOMAIN=<your-subdomain-from-daily-setup>
+
+# S3 Storage for Daily.co recordings
+DAILYCO_STORAGE_AWS_BUCKET_NAME=<your-bucket-from-daily-setup>
+DAILYCO_STORAGE_AWS_REGION=us-east-1
+DAILYCO_STORAGE_AWS_ROLE_ARN=<your-role-arn-from-daily-setup>
+
+# Transcript storage (should already be configured from main setup)
+# TRANSCRIPT_STORAGE_BACKEND=aws
+# TRANSCRIPT_STORAGE_AWS_ACCESS_KEY_ID=<your-key>
+# TRANSCRIPT_STORAGE_AWS_SECRET_ACCESS_KEY=<your-secret>
+# TRANSCRIPT_STORAGE_AWS_BUCKET_NAME=<your-bucket-name>
+# TRANSCRIPT_STORAGE_AWS_REGION=<your-bucket-region>
+```
+
+---
+
+## Restart Services
+
+After changing `.env` files, reload with `up -d`:
+
+```bash
+sudo docker compose -f docker-compose.prod.yml up -d server worker
+```
+
+**Note**: `docker compose up -d` detects env changes and recreates containers automatically.
+
+---
+
+## Test Live Room
+
+1. Visit your Reflector frontend: `https://app.example.com`
+2. Go to **Rooms**
+3. Click **Create Room**
+4. Select **Daily** as the platform
+5. Allow camera/microphone access
+6. You should see Daily.co video interface
+7. Speak for 10-20 seconds
+8. Leave the meeting
+9. Recording should appear in **Transcripts** within 5 minutes (if webhooks aren't set up yet, see [Webhook Configuration](#webhook-configuration-optional) below)
+
+---
+
+## Troubleshooting
+
+### Recording doesn't appear in S3
+
+1. Check Daily.co dashboard → **Logs** for errors
+2. Verify IAM role trust policy has correct Daily.co account ID and your Daily.co subdomain
+3. Verify that the bucket has
+
+### Recording in S3 but not transcribed
+
+1. Check webhook is configured (Reflector should auto-create it)
+2. Check worker logs:
+   ```bash
+   docker compose -f docker-compose.prod.yml logs worker --tail 50
+   ```
+3. Verify `DAILYCO_STORAGE_AWS_*` vars in `server/.env`
+
+### "Access Denied" when Daily.co tries to write to S3
+
+1. Double-check IAM role ARN in Daily.co settings
+2. Verify bucket name matches exactly
+3. Check IAM policy has `s3:PutObject` permission
+
+---
+
+## Webhook Configuration [optional]
+
+`manage_daily_webhook.py` script guides you through creating a webhook for Daily recordings.
+
+The webhook isn't required - polling mechanism is the default and performed automatically.
+
+This guide won't go deep into webhook setup.
--- a/docs/docs/installation/docker-setup.md
+++ b/docs/docs/installation/docker-setup.md
@@ -0,0 +1,192 @@
+---
+sidebar_position: 3
+title: Docker Reference
+---
+
+# Docker Reference
+
+This page documents the Docker Compose configuration for Reflector. For the complete deployment guide, see [Deployment Guide](./overview).
+
+## Services
+
+The `docker-compose.prod.yml` includes these services:
+
+| Service | Image | Purpose |
+|---------|-------|---------|
+| `web` | `monadicalsas/reflector-frontend` | Next.js frontend |
+| `server` | `monadicalsas/reflector-backend` | FastAPI backend |
+| `worker` | `monadicalsas/reflector-backend` | Celery worker for background tasks |
+| `beat` | `monadicalsas/reflector-backend` | Celery beat scheduler |
+| `redis` | `redis:7.2-alpine` | Message broker and cache |
+| `postgres` | `postgres:17-alpine` | Primary database |
+| `caddy` | `caddy:2-alpine` | Reverse proxy with auto-SSL |
+
+## Environment Files
+
+Reflector uses two separate environment files:
+
+### Backend (`server/.env`)
+
+Used by: `server`, `worker`, `beat`
+
+Key variables:
+```env
+# Database connection
+DATABASE_URL=postgresql+asyncpg://reflector:reflector@postgres:5432/reflector
+
+# Redis
+REDIS_HOST=redis
+CELERY_BROKER_URL=redis://redis:6379/1
+CELERY_RESULT_BACKEND=redis://redis:6379/1
+
+# API domain and CORS
+BASE_URL=https://api.example.com
+CORS_ORIGIN=https://app.example.com
+
+# Modal GPU processing
+TRANSCRIPT_BACKEND=modal
+TRANSCRIPT_URL=https://...
+TRANSCRIPT_MODAL_API_KEY=...
+```
+
+### Frontend (`www/.env`)
+
+Used by: `web`
+
+Key variables:
+```env
+# Domain configuration
+SITE_URL=https://app.example.com
+API_URL=https://api.example.com
+WEBSOCKET_URL=wss://api.example.com
+SERVER_API_URL=http://server:1250
+
+# Authentication
+NEXTAUTH_URL=https://app.example.com
+NEXTAUTH_SECRET=...
+```
+
+Note: `API_URL` is used client-side (browser), `SERVER_API_URL` is used server-side (SSR).
+
+## Volumes
+
+| Volume | Purpose |
+|--------|---------|
+| `redis_data` | Redis persistence |
+| `postgres_data` | PostgreSQL data |
+| `server_data` | Uploaded files, local storage |
+| `caddy_data` | SSL certificates |
+| `caddy_config` | Caddy configuration |
+
+## Network
+
+All services share the default network. The network is marked `attachable: true` to allow external containers (like Authentik) to join.
+
+## Common Commands
+
+### Start all services
+```bash
+docker compose -f docker-compose.prod.yml up -d
+```
+
+### View logs
+```bash
+# All services
+docker compose -f docker-compose.prod.yml logs -f
+
+# Specific service
+docker compose -f docker-compose.prod.yml logs server --tail 50
+```
+
+### Restart a service
+```bash
+# Quick restart (doesn't reload .env changes)
+docker compose -f docker-compose.prod.yml restart server
+
+# Reload .env and restart
+docker compose -f docker-compose.prod.yml up -d server
+```
+
+### Run database migrations
+```bash
+docker compose -f docker-compose.prod.yml exec server uv run alembic upgrade head
+```
+
+### Access database
+```bash
+docker compose -f docker-compose.prod.yml exec postgres psql -U reflector
+```
+
+### Pull latest images
+```bash
+docker compose -f docker-compose.prod.yml pull
+docker compose -f docker-compose.prod.yml up -d
+```
+
+### Stop all services
+```bash
+docker compose -f docker-compose.prod.yml down
+```
+
+### Full reset (WARNING: deletes data)
+```bash
+docker compose -f docker-compose.prod.yml down -v
+```
+
+## Customization
+
+### Using a different database
+
+To use an external PostgreSQL:
+
+1. Remove `postgres` service from compose file
+2. Update `DATABASE_URL` in `server/.env`:
+   ```env
+   DATABASE_URL=postgresql+asyncpg://user:pass@external-host:5432/reflector
+   ```
+
+### Using external Redis
+
+1. Remove `redis` service from compose file
+2. Update Redis settings in `server/.env`:
+   ```env
+   REDIS_HOST=external-redis-host
+   CELERY_BROKER_URL=redis://external-redis-host:6379/1
+   ```
+
+### Adding Authentik
+
+To add Authentik for authentication, see [Authentication Setup](./auth-setup). Quick steps:
+
+1. Deploy Authentik separately
+2. Connect to Reflector's network:
+   ```bash
+   docker network connect reflector_default authentik-server-1
+   ```
+3. Add to Caddyfile:
+   ```
+   authentik.example.com {
+       reverse_proxy authentik-server-1:9000
+   }
+   ```
+
+## Caddyfile Reference
+
+The Caddyfile supports environment variable substitution:
+
+```
+{$FRONTEND_DOMAIN:app.example.com} {
+    reverse_proxy web:3000
+}
+
+{$API_DOMAIN:api.example.com} {
+    reverse_proxy server:1250
+}
+```
+
+Set `FRONTEND_DOMAIN` and `API_DOMAIN` environment variables, or edit the file directly.
+
+### Reload Caddy after changes
+```bash
+docker compose -f docker-compose.prod.yml exec caddy caddy reload --config /etc/caddy/Caddyfile
+```
--- a/docs/docs/installation/docs-deployment.md
+++ b/docs/docs/installation/docs-deployment.md
@@ -0,0 +1,139 @@
+---
+sidebar_position: 10
+title: Docs Website Deployment
+---
+
+# Docs Website Deployment
+
+This guide covers deploying the Reflector documentation website. **This is optional and intended for internal/experimental use only.**
+
+## Overview
+
+The documentation is built using Docusaurus and deployed as a static nginx-served site.
+
+## Prerequisites
+
+- Reflector already deployed (Steps 1-7 from [Deployment Guide](./overview))
+- DNS A record for docs subdomain (e.g., `docs.example.com`)
+
+## Deployment Steps
+
+### Step 1: Pre-fetch OpenAPI Spec
+
+The docs site includes API reference from your running backend. Fetch it before building:
+
+```bash
+cd ~/reflector
+docker compose -f docker-compose.prod.yml exec server curl -s http://localhost:1250/openapi.json > docs/static/openapi.json
+```
+
+This creates `docs/static/openapi.json` (should be ~70KB) which will be copied during Docker build.
+
+**Why not fetch during build?** Docker build containers are network-isolated and can't access the running backend services.
+
+### Step 2: Verify Dockerfile
+
+The Dockerfile is already in `docs/Dockerfile`:
+
+```dockerfile
+FROM node:18-alpine AS builder
+WORKDIR /app
+
+# Copy package files
+COPY package*.json ./
+
+# Inshall dependencies
+RUN npm ci
+
+# Copy source (includes static/openapi.json if pre-fetched)
+COPY . .
+
+# Fix docusaurus config: change onBrokenLinks to 'warn' for Docker build
+RUN sed -i "s/onBrokenLinks: 'throw'/onBrokenLinks: 'warn'/g" docusaurus.config.ts
+
+# Build static site
+RUN npx docusaurus build
+
+FROM nginx:alpine
+COPY --from=builder /app/build /usr/share/nginx/html
+EXPOSE 80
+CMD ["nginx", "-g", "daemon off;"]
+```
+
+### Step 3: Add Docs Service to docker-compose.prod.yml
+
+Add this service to `docker-compose.prod.yml`:
+
+```yaml
+docs:
+  build: ./docs
+  restart: unless-stopped
+  networks:
+    - default
+```
+
+### Step 4: Add Caddy Route
+
+Add to `Caddyfile`:
+
+```
+{$DOCS_DOMAIN:docs.example.com} {
+    reverse_proxy docs:80
+}
+```
+
+### Step 5: Build and Deploy
+
+```bash
+cd ~/reflector
+docker compose -f docker-compose.prod.yml up -d --build docs
+docker compose -f docker-compose.prod.yml exec caddy caddy reload --config /etc/caddy/Caddyfile
+```
+
+### Step 6: Verify
+
+```bash
+# Check container status
+docker compose -f docker-compose.prod.yml ps docs
+# Should show "Up"
+
+# Test URL
+curl -I https://docs.example.com
+# Should return HTTP/2 200
+```
+
+Visit `https://docs.example.com` in your browser
+
+## Updating Documentation
+
+When docs are updated:
+
+```bash
+cd ~/reflector
+git pull
+
+# Refresh OpenAPI spec from backend
+docker compose -f docker-compose.prod.yml exec server curl -s http://localhost:1250/openapi.json > docs/static/openapi.json
+
+# Rebuild docs
+docker compose -f docker-compose.prod.yml up -d --build docs
+```
+
+## Troubleshooting
+
+### Missing openapi.json during build
+- Make sure you ran the pre-fetch step first (Step 1)
+- Verify `docs/static/openapi.json` exists and is ~70KB
+- Re-run: `docker compose exec server curl -s http://localhost:1250/openapi.json > docs/static/openapi.json`
+
+### Build fails with "Docusaurus found broken links"
+- This happens if `onBrokenLinks: 'throw'` is set in docusaurus.config.ts
+- Solution is already in Dockerfile: uses `sed` to change to `'warn'` during build
+
+### 404 on all pages
+- Docusaurus baseUrl might be wrong - should be `/` for custom domain
+- Check `docs/docusaurus.config.ts`: `baseUrl: '/'`
+
+### Docs not updating after rebuild
+- Force rebuild: `docker compose -f docker-compose.prod.yml build --no-cache docs`
+- Then: `docker compose -f docker-compose.prod.yml up -d docs`
--- a/docs/docs/installation/modal-setup.md
+++ b/docs/docs/installation/modal-setup.md
@@ -0,0 +1,171 @@
+---
+sidebar_position: 4
+title: Modal.com Setup
+---
+
+# Modal.com Setup
+
+This page covers Modal.com GPU setup in detail. For the complete deployment guide, see [Deployment Guide](./overview).
+
+Reflector uses [Modal.com](https://modal.com) for GPU-accelerated audio processing. This guide walks you through deploying the required GPU functions.
+
+## What is Modal.com?
+
+Modal is a serverless GPU platform. You deploy Python code that runs on their GPUs, and pay only for actual compute time. Reflector uses Modal for:
+
+- **Transcription**: Whisper model for speech-to-text
+- **Diarization**: Pyannote model for speaker identification
+
+## Prerequisites
+
+1. **Modal.com account** - Sign up at https://modal.com (free tier available)
+2. **HuggingFace account** - Required for Pyannote diarization models:
+   - Create account at https://huggingface.co
+   - Accept **both** Pyannote licenses:
+     - https://huggingface.co/pyannote/speaker-diarization-3.1
+     - https://huggingface.co/pyannote/segmentation-3.0
+   - Generate access token at https://huggingface.co/settings/tokens
+
+## Deployment
+
+**Location: YOUR LOCAL COMPUTER (laptop/desktop)**
+
+Modal CLI requires browser authentication, so this must run on a machine with a browser - not on a headless server.
+
+### Install Modal CLI
+
+```bash
+uv tool install modal
+```
+
+### Authenticate with Modal
+
+```bash
+modal setup
+```
+
+This opens your browser for authentication. Complete the login flow.
+
+### Clone Repository and Deploy
+
+```bash
+git clone https://github.com/monadical-sas/reflector.git
+cd reflector/gpu/modal_deployments
+./deploy-all.sh --hf-token YOUR_HUGGINGFACE_TOKEN
+```
+
+Or run interactively (script will prompt for token):
+```bash
+./deploy-all.sh
+```
+
+### What the Script Does
+
+1. **Prompts for HuggingFace token** - Needed to download the Pyannote diarization model
+2. **Generates API key** - Creates a secure random key for authenticating requests to GPU functions
+3. **Creates Modal secrets**:
+   - `hf_token` - Your HuggingFace token
+   - `reflector-gpu` - The generated API key
+4. **Deploys GPU functions** - Transcriber (Whisper) and Diarizer (Pyannote)
+5. **Outputs configuration** - Prints URLs and API key to console
+
+### Example Output
+
+```
+==========================================
+Reflector GPU Functions Deployment
+==========================================
+
+Generating API key for GPU services...
+Creating Modal secrets...
+  -> Creating secret: hf_token
+  -> Creating secret: reflector-gpu
+
+Deploying transcriber (Whisper)...
+  -> https://yourname--reflector-transcriber-web.modal.run
+
+Deploying diarizer (Pyannote)...
+  -> https://yourname--reflector-diarizer-web.modal.run
+
+==========================================
+Deployment complete!
+==========================================
+
+Copy these values to your server's server/.env file:
+
+# --- Modal GPU Configuration ---
+TRANSCRIPT_BACKEND=modal
+TRANSCRIPT_URL=https://yourname--reflector-transcriber-web.modal.run
+TRANSCRIPT_MODAL_API_KEY=abc123...
+
+DIARIZATION_BACKEND=modal
+DIARIZATION_URL=https://yourname--reflector-diarizer-web.modal.run
+DIARIZATION_MODAL_API_KEY=abc123...
+# --- End Modal Configuration ---
+```
+
+Copy the output and paste it into your `server/.env` file on your server.
+
+## Costs
+
+Modal charges based on GPU compute time:
+- Functions scale to zero when not in use (no cost when idle)
+- You only pay for actual processing time
+- Free tier includes $30/month of credits
+
+Typical costs for audio processing:
+- Transcription: ~$0.01-0.05 per minute of audio
+- Diarization: ~$0.02-0.10 per minute of audio
+
+## Troubleshooting
+
+### "Modal CLI not installed"
+```bash
+uv tool install modal
+```
+
+### "Not authenticated with Modal"
+```bash
+modal setup
+# Complete browser authentication
+```
+
+### "Failed to create secret hf_token"
+- Verify your HuggingFace token is valid
+- Ensure you've accepted the Pyannote license
+- Token needs `read` permission
+
+### Deployment fails
+Check the Modal dashboard for detailed error logs:
+- Visit https://modal.com/apps
+- Click on the failed function
+- View build and runtime logs
+
+### Re-running deployment
+The script is safe to re-run. It will:
+- Update existing secrets if they exist
+- Redeploy functions with latest code
+- Output new configuration (API key stays the same if secret exists)
+
+## Manual Deployment (Advanced)
+
+If you prefer to deploy functions individually:
+
+```bash
+cd gpu/modal_deployments
+
+# Create secrets manually
+modal secret create hf_token HF_TOKEN=your-hf-token
+modal secret create reflector-gpu REFLECTOR_GPU_APIKEY=$(openssl rand -hex 32)
+
+# Deploy each function
+modal deploy reflector_transcriber.py
+modal deploy reflector_diarizer.py
+```
+
+## Monitoring
+
+View your deployed functions and their usage:
+- **Modal Dashboard**: https://modal.com/apps
+- **Function logs**: Click on any function to view logs
+- **Usage**: View compute time and costs in the dashboard
--- a/docs/docs/installation/overview.md
+++ b/docs/docs/installation/overview.md
@@ -0,0 +1,411 @@
+---
+sidebar_position: 1
+title: Deployment Guide
+---
+
+# Deployment Guide
+
+This guide walks you through deploying Reflector from scratch. Follow these steps in order.
+
+## What You'll Set Up
+
+```mermaid
+flowchart LR
+    User --> Caddy["Caddy (auto-SSL)"]
+    Caddy --> Frontend["Frontend (Next.js)"]
+    Caddy --> Backend["Backend (FastAPI)"]
+    Backend --> PostgreSQL
+    Backend --> Redis
+    Backend --> Workers["Celery Workers"]
+    Workers --> PostgreSQL
+    Workers --> Redis
+    Workers --> GPU["GPU Processing<br/>(Modal.com OR Self-hosted)"]
+```
+
+## Prerequisites
+
+Before starting, you need:
+
+- **Production server** -  4+ cores, 8GB+ RAM, public IP
+- **Two domain names** - e.g., `app.example.com` (frontend) and `api.example.com` (backend)
+- **GPU processing** - Choose one:
+  - Modal.com account, OR
+  - GPU server with NVIDIA GPU (8GB+ VRAM)
+- **HuggingFace account** - Free at https://huggingface.co
+  - Accept both Pyannote licenses (required for speaker diarization):
+    - https://huggingface.co/pyannote/speaker-diarization-3.1
+    - https://huggingface.co/pyannote/segmentation-3.0
+- **LLM API** - For summaries and topic detection. Choose one:
+  - OpenAI API key at https://platform.openai.com/account/api-keys, OR
+  - Any OpenAI-compatible endpoint (vLLM, LiteLLM, Ollama, etc.)
+- **AWS S3 bucket** - For storing audio files and transcripts (see [S3 Setup](#create-s3-bucket-for-transcript-storage) below)
+
+### Optional (for live meeting rooms)
+
+- [ ] **Daily.co account** - Free tier at https://dashboard.daily.co
+- [ ] **AWS S3 bucket + IAM Role** - For Daily.co recording storage (separate from transcript storage)
+
+---
+
+## Configure DNS
+
+```
+Type: A    Name: app    Value: <your-server-ip>
+Type: A    Name: api    Value: <your-server-ip>
+```
+
+---
+
+## Deploy GPU Processing
+
+Reflector requires GPU processing for transcription and speaker diarization. Choose one option:
+
+| | **Modal.com (Cloud)** | **Self-Hosted GPU** |
+|---|---|---|
+| **Best for** | No GPU hardware, zero maintenance | Own GPU server, full control |
+| **Pricing** | Pay-per-use | Fixed infrastructure cost |
+
+### Option A: Modal.com (Serverless Cloud GPU)
+
+#### Accept HuggingFace Licenses
+
+Visit both pages and click "Accept":
+- https://huggingface.co/pyannote/speaker-diarization-3.1
+- https://huggingface.co/pyannote/segmentation-3.0
+
+Generate a token at https://huggingface.co/settings/tokens
+
+#### Deploy to Modal
+
+There's an install script to help with this setup. It's using modal API to set all necessary moving parts.
+
+As an alternative, all those operations that script does could be performed in modal settings in modal UI.
+
+```bash
+uv tool install modal
+modal setup  # opens browser for authentication
+
+git clone https://github.com/monadical-sas/reflector.git
+cd reflector/gpu/modal_deployments
+./deploy-all.sh --hf-token YOUR_HUGGINGFACE_TOKEN
+```
+
+**Save the output** - copy the configuration block, you'll need it soon.
+
+See [Modal Setup](./modal-setup) for troubleshooting and details.
+
+### Option B: Self-Hosted GPU
+
+**Location: YOUR GPU SERVER**
+
+Requires: NVIDIA GPU with 8GB+ VRAM, Ubuntu 22.04+, 40-50GB disk.
+
+See [Self-Hosted GPU Setup](./self-hosted-gpu-setup) for complete instructions. Quick summary:
+
+1. Install NVIDIA drivers and Docker
+2. Clone repository: `git clone https://github.com/monadical-sas/reflector.git`
+3. Configure `.env` with HuggingFace token
+4. Start service with Docker compose
+5. Set up Caddy reverse proxy for HTTPS
+
+**Save your API key and HTTPS URL** - you'll need them soon.
+
+---
+
+## Prepare Server
+
+**Location: dedicated reflector server**
+
+### Install Docker
+
+```bash
+ssh user@your-server-ip
+
+curl -fsSL https://get.docker.com | sh
+sudo usermod -aG docker $USER
+
+# Log out and back in for group changes
+exit
+ssh user@your-server-ip
+
+docker --version  # verify
+```
+
+### Firewall
+
+Ensure ports 80 (HTTP) and 443 (HTTPS) are open for inbound traffic. The method varies by cloud provider and OS configuration.
+
+**For live transcription without Daily/Whereby rooms**: WebRTC requires UDP port range 49152-65535 for media traffic.
+
+### Clone Repository
+
+The Docker images contain all application code. You clone the repository for configuration files and the compose definition:
+
+```bash
+git clone https://github.com/monadical-sas/reflector.git
+cd reflector
+```
+
+---
+
+## Create S3 Bucket for Transcript Storage
+
+Reflector requires AWS S3 to store audio files during processing.
+
+### Create Bucket
+
+```bash
+# Choose a unique bucket name
+BUCKET_NAME="reflector-transcripts-yourname"
+AWS_REGION="us-east-1"
+
+# Create bucket
+aws s3 mb s3://$BUCKET_NAME --region $AWS_REGION
+```
+
+### Create IAM User
+
+Create an IAM user with S3 access for Reflector:
+
+1. Go to AWS IAM Console → Users → Create User
+2. Name: `reflector-transcripts`
+3. Attach policy: `AmazonS3FullAccess` (or create a custom policy for just your bucket)
+4. Create access key (Access key ID + Secret access key)
+
+Save these credentials - you'll need them in the next step.
+
+---
+
+## Configure Environment
+
+Reflector has two env files:
+- `server/.env` - Backend configuration
+- `www/.env` - Frontend configuration
+
+### Backend Configuration
+
+```bash
+cp server/.env.example server/.env
+nano server/.env
+```
+
+**Required settings:**
+```env
+# Database (defaults work with docker-compose.prod.yml)
+DATABASE_URL=postgresql+asyncpg://reflector:reflector@postgres:5432/reflector
+
+# Redis
+REDIS_HOST=redis
+CELERY_BROKER_URL=redis://redis:6379/1
+CELERY_RESULT_BACKEND=redis://redis:6379/1
+
+# Your domains
+BASE_URL=https://api.example.com
+CORS_ORIGIN=https://app.example.com
+CORS_ALLOW_CREDENTIALS=true
+
+# Secret key - generate with: openssl rand -hex 32
+SECRET_KEY=<your-generated-secret>
+
+# GPU Processing - choose ONE option:
+
+# Option A: Modal.com (paste from deploy-all.sh output)
+TRANSCRIPT_BACKEND=modal
+TRANSCRIPT_URL=https://yourname--reflector-transcriber-web.modal.run
+TRANSCRIPT_MODAL_API_KEY=<from-deploy-all.sh-output>
+DIARIZATION_BACKEND=modal
+DIARIZATION_URL=https://yourname--reflector-diarizer-web.modal.run
+DIARIZATION_MODAL_API_KEY=<from-deploy-all.sh-output>
+
+# Option B: Self-hosted GPU (use your GPU server URL and API key)
+# TRANSCRIPT_BACKEND=modal
+# TRANSCRIPT_URL=https://gpu.example.com
+# TRANSCRIPT_MODAL_API_KEY=<your-generated-api-key>
+# DIARIZATION_BACKEND=modal
+# DIARIZATION_URL=https://gpu.example.com
+# DIARIZATION_MODAL_API_KEY=<your-generated-api-key>
+
+# Storage - where to store audio files and transcripts (requires AWS S3)
+TRANSCRIPT_STORAGE_BACKEND=aws
+TRANSCRIPT_STORAGE_AWS_ACCESS_KEY_ID=your-aws-access-key
+TRANSCRIPT_STORAGE_AWS_SECRET_ACCESS_KEY=your-aws-secret-key
+TRANSCRIPT_STORAGE_AWS_BUCKET_NAME=reflector-media
+TRANSCRIPT_STORAGE_AWS_REGION=us-east-1
+
+# LLM - for generating titles, summaries, and topics
+LLM_API_KEY=sk-your-openai-api-key
+LLM_MODEL=gpt-4o-mini
+# LLM_URL=https://api.openai.com/v1  # Optional: custom endpoint (vLLM, LiteLLM, Ollama, etc.)
+
+# Auth - disable for initial setup (see a dedicated step for authentication)
+AUTH_BACKEND=none
+```
+
+### Frontend Configuration
+
+```bash
+cp www/.env.example www/.env
+nano www/.env
+```
+
+**Required settings:**
+```env
+# Your domains
+SITE_URL=https://app.example.com
+API_URL=https://api.example.com
+WEBSOCKET_URL=wss://api.example.com
+SERVER_API_URL=http://server:1250
+
+# NextAuth
+NEXTAUTH_URL=https://app.example.com
+NEXTAUTH_SECRET=<generate-with-openssl-rand-hex-32>
+
+# Disable login requirement for initial setup
+FEATURE_REQUIRE_LOGIN=false
+```
+
+---
+
+## Configure Caddy
+
+```bash
+cp Caddyfile.example Caddyfile
+nano Caddyfile
+```
+
+Replace `example.com` with your domains. The `{$VAR:default}` syntax uses Caddy's env var substitution - you can either edit the file directly or set `FRONTEND_DOMAIN` and `API_DOMAIN` environment variables.
+
+```
+{$FRONTEND_DOMAIN:app.example.com} {
+    reverse_proxy web:3000
+}
+
+{$API_DOMAIN:api.example.com} {
+    reverse_proxy server:1250
+}
+```
+
+---
+
+## Start Services
+
+```bash
+docker compose -f docker-compose.prod.yml up -d
+```
+
+Wait for containers to start (first run may take 1-2 minutes to pull images and initialize).
+
+---
+
+## Verify Deployment
+
+### Check services
+```bash
+docker compose -f docker-compose.prod.yml ps
+# All should show "Up"
+```
+
+### Test API
+```bash
+curl https://api.example.com/health
+# Should return: {"status":"healthy"}
+```
+
+### Test Frontend
+- Visit https://app.example.com
+- You should see the Reflector interface
+- Try uploading an audio file to test transcription
+
+If any verification fails, see [Troubleshooting](#troubleshooting) below.
+
+---
+
+## Enable Authentication (Required for Live Rooms)
+
+By default, Reflector is open (no login required). **Authentication is required if you want to use Live Meeting Rooms.**
+
+See [Authentication Setup](./auth-setup) for full Authentik OAuth configuration.
+
+Quick summary:
+1. Deploy Authentik on your server
+2. Create OAuth provider in Authentik
+3. Extract public key for JWT verification
+4. Update `server/.env`: `AUTH_BACKEND=jwt` + `AUTH_JWT_AUDIENCE`
+5. Update `www/.env`: `FEATURE_REQUIRE_LOGIN=true` + Authentik credentials
+6. Mount JWT keys volume and restart services
+
+---
+
+## Enable Live Meeting Rooms
+
+**Requires: Authentication Step**
+
+Live rooms require Daily.co and AWS S3. See [Daily.co Setup](./daily-setup) for complete S3/IAM configuration instructions.
+
+Note that Reflector also supports Whereby as a call provider - this doc doesn't cover its setup yet.
+
+Quick config - Add to `server/.env`:
+
+```env
+DEFAULT_VIDEO_PLATFORM=daily
+DAILY_API_KEY=<from-daily.co-dashboard>
+DAILY_SUBDOMAIN=<your-daily-subdomain>
+
+# S3 for recording storage
+DAILYCO_STORAGE_AWS_BUCKET_NAME=<your-bucket>
+DAILYCO_STORAGE_AWS_REGION=us-east-1
+DAILYCO_STORAGE_AWS_ROLE_ARN=<arn:aws:iam::ACCOUNT:role/DailyCo>
+```
+
+Reload env and restart:
+```bash
+docker compose -f docker-compose.prod.yml up -d server worker
+```
+
+---
+
+## Troubleshooting
+
+### Check logs for errors
+```bash
+docker compose -f docker-compose.prod.yml logs server --tail 20
+docker compose -f docker-compose.prod.yml logs worker --tail 20
+```
+
+### Services won't start
+```bash
+docker compose -f docker-compose.prod.yml logs
+```
+
+### CORS errors in browser
+- Verify `CORS_ORIGIN` in `server/.env` matches your frontend domain exactly (including `https://`)
+- Reload env: `docker compose -f docker-compose.prod.yml up -d server`
+
+### SSL certificate errors
+- Caddy auto-provisions Let's Encrypt certificates
+- Ensure ports 80 and 443 are open
+- Check: `docker compose -f docker-compose.prod.yml logs caddy`
+
+### Transcription not working
+- Check Modal dashboard: https://modal.com/apps
+- Verify URLs in `server/.env` match deployed functions
+- Check worker logs: `docker compose -f docker-compose.prod.yml logs worker`
+
+### "Login required" but auth not configured
+- Set `FEATURE_REQUIRE_LOGIN=false` in `www/.env`
+- Rebuild frontend: `docker compose -f docker-compose.prod.yml up -d --force-recreate web`
+
+### Database migrations or connectivity issues
+Migrations run automatically on server startup. To check database connectivity or debug migration failures:
+
+```bash
+# Check server logs for migration errors
+docker compose -f docker-compose.prod.yml logs server | grep -i -E "(alembic|migration|database|postgres)"
+
+# Verify database connectivity
+docker compose -f docker-compose.prod.yml exec server uv run python -c "from reflector.db import engine; print('DB connected')"
+
+# Manually run migrations (if needed)
+docker compose -f docker-compose.prod.yml exec server uv run alembic upgrade head
+```
+
--- a/docs/docs/installation/requirements.md
+++ b/docs/docs/installation/requirements.md
@@ -0,0 +1,63 @@
+---
+sidebar_position: 2
+title: System Requirements
+---
+
+# System Requirements
+
+This page lists hardware and software requirements. For the complete deployment guide, see [Deployment Guide](./overview).
+
+## Server Requirements
+
+### Minimum Requirements
+
+- **CPU**: 4 cores
+- **RAM**: 8 GB
+- **Storage**: 50 GB SSD
+- **OS**: Ubuntu 22.04+ or compatible Linux
+- **Network**: Public IP address
+
+### Recommended Requirements
+
+- **CPU**: 8+ cores
+- **RAM**: 16 GB
+- **Storage**: 100 GB SSD
+- **Network**: 1 Gbps connection
+
+## Software Requirements
+
+- Docker Engine 20.10+
+- Docker Compose 2.0+
+
+## External Services
+
+### Required
+
+- **Two domain names** - One for frontend (e.g., `app.example.com`), one for API (e.g., `api.example.com`)
+- **Modal.com account** - For GPU-accelerated transcription and diarization (free tier available)
+- **HuggingFace account** - For Pyannote diarization model access
+- **LLM API** - For generating summaries and topic detection. Options:
+  - OpenAI API (https://platform.openai.com/account/api-keys)
+  - Any OpenAI-compatible endpoint (vLLM, LiteLLM, Ollama)
+  - Self-hosted: Phi-4 14B 4-bit recommended (~8GB VRAM)
+
+### Required for Live Meeting Rooms
+
+- **Daily.co account** - For video conferencing (free tier available at https://dashboard.daily.co)
+- **AWS S3 bucket + IAM Role** - For Daily.co to store recordings
+- **Another AWS S3 bucket (optional, can reuse the one above)** - For Reflector to store "compiled" mp3 files and transient diarization process temporary files
+
+### Optional
+
+- **AWS S3** - For cloud storage of recordings and transcripts
+- **Authentik** - For SSO/OIDC authentication
+- **Sentry** - For error tracking
+
+## Development Requirements
+
+For local development only (not required for production deployment):
+
+- Node.js 22+ (for frontend development)
+- Python 3.12+ (for backend development)
+- pnpm (for frontend package management)
+- uv (for Python package management)
--- a/docs/docs/installation/self-hosted-gpu-setup.md
+++ b/docs/docs/installation/self-hosted-gpu-setup.md
@@ -0,0 +1,307 @@
+---
+sidebar_position: 5
+title: Self-Hosted GPU Setup
+---
+
+# Self-Hosted GPU Setup
+
+This guide covers deploying Reflector's GPU processing on your own server instead of Modal.com. For the complete deployment guide, see [Deployment Guide](./overview).
+
+## When to Use Self-Hosted GPU
+
+**Choose self-hosted GPU if you:**
+- Have GPU hardware available (NVIDIA required)
+- Want full control over processing
+- Prefer fixed infrastructure costs over pay-per-use
+- Have privacy or data locality requirements
+- Need to process audio without external API calls
+
+**Choose Modal.com instead if you:**
+- Don't have GPU hardware
+- Want zero infrastructure management
+- Prefer pay-per-use pricing
+- Need instant scaling for variable workloads
+
+See [Modal.com Setup](./modal-setup) for cloud GPU deployment.
+
+## What Gets Deployed
+
+The self-hosted GPU service provides the same API endpoints as Modal:
+- `POST /v1/audio/transcriptions` - Whisper transcription
+- `POST /v1/audio/transcriptions-from-url` - Transcribe from URL
+- `POST /diarize` - Pyannote speaker diarization
+- `POST /translate` - Audio translation
+
+Your main Reflector server connects to this service exactly like it connects to Modal - only the URL changes.
+
+## Prerequisites
+
+### Hardware
+- **GPU**: NVIDIA GPU with 8GB+ VRAM (tested on Tesla T4 with 15GB)
+- **CPU**: 4+ cores recommended
+- **RAM**: 8GB minimum, 16GB recommended
+- **Disk**: 40-50GB minimum
+
+### Software
+- Public IP address
+- Domain name with DNS A record pointing to server
+
+### Accounts
+- **HuggingFace account** with accepted Pyannote licenses:
+  - https://huggingface.co/pyannote/speaker-diarization-3.1
+  - https://huggingface.co/pyannote/segmentation-3.0
+- **HuggingFace access token** from https://huggingface.co/settings/tokens
+
+## Docker Deployment
+
+### Step 1: Install NVIDIA Driver
+
+```bash
+sudo apt update
+sudo apt install -y nvidia-driver-535
+sudo reboot
+
+# After reboot, verify installation
+nvidia-smi
+```
+
+Expected output: GPU details with driver version and CUDA version.
+
+### Step 2: Install Docker
+
+Follow the [official Docker installation guide](https://docs.docker.com/engine/install/ubuntu/) for your distribution.
+
+After installation, add your user to the docker group:
+
+```bash
+sudo usermod -aG docker $USER
+
+# Log out and back in for group changes
+exit
+# SSH back in
+```
+
+### Step 3: Install NVIDIA Container Toolkit
+
+```bash
+# Add NVIDIA repository and install toolkit
+curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | \
+  sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg
+
+curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \
+  sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
+  sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
+
+sudo apt-get update && sudo apt-get install -y nvidia-container-toolkit
+sudo nvidia-ctk runtime configure --runtime=docker
+sudo systemctl restart docker
+```
+
+### Step 4: Clone Repository and Configure
+
+```bash
+git clone https://github.com/monadical-sas/reflector.git
+cd reflector/gpu/self_hosted
+
+# Create environment file
+cat > .env << EOF
+REFLECTOR_GPU_APIKEY=$(openssl rand -hex 16)
+HF_TOKEN=your_huggingface_token_here
+EOF
+
+# Note the generated API key - you'll need it for main server config
+cat .env
+```
+
+### Step 5: Build and Start
+
+The repository includes a `compose.yml` file. Build and start:
+
+
+```bash
+# Build image (takes ~5 minutes, downloads ~10GB)
+sudo docker compose build
+
+# Start service
+sudo docker compose up -d
+
+# Wait for startup and verify
+sleep 30
+sudo docker compose logs
+```
+
+Look for: `INFO: Application startup complete. Uvicorn running on http://0.0.0.0:8000`
+
+### Step 7: Verify GPU Access
+
+```bash
+# Check GPU is accessible from container
+sudo docker exec $(sudo docker ps -q) nvidia-smi
+```
+
+Should show GPU with ~3GB VRAM used (models loaded).
+
+---
+
+## Configure HTTPS with Caddy
+
+Caddy handles SSL automatically.
+
+### Install Caddy
+
+```bash
+sudo apt install -y debian-keyring debian-archive-keyring apt-transport-https curl
+
+curl -1sLf 'https://dl.cloudsmith.io/public/caddy/stable/gpg.key' | \
+  sudo gpg --dearmor -o /usr/share/keyrings/caddy-stable-archive-keyring.gpg
+
+curl -1sLf 'https://dl.cloudsmith.io/public/caddy/stable/debian.deb.txt' | \
+  sudo tee /etc/apt/sources.list.d/caddy-stable.list
+
+sudo apt update
+sudo apt install -y caddy
+```
+
+### Configure Reverse Proxy
+
+Edit the Caddyfile with your domain:
+
+```bash
+sudo nano /etc/caddy/Caddyfile
+```
+
+Add (replace `gpu.example.com` with your domain):
+
+```
+gpu.example.com {
+    reverse_proxy localhost:8000
+}
+```
+
+Reload Caddy (auto-provisions SSL certificate):
+
+```bash
+sudo systemctl reload caddy
+```
+
+### Verify HTTPS
+
+```bash
+curl -I https://gpu.example.com/docs
+# Should return HTTP/2 200
+```
+
+---
+
+## Configure Main Reflector Server
+
+On your main Reflector server, update `server/.env`:
+
+```env
+# GPU Processing - Self-hosted
+TRANSCRIPT_BACKEND=modal
+TRANSCRIPT_URL=https://gpu.example.com
+TRANSCRIPT_MODAL_API_KEY=<your-generated-api-key>
+
+DIARIZATION_BACKEND=modal
+DIARIZATION_URL=https://gpu.example.com
+DIARIZATION_MODAL_API_KEY=<your-generated-api-key>
+```
+
+**Note:** The backend type is `modal` because the self-hosted GPU service implements the same API contract as Modal.com. This allows you to switch between cloud and self-hosted GPU processing by only changing the URL and API key.
+
+Restart services to apply:
+
+```bash
+docker compose -f docker-compose.prod.yml restart server worker
+```
+
+---
+
+## Service Management
+
+All commands in this section assume you're in `~/reflector/gpu/self_hosted/`.
+
+```bash
+# View logs
+sudo docker compose logs -f
+
+# Restart service
+sudo docker compose restart
+
+# Stop service
+sudo docker compose down
+
+# Check status
+sudo docker compose ps
+```
+
+### Monitor GPU
+
+```bash
+# Check GPU usage
+nvidia-smi
+
+# Watch in real-time
+watch -n 1 nvidia-smi
+```
+
+**Typical GPU memory usage:**
+- Idle (models loaded): ~3GB VRAM
+- During transcription: ~4-5GB VRAM
+
+---
+
+## Troubleshooting
+
+### nvidia-smi fails after driver install
+
+```bash
+# Manually load kernel modules
+sudo modprobe nvidia
+nvidia-smi
+```
+
+### Service fails with "Could not download pyannote pipeline"
+
+1. Verify HF_TOKEN is valid: `echo $HF_TOKEN`
+2. Check model access at https://huggingface.co/pyannote/speaker-diarization-3.1
+3. Update .env with correct token
+4. Restart service: `sudo docker compose restart`
+
+### Cannot connect to HTTPS endpoint
+
+1. Verify DNS resolves: `dig +short gpu.example.com`
+2. Check firewall: `sudo ufw status` (ports 80, 443 must be open)
+3. Check Caddy: `sudo systemctl status caddy`
+4. View Caddy logs: `sudo journalctl -u caddy -n 50`
+
+### SSL certificate not provisioning
+
+Requirements for Let's Encrypt:
+- Ports 80 and 443 publicly accessible
+- DNS resolves to server's public IP
+- Valid domain (not localhost or private IP)
+
+### Docker container won't start
+
+```bash
+# Check logs
+sudo docker compose logs
+
+# Common issues:
+# - Port 8000 already in use
+# - GPU not accessible (nvidia-ctk not configured)
+# - Missing .env file
+```
+
+---
+
+## Updating
+
+```bash
+cd ~/reflector/gpu/self_hosted
+git pull
+sudo docker compose build
+sudo docker compose up -d
+```
--- a/docs/docs/intro.md
+++ b/docs/docs/intro.md
@@ -0,0 +1,61 @@
+---
+sidebar_position: 1
+title: Introduction
+---
+
+# Welcome to Reflector
+
+Reflector is a privacy-focused, self-hosted AI-powered audio transcription and meeting analysis platform that provides real-time transcription, speaker diarization, and summarization for audio content and live meetings. With complete control over your data and infrastructure, you can run models on your own hardware (roadmap - currently supports Modal.com for GPU processing).
+
+## What is Reflector?
+
+Reflector is a web application that utilizes AI to process audio content, providing:
+
+- **Real-time Transcription**: Convert speech to text using [Whisper](https://github.com/openai/whisper) (multi-language) or [Parakeet](https://github.com/NVIDIA/NeMo) (English) models
+- **Speaker Diarization**: Identify and label different speakers using [Pyannote](https://github.com/pyannote/pyannote-audio) 3.1
+- **Topic Detection & Summarization**: Extract key topics and generate concise summaries using LLMs
+- **Meeting Recording**: Create permanent records of meetings with searchable transcripts
+
+![Reflector Transcript View](/img/reflector-transcript-view.png)
+
+## Features
+
+| Feature                                    | Public Mode | Private Mode |
+|--------------------------------------------|------------|--------------|
+| **Authentication**                         | None required | Required |
+| **Audio Upload**                           | ✅ | ✅ |
+| **Live Microphone Streaming**              | ✅ | ✅ |
+| **Transcription**                          | ✅ | ✅ |
+| **Speaker Diarization**                    | ✅ | ✅ |
+| **Topic Detection**                        | ✅ | ✅ |
+| **Summarization**                          | ✅ | ✅ |
+| **Virtual Meeting Rooms (Whereby, Daily)** | ❌ | ✅ |
+| **Browse Transcripts Page**                | ❌ | ✅ |
+| **Search Functionality**                   | ❌ | ✅ |
+| **Persistent Storage**                     | ❌ | ✅ |
+
+## Architecture Overview
+
+Reflector consists of three main components:
+
+- **Frontend**: React application built with Next.js
+- **Backend**: Python server using FastAPI
+- **Processing**: Scalable GPU workers for ML inference (Modal.com or local)
+
+## Getting Started
+
+Ready to deploy Reflector? Head over to our [Installation Guide](./installation/overview) to set up your own instance.
+
+For a quick overview of how Reflector processes audio, check out our [Pipeline Documentation](./pipelines/overview).
+
+## Open Source
+
+Reflector is open source software developed by [Monadical](https://monadical.com) and licensed under the **MIT License**. We welcome contributions from the community!
+
+- [GitHub Repository](https://github.com/monadical-sas/reflector)
+- [Issue Tracker](https://github.com/monadical-sas/reflector/issues)
+- [Pull Requests](https://github.com/monadical-sas/reflector/pulls)
+
+## Support
+
+Need help? Reach out to the community through GitHub Discussions.
--- a/docs/docs/pipelines/file-pipeline.md
+++ b/docs/docs/pipelines/file-pipeline.md
@@ -0,0 +1,83 @@
+---
+sidebar_position: 2
+title: File Processing Pipeline
+---
+
+# File Processing Pipeline
+
+The file processing pipeline handles uploaded audio files, optimizing for accuracy and throughput.
+
+## Pipeline Stages
+
+### 1. Input Stage
+
+**Accepted Formats:**
+- MP3 (most common)
+- WAV (uncompressed)
+- M4A (Apple format)
+- WebM (browser recordings)
+- MP4 (video with audio track)
+
+**File Validation:**
+- Sample rate: Any (will be resampled to 16kHz)
+
+### 2. Pre-processing
+
+**Audio Normalization:**
+```yaml
+# Convert to standard format
+- Sample rate: 16kHz (Whisper requirement)
+- Channels: Mono
+- Bit depth: 16-bit
+- Format: WAV
+```
+
+**Noise Reduction (Optional):**
+- Background noise removal
+- Echo cancellation
+- High-pass filter for rumble
+
+### 3. Chunking Strategy
+
+Audio is split into segments for processing:
+- Configurable chunk sizes
+- Optional silence detection for natural breaks
+- Parallel processing of chunks
+
+### 4. Transcription Processing
+
+Transcription uses OpenAI Whisper models via Modal.com or self-hosted GPU:
+- Automatic language detection
+- Word-level timestamps
+
+### 5. Diarization (Speaker Identification)
+
+Speaker diarization uses Pyannote 3.1:
+
+1. **Voice Activity Detection (VAD)** - Identifies speech segments
+2. **Speaker Embedding** - Extracts voice characteristics
+3. **Clustering** - Groups similar voices
+4. **Segmentation** - Assigns speaker labels to time segments
+
+### 6. Alignment & Merging
+
+- Combines transcription with speaker diarization
+- Maps speaker labels to transcript segments
+- Resolves timing overlaps
+- Validates timeline consistency
+
+### 7. Post-processing Chain
+
+- **Text Formatting**: Punctuation, capitalization
+- **Topic Detection**: LLM-based topic extraction
+- **Summarization**: AI-generated summaries and action items
+
+### 8. Storage & Delivery
+
+**File Storage:**
+- Original audio: S3 (optional)
+- Transcript exports: JSON, VTT, TXT
+
+**Notifications:**
+- WebSocket updates during processing
+- Webhook notifications on completion (optional)
--- a/docs/docs/reference/api.md
+++ b/docs/docs/reference/api.md
@@ -0,0 +1,28 @@
+---
+title: API Reference
+---
+
+# API Reference
+
+The complete API documentation is auto-generated from the OpenAPI specification.
+
+## Interactive Documentation
+
+When running Reflector, interactive API docs are available at:
+
+- **Swagger UI**: `https://your-api-domain/docs`
+- **ReDoc**: `https://your-api-domain/redoc`
+
+## OpenAPI Specification
+
+The raw OpenAPI 3.0 specification can be downloaded from:
+
+```
+https://your-api-domain/openapi.json
+```
+
+A static copy is also available: [openapi.json](/openapi.json)
+
+## Authentication
+
+See [Authentication Setup](../installation/auth-setup) for configuring API authentication.
--- a/docs/docs/roadmap.md
+++ b/docs/docs/roadmap.md
@@ -0,0 +1,112 @@
+---
+sidebar_position: 100
+title: Roadmap
+---
+
+# Product Roadmap
+
+Our development roadmap for Reflector, focusing on expanding capabilities while maintaining privacy and performance.
+
+## Planned Features
+
+### 🌍 Multi-Language Support Enhancement
+
+**Current State:**
+- Whisper supports multi-language transcription
+- Parakeet supports English only with high accuracy
+
+**Planned Improvements:**
+- Default language selection per room/user
+- Automatic language detection improvements
+- Multi-language diarization support
+- RTL (Right-to-Left) language UI support
+- Language-specific post-processing rules
+
+### 🏠 Self-Hosted Room Providers
+
+**Jitsi Integration**
+
+Moving beyond Whereby to support self-hosted video conferencing:
+
+- No API keys required
+- Complete control over video infrastructure
+- Custom branding and configuration
+- Lower operational costs
+- Enhanced privacy with self-hosted video
+
+**Implementation Plan:**
+- WebRTC bridge for Jitsi Meet
+- Room management API integration
+- Recording synchronization
+- Participant tracking
+
+### 📅 Calendar Integration
+
+**Planned Capabilities:**
+- Google Calendar synchronization
+- Microsoft Outlook integration
+- Automatic meeting room creation
+- Pre-meeting document preparation
+- Post-meeting transcript delivery
+- Recurring meeting support
+
+**Features:**
+- Auto-join scheduled meetings
+- Calendar-based access control
+- Meeting agenda import
+- Action item export to calendar
+
+## Future Considerations
+
+### Enhanced Analytics
+- Meeting insights dashboard
+- Speaker participation metrics
+- Topic trends over time
+- Team collaboration patterns
+
+### Advanced AI Features
+- Real-time sentiment analysis
+- Emotion detection
+- Meeting quality scores
+- Automated coaching suggestions
+
+### Integration Ecosystem
+- Slack/Teams notifications
+- CRM integration (Salesforce, HubSpot)
+- Project management tools (Jira, Asana)
+- Knowledge bases (Notion, Confluence)
+
+### Performance Improvements
+- WebAssembly for client-side processing
+- Edge computing support
+- 5G network optimization
+- Blockchain for transcript verification
+
+## Contributing
+
+We welcome community contributions! Areas where you can help:
+
+1. **Language Support**: Add support for your language
+2. **Integrations**: Connect with your favorite tools
+3. **Models**: Fine-tune models for specific domains
+4. **Documentation**: Improve guides and examples
+
+See our [Contributing Guide](https://github.com/monadical-sas/reflector/blob/main/CONTRIBUTING.md) for details.
+
+## Timeline
+
+We don't provide specific dates as development depends on community contributions and priorities. Features are generally released when they're ready and properly tested.
+
+## Feature Requests
+
+Have an idea for Reflector? We'd love to hear it!
+
+- [Open a GitHub Issue](https://github.com/monadical-sas/reflector/issues/new)
+- [Join our Discord](#)
+- [Email us](mailto:reflector@monadical.com)
+
+## Stay Updated
+
+- Watch our [GitHub repository](https://github.com/monadical-sas/reflector)
+- Follow our [blog](#)
+- Subscribe to our [newsletter](#)