mirror of https://github.com/Monadical-SAS/reflector.git synced 2026-02-04 09:56:47 +00:00

Files

Igor Monadical 407c15299f docs: docs website + installation (#778 )

* feat: WIP doc (vibe started and iterated)

* install from scratch docs

* caddyfile.example

* gitignore

* authentik script

* authentik script

* authentik script

* llm doc

* authentik ongoing

* more daily setup logs

* doc website

* gpu self hosted setup guide (no-mistakes)

* doc review round

* doc review round

* doc review round

* update doc site sidebars

* feat(docs): add mermaid diagram support

* docs polishing

* live pipeline doc

* move pipeline dev docs to dev docs location

* doc pr review iteration

* dockerfile healthcheck

* docs/pr-comments

* remove jwt comment

* llm suggestion

* pr comments

* pr comments

* document auto migrations

* cleanup docs

---------

Co-authored-by: Mathieu Virbel <mat@meltingrocks.com>
Co-authored-by: Igor Loskutov <igor.loskutoff@gmail.com>

2026-01-06 17:25:02 -05:00

6.0 KiB

Raw Blame History

sidebar_position, title

sidebar_position	title
1	Architecture Overview

Architecture Overview

Reflector is built as a modern, scalable, microservices-based application designed to handle audio processing workloads efficiently while maintaining data privacy and control.

System Components

Frontend Application

The user interface is built with Next.js 15 using the App Router pattern, providing:

Server-side rendering for optimal performance
Real-time WebSocket connections for live transcription
WebRTC support for audio streaming and live meetings (via Daily.co or Whereby)
Responsive design with Chakra UI components

Backend API Server

The core API is powered by FastAPI, a modern Python framework that provides:

High-performance async request handling
Automatic OpenAPI documentation generation
Type safety with Pydantic models
WebSocket support for real-time updates

Processing Pipeline

Audio processing is handled through a modular pipeline architecture:

Audio Input → Chunking → Transcription → Diarization → Post-Processing → Storage

Each step can run independently and in parallel, allowing for:

Scalable processing of large files
Real-time streaming capabilities
Fault tolerance and retry mechanisms

Worker Architecture

Background tasks are managed by Celery workers with Redis as the message broker:

Distributed task processing
Priority queues for time-sensitive operations
Automatic retry on failure
Progress tracking and notifications

GPU Acceleration

ML models run on GPU-accelerated infrastructure:

Modal.com for serverless GPU processing
Self-hosted GPU with Docker deployment
Automatic scaling based on demand
Cost-effective pay-per-use model

Data Flow

Daily.co Meeting Recording Flow

Recording: Daily.co captures separate audio tracks per participant
Webhook: Daily.co notifies Reflector when recording is ready
Track Download: Individual participant tracks fetched from S3
Padding: Tracks padded with silence based on join time for synchronization
Transcription: Each track transcribed independently (speaker = track index)
Merge: Transcriptions sorted by timestamp and combined
Mixdown: Tracks mixed to single MP3 for playback
Post-Processing: Topics, title, and summaries generated via LLM
Delivery: Results stored and user notified via WebSocket

File Upload Flow

Upload: User uploads audio file through web interface
Storage: File stored temporarily
Transcription: Full file transcribed via Whisper
Diarization: ML-based speaker identification (Pyannote)
Post-Processing: Topics, title, summaries
Delivery: Results stored and user notified

Live Streaming Flow

WebRTC Connection: Browser establishes peer connection via Daily.co or Whereby
Audio Capture: Microphone audio streamed to server
Buffering: Audio buffered for processing
Real-time Processing: Segments transcribed as they arrive
WebSocket Updates: Results streamed back to client
Continuous Assembly: Full transcript built progressively

Deployment Architecture

Container-Based Deployment

All components are containerized for consistent deployment:

services:
  web:         # Next.js application
  server:      # FastAPI server
  worker:      # Celery workers
  redis:       # Message broker
  postgres:    # Database
  caddy:       # Reverse proxy

Networking

Host Network Mode: Required for WebRTC/ICE compatibility
Caddy Reverse Proxy: Handles SSL termination and routing
WebSocket Upgrade: Supports real-time connections

Scalability Considerations

Horizontal Scaling

Stateless Backend: Multiple API server instances
Worker Pools: Add workers based on queue depth
Database Pooling: Connection management for concurrent access

Vertical Scaling

GPU Workers: Scale up for faster model inference
Memory Optimization: Efficient audio buffering

Security Architecture

Authentication & Authorization

JWT Tokens: Stateless authentication
Authentik Integration: Enterprise SSO support
Role-Based Access: Granular permissions

Data Protection

Encryption in Transit: TLS for all connections
Temporary Storage: Automatic cleanup of processed files

Privacy by Design

Local Processing: Option to process entirely on-premises
No Training on User Data: Models are pre-trained
Data Isolation: Multi-tenant data separation

Integration Points

External Services

Modal.com: GPU processing
AWS S3: Long-term storage
Whereby: Video conferencing rooms
Zulip: Chat integration (optional)

APIs and Webhooks

RESTful API: Standard CRUD operations
WebSocket API: Real-time updates
Webhook Notifications: Processing completion events
OpenAPI Specification: Machine-readable API definition

Performance Optimization

Caching Strategy

Redis Cache: Frequently accessed data
CDN: Static asset delivery
Browser Cache: Client-side optimization

Database Optimization

Indexed Queries: Fast search and retrieval
Connection Pooling: Efficient resource usage
Query Optimization: N+1 query prevention

Processing Optimization

Batch Processing: Efficient GPU utilization
Parallel Execution: Multi-core CPU usage
Stream Processing: Reduced memory footprint

Monitoring and Observability

Metrics Collection

Application Metrics: Request rates, response times
System Metrics: CPU, memory, disk usage
Business Metrics: Transcription accuracy, processing times

Logging

Structured Logging: JSON format for analysis
Log Aggregation: Centralized log management
Error Tracking: Sentry integration

Health Checks

Liveness Probes: Component availability
Readiness Probes: Service readiness
Dependency Checks: External service status

6.0 KiB Raw Blame History