* feat: WIP doc (vibe started and iterated) * install from scratch docs * caddyfile.example * gitignore * authentik script * authentik script * authentik script * llm doc * authentik ongoing * more daily setup logs * doc website * gpu self hosted setup guide (no-mistakes) * doc review round * doc review round * doc review round * update doc site sidebars * feat(docs): add mermaid diagram support * docs polishing * live pipeline doc * move pipeline dev docs to dev docs location * doc pr review iteration * dockerfile healthcheck * docs/pr-comments * remove jwt comment * llm suggestion * pr comments * pr comments * document auto migrations * cleanup docs --------- Co-authored-by: Mathieu Virbel <mat@meltingrocks.com> Co-authored-by: Igor Loskutov <igor.loskutoff@gmail.com>
4.7 KiB
sidebar_position, title
| sidebar_position | title |
|---|---|
| 4 | Modal.com Setup |
Modal.com Setup
This page covers Modal.com GPU setup in detail. For the complete deployment guide, see Deployment Guide.
Reflector uses Modal.com for GPU-accelerated audio processing. This guide walks you through deploying the required GPU functions.
What is Modal.com?
Modal is a serverless GPU platform. You deploy Python code that runs on their GPUs, and pay only for actual compute time. Reflector uses Modal for:
- Transcription: Whisper model for speech-to-text
- Diarization: Pyannote model for speaker identification
Prerequisites
- Modal.com account - Sign up at https://modal.com (free tier available)
- HuggingFace account - Required for Pyannote diarization models:
- Create account at https://huggingface.co
- Accept both Pyannote licenses:
- Generate access token at https://huggingface.co/settings/tokens
Deployment
Location: YOUR LOCAL COMPUTER (laptop/desktop)
Modal CLI requires browser authentication, so this must run on a machine with a browser - not on a headless server.
Install Modal CLI
uv tool install modal
Authenticate with Modal
modal setup
This opens your browser for authentication. Complete the login flow.
Clone Repository and Deploy
git clone https://github.com/monadical-sas/reflector.git
cd reflector/gpu/modal_deployments
./deploy-all.sh --hf-token YOUR_HUGGINGFACE_TOKEN
Or run interactively (script will prompt for token):
./deploy-all.sh
What the Script Does
- Prompts for HuggingFace token - Needed to download the Pyannote diarization model
- Generates API key - Creates a secure random key for authenticating requests to GPU functions
- Creates Modal secrets:
hf_token- Your HuggingFace tokenreflector-gpu- The generated API key
- Deploys GPU functions - Transcriber (Whisper) and Diarizer (Pyannote)
- Outputs configuration - Prints URLs and API key to console
Example Output
==========================================
Reflector GPU Functions Deployment
==========================================
Generating API key for GPU services...
Creating Modal secrets...
-> Creating secret: hf_token
-> Creating secret: reflector-gpu
Deploying transcriber (Whisper)...
-> https://yourname--reflector-transcriber-web.modal.run
Deploying diarizer (Pyannote)...
-> https://yourname--reflector-diarizer-web.modal.run
==========================================
Deployment complete!
==========================================
Copy these values to your server's server/.env file:
# --- Modal GPU Configuration ---
TRANSCRIPT_BACKEND=modal
TRANSCRIPT_URL=https://yourname--reflector-transcriber-web.modal.run
TRANSCRIPT_MODAL_API_KEY=abc123...
DIARIZATION_BACKEND=modal
DIARIZATION_URL=https://yourname--reflector-diarizer-web.modal.run
DIARIZATION_MODAL_API_KEY=abc123...
# --- End Modal Configuration ---
Copy the output and paste it into your server/.env file on your server.
Costs
Modal charges based on GPU compute time:
- Functions scale to zero when not in use (no cost when idle)
- You only pay for actual processing time
- Free tier includes $30/month of credits
Typical costs for audio processing:
- Transcription: ~$0.01-0.05 per minute of audio
- Diarization: ~$0.02-0.10 per minute of audio
Troubleshooting
"Modal CLI not installed"
uv tool install modal
"Not authenticated with Modal"
modal setup
# Complete browser authentication
"Failed to create secret hf_token"
- Verify your HuggingFace token is valid
- Ensure you've accepted the Pyannote license
- Token needs
readpermission
Deployment fails
Check the Modal dashboard for detailed error logs:
- Visit https://modal.com/apps
- Click on the failed function
- View build and runtime logs
Re-running deployment
The script is safe to re-run. It will:
- Update existing secrets if they exist
- Redeploy functions with latest code
- Output new configuration (API key stays the same if secret exists)
Manual Deployment (Advanced)
If you prefer to deploy functions individually:
cd gpu/modal_deployments
# Create secrets manually
modal secret create hf_token HF_TOKEN=your-hf-token
modal secret create reflector-gpu REFLECTOR_GPU_APIKEY=$(openssl rand -hex 32)
# Deploy each function
modal deploy reflector_transcriber.py
modal deploy reflector_diarizer.py
Monitoring
View your deployed functions and their usage:
- Modal Dashboard: https://modal.com/apps
- Function logs: Click on any function to view logs
- Usage: View compute time and costs in the dashboard