mirror of
https://github.com/Monadical-SAS/reflector.git
synced 2025-12-21 12:49:06 +00:00
doc review round
This commit is contained in:
@@ -26,13 +26,15 @@ flowchart LR
|
|||||||
|
|
||||||
Before starting, you need:
|
Before starting, you need:
|
||||||
|
|
||||||
- [ ] **Production server** - Ubuntu 22.04+, 4+ cores, 8GB+ RAM, public IP
|
- **Production server** - 4+ cores, 8GB+ RAM, public IP
|
||||||
- [ ] **Two domain names** - e.g., `app.example.com` (frontend) and `api.example.com` (backend)
|
- **Two domain names** - e.g., `app.example.com` (frontend) and `api.example.com` (backend)
|
||||||
- [ ] **GPU processing** - Choose one:
|
- **GPU processing** - Choose one:
|
||||||
- Modal.com account (free tier at https://modal.com), OR
|
- Modal.com account, OR
|
||||||
- GPU server with NVIDIA GPU (8GB+ VRAM)
|
- GPU server with NVIDIA GPU (8GB+ VRAM)
|
||||||
- [ ] **HuggingFace account** - Free at https://huggingface.co
|
- **HuggingFace account** - Free at https://huggingface.co
|
||||||
- [ ] **OpenAI API key** - For summaries and topic detection at https://platform.openai.com/account/api-keys
|
- **LLM API** - For summaries and topic detection. Choose one:
|
||||||
|
- OpenAI API key at https://platform.openai.com/account/api-keys, OR
|
||||||
|
- Any OpenAI-compatible endpoint (vLLM, LiteLLM, Ollama, etc.)
|
||||||
|
|
||||||
### Optional (for live meeting rooms)
|
### Optional (for live meeting rooms)
|
||||||
|
|
||||||
@@ -41,52 +43,40 @@ Before starting, you need:
|
|||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## Step 1: Configure DNS
|
## Configure DNS
|
||||||
|
|
||||||
**Location: Your domain registrar / DNS provider**
|
|
||||||
|
|
||||||
Create A records pointing to your server:
|
|
||||||
```
|
```
|
||||||
Type: A Name: app Value: <your-server-ip>
|
Type: A Name: app Value: <your-server-ip>
|
||||||
Type: A Name: api Value: <your-server-ip>
|
Type: A Name: api Value: <your-server-ip>
|
||||||
```
|
```
|
||||||
|
|
||||||
Verify propagation (wait a few minutes):
|
|
||||||
```bash
|
|
||||||
dig app.example.com +short
|
|
||||||
dig api.example.com +short
|
|
||||||
# Both should return your server IP
|
|
||||||
```
|
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## Step 2: Deploy GPU Processing
|
## Deploy GPU Processing
|
||||||
|
|
||||||
Reflector requires GPU processing for transcription (Whisper) and speaker diarization (Pyannote). Choose one option:
|
Reflector requires GPU processing for transcription and speaker diarization. Choose one option:
|
||||||
|
|
||||||
| | **Modal.com (Cloud)** | **Self-Hosted GPU** |
|
| | **Modal.com (Cloud)** | **Self-Hosted GPU** |
|
||||||
|---|---|---|
|
|---|---|---|
|
||||||
| **Best for** | No GPU hardware, zero maintenance | Own GPU server, full control |
|
| **Best for** | No GPU hardware, zero maintenance | Own GPU server, full control |
|
||||||
| **Pricing** | Pay-per-use (~$0.01-0.10/min audio) | Fixed infrastructure cost |
|
| **Pricing** | Pay-per-use | Fixed infrastructure cost |
|
||||||
| **Setup** | Run from laptop (browser auth) | Run on GPU server |
|
|
||||||
| **Scaling** | Automatic | Manual |
|
|
||||||
|
|
||||||
### Option A: Modal.com (Serverless Cloud GPU)
|
### Option A: Modal.com (Serverless Cloud GPU)
|
||||||
|
|
||||||
**Location: YOUR LOCAL COMPUTER (laptop/desktop)**
|
|
||||||
|
|
||||||
Modal requires browser authentication, so this runs locally - not on your server.
|
|
||||||
|
|
||||||
#### Accept HuggingFace Licenses
|
#### Accept HuggingFace Licenses
|
||||||
|
|
||||||
Visit both pages and click "Accept":
|
Visit both pages and click "Accept":
|
||||||
- https://huggingface.co/pyannote/speaker-diarization-3.1
|
- https://huggingface.co/pyannote/speaker-diarization-3.1
|
||||||
- https://huggingface.co/pyannote/segmentation-3.0
|
- https://huggingface.co/pyannote/segmentation-3.0
|
||||||
|
|
||||||
Then generate a token at https://huggingface.co/settings/tokens
|
Generate a token at https://huggingface.co/settings/tokens
|
||||||
|
|
||||||
#### Deploy to Modal
|
#### Deploy to Modal
|
||||||
|
|
||||||
|
There's an install script to help with this setup. It's using modal API to set all necessary moving parts.
|
||||||
|
|
||||||
|
As an alternative, all those operations that script does could be performed in modal settings in modal UI.
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
pip install modal
|
pip install modal
|
||||||
modal setup # opens browser for authentication
|
modal setup # opens browser for authentication
|
||||||
@@ -96,7 +86,7 @@ cd reflector/gpu/modal_deployments
|
|||||||
./deploy-all.sh --hf-token YOUR_HUGGINGFACE_TOKEN
|
./deploy-all.sh --hf-token YOUR_HUGGINGFACE_TOKEN
|
||||||
```
|
```
|
||||||
|
|
||||||
**Save the output** - copy the configuration block, you'll need it for Step 4.
|
**Save the output** - copy the configuration block, you'll need it soon.
|
||||||
|
|
||||||
See [Modal Setup](./modal-setup) for troubleshooting and details.
|
See [Modal Setup](./modal-setup) for troubleshooting and details.
|
||||||
|
|
||||||
@@ -114,13 +104,13 @@ See [Self-Hosted GPU Setup](./self-hosted-gpu-setup) for complete instructions.
|
|||||||
4. Start service (Docker compose or systemd)
|
4. Start service (Docker compose or systemd)
|
||||||
5. Set up Caddy reverse proxy for HTTPS
|
5. Set up Caddy reverse proxy for HTTPS
|
||||||
|
|
||||||
**Save your API key and HTTPS URL** - you'll need them for Step 4.
|
**Save your API key and HTTPS URL** - you'll need them soon.
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## Step 3: Prepare Server
|
## Prepare Server
|
||||||
|
|
||||||
**Location: YOUR SERVER (via SSH)**
|
**Location: dedicated reflector server**
|
||||||
|
|
||||||
### Install Docker
|
### Install Docker
|
||||||
|
|
||||||
@@ -150,7 +140,7 @@ cd reflector
|
|||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## Step 4: Configure Environment
|
## Configure Environment
|
||||||
|
|
||||||
**Location: YOUR SERVER (via SSH, in the `reflector` directory)**
|
**Location: YOUR SERVER (via SSH, in the `reflector` directory)**
|
||||||
|
|
||||||
@@ -183,7 +173,7 @@ CORS_ALLOW_CREDENTIALS=true
|
|||||||
# Secret key - generate with: openssl rand -hex 32
|
# Secret key - generate with: openssl rand -hex 32
|
||||||
SECRET_KEY=<your-generated-secret>
|
SECRET_KEY=<your-generated-secret>
|
||||||
|
|
||||||
# GPU Processing - choose ONE option from Step 2:
|
# GPU Processing - choose ONE option:
|
||||||
|
|
||||||
# Option A: Modal.com (paste from deploy-all.sh output)
|
# Option A: Modal.com (paste from deploy-all.sh output)
|
||||||
TRANSCRIPT_BACKEND=modal
|
TRANSCRIPT_BACKEND=modal
|
||||||
@@ -208,7 +198,7 @@ TRANSCRIPT_STORAGE_BACKEND=local
|
|||||||
LLM_API_KEY=sk-your-openai-api-key
|
LLM_API_KEY=sk-your-openai-api-key
|
||||||
LLM_MODEL=gpt-4o-mini
|
LLM_MODEL=gpt-4o-mini
|
||||||
|
|
||||||
# Auth - disable for initial setup (see Step 8 for authentication)
|
# Auth - disable for initial setup (see a dedicated step for authentication)
|
||||||
AUTH_BACKEND=none
|
AUTH_BACKEND=none
|
||||||
```
|
```
|
||||||
|
|
||||||
@@ -237,7 +227,7 @@ FEATURE_REQUIRE_LOGIN=false
|
|||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## Step 5: Configure Caddy
|
## Configure Caddy
|
||||||
|
|
||||||
**Location: YOUR SERVER (via SSH)**
|
**Location: YOUR SERVER (via SSH)**
|
||||||
|
|
||||||
@@ -260,7 +250,7 @@ Replace `example.com` with your domains. The `{$VAR:default}` syntax uses Caddy'
|
|||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## Step 6: Start Services
|
## Start Services
|
||||||
|
|
||||||
**Location: YOUR SERVER (via SSH)**
|
**Location: YOUR SERVER (via SSH)**
|
||||||
|
|
||||||
@@ -280,7 +270,7 @@ docker compose -f docker-compose.prod.yml exec server uv run alembic upgrade hea
|
|||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## Step 7: Verify Deployment
|
## Verify Deployment
|
||||||
|
|
||||||
### Check services
|
### Check services
|
||||||
```bash
|
```bash
|
||||||
@@ -307,9 +297,9 @@ curl https://api.example.com/health
|
|||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## Step 8: Enable Authentication (Required for Live Rooms)
|
## Enable Authentication (Required for Live Rooms)
|
||||||
|
|
||||||
By default, Reflector is open (no login required). **Authentication is required if you want to use Live Meeting Rooms (Step 9).**
|
By default, Reflector is open (no login required). **Authentication is required if you want to use Live Meeting Rooms.**
|
||||||
|
|
||||||
See [Authentication Setup](./auth-setup) for full Authentik OAuth configuration.
|
See [Authentication Setup](./auth-setup) for full Authentik OAuth configuration.
|
||||||
|
|
||||||
@@ -323,9 +313,9 @@ Quick summary:
|
|||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## Step 9: Enable Live Meeting Rooms
|
## Enable Live Meeting Rooms
|
||||||
|
|
||||||
**Requires: Step 8 (Authentication)**
|
**Requires: Authentication Step**
|
||||||
|
|
||||||
Live rooms require Daily.co and AWS S3. See [Daily.co Setup](./daily-setup) for complete S3/IAM configuration instructions.
|
Live rooms require Daily.co and AWS S3. See [Daily.co Setup](./daily-setup) for complete S3/IAM configuration instructions.
|
||||||
|
|
||||||
|
|||||||
@@ -43,7 +43,6 @@ Your main Reflector server connects to this service exactly like it connects to
|
|||||||
- Systemd method: 25-30GB minimum
|
- Systemd method: 25-30GB minimum
|
||||||
|
|
||||||
### Software
|
### Software
|
||||||
- Ubuntu 22.04 or 24.04
|
|
||||||
- Public IP address
|
- Public IP address
|
||||||
- Domain name with DNS A record pointing to server
|
- Domain name with DNS A record pointing to server
|
||||||
|
|
||||||
@@ -55,34 +54,6 @@ Your main Reflector server connects to this service exactly like it connects to
|
|||||||
|
|
||||||
## Choose Deployment Method
|
## Choose Deployment Method
|
||||||
|
|
||||||
### Docker Deployment (Recommended)
|
|
||||||
|
|
||||||
**Pros:**
|
|
||||||
- Container isolation and reproducibility
|
|
||||||
- No manual library path configuration
|
|
||||||
- Easier to replicate across servers
|
|
||||||
- Built-in restart policies
|
|
||||||
- Simpler dependency management
|
|
||||||
|
|
||||||
**Cons:**
|
|
||||||
- Higher disk usage (~15GB for container)
|
|
||||||
- Requires 40-50GB disk minimum
|
|
||||||
|
|
||||||
**Best for:** Teams wanting reproducible deployments, multiple GPU servers
|
|
||||||
|
|
||||||
### Systemd Deployment
|
|
||||||
|
|
||||||
**Pros:**
|
|
||||||
- Lower disk usage (~8GB total)
|
|
||||||
- Direct GPU access (no container layer)
|
|
||||||
- Works on smaller disks (25-30GB)
|
|
||||||
|
|
||||||
**Cons:**
|
|
||||||
- Manual `LD_LIBRARY_PATH` configuration
|
|
||||||
- Less portable across systems
|
|
||||||
|
|
||||||
**Best for:** Single GPU server, limited disk space
|
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## Docker Deployment
|
## Docker Deployment
|
||||||
@@ -422,16 +393,6 @@ watch -n 1 nvidia-smi
|
|||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## Performance Notes
|
|
||||||
|
|
||||||
**Tesla T4 benchmarks:**
|
|
||||||
- Transcription: ~2-3x real-time (10 min audio in 3-5 min)
|
|
||||||
- Diarization: ~1.5x real-time
|
|
||||||
- Max concurrent requests: 2-3 (depends on audio length)
|
|
||||||
- First request warmup: ~10 seconds (model loading)
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Troubleshooting
|
## Troubleshooting
|
||||||
|
|
||||||
### nvidia-smi fails after driver install
|
### nvidia-smi fails after driver install
|
||||||
@@ -483,16 +444,6 @@ sudo docker compose logs
|
|||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## Security Considerations
|
|
||||||
|
|
||||||
1. **API Key**: Keep `REFLECTOR_GPU_APIKEY` secret, rotate periodically
|
|
||||||
2. **HuggingFace Token**: Treat as password, never commit to git
|
|
||||||
3. **Firewall**: Only expose ports 80 and 443 publicly
|
|
||||||
4. **Updates**: Regularly update system packages
|
|
||||||
5. **Monitoring**: Set up alerts for service failures
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Updating
|
## Updating
|
||||||
|
|
||||||
### Docker
|
### Docker
|
||||||
|
|||||||
@@ -6,7 +6,7 @@ usage() {
|
|||||||
echo "Usage: $0 [OPTIONS]"
|
echo "Usage: $0 [OPTIONS]"
|
||||||
echo ""
|
echo ""
|
||||||
echo "Options:"
|
echo "Options:"
|
||||||
echo " --hf-token TOKEN HuggingFace token for Pyannote model"
|
echo " --hf-token TOKEN HuggingFace token"
|
||||||
echo " --help Show this help message"
|
echo " --help Show this help message"
|
||||||
echo ""
|
echo ""
|
||||||
echo "Examples:"
|
echo "Examples:"
|
||||||
@@ -88,7 +88,7 @@ if [[ ! "$HF_TOKEN" =~ ^hf_ ]]; then
|
|||||||
fi
|
fi
|
||||||
fi
|
fi
|
||||||
|
|
||||||
# --- Auto-generate API Key ---
|
# --- Auto-generate reflector<->GPU API Key ---
|
||||||
echo ""
|
echo ""
|
||||||
echo "Generating API key for GPU services..."
|
echo "Generating API key for GPU services..."
|
||||||
API_KEY=$(openssl rand -hex 32)
|
API_KEY=$(openssl rand -hex 32)
|
||||||
|
|||||||
Reference in New Issue
Block a user