doc review round

This commit is contained in:
Igor Loskutov
2025-12-09 12:11:22 -05:00
parent 2b3f28993f
commit d890061056
3 changed files with 33 additions and 92 deletions

View File

@@ -26,13 +26,15 @@ flowchart LR
Before starting, you need:
- [ ] **Production server** - Ubuntu 22.04+, 4+ cores, 8GB+ RAM, public IP
- [ ] **Two domain names** - e.g., `app.example.com` (frontend) and `api.example.com` (backend)
- [ ] **GPU processing** - Choose one:
- Modal.com account (free tier at https://modal.com), OR
- **Production server** - 4+ cores, 8GB+ RAM, public IP
- **Two domain names** - e.g., `app.example.com` (frontend) and `api.example.com` (backend)
- **GPU processing** - Choose one:
- Modal.com account, OR
- GPU server with NVIDIA GPU (8GB+ VRAM)
- [ ] **HuggingFace account** - Free at https://huggingface.co
- [ ] **OpenAI API key** - For summaries and topic detection at https://platform.openai.com/account/api-keys
- **HuggingFace account** - Free at https://huggingface.co
- **LLM API** - For summaries and topic detection. Choose one:
- OpenAI API key at https://platform.openai.com/account/api-keys, OR
- Any OpenAI-compatible endpoint (vLLM, LiteLLM, Ollama, etc.)
### Optional (for live meeting rooms)
@@ -41,52 +43,40 @@ Before starting, you need:
---
## Step 1: Configure DNS
## Configure DNS
**Location: Your domain registrar / DNS provider**
Create A records pointing to your server:
```
Type: A Name: app Value: <your-server-ip>
Type: A Name: api Value: <your-server-ip>
```
Verify propagation (wait a few minutes):
```bash
dig app.example.com +short
dig api.example.com +short
# Both should return your server IP
```
---
## Step 2: Deploy GPU Processing
## Deploy GPU Processing
Reflector requires GPU processing for transcription (Whisper) and speaker diarization (Pyannote). Choose one option:
Reflector requires GPU processing for transcription and speaker diarization. Choose one option:
| | **Modal.com (Cloud)** | **Self-Hosted GPU** |
|---|---|---|
| **Best for** | No GPU hardware, zero maintenance | Own GPU server, full control |
| **Pricing** | Pay-per-use (~$0.01-0.10/min audio) | Fixed infrastructure cost |
| **Setup** | Run from laptop (browser auth) | Run on GPU server |
| **Scaling** | Automatic | Manual |
| **Pricing** | Pay-per-use | Fixed infrastructure cost |
### Option A: Modal.com (Serverless Cloud GPU)
**Location: YOUR LOCAL COMPUTER (laptop/desktop)**
Modal requires browser authentication, so this runs locally - not on your server.
#### Accept HuggingFace Licenses
Visit both pages and click "Accept":
- https://huggingface.co/pyannote/speaker-diarization-3.1
- https://huggingface.co/pyannote/segmentation-3.0
Then generate a token at https://huggingface.co/settings/tokens
Generate a token at https://huggingface.co/settings/tokens
#### Deploy to Modal
There's an install script to help with this setup. It's using modal API to set all necessary moving parts.
As an alternative, all those operations that script does could be performed in modal settings in modal UI.
```bash
pip install modal
modal setup # opens browser for authentication
@@ -96,7 +86,7 @@ cd reflector/gpu/modal_deployments
./deploy-all.sh --hf-token YOUR_HUGGINGFACE_TOKEN
```
**Save the output** - copy the configuration block, you'll need it for Step 4.
**Save the output** - copy the configuration block, you'll need it soon.
See [Modal Setup](./modal-setup) for troubleshooting and details.
@@ -114,13 +104,13 @@ See [Self-Hosted GPU Setup](./self-hosted-gpu-setup) for complete instructions.
4. Start service (Docker compose or systemd)
5. Set up Caddy reverse proxy for HTTPS
**Save your API key and HTTPS URL** - you'll need them for Step 4.
**Save your API key and HTTPS URL** - you'll need them soon.
---
## Step 3: Prepare Server
## Prepare Server
**Location: YOUR SERVER (via SSH)**
**Location: dedicated reflector server**
### Install Docker
@@ -150,7 +140,7 @@ cd reflector
---
## Step 4: Configure Environment
## Configure Environment
**Location: YOUR SERVER (via SSH, in the `reflector` directory)**
@@ -183,7 +173,7 @@ CORS_ALLOW_CREDENTIALS=true
# Secret key - generate with: openssl rand -hex 32
SECRET_KEY=<your-generated-secret>
# GPU Processing - choose ONE option from Step 2:
# GPU Processing - choose ONE option:
# Option A: Modal.com (paste from deploy-all.sh output)
TRANSCRIPT_BACKEND=modal
@@ -208,7 +198,7 @@ TRANSCRIPT_STORAGE_BACKEND=local
LLM_API_KEY=sk-your-openai-api-key
LLM_MODEL=gpt-4o-mini
# Auth - disable for initial setup (see Step 8 for authentication)
# Auth - disable for initial setup (see a dedicated step for authentication)
AUTH_BACKEND=none
```
@@ -237,7 +227,7 @@ FEATURE_REQUIRE_LOGIN=false
---
## Step 5: Configure Caddy
## Configure Caddy
**Location: YOUR SERVER (via SSH)**
@@ -260,7 +250,7 @@ Replace `example.com` with your domains. The `{$VAR:default}` syntax uses Caddy'
---
## Step 6: Start Services
## Start Services
**Location: YOUR SERVER (via SSH)**
@@ -280,7 +270,7 @@ docker compose -f docker-compose.prod.yml exec server uv run alembic upgrade hea
---
## Step 7: Verify Deployment
## Verify Deployment
### Check services
```bash
@@ -307,9 +297,9 @@ curl https://api.example.com/health
---
## Step 8: Enable Authentication (Required for Live Rooms)
## Enable Authentication (Required for Live Rooms)
By default, Reflector is open (no login required). **Authentication is required if you want to use Live Meeting Rooms (Step 9).**
By default, Reflector is open (no login required). **Authentication is required if you want to use Live Meeting Rooms.**
See [Authentication Setup](./auth-setup) for full Authentik OAuth configuration.
@@ -323,9 +313,9 @@ Quick summary:
---
## Step 9: Enable Live Meeting Rooms
## Enable Live Meeting Rooms
**Requires: Step 8 (Authentication)**
**Requires: Authentication Step**
Live rooms require Daily.co and AWS S3. See [Daily.co Setup](./daily-setup) for complete S3/IAM configuration instructions.

View File

@@ -43,7 +43,6 @@ Your main Reflector server connects to this service exactly like it connects to
- Systemd method: 25-30GB minimum
### Software
- Ubuntu 22.04 or 24.04
- Public IP address
- Domain name with DNS A record pointing to server
@@ -55,34 +54,6 @@ Your main Reflector server connects to this service exactly like it connects to
## Choose Deployment Method
### Docker Deployment (Recommended)
**Pros:**
- Container isolation and reproducibility
- No manual library path configuration
- Easier to replicate across servers
- Built-in restart policies
- Simpler dependency management
**Cons:**
- Higher disk usage (~15GB for container)
- Requires 40-50GB disk minimum
**Best for:** Teams wanting reproducible deployments, multiple GPU servers
### Systemd Deployment
**Pros:**
- Lower disk usage (~8GB total)
- Direct GPU access (no container layer)
- Works on smaller disks (25-30GB)
**Cons:**
- Manual `LD_LIBRARY_PATH` configuration
- Less portable across systems
**Best for:** Single GPU server, limited disk space
---
## Docker Deployment
@@ -422,16 +393,6 @@ watch -n 1 nvidia-smi
---
## Performance Notes
**Tesla T4 benchmarks:**
- Transcription: ~2-3x real-time (10 min audio in 3-5 min)
- Diarization: ~1.5x real-time
- Max concurrent requests: 2-3 (depends on audio length)
- First request warmup: ~10 seconds (model loading)
---
## Troubleshooting
### nvidia-smi fails after driver install
@@ -483,16 +444,6 @@ sudo docker compose logs
---
## Security Considerations
1. **API Key**: Keep `REFLECTOR_GPU_APIKEY` secret, rotate periodically
2. **HuggingFace Token**: Treat as password, never commit to git
3. **Firewall**: Only expose ports 80 and 443 publicly
4. **Updates**: Regularly update system packages
5. **Monitoring**: Set up alerts for service failures
---
## Updating
### Docker

View File

@@ -6,7 +6,7 @@ usage() {
echo "Usage: $0 [OPTIONS]"
echo ""
echo "Options:"
echo " --hf-token TOKEN HuggingFace token for Pyannote model"
echo " --hf-token TOKEN HuggingFace token"
echo " --help Show this help message"
echo ""
echo "Examples:"
@@ -88,7 +88,7 @@ if [[ ! "$HF_TOKEN" =~ ^hf_ ]]; then
fi
fi
# --- Auto-generate API Key ---
# --- Auto-generate reflector<->GPU API Key ---
echo ""
echo "Generating API key for GPU services..."
API_KEY=$(openssl rand -hex 32)