doc review round

2025-12-20 12:19:06 +00:00 · 2025-12-09 12:11:22 -05:00
parent 2b3f28993f
commit d890061056
3 changed files with 33 additions and 92 deletions
--- a/docs/docs/installation/overview.md
+++ b/docs/docs/installation/overview.md
@@ -26,13 +26,15 @@ flowchart LR

 Before starting, you need:

- [ ] **Production server** - Ubuntu 22.04+, 4+ cores, 8GB+ RAM, public IP
- [ ] **Two domain names** - e.g., `app.example.com` (frontend) and `api.example.com` (backend)
- [ ] **GPU processing** - Choose one:
-  - Modal.com account (free tier at https://modal.com), OR
+- **Production server** -  4+ cores, 8GB+ RAM, public IP
+- **Two domain names** - e.g., `app.example.com` (frontend) and `api.example.com` (backend)
+- **GPU processing** - Choose one:
+  - Modal.com account, OR
  - GPU server with NVIDIA GPU (8GB+ VRAM)
- [ ] **HuggingFace account** - Free at https://huggingface.co
- [ ] **OpenAI API key** - For summaries and topic detection at https://platform.openai.com/account/api-keys
+- **HuggingFace account** - Free at https://huggingface.co
+- **LLM API** - For summaries and topic detection. Choose one:
+  - OpenAI API key at https://platform.openai.com/account/api-keys, OR
+  - Any OpenAI-compatible endpoint (vLLM, LiteLLM, Ollama, etc.)

 ### Optional (for live meeting rooms)

@@ -41,52 +43,40 @@ Before starting, you need:

 ---

-## Step 1: Configure DNS
+## Configure DNS

-**Location: Your domain registrar / DNS provider**
-
-Create A records pointing to your server:
 ```
 Type: A    Name: app    Value: <your-server-ip>
 Type: A    Name: api    Value: <your-server-ip>
 ```

-Verify propagation (wait a few minutes):
-```bash
-dig app.example.com +short
-dig api.example.com +short
-# Both should return your server IP
-```
-
 ---

-## Step 2: Deploy GPU Processing
+## Deploy GPU Processing

-Reflector requires GPU processing for transcription (Whisper) and speaker diarization (Pyannote). Choose one option:
+Reflector requires GPU processing for transcription and speaker diarization. Choose one option:

 | | **Modal.com (Cloud)** | **Self-Hosted GPU** |
 |---|---|---|
 | **Best for** | No GPU hardware, zero maintenance | Own GPU server, full control |
-| **Pricing** | Pay-per-use (~$0.01-0.10/min audio) | Fixed infrastructure cost |
-| **Setup** | Run from laptop (browser auth) | Run on GPU server |
-| **Scaling** | Automatic | Manual |
+| **Pricing** | Pay-per-use | Fixed infrastructure cost |

 ### Option A: Modal.com (Serverless Cloud GPU)

-**Location: YOUR LOCAL COMPUTER (laptop/desktop)**
-
-Modal requires browser authentication, so this runs locally - not on your server.
-
 #### Accept HuggingFace Licenses

 Visit both pages and click "Accept":
 - https://huggingface.co/pyannote/speaker-diarization-3.1
 - https://huggingface.co/pyannote/segmentation-3.0

-Then generate a token at https://huggingface.co/settings/tokens
+Generate a token at https://huggingface.co/settings/tokens

 #### Deploy to Modal

+There's an install script to help with this setup. It's using modal API to set all necessary moving parts.
+
+As an alternative, all those operations that script does could be performed in modal settings in modal UI.
+
 ```bash
 pip install modal
 modal setup  # opens browser for authentication
@@ -96,7 +86,7 @@ cd reflector/gpu/modal_deployments
 ./deploy-all.sh --hf-token YOUR_HUGGINGFACE_TOKEN
 ```

-**Save the output** - copy the configuration block, you'll need it for Step 4.
+**Save the output** - copy the configuration block, you'll need it soon.

 See [Modal Setup](./modal-setup) for troubleshooting and details.

@@ -114,13 +104,13 @@ See [Self-Hosted GPU Setup](./self-hosted-gpu-setup) for complete instructions.
 4. Start service (Docker compose or systemd)
 5. Set up Caddy reverse proxy for HTTPS

-**Save your API key and HTTPS URL** - you'll need them for Step 4.
+**Save your API key and HTTPS URL** - you'll need them soon.

 ---

-## Step 3: Prepare Server
+## Prepare Server

-**Location: YOUR SERVER (via SSH)**
+**Location: dedicated reflector server**

 ### Install Docker

@@ -150,7 +140,7 @@ cd reflector

 ---

-## Step 4: Configure Environment
+## Configure Environment

 **Location: YOUR SERVER (via SSH, in the `reflector` directory)**

@@ -183,7 +173,7 @@ CORS_ALLOW_CREDENTIALS=true
 # Secret key - generate with: openssl rand -hex 32
 SECRET_KEY=<your-generated-secret>

-# GPU Processing - choose ONE option from Step 2:
+# GPU Processing - choose ONE option:

 # Option A: Modal.com (paste from deploy-all.sh output)
 TRANSCRIPT_BACKEND=modal
@@ -208,7 +198,7 @@ TRANSCRIPT_STORAGE_BACKEND=local
 LLM_API_KEY=sk-your-openai-api-key
 LLM_MODEL=gpt-4o-mini

-# Auth - disable for initial setup (see Step 8 for authentication)
+# Auth - disable for initial setup (see a dedicated step for authentication)
 AUTH_BACKEND=none
 ```

@@ -237,7 +227,7 @@ FEATURE_REQUIRE_LOGIN=false

 ---

-## Step 5: Configure Caddy
+## Configure Caddy

 **Location: YOUR SERVER (via SSH)**

@@ -260,7 +250,7 @@ Replace `example.com` with your domains. The `{$VAR:default}` syntax uses Caddy'

 ---

-## Step 6: Start Services
+## Start Services

 **Location: YOUR SERVER (via SSH)**

@@ -280,7 +270,7 @@ docker compose -f docker-compose.prod.yml exec server uv run alembic upgrade hea

 ---

-## Step 7: Verify Deployment
+## Verify Deployment

 ### Check services
 ```bash
@@ -307,9 +297,9 @@ curl https://api.example.com/health

 ---

-## Step 8: Enable Authentication (Required for Live Rooms)
+## Enable Authentication (Required for Live Rooms)

-By default, Reflector is open (no login required). **Authentication is required if you want to use Live Meeting Rooms (Step 9).**
+By default, Reflector is open (no login required). **Authentication is required if you want to use Live Meeting Rooms.**

 See [Authentication Setup](./auth-setup) for full Authentik OAuth configuration.

@@ -323,9 +313,9 @@ Quick summary:

 ---

-## Step 9: Enable Live Meeting Rooms
+## Enable Live Meeting Rooms

-**Requires: Step 8 (Authentication)**
+**Requires: Authentication Step**

 Live rooms require Daily.co and AWS S3. See [Daily.co Setup](./daily-setup) for complete S3/IAM configuration instructions.

--- a/docs/docs/installation/self-hosted-gpu-setup.md
+++ b/docs/docs/installation/self-hosted-gpu-setup.md
@@ -43,7 +43,6 @@ Your main Reflector server connects to this service exactly like it connects to
  - Systemd method: 25-30GB minimum

 ### Software
- Ubuntu 22.04 or 24.04
 - Public IP address
 - Domain name with DNS A record pointing to server

@@ -55,34 +54,6 @@ Your main Reflector server connects to this service exactly like it connects to

 ## Choose Deployment Method

-### Docker Deployment (Recommended)
-
-**Pros:**
- Container isolation and reproducibility
- No manual library path configuration
- Easier to replicate across servers
- Built-in restart policies
- Simpler dependency management
-
-**Cons:**
- Higher disk usage (~15GB for container)
- Requires 40-50GB disk minimum
-
-**Best for:** Teams wanting reproducible deployments, multiple GPU servers
-
-### Systemd Deployment
-
-**Pros:**
- Lower disk usage (~8GB total)
- Direct GPU access (no container layer)
- Works on smaller disks (25-30GB)
-
-**Cons:**
- Manual `LD_LIBRARY_PATH` configuration
- Less portable across systems
-
-**Best for:** Single GPU server, limited disk space
-
 ---

 ## Docker Deployment
@@ -422,16 +393,6 @@ watch -n 1 nvidia-smi

 ---

-## Performance Notes
-
-**Tesla T4 benchmarks:**
- Transcription: ~2-3x real-time (10 min audio in 3-5 min)
- Diarization: ~1.5x real-time
- Max concurrent requests: 2-3 (depends on audio length)
- First request warmup: ~10 seconds (model loading)
-
---
-
 ## Troubleshooting

 ### nvidia-smi fails after driver install
@@ -483,16 +444,6 @@ sudo docker compose logs

 ---

-## Security Considerations
-
-1. **API Key**: Keep `REFLECTOR_GPU_APIKEY` secret, rotate periodically
-2. **HuggingFace Token**: Treat as password, never commit to git
-3. **Firewall**: Only expose ports 80 and 443 publicly
-4. **Updates**: Regularly update system packages
-5. **Monitoring**: Set up alerts for service failures
-
---
-
 ## Updating

 ### Docker
--- a/gpu/modal_deployments/deploy-all.sh
+++ b/gpu/modal_deployments/deploy-all.sh
@@ -6,7 +6,7 @@ usage() {
    echo "Usage: $0 [OPTIONS]"
    echo ""
    echo "Options:"
-    echo "  --hf-token TOKEN    HuggingFace token for Pyannote model"
+    echo "  --hf-token TOKEN    HuggingFace token"
    echo "  --help              Show this help message"
    echo ""
    echo "Examples:"
@@ -88,7 +88,7 @@ if [[ ! "$HF_TOKEN" =~ ^hf_ ]]; then
    fi
 fi

-# --- Auto-generate API Key ---
+# --- Auto-generate reflector<->GPU API Key ---
 echo ""
 echo "Generating API key for GPU services..."
 API_KEY=$(openssl rand -hex 32)