feat: custom ca for caddy (#931)

* fix: send email on transcript page permissions fixed * feat: custom ca for caddy
2026-04-10 23:56:55 +00:00 · 2026-03-30 11:42:39 -05:00
parent bfaf4f403b
commit 12bf0c2d77
15 changed files with 1664 additions and 23 deletions
--- a/docsv2/custom-ca-setup.md
+++ b/docsv2/custom-ca-setup.md
@@ -0,0 +1,337 @@
+# Custom CA Certificate Setup
+
+Use a private Certificate Authority (CA) with Reflector self-hosted deployments. This covers two scenarios:
+
+1. **Custom local domain** — Serve Reflector over HTTPS on an internal domain (e.g., `reflector.local`) using certs signed by your own CA
+2. **Backend CA trust** — Let Reflector's backend services (server, workers, GPU) make HTTPS calls to GPU, LLM, or other internal services behind your private CA
+
+Both can be used independently or together.
+
+## Quick Start
+
+### Generate test certificates
+
+```bash
+./scripts/generate-certs.sh reflector.local
+```
+
+This creates `certs/` with:
+- `ca.key` + `ca.crt` — Root CA (10-year validity)
+- `server-key.pem` + `server.pem` — Server certificate (1-year, SAN: domain + localhost + 127.0.0.1)
+
+### Deploy with custom CA + domain
+
+```bash
+# Add domain to /etc/hosts on the server (use 127.0.0.1 for local, or server LAN IP for network access)
+echo "127.0.0.1 reflector.local" | sudo tee -a /etc/hosts
+
+# Run setup — pass the certs directory
+./scripts/setup-selfhosted.sh --gpu --caddy --domain reflector.local --custom-ca certs/
+
+# Trust the CA on your machine (see "Trust the CA" section below)
+```
+
+### Deploy with CA trust only (GPU/LLM behind private CA)
+
+```bash
+# Only need the CA cert file — no Caddy TLS certs needed
+./scripts/setup-selfhosted.sh --hosted --custom-ca /path/to/corporate-ca.crt
+```
+
+## How `--custom-ca` Works
+
+The flag accepts a **directory** or a **single file**:
+
+### Directory mode
+
+```bash
+--custom-ca certs/
+```
+
+Looks for these files by convention:
+- `ca.crt` (required) — CA certificate to trust
+- `server.pem` + `server-key.pem` (optional) — TLS certificate/key for Caddy
+
+If `server.pem` + `server-key.pem` are found AND `--domain` is provided:
+- Caddy serves HTTPS using those certs
+- Backend containers trust the CA for outbound calls
+
+If only `ca.crt` is found:
+- Backend containers trust the CA for outbound calls
+- Caddy is unaffected (uses Let's Encrypt, self-signed, or no Caddy)
+
+### Single file mode
+
+```bash
+--custom-ca /path/to/corporate-ca.crt
+```
+
+Only injects CA trust into backend containers. No Caddy TLS changes.
+
+## Scenarios
+
+### Scenario 1: Custom local domain
+
+Your Reflector instance runs on an internal network. You want `https://reflector.local` with proper TLS (no browser warnings).
+
+```bash
+# 1. Generate certs
+./scripts/generate-certs.sh reflector.local
+
+# 2. Add to /etc/hosts on the server
+echo "127.0.0.1 reflector.local" | sudo tee -a /etc/hosts
+
+# 3. Deploy
+./scripts/setup-selfhosted.sh --gpu --garage --caddy --domain reflector.local --custom-ca certs/
+
+# 4. Trust the CA on your machine (see "Trust the CA" section below)
+```
+
+If other machines on the network need to access it, add the server's LAN IP to `/etc/hosts` on those machines instead:
+```bash
+echo "192.168.1.100 reflector.local" | sudo tee -a /etc/hosts
+```
+
+And include that IP as an extra SAN when generating certs:
+```bash
+./scripts/generate-certs.sh reflector.local "IP:192.168.1.100"
+```
+
+### Scenario 2: GPU/LLM behind corporate CA
+
+Your GPU or LLM server (e.g., `https://gpu.internal.corp`) uses certificates signed by your corporate CA. Reflector's backend needs to trust that CA for outbound HTTPS calls.
+
+```bash
+# Get the CA certificate from your IT team (PEM format)
+# Then deploy — Caddy can still use Let's Encrypt or self-signed
+./scripts/setup-selfhosted.sh --hosted --garage --caddy --custom-ca /path/to/corporate-ca.crt
+```
+
+This works because:
+- **TLS cert/key** = "this is my identity" — for Caddy to serve HTTPS to browsers
+- **CA cert** = "I trust this authority" — for backend containers to verify outbound connections
+
+Your Reflector frontend can use Let's Encrypt (public domain) or self-signed certs, while the backend trusts a completely different CA for GPU/LLM calls.
+
+### Scenario 3: Both combined (same CA)
+
+Custom domain + GPU/LLM all behind the same CA:
+
+```bash
+./scripts/generate-certs.sh reflector.local "DNS:gpu.local"
+./scripts/setup-selfhosted.sh --gpu --garage --caddy --domain reflector.local --custom-ca certs/
+```
+
+### Scenario 4: Multiple CAs (local domain + remote GPU on different CA)
+
+Your Reflector uses one CA for `reflector.local`, but the GPU host uses a different CA:
+
+```bash
+# Your local domain setup
+./scripts/generate-certs.sh reflector.local
+
+# Deploy with your CA + trust the GPU host's CA too
+./scripts/setup-selfhosted.sh --hosted --garage --caddy \
+    --domain reflector.local \
+    --custom-ca certs/ \
+    --extra-ca /path/to/gpu-machine-ca.crt
+```
+
+`--extra-ca` appends additional CA certs to the trust bundle. Backend containers trust ALL CAs — your local domain AND the GPU host's certs both work.
+
+You can repeat `--extra-ca` for multiple remote services:
+```bash
+--extra-ca /path/to/gpu-ca.crt --extra-ca /path/to/llm-ca.crt
+```
+
+For setting up a dedicated GPU host, see [Standalone GPU Host Setup](gpu-host-setup.md).
+
+## Trust the CA on Client Machines
+
+After deploying, clients need to trust the CA to avoid browser warnings.
+
+### macOS
+
+```bash
+sudo security add-trusted-cert -d -r trustRoot \
+    -k /Library/Keychains/System.keychain certs/ca.crt
+```
+
+### Linux (Ubuntu/Debian)
+
+```bash
+sudo cp certs/ca.crt /usr/local/share/ca-certificates/reflector-ca.crt
+sudo update-ca-certificates
+```
+
+### Linux (RHEL/Fedora)
+
+```bash
+sudo cp certs/ca.crt /etc/pki/ca-trust/source/anchors/reflector-ca.crt
+sudo update-ca-trust
+```
+
+### Windows (PowerShell as admin)
+
+```powershell
+Import-Certificate -FilePath .\certs\ca.crt -CertStoreLocation Cert:\LocalMachine\Root
+```
+
+### Firefox (all platforms)
+
+Firefox uses its own certificate store:
+1. Settings > Privacy & Security > View Certificates
+2. Authorities tab > Import
+3. Select `ca.crt` and check "Trust this CA to identify websites"
+
+## How It Works Internally
+
+### Docker entrypoint CA injection
+
+Each backend container (server, worker, beat, hatchet workers, GPU) has an entrypoint script (`docker-entrypoint.sh`) that:
+
+1. Checks if a CA cert is mounted at `/usr/local/share/ca-certificates/custom-ca.crt`
+2. If present, runs `update-ca-certificates` to create a **combined bundle** (system CAs + custom CA)
+3. Sets environment variables so all Python/gRPC libraries use the combined bundle:
+
+| Env var | Covers |
+|---------|--------|
+| `SSL_CERT_FILE` | httpx, OpenAI SDK, llama-index, Python ssl module |
+| `REQUESTS_CA_BUNDLE` | requests library (transitive dependencies) |
+| `CURL_CA_BUNDLE` | curl CLI (container healthchecks) |
+| `GRPC_DEFAULT_SSL_ROOTS_FILE_PATH` | grpcio (Hatchet gRPC client) |
+
+When no CA cert is mounted, the entrypoint is a no-op — containers behave exactly as before.
+
+### Why this replaces manual certifi patching
+
+Previously, the workaround for trusting a private CA in Python was to patch certifi's bundle directly:
+
+```bash
+# OLD approach — fragile, do NOT use
+cat custom-ca.crt >> $(python -c "import certifi; print(certifi.where())")
+```
+
+This breaks whenever certifi is updated (any `pip install`/`uv sync` overwrites the bundle and the CA is lost).
+
+Our entrypoint approach is permanent because:
+
+1. `SSL_CERT_FILE` is checked by Python's `ssl.create_default_context()` **before** falling back to `certifi.where()`. When set, certifi's bundle is never read.
+2. `REQUESTS_CA_BUNDLE` similarly overrides certifi for the `requests` library.
+3. The CA is injected at container startup (runtime), not baked into the Python environment. It survives image rebuilds, dependency updates, and `uv sync`.
+
+```
+Python SSL lookup chain:
+  ssl.create_default_context()
+    → SSL_CERT_FILE env var? → YES → use combined bundle (system + custom CA) ✓
+    → (certifi.where() is never reached)
+```
+
+This covers all outbound HTTPS calls: httpx (transcription, diarization, translation, webhooks), OpenAI SDK (transcription), llama-index (LLM/summarization), and requests (transitive dependencies).
+
+### Compose override
+
+The setup script generates `docker-compose.ca.yml` which mounts the CA cert into every backend container as a read-only bind mount. This file is:
+- Only generated when `--custom-ca` is passed
+- Deleted on re-runs without `--custom-ca` (prevents stale overrides)
+- Added to `.gitignore`
+
+### Node.js (frontend)
+
+The web container uses `NODE_EXTRA_CA_CERTS` which **adds** to Node's trust store (unlike Python's `SSL_CERT_FILE` which replaces it). This is set via the compose override.
+
+## Generate Your Own CA (Manual)
+
+If you prefer not to use `generate-certs.sh`:
+
+```bash
+# 1. Create CA
+openssl genrsa -out ca.key 4096
+openssl req -x509 -new -nodes -key ca.key -sha256 -days 3650 \
+    -out ca.crt -subj "/CN=My CA/O=My Organization"
+
+# 2. Create server key
+openssl genrsa -out server-key.pem 2048
+
+# 3. Create CSR with SANs
+openssl req -new -key server-key.pem -out server.csr \
+    -subj "/CN=reflector.local" \
+    -addext "subjectAltName=DNS:reflector.local,DNS:localhost,IP:127.0.0.1"
+
+# 4. Sign with CA
+openssl x509 -req -in server.csr -CA ca.crt -CAkey ca.key \
+    -CAcreateserial -out server.pem -days 365 -sha256 \
+    -copy_extensions copyall
+
+# 5. Clean up
+rm server.csr ca.srl
+```
+
+## Using Existing Corporate Certificates
+
+If your organization already has a CA:
+
+1. Get the CA certificate in PEM format from your IT team
+2. If you have a PKCS#12 (.p12/.pfx) bundle, extract the CA cert:
+   ```bash
+   openssl pkcs12 -in bundle.p12 -cacerts -nokeys -out ca.crt
+   ```
+3. If you have multiple intermediate CAs, concatenate them into one PEM file:
+   ```bash
+   cat intermediate-ca.crt root-ca.crt > ca.crt
+   ```
+
+## Troubleshooting
+
+### Browser: "Your connection is not private"
+
+The CA is not trusted on the client machine. See "Trust the CA" section above.
+
+Check certificate expiry:
+```bash
+openssl x509 -noout -dates -in certs/server.pem
+```
+
+### Backend: `SSL: CERTIFICATE_VERIFY_FAILED`
+
+CA cert not mounted or not loaded. Check inside the container:
+```bash
+docker compose exec server env | grep SSL_CERT_FILE
+docker compose exec server python -c "
+import ssl, os
+print('SSL_CERT_FILE:', os.environ.get('SSL_CERT_FILE', 'not set'))
+ctx = ssl.create_default_context()
+print('CA certs loaded:', ctx.cert_store_stats())
+"
+```
+
+### Caddy: "certificate is not valid for any names"
+
+Domain in Caddyfile doesn't match the certificate's SAN/CN. Check:
+```bash
+openssl x509 -noout -text -in certs/server.pem | grep -A1 "Subject Alternative Name"
+```
+
+### Certificate chain issues
+
+If you have intermediate CAs, concatenate them into `server.pem`:
+```bash
+cat server-cert.pem intermediate-ca.pem > certs/server.pem
+```
+
+Verify the chain:
+```bash
+openssl verify -CAfile certs/ca.crt certs/server.pem
+```
+
+### Certificate renewal
+
+Custom CA certs are NOT auto-renewed (unlike Let's Encrypt). Replace cert files and restart:
+```bash
+# Replace certs
+cp new-server.pem certs/server.pem
+cp new-server-key.pem certs/server-key.pem
+
+# Restart Caddy to pick up new certs
+docker compose restart caddy
+```
--- a/docsv2/gpu-host-setup.md
+++ b/docsv2/gpu-host-setup.md
@@ -0,0 +1,294 @@
+# Standalone GPU Host Setup
+
+Deploy Reflector's GPU transcription/diarization/translation service on a dedicated machine, separate from the main Reflector instance. Useful when:
+
+- Your GPU machine is on a different network than the Reflector server
+- You want to share one GPU service across multiple Reflector instances
+- The GPU machine has special hardware/drivers that can't run the full stack
+- You need to scale GPU processing independently
+
+## Architecture
+
+```
+┌─────────────────────┐         HTTPS          ┌────────────────────┐
+│  Reflector Server    │ ────────────────────── │  GPU Host          │
+│  (server, worker,    │  TRANSCRIPT_URL        │  (transcription,   │
+│   web, postgres,     │  DIARIZATION_URL       │   diarization,     │
+│   redis, hatchet)    │  TRANSLATE_URL         │   translation)     │
+│                      │                        │                    │
+│  setup-selfhosted.sh │                        │  setup-gpu-host.sh │
+│  --hosted            │                        │                    │
+└─────────────────────┘                        └────────────────────┘
+```
+
+The GPU service is a standalone FastAPI app that exposes transcription, diarization, translation, and audio padding endpoints. It has **no dependencies** on PostgreSQL, Redis, Hatchet, or any other Reflector service.
+
+## Quick Start
+
+### On the GPU machine
+
+```bash
+git clone <reflector-repo>
+cd reflector
+
+# Set HuggingFace token (required for diarization models)
+export HF_TOKEN=your-huggingface-token
+
+# Deploy with HTTPS (Let's Encrypt)
+./scripts/setup-gpu-host.sh --domain gpu.example.com --api-key my-secret-key
+
+# Or deploy with custom CA
+./scripts/generate-certs.sh gpu.local
+./scripts/setup-gpu-host.sh --domain gpu.local --custom-ca certs/ --api-key my-secret-key
+```
+
+### On the Reflector machine
+
+```bash
+# If the GPU host uses a custom CA, trust it
+./scripts/setup-selfhosted.sh --hosted --garage --caddy \
+    --extra-ca /path/to/gpu-machine-ca.crt
+
+# Or if you already have --custom-ca for your local domain
+./scripts/setup-selfhosted.sh --hosted --garage --caddy \
+    --domain reflector.local --custom-ca certs/ \
+    --extra-ca /path/to/gpu-machine-ca.crt
+```
+
+Then configure `server/.env` to point to the GPU host:
+
+```bash
+TRANSCRIPT_BACKEND=modal
+TRANSCRIPT_URL=https://gpu.example.com
+TRANSCRIPT_MODAL_API_KEY=my-secret-key
+
+DIARIZATION_BACKEND=modal
+DIARIZATION_URL=https://gpu.example.com
+DIARIZATION_MODAL_API_KEY=my-secret-key
+
+TRANSLATION_BACKEND=modal
+TRANSLATE_URL=https://gpu.example.com
+TRANSLATION_MODAL_API_KEY=my-secret-key
+```
+
+## Script Options
+
+```
+./scripts/setup-gpu-host.sh [OPTIONS]
+
+Options:
+  --domain DOMAIN    Domain name for HTTPS (Let's Encrypt or custom cert)
+  --custom-ca PATH   Custom CA (directory or single PEM file)
+  --extra-ca FILE    Additional CA cert to trust (repeatable)
+  --api-key KEY      API key to protect the service (strongly recommended)
+  --cpu              CPU-only mode (no NVIDIA GPU required)
+  --port PORT        Host port (default: 443 with Caddy, 8000 without)
+```
+
+## Deployment Scenarios
+
+### Public internet with Let's Encrypt
+
+GPU machine has a public IP and domain:
+
+```bash
+./scripts/setup-gpu-host.sh --domain gpu.example.com --api-key my-secret-key
+```
+
+Requirements:
+- DNS A record: `gpu.example.com` → GPU machine's public IP
+- Ports 80 and 443 open
+- Caddy auto-provisions Let's Encrypt certificate
+
+### Internal network with custom CA
+
+GPU machine on a private network:
+
+```bash
+# Generate certs on the GPU machine
+./scripts/generate-certs.sh gpu.internal "IP:192.168.1.200"
+
+# Deploy
+./scripts/setup-gpu-host.sh --domain gpu.internal --custom-ca certs/ --api-key my-secret-key
+```
+
+On each machine that connects (including the Reflector server), add DNS:
+```bash
+echo "192.168.1.200 gpu.internal" | sudo tee -a /etc/hosts
+```
+
+### IP-only (no domain)
+
+No domain needed — just use the machine's IP:
+
+```bash
+./scripts/setup-gpu-host.sh --api-key my-secret-key
+```
+
+Caddy is not used; the GPU service runs directly on port 8000 (HTTP). For HTTPS without a domain, the Reflector machine connects via `http://<GPU_IP>:8000`.
+
+### CPU-only (no NVIDIA GPU)
+
+Works on any machine — transcription will be slower:
+
+```bash
+./scripts/setup-gpu-host.sh --cpu --domain gpu.example.com --api-key my-secret-key
+```
+
+## DNS Resolution
+
+The Reflector server must be able to reach the GPU host by name or IP.
+
+| Setup | DNS Method | TRANSCRIPT_URL example |
+|-------|------------|----------------------|
+| Public domain | DNS A record | `https://gpu.example.com` |
+| Internal domain | `/etc/hosts` on both machines | `https://gpu.internal` |
+| IP only | No DNS needed | `http://192.168.1.200:8000` |
+
+For internal domains, add the GPU machine's IP to `/etc/hosts` on the Reflector machine:
+```bash
+echo "192.168.1.200 gpu.internal" | sudo tee -a /etc/hosts
+```
+
+If the Reflector server runs in Docker, the containers resolve DNS from the host (Docker's default DNS behavior). So adding to the host's `/etc/hosts` is sufficient.
+
+## Multi-CA Setup
+
+When your Reflector instance has its own CA (for `reflector.local`) and the GPU host has a different CA:
+
+**On the GPU machine:**
+```bash
+./scripts/generate-certs.sh gpu.local
+./scripts/setup-gpu-host.sh --domain gpu.local --custom-ca certs/ --api-key my-key
+```
+
+**On the Reflector machine:**
+```bash
+# Your local CA for reflector.local + the GPU host's CA
+./scripts/setup-selfhosted.sh --hosted --garage --caddy \
+    --domain reflector.local \
+    --custom-ca certs/ \
+    --extra-ca /path/to/gpu-machine-ca.crt
+```
+
+The `--extra-ca` flag appends the GPU host's CA to the trust bundle. Backend containers trust both CAs — your local domain works AND outbound calls to the GPU host succeed.
+
+You can repeat `--extra-ca` for multiple remote services:
+```bash
+--extra-ca /path/to/gpu-ca.crt --extra-ca /path/to/llm-ca.crt
+```
+
+## API Key Authentication
+
+The GPU service uses Bearer token authentication via `REFLECTOR_GPU_APIKEY`:
+
+```bash
+# Test from the Reflector machine
+curl -s https://gpu.example.com/docs                              # No auth needed for docs
+curl -s -X POST https://gpu.example.com/v1/audio/transcriptions \
+    -H "Authorization: Bearer <my-secret-key>" \                    #gitleaks:allow
+    -F "file=@audio.wav"
+```
+
+If `REFLECTOR_GPU_APIKEY` is not set, the service accepts all requests (open access). Always use `--api-key` for internet-facing deployments.
+
+The same key goes in Reflector's `server/.env` as `TRANSCRIPT_MODAL_API_KEY` and `DIARIZATION_MODAL_API_KEY`.
+
+## Files
+
+| File | Checked in? | Purpose |
+|------|-------------|---------|
+| `docker-compose.gpu-host.yml` | Yes | Static compose file with profiles (`gpu`, `cpu`, `caddy`) |
+| `.env.gpu-host` | No (generated) | Environment variables (HF_TOKEN, API key, ports) |
+| `Caddyfile.gpu-host` | No (generated) | Caddy config (only when using HTTPS) |
+| `docker-compose.gpu-ca.yml` | No (generated) | CA cert mounts override (only with --custom-ca) |
+| `certs/` | No (generated) | Staged certificates (when using --custom-ca) |
+
+The compose file is checked into the repo — you can read it to understand exactly what runs. The script only generates env vars, Caddyfile, and CA overrides. Profiles control which service starts:
+
+```bash
+# What the script does under the hood:
+docker compose -f docker-compose.gpu-host.yml --profile gpu --profile caddy \
+    --env-file .env.gpu-host up -d
+
+# CPU mode:
+docker compose -f docker-compose.gpu-host.yml --profile cpu --profile caddy \
+    --env-file .env.gpu-host up -d
+```
+
+Both `gpu` and `cpu` services get the network alias `transcription`, so Caddy's config works with either.
+
+## Management
+
+```bash
+# View logs
+docker compose -f docker-compose.gpu-host.yml --profile gpu logs -f gpu
+
+# Restart
+docker compose -f docker-compose.gpu-host.yml --profile gpu restart gpu
+
+# Stop
+docker compose -f docker-compose.gpu-host.yml --profile gpu --profile caddy down
+
+# Re-run setup
+./scripts/setup-gpu-host.sh [same flags]
+
+# Rebuild after code changes
+docker compose -f docker-compose.gpu-host.yml --profile gpu build gpu
+docker compose -f docker-compose.gpu-host.yml --profile gpu up -d gpu
+```
+
+If you deployed with `--custom-ca`, include the CA override in manual commands:
+```bash
+docker compose -f docker-compose.gpu-host.yml -f docker-compose.gpu-ca.yml \
+    --profile gpu logs -f gpu
+```
+
+## Troubleshooting
+
+### GPU service won't start
+
+Check logs:
+```bash
+docker compose -f docker-compose.gpu-host.yml logs gpu
+```
+
+Common causes:
+- NVIDIA driver not installed or `nvidia-container-toolkit` missing
+- `HF_TOKEN` not set (diarization model download fails)
+- Port already in use
+
+### Reflector can't connect to GPU host
+
+From the Reflector machine:
+```bash
+# Test HTTPS connectivity
+curl -v https://gpu.example.com/docs
+
+# If using custom CA, test with explicit CA
+curl --cacert /path/to/gpu-ca.crt https://gpu.internal/docs
+```
+
+From inside the Reflector container:
+```bash
+docker compose exec server python -c "
+import httpx
+r = httpx.get('https://gpu.internal/docs')
+print(r.status_code)
+"
+```
+
+### SSL: CERTIFICATE_VERIFY_FAILED
+
+The Reflector backend doesn't trust the GPU host's CA. Fix:
+```bash
+# Re-run Reflector setup with the GPU host's CA
+./scripts/setup-selfhosted.sh --hosted --extra-ca /path/to/gpu-ca.crt
+```
+
+### Diarization returns errors
+
+- Accept pyannote model licenses on HuggingFace:
+  - https://huggingface.co/pyannote/speaker-diarization-3.1
+  - https://huggingface.co/pyannote/segmentation-3.0
+- Verify `HF_TOKEN` is set in `.env.gpu-host`