feat: custom ca for caddy (#931)

* fix: send email on transcript page permissions fixed

* feat: custom ca for caddy
This commit is contained in:
Juan Diego García
2026-03-30 11:42:39 -05:00
committed by GitHub
parent bfaf4f403b
commit 12bf0c2d77
15 changed files with 1664 additions and 23 deletions

337
docsv2/custom-ca-setup.md Normal file
View File

@@ -0,0 +1,337 @@
# Custom CA Certificate Setup
Use a private Certificate Authority (CA) with Reflector self-hosted deployments. This covers two scenarios:
1. **Custom local domain** — Serve Reflector over HTTPS on an internal domain (e.g., `reflector.local`) using certs signed by your own CA
2. **Backend CA trust** — Let Reflector's backend services (server, workers, GPU) make HTTPS calls to GPU, LLM, or other internal services behind your private CA
Both can be used independently or together.
## Quick Start
### Generate test certificates
```bash
./scripts/generate-certs.sh reflector.local
```
This creates `certs/` with:
- `ca.key` + `ca.crt` — Root CA (10-year validity)
- `server-key.pem` + `server.pem` — Server certificate (1-year, SAN: domain + localhost + 127.0.0.1)
### Deploy with custom CA + domain
```bash
# Add domain to /etc/hosts on the server (use 127.0.0.1 for local, or server LAN IP for network access)
echo "127.0.0.1 reflector.local" | sudo tee -a /etc/hosts
# Run setup — pass the certs directory
./scripts/setup-selfhosted.sh --gpu --caddy --domain reflector.local --custom-ca certs/
# Trust the CA on your machine (see "Trust the CA" section below)
```
### Deploy with CA trust only (GPU/LLM behind private CA)
```bash
# Only need the CA cert file — no Caddy TLS certs needed
./scripts/setup-selfhosted.sh --hosted --custom-ca /path/to/corporate-ca.crt
```
## How `--custom-ca` Works
The flag accepts a **directory** or a **single file**:
### Directory mode
```bash
--custom-ca certs/
```
Looks for these files by convention:
- `ca.crt` (required) — CA certificate to trust
- `server.pem` + `server-key.pem` (optional) — TLS certificate/key for Caddy
If `server.pem` + `server-key.pem` are found AND `--domain` is provided:
- Caddy serves HTTPS using those certs
- Backend containers trust the CA for outbound calls
If only `ca.crt` is found:
- Backend containers trust the CA for outbound calls
- Caddy is unaffected (uses Let's Encrypt, self-signed, or no Caddy)
### Single file mode
```bash
--custom-ca /path/to/corporate-ca.crt
```
Only injects CA trust into backend containers. No Caddy TLS changes.
## Scenarios
### Scenario 1: Custom local domain
Your Reflector instance runs on an internal network. You want `https://reflector.local` with proper TLS (no browser warnings).
```bash
# 1. Generate certs
./scripts/generate-certs.sh reflector.local
# 2. Add to /etc/hosts on the server
echo "127.0.0.1 reflector.local" | sudo tee -a /etc/hosts
# 3. Deploy
./scripts/setup-selfhosted.sh --gpu --garage --caddy --domain reflector.local --custom-ca certs/
# 4. Trust the CA on your machine (see "Trust the CA" section below)
```
If other machines on the network need to access it, add the server's LAN IP to `/etc/hosts` on those machines instead:
```bash
echo "192.168.1.100 reflector.local" | sudo tee -a /etc/hosts
```
And include that IP as an extra SAN when generating certs:
```bash
./scripts/generate-certs.sh reflector.local "IP:192.168.1.100"
```
### Scenario 2: GPU/LLM behind corporate CA
Your GPU or LLM server (e.g., `https://gpu.internal.corp`) uses certificates signed by your corporate CA. Reflector's backend needs to trust that CA for outbound HTTPS calls.
```bash
# Get the CA certificate from your IT team (PEM format)
# Then deploy — Caddy can still use Let's Encrypt or self-signed
./scripts/setup-selfhosted.sh --hosted --garage --caddy --custom-ca /path/to/corporate-ca.crt
```
This works because:
- **TLS cert/key** = "this is my identity" — for Caddy to serve HTTPS to browsers
- **CA cert** = "I trust this authority" — for backend containers to verify outbound connections
Your Reflector frontend can use Let's Encrypt (public domain) or self-signed certs, while the backend trusts a completely different CA for GPU/LLM calls.
### Scenario 3: Both combined (same CA)
Custom domain + GPU/LLM all behind the same CA:
```bash
./scripts/generate-certs.sh reflector.local "DNS:gpu.local"
./scripts/setup-selfhosted.sh --gpu --garage --caddy --domain reflector.local --custom-ca certs/
```
### Scenario 4: Multiple CAs (local domain + remote GPU on different CA)
Your Reflector uses one CA for `reflector.local`, but the GPU host uses a different CA:
```bash
# Your local domain setup
./scripts/generate-certs.sh reflector.local
# Deploy with your CA + trust the GPU host's CA too
./scripts/setup-selfhosted.sh --hosted --garage --caddy \
--domain reflector.local \
--custom-ca certs/ \
--extra-ca /path/to/gpu-machine-ca.crt
```
`--extra-ca` appends additional CA certs to the trust bundle. Backend containers trust ALL CAs — your local domain AND the GPU host's certs both work.
You can repeat `--extra-ca` for multiple remote services:
```bash
--extra-ca /path/to/gpu-ca.crt --extra-ca /path/to/llm-ca.crt
```
For setting up a dedicated GPU host, see [Standalone GPU Host Setup](gpu-host-setup.md).
## Trust the CA on Client Machines
After deploying, clients need to trust the CA to avoid browser warnings.
### macOS
```bash
sudo security add-trusted-cert -d -r trustRoot \
-k /Library/Keychains/System.keychain certs/ca.crt
```
### Linux (Ubuntu/Debian)
```bash
sudo cp certs/ca.crt /usr/local/share/ca-certificates/reflector-ca.crt
sudo update-ca-certificates
```
### Linux (RHEL/Fedora)
```bash
sudo cp certs/ca.crt /etc/pki/ca-trust/source/anchors/reflector-ca.crt
sudo update-ca-trust
```
### Windows (PowerShell as admin)
```powershell
Import-Certificate -FilePath .\certs\ca.crt -CertStoreLocation Cert:\LocalMachine\Root
```
### Firefox (all platforms)
Firefox uses its own certificate store:
1. Settings > Privacy & Security > View Certificates
2. Authorities tab > Import
3. Select `ca.crt` and check "Trust this CA to identify websites"
## How It Works Internally
### Docker entrypoint CA injection
Each backend container (server, worker, beat, hatchet workers, GPU) has an entrypoint script (`docker-entrypoint.sh`) that:
1. Checks if a CA cert is mounted at `/usr/local/share/ca-certificates/custom-ca.crt`
2. If present, runs `update-ca-certificates` to create a **combined bundle** (system CAs + custom CA)
3. Sets environment variables so all Python/gRPC libraries use the combined bundle:
| Env var | Covers |
|---------|--------|
| `SSL_CERT_FILE` | httpx, OpenAI SDK, llama-index, Python ssl module |
| `REQUESTS_CA_BUNDLE` | requests library (transitive dependencies) |
| `CURL_CA_BUNDLE` | curl CLI (container healthchecks) |
| `GRPC_DEFAULT_SSL_ROOTS_FILE_PATH` | grpcio (Hatchet gRPC client) |
When no CA cert is mounted, the entrypoint is a no-op — containers behave exactly as before.
### Why this replaces manual certifi patching
Previously, the workaround for trusting a private CA in Python was to patch certifi's bundle directly:
```bash
# OLD approach — fragile, do NOT use
cat custom-ca.crt >> $(python -c "import certifi; print(certifi.where())")
```
This breaks whenever certifi is updated (any `pip install`/`uv sync` overwrites the bundle and the CA is lost).
Our entrypoint approach is permanent because:
1. `SSL_CERT_FILE` is checked by Python's `ssl.create_default_context()` **before** falling back to `certifi.where()`. When set, certifi's bundle is never read.
2. `REQUESTS_CA_BUNDLE` similarly overrides certifi for the `requests` library.
3. The CA is injected at container startup (runtime), not baked into the Python environment. It survives image rebuilds, dependency updates, and `uv sync`.
```
Python SSL lookup chain:
ssl.create_default_context()
→ SSL_CERT_FILE env var? → YES → use combined bundle (system + custom CA) ✓
→ (certifi.where() is never reached)
```
This covers all outbound HTTPS calls: httpx (transcription, diarization, translation, webhooks), OpenAI SDK (transcription), llama-index (LLM/summarization), and requests (transitive dependencies).
### Compose override
The setup script generates `docker-compose.ca.yml` which mounts the CA cert into every backend container as a read-only bind mount. This file is:
- Only generated when `--custom-ca` is passed
- Deleted on re-runs without `--custom-ca` (prevents stale overrides)
- Added to `.gitignore`
### Node.js (frontend)
The web container uses `NODE_EXTRA_CA_CERTS` which **adds** to Node's trust store (unlike Python's `SSL_CERT_FILE` which replaces it). This is set via the compose override.
## Generate Your Own CA (Manual)
If you prefer not to use `generate-certs.sh`:
```bash
# 1. Create CA
openssl genrsa -out ca.key 4096
openssl req -x509 -new -nodes -key ca.key -sha256 -days 3650 \
-out ca.crt -subj "/CN=My CA/O=My Organization"
# 2. Create server key
openssl genrsa -out server-key.pem 2048
# 3. Create CSR with SANs
openssl req -new -key server-key.pem -out server.csr \
-subj "/CN=reflector.local" \
-addext "subjectAltName=DNS:reflector.local,DNS:localhost,IP:127.0.0.1"
# 4. Sign with CA
openssl x509 -req -in server.csr -CA ca.crt -CAkey ca.key \
-CAcreateserial -out server.pem -days 365 -sha256 \
-copy_extensions copyall
# 5. Clean up
rm server.csr ca.srl
```
## Using Existing Corporate Certificates
If your organization already has a CA:
1. Get the CA certificate in PEM format from your IT team
2. If you have a PKCS#12 (.p12/.pfx) bundle, extract the CA cert:
```bash
openssl pkcs12 -in bundle.p12 -cacerts -nokeys -out ca.crt
```
3. If you have multiple intermediate CAs, concatenate them into one PEM file:
```bash
cat intermediate-ca.crt root-ca.crt > ca.crt
```
## Troubleshooting
### Browser: "Your connection is not private"
The CA is not trusted on the client machine. See "Trust the CA" section above.
Check certificate expiry:
```bash
openssl x509 -noout -dates -in certs/server.pem
```
### Backend: `SSL: CERTIFICATE_VERIFY_FAILED`
CA cert not mounted or not loaded. Check inside the container:
```bash
docker compose exec server env | grep SSL_CERT_FILE
docker compose exec server python -c "
import ssl, os
print('SSL_CERT_FILE:', os.environ.get('SSL_CERT_FILE', 'not set'))
ctx = ssl.create_default_context()
print('CA certs loaded:', ctx.cert_store_stats())
"
```
### Caddy: "certificate is not valid for any names"
Domain in Caddyfile doesn't match the certificate's SAN/CN. Check:
```bash
openssl x509 -noout -text -in certs/server.pem | grep -A1 "Subject Alternative Name"
```
### Certificate chain issues
If you have intermediate CAs, concatenate them into `server.pem`:
```bash
cat server-cert.pem intermediate-ca.pem > certs/server.pem
```
Verify the chain:
```bash
openssl verify -CAfile certs/ca.crt certs/server.pem
```
### Certificate renewal
Custom CA certs are NOT auto-renewed (unlike Let's Encrypt). Replace cert files and restart:
```bash
# Replace certs
cp new-server.pem certs/server.pem
cp new-server-key.pem certs/server-key.pem
# Restart Caddy to pick up new certs
docker compose restart caddy
```

294
docsv2/gpu-host-setup.md Normal file
View File

@@ -0,0 +1,294 @@
# Standalone GPU Host Setup
Deploy Reflector's GPU transcription/diarization/translation service on a dedicated machine, separate from the main Reflector instance. Useful when:
- Your GPU machine is on a different network than the Reflector server
- You want to share one GPU service across multiple Reflector instances
- The GPU machine has special hardware/drivers that can't run the full stack
- You need to scale GPU processing independently
## Architecture
```
┌─────────────────────┐ HTTPS ┌────────────────────┐
│ Reflector Server │ ────────────────────── │ GPU Host │
│ (server, worker, │ TRANSCRIPT_URL │ (transcription, │
│ web, postgres, │ DIARIZATION_URL │ diarization, │
│ redis, hatchet) │ TRANSLATE_URL │ translation) │
│ │ │ │
│ setup-selfhosted.sh │ │ setup-gpu-host.sh │
│ --hosted │ │ │
└─────────────────────┘ └────────────────────┘
```
The GPU service is a standalone FastAPI app that exposes transcription, diarization, translation, and audio padding endpoints. It has **no dependencies** on PostgreSQL, Redis, Hatchet, or any other Reflector service.
## Quick Start
### On the GPU machine
```bash
git clone <reflector-repo>
cd reflector
# Set HuggingFace token (required for diarization models)
export HF_TOKEN=your-huggingface-token
# Deploy with HTTPS (Let's Encrypt)
./scripts/setup-gpu-host.sh --domain gpu.example.com --api-key my-secret-key
# Or deploy with custom CA
./scripts/generate-certs.sh gpu.local
./scripts/setup-gpu-host.sh --domain gpu.local --custom-ca certs/ --api-key my-secret-key
```
### On the Reflector machine
```bash
# If the GPU host uses a custom CA, trust it
./scripts/setup-selfhosted.sh --hosted --garage --caddy \
--extra-ca /path/to/gpu-machine-ca.crt
# Or if you already have --custom-ca for your local domain
./scripts/setup-selfhosted.sh --hosted --garage --caddy \
--domain reflector.local --custom-ca certs/ \
--extra-ca /path/to/gpu-machine-ca.crt
```
Then configure `server/.env` to point to the GPU host:
```bash
TRANSCRIPT_BACKEND=modal
TRANSCRIPT_URL=https://gpu.example.com
TRANSCRIPT_MODAL_API_KEY=my-secret-key
DIARIZATION_BACKEND=modal
DIARIZATION_URL=https://gpu.example.com
DIARIZATION_MODAL_API_KEY=my-secret-key
TRANSLATION_BACKEND=modal
TRANSLATE_URL=https://gpu.example.com
TRANSLATION_MODAL_API_KEY=my-secret-key
```
## Script Options
```
./scripts/setup-gpu-host.sh [OPTIONS]
Options:
--domain DOMAIN Domain name for HTTPS (Let's Encrypt or custom cert)
--custom-ca PATH Custom CA (directory or single PEM file)
--extra-ca FILE Additional CA cert to trust (repeatable)
--api-key KEY API key to protect the service (strongly recommended)
--cpu CPU-only mode (no NVIDIA GPU required)
--port PORT Host port (default: 443 with Caddy, 8000 without)
```
## Deployment Scenarios
### Public internet with Let's Encrypt
GPU machine has a public IP and domain:
```bash
./scripts/setup-gpu-host.sh --domain gpu.example.com --api-key my-secret-key
```
Requirements:
- DNS A record: `gpu.example.com` → GPU machine's public IP
- Ports 80 and 443 open
- Caddy auto-provisions Let's Encrypt certificate
### Internal network with custom CA
GPU machine on a private network:
```bash
# Generate certs on the GPU machine
./scripts/generate-certs.sh gpu.internal "IP:192.168.1.200"
# Deploy
./scripts/setup-gpu-host.sh --domain gpu.internal --custom-ca certs/ --api-key my-secret-key
```
On each machine that connects (including the Reflector server), add DNS:
```bash
echo "192.168.1.200 gpu.internal" | sudo tee -a /etc/hosts
```
### IP-only (no domain)
No domain needed — just use the machine's IP:
```bash
./scripts/setup-gpu-host.sh --api-key my-secret-key
```
Caddy is not used; the GPU service runs directly on port 8000 (HTTP). For HTTPS without a domain, the Reflector machine connects via `http://<GPU_IP>:8000`.
### CPU-only (no NVIDIA GPU)
Works on any machine — transcription will be slower:
```bash
./scripts/setup-gpu-host.sh --cpu --domain gpu.example.com --api-key my-secret-key
```
## DNS Resolution
The Reflector server must be able to reach the GPU host by name or IP.
| Setup | DNS Method | TRANSCRIPT_URL example |
|-------|------------|----------------------|
| Public domain | DNS A record | `https://gpu.example.com` |
| Internal domain | `/etc/hosts` on both machines | `https://gpu.internal` |
| IP only | No DNS needed | `http://192.168.1.200:8000` |
For internal domains, add the GPU machine's IP to `/etc/hosts` on the Reflector machine:
```bash
echo "192.168.1.200 gpu.internal" | sudo tee -a /etc/hosts
```
If the Reflector server runs in Docker, the containers resolve DNS from the host (Docker's default DNS behavior). So adding to the host's `/etc/hosts` is sufficient.
## Multi-CA Setup
When your Reflector instance has its own CA (for `reflector.local`) and the GPU host has a different CA:
**On the GPU machine:**
```bash
./scripts/generate-certs.sh gpu.local
./scripts/setup-gpu-host.sh --domain gpu.local --custom-ca certs/ --api-key my-key
```
**On the Reflector machine:**
```bash
# Your local CA for reflector.local + the GPU host's CA
./scripts/setup-selfhosted.sh --hosted --garage --caddy \
--domain reflector.local \
--custom-ca certs/ \
--extra-ca /path/to/gpu-machine-ca.crt
```
The `--extra-ca` flag appends the GPU host's CA to the trust bundle. Backend containers trust both CAs — your local domain works AND outbound calls to the GPU host succeed.
You can repeat `--extra-ca` for multiple remote services:
```bash
--extra-ca /path/to/gpu-ca.crt --extra-ca /path/to/llm-ca.crt
```
## API Key Authentication
The GPU service uses Bearer token authentication via `REFLECTOR_GPU_APIKEY`:
```bash
# Test from the Reflector machine
curl -s https://gpu.example.com/docs # No auth needed for docs
curl -s -X POST https://gpu.example.com/v1/audio/transcriptions \
-H "Authorization: Bearer <my-secret-key>" \ #gitleaks:allow
-F "file=@audio.wav"
```
If `REFLECTOR_GPU_APIKEY` is not set, the service accepts all requests (open access). Always use `--api-key` for internet-facing deployments.
The same key goes in Reflector's `server/.env` as `TRANSCRIPT_MODAL_API_KEY` and `DIARIZATION_MODAL_API_KEY`.
## Files
| File | Checked in? | Purpose |
|------|-------------|---------|
| `docker-compose.gpu-host.yml` | Yes | Static compose file with profiles (`gpu`, `cpu`, `caddy`) |
| `.env.gpu-host` | No (generated) | Environment variables (HF_TOKEN, API key, ports) |
| `Caddyfile.gpu-host` | No (generated) | Caddy config (only when using HTTPS) |
| `docker-compose.gpu-ca.yml` | No (generated) | CA cert mounts override (only with --custom-ca) |
| `certs/` | No (generated) | Staged certificates (when using --custom-ca) |
The compose file is checked into the repo — you can read it to understand exactly what runs. The script only generates env vars, Caddyfile, and CA overrides. Profiles control which service starts:
```bash
# What the script does under the hood:
docker compose -f docker-compose.gpu-host.yml --profile gpu --profile caddy \
--env-file .env.gpu-host up -d
# CPU mode:
docker compose -f docker-compose.gpu-host.yml --profile cpu --profile caddy \
--env-file .env.gpu-host up -d
```
Both `gpu` and `cpu` services get the network alias `transcription`, so Caddy's config works with either.
## Management
```bash
# View logs
docker compose -f docker-compose.gpu-host.yml --profile gpu logs -f gpu
# Restart
docker compose -f docker-compose.gpu-host.yml --profile gpu restart gpu
# Stop
docker compose -f docker-compose.gpu-host.yml --profile gpu --profile caddy down
# Re-run setup
./scripts/setup-gpu-host.sh [same flags]
# Rebuild after code changes
docker compose -f docker-compose.gpu-host.yml --profile gpu build gpu
docker compose -f docker-compose.gpu-host.yml --profile gpu up -d gpu
```
If you deployed with `--custom-ca`, include the CA override in manual commands:
```bash
docker compose -f docker-compose.gpu-host.yml -f docker-compose.gpu-ca.yml \
--profile gpu logs -f gpu
```
## Troubleshooting
### GPU service won't start
Check logs:
```bash
docker compose -f docker-compose.gpu-host.yml logs gpu
```
Common causes:
- NVIDIA driver not installed or `nvidia-container-toolkit` missing
- `HF_TOKEN` not set (diarization model download fails)
- Port already in use
### Reflector can't connect to GPU host
From the Reflector machine:
```bash
# Test HTTPS connectivity
curl -v https://gpu.example.com/docs
# If using custom CA, test with explicit CA
curl --cacert /path/to/gpu-ca.crt https://gpu.internal/docs
```
From inside the Reflector container:
```bash
docker compose exec server python -c "
import httpx
r = httpx.get('https://gpu.internal/docs')
print(r.status_code)
"
```
### SSL: CERTIFICATE_VERIFY_FAILED
The Reflector backend doesn't trust the GPU host's CA. Fix:
```bash
# Re-run Reflector setup with the GPU host's CA
./scripts/setup-selfhosted.sh --hosted --extra-ca /path/to/gpu-ca.crt
```
### Diarization returns errors
- Accept pyannote model licenses on HuggingFace:
- https://huggingface.co/pyannote/speaker-diarization-3.1
- https://huggingface.co/pyannote/segmentation-3.0
- Verify `HF_TOKEN` is set in `.env.gpu-host`