mirror of
https://github.com/Monadical-SAS/reflector.git
synced 2026-04-10 23:56:55 +00:00
feat: custom ca for caddy (#931)
* fix: send email on transcript page permissions fixed * feat: custom ca for caddy
This commit is contained in:
committed by
GitHub
parent
bfaf4f403b
commit
12bf0c2d77
337
docsv2/custom-ca-setup.md
Normal file
337
docsv2/custom-ca-setup.md
Normal file
@@ -0,0 +1,337 @@
|
||||
# Custom CA Certificate Setup
|
||||
|
||||
Use a private Certificate Authority (CA) with Reflector self-hosted deployments. This covers two scenarios:
|
||||
|
||||
1. **Custom local domain** — Serve Reflector over HTTPS on an internal domain (e.g., `reflector.local`) using certs signed by your own CA
|
||||
2. **Backend CA trust** — Let Reflector's backend services (server, workers, GPU) make HTTPS calls to GPU, LLM, or other internal services behind your private CA
|
||||
|
||||
Both can be used independently or together.
|
||||
|
||||
## Quick Start
|
||||
|
||||
### Generate test certificates
|
||||
|
||||
```bash
|
||||
./scripts/generate-certs.sh reflector.local
|
||||
```
|
||||
|
||||
This creates `certs/` with:
|
||||
- `ca.key` + `ca.crt` — Root CA (10-year validity)
|
||||
- `server-key.pem` + `server.pem` — Server certificate (1-year, SAN: domain + localhost + 127.0.0.1)
|
||||
|
||||
### Deploy with custom CA + domain
|
||||
|
||||
```bash
|
||||
# Add domain to /etc/hosts on the server (use 127.0.0.1 for local, or server LAN IP for network access)
|
||||
echo "127.0.0.1 reflector.local" | sudo tee -a /etc/hosts
|
||||
|
||||
# Run setup — pass the certs directory
|
||||
./scripts/setup-selfhosted.sh --gpu --caddy --domain reflector.local --custom-ca certs/
|
||||
|
||||
# Trust the CA on your machine (see "Trust the CA" section below)
|
||||
```
|
||||
|
||||
### Deploy with CA trust only (GPU/LLM behind private CA)
|
||||
|
||||
```bash
|
||||
# Only need the CA cert file — no Caddy TLS certs needed
|
||||
./scripts/setup-selfhosted.sh --hosted --custom-ca /path/to/corporate-ca.crt
|
||||
```
|
||||
|
||||
## How `--custom-ca` Works
|
||||
|
||||
The flag accepts a **directory** or a **single file**:
|
||||
|
||||
### Directory mode
|
||||
|
||||
```bash
|
||||
--custom-ca certs/
|
||||
```
|
||||
|
||||
Looks for these files by convention:
|
||||
- `ca.crt` (required) — CA certificate to trust
|
||||
- `server.pem` + `server-key.pem` (optional) — TLS certificate/key for Caddy
|
||||
|
||||
If `server.pem` + `server-key.pem` are found AND `--domain` is provided:
|
||||
- Caddy serves HTTPS using those certs
|
||||
- Backend containers trust the CA for outbound calls
|
||||
|
||||
If only `ca.crt` is found:
|
||||
- Backend containers trust the CA for outbound calls
|
||||
- Caddy is unaffected (uses Let's Encrypt, self-signed, or no Caddy)
|
||||
|
||||
### Single file mode
|
||||
|
||||
```bash
|
||||
--custom-ca /path/to/corporate-ca.crt
|
||||
```
|
||||
|
||||
Only injects CA trust into backend containers. No Caddy TLS changes.
|
||||
|
||||
## Scenarios
|
||||
|
||||
### Scenario 1: Custom local domain
|
||||
|
||||
Your Reflector instance runs on an internal network. You want `https://reflector.local` with proper TLS (no browser warnings).
|
||||
|
||||
```bash
|
||||
# 1. Generate certs
|
||||
./scripts/generate-certs.sh reflector.local
|
||||
|
||||
# 2. Add to /etc/hosts on the server
|
||||
echo "127.0.0.1 reflector.local" | sudo tee -a /etc/hosts
|
||||
|
||||
# 3. Deploy
|
||||
./scripts/setup-selfhosted.sh --gpu --garage --caddy --domain reflector.local --custom-ca certs/
|
||||
|
||||
# 4. Trust the CA on your machine (see "Trust the CA" section below)
|
||||
```
|
||||
|
||||
If other machines on the network need to access it, add the server's LAN IP to `/etc/hosts` on those machines instead:
|
||||
```bash
|
||||
echo "192.168.1.100 reflector.local" | sudo tee -a /etc/hosts
|
||||
```
|
||||
|
||||
And include that IP as an extra SAN when generating certs:
|
||||
```bash
|
||||
./scripts/generate-certs.sh reflector.local "IP:192.168.1.100"
|
||||
```
|
||||
|
||||
### Scenario 2: GPU/LLM behind corporate CA
|
||||
|
||||
Your GPU or LLM server (e.g., `https://gpu.internal.corp`) uses certificates signed by your corporate CA. Reflector's backend needs to trust that CA for outbound HTTPS calls.
|
||||
|
||||
```bash
|
||||
# Get the CA certificate from your IT team (PEM format)
|
||||
# Then deploy — Caddy can still use Let's Encrypt or self-signed
|
||||
./scripts/setup-selfhosted.sh --hosted --garage --caddy --custom-ca /path/to/corporate-ca.crt
|
||||
```
|
||||
|
||||
This works because:
|
||||
- **TLS cert/key** = "this is my identity" — for Caddy to serve HTTPS to browsers
|
||||
- **CA cert** = "I trust this authority" — for backend containers to verify outbound connections
|
||||
|
||||
Your Reflector frontend can use Let's Encrypt (public domain) or self-signed certs, while the backend trusts a completely different CA for GPU/LLM calls.
|
||||
|
||||
### Scenario 3: Both combined (same CA)
|
||||
|
||||
Custom domain + GPU/LLM all behind the same CA:
|
||||
|
||||
```bash
|
||||
./scripts/generate-certs.sh reflector.local "DNS:gpu.local"
|
||||
./scripts/setup-selfhosted.sh --gpu --garage --caddy --domain reflector.local --custom-ca certs/
|
||||
```
|
||||
|
||||
### Scenario 4: Multiple CAs (local domain + remote GPU on different CA)
|
||||
|
||||
Your Reflector uses one CA for `reflector.local`, but the GPU host uses a different CA:
|
||||
|
||||
```bash
|
||||
# Your local domain setup
|
||||
./scripts/generate-certs.sh reflector.local
|
||||
|
||||
# Deploy with your CA + trust the GPU host's CA too
|
||||
./scripts/setup-selfhosted.sh --hosted --garage --caddy \
|
||||
--domain reflector.local \
|
||||
--custom-ca certs/ \
|
||||
--extra-ca /path/to/gpu-machine-ca.crt
|
||||
```
|
||||
|
||||
`--extra-ca` appends additional CA certs to the trust bundle. Backend containers trust ALL CAs — your local domain AND the GPU host's certs both work.
|
||||
|
||||
You can repeat `--extra-ca` for multiple remote services:
|
||||
```bash
|
||||
--extra-ca /path/to/gpu-ca.crt --extra-ca /path/to/llm-ca.crt
|
||||
```
|
||||
|
||||
For setting up a dedicated GPU host, see [Standalone GPU Host Setup](gpu-host-setup.md).
|
||||
|
||||
## Trust the CA on Client Machines
|
||||
|
||||
After deploying, clients need to trust the CA to avoid browser warnings.
|
||||
|
||||
### macOS
|
||||
|
||||
```bash
|
||||
sudo security add-trusted-cert -d -r trustRoot \
|
||||
-k /Library/Keychains/System.keychain certs/ca.crt
|
||||
```
|
||||
|
||||
### Linux (Ubuntu/Debian)
|
||||
|
||||
```bash
|
||||
sudo cp certs/ca.crt /usr/local/share/ca-certificates/reflector-ca.crt
|
||||
sudo update-ca-certificates
|
||||
```
|
||||
|
||||
### Linux (RHEL/Fedora)
|
||||
|
||||
```bash
|
||||
sudo cp certs/ca.crt /etc/pki/ca-trust/source/anchors/reflector-ca.crt
|
||||
sudo update-ca-trust
|
||||
```
|
||||
|
||||
### Windows (PowerShell as admin)
|
||||
|
||||
```powershell
|
||||
Import-Certificate -FilePath .\certs\ca.crt -CertStoreLocation Cert:\LocalMachine\Root
|
||||
```
|
||||
|
||||
### Firefox (all platforms)
|
||||
|
||||
Firefox uses its own certificate store:
|
||||
1. Settings > Privacy & Security > View Certificates
|
||||
2. Authorities tab > Import
|
||||
3. Select `ca.crt` and check "Trust this CA to identify websites"
|
||||
|
||||
## How It Works Internally
|
||||
|
||||
### Docker entrypoint CA injection
|
||||
|
||||
Each backend container (server, worker, beat, hatchet workers, GPU) has an entrypoint script (`docker-entrypoint.sh`) that:
|
||||
|
||||
1. Checks if a CA cert is mounted at `/usr/local/share/ca-certificates/custom-ca.crt`
|
||||
2. If present, runs `update-ca-certificates` to create a **combined bundle** (system CAs + custom CA)
|
||||
3. Sets environment variables so all Python/gRPC libraries use the combined bundle:
|
||||
|
||||
| Env var | Covers |
|
||||
|---------|--------|
|
||||
| `SSL_CERT_FILE` | httpx, OpenAI SDK, llama-index, Python ssl module |
|
||||
| `REQUESTS_CA_BUNDLE` | requests library (transitive dependencies) |
|
||||
| `CURL_CA_BUNDLE` | curl CLI (container healthchecks) |
|
||||
| `GRPC_DEFAULT_SSL_ROOTS_FILE_PATH` | grpcio (Hatchet gRPC client) |
|
||||
|
||||
When no CA cert is mounted, the entrypoint is a no-op — containers behave exactly as before.
|
||||
|
||||
### Why this replaces manual certifi patching
|
||||
|
||||
Previously, the workaround for trusting a private CA in Python was to patch certifi's bundle directly:
|
||||
|
||||
```bash
|
||||
# OLD approach — fragile, do NOT use
|
||||
cat custom-ca.crt >> $(python -c "import certifi; print(certifi.where())")
|
||||
```
|
||||
|
||||
This breaks whenever certifi is updated (any `pip install`/`uv sync` overwrites the bundle and the CA is lost).
|
||||
|
||||
Our entrypoint approach is permanent because:
|
||||
|
||||
1. `SSL_CERT_FILE` is checked by Python's `ssl.create_default_context()` **before** falling back to `certifi.where()`. When set, certifi's bundle is never read.
|
||||
2. `REQUESTS_CA_BUNDLE` similarly overrides certifi for the `requests` library.
|
||||
3. The CA is injected at container startup (runtime), not baked into the Python environment. It survives image rebuilds, dependency updates, and `uv sync`.
|
||||
|
||||
```
|
||||
Python SSL lookup chain:
|
||||
ssl.create_default_context()
|
||||
→ SSL_CERT_FILE env var? → YES → use combined bundle (system + custom CA) ✓
|
||||
→ (certifi.where() is never reached)
|
||||
```
|
||||
|
||||
This covers all outbound HTTPS calls: httpx (transcription, diarization, translation, webhooks), OpenAI SDK (transcription), llama-index (LLM/summarization), and requests (transitive dependencies).
|
||||
|
||||
### Compose override
|
||||
|
||||
The setup script generates `docker-compose.ca.yml` which mounts the CA cert into every backend container as a read-only bind mount. This file is:
|
||||
- Only generated when `--custom-ca` is passed
|
||||
- Deleted on re-runs without `--custom-ca` (prevents stale overrides)
|
||||
- Added to `.gitignore`
|
||||
|
||||
### Node.js (frontend)
|
||||
|
||||
The web container uses `NODE_EXTRA_CA_CERTS` which **adds** to Node's trust store (unlike Python's `SSL_CERT_FILE` which replaces it). This is set via the compose override.
|
||||
|
||||
## Generate Your Own CA (Manual)
|
||||
|
||||
If you prefer not to use `generate-certs.sh`:
|
||||
|
||||
```bash
|
||||
# 1. Create CA
|
||||
openssl genrsa -out ca.key 4096
|
||||
openssl req -x509 -new -nodes -key ca.key -sha256 -days 3650 \
|
||||
-out ca.crt -subj "/CN=My CA/O=My Organization"
|
||||
|
||||
# 2. Create server key
|
||||
openssl genrsa -out server-key.pem 2048
|
||||
|
||||
# 3. Create CSR with SANs
|
||||
openssl req -new -key server-key.pem -out server.csr \
|
||||
-subj "/CN=reflector.local" \
|
||||
-addext "subjectAltName=DNS:reflector.local,DNS:localhost,IP:127.0.0.1"
|
||||
|
||||
# 4. Sign with CA
|
||||
openssl x509 -req -in server.csr -CA ca.crt -CAkey ca.key \
|
||||
-CAcreateserial -out server.pem -days 365 -sha256 \
|
||||
-copy_extensions copyall
|
||||
|
||||
# 5. Clean up
|
||||
rm server.csr ca.srl
|
||||
```
|
||||
|
||||
## Using Existing Corporate Certificates
|
||||
|
||||
If your organization already has a CA:
|
||||
|
||||
1. Get the CA certificate in PEM format from your IT team
|
||||
2. If you have a PKCS#12 (.p12/.pfx) bundle, extract the CA cert:
|
||||
```bash
|
||||
openssl pkcs12 -in bundle.p12 -cacerts -nokeys -out ca.crt
|
||||
```
|
||||
3. If you have multiple intermediate CAs, concatenate them into one PEM file:
|
||||
```bash
|
||||
cat intermediate-ca.crt root-ca.crt > ca.crt
|
||||
```
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Browser: "Your connection is not private"
|
||||
|
||||
The CA is not trusted on the client machine. See "Trust the CA" section above.
|
||||
|
||||
Check certificate expiry:
|
||||
```bash
|
||||
openssl x509 -noout -dates -in certs/server.pem
|
||||
```
|
||||
|
||||
### Backend: `SSL: CERTIFICATE_VERIFY_FAILED`
|
||||
|
||||
CA cert not mounted or not loaded. Check inside the container:
|
||||
```bash
|
||||
docker compose exec server env | grep SSL_CERT_FILE
|
||||
docker compose exec server python -c "
|
||||
import ssl, os
|
||||
print('SSL_CERT_FILE:', os.environ.get('SSL_CERT_FILE', 'not set'))
|
||||
ctx = ssl.create_default_context()
|
||||
print('CA certs loaded:', ctx.cert_store_stats())
|
||||
"
|
||||
```
|
||||
|
||||
### Caddy: "certificate is not valid for any names"
|
||||
|
||||
Domain in Caddyfile doesn't match the certificate's SAN/CN. Check:
|
||||
```bash
|
||||
openssl x509 -noout -text -in certs/server.pem | grep -A1 "Subject Alternative Name"
|
||||
```
|
||||
|
||||
### Certificate chain issues
|
||||
|
||||
If you have intermediate CAs, concatenate them into `server.pem`:
|
||||
```bash
|
||||
cat server-cert.pem intermediate-ca.pem > certs/server.pem
|
||||
```
|
||||
|
||||
Verify the chain:
|
||||
```bash
|
||||
openssl verify -CAfile certs/ca.crt certs/server.pem
|
||||
```
|
||||
|
||||
### Certificate renewal
|
||||
|
||||
Custom CA certs are NOT auto-renewed (unlike Let's Encrypt). Replace cert files and restart:
|
||||
```bash
|
||||
# Replace certs
|
||||
cp new-server.pem certs/server.pem
|
||||
cp new-server-key.pem certs/server-key.pem
|
||||
|
||||
# Restart Caddy to pick up new certs
|
||||
docker compose restart caddy
|
||||
```
|
||||
294
docsv2/gpu-host-setup.md
Normal file
294
docsv2/gpu-host-setup.md
Normal file
@@ -0,0 +1,294 @@
|
||||
# Standalone GPU Host Setup
|
||||
|
||||
Deploy Reflector's GPU transcription/diarization/translation service on a dedicated machine, separate from the main Reflector instance. Useful when:
|
||||
|
||||
- Your GPU machine is on a different network than the Reflector server
|
||||
- You want to share one GPU service across multiple Reflector instances
|
||||
- The GPU machine has special hardware/drivers that can't run the full stack
|
||||
- You need to scale GPU processing independently
|
||||
|
||||
## Architecture
|
||||
|
||||
```
|
||||
┌─────────────────────┐ HTTPS ┌────────────────────┐
|
||||
│ Reflector Server │ ────────────────────── │ GPU Host │
|
||||
│ (server, worker, │ TRANSCRIPT_URL │ (transcription, │
|
||||
│ web, postgres, │ DIARIZATION_URL │ diarization, │
|
||||
│ redis, hatchet) │ TRANSLATE_URL │ translation) │
|
||||
│ │ │ │
|
||||
│ setup-selfhosted.sh │ │ setup-gpu-host.sh │
|
||||
│ --hosted │ │ │
|
||||
└─────────────────────┘ └────────────────────┘
|
||||
```
|
||||
|
||||
The GPU service is a standalone FastAPI app that exposes transcription, diarization, translation, and audio padding endpoints. It has **no dependencies** on PostgreSQL, Redis, Hatchet, or any other Reflector service.
|
||||
|
||||
## Quick Start
|
||||
|
||||
### On the GPU machine
|
||||
|
||||
```bash
|
||||
git clone <reflector-repo>
|
||||
cd reflector
|
||||
|
||||
# Set HuggingFace token (required for diarization models)
|
||||
export HF_TOKEN=your-huggingface-token
|
||||
|
||||
# Deploy with HTTPS (Let's Encrypt)
|
||||
./scripts/setup-gpu-host.sh --domain gpu.example.com --api-key my-secret-key
|
||||
|
||||
# Or deploy with custom CA
|
||||
./scripts/generate-certs.sh gpu.local
|
||||
./scripts/setup-gpu-host.sh --domain gpu.local --custom-ca certs/ --api-key my-secret-key
|
||||
```
|
||||
|
||||
### On the Reflector machine
|
||||
|
||||
```bash
|
||||
# If the GPU host uses a custom CA, trust it
|
||||
./scripts/setup-selfhosted.sh --hosted --garage --caddy \
|
||||
--extra-ca /path/to/gpu-machine-ca.crt
|
||||
|
||||
# Or if you already have --custom-ca for your local domain
|
||||
./scripts/setup-selfhosted.sh --hosted --garage --caddy \
|
||||
--domain reflector.local --custom-ca certs/ \
|
||||
--extra-ca /path/to/gpu-machine-ca.crt
|
||||
```
|
||||
|
||||
Then configure `server/.env` to point to the GPU host:
|
||||
|
||||
```bash
|
||||
TRANSCRIPT_BACKEND=modal
|
||||
TRANSCRIPT_URL=https://gpu.example.com
|
||||
TRANSCRIPT_MODAL_API_KEY=my-secret-key
|
||||
|
||||
DIARIZATION_BACKEND=modal
|
||||
DIARIZATION_URL=https://gpu.example.com
|
||||
DIARIZATION_MODAL_API_KEY=my-secret-key
|
||||
|
||||
TRANSLATION_BACKEND=modal
|
||||
TRANSLATE_URL=https://gpu.example.com
|
||||
TRANSLATION_MODAL_API_KEY=my-secret-key
|
||||
```
|
||||
|
||||
## Script Options
|
||||
|
||||
```
|
||||
./scripts/setup-gpu-host.sh [OPTIONS]
|
||||
|
||||
Options:
|
||||
--domain DOMAIN Domain name for HTTPS (Let's Encrypt or custom cert)
|
||||
--custom-ca PATH Custom CA (directory or single PEM file)
|
||||
--extra-ca FILE Additional CA cert to trust (repeatable)
|
||||
--api-key KEY API key to protect the service (strongly recommended)
|
||||
--cpu CPU-only mode (no NVIDIA GPU required)
|
||||
--port PORT Host port (default: 443 with Caddy, 8000 without)
|
||||
```
|
||||
|
||||
## Deployment Scenarios
|
||||
|
||||
### Public internet with Let's Encrypt
|
||||
|
||||
GPU machine has a public IP and domain:
|
||||
|
||||
```bash
|
||||
./scripts/setup-gpu-host.sh --domain gpu.example.com --api-key my-secret-key
|
||||
```
|
||||
|
||||
Requirements:
|
||||
- DNS A record: `gpu.example.com` → GPU machine's public IP
|
||||
- Ports 80 and 443 open
|
||||
- Caddy auto-provisions Let's Encrypt certificate
|
||||
|
||||
### Internal network with custom CA
|
||||
|
||||
GPU machine on a private network:
|
||||
|
||||
```bash
|
||||
# Generate certs on the GPU machine
|
||||
./scripts/generate-certs.sh gpu.internal "IP:192.168.1.200"
|
||||
|
||||
# Deploy
|
||||
./scripts/setup-gpu-host.sh --domain gpu.internal --custom-ca certs/ --api-key my-secret-key
|
||||
```
|
||||
|
||||
On each machine that connects (including the Reflector server), add DNS:
|
||||
```bash
|
||||
echo "192.168.1.200 gpu.internal" | sudo tee -a /etc/hosts
|
||||
```
|
||||
|
||||
### IP-only (no domain)
|
||||
|
||||
No domain needed — just use the machine's IP:
|
||||
|
||||
```bash
|
||||
./scripts/setup-gpu-host.sh --api-key my-secret-key
|
||||
```
|
||||
|
||||
Caddy is not used; the GPU service runs directly on port 8000 (HTTP). For HTTPS without a domain, the Reflector machine connects via `http://<GPU_IP>:8000`.
|
||||
|
||||
### CPU-only (no NVIDIA GPU)
|
||||
|
||||
Works on any machine — transcription will be slower:
|
||||
|
||||
```bash
|
||||
./scripts/setup-gpu-host.sh --cpu --domain gpu.example.com --api-key my-secret-key
|
||||
```
|
||||
|
||||
## DNS Resolution
|
||||
|
||||
The Reflector server must be able to reach the GPU host by name or IP.
|
||||
|
||||
| Setup | DNS Method | TRANSCRIPT_URL example |
|
||||
|-------|------------|----------------------|
|
||||
| Public domain | DNS A record | `https://gpu.example.com` |
|
||||
| Internal domain | `/etc/hosts` on both machines | `https://gpu.internal` |
|
||||
| IP only | No DNS needed | `http://192.168.1.200:8000` |
|
||||
|
||||
For internal domains, add the GPU machine's IP to `/etc/hosts` on the Reflector machine:
|
||||
```bash
|
||||
echo "192.168.1.200 gpu.internal" | sudo tee -a /etc/hosts
|
||||
```
|
||||
|
||||
If the Reflector server runs in Docker, the containers resolve DNS from the host (Docker's default DNS behavior). So adding to the host's `/etc/hosts` is sufficient.
|
||||
|
||||
## Multi-CA Setup
|
||||
|
||||
When your Reflector instance has its own CA (for `reflector.local`) and the GPU host has a different CA:
|
||||
|
||||
**On the GPU machine:**
|
||||
```bash
|
||||
./scripts/generate-certs.sh gpu.local
|
||||
./scripts/setup-gpu-host.sh --domain gpu.local --custom-ca certs/ --api-key my-key
|
||||
```
|
||||
|
||||
**On the Reflector machine:**
|
||||
```bash
|
||||
# Your local CA for reflector.local + the GPU host's CA
|
||||
./scripts/setup-selfhosted.sh --hosted --garage --caddy \
|
||||
--domain reflector.local \
|
||||
--custom-ca certs/ \
|
||||
--extra-ca /path/to/gpu-machine-ca.crt
|
||||
```
|
||||
|
||||
The `--extra-ca` flag appends the GPU host's CA to the trust bundle. Backend containers trust both CAs — your local domain works AND outbound calls to the GPU host succeed.
|
||||
|
||||
You can repeat `--extra-ca` for multiple remote services:
|
||||
```bash
|
||||
--extra-ca /path/to/gpu-ca.crt --extra-ca /path/to/llm-ca.crt
|
||||
```
|
||||
|
||||
## API Key Authentication
|
||||
|
||||
The GPU service uses Bearer token authentication via `REFLECTOR_GPU_APIKEY`:
|
||||
|
||||
```bash
|
||||
# Test from the Reflector machine
|
||||
curl -s https://gpu.example.com/docs # No auth needed for docs
|
||||
curl -s -X POST https://gpu.example.com/v1/audio/transcriptions \
|
||||
-H "Authorization: Bearer <my-secret-key>" \ #gitleaks:allow
|
||||
-F "file=@audio.wav"
|
||||
```
|
||||
|
||||
If `REFLECTOR_GPU_APIKEY` is not set, the service accepts all requests (open access). Always use `--api-key` for internet-facing deployments.
|
||||
|
||||
The same key goes in Reflector's `server/.env` as `TRANSCRIPT_MODAL_API_KEY` and `DIARIZATION_MODAL_API_KEY`.
|
||||
|
||||
## Files
|
||||
|
||||
| File | Checked in? | Purpose |
|
||||
|------|-------------|---------|
|
||||
| `docker-compose.gpu-host.yml` | Yes | Static compose file with profiles (`gpu`, `cpu`, `caddy`) |
|
||||
| `.env.gpu-host` | No (generated) | Environment variables (HF_TOKEN, API key, ports) |
|
||||
| `Caddyfile.gpu-host` | No (generated) | Caddy config (only when using HTTPS) |
|
||||
| `docker-compose.gpu-ca.yml` | No (generated) | CA cert mounts override (only with --custom-ca) |
|
||||
| `certs/` | No (generated) | Staged certificates (when using --custom-ca) |
|
||||
|
||||
The compose file is checked into the repo — you can read it to understand exactly what runs. The script only generates env vars, Caddyfile, and CA overrides. Profiles control which service starts:
|
||||
|
||||
```bash
|
||||
# What the script does under the hood:
|
||||
docker compose -f docker-compose.gpu-host.yml --profile gpu --profile caddy \
|
||||
--env-file .env.gpu-host up -d
|
||||
|
||||
# CPU mode:
|
||||
docker compose -f docker-compose.gpu-host.yml --profile cpu --profile caddy \
|
||||
--env-file .env.gpu-host up -d
|
||||
```
|
||||
|
||||
Both `gpu` and `cpu` services get the network alias `transcription`, so Caddy's config works with either.
|
||||
|
||||
## Management
|
||||
|
||||
```bash
|
||||
# View logs
|
||||
docker compose -f docker-compose.gpu-host.yml --profile gpu logs -f gpu
|
||||
|
||||
# Restart
|
||||
docker compose -f docker-compose.gpu-host.yml --profile gpu restart gpu
|
||||
|
||||
# Stop
|
||||
docker compose -f docker-compose.gpu-host.yml --profile gpu --profile caddy down
|
||||
|
||||
# Re-run setup
|
||||
./scripts/setup-gpu-host.sh [same flags]
|
||||
|
||||
# Rebuild after code changes
|
||||
docker compose -f docker-compose.gpu-host.yml --profile gpu build gpu
|
||||
docker compose -f docker-compose.gpu-host.yml --profile gpu up -d gpu
|
||||
```
|
||||
|
||||
If you deployed with `--custom-ca`, include the CA override in manual commands:
|
||||
```bash
|
||||
docker compose -f docker-compose.gpu-host.yml -f docker-compose.gpu-ca.yml \
|
||||
--profile gpu logs -f gpu
|
||||
```
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### GPU service won't start
|
||||
|
||||
Check logs:
|
||||
```bash
|
||||
docker compose -f docker-compose.gpu-host.yml logs gpu
|
||||
```
|
||||
|
||||
Common causes:
|
||||
- NVIDIA driver not installed or `nvidia-container-toolkit` missing
|
||||
- `HF_TOKEN` not set (diarization model download fails)
|
||||
- Port already in use
|
||||
|
||||
### Reflector can't connect to GPU host
|
||||
|
||||
From the Reflector machine:
|
||||
```bash
|
||||
# Test HTTPS connectivity
|
||||
curl -v https://gpu.example.com/docs
|
||||
|
||||
# If using custom CA, test with explicit CA
|
||||
curl --cacert /path/to/gpu-ca.crt https://gpu.internal/docs
|
||||
```
|
||||
|
||||
From inside the Reflector container:
|
||||
```bash
|
||||
docker compose exec server python -c "
|
||||
import httpx
|
||||
r = httpx.get('https://gpu.internal/docs')
|
||||
print(r.status_code)
|
||||
"
|
||||
```
|
||||
|
||||
### SSL: CERTIFICATE_VERIFY_FAILED
|
||||
|
||||
The Reflector backend doesn't trust the GPU host's CA. Fix:
|
||||
```bash
|
||||
# Re-run Reflector setup with the GPU host's CA
|
||||
./scripts/setup-selfhosted.sh --hosted --extra-ca /path/to/gpu-ca.crt
|
||||
```
|
||||
|
||||
### Diarization returns errors
|
||||
|
||||
- Accept pyannote model licenses on HuggingFace:
|
||||
- https://huggingface.co/pyannote/speaker-diarization-3.1
|
||||
- https://huggingface.co/pyannote/segmentation-3.0
|
||||
- Verify `HF_TOKEN` is set in `.env.gpu-host`
|
||||
Reference in New Issue
Block a user