mirror of
https://github.com/Monadical-SAS/reflector.git
synced 2025-12-21 12:49:06 +00:00
138 lines
3.2 KiB
Markdown
138 lines
3.2 KiB
Markdown
# Local Development GPU Setup
|
|
|
|
Run transcription and diarization locally for development/testing.
|
|
|
|
> **For production deployment**, see the [Self-Hosted GPU Setup Guide](../../docs/docs/installation/self-hosted-gpu-setup.md).
|
|
|
|
## Prerequisites
|
|
|
|
1. **Python 3.12+** and **uv** package manager
|
|
2. **FFmpeg** installed and on PATH
|
|
3. **HuggingFace account** with access to pyannote models
|
|
|
|
### Accept Pyannote Licenses (Required)
|
|
|
|
Before first run, accept licenses for these gated models (logged into HuggingFace):
|
|
- https://hf.co/pyannote/speaker-diarization-3.1
|
|
- https://hf.co/pyannote/segmentation-3.0
|
|
|
|
## Quick Start
|
|
|
|
### 1. Install dependencies
|
|
|
|
```bash
|
|
cd gpu/self_hosted
|
|
uv sync
|
|
```
|
|
|
|
### 2. Start the GPU service
|
|
|
|
```bash
|
|
cd gpu/self_hosted
|
|
HF_TOKEN=<your-huggingface-token> \
|
|
REFLECTOR_GPU_APIKEY=dev-key-12345 \
|
|
.venv/bin/uvicorn main:app --host 0.0.0.0 --port 8000
|
|
```
|
|
|
|
Note: The `.env` file is NOT auto-loaded. Pass env vars explicitly or use:
|
|
```bash
|
|
export HF_TOKEN=<your-token>
|
|
export REFLECTOR_GPU_APIKEY=dev-key-12345
|
|
.venv/bin/uvicorn main:app --host 0.0.0.0 --port 8000
|
|
```
|
|
|
|
### 3. Configure Reflector to use local GPU
|
|
|
|
Edit `server/.env`:
|
|
|
|
```bash
|
|
# Transcription - local GPU service
|
|
TRANSCRIPT_BACKEND=modal
|
|
TRANSCRIPT_URL=http://host.docker.internal:8000
|
|
TRANSCRIPT_MODAL_API_KEY=dev-key-12345
|
|
|
|
# Diarization - local GPU service
|
|
DIARIZATION_BACKEND=modal
|
|
DIARIZATION_URL=http://host.docker.internal:8000
|
|
DIARIZATION_MODAL_API_KEY=dev-key-12345
|
|
```
|
|
|
|
Note: Use `host.docker.internal` because Reflector server runs in Docker.
|
|
|
|
### 4. Restart Reflector server
|
|
|
|
```bash
|
|
cd server
|
|
docker compose restart server worker
|
|
```
|
|
|
|
## Testing
|
|
|
|
### Test transcription
|
|
|
|
```bash
|
|
curl -s -X POST http://localhost:8000/v1/audio/transcriptions \
|
|
-H "Authorization: Bearer dev-key-12345" \
|
|
-F "file=@/path/to/audio.wav" \
|
|
-F "language=en"
|
|
```
|
|
|
|
### Test diarization
|
|
|
|
```bash
|
|
curl -s -X POST "http://localhost:8000/diarize?audio_file_url=<audio-url>" \
|
|
-H "Authorization: Bearer dev-key-12345"
|
|
```
|
|
|
|
## Platform Notes
|
|
|
|
### macOS (ARM)
|
|
|
|
Docker build fails - CUDA packages are x86_64 only. Use local Python instead:
|
|
```bash
|
|
uv sync
|
|
HF_TOKEN=xxx REFLECTOR_GPU_APIKEY=xxx .venv/bin/uvicorn main:app --host 0.0.0.0 --port 8000
|
|
```
|
|
|
|
### Linux with NVIDIA GPU
|
|
|
|
Docker works with CUDA acceleration:
|
|
```bash
|
|
docker compose up -d
|
|
```
|
|
|
|
### CPU-only
|
|
|
|
Works on any platform, just slower. PyTorch auto-detects and falls back to CPU.
|
|
|
|
## Switching Back to Modal.com
|
|
|
|
Edit `server/.env`:
|
|
|
|
```bash
|
|
TRANSCRIPT_BACKEND=modal
|
|
TRANSCRIPT_URL=https://monadical-sas--reflector-transcriber-parakeet-web.modal.run
|
|
TRANSCRIPT_MODAL_API_KEY=<modal-api-key>
|
|
|
|
DIARIZATION_BACKEND=modal
|
|
DIARIZATION_URL=https://monadical-sas--reflector-diarizer-web.modal.run
|
|
DIARIZATION_MODAL_API_KEY=<modal-api-key>
|
|
```
|
|
|
|
## Troubleshooting
|
|
|
|
### "Could not download pyannote pipeline"
|
|
- Accept model licenses at HuggingFace (see Prerequisites)
|
|
- Verify HF_TOKEN is set and valid
|
|
|
|
### Service won't start
|
|
- Check port 8000 is free: `lsof -i :8000`
|
|
- Kill orphan processes if needed
|
|
|
|
### Transcription returns empty text
|
|
- Ensure audio contains speech (not just tones/silence)
|
|
- Check audio format is supported (wav, mp3, etc.)
|
|
|
|
### Deprecation warnings from torchaudio/pyannote
|
|
- Safe to ignore - doesn't affect functionality
|