feat: add local pyannote file diarization processor (#858)

* feat: add local pyannote file diarization processor

Enables file diarization without Modal by using pyannote.audio locally.
Downloads model bundle from S3 on first use, caches locally, patches
config to use local paths. Set DIARIZATION_BACKEND=pyannote to enable.

* fix: standalone setup enables pyannote diarization and public mode

Replace DIARIZATION_ENABLED=false with DIARIZATION_BACKEND=pyannote so
file uploads get speaker diarization out of the box. Add PUBLIC_MODE=true
so unauthenticated users can list/browse transcripts.

* fix: touch env files before first compose_cmd in standalone setup

docker-compose.yml references www/.env.local as env_file, but the
setup script only creates it in step 4. compose_cmd calls in step 3
(Garage) fail on a fresh clone when the file doesn't exist yet.

* feat: standalone uses self-hosted GPU service for transcription+diarization

Replace in-process pyannote approach with self-hosted gpu/self_hosted/ service.
Same HTTP API as Modal — just TRANSCRIPT_URL/DIARIZATION_URL point to local container.

- Add gpu/self_hosted/Dockerfile.cpu (GPU Dockerfile minus NVIDIA CUDA)
- Add S3 model bundle fallback in diarizer.py when HF_TOKEN not set
- Add gpu service to docker-compose.standalone.yml with compose env overrides
- Fix /browse empty in PUBLIC_MODE (search+list queries filtered out roomless transcripts)
- Remove audio_diarization_pyannote.py, file_diarization_pyannote.py and tests
- Remove pyannote-audio from server local deps

* fix: allow unauthenticated GPU requests when no API key configured

OAuth2PasswordBearer with auto_error=True rejects requests without
Authorization header before apikey_auth can check if auth is needed.

* fix: rename standalone gpu service to cpu to match Dockerfile.cpu usage

* docs: add programmatic testing section and fix gpu->cpu naming in setup script/docs

- Add "Testing programmatically" section to standalone docs with curl commands
  for creating transcript, uploading audio, polling status, checking result
- Fix setup-standalone.sh to reference `cpu` service (was still `gpu` after rename)
- Update all docs references from gpu to cpu service naming

---------

Co-authored-by: Igor Loskutov <igor.loskutoff@gmail.com>
This commit is contained in:
2026-02-11 12:41:32 -05:00
committed by GitHub
parent ec4f356b4c
commit adc4c20bf4
12 changed files with 248 additions and 777 deletions

View File

@@ -184,8 +184,10 @@ ENVEOF
env_set "$SERVER_ENV" "CELERY_BROKER_URL" "redis://redis:6379/1"
env_set "$SERVER_ENV" "CELERY_RESULT_BACKEND" "redis://redis:6379/1"
env_set "$SERVER_ENV" "AUTH_BACKEND" "none"
env_set "$SERVER_ENV" "TRANSCRIPT_BACKEND" "whisper"
env_set "$SERVER_ENV" "DIARIZATION_ENABLED" "false"
env_set "$SERVER_ENV" "PUBLIC_MODE" "true"
# TRANSCRIPT_BACKEND, TRANSCRIPT_URL, DIARIZATION_BACKEND, DIARIZATION_URL
# are set via docker-compose.standalone.yml `environment:` overrides — not written here
# so we don't clobber the user's server/.env for non-standalone use.
env_set "$SERVER_ENV" "TRANSLATION_BACKEND" "passthrough"
env_set "$SERVER_ENV" "LLM_URL" "$LLM_URL_VALUE"
env_set "$SERVER_ENV" "LLM_MODEL" "$MODEL"
@@ -305,7 +307,7 @@ step_services() {
fi
# server runs alembic migrations on startup automatically (see runserver.sh)
compose_cmd up -d postgres redis garage server worker beat web
compose_cmd up -d postgres redis garage cpu server worker beat web
ok "Containers started"
info "Server is running migrations (alembic upgrade head)..."
}
@@ -316,6 +318,26 @@ step_services() {
step_health() {
info "Step 6: Health checks"
# CPU service may take a while on first start (model download + load).
# No host port exposed — check via docker exec.
info "Waiting for CPU service (first start downloads ~1GB of models)..."
local cpu_ok=false
for i in $(seq 1 120); do
if compose_cmd exec -T cpu curl -sf http://localhost:8000/docs > /dev/null 2>&1; then
cpu_ok=true
break
fi
echo -ne "\r Waiting for CPU service... ($i/120)"
sleep 5
done
echo ""
if [[ "$cpu_ok" == "true" ]]; then
ok "CPU service healthy (transcription + diarization)"
else
warn "CPU service not ready yet — it will keep loading in the background"
warn "Check with: docker compose logs cpu"
fi
wait_for_url "http://localhost:1250/health" "Server API" 60 3
echo ""
ok "Server API healthy"
@@ -356,6 +378,10 @@ main() {
LLM_URL_VALUE=""
OLLAMA_PROFILE=""
# docker-compose.yml may reference env_files that don't exist yet;
# touch them so compose_cmd works before the steps that populate them.
touch "$SERVER_ENV" "$WWW_ENV"
step_llm
echo ""
step_server_env