Compare commits

..

26 Commits

Author SHA1 Message Date
Igor Loskutov
26f1e5f6dd feat: reduce Hatchet payload size by removing words from topic chunk workflows
Remove ~6.5MB of redundant Word data from Hatchet task boundaries:
- Remove words from TopicChunkInput/TopicChunkResult (child workflow I/O)
- detect_topics maps words from local chunks by chunk_index instead
- TopicsResult carries empty transcript words (persisted to DB already)
- extract_subjects refetches topics from DB instead of task output
- Clear topics at detect_topics start for retry idempotency
2026-02-12 09:50:26 -05:00
b468427f1b feat: local llm support + standalone-script doc/draft (#856)
* feat: local LLM via Ollama + structured output response_format

- Add setup script (scripts/setup-local-llm.sh) for one-command Ollama setup
  Mac: native Metal GPU, Linux: containerized via docker-compose profiles
- Add ollama-gpu and ollama-cpu docker-compose profiles for Linux
- Add extra_hosts to server/hatchet-worker-llm for host.docker.internal
- Pass response_format JSON schema in StructuredOutputWorkflow.extract()
  enabling grammar-based constrained decoding on Ollama/llama.cpp/vLLM/OpenAI
- Update .env.example with Ollama as default LLM option
- Add Ollama PRD and local dev setup docs

* refactor: move Ollama services to docker-compose.standalone.yml

Ollama profiles (ollama-gpu, ollama-cpu) are only for Linux standalone
deployment. Mac devs never use them. Separate file keeps the main
compose clean and provides a natural home for future standalone services
(MinIO, etc.).

Linux: docker compose -f docker-compose.yml -f docker-compose.standalone.yml --profile ollama-gpu up -d
Mac: docker compose up -d (native Ollama, no standalone file needed)

* fix: correct PRD goal (demo/eval, not dev replacement) and processor naming

* chore: remove completed PRD, rename setup doc, drop response_format tests

- Remove docs/01_ollama.prd.md (implementation complete)
- Rename local-dev-setup.md -> standalone-local-setup.md
- Remove TestResponseFormat class from test_llm_retry.py

* docs: resolve standalone storage step — skip S3 for live-only mode

* docs: add TASKS.md for standalone env defaults + setup script work

* feat: add unified setup-local-dev.sh for standalone deployment

Single script takes fresh clone to working Reflector: Ollama/LLM setup,
env file generation (server/.env + www/.env.local), docker compose up,
health checks. No Hatchet in standalone — live pipeline is pure Celery.

* chore: rename to setup-standalone, remove redundant setup-local-llm.sh

* feat: add custom S3 endpoint support + Garage standalone storage

Add TRANSCRIPT_STORAGE_AWS_ENDPOINT_URL setting to enable S3-compatible
backends (Garage, MinIO). When set, uses path-style addressing and
routes all requests to the custom endpoint. When unset, AWS behavior
is unchanged.

- AwsStorage: accept aws_endpoint_url, pass to all 6 session.client()
  calls, configure path-style addressing and base_url
- Fix 4 direct AwsStorage constructions in Hatchet workflows to pass
  endpoint_url (would have silently targeted wrong endpoint)
- Standalone: add Garage service to docker-compose.standalone.yml,
  setup script initializes layout/bucket/key and writes credentials
- Fix compose_cmd() bug: Mac path was missing standalone yml
- garage.toml template with runtime secret generation via openssl

* fix: standalone setup — garage config, symlink handling, healthcheck

- garage.toml: fix rpc_secret field name (was secret_transmitter),
  move to top-level per Garage v1.1.0 spec, remove unused [s3_web]
- setup-standalone.sh: resolve symlinked .env files before writing,
  always ensure all standalone-critical vars via env_set,
  fix garage key create/info syntax (positional arg, not --name),
  avoid overwriting key secret with "(redacted)" on re-run,
  use compose_cmd in health check
- docker-compose.standalone.yml: fix garage healthcheck (no curl in
  image, use /garage stats instead)

* docs: update standalone md — symlink handling, garage config template

* docs: add troubleshooting section + port conflict check in setup script

Port conflicts from stale next dev / other worktree processes silently
shadow Docker container port mappings, causing env vars to appear ignored.

* fix: invalidate transcript query on STATUS websocket event

Without this, the processing page never redirects after completion
because the redirect logic watches the REST query data, not the
WebSocket status state.

Cherry-picked from feat-dag-progress (faec509a).

* fix: local env setup (#855)

* Ensure rate limit

* Increase nextjs compilation speed

* Fix daily no content handling

* Simplify daily webhook creation

* Fix webhook request validation

* feat: add local pyannote file diarization processor (#858)

* feat: add local pyannote file diarization processor

Enables file diarization without Modal by using pyannote.audio locally.
Downloads model bundle from S3 on first use, caches locally, patches
config to use local paths. Set DIARIZATION_BACKEND=pyannote to enable.

* fix: standalone setup enables pyannote diarization and public mode

Replace DIARIZATION_ENABLED=false with DIARIZATION_BACKEND=pyannote so
file uploads get speaker diarization out of the box. Add PUBLIC_MODE=true
so unauthenticated users can list/browse transcripts.

* fix: touch env files before first compose_cmd in standalone setup

docker-compose.yml references www/.env.local as env_file, but the
setup script only creates it in step 4. compose_cmd calls in step 3
(Garage) fail on a fresh clone when the file doesn't exist yet.

* feat: standalone uses self-hosted GPU service for transcription+diarization

Replace in-process pyannote approach with self-hosted gpu/self_hosted/ service.
Same HTTP API as Modal — just TRANSCRIPT_URL/DIARIZATION_URL point to local container.

- Add gpu/self_hosted/Dockerfile.cpu (GPU Dockerfile minus NVIDIA CUDA)
- Add S3 model bundle fallback in diarizer.py when HF_TOKEN not set
- Add gpu service to docker-compose.standalone.yml with compose env overrides
- Fix /browse empty in PUBLIC_MODE (search+list queries filtered out roomless transcripts)
- Remove audio_diarization_pyannote.py, file_diarization_pyannote.py and tests
- Remove pyannote-audio from server local deps

* fix: allow unauthenticated GPU requests when no API key configured

OAuth2PasswordBearer with auto_error=True rejects requests without
Authorization header before apikey_auth can check if auth is needed.

* fix: rename standalone gpu service to cpu to match Dockerfile.cpu usage

* docs: add programmatic testing section and fix gpu->cpu naming in setup script/docs

- Add "Testing programmatically" section to standalone docs with curl commands
  for creating transcript, uploading audio, polling status, checking result
- Fix setup-standalone.sh to reference `cpu` service (was still `gpu` after rename)
- Update all docs references from gpu to cpu service naming

---------

Co-authored-by: Igor Loskutov <igor.loskutoff@gmail.com>

* Fix websocket disconnect errors

* Fix event loop is closed in Celery workers

* Allow reprocessing idle multitrack transcripts

* feat: add local pyannote file diarization processor

Enables file diarization without Modal by using pyannote.audio locally.
Downloads model bundle from S3 on first use, caches locally, patches
config to use local paths. Set DIARIZATION_BACKEND=pyannote to enable.

* feat: standalone uses self-hosted GPU service for transcription+diarization

Replace in-process pyannote approach with self-hosted gpu/self_hosted/ service.
Same HTTP API as Modal — just TRANSCRIPT_URL/DIARIZATION_URL point to local container.

- Add gpu/self_hosted/Dockerfile.cpu (GPU Dockerfile minus NVIDIA CUDA)
- Add S3 model bundle fallback in diarizer.py when HF_TOKEN not set
- Add gpu service to docker-compose.standalone.yml with compose env overrides
- Fix /browse empty in PUBLIC_MODE (search+list queries filtered out roomless transcripts)
- Remove audio_diarization_pyannote.py, file_diarization_pyannote.py and tests
- Remove pyannote-audio from server local deps

* fix: set source_kind to FILE on audio file upload

The upload endpoint left source_kind as the default LIVE even when
a file was uploaded. Now sets it to FILE when the upload completes.

* Add hatchet env vars

* fix: improve port conflict detection and ollama model check in standalone setup

- Filter OrbStack/Docker Desktop PIDs from port conflict check (false positives on Mac)
- Check all infra ports (5432, 6379, 3900, 3903) not just app ports
- Fix ollama model detection to match on name column only
- Document OrbStack and cross-project port conflicts in troubleshooting

* fix: processing page auto-redirect after file upload completes

Three fixes for the processing page not redirecting when status becomes "ended":

- Add useWebSockets to processing page so it receives STATUS events
- Remove OAuth2PasswordBearer from auth_none — broke WebSocket endpoints (500)
- Reconnect stale Redis in ws_manager when Celery worker reuses dead event loop

* fix: mock Celery broker in idle transcript validation test

test_validation_idle_transcript_with_recording_allowed called
validate_transcript_for_processing without mocking
task_is_scheduled_or_active, which attempts a real Celery
broker connection (AMQP port 5672). Other tests in the same
file already mock this — apply the same pattern here.

* Enable server host mode

* Fix webrtc connection

* Remove turbopack

* fix: standalone GPU service connectivity with host network mode

Server runs with network_mode: host and can't resolve Docker service
names. Publish cpu port as 8100 on host, point server at localhost:8100.
Worker stays on bridge network using cpu:8000. Add dummy
TRANSCRIPT_MODAL_API_KEY since OpenAI SDK requires it even for local
endpoints.

---------

Co-authored-by: Igor Loskutov <igor.loskutoff@gmail.com>
Co-authored-by: Sergey Mankovsky <sergey@mankovsky.dev>
2026-02-11 18:20:36 -05:00
cd2255cfbc chore(main): release 0.33.0 (#847) 2026-02-06 18:12:06 -05:00
15ab2e306e feat: Daily+hatchet default (#846)
* feat: set Daily as default video platform

Daily.co has been battle-tested and is ready to be the default.
Whereby remains available for rooms that explicitly set it.

* feat: enforce Hatchet for all multitrack processing

Remove use_celery option from rooms - multitrack (Daily) recordings
now always use Hatchet workflows. Celery remains for single-track
(Whereby) file processing only.

- Remove use_celery column from room table
- Simplify dispatch logic to always use Hatchet for multitracks
- Update tests to mock Hatchet instead of Celery

* fix: update whereby test to patch Hatchet instead of removed Celery import

---------

Co-authored-by: Igor Loskutov <igor.loskutoff@gmail.com>
2026-02-05 18:38:08 -05:00
1ce1c7a910 fix: websocket tests (#825)
* fix websocket tests

* fix: restore timeout and fix celery test infrastructure

- Re-add timeout=1.0 to ws_manager pubsub loop (prevents CPU spin?)
- Use Redis for Celery tests (memory:// broker doesn't support chords)
- Add timeout param to in-memory subscriber mock
- Remove duplicate celery_includes fixture from rtc_ws tests

* fix: remove redundant inline imports in test files

* fix: update gitleaks ignore for moved s3_key line

---------

Co-authored-by: Igor Loskutov <igor.loskutoff@gmail.com>
2026-02-05 14:23:31 -05:00
Rémi Pauchet
984795357e - fix nvidia repo blocked by apt (sha1) (#845)
- use build cache for apt and uv
- limit concurency for uv to prevent crashes with too many cores
2026-02-05 13:59:34 -05:00
fa3cf5da0f chore(main): release 0.32.2 (#842) 2026-02-03 22:05:22 -05:00
8707c6694a fix: use Daily API recording.duration as master source for transcript duration (#844)
Set duration early in get_participants from Daily API (seconds -> ms),
ensuring post_zulip has the value before mixdown_tracks completes.

Removes redundant duration update from mixdown_tracks.

Co-authored-by: Igor Loskutov <igor.loskutoff@gmail.com>
2026-02-03 17:15:03 -05:00
4acde4b7fd fix: increase TIMEOUT_MEDIUM from 2m to 5m for LLM tasks (#843)
Topic detection was timing out on longer transcripts when LLM
responses are slow. This affects detect_chunk_topic and other
LLM-calling tasks that use TIMEOUT_MEDIUM.

Co-authored-by: Igor Loskutov <igor.loskutoff@gmail.com>
2026-02-03 16:05:16 -05:00
a2ed7d60d5 fix: make caddy optional (#841) 2026-02-03 00:18:47 +01:00
a08f94a5bf chore(main): release 0.32.1 (#840) 2026-01-30 17:34:48 -05:00
Igor Loskutov
c05d1f03cd fix: match httpx pad with hatchet audio timeout 2026-01-30 15:56:18 -05:00
Igor Loskutov
23eb1371cb fix: daily multitrack pipeline finalze dependency fix 2026-01-30 15:19:27 -05:00
2592e369f6 chore(main): release 0.32.0 (#838) 2026-01-30 13:13:59 -05:00
7fde64e252 feat: modal padding (#837)
* Add Modal backend for audio padding

- Create reflector_padding.py Modal deployment (CPU-based)
- Add PaddingWorkflow with conditional Modal/local backend
- Update deploy-all.sh to include padding deployment

---------

Co-authored-by: Igor Loskutov <igor.loskutoff@gmail.com>
2026-01-30 13:11:51 -05:00
2ca624f052 chore(main): release 0.31.0 (#835) 2026-01-26 13:07:29 -05:00
fc3ef6c893 feat: mixdown optional (#834)
* optional mixdown

* optional mixdown

---------

Co-authored-by: Igor Loskutov <igor.loskutoff@gmail.com>
2026-01-23 15:51:18 -05:00
5d26461477 chore(main): release 0.30.0 (#832) 2026-01-23 13:58:33 -05:00
6c175a11d8 feat: brady bunch (#816)
* brady bunch PRD/tasks

* clean dead daily.co code

* brady bunch prototype (no-mistakes)

* brady bunch prototype (no-mistakes) review

* self-review

* daily poll time match (no-mistakes)

* daily poll self-review (no-mistakes)

* daily poll self-review (no-mistakes)

* daily co doc

* cleanup

* cleanup

* self-review (no-mistakes)

* self-review (no-mistakes)

* self-review

* self-review

* ui typefix

* dupe calls error handling proper

* daily reflector data model doc

* logging style fix

* migration merge

---------

Co-authored-by: Igor Loskutov <igor.loskutoff@gmail.com>
2026-01-23 12:33:06 -05:00
6e786b7631 hatchet processing resilence several fixes (#831)
Co-authored-by: Igor Loskutov <igor.loskutoff@gmail.com>
2026-01-22 19:03:33 -05:00
8fc8d8bf4a chore(main): release 0.29.0 (#828) 2026-01-22 12:52:39 -05:00
c723752b7e feat: set hatchet as default for multitracks (#822)
* set hatchet as default for multitracks

* fix: pipeline routing tests for hatchet-default branch

- Create room with use_celery=True to force Celery backend in tests
- Link transcript to room to enable multitrack pipeline routing
- Fixes test failures caused by missing HATCHET_CLIENT_TOKEN in test env

* Update server/reflector/services/transcript_process.py

Co-authored-by: pr-agent-monadical[bot] <198624643+pr-agent-monadical[bot]@users.noreply.github.com>

---------

Co-authored-by: Igor Loskutov <igor.loskutoff@gmail.com>
Co-authored-by: pr-agent-monadical[bot] <198624643+pr-agent-monadical[bot]@users.noreply.github.com>
2026-01-21 17:05:03 -05:00
4dc49e5b25 chore(main): release 0.28.1 (#827) 2026-01-21 15:30:42 -05:00
23d2bc283d fix: ics non-sync bugfix (#823)
* ics non-sync bugfix

* fix tests

---------

Co-authored-by: Igor Loskutov <igor.loskutoff@gmail.com>
2026-01-21 15:10:19 -05:00
c8743fdf1c fix webhook tests (#826)
Co-authored-by: Igor Loskutov <igor.loskutoff@gmail.com>
2026-01-21 14:31:20 -05:00
8a293882ad timeout to untighten ws python loop (#821)
Co-authored-by: Igor Loskutov <igor.loskutoff@gmail.com>
2026-01-20 16:29:09 -05:00
104 changed files with 4791 additions and 1625 deletions

1
.gitignore vendored
View File

@@ -1,5 +1,6 @@
.DS_Store
server/.env
server/.env.production
.env
Caddyfile
server/exportdanswer

View File

@@ -3,3 +3,5 @@ docs/docs/installation/auth-setup.md:curl-auth-header:250
docs/docs/installation/daily-setup.md:curl-auth-header:277
gpu/self_hosted/DEV_SETUP.md:curl-auth-header:74
gpu/self_hosted/DEV_SETUP.md:curl-auth-header:83
server/reflector/worker/process.py:generic-api-key:465
server/reflector/worker/process.py:generic-api-key:594

View File

@@ -1,5 +1,69 @@
# Changelog
## [0.33.0](https://github.com/Monadical-SAS/reflector/compare/v0.32.2...v0.33.0) (2026-02-05)
### Features
* Daily+hatchet default ([#846](https://github.com/Monadical-SAS/reflector/issues/846)) ([15ab2e3](https://github.com/Monadical-SAS/reflector/commit/15ab2e306eacf575494b4b5d2b2ad779d44a1c7f))
### Bug Fixes
* websocket tests ([#825](https://github.com/Monadical-SAS/reflector/issues/825)) ([1ce1c7a](https://github.com/Monadical-SAS/reflector/commit/1ce1c7a910b6c374115d2437b17f9d288ef094dc))
## [0.32.2](https://github.com/Monadical-SAS/reflector/compare/v0.32.1...v0.32.2) (2026-02-03)
### Bug Fixes
* increase TIMEOUT_MEDIUM from 2m to 5m for LLM tasks ([#843](https://github.com/Monadical-SAS/reflector/issues/843)) ([4acde4b](https://github.com/Monadical-SAS/reflector/commit/4acde4b7fdef88cc02ca12cf38c9020b05ed96ac))
* make caddy optional ([#841](https://github.com/Monadical-SAS/reflector/issues/841)) ([a2ed7d6](https://github.com/Monadical-SAS/reflector/commit/a2ed7d60d557b551a5b64e4dfd909b63a791d9fc))
* use Daily API recording.duration as master source for transcript duration ([#844](https://github.com/Monadical-SAS/reflector/issues/844)) ([8707c66](https://github.com/Monadical-SAS/reflector/commit/8707c6694a80c939b6214bbc13331741f192e082))
## [0.32.1](https://github.com/Monadical-SAS/reflector/compare/v0.32.0...v0.32.1) (2026-01-30)
### Bug Fixes
* daily multitrack pipeline finalze dependency fix ([23eb137](https://github.com/Monadical-SAS/reflector/commit/23eb1371cb9348c4b81eb12ad506b582f8a4799e))
* match httpx pad with hatchet audio timeout ([c05d1f0](https://github.com/Monadical-SAS/reflector/commit/c05d1f03cd8369fc06efd455527e50246887efd0))
## [0.32.0](https://github.com/Monadical-SAS/reflector/compare/v0.31.0...v0.32.0) (2026-01-30)
### Features
* modal padding ([#837](https://github.com/Monadical-SAS/reflector/issues/837)) ([7fde64e](https://github.com/Monadical-SAS/reflector/commit/7fde64e2529a1d37b0f7507c62d983a7bd0b5b89))
## [0.31.0](https://github.com/Monadical-SAS/reflector/compare/v0.30.0...v0.31.0) (2026-01-23)
### Features
* mixdown optional ([#834](https://github.com/Monadical-SAS/reflector/issues/834)) ([fc3ef6c](https://github.com/Monadical-SAS/reflector/commit/fc3ef6c8933231c731fad84e7477a476a6220a5e))
## [0.30.0](https://github.com/Monadical-SAS/reflector/compare/v0.29.0...v0.30.0) (2026-01-23)
### Features
* brady bunch ([#816](https://github.com/Monadical-SAS/reflector/issues/816)) ([6c175a1](https://github.com/Monadical-SAS/reflector/commit/6c175a11d8a3745095bfad06a4ad3ccdfd278433))
## [0.29.0](https://github.com/Monadical-SAS/reflector/compare/v0.28.1...v0.29.0) (2026-01-21)
### Features
* set hatchet as default for multitracks ([#822](https://github.com/Monadical-SAS/reflector/issues/822)) ([c723752](https://github.com/Monadical-SAS/reflector/commit/c723752b7e15aa48a41ad22856f147a5517d3f46))
## [0.28.1](https://github.com/Monadical-SAS/reflector/compare/v0.28.0...v0.28.1) (2026-01-21)
### Bug Fixes
* ics non-sync bugfix ([#823](https://github.com/Monadical-SAS/reflector/issues/823)) ([23d2bc2](https://github.com/Monadical-SAS/reflector/commit/23d2bc283d4d02187b250d2055103e0374ee93d6))
## [0.28.0](https://github.com/Monadical-SAS/reflector/compare/v0.27.0...v0.28.0) (2026-01-20)

View File

@@ -1,6 +1,8 @@
# Reflector Caddyfile
# Replace example.com with your actual domains
# CORS is handled by the backend - Caddy just proxies
# Reflector Caddyfile (optional reverse proxy)
# Use this only when you run Caddy via: docker compose -f docker-compose.prod.yml --profile caddy up -d
# If Coolify, Traefik, or nginx already use ports 80/443, do NOT start Caddy; point your proxy at web:3000 and server:1250.
#
# Replace example.com with your actual domains. CORS is handled by the backend - Caddy just proxies.
#
# For environment variable substitution, set:
# FRONTEND_DOMAIN=app.example.com

View File

@@ -1,9 +1,14 @@
# Production Docker Compose configuration
# Usage: docker compose -f docker-compose.prod.yml up -d
#
# Caddy (reverse proxy on ports 80/443) is OPTIONAL and behind the "caddy" profile:
# - With Caddy (self-hosted, you manage SSL): docker compose -f docker-compose.prod.yml --profile caddy up -d
# - Without Caddy (Coolify/Traefik/nginx already on 80/443): docker compose -f docker-compose.prod.yml up -d
# Then point your proxy at web:3000 (frontend) and server:1250 (API).
#
# Prerequisites:
# 1. Copy .env.example to .env and configure for both server/ and www/
# 2. Copy Caddyfile.example to Caddyfile and edit with your domains
# 2. If using Caddy: copy Caddyfile.example to Caddyfile and edit your domains
# 3. Deploy Modal GPU functions (see gpu/modal_deployments/deploy-all.sh)
services:
@@ -84,6 +89,8 @@ services:
retries: 3
caddy:
profiles:
- caddy
image: caddy:2-alpine
restart: unless-stopped
ports:

View File

@@ -0,0 +1,120 @@
# Standalone services for fully local deployment (no external dependencies).
# Usage: docker compose -f docker-compose.yml -f docker-compose.standalone.yml up -d
#
# On Linux with NVIDIA GPU, also pass: --profile ollama-gpu
# On Linux without GPU: --profile ollama-cpu
# On Mac: Ollama runs natively (Metal GPU) — no profile needed, services here unused.
services:
garage:
image: dxflrs/garage:v1.1.0
ports:
- "3900:3900" # S3 API
- "3903:3903" # Admin API
volumes:
- garage_data:/var/lib/garage/data
- garage_meta:/var/lib/garage/meta
- ./data/garage.toml:/etc/garage.toml:ro
restart: unless-stopped
healthcheck:
test: ["CMD", "/garage", "stats"]
interval: 10s
timeout: 5s
retries: 5
start_period: 5s
ollama:
image: ollama/ollama:latest
profiles: ["ollama-gpu"]
ports:
- "11434:11434"
volumes:
- ollama_data:/root/.ollama
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: all
capabilities: [gpu]
restart: unless-stopped
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:11434/api/tags"]
interval: 10s
timeout: 5s
retries: 5
ollama-cpu:
image: ollama/ollama:latest
profiles: ["ollama-cpu"]
ports:
- "11434:11434"
volumes:
- ollama_data:/root/.ollama
restart: unless-stopped
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:11434/api/tags"]
interval: 10s
timeout: 5s
retries: 5
# Override server/worker/beat to use self-hosted GPU service for transcription+diarization.
# compose `environment:` overrides values from `env_file:` — no need to edit server/.env.
server:
environment:
TRANSCRIPT_BACKEND: modal
TRANSCRIPT_URL: http://localhost:8100
TRANSCRIPT_MODAL_API_KEY: local
DIARIZATION_BACKEND: modal
DIARIZATION_URL: http://localhost:8100
worker:
environment:
TRANSCRIPT_BACKEND: modal
TRANSCRIPT_URL: http://cpu:8000
TRANSCRIPT_MODAL_API_KEY: local
DIARIZATION_BACKEND: modal
DIARIZATION_URL: http://cpu:8000
cpu:
build:
context: ./gpu/self_hosted
dockerfile: Dockerfile.cpu
ports:
- "8100:8000"
volumes:
- gpu_cache:/root/.cache
restart: unless-stopped
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:8000/docs"]
interval: 15s
timeout: 5s
retries: 10
start_period: 120s
gpu-nvidia:
build:
context: ./gpu/self_hosted
profiles: ["gpu-nvidia"]
volumes:
- gpu_cache:/root/.cache
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: all
capabilities: [gpu]
restart: unless-stopped
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:8000/docs"]
interval: 15s
timeout: 5s
retries: 10
start_period: 120s
volumes:
garage_data:
garage_meta:
ollama_data:
gpu_cache:

View File

@@ -2,8 +2,7 @@ services:
server:
build:
context: server
ports:
- 1250:1250
network_mode: host
volumes:
- ./server/:/app/
- /app/.venv
@@ -11,6 +10,12 @@ services:
- ./server/.env
environment:
ENTRYPOINT: server
DATABASE_URL: postgresql+asyncpg://reflector:reflector@localhost:5432/reflector
REDIS_HOST: localhost
CELERY_BROKER_URL: redis://localhost:6379/1
CELERY_RESULT_BACKEND: redis://localhost:6379/1
HATCHET_CLIENT_SERVER_URL: http://localhost:8889
HATCHET_CLIENT_HOST_PORT: localhost:7078
worker:
build:
@@ -22,6 +27,11 @@ services:
- ./server/.env
environment:
ENTRYPOINT: worker
HATCHET_CLIENT_SERVER_URL: http://hatchet:8888
HATCHET_CLIENT_HOST_PORT: hatchet:7077
depends_on:
redis:
condition: service_started
beat:
build:
@@ -33,6 +43,9 @@ services:
- ./server/.env
environment:
ENTRYPOINT: beat
depends_on:
redis:
condition: service_started
hatchet-worker-cpu:
build:
@@ -44,6 +57,8 @@ services:
- ./server/.env
environment:
ENTRYPOINT: hatchet-worker-cpu
HATCHET_CLIENT_SERVER_URL: http://hatchet:8888
HATCHET_CLIENT_HOST_PORT: hatchet:7077
depends_on:
hatchet:
condition: service_healthy
@@ -57,6 +72,8 @@ services:
- ./server/.env
environment:
ENTRYPOINT: hatchet-worker-llm
HATCHET_CLIENT_SERVER_URL: http://hatchet:8888
HATCHET_CLIENT_HOST_PORT: hatchet:7077
depends_on:
hatchet:
condition: service_healthy
@@ -75,10 +92,16 @@ services:
volumes:
- ./www:/app/
- /app/node_modules
- next_cache:/app/.next
env_file:
- ./www/.env.local
environment:
- NODE_ENV=development
- SERVER_API_URL=http://host.docker.internal:1250
extra_hosts:
- "host.docker.internal:host-gateway"
depends_on:
- server
postgres:
image: postgres:17
@@ -94,13 +117,14 @@ services:
- ./server/docker/init-hatchet-db.sql:/docker-entrypoint-initdb.d/init-hatchet-db.sql:ro
healthcheck:
test: ["CMD-SHELL", "pg_isready -d reflector -U reflector"]
interval: 10s
timeout: 10s
retries: 5
start_period: 10s
interval: 5s
timeout: 5s
retries: 10
start_period: 15s
hatchet:
image: ghcr.io/hatchet-dev/hatchet/hatchet-lite:latest
restart: on-failure
ports:
- "8889:8888"
- "7078:7077"
@@ -108,7 +132,7 @@ services:
postgres:
condition: service_healthy
environment:
DATABASE_URL: "postgresql://reflector:reflector@postgres:5432/hatchet?sslmode=disable"
DATABASE_URL: "postgresql://reflector:reflector@postgres:5432/hatchet?sslmode=disable&connect_timeout=30"
SERVER_AUTH_COOKIE_DOMAIN: localhost
SERVER_AUTH_COOKIE_INSECURE: "t"
SERVER_GRPC_BIND_ADDRESS: "0.0.0.0"
@@ -128,6 +152,5 @@ services:
retries: 5
start_period: 30s
networks:
default:
attachable: true
volumes:
next_cache:

View File

@@ -11,15 +11,15 @@ This page documents the Docker Compose configuration for Reflector. For the comp
The `docker-compose.prod.yml` includes these services:
| Service | Image | Purpose |
|---------|-------|---------|
| `web` | `monadicalsas/reflector-frontend` | Next.js frontend |
| `server` | `monadicalsas/reflector-backend` | FastAPI backend |
| `worker` | `monadicalsas/reflector-backend` | Celery worker for background tasks |
| `beat` | `monadicalsas/reflector-backend` | Celery beat scheduler |
| `redis` | `redis:7.2-alpine` | Message broker and cache |
| `postgres` | `postgres:17-alpine` | Primary database |
| `caddy` | `caddy:2-alpine` | Reverse proxy with auto-SSL |
| Service | Image | Purpose |
| ---------- | --------------------------------- | --------------------------------------------------------------------------- |
| `web` | `monadicalsas/reflector-frontend` | Next.js frontend |
| `server` | `monadicalsas/reflector-backend` | FastAPI backend |
| `worker` | `monadicalsas/reflector-backend` | Celery worker for background tasks |
| `beat` | `monadicalsas/reflector-backend` | Celery beat scheduler |
| `redis` | `redis:7.2-alpine` | Message broker and cache |
| `postgres` | `postgres:17-alpine` | Primary database |
| `caddy` | `caddy:2-alpine` | Reverse proxy with auto-SSL (optional; see [Caddy profile](#caddy-profile)) |
## Environment Files
@@ -30,6 +30,7 @@ Reflector uses two separate environment files:
Used by: `server`, `worker`, `beat`
Key variables:
```env
# Database connection
DATABASE_URL=postgresql+asyncpg://reflector:reflector@postgres:5432/reflector
@@ -54,6 +55,7 @@ TRANSCRIPT_MODAL_API_KEY=...
Used by: `web`
Key variables:
```env
# Domain configuration
SITE_URL=https://app.example.com
@@ -70,26 +72,42 @@ Note: `API_URL` is used client-side (browser), `SERVER_API_URL` is used server-s
## Volumes
| Volume | Purpose |
|--------|---------|
| `redis_data` | Redis persistence |
| `postgres_data` | PostgreSQL data |
| `server_data` | Uploaded files, local storage |
| `caddy_data` | SSL certificates |
| `caddy_config` | Caddy configuration |
| Volume | Purpose |
| --------------- | ----------------------------- |
| `redis_data` | Redis persistence |
| `postgres_data` | PostgreSQL data |
| `server_data` | Uploaded files, local storage |
| `caddy_data` | SSL certificates |
| `caddy_config` | Caddy configuration |
## Network
All services share the default network. The network is marked `attachable: true` to allow external containers (like Authentik) to join.
## Caddy profile
Caddy (ports 80 and 443) is **optional** and behind the `caddy` profile so it does not conflict with an existing reverse proxy (e.g. Coolify, Traefik, nginx).
- **With Caddy** (you want Reflector to handle SSL):
`docker compose -f docker-compose.prod.yml --profile caddy up -d`
- **Without Caddy** (Coolify or another proxy already on 80/443):
`docker compose -f docker-compose.prod.yml up -d`
Then configure your proxy to send traffic to `web:3000` (frontend) and `server:1250` (API).
## Common Commands
### Start all services
```bash
# Without Caddy (e.g. when using Coolify)
docker compose -f docker-compose.prod.yml up -d
# With Caddy as reverse proxy
docker compose -f docker-compose.prod.yml --profile caddy up -d
```
### View logs
```bash
# All services
docker compose -f docker-compose.prod.yml logs -f
@@ -99,6 +117,7 @@ docker compose -f docker-compose.prod.yml logs server --tail 50
```
### Restart a service
```bash
# Quick restart (doesn't reload .env changes)
docker compose -f docker-compose.prod.yml restart server
@@ -108,27 +127,32 @@ docker compose -f docker-compose.prod.yml up -d server
```
### Run database migrations
```bash
docker compose -f docker-compose.prod.yml exec server uv run alembic upgrade head
```
### Access database
```bash
docker compose -f docker-compose.prod.yml exec postgres psql -U reflector
```
### Pull latest images
```bash
docker compose -f docker-compose.prod.yml pull
docker compose -f docker-compose.prod.yml up -d
```
### Stop all services
```bash
docker compose -f docker-compose.prod.yml down
```
### Full reset (WARNING: deletes data)
```bash
docker compose -f docker-compose.prod.yml down -v
```
@@ -187,6 +211,7 @@ The Caddyfile supports environment variable substitution:
Set `FRONTEND_DOMAIN` and `API_DOMAIN` environment variables, or edit the file directly.
### Reload Caddy after changes
```bash
docker compose -f docker-compose.prod.yml exec caddy caddy reload --config /etc/caddy/Caddyfile
```

View File

@@ -26,7 +26,7 @@ flowchart LR
Before starting, you need:
- **Production server** - 4+ cores, 8GB+ RAM, public IP
- **Production server** - 4+ cores, 8GB+ RAM, public IP
- **Two domain names** - e.g., `app.example.com` (frontend) and `api.example.com` (backend)
- **GPU processing** - Choose one:
- Modal.com account, OR
@@ -60,16 +60,17 @@ Type: A Name: api Value: <your-server-ip>
Reflector requires GPU processing for transcription and speaker diarization. Choose one option:
| | **Modal.com (Cloud)** | **Self-Hosted GPU** |
|---|---|---|
| | **Modal.com (Cloud)** | **Self-Hosted GPU** |
| ------------ | --------------------------------- | ---------------------------- |
| **Best for** | No GPU hardware, zero maintenance | Own GPU server, full control |
| **Pricing** | Pay-per-use | Fixed infrastructure cost |
| **Pricing** | Pay-per-use | Fixed infrastructure cost |
### Option A: Modal.com (Serverless Cloud GPU)
#### Accept HuggingFace Licenses
Visit both pages and click "Accept":
- https://huggingface.co/pyannote/speaker-diarization-3.1
- https://huggingface.co/pyannote/segmentation-3.0
@@ -179,6 +180,7 @@ Save these credentials - you'll need them in the next step.
## Configure Environment
Reflector has two env files:
- `server/.env` - Backend configuration
- `www/.env` - Frontend configuration
@@ -190,6 +192,7 @@ nano server/.env
```
**Required settings:**
```env
# Database (defaults work with docker-compose.prod.yml)
DATABASE_URL=postgresql+asyncpg://reflector:reflector@postgres:5432/reflector
@@ -249,6 +252,7 @@ nano www/.env
```
**Required settings:**
```env
# Your domains
SITE_URL=https://app.example.com
@@ -266,7 +270,11 @@ FEATURE_REQUIRE_LOGIN=false
---
## Configure Caddy
## Reverse proxy (Caddy or existing)
**If Coolify, Traefik, or nginx already use ports 80/443** (e.g. Coolify on your host): skip Caddy. Start the stack without the Caddy profile (see [Start Services](#start-services) below), then point your proxy at `web:3000` (frontend) and `server:1250` (API).
**If you want Reflector to provide the reverse proxy and SSL:**
```bash
cp Caddyfile.example Caddyfile
@@ -289,10 +297,18 @@ Replace `example.com` with your domains. The `{$VAR:default}` syntax uses Caddy'
## Start Services
**Without Caddy** (e.g. Coolify already on 80/443):
```bash
docker compose -f docker-compose.prod.yml up -d
```
**With Caddy** (Reflector handles SSL):
```bash
docker compose -f docker-compose.prod.yml --profile caddy up -d
```
Wait for containers to start (first run may take 1-2 minutes to pull images and initialize).
---
@@ -300,18 +316,21 @@ Wait for containers to start (first run may take 1-2 minutes to pull images and
## Verify Deployment
### Check services
```bash
docker compose -f docker-compose.prod.yml ps
# All should show "Up"
```
### Test API
```bash
curl https://api.example.com/health
# Should return: {"status":"healthy"}
```
### Test Frontend
- Visit https://app.example.com
- You should see the Reflector interface
- Try uploading an audio file to test transcription
@@ -327,6 +346,7 @@ By default, Reflector is open (no login required). **Authentication is required
See [Authentication Setup](./auth-setup) for full Authentik OAuth configuration.
Quick summary:
1. Deploy Authentik on your server
2. Create OAuth provider in Authentik
3. Extract public key for JWT verification
@@ -358,6 +378,7 @@ DAILYCO_STORAGE_AWS_ROLE_ARN=<arn:aws:iam::ACCOUNT:role/DailyCo>
```
Reload env and restart:
```bash
docker compose -f docker-compose.prod.yml up -d server worker
```
@@ -367,35 +388,43 @@ docker compose -f docker-compose.prod.yml up -d server worker
## Troubleshooting
### Check logs for errors
```bash
docker compose -f docker-compose.prod.yml logs server --tail 20
docker compose -f docker-compose.prod.yml logs worker --tail 20
```
### Services won't start
```bash
docker compose -f docker-compose.prod.yml logs
```
### CORS errors in browser
- Verify `CORS_ORIGIN` in `server/.env` matches your frontend domain exactly (including `https://`)
- Reload env: `docker compose -f docker-compose.prod.yml up -d server`
### SSL certificate errors
### SSL certificate errors (when using Caddy)
- Caddy auto-provisions Let's Encrypt certificates
- Ensure ports 80 and 443 are open
- Ensure ports 80 and 443 are open and not used by another proxy
- Check: `docker compose -f docker-compose.prod.yml logs caddy`
- If port 80 is already in use (e.g. by Coolify), run without Caddy: `docker compose -f docker-compose.prod.yml up -d` and use your existing proxy
### Transcription not working
- Check Modal dashboard: https://modal.com/apps
- Verify URLs in `server/.env` match deployed functions
- Check worker logs: `docker compose -f docker-compose.prod.yml logs worker`
### "Login required" but auth not configured
- Set `FEATURE_REQUIRE_LOGIN=false` in `www/.env`
- Rebuild frontend: `docker compose -f docker-compose.prod.yml up -d --force-recreate web`
### Database migrations or connectivity issues
Migrations run automatically on server startup. To check database connectivity or debug migration failures:
```bash
@@ -408,4 +437,3 @@ docker compose -f docker-compose.prod.yml exec server uv run python -c "from ref
# Manually run migrations (if needed)
docker compose -f docker-compose.prod.yml exec server uv run alembic upgrade head
```

View File

@@ -0,0 +1,214 @@
---
sidebar_position: 2
title: Standalone Local Setup
---
# Standalone Local Setup
**The goal**: a clueless user clones the repo, runs one script, and has a working Reflector instance locally. No cloud accounts, no API keys, no manual env file editing.
```bash
git clone https://github.com/monadical-sas/reflector.git
cd reflector
./scripts/setup-standalone.sh
```
The script is idempotent — safe to re-run at any time. It detects what's already set up and skips completed steps.
## Prerequisites
- Docker / OrbStack / Docker Desktop (any)
- Mac (Apple Silicon) or Linux
- 16GB+ RAM (32GB recommended for 14B LLM models)
- **Mac only**: [Ollama](https://ollama.com/download) installed (`brew install ollama`)
## What the script does
### 1. LLM inference via Ollama
**Mac**: starts Ollama natively (Metal GPU acceleration). Pulls the LLM model. Docker containers reach it via `host.docker.internal:11434`.
**Linux**: starts containerized Ollama via `docker-compose.standalone.yml` profile (`ollama-gpu` with NVIDIA, `ollama-cpu` without). Pulls model inside the container.
### 2. Environment files
Generates `server/.env` and `www/.env.local` with standalone defaults:
**`server/.env`** — key settings:
| Variable | Value | Why |
|----------|-------|-----|
| `DATABASE_URL` | `postgresql+asyncpg://...@postgres:5432/reflector` | Docker-internal hostname |
| `REDIS_HOST` | `redis` | Docker-internal hostname |
| `CELERY_BROKER_URL` | `redis://redis:6379/1` | Docker-internal hostname |
| `AUTH_BACKEND` | `none` | No Authentik in standalone |
| `TRANSCRIPT_BACKEND` | `modal` | HTTP API to self-hosted CPU service |
| `TRANSCRIPT_URL` | `http://cpu:8000` | Docker-internal CPU service |
| `DIARIZATION_BACKEND` | `modal` | HTTP API to self-hosted CPU service |
| `DIARIZATION_URL` | `http://cpu:8000` | Docker-internal CPU service |
| `TRANSLATION_BACKEND` | `passthrough` | No Modal |
| `LLM_URL` | `http://host.docker.internal:11434/v1` (Mac) | Ollama endpoint |
**`www/.env.local`** — key settings:
| Variable | Value |
|----------|-------|
| `API_URL` | `http://localhost:1250` |
| `SERVER_API_URL` | `http://server:1250` |
| `WEBSOCKET_URL` | `ws://localhost:1250` |
| `FEATURE_REQUIRE_LOGIN` | `false` |
| `NEXTAUTH_SECRET` | `standalone-dev-secret-not-for-production` |
If env files already exist (including symlinks from worktree setup), the script resolves symlinks and ensures all standalone-critical vars are set. Existing vars not related to standalone are preserved.
### 3. Object storage (Garage)
Standalone uses [Garage](https://garagehq.deuxfleurs.fr/) — a lightweight S3-compatible object store running in Docker. The setup script starts Garage, initializes the layout, creates a bucket and access key, and writes the credentials to `server/.env`.
**`server/.env`** — storage settings added by the script:
| Variable | Value | Why |
|----------|-------|-----|
| `TRANSCRIPT_STORAGE_BACKEND` | `aws` | Uses the S3-compatible storage driver |
| `TRANSCRIPT_STORAGE_AWS_ENDPOINT_URL` | `http://garage:3900` | Docker-internal Garage S3 API |
| `TRANSCRIPT_STORAGE_AWS_BUCKET_NAME` | `reflector-media` | Created by the script |
| `TRANSCRIPT_STORAGE_AWS_REGION` | `garage` | Must match Garage config |
| `TRANSCRIPT_STORAGE_AWS_ACCESS_KEY_ID` | *(auto-generated)* | Created by `garage key create` |
| `TRANSCRIPT_STORAGE_AWS_SECRET_ACCESS_KEY` | *(auto-generated)* | Created by `garage key create` |
The `TRANSCRIPT_STORAGE_AWS_ENDPOINT_URL` setting enables S3-compatible backends. When set, the storage driver uses path-style addressing and routes all requests to the custom endpoint. When unset (production AWS), behavior is unchanged.
Garage config template lives at `scripts/garage.toml`. The setup script generates `data/garage.toml` (gitignored) with a random RPC secret and mounts it read-only into the container. Single-node, `replication_factor=1`.
> **Note**: Presigned URLs embed the Garage Docker hostname (`http://garage:3900`). This is fine — the server proxies S3 responses to the browser. Modal GPU workers cannot reach internal Garage, but standalone doesn't use Modal.
### 4. Transcription and diarization
Standalone runs the self-hosted ML service (`gpu/self_hosted/`) in a CPU-only Docker container named `cpu`. This is the same FastAPI service used for Modal.com GPU deployments, but built with `Dockerfile.cpu` (no NVIDIA CUDA dependencies). The compose service is named `cpu` (not `gpu`) to make clear it runs without GPU acceleration; the source code lives in `gpu/self_hosted/` because it's shared with the GPU deployment.
The `modal` backend name is reused — it just means "HTTP API client". Setting `TRANSCRIPT_URL` / `DIARIZATION_URL` to `http://cpu:8000` routes requests to the local container instead of Modal.com.
On first start, the service downloads pyannote speaker diarization models (~1GB) from a public S3 bundle. Models are cached in a Docker volume (`gpu_cache`) so subsequent starts are fast. No HuggingFace token or API key needed.
> **Performance**: CPU-only transcription and diarization work but are slow (~15 min for a 3 min file). For faster processing on Linux with NVIDIA GPU, use `--profile gpu-nvidia` instead (see `docker-compose.standalone.yml`).
### 5. Docker services
```bash
docker compose up -d postgres redis garage cpu server worker beat web
```
All services start in a single command. Garage and `cpu` are already started by earlier steps but included for idempotency. No Hatchet in standalone mode — LLM processing (summaries, topics, titles) runs via Celery tasks.
### 6. Database migrations
Run automatically by the `server` container on startup (`runserver.sh` calls `alembic upgrade head`). No manual step needed.
### 7. Health check
Verifies:
- CPU service responds (transcription + diarization ready)
- Server responds at `http://localhost:1250/health`
- Frontend serves at `http://localhost:3000`
- LLM endpoint reachable from inside containers
## Services
| Service | Port | Purpose |
|---------|------|---------|
| `server` | 1250 | FastAPI backend (runs migrations on start) |
| `web` | 3000 | Next.js frontend |
| `postgres` | 5432 | PostgreSQL database |
| `redis` | 6379 | Cache + Celery broker |
| `garage` | 3900, 3903 | S3-compatible object storage (S3 API + admin API) |
| `cpu` | — | Self-hosted transcription + diarization (CPU-only) |
| `worker` | — | Celery worker (live pipeline post-processing) |
| `beat` | — | Celery beat (scheduled tasks) |
## Testing programmatically
After the setup script completes, verify the full pipeline (upload, transcription, diarization, LLM summary) via the API:
```bash
# 1. Create a transcript
TRANSCRIPT_ID=$(curl -s -X POST 'http://localhost:1250/v1/transcripts' \
-H 'Content-Type: application/json' \
-d '{"name":"test-upload"}' | python3 -c "import sys,json; print(json.load(sys.stdin)['id'])")
echo "Created: $TRANSCRIPT_ID"
# 2. Upload an audio file (single-chunk upload)
curl -s "http://localhost:1250/v1/transcripts/${TRANSCRIPT_ID}/record/upload?chunk_number=0&total_chunks=1" \
-X POST -F "chunk=@/path/to/audio.mp3"
# 3. Poll until processing completes (status: ended or error)
while true; do
STATUS=$(curl -s "http://localhost:1250/v1/transcripts/${TRANSCRIPT_ID}" \
| python3 -c "import sys,json; print(json.load(sys.stdin)['status'])")
echo "Status: $STATUS"
case "$STATUS" in ended|error) break;; esac
sleep 10
done
# 4. Check the result
curl -s "http://localhost:1250/v1/transcripts/${TRANSCRIPT_ID}" | python3 -m json.tool
```
Expected result: status `ended`, auto-generated `title`, `short_summary`, `long_summary`, and `transcript` text with `Speaker 0` / `Speaker 1` labels.
CPU-only processing is slow (~15 min for a 3 min audio file). Diarization finishes in ~3 min, transcription takes the rest.
## Troubleshooting
### Port conflicts (most common issue)
If the frontend or backend behaves unexpectedly (e.g., env vars seem ignored, changes don't take effect), **check for port conflicts first**:
```bash
# Check what's listening on key ports
lsof -i :3000 # frontend
lsof -i :1250 # backend
lsof -i :5432 # postgres
lsof -i :3900 # Garage S3 API
lsof -i :6379 # Redis
# Kill stale processes on a port
lsof -ti :3000 | xargs kill
```
Common causes:
- A stale `next dev` or `pnpm dev` process from another terminal/worktree
- Another Docker Compose project (different worktree) with containers on the same ports — the setup script only manages its own project; containers from other projects must be stopped manually (`docker ps` to find them, `docker stop` to kill them)
The setup script checks ports 3000, 1250, 5432, 6379, 3900, 3903 for conflicts before starting services. It ignores OrbStack/Docker Desktop port forwarding processes (which always bind these ports but are not real conflicts).
### OrbStack false port-conflict warnings (Mac)
If you use OrbStack as your Docker runtime, `lsof` will show OrbStack binding ports like 3000, 1250, etc. even when no containers are running. This is OrbStack's port forwarding mechanism — not a real conflict. The setup script filters these out automatically.
### Re-enabling authentication
Standalone runs without authentication (`FEATURE_REQUIRE_LOGIN=false`, `AUTH_BACKEND=none`). To re-enable:
1. In `www/.env.local`: set `FEATURE_REQUIRE_LOGIN=true`, uncomment `AUTHENTIK_ISSUER` and `AUTHENTIK_REFRESH_TOKEN_URL`
2. In `server/.env`: set `AUTH_BACKEND=authentik` (or your backend), configure `AUTH_JWT_AUDIENCE`
3. Restart: `docker compose -f docker-compose.yml -f docker-compose.standalone.yml up -d --force-recreate web server`
## What's NOT covered
These require external accounts and infrastructure that can't be scripted:
- **Live meeting rooms** — requires Daily.co account, S3 bucket, IAM roles
- **Authentication** — requires Authentik deployment and OAuth configuration
- **Hatchet workflows** — requires separate Hatchet setup for multitrack processing
- **Production deployment** — see [Deployment Guide](./overview)
## Current status
All steps implemented. The setup script handles everything end-to-end:
- Step 1 (Ollama/LLM) — implemented
- Step 2 (environment files) — implemented
- Step 3 (object storage / Garage) — implemented
- Step 4 (transcription/diarization) — implemented (self-hosted GPU service)
- Steps 5-7 (Docker, migrations, health) — implemented
- **Unified script**: `scripts/setup-standalone.sh`

View File

@@ -131,6 +131,15 @@ if [ -z "$DIARIZER_URL" ]; then
fi
echo " -> $DIARIZER_URL"
echo ""
echo "Deploying padding (CPU audio processing via Modal SDK)..."
modal deploy reflector_padding.py
if [ $? -ne 0 ]; then
echo "Error: Failed to deploy padding. Check Modal dashboard for details."
exit 1
fi
echo " -> reflector-padding.pad_track (Modal SDK function)"
# --- Output Configuration ---
echo ""
echo "=========================================="
@@ -147,4 +156,6 @@ echo ""
echo "DIARIZATION_BACKEND=modal"
echo "DIARIZATION_URL=$DIARIZER_URL"
echo "DIARIZATION_MODAL_API_KEY=$API_KEY"
echo ""
echo "# Padding uses Modal SDK (requires MODAL_TOKEN_ID/SECRET in worker containers)"
echo "# --- End Modal Configuration ---"

View File

@@ -0,0 +1,277 @@
"""
Reflector GPU backend - audio padding
======================================
CPU-intensive audio padding service for adding silence to audio tracks.
Uses PyAV filter graph (adelay) for precise track synchronization.
IMPORTANT: This padding logic is duplicated from server/reflector/utils/audio_padding.py
for Modal deployment isolation (Modal can't import from server/reflector/). If you modify
the PyAV filter graph or padding algorithm, you MUST update both:
- gpu/modal_deployments/reflector_padding.py (this file)
- server/reflector/utils/audio_padding.py
Constants duplicated from server/reflector/utils/audio_constants.py for same reason.
"""
import os
import tempfile
from fractions import Fraction
import math
import asyncio
import modal
S3_TIMEOUT = 60 # happens 2 times
PADDING_TIMEOUT = 600 + (S3_TIMEOUT * 2)
SCALEDOWN_WINDOW = 60 # The maximum duration (in seconds) that individual containers can remain idle when scaling down.
DISCONNECT_CHECK_INTERVAL = 2 # Check for client disconnect
app = modal.App("reflector-padding")
# CPU-based image
image = (
modal.Image.debian_slim(python_version="3.12")
.apt_install("ffmpeg") # Required by PyAV
.pip_install(
"av==13.1.0", # PyAV for audio processing
"requests==2.32.3", # HTTP for presigned URL downloads/uploads
"fastapi==0.115.12", # API framework
)
)
# ref B0F71CE8-FC59-4AA5-8414-DAFB836DB711
OPUS_STANDARD_SAMPLE_RATE = 48000
# ref B0F71CE8-FC59-4AA5-8414-DAFB836DB711
OPUS_DEFAULT_BIT_RATE = 128000
@app.function(
cpu=2.0,
timeout=PADDING_TIMEOUT,
scaledown_window=SCALEDOWN_WINDOW,
image=image,
)
@modal.asgi_app()
def web():
from fastapi import FastAPI, Request, HTTPException
from pydantic import BaseModel
class PaddingRequest(BaseModel):
track_url: str
output_url: str
start_time_seconds: float
track_index: int
class PaddingResponse(BaseModel):
size: int
cancelled: bool = False
web_app = FastAPI()
@web_app.post("/pad")
async def pad_track_endpoint(request: Request, req: PaddingRequest) -> PaddingResponse:
"""Modal web endpoint for padding audio tracks with disconnect detection.
"""
import logging
logging.basicConfig(level=logging.INFO, format="%(asctime)s - %(levelname)s - %(message)s")
logger = logging.getLogger(__name__)
if not req.track_url:
raise HTTPException(status_code=400, detail="track_url cannot be empty")
if not req.output_url:
raise HTTPException(status_code=400, detail="output_url cannot be empty")
if req.start_time_seconds <= 0:
raise HTTPException(status_code=400, detail=f"start_time_seconds must be positive, got {req.start_time_seconds}")
if req.start_time_seconds > 18000:
raise HTTPException(status_code=400, detail=f"start_time_seconds exceeds maximum 18000s (5 hours)")
logger.info(f"Padding request: track {req.track_index}, delay={req.start_time_seconds}s")
# Thread-safe cancellation flag shared between async disconnect checker and blocking thread
import threading
cancelled = threading.Event()
async def check_disconnect():
"""Background task to check for client disconnect every 2 seconds."""
while not cancelled.is_set():
await asyncio.sleep(DISCONNECT_CHECK_INTERVAL)
if await request.is_disconnected():
logger.warning("Client disconnected, setting cancellation flag")
cancelled.set()
break
# Start disconnect checker in background
disconnect_task = asyncio.create_task(check_disconnect())
try:
result = await asyncio.get_event_loop().run_in_executor(
None, _pad_track_blocking, req, cancelled, logger
)
return PaddingResponse(**result)
finally:
cancelled.set()
disconnect_task.cancel()
try:
await disconnect_task
except asyncio.CancelledError:
pass
def _pad_track_blocking(req, cancelled, logger) -> dict:
"""Blocking CPU-bound padding work with periodic cancellation checks.
Args:
cancelled: threading.Event for thread-safe cancellation signaling
"""
import av
import requests
from av.audio.resampler import AudioResampler
import time
temp_dir = tempfile.mkdtemp()
input_path = None
output_path = None
last_check = time.time()
try:
logger.info("Downloading track for padding")
response = requests.get(req.track_url, stream=True, timeout=S3_TIMEOUT)
response.raise_for_status()
input_path = os.path.join(temp_dir, "track.webm")
total_bytes = 0
chunk_count = 0
with open(input_path, "wb") as f:
for chunk in response.iter_content(chunk_size=8192):
if chunk:
f.write(chunk)
total_bytes += len(chunk)
chunk_count += 1
# Check for cancellation every arbitrary amount of chunks
if chunk_count % 12 == 0:
now = time.time()
if now - last_check >= DISCONNECT_CHECK_INTERVAL:
if cancelled.is_set():
logger.info("Cancelled during download, exiting early")
return {"size": 0, "cancelled": True}
last_check = now
logger.info(f"Track downloaded: {total_bytes} bytes")
if cancelled.is_set():
logger.info("Cancelled after download, exiting early")
return {"size": 0, "cancelled": True}
# Apply padding using PyAV
output_path = os.path.join(temp_dir, "padded.webm")
delay_ms = math.floor(req.start_time_seconds * 1000)
logger.info(f"Padding track {req.track_index} with {delay_ms}ms delay using PyAV")
in_container = av.open(input_path)
in_stream = next((s for s in in_container.streams if s.type == "audio"), None)
if in_stream is None:
raise ValueError("No audio stream in input")
with av.open(output_path, "w", format="webm") as out_container:
out_stream = out_container.add_stream("libopus", rate=OPUS_STANDARD_SAMPLE_RATE)
out_stream.bit_rate = OPUS_DEFAULT_BIT_RATE
graph = av.filter.Graph()
abuf_args = (
f"time_base=1/{OPUS_STANDARD_SAMPLE_RATE}:"
f"sample_rate={OPUS_STANDARD_SAMPLE_RATE}:"
f"sample_fmt=s16:"
f"channel_layout=stereo"
)
src = graph.add("abuffer", args=abuf_args, name="src")
aresample_f = graph.add("aresample", args="async=1", name="ares")
delays_arg = f"{delay_ms}|{delay_ms}"
adelay_f = graph.add("adelay", args=f"delays={delays_arg}:all=1", name="delay")
sink = graph.add("abuffersink", name="sink")
src.link_to(aresample_f)
aresample_f.link_to(adelay_f)
adelay_f.link_to(sink)
graph.configure()
resampler = AudioResampler(
format="s16", layout="stereo", rate=OPUS_STANDARD_SAMPLE_RATE
)
for frame in in_container.decode(in_stream):
# Check for cancellation periodically
now = time.time()
if now - last_check >= DISCONNECT_CHECK_INTERVAL:
if cancelled.is_set():
logger.info("Cancelled during processing, exiting early")
in_container.close()
return {"size": 0, "cancelled": True}
last_check = now
out_frames = resampler.resample(frame) or []
for rframe in out_frames:
rframe.sample_rate = OPUS_STANDARD_SAMPLE_RATE
rframe.time_base = Fraction(1, OPUS_STANDARD_SAMPLE_RATE)
src.push(rframe)
while True:
try:
f_out = sink.pull()
except Exception:
break
f_out.sample_rate = OPUS_STANDARD_SAMPLE_RATE
f_out.time_base = Fraction(1, OPUS_STANDARD_SAMPLE_RATE)
for packet in out_stream.encode(f_out):
out_container.mux(packet)
# Flush filter graph
src.push(None)
while True:
try:
f_out = sink.pull()
except Exception:
break
f_out.sample_rate = OPUS_STANDARD_SAMPLE_RATE
f_out.time_base = Fraction(1, OPUS_STANDARD_SAMPLE_RATE)
for packet in out_stream.encode(f_out):
out_container.mux(packet)
# Flush encoder
for packet in out_stream.encode(None):
out_container.mux(packet)
in_container.close()
file_size = os.path.getsize(output_path)
logger.info(f"Padding complete: {file_size} bytes")
logger.info("Uploading padded track to S3")
with open(output_path, "rb") as f:
upload_response = requests.put(req.output_url, data=f, timeout=S3_TIMEOUT)
upload_response.raise_for_status()
logger.info(f"Upload complete: {file_size} bytes")
return {"size": file_size}
finally:
if input_path and os.path.exists(input_path):
try:
os.unlink(input_path)
except Exception as e:
logger.warning(f"Failed to cleanup input file: {e}")
if output_path and os.path.exists(output_path):
try:
os.unlink(output_path)
except Exception as e:
logger.warning(f"Failed to cleanup output file: {e}")
try:
os.rmdir(temp_dir)
except Exception as e:
logger.warning(f"Failed to cleanup temp directory: {e}")
return web_app

View File

@@ -4,27 +4,31 @@ ENV PYTHONUNBUFFERED=1 \
UV_LINK_MODE=copy \
UV_NO_CACHE=1
# patch until nvidia updates the sha1 repo
ADD sequoia.config /etc/crypto-policies/back-ends/sequoia.config
WORKDIR /tmp
RUN apt-get update \
RUN --mount=type=cache,target=/var/cache/apt,sharing=locked \
--mount=type=cache,target=/var/lib/apt,sharing=locked \
apt-get update \
&& apt-get install -y \
ffmpeg \
curl \
ca-certificates \
gnupg \
wget \
&& apt-get clean
wget
# Add NVIDIA CUDA repo for Debian 12 (bookworm) and install cuDNN 9 for CUDA 12
ADD https://developer.download.nvidia.com/compute/cuda/repos/debian12/x86_64/cuda-keyring_1.1-1_all.deb /cuda-keyring.deb
RUN dpkg -i /cuda-keyring.deb \
RUN --mount=type=cache,target=/var/cache/apt,sharing=locked \
--mount=type=cache,target=/var/lib/apt,sharing=locked \
dpkg -i /cuda-keyring.deb \
&& rm /cuda-keyring.deb \
&& apt-get update \
&& apt-get install -y --no-install-recommends \
cuda-cudart-12-6 \
libcublas-12-6 \
libcudnn9-cuda-12 \
libcudnn9-dev-cuda-12 \
&& apt-get clean \
&& rm -rf /var/lib/apt/lists/*
libcudnn9-dev-cuda-12
ADD https://astral.sh/uv/install.sh /uv-installer.sh
RUN sh /uv-installer.sh && rm /uv-installer.sh
ENV PATH="/root/.local/bin/:$PATH"
@@ -39,6 +43,13 @@ COPY ./app /app/app
COPY ./main.py /app/
COPY ./runserver.sh /app/
# prevent uv failing with too many open files on big cpus
ENV UV_CONCURRENT_INSTALLS=16
# first install
RUN --mount=type=cache,target=/root/.cache/uv \
uv sync --compile-bytecode --locked
EXPOSE 8000
CMD ["sh", "/app/runserver.sh"]

View File

@@ -0,0 +1,39 @@
FROM python:3.12-slim
ENV PYTHONUNBUFFERED=1 \
UV_LINK_MODE=copy \
UV_NO_CACHE=1
WORKDIR /tmp
RUN --mount=type=cache,target=/var/cache/apt,sharing=locked \
--mount=type=cache,target=/var/lib/apt,sharing=locked \
apt-get update \
&& apt-get install -y \
ffmpeg \
curl \
ca-certificates \
gnupg \
wget
ADD https://astral.sh/uv/install.sh /uv-installer.sh
RUN sh /uv-installer.sh && rm /uv-installer.sh
ENV PATH="/root/.local/bin/:$PATH"
RUN mkdir -p /app
WORKDIR /app
COPY pyproject.toml uv.lock /app/
COPY ./app /app/app
COPY ./main.py /app/
COPY ./runserver.sh /app/
# prevent uv failing with too many open files on big cpus
ENV UV_CONCURRENT_INSTALLS=16
# first install
RUN --mount=type=cache,target=/root/.cache/uv \
uv sync --compile-bytecode --locked
EXPOSE 8000
CMD ["sh", "/app/runserver.sh"]

View File

@@ -3,14 +3,14 @@ import os
from fastapi import Depends, HTTPException, status
from fastapi.security import OAuth2PasswordBearer
oauth2_scheme = OAuth2PasswordBearer(tokenUrl="token")
oauth2_scheme = OAuth2PasswordBearer(tokenUrl="token", auto_error=False)
def apikey_auth(apikey: str = Depends(oauth2_scheme)):
def apikey_auth(apikey: str | None = Depends(oauth2_scheme)):
required_key = os.environ.get("REFLECTOR_GPU_APIKEY")
if not required_key:
return
if apikey == required_key:
if apikey and apikey == required_key:
return
raise HTTPException(
status_code=status.HTTP_401_UNAUTHORIZED,

View File

@@ -1,10 +1,65 @@
import logging
import os
import tarfile
import threading
from pathlib import Path
from urllib.request import urlopen
import torch
import torchaudio
import yaml
from pyannote.audio import Pipeline
logger = logging.getLogger(__name__)
S3_BUNDLE_URL = "https://reflector-public.s3.us-east-1.amazonaws.com/pyannote-speaker-diarization-3.1.tar.gz"
BUNDLE_CACHE_DIR = Path("/root/.cache/pyannote-bundle")
def _ensure_model(cache_dir: Path) -> str:
"""Download and extract S3 model bundle if not cached."""
model_dir = cache_dir / "pyannote-speaker-diarization-3.1"
config_path = model_dir / "config.yaml"
if config_path.exists():
logger.info("Using cached model bundle at %s", model_dir)
return str(model_dir)
cache_dir.mkdir(parents=True, exist_ok=True)
tarball_path = cache_dir / "model.tar.gz"
logger.info("Downloading model bundle from %s", S3_BUNDLE_URL)
with urlopen(S3_BUNDLE_URL) as response, open(tarball_path, "wb") as f:
while chunk := response.read(8192):
f.write(chunk)
logger.info("Extracting model bundle")
with tarfile.open(tarball_path, "r:gz") as tar:
tar.extractall(path=cache_dir, filter="data")
tarball_path.unlink()
_patch_config(model_dir, cache_dir)
return str(model_dir)
def _patch_config(model_dir: Path, cache_dir: Path) -> None:
"""Rewrite config.yaml to reference local pytorch_model.bin paths."""
config_path = model_dir / "config.yaml"
with open(config_path) as f:
config = yaml.safe_load(f)
config["pipeline"]["params"]["segmentation"] = str(
cache_dir / "pyannote-segmentation-3.0" / "pytorch_model.bin"
)
config["pipeline"]["params"]["embedding"] = str(
cache_dir / "pyannote-wespeaker-voxceleb-resnet34-LM" / "pytorch_model.bin"
)
with open(config_path, "w") as f:
yaml.dump(config, f)
logger.info("Patched config.yaml with local model paths")
class PyannoteDiarizationService:
def __init__(self):
@@ -14,10 +69,20 @@ class PyannoteDiarizationService:
def load(self):
self._device = "cuda" if torch.cuda.is_available() else "cpu"
self._pipeline = Pipeline.from_pretrained(
"pyannote/speaker-diarization-3.1",
use_auth_token=os.environ.get("HF_TOKEN"),
)
hf_token = os.environ.get("HF_TOKEN")
if hf_token:
logger.info("Loading pyannote model from HuggingFace (HF_TOKEN set)")
self._pipeline = Pipeline.from_pretrained(
"pyannote/speaker-diarization-3.1",
use_auth_token=hf_token,
)
else:
logger.info("HF_TOKEN not set — loading model from S3 bundle")
model_path = _ensure_model(BUNDLE_CACHE_DIR)
config_path = Path(model_path) / "config.yaml"
self._pipeline = Pipeline.from_pretrained(str(config_path))
self._pipeline.to(torch.device(self._device))
def diarize_file(self, file_path: str, timestamp: float = 0.0) -> dict:

View File

@@ -0,0 +1,2 @@
[hash_algorithms]
sha1 = "always"

14
scripts/garage.toml Normal file
View File

@@ -0,0 +1,14 @@
metadata_dir = "/var/lib/garage/meta"
data_dir = "/var/lib/garage/data"
replication_factor = 1
rpc_secret = "__GARAGE_RPC_SECRET__"
rpc_bind_addr = "[::]:3901"
[s3_api]
api_bind_addr = "[::]:3900"
s3_region = "garage"
root_domain = ".s3.garage.localhost"
[admin]
api_bind_addr = "[::]:3903"

417
scripts/setup-standalone.sh Executable file
View File

@@ -0,0 +1,417 @@
#!/usr/bin/env bash
#
# Standalone local development setup for Reflector.
# Takes a fresh clone to a working instance — no cloud accounts, no API keys.
#
# Usage:
# ./scripts/setup-standalone.sh
#
# Idempotent — safe to re-run at any time.
#
set -euo pipefail
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
ROOT_DIR="$(cd "$SCRIPT_DIR/.." && pwd)"
SERVER_ENV="$ROOT_DIR/server/.env"
WWW_ENV="$ROOT_DIR/www/.env.local"
MODEL="${LLM_MODEL:-qwen2.5:14b}"
OLLAMA_PORT="${OLLAMA_PORT:-11434}"
OS="$(uname -s)"
# --- Colors ---
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
CYAN='\033[0;36m'
NC='\033[0m'
info() { echo -e "${CYAN}==>${NC} $*"; }
ok() { echo -e "${GREEN}${NC} $*"; }
warn() { echo -e "${YELLOW} !${NC} $*"; }
err() { echo -e "${RED}${NC} $*" >&2; }
# --- Helpers ---
wait_for_url() {
local url="$1" label="$2" retries="${3:-30}" interval="${4:-2}"
for i in $(seq 1 "$retries"); do
if curl -sf "$url" > /dev/null 2>&1; then
return 0
fi
echo -ne "\r Waiting for $label... ($i/$retries)"
sleep "$interval"
done
echo ""
err "$label not responding at $url after $retries attempts"
return 1
}
env_has_key() {
local file="$1" key="$2"
grep -q "^${key}=" "$file" 2>/dev/null
}
env_set() {
local file="$1" key="$2" value="$3"
if env_has_key "$file" "$key"; then
# Replace existing value (portable sed)
if [[ "$OS" == "Darwin" ]]; then
sed -i '' "s|^${key}=.*|${key}=${value}|" "$file"
else
sed -i "s|^${key}=.*|${key}=${value}|" "$file"
fi
else
echo "${key}=${value}" >> "$file"
fi
}
resolve_symlink() {
local file="$1"
if [[ -L "$file" ]]; then
warn "$(basename "$file") is a symlink — creating standalone copy"
cp -L "$file" "$file.tmp"
rm "$file"
mv "$file.tmp" "$file"
fi
}
compose_cmd() {
local compose_files="-f $ROOT_DIR/docker-compose.yml -f $ROOT_DIR/docker-compose.standalone.yml"
if [[ "$OS" == "Linux" ]] && [[ -n "${OLLAMA_PROFILE:-}" ]]; then
docker compose $compose_files --profile "$OLLAMA_PROFILE" "$@"
else
docker compose $compose_files "$@"
fi
}
# =========================================================
# Step 1: LLM / Ollama
# =========================================================
step_llm() {
info "Step 1: LLM setup (Ollama + $MODEL)"
case "$OS" in
Darwin)
if ! command -v ollama &> /dev/null; then
err "Ollama not found. Install it:"
err " brew install ollama"
err " # or https://ollama.com/download"
exit 1
fi
# Start if not running
if ! curl -sf "http://localhost:$OLLAMA_PORT/api/tags" > /dev/null 2>&1; then
info "Starting Ollama..."
ollama serve &
disown
fi
wait_for_url "http://localhost:$OLLAMA_PORT/api/tags" "Ollama"
echo ""
# Pull model if not already present
if ollama list 2>/dev/null | awk '{print $1}' | grep -qx "$MODEL"; then
ok "Model $MODEL already pulled"
else
info "Pulling model $MODEL (this may take a while)..."
ollama pull "$MODEL"
fi
LLM_URL_VALUE="http://host.docker.internal:$OLLAMA_PORT/v1"
;;
Linux)
if command -v nvidia-smi &> /dev/null && nvidia-smi > /dev/null 2>&1; then
ok "NVIDIA GPU detected — using ollama-gpu profile"
OLLAMA_PROFILE="ollama-gpu"
OLLAMA_SVC="ollama"
LLM_URL_VALUE="http://ollama:$OLLAMA_PORT/v1"
else
warn "No NVIDIA GPU — using ollama-cpu profile"
OLLAMA_PROFILE="ollama-cpu"
OLLAMA_SVC="ollama-cpu"
LLM_URL_VALUE="http://ollama-cpu:$OLLAMA_PORT/v1"
fi
info "Starting Ollama container..."
compose_cmd up -d
wait_for_url "http://localhost:$OLLAMA_PORT/api/tags" "Ollama"
echo ""
# Pull model inside container
if compose_cmd exec "$OLLAMA_SVC" ollama list 2>/dev/null | awk '{print $1}' | grep -qx "$MODEL"; then
ok "Model $MODEL already pulled"
else
info "Pulling model $MODEL inside container (this may take a while)..."
compose_cmd exec "$OLLAMA_SVC" ollama pull "$MODEL"
fi
;;
*)
err "Unsupported OS: $OS"
exit 1
;;
esac
ok "LLM ready ($MODEL via Ollama)"
}
# =========================================================
# Step 2: Generate server/.env
# =========================================================
step_server_env() {
info "Step 2: Generating server/.env"
resolve_symlink "$SERVER_ENV"
if [[ -f "$SERVER_ENV" ]]; then
ok "server/.env already exists — ensuring standalone vars"
else
cat > "$SERVER_ENV" << 'ENVEOF'
# Generated by setup-standalone.sh — standalone local development
# Source of truth for settings: server/reflector/settings.py
ENVEOF
ok "Created server/.env"
fi
# Ensure all standalone-critical vars (appends if missing, replaces if present)
env_set "$SERVER_ENV" "DATABASE_URL" "postgresql+asyncpg://reflector:reflector@postgres:5432/reflector"
env_set "$SERVER_ENV" "REDIS_HOST" "redis"
env_set "$SERVER_ENV" "CELERY_BROKER_URL" "redis://redis:6379/1"
env_set "$SERVER_ENV" "CELERY_RESULT_BACKEND" "redis://redis:6379/1"
env_set "$SERVER_ENV" "AUTH_BACKEND" "none"
env_set "$SERVER_ENV" "PUBLIC_MODE" "true"
# TRANSCRIPT_BACKEND, TRANSCRIPT_URL, DIARIZATION_BACKEND, DIARIZATION_URL
# are set via docker-compose.standalone.yml `environment:` overrides — not written here
# so we don't clobber the user's server/.env for non-standalone use.
env_set "$SERVER_ENV" "TRANSLATION_BACKEND" "passthrough"
env_set "$SERVER_ENV" "LLM_URL" "$LLM_URL_VALUE"
env_set "$SERVER_ENV" "LLM_MODEL" "$MODEL"
env_set "$SERVER_ENV" "LLM_API_KEY" "not-needed"
ok "Standalone vars set (LLM_URL=$LLM_URL_VALUE)"
}
# =========================================================
# Step 3: Object storage (Garage)
# =========================================================
step_storage() {
info "Step 3: Object storage (Garage)"
# Generate garage.toml from template (fill in RPC secret)
GARAGE_TOML="$ROOT_DIR/scripts/garage.toml"
GARAGE_TOML_RUNTIME="$ROOT_DIR/data/garage.toml"
if [[ ! -f "$GARAGE_TOML_RUNTIME" ]]; then
mkdir -p "$ROOT_DIR/data"
RPC_SECRET=$(openssl rand -hex 32)
sed "s|__GARAGE_RPC_SECRET__|${RPC_SECRET}|" "$GARAGE_TOML" > "$GARAGE_TOML_RUNTIME"
fi
compose_cmd up -d garage
wait_for_url "http://localhost:3903/health" "Garage admin API"
echo ""
# Layout: get node ID, assign, apply (skip if already applied)
NODE_ID=$(compose_cmd exec -T garage /garage node id -q 2>/dev/null | tr -d '[:space:]')
LAYOUT_STATUS=$(compose_cmd exec -T garage /garage layout show 2>&1 || true)
if echo "$LAYOUT_STATUS" | grep -q "No nodes"; then
compose_cmd exec -T garage /garage layout assign "$NODE_ID" -c 1G -z dc1
compose_cmd exec -T garage /garage layout apply --version 1
fi
# Create bucket (idempotent — skip if exists)
if ! compose_cmd exec -T garage /garage bucket info reflector-media &>/dev/null; then
compose_cmd exec -T garage /garage bucket create reflector-media
fi
# Create key (idempotent — skip if exists)
CREATED_KEY=false
if compose_cmd exec -T garage /garage key info reflector &>/dev/null; then
ok "Key 'reflector' already exists"
else
KEY_OUTPUT=$(compose_cmd exec -T garage /garage key create reflector)
CREATED_KEY=true
fi
# Grant bucket permissions (idempotent)
compose_cmd exec -T garage /garage bucket allow reflector-media --read --write --key reflector
# Set env vars (only parse key on first create — key info redacts the secret)
env_set "$SERVER_ENV" "TRANSCRIPT_STORAGE_BACKEND" "aws"
env_set "$SERVER_ENV" "TRANSCRIPT_STORAGE_AWS_ENDPOINT_URL" "http://garage:3900"
env_set "$SERVER_ENV" "TRANSCRIPT_STORAGE_AWS_BUCKET_NAME" "reflector-media"
env_set "$SERVER_ENV" "TRANSCRIPT_STORAGE_AWS_REGION" "garage"
if [[ "$CREATED_KEY" == "true" ]]; then
KEY_ID=$(echo "$KEY_OUTPUT" | grep -i "key id" | awk '{print $NF}')
KEY_SECRET=$(echo "$KEY_OUTPUT" | grep -i "secret key" | awk '{print $NF}')
env_set "$SERVER_ENV" "TRANSCRIPT_STORAGE_AWS_ACCESS_KEY_ID" "$KEY_ID"
env_set "$SERVER_ENV" "TRANSCRIPT_STORAGE_AWS_SECRET_ACCESS_KEY" "$KEY_SECRET"
fi
ok "Object storage ready (Garage)"
}
# =========================================================
# Step 4: Generate www/.env.local
# =========================================================
step_www_env() {
info "Step 4: Generating www/.env.local"
resolve_symlink "$WWW_ENV"
if [[ -f "$WWW_ENV" ]]; then
ok "www/.env.local already exists — ensuring standalone vars"
else
cat > "$WWW_ENV" << 'ENVEOF'
# Generated by setup-standalone.sh — standalone local development
ENVEOF
ok "Created www/.env.local"
fi
env_set "$WWW_ENV" "SITE_URL" "http://localhost:3000"
env_set "$WWW_ENV" "NEXTAUTH_URL" "http://localhost:3000"
env_set "$WWW_ENV" "NEXTAUTH_SECRET" "standalone-dev-secret-not-for-production"
env_set "$WWW_ENV" "API_URL" "http://localhost:1250"
env_set "$WWW_ENV" "WEBSOCKET_URL" "ws://localhost:1250"
env_set "$WWW_ENV" "SERVER_API_URL" "http://server:1250"
env_set "$WWW_ENV" "FEATURE_REQUIRE_LOGIN" "false"
ok "Standalone www vars set"
}
# =========================================================
# Step 5: Start all services
# =========================================================
step_services() {
info "Step 5: Starting Docker services"
# Check for port conflicts — stale processes silently shadow Docker port mappings.
# OrbStack/Docker Desktop bind ports for forwarding; ignore those PIDs.
local ports_ok=true
for port in 3000 1250 5432 6379 3900 3903; do
local pids
pids=$(lsof -ti :"$port" 2>/dev/null || true)
for pid in $pids; do
local pname
pname=$(ps -p "$pid" -o comm= 2>/dev/null || true)
# OrbStack and Docker Desktop own port forwarding — not real conflicts
if [[ "$pname" == *"OrbStack"* ]] || [[ "$pname" == *"com.docker"* ]] || [[ "$pname" == *"vpnkit"* ]]; then
continue
fi
warn "Port $port already in use by PID $pid ($pname)"
warn "Kill it with: lsof -ti :$port | xargs kill"
ports_ok=false
done
done
if [[ "$ports_ok" == "false" ]]; then
warn "Port conflicts detected — Docker containers may not be reachable"
warn "Continuing anyway (services will start but may be shadowed)"
fi
# server runs alembic migrations on startup automatically (see runserver.sh)
compose_cmd up -d postgres redis garage cpu server worker beat web
ok "Containers started"
info "Server is running migrations (alembic upgrade head)..."
}
# =========================================================
# Step 6: Health checks
# =========================================================
step_health() {
info "Step 6: Health checks"
# CPU service may take a while on first start (model download + load).
# No host port exposed — check via docker exec.
info "Waiting for CPU service (first start downloads ~1GB of models)..."
local cpu_ok=false
for i in $(seq 1 120); do
if compose_cmd exec -T cpu curl -sf http://localhost:8000/docs > /dev/null 2>&1; then
cpu_ok=true
break
fi
echo -ne "\r Waiting for CPU service... ($i/120)"
sleep 5
done
echo ""
if [[ "$cpu_ok" == "true" ]]; then
ok "CPU service healthy (transcription + diarization)"
else
warn "CPU service not ready yet — it will keep loading in the background"
warn "Check with: docker compose logs cpu"
fi
wait_for_url "http://localhost:1250/health" "Server API" 60 3
echo ""
ok "Server API healthy"
wait_for_url "http://localhost:3000" "Frontend" 90 3
echo ""
ok "Frontend responding"
# Check LLM reachability from inside a container
if compose_cmd exec -T server \
curl -sf "$LLM_URL_VALUE/models" > /dev/null 2>&1; then
ok "LLM reachable from containers"
else
warn "LLM not reachable from containers at $LLM_URL_VALUE"
warn "Summaries/topics/titles won't work until LLM is accessible"
fi
}
# =========================================================
# Main
# =========================================================
main() {
echo ""
echo "=========================================="
echo " Reflector — Standalone Local Setup"
echo "=========================================="
echo ""
# Ensure we're in the repo root
if [[ ! -f "$ROOT_DIR/docker-compose.yml" ]]; then
err "docker-compose.yml not found in $ROOT_DIR"
err "Run this script from the repo root: ./scripts/setup-standalone.sh"
exit 1
fi
# LLM_URL_VALUE is set by step_llm, used by later steps
LLM_URL_VALUE=""
OLLAMA_PROFILE=""
# docker-compose.yml may reference env_files that don't exist yet;
# touch them so compose_cmd works before the steps that populate them.
touch "$SERVER_ENV" "$WWW_ENV"
step_llm
echo ""
step_server_env
echo ""
step_storage
echo ""
step_www_env
echo ""
step_services
echo ""
step_health
echo ""
echo "=========================================="
echo -e " ${GREEN}Reflector is running!${NC}"
echo "=========================================="
echo ""
echo " Frontend: http://localhost:3000"
echo " API: http://localhost:1250"
echo ""
echo " To stop: docker compose down"
echo " To re-run: ./scripts/setup-standalone.sh"
echo ""
}
main "$@"

View File

@@ -66,15 +66,22 @@ TRANSLATE_URL=https://monadical-sas--reflector-translator-web.modal.run
## LLM backend (Required)
##
## Responsible for generating titles, summaries, and topic detection
## Requires OpenAI API key
## Supports any OpenAI-compatible endpoint.
## =======================================================
## OpenAI API key - get from https://platform.openai.com/account/api-keys
LLM_API_KEY=sk-your-openai-api-key
LLM_MODEL=gpt-4o-mini
## --- Option A: Local LLM via Ollama (recommended for dev) ---
## Setup: ./scripts/setup-standalone.sh
## Mac: Ollama runs natively (Metal GPU). Containers reach it via host.docker.internal.
## Linux: docker compose --profile ollama-gpu up -d (or ollama-cpu for no GPU)
LLM_URL=http://host.docker.internal:11434/v1
LLM_MODEL=qwen2.5:14b
LLM_API_KEY=not-needed
## Linux with containerized Ollama: LLM_URL=http://ollama:11434/v1
## Optional: Custom endpoint (defaults to OpenAI)
# LLM_URL=https://api.openai.com/v1
## --- Option B: Remote/cloud LLM ---
#LLM_API_KEY=sk-your-openai-api-key
#LLM_MODEL=gpt-4o-mini
## LLM_URL defaults to OpenAI when unset
## Context size for summary generation (tokens)
LLM_CONTEXT_WINDOW=16000

View File

@@ -0,0 +1,496 @@
# Daily.co and Reflector Data Model
This document explains the data model relationships between Daily.co's API concepts and Reflector's database schema, clarifying common sources of confusion.
---
## Table of Contents
1. [Core Entities Overview](#core-entities-overview)
2. [Daily.co vs Reflector Terminology](#dailyco-vs-reflector-terminology)
3. [Entity Relationships](#entity-relationships)
4. [Recording Multiplicity](#recording-multiplicity)
5. [Session Identifiers Explained](#session-identifiers-explained)
6. [Time-Based Matching](#time-based-matching)
7. [Multitrack Recording Details](#multitrack-recording-details)
8. [Verified Example](#verified-example)
---
## Core Entities Overview
### Reflector's Four Primary Entities
```
┌─────────────────────────────────────────────────────────────────┐
│ Room (Reflector) │
│ - Persistent meeting template │
│ - User-created configuration │
│ - Example: "team-standup" │
└────────────────────┬────────────────────────────────────────────┘
│ 1:N
┌─────────────────────────────────────────────────────────────────┐
│ Meeting (Reflector) │
│ - Single session instance │
│ - Creates NEW Daily.co room with timestamp │
│ - Example: "team-standup-20260115120000" │
└────────────────────┬────────────────────────────────────────────┘
│ 1:N
┌─────────────────────────────────────────────────────────────────┐
│ Recording (Reflector + Daily.co) │
│ - One segment of audio/video │
│ - New recording created on stop/restart │
│ - track_keys: JSON array of S3 file paths │
└────────────────────┬────────────────────────────────────────────┘
│ 1:1
┌─────────────────────────────────────────────────────────────────┐
│ Transcript (Reflector) │
│ - Processed audio with transcription │
│ - Diarization, summaries, topics │
│ - One transcript per recording │
└─────────────────────────────────────────────────────────────────┘
```
---
## Daily.co vs Reflector Terminology
### Room
| Aspect | Daily.co | Reflector |
|--------|----------|-----------|
| **Definition** | Virtual meeting space on Daily.co platform | User-created meeting template/configuration |
| **Lifetime** | Configurable expiration | Persistent until user deletes |
| **Creation** | API call for each meeting | Pre-created by user once |
| **Reuse** | Can host multiple sessions | Generates new Daily.co room per meeting |
| **Name Format** | `room-name` (reusable) | `room-name` (base identifier) |
| **Timestamping** | Not required | Meeting adds timestamp: `{name}-YYYYMMDDHHMMSS` |
**Example:**
```
Reflector Room: "daily-private-igor" (persistent config)
↓ starts meeting
Daily.co Room: "daily-private-igor-20260110042117"
```
### Meeting
| Aspect | Daily.co | Reflector |
|--------|----------|-----------|
| **Definition** | Session that starts when first participant joins | Explicit database record of a session |
| **Identifier** | `mtgSessionId` (generated by Daily.co) | `meeting.id` (UUID, generated by Reflector) |
| **Creation** | Implicit (first participant join) | Explicit API call before participants join |
| **Purpose** | Tracks active session state | Links recordings, transcripts, participants |
| **Scope** | Per room instance | Per Reflector room + timestamp |
**Critical Limitation:** Daily.co's recordings API often does NOT return `mtgSessionId`, requiring time-based matching (see [Time-Based Matching](#time-based-matching)).
### Recording
| Aspect | Daily.co | Reflector |
|--------|----------|-----------|
| **Definition** | Audio/video files on S3 | Metadata + processing status |
| **Types** | `cloud` (composed video), `raw-tracks` (multitrack) | Stores references + `track_keys` array |
| **Multiplicity** | One recording object per start/stop cycle | One DB row per Daily.co recording object |
| **Identifier** | Daily.co `recording_id` | Same `recording_id` (stored in DB) |
| **Multitrack** | Array of `.webm` files (one per participant) | `track_keys` JSON array with S3 paths |
| **Linkage** | Via `room_name` + `start_ts` | FK `meeting_id` (set via time-based match) |
**Critical Behavior:** Recording **stops/restarts** create **separate recording objects** with unique IDs.
---
## Entity Relationships
### Database Schema Relationships
```sql
-- Simplified schema showing key relationships
TABLE room (
id VARCHAR PRIMARY KEY,
name VARCHAR UNIQUE,
platform VARCHAR -- 'whereby' | 'daily'
)
TABLE meeting (
id VARCHAR PRIMARY KEY,
room_id VARCHAR REFERENCES room(id) ON DELETE CASCADE, -- nullable
room_name VARCHAR, -- Daily.co room name (timestamped)
start_date TIMESTAMP,
platform VARCHAR
)
TABLE recording (
id VARCHAR PRIMARY KEY, -- Daily.co recording_id
meeting_id VARCHAR, -- FK to meeting (set via time-based match)
bucket_name VARCHAR,
object_key VARCHAR, -- S3 prefix
track_keys JSON, -- Array of S3 keys for multitrack
recorded_at TIMESTAMP
)
TABLE transcript (
id VARCHAR PRIMARY KEY,
recording_id VARCHAR, -- nullable FK
meeting_id VARCHAR, -- nullable FK
room_id VARCHAR, -- nullable FK
participants JSON, -- [{id, speaker, name, user_id}, ...]
title VARCHAR,
long_summary VARCHAR,
webvtt TEXT
)
```
**Relationship Cardinalities:**
```
1 Room → N Meetings
1 Meeting → N Recordings (common: 1-21 recordings per meeting)
1 Recording → 1 Transcript
1 Meeting → N Transcripts (via recordings)
```
---
## Recording Multiplicity
### Why Multiple Recordings Per Meeting?
Daily.co creates a **new recording object** (new ID, new files) whenever recording stops and restarts. This happens due to:
1. **Manual stop/start** - User clicks stop, then start recording again
2. **Network reconnection** - Participant drops, reconnects → triggers restart
3. **Participant rejoin** - Last participant leaves, new one joins → new session
---
## Session Identifiers Explained
### The Hidden Entity: Daily.co Meeting Session
Daily.co has an **implicit ephemeral entity** that sits between Room and Recording:
```
Daily.co Room: "daily-private-igor-20260110042117"
├─ Daily.co Meeting Session #1 (mtgSessionId: c04334de...)
│ └─ Recording #3 (f4a50f94) - 4s, 1 track
└─ Daily.co Meeting Session #2 (mtgSessionId: 4cdae3c0...)
├─ Recording #2 (b0fa94da) - 80s, 2 tracks ← recording stopped
└─ Recording #1 (05edf519) - 62s, 1 track ← then restarted
```
**Daily.co Meeting Session:**
- **Lifecycle:** Starts when first participant joins, ends when last participant leaves
- **Identifier:** `mtgSessionId` (generated by Daily.co)
- **Persistence:** Ephemeral - new ID if everyone leaves and someone rejoins
- **Relationship:** 1 Session → N Recordings (if recording stops/restarts during session)
**Key Insight:** Multiple recordings can share the same `mtgSessionId` if recording was stopped and restarted while participants remained connected.
### mtgSessionId (Meeting Session Identifier)
`mtgSessionId` identifies a **Daily.co meeting session** (not individual participants, not a room).
### session_id (Per-Participant)
**Different concept:** Per-participant connection identifier from webhooks.
**Reflector Tracking:** `daily_participant_session` table
```sql
TABLE daily_participant_session (
id VARCHAR PRIMARY KEY, -- {meeting_id}:{user_id}:{joined_at_ms}
meeting_id VARCHAR,
session_id VARCHAR, -- From webhook (per-participant)
user_id VARCHAR,
user_name VARCHAR,
joined_at TIMESTAMP,
left_at TIMESTAMP
)
```
---
## Time-Based Matching
### Problem Statement
Daily.co's recordings API does not reliably return `mtgSessionId`, making it impossible to directly link recordings to meetings via Daily.co's identifiers.
**Example API response:**
```json
{
"id": "recording-uuid",
"room_name": "daily-private-igor-20260110042117",
"start_ts": 1768018896,
"mtgSessionId": null Missing!
}
```
### Solution: Time-Based Matching
**Implementation:** `reflector/db/meetings.py:get_by_room_name_and_time()`
---
## Multitrack Recording Details
### track_keys JSON Array
**Schema:** `recording.track_keys` (JSON, nullable)
```sql
-- Example recording with 2 audio tracks
{
"id": "b0fa94da-73b5-4f95-9239-5216a682a505",
"track_keys": [
"igormonadical/daily-private-igor-20260110042117/1768018896877-890c0eae-e186-4534-a7bd-7c794b7d6d7f-cam-audio-1768018914565",
"igormonadical/daily-private-igor-20260110042117/1768018896877-9660e8e9-4297-4f17-951d-0b2bf2401803-cam-audio-1768018899286"
]
}
```
**Semantics:**
- `track_keys = null` → Not multitrack (cloud recording)
- `track_keys = []` → Multitrack recording with no audio captured (silence/muted)
- `track_keys = [...]` → Multitrack with N audio tracks
**Property:** `recording.is_multitrack` (Python)
```python
@property
def is_multitrack(self) -> bool:
return self.track_keys is not None and len(self.track_keys) > 0
```
### Track Filename Format
Daily.co multitrack filenames encode timing and participant information:
**Format:** `{recording_start_ts}-{participant_id}-cam-audio-{track_start_ts}`
**Example:** `1768018896877-890c0eae-e186-4534-a7bd-7c794b7d6d7f-cam-audio-1768018914565`
**Parsed Components:**
```python
# reflector/utils/daily.py:25-60
class DailyRecordingFilename(NamedTuple):
recording_start_ts: int # 1768018896877 (milliseconds)
participant_id: str # 890c0eae-e186-4534-a7bd-7c794b7d6d7f
track_start_ts: int # 1768018914565 (milliseconds)
```
**Note:** Browser downloads from S3 add `.webm` extension due to MIME headers, but S3 object keys have no extension.
### Video Track Filtering
Daily.co API returns both audio and video tracks, but Reflector only processes audio.
**Filtering Logic:** `reflector/worker/process.py:660`
```python
track_keys = [t.s3Key for t in recording.tracks if t.type == "audio"]
```
**Example API Response:**
```json
{
"tracks": [
{"type": "audio", "s3Key": "...cam-audio-1768018914565"},
{"type": "audio", "s3Key": "...cam-audio-1768018899286"},
{"type": "video", "s3Key": "...cam-video-1768018897095"} Filtered out
]
}
```
**Result:** Only 2 audio tracks stored in `recording.track_keys`, video track discarded.
**Rationale:** Reflector is audio transcription system; video not needed for processing.
### Track-to-Participant Mapping
**Flow:**
1. Daily.co webhook/polling provides `track_keys` array
2. Each track filename contains `participant_id`
3. Reflector queries Daily.co API: `GET /meetings/{mtgSessionId}/participants`
4. Maps `participant_id``user_name`
5. Stores in `transcript.participants` JSON:
```json
[
{
"id": "890c0eae-e186-4534-a7bd-7c794b7d6d7f",
"speaker": 0,
"name": "test2",
"user_id": "907f2cc1-eaab-435f-8ee2-09185f416b22"
},
{
"id": "9660e8e9-4297-4f17-951d-0b2bf2401803",
"speaker": 1,
"name": "test",
"user_id": "907f2cc1-eaab-435f-8ee2-09185f416b22"
}
]
```
**Diarization:** Multitrack recordings don't need speaker diarization AI — speaker identity comes from separate audio tracks.
---
## Example
### Meeting: daily-private-igor-20260110042117
**Context:** User conducted test recording with start/stop cycles, producing 3 recordings.
#### Database State
```sql
-- Meeting
id: 034804b8-cee2-4fb4-94d7-122f6f068a61
room_name: daily-private-igor-20260110042117
start_date: 2026-01-10 04:21:17+00
```
#### Daily.co API Response
```json
[
{
"id": "f4a50f94-053c-4f9d-bda6-78ad051fbc36",
"room_name": "daily-private-igor-20260110042117",
"start_ts": 1768018885,
"duration": 4,
"status": "finished",
"mtgSessionId": "c04334de-42a0-4c2a-96be-a49b068dca85",
"tracks": [
{"type": "audio", "s3Key": "...62e8f3ae...cam-audio-1768018885417"}
]
},
{
"id": "b0fa94da-73b5-4f95-9239-5216a682a505",
"room_name": "daily-private-igor-20260110042117",
"start_ts": 1768018896,
"duration": 80,
"status": "finished",
"mtgSessionId": "4cdae3c0-86cb-4578-8a6d-3a228bb48345",
"tracks": [
{"type": "audio", "s3Key": "...890c0eae...cam-audio-1768018914565"},
{"type": "audio", "s3Key": "...9660e8e9...cam-audio-1768018899286"},
{"type": "video", "s3Key": "...9660e8e9...cam-video-1768018897095"}
]
},
{
"id": "05edf519-9048-4b49-9a75-73e9826fd950",
"room_name": "daily-private-igor-20260110042117",
"start_ts": 1768018914,
"duration": 62,
"status": "finished",
"mtgSessionId": "4cdae3c0-86cb-4578-8a6d-3a228bb48345",
"tracks": [
{"type": "audio", "s3Key": "...890c0eae...cam-audio-1768018914948"}
]
}
]
```
**Key Observations:**
- 3 recording objects returned by Daily.co
- 2 different `mtgSessionId` values (2 different meeting instances)
- Recording #2 has 3 tracks (2 audio + 1 video)
- Timestamps: 1768018885 → 1768018896 (+11s) → 1768018914 (+18s)
#### Reflector Database
**Recordings:**
```
┌──────────────────────────────────────┬──────────────┬────────────┬──────────────────────────────────────┐
│ id │ track_count │ duration │ mtgSessionId │
├──────────────────────────────────────┼──────────────┼────────────┼──────────────────────────────────────┤
│ f4a50f94-053c-4f9d-bda6-78ad051fbc36 │ 1 │ 4s │ c04334de-42a0-4c2a-96be-a49b068dca85 │
│ b0fa94da-73b5-4f95-9239-5216a682a505 │ 2 (video=0) │ 80s │ 4cdae3c0-86cb-4578-8a6d-3a228bb48345 │
│ 05edf519-9048-4b49-9a75-73e9826fd950 │ 1 │ 62s │ 4cdae3c0-86cb-4578-8a6d-3a228bb48345 │
└──────────────────────────────────────┴──────────────┴────────────┴──────────────────────────────────────┘
```
**Note:** Recording #2 has 2 audio tracks (video filtered out), not 3.
**Transcripts:**
```
┌──────────────────────────────────────┬──────────────────────────────────────┬──────────────┬──────────────────────────────────────────────┐
│ id │ recording_id │ participants │ title │
├──────────────────────────────────────┼──────────────────────────────────────┼──────────────┼──────────────────────────────────────────────┤
│ 17149b1f-546c-4837-80a0-f8140bd16592 │ f4a50f94-053c-4f9d-bda6-78ad051fbc36 │ 1 (test) │ (empty - no speech) │
│ 49801332-3222-4c11-bdb2-375479fc87f2 │ b0fa94da-73b5-4f95-9239-5216a682a505 │ 2 (test, │ "Examination and Validation Procedures │
│ │ │ test2) │ Review" │
│ e5271e12-20fb-42d2-b5a8-21438abadef9 │ 05edf519-9048-4b49-9a75-73e9826fd950 │ 1 (test2) │ "Technical Sound Check Procedure Review" │
└──────────────────────────────────────┴──────────────────────────────────────┴──────────────┴──────────────────────────────────────────────┘
```
**Transcript Content:**
*Transcript #1* (17149b1f): Empty WebVTT (no audio captured)
*Transcript #2* (49801332):
```webvtt
WEBVTT
00:00:03.109 --> 00:00:05.589
<v Speaker1>Test, test, test. Test, test, test, test, test.
00:00:19.829 --> 00:00:22.710
<v Speaker0>Test test test test test test test test test test test.
```
**AI-Generated Summary:**
> "The meeting focused on the critical importance of rigorous testing for ensuring reliability and quality, with test and test2 emphasizing the need for a structured testing framework and meticulous documentation..."
*Transcript #3* (e5271e12):
```webvtt
WEBVTT
00:00:02.029 --> 00:00:04.910
<v Speaker0>Test, test, test, test, test, test, test, test, test, test, test.
```
#### Validation: track_keys → participants
**Recording #2 (b0fa94da) tracks:**
```json
[
".../890c0eae-e186-4534-a7bd-7c794b7d6d7f-cam-audio-...",
".../9660e8e9-4297-4f17-951d-0b2bf2401803-cam-audio-..."
]
```
**Transcript #2 (49801332) participants:**
```json
[
{"id": "890c0eae-e186-4534-a7bd-7c794b7d6d7f", "speaker": 0, "name": "test2"},
{"id": "9660e8e9-4297-4f17-951d-0b2bf2401803", "speaker": 1, "name": "test"}
]
```
### Data Flow
```
Daily.co API: 3 recordings
Polling: _poll_raw_tracks_recordings()
Worker: process_multitrack_recording.delay() × 3
DB: 3 recording rows created
Pipeline: Audio processing + transcription × 3
DB: 3 transcript rows created (1:1 with recordings)
UI: User sees 3 separate transcripts
```
**Result:** ✅ 1:1 Recording → Transcript relationship maintained.
---
**Document Version:** 1.0
**Last Verified:** 2026-01-15
**Data Source:** Production database + Daily.co API inspection

View File

@@ -0,0 +1,40 @@
"""add cloud recording support
Revision ID: 1b1e6a6fc465
Revises: bd3a729bb379
Create Date: 2026-01-09 17:17:33.535620
"""
from typing import Sequence, Union
import sqlalchemy as sa
from alembic import op
# revision identifiers, used by Alembic.
revision: str = "1b1e6a6fc465"
down_revision: Union[str, None] = "bd3a729bb379"
branch_labels: Union[str, Sequence[str], None] = None
depends_on: Union[str, Sequence[str], None] = None
def upgrade() -> None:
# ### commands auto generated by Alembic - please adjust! ###
with op.batch_alter_table("meeting", schema=None) as batch_op:
batch_op.add_column(
sa.Column("daily_composed_video_s3_key", sa.String(), nullable=True)
)
batch_op.add_column(
sa.Column("daily_composed_video_duration", sa.Integer(), nullable=True)
)
# ### end Alembic commands ###
def downgrade() -> None:
# ### commands auto generated by Alembic - please adjust! ###
with op.batch_alter_table("meeting", schema=None) as batch_op:
batch_op.drop_column("daily_composed_video_duration")
batch_op.drop_column("daily_composed_video_s3_key")
# ### end Alembic commands ###

View File

@@ -0,0 +1,35 @@
"""drop_use_celery_column
Revision ID: 3aa20b96d963
Revises: e69f08ead8ea
Create Date: 2026-02-05 10:12:44.065279
"""
from typing import Sequence, Union
import sqlalchemy as sa
from alembic import op
# revision identifiers, used by Alembic.
revision: str = "3aa20b96d963"
down_revision: Union[str, None] = "e69f08ead8ea"
branch_labels: Union[str, Sequence[str], None] = None
depends_on: Union[str, Sequence[str], None] = None
def upgrade() -> None:
with op.batch_alter_table("room", schema=None) as batch_op:
batch_op.drop_column("use_celery")
def downgrade() -> None:
with op.batch_alter_table("room", schema=None) as batch_op:
batch_op.add_column(
sa.Column(
"use_celery",
sa.Boolean(),
server_default=sa.text("false"),
nullable=False,
)
)

View File

@@ -0,0 +1,44 @@
"""replace_use_hatchet_with_use_celery
Revision ID: 80beb1ea3269
Revises: bd3a729bb379
Create Date: 2026-01-20 16:26:25.555869
"""
from typing import Sequence, Union
import sqlalchemy as sa
from alembic import op
# revision identifiers, used by Alembic.
revision: str = "80beb1ea3269"
down_revision: Union[str, None] = "bd3a729bb379"
branch_labels: Union[str, Sequence[str], None] = None
depends_on: Union[str, Sequence[str], None] = None
def upgrade() -> None:
with op.batch_alter_table("room", schema=None) as batch_op:
batch_op.add_column(
sa.Column(
"use_celery",
sa.Boolean(),
server_default=sa.text("false"),
nullable=False,
)
)
batch_op.drop_column("use_hatchet")
def downgrade() -> None:
with op.batch_alter_table("room", schema=None) as batch_op:
batch_op.add_column(
sa.Column(
"use_hatchet",
sa.Boolean(),
server_default=sa.text("false"),
nullable=False,
)
)
batch_op.drop_column("use_celery")

View File

@@ -0,0 +1,23 @@
"""merge cloud recording and celery heads
Revision ID: e69f08ead8ea
Revises: 1b1e6a6fc465, 80beb1ea3269
Create Date: 2026-01-21 21:39:10.326841
"""
from typing import Sequence, Union
# revision identifiers, used by Alembic.
revision: str = "e69f08ead8ea"
down_revision: Union[str, None] = ("1b1e6a6fc465", "80beb1ea3269")
branch_labels: Union[str, Sequence[str], None] = None
depends_on: Union[str, Sequence[str], None] = None
def upgrade() -> None:
pass
def downgrade() -> None:
pass

View File

@@ -8,7 +8,7 @@ readme = "README.md"
dependencies = [
"aiohttp>=3.9.0",
"aiohttp-cors>=0.7.0",
"av>=10.0.0",
"av>=15.0.0",
"requests>=2.31.0",
"aiortc>=1.5.0",
"sortedcontainers>=2.4.0",
@@ -68,7 +68,6 @@ evaluation = [
"pydantic>=2.1.1",
]
local = [
"pyannote-audio>=3.3.2",
"faster-whisper>=0.10.0",
]
silero-vad = [

View File

@@ -22,6 +22,8 @@ def asynctask(f):
await database.disconnect()
coro = run_with_db()
if current_task:
return asyncio.run(coro)
try:
loop = asyncio.get_running_loop()
except RuntimeError:

View File

@@ -1,11 +1,5 @@
from typing import Annotated
from fastapi import Depends
from fastapi.security import OAuth2PasswordBearer
from pydantic import BaseModel
oauth2_scheme = OAuth2PasswordBearer(tokenUrl="token", auto_error=False)
class UserInfo(BaseModel):
sub: str
@@ -15,13 +9,13 @@ class AccessTokenInfo(BaseModel):
pass
def authenticated(token: Annotated[str, Depends(oauth2_scheme)]):
def authenticated():
return None
def current_user(token: Annotated[str, Depends(oauth2_scheme)]):
def current_user():
return None
def current_user_optional(token: Annotated[str, Depends(oauth2_scheme)]):
def current_user_optional():
return None

View File

@@ -3,7 +3,7 @@ Daily.co API Module
"""
# Client
from .client import DailyApiClient, DailyApiError
from .client import DailyApiClient, DailyApiError, RecordingType
# Request models
from .requests import (
@@ -64,6 +64,7 @@ __all__ = [
# Client
"DailyApiClient",
"DailyApiError",
"RecordingType",
# Requests
"CreateRoomRequest",
"RoomProperties",

View File

@@ -7,7 +7,8 @@ Reference: https://docs.daily.co/reference/rest-api
"""
from http import HTTPStatus
from typing import Any
from typing import Any, Literal
from uuid import UUID
import httpx
import structlog
@@ -32,6 +33,8 @@ from .responses import (
logger = structlog.get_logger(__name__)
RecordingType = Literal["cloud", "raw-tracks"]
class DailyApiError(Exception):
"""Daily.co API error with full request/response context."""
@@ -143,6 +146,8 @@ class DailyApiClient:
)
raise DailyApiError(operation, response)
if not response.content:
return {}
return response.json()
# ============================================================================
@@ -395,6 +400,38 @@ class DailyApiClient:
return [RecordingResponse(**r) for r in data["data"]]
async def start_recording(
self,
room_name: NonEmptyString,
recording_type: RecordingType,
instance_id: UUID,
) -> dict[str, Any]:
"""Start recording via REST API.
Reference: https://docs.daily.co/reference/rest-api/rooms/recordings/start
Args:
room_name: Daily.co room name
recording_type: Recording type
instance_id: UUID for this recording session
Returns:
Recording start confirmation from Daily.co API
Raises:
DailyApiError: If API request fails
"""
client = await self._get_client()
response = await client.post(
f"{self.base_url}/rooms/{room_name}/recordings/start",
headers=self.headers,
json={
"type": recording_type,
"instanceId": str(instance_id),
},
)
return await self._handle_response(response, "start_recording")
# ============================================================================
# MEETING TOKENS
# ============================================================================

View File

@@ -0,0 +1,37 @@
"""
Daily.co recording instanceId generation utilities.
Deterministic instance ID generation for cloud and raw-tracks recordings.
MUST match frontend logic
"""
from uuid import UUID, uuid5
from reflector.utils.string import NonEmptyString
# Namespace UUID for UUIDv5 generation of raw-tracks instanceIds
# DO NOT CHANGE: Breaks instanceId determinism across deployments and frontend/backend matching
RAW_TRACKS_NAMESPACE = UUID("a1b2c3d4-e5f6-7890-abcd-ef1234567890")
def generate_cloud_instance_id(meeting_id: NonEmptyString) -> UUID:
"""
Generate instanceId for cloud recording.
Cloud recordings use meeting ID directly as instanceId.
This ensures each meeting has one unique cloud recording.
"""
return UUID(meeting_id)
def generate_raw_tracks_instance_id(meeting_id: NonEmptyString) -> UUID:
"""
Generate instanceId for raw-tracks recording.
Raw-tracks recordings use UUIDv5(meeting_id, namespace) to ensure
different instanceId from cloud while remaining deterministic.
Daily.co requires cloud and raw-tracks to have different instanceIds
for concurrent recording.
"""
return uuid5(RAW_TRACKS_NAMESPACE, meeting_id)

View File

@@ -88,13 +88,6 @@ class MeetingTokenProperties(BaseModel):
is_owner: bool = Field(
default=False, description="Grant owner privileges to token holder"
)
start_cloud_recording: bool = Field(
default=False, description="Automatically start cloud recording on join"
)
start_cloud_recording_opts: dict | None = Field(
default=None,
description="Options for startRecording when start_cloud_recording is true (e.g., maxDuration)",
)
enable_recording_ui: bool = Field(
default=True, description="Show recording controls in UI"
)

View File

@@ -116,6 +116,7 @@ class RecordingS3Info(BaseModel):
bucket_name: NonEmptyString
bucket_region: NonEmptyString
key: NonEmptyString | None = None
endpoint: NonEmptyString | None = None
@@ -132,6 +133,9 @@ class RecordingResponse(BaseModel):
id: NonEmptyString = Field(description="Recording identifier")
room_name: NonEmptyString = Field(description="Room where recording occurred")
start_ts: int = Field(description="Recording start timestamp (Unix epoch seconds)")
type: Literal["cloud", "raw-tracks"] | None = Field(
None, description="Recording type (may be missing from API)"
)
status: RecordingStatus = Field(
description="Recording status ('in-progress' or 'finished')"
)
@@ -145,6 +149,9 @@ class RecordingResponse(BaseModel):
None, description="Token for sharing recording"
)
s3: RecordingS3Info | None = Field(None, description="S3 bucket information")
s3key: NonEmptyString | None = Field(
None, description="S3 key for cloud recordings (top-level field)"
)
tracks: list[DailyTrack] = Field(
default_factory=list,
description="Track list for raw-tracks recordings (always array, never null)",

View File

@@ -99,7 +99,7 @@ def extract_room_name(event: DailyWebhookEvent) -> str | None:
>>> event = DailyWebhookEvent(**webhook_payload)
>>> room_name = extract_room_name(event)
"""
room = event.payload.get("room_name")
room = event.payload.get("room_name") or event.payload.get("room")
# Ensure we return a string, not any falsy value that might be in payload
return room if isinstance(room, str) else None

View File

@@ -6,7 +6,7 @@ Reference: https://docs.daily.co/reference/rest-api/webhooks
from typing import Annotated, Any, Dict, Literal, Union
from pydantic import BaseModel, Field, field_validator
from pydantic import AliasChoices, BaseModel, ConfigDict, Field, field_validator
from reflector.utils.string import NonEmptyString
@@ -41,6 +41,8 @@ class DailyTrack(BaseModel):
Reference: https://docs.daily.co/reference/rest-api/recordings
"""
model_config = ConfigDict(extra="ignore")
type: Literal["audio", "video"]
s3Key: NonEmptyString = Field(description="S3 object key for the track file")
size: int = Field(description="File size in bytes")
@@ -54,6 +56,8 @@ class DailyWebhookEvent(BaseModel):
Reference: https://docs.daily.co/reference/rest-api/webhooks
"""
model_config = ConfigDict(extra="ignore")
version: NonEmptyString = Field(
description="Represents the version of the event. This uses semantic versioning to inform a consumer if the payload has introduced any breaking changes"
)
@@ -82,7 +86,13 @@ class ParticipantJoinedPayload(BaseModel):
Reference: https://docs.daily.co/reference/rest-api/webhooks/events/participant-joined
"""
room_name: NonEmptyString | None = Field(None, description="Daily.co room name")
model_config = ConfigDict(extra="ignore")
room_name: NonEmptyString | None = Field(
None,
description="Daily.co room name",
validation_alias=AliasChoices("room_name", "room"),
)
session_id: NonEmptyString = Field(description="Daily.co session identifier")
user_id: NonEmptyString = Field(description="User identifier (may be encoded)")
user_name: NonEmptyString | None = Field(None, description="User display name")
@@ -100,7 +110,13 @@ class ParticipantLeftPayload(BaseModel):
Reference: https://docs.daily.co/reference/rest-api/webhooks/events/participant-left
"""
room_name: NonEmptyString | None = Field(None, description="Daily.co room name")
model_config = ConfigDict(extra="ignore")
room_name: NonEmptyString | None = Field(
None,
description="Daily.co room name",
validation_alias=AliasChoices("room_name", "room"),
)
session_id: NonEmptyString = Field(description="Daily.co session identifier")
user_id: NonEmptyString = Field(description="User identifier (may be encoded)")
user_name: NonEmptyString | None = Field(None, description="User display name")
@@ -112,6 +128,9 @@ class ParticipantLeftPayload(BaseModel):
_normalize_joined_at = field_validator("joined_at", mode="before")(
normalize_timestamp_to_int
)
_normalize_duration = field_validator("duration", mode="before")(
normalize_timestamp_to_int
)
class RecordingStartedPayload(BaseModel):
@@ -121,6 +140,8 @@ class RecordingStartedPayload(BaseModel):
Reference: https://docs.daily.co/reference/rest-api/webhooks/events/recording-started
"""
model_config = ConfigDict(extra="ignore")
room_name: NonEmptyString | None = Field(None, description="Daily.co room name")
recording_id: NonEmptyString = Field(description="Recording identifier")
start_ts: int | None = Field(None, description="Recording start timestamp")
@@ -138,7 +159,9 @@ class RecordingReadyToDownloadPayload(BaseModel):
Reference: https://docs.daily.co/reference/rest-api/webhooks/events/recording-ready-to-download
"""
type: Literal["cloud", "raw-tracks"] = Field(
model_config = ConfigDict(extra="ignore")
type: Literal["cloud", "cloud-audio-only", "raw-tracks"] = Field(
description="The type of recording that was generated"
)
recording_id: NonEmptyString = Field(
@@ -153,8 +176,9 @@ class RecordingReadyToDownloadPayload(BaseModel):
status: Literal["finished"] = Field(
description="The status of the given recording (always 'finished' in ready-to-download webhook, see RecordingStatus in responses.py for full API statuses)"
)
max_participants: int = Field(
description="The number of participants on the call that were recorded"
max_participants: int | None = Field(
None,
description="The number of participants on the call that were recorded (optional; Daily may omit it in some webhook versions)",
)
duration: int = Field(description="The duration in seconds of the call")
s3_key: NonEmptyString = Field(
@@ -180,6 +204,8 @@ class RecordingErrorPayload(BaseModel):
Reference: https://docs.daily.co/reference/rest-api/webhooks/events/recording-error
"""
model_config = ConfigDict(extra="ignore")
action: Literal["clourd-recording-err", "cloud-recording-error"] = Field(
description="A string describing the event that was emitted (both variants are documented)"
)
@@ -200,6 +226,8 @@ class RecordingErrorPayload(BaseModel):
class ParticipantJoinedEvent(BaseModel):
model_config = ConfigDict(extra="ignore")
version: NonEmptyString
type: Literal["participant.joined"]
id: NonEmptyString
@@ -212,6 +240,8 @@ class ParticipantJoinedEvent(BaseModel):
class ParticipantLeftEvent(BaseModel):
model_config = ConfigDict(extra="ignore")
version: NonEmptyString
type: Literal["participant.left"]
id: NonEmptyString
@@ -224,6 +254,8 @@ class ParticipantLeftEvent(BaseModel):
class RecordingStartedEvent(BaseModel):
model_config = ConfigDict(extra="ignore")
version: NonEmptyString
type: Literal["recording.started"]
id: NonEmptyString
@@ -236,6 +268,8 @@ class RecordingStartedEvent(BaseModel):
class RecordingReadyEvent(BaseModel):
model_config = ConfigDict(extra="ignore")
version: NonEmptyString
type: Literal["recording.ready-to-download"]
id: NonEmptyString
@@ -248,6 +282,8 @@ class RecordingReadyEvent(BaseModel):
class RecordingErrorEvent(BaseModel):
model_config = ConfigDict(extra="ignore")
version: NonEmptyString
type: Literal["recording.error"]
id: NonEmptyString

View File

@@ -1,4 +1,4 @@
from datetime import datetime
from datetime import datetime, timedelta
from typing import Any, Literal
import sqlalchemy as sa
@@ -9,7 +9,7 @@ from reflector.db import get_database, metadata
from reflector.db.rooms import Room
from reflector.schemas.platform import WHEREBY_PLATFORM, Platform
from reflector.utils import generate_uuid4
from reflector.utils.string import assert_equal
from reflector.utils.string import NonEmptyString, assert_equal
meetings = sa.Table(
"meeting",
@@ -63,6 +63,9 @@ meetings = sa.Table(
nullable=False,
server_default=assert_equal(WHEREBY_PLATFORM, "whereby"),
),
# Daily.co composed video (Brady Bunch grid layout) - Daily.co only, not Whereby
sa.Column("daily_composed_video_s3_key", sa.String, nullable=True),
sa.Column("daily_composed_video_duration", sa.Integer, nullable=True),
sa.Index("idx_meeting_room_id", "room_id"),
sa.Index("idx_meeting_calendar_event", "calendar_event_id"),
)
@@ -110,6 +113,9 @@ class Meeting(BaseModel):
calendar_event_id: str | None = None
calendar_metadata: dict[str, Any] | None = None
platform: Platform = WHEREBY_PLATFORM
# Daily.co composed video (Brady Bunch grid) - Daily.co only
daily_composed_video_s3_key: str | None = None
daily_composed_video_duration: int | None = None
class MeetingController:
@@ -171,6 +177,90 @@ class MeetingController:
return None
return Meeting(**result)
async def get_by_room_name_all(self, room_name: str) -> list[Meeting]:
"""Get all meetings for a room name (not just most recent)."""
query = meetings.select().where(meetings.c.room_name == room_name)
results = await get_database().fetch_all(query)
return [Meeting(**r) for r in results]
async def get_by_room_name_and_time(
self,
room_name: NonEmptyString,
recording_start: datetime,
time_window_hours: int = 168,
) -> Meeting | None:
"""
Get meeting by room name closest to recording timestamp.
HACK ALERT: Daily.co doesn't return instanceId in recordings API response,
and mtgSessionId is separate from our instanceId. Time-based matching is
the least-bad workaround.
This handles edge case of duplicate room_name values in DB (race conditions,
double-clicks, etc.) by matching based on temporal proximity.
Algorithm:
1. Find meetings within time_window_hours of recording_start
2. Return meeting with start_date closest to recording_start
3. If tie, return first by meeting.id (deterministic)
Args:
room_name: Daily.co room name from recording
recording_start: Timezone-aware datetime from recording.start_ts
time_window_hours: Search window (default 168 = 1 week)
Returns:
Meeting closest to recording timestamp, or None if no matches
Failure modes:
- Multiple meetings in same room within ~5 minutes: picks closest
- All meetings outside time window: returns None
- Clock skew between Daily.co and DB: 1-week window tolerates this
Why 1 week window:
- Handles webhook failures (recording discovered days later)
- Tolerates clock skew
- Rejects unrelated meetings from weeks ago
"""
# Validate timezone-aware datetime
if recording_start.tzinfo is None:
raise ValueError(
f"recording_start must be timezone-aware, got naive datetime: {recording_start}"
)
window_start = recording_start - timedelta(hours=time_window_hours)
window_end = recording_start + timedelta(hours=time_window_hours)
query = (
meetings.select()
.where(
sa.and_(
meetings.c.room_name == room_name,
meetings.c.start_date >= window_start,
meetings.c.start_date <= window_end,
)
)
.order_by(meetings.c.start_date)
)
results = await get_database().fetch_all(query)
if not results:
return None
candidates = [Meeting(**r) for r in results]
# Find meeting with start_date closest to recording_start
closest = min(
candidates,
key=lambda m: (
abs((m.start_date - recording_start).total_seconds()),
m.id, # Tie-breaker: deterministic by UUID
),
)
return closest
async def get_active(self, room: Room, current_time: datetime) -> Meeting | None:
"""
Get latest active meeting for a room.
@@ -260,6 +350,44 @@ class MeetingController:
query = meetings.update().where(meetings.c.id == meeting_id).values(**kwargs)
await get_database().execute(query)
async def set_cloud_recording_if_missing(
self,
meeting_id: NonEmptyString,
s3_key: NonEmptyString,
duration: int,
) -> bool:
"""
Set cloud recording only if not already set.
Returns True if updated, False if already set.
Prevents webhook/polling race condition via atomic WHERE clause.
"""
# Check current value before update to detect actual change
meeting_before = await self.get_by_id(meeting_id)
if not meeting_before:
return False
was_null = meeting_before.daily_composed_video_s3_key is None
query = (
meetings.update()
.where(
sa.and_(
meetings.c.id == meeting_id,
meetings.c.daily_composed_video_s3_key.is_(None),
)
)
.values(
daily_composed_video_s3_key=s3_key,
daily_composed_video_duration=duration,
)
)
await get_database().execute(query)
# Return True only if value was NULL before (actual update occurred)
# If was_null=False, the WHERE clause prevented the update
return was_null
async def increment_num_clients(self, meeting_id: str) -> None:
"""Atomically increment participant count."""
query = (

View File

@@ -7,6 +7,7 @@ from sqlalchemy import or_
from reflector.db import get_database, metadata
from reflector.utils import generate_uuid4
from reflector.utils.string import NonEmptyString
recordings = sa.Table(
"recording",
@@ -71,6 +72,19 @@ class RecordingController:
query = recordings.delete().where(recordings.c.id == id)
await get_database().execute(query)
async def set_meeting_id(
self,
recording_id: NonEmptyString,
meeting_id: NonEmptyString,
) -> None:
"""Link recording to meeting."""
query = (
recordings.update()
.where(recordings.c.id == recording_id)
.values(meeting_id=meeting_id)
)
await get_database().execute(query)
# no check for existence
async def get_by_ids(self, recording_ids: list[str]) -> list[Recording]:
if not recording_ids:

View File

@@ -57,12 +57,6 @@ rooms = sqlalchemy.Table(
sqlalchemy.String,
nullable=False,
),
sqlalchemy.Column(
"use_hatchet",
sqlalchemy.Boolean,
nullable=False,
server_default=false(),
),
sqlalchemy.Column(
"skip_consent",
sqlalchemy.Boolean,
@@ -97,7 +91,6 @@ class Room(BaseModel):
ics_last_sync: datetime | None = None
ics_last_etag: str | None = None
platform: Platform = Field(default_factory=lambda: settings.DEFAULT_VIDEO_PLATFORM)
use_hatchet: bool = False
skip_consent: bool = False

View File

@@ -26,6 +26,7 @@ from reflector.db.rooms import rooms
from reflector.db.transcripts import SourceKind, TranscriptStatus, transcripts
from reflector.db.utils import is_postgresql
from reflector.logger import logger
from reflector.settings import settings
from reflector.utils.string import NonEmptyString, try_parse_non_empty_string
DEFAULT_SEARCH_LIMIT = 20
@@ -396,7 +397,7 @@ class SearchController:
transcripts.c.user_id == params.user_id, rooms.c.is_shared
)
)
else:
elif not settings.PUBLIC_MODE:
base_query = base_query.where(rooms.c.is_shared)
if params.room_id:
base_query = base_query.where(transcripts.c.room_id == params.room_id)

View File

@@ -406,7 +406,7 @@ class TranscriptController:
query = query.where(
or_(transcripts.c.user_id == user_id, rooms.c.is_shared)
)
else:
elif not settings.PUBLIC_MODE:
query = query.where(rooms.c.is_shared)
if source_kind:

View File

@@ -12,7 +12,9 @@ import threading
from hatchet_sdk import ClientConfig, Hatchet
from hatchet_sdk.clients.rest.models import V1TaskStatus
from hatchet_sdk.rate_limit import RateLimitDuration
from reflector.hatchet.constants import LLM_RATE_LIMIT_KEY, LLM_RATE_LIMIT_PER_SECOND
from reflector.logger import logger
from reflector.settings import settings
@@ -113,3 +115,26 @@ class HatchetClientManager:
"""Reset the client instance (for testing)."""
with cls._lock:
cls._instance = None
@classmethod
async def ensure_rate_limit(cls) -> None:
"""Ensure the LLM rate limit exists in Hatchet.
Uses the Hatchet SDK rate_limits client (aio_put). See:
https://docs.hatchet.run/sdks/python/feature-clients/rate_limits
"""
logger.info(
"[Hatchet] Ensuring rate limit exists",
rate_limit_key=LLM_RATE_LIMIT_KEY,
limit=LLM_RATE_LIMIT_PER_SECOND,
)
client = cls.get_client()
await client.rate_limits.aio_put(
key=LLM_RATE_LIMIT_KEY,
limit=LLM_RATE_LIMIT_PER_SECOND,
duration=RateLimitDuration.SECOND,
)
logger.info(
"[Hatchet] Rate limit put successfully",
rate_limit_key=LLM_RATE_LIMIT_KEY,
)

View File

@@ -35,7 +35,9 @@ LLM_RATE_LIMIT_PER_SECOND = 10
# Task execution timeouts (seconds)
TIMEOUT_SHORT = 60 # Quick operations: API calls, DB updates
TIMEOUT_MEDIUM = 120 # Single LLM calls, waveform generation
TIMEOUT_MEDIUM = (
300 # Single LLM calls, waveform generation (5m for slow LLM responses)
)
TIMEOUT_LONG = 180 # Action items (larger context LLM)
TIMEOUT_AUDIO = 300 # Audio processing: padding, mixdown
TIMEOUT_AUDIO = 720 # Audio processing: padding, mixdown
TIMEOUT_HEAVY = 600 # Transcription, fan-out LLM tasks

View File

@@ -12,14 +12,9 @@ from reflector.hatchet.workflows.daily_multitrack_pipeline import (
daily_multitrack_pipeline,
)
from reflector.logger import logger
from reflector.settings import settings
def main():
if not settings.HATCHET_ENABLED:
logger.error("HATCHET_ENABLED is False, not starting CPU workers")
return
hatchet = HatchetClientManager.get_client()
logger.info(

View File

@@ -3,6 +3,8 @@ LLM/I/O worker pool for all non-CPU tasks.
Handles: all tasks except mixdown_tracks (transcription, LLM inference, orchestration)
"""
import asyncio
from reflector.hatchet.client import HatchetClientManager
from reflector.hatchet.workflows.daily_multitrack_pipeline import (
daily_multitrack_pipeline,
@@ -11,7 +13,6 @@ from reflector.hatchet.workflows.subject_processing import subject_workflow
from reflector.hatchet.workflows.topic_chunk_processing import topic_chunk_workflow
from reflector.hatchet.workflows.track_processing import track_workflow
from reflector.logger import logger
from reflector.settings import settings
SLOTS = 10
WORKER_NAME = "llm-worker-pool"
@@ -19,12 +20,17 @@ POOL = "llm-io"
def main():
if not settings.HATCHET_ENABLED:
logger.error("HATCHET_ENABLED is False, not starting LLM workers")
return
hatchet = HatchetClientManager.get_client()
try:
asyncio.run(HatchetClientManager.ensure_rate_limit())
except Exception as e:
logger.warning(
"[Hatchet] Rate limit initialization failed, but continuing. "
"If workflows fail to register, rate limits may need to be created manually.",
error=str(e),
)
logger.info(
"Starting Hatchet LLM worker pool (all tasks except mixdown)",
worker_name=WORKER_NAME,

View File

@@ -171,11 +171,13 @@ async def set_workflow_error_status(transcript_id: NonEmptyString) -> bool:
def _spawn_storage():
"""Create fresh storage instance."""
# TODO: replace direct AwsStorage construction with get_transcripts_storage() factory
return AwsStorage(
aws_bucket_name=settings.TRANSCRIPT_STORAGE_AWS_BUCKET_NAME,
aws_region=settings.TRANSCRIPT_STORAGE_AWS_REGION,
aws_access_key_id=settings.TRANSCRIPT_STORAGE_AWS_ACCESS_KEY_ID,
aws_secret_access_key=settings.TRANSCRIPT_STORAGE_AWS_SECRET_ACCESS_KEY,
aws_endpoint_url=settings.TRANSCRIPT_STORAGE_AWS_ENDPOINT_URL,
)
@@ -322,6 +324,7 @@ async def get_participants(input: PipelineInput, ctx: Context) -> ParticipantsRe
mtg_session_id = recording.mtg_session_id
async with fresh_db_connection():
from reflector.db.transcripts import ( # noqa: PLC0415
TranscriptDuration,
TranscriptParticipant,
transcripts_controller,
)
@@ -330,15 +333,26 @@ async def get_participants(input: PipelineInput, ctx: Context) -> ParticipantsRe
if not transcript:
raise ValueError(f"Transcript {input.transcript_id} not found")
# Note: title NOT cleared - preserves existing titles
# Duration from Daily API (seconds -> milliseconds) - master source
duration_ms = recording.duration * 1000 if recording.duration else 0
await transcripts_controller.update(
transcript,
{
"events": [],
"topics": [],
"participants": [],
"duration": duration_ms,
},
)
await append_event_and_broadcast(
input.transcript_id,
transcript,
"DURATION",
TranscriptDuration(duration=duration_ms),
logger=logger,
)
mtg_session_id = assert_non_none_and_non_empty(
mtg_session_id, "mtg_session_id is required"
)
@@ -706,7 +720,6 @@ async def detect_topics(input: PipelineInput, ctx: Context) -> TopicsResult:
chunk_text=chunk["text"],
timestamp=chunk["timestamp"],
duration=chunk["duration"],
words=chunk["words"],
)
)
for chunk in chunks
@@ -718,31 +731,41 @@ async def detect_topics(input: PipelineInput, ctx: Context) -> TopicsResult:
TopicChunkResult(**result[TaskName.DETECT_CHUNK_TOPIC]) for result in results
]
# Build index-to-words map from local chunks (words not in child workflow results)
chunks_by_index = {chunk["index"]: chunk["words"] for chunk in chunks}
async with fresh_db_connection():
transcript = await transcripts_controller.get_by_id(input.transcript_id)
if not transcript:
raise ValueError(f"Transcript {input.transcript_id} not found")
# Clear topics for idempotency on retry (each topic gets a fresh UUID,
# so upsert_topic would append duplicates without this)
await transcripts_controller.update(transcript, {"topics": []})
for chunk in topic_chunks:
chunk_words = chunks_by_index[chunk.chunk_index]
topic = TranscriptTopic(
title=chunk.title,
summary=chunk.summary,
timestamp=chunk.timestamp,
transcript=" ".join(w.text for w in chunk.words),
words=chunk.words,
transcript=" ".join(w.text for w in chunk_words),
words=chunk_words,
)
await transcripts_controller.upsert_topic(transcript, topic)
await append_event_and_broadcast(
input.transcript_id, transcript, "TOPIC", topic, logger=logger
)
# Words omitted from TopicsResult — already persisted to DB above.
# Downstream tasks that need words refetch from DB.
topics_list = [
TitleSummary(
title=chunk.title,
summary=chunk.summary,
timestamp=chunk.timestamp,
duration=chunk.duration,
transcript=TranscriptType(words=chunk.words),
transcript=TranscriptType(words=[]),
)
for chunk in topic_chunks
]
@@ -828,9 +851,8 @@ async def extract_subjects(input: PipelineInput, ctx: Context) -> SubjectsResult
ctx.log(f"extract_subjects: starting for transcript_id={input.transcript_id}")
topics_result = ctx.task_output(detect_topics)
topics = topics_result.topics
if not topics:
if not topics_result.topics:
ctx.log("extract_subjects: no topics, returning empty subjects")
return SubjectsResult(
subjects=[],
@@ -843,11 +865,13 @@ async def extract_subjects(input: PipelineInput, ctx: Context) -> SubjectsResult
# sharing DB connections and LLM HTTP pools across forks
from reflector.db.transcripts import transcripts_controller # noqa: PLC0415
from reflector.llm import LLM # noqa: PLC0415
from reflector.processors.types import words_to_segments # noqa: PLC0415
async with fresh_db_connection():
transcript = await transcripts_controller.get_by_id(input.transcript_id)
# Build transcript text from topics (same logic as TranscriptFinalSummaryProcessor)
# Build transcript text from DB topics (words omitted from task output
# to reduce Hatchet payload size — refetch from DB where they were persisted)
speakermap = {}
if transcript and transcript.participants:
speakermap = {
@@ -857,8 +881,8 @@ async def extract_subjects(input: PipelineInput, ctx: Context) -> SubjectsResult
}
text_lines = []
for topic in topics:
for segment in topic.transcript.as_segments():
for db_topic in transcript.topics:
for segment in words_to_segments(db_topic.words):
name = speakermap.get(segment.speaker, f"Speaker {segment.speaker}")
text_lines.append(f"{name}: {segment.text}")
@@ -1095,7 +1119,7 @@ async def identify_action_items(
@daily_multitrack_pipeline.task(
parents=[generate_waveform, generate_title, generate_recap, identify_action_items],
parents=[process_tracks, generate_title, generate_recap, identify_action_items],
execution_timeout=timedelta(seconds=TIMEOUT_SHORT),
retries=3,
)
@@ -1108,12 +1132,8 @@ async def finalize(input: PipelineInput, ctx: Context) -> FinalizeResult:
"""
ctx.log("finalize: saving transcript and setting status to 'ended'")
mixdown_result = ctx.task_output(mixdown_tracks)
track_result = ctx.task_output(process_tracks)
duration = mixdown_result.duration
all_words = track_result.all_words
# Cleanup temporary padded S3 files (deferred until finalize for semantic parity with Celery)
created_padded_files = track_result.created_padded_files
if created_padded_files:
@@ -1133,7 +1153,6 @@ async def finalize(input: PipelineInput, ctx: Context) -> FinalizeResult:
async with fresh_db_connection():
from reflector.db.transcripts import ( # noqa: PLC0415
TranscriptDuration,
TranscriptText,
transcripts_controller,
)
@@ -1142,34 +1161,26 @@ async def finalize(input: PipelineInput, ctx: Context) -> FinalizeResult:
if transcript is None:
raise ValueError(f"Transcript {input.transcript_id} not found in database")
merged_transcript = TranscriptType(words=all_words, translation=None)
await append_event_and_broadcast(
input.transcript_id,
transcript,
"TRANSCRIPT",
TranscriptText(
text=merged_transcript.text,
translation=merged_transcript.translation,
text="",
translation=None,
),
logger=logger,
)
# Save duration and clear workflow_run_id (workflow completed successfully)
# Note: title/long_summary/short_summary already saved by their callbacks
# Clear workflow_run_id (workflow completed successfully)
# Note: title/long_summary/short_summary/duration already saved by their callbacks
await transcripts_controller.update(
transcript,
{
"duration": duration,
"workflow_run_id": None, # Clear on success - no need to resume
},
)
duration_data = TranscriptDuration(duration=duration)
await append_event_and_broadcast(
input.transcript_id, transcript, "DURATION", duration_data, logger=logger
)
await set_status_and_broadcast(input.transcript_id, "ended", logger=logger)
ctx.log(
@@ -1347,14 +1358,34 @@ async def send_webhook(input: PipelineInput, ctx: Context) -> WebhookResult:
f"participants={len(payload.transcript.participants)})"
)
response = await send_webhook_request(
url=room.webhook_url,
payload=payload,
event_type="transcript.completed",
webhook_secret=room.webhook_secret,
timeout=30.0,
)
try:
response = await send_webhook_request(
url=room.webhook_url,
payload=payload,
event_type="transcript.completed",
webhook_secret=room.webhook_secret,
timeout=30.0,
)
ctx.log(f"send_webhook complete: status_code={response.status_code}")
ctx.log(f"send_webhook complete: status_code={response.status_code}")
return WebhookResult(webhook_sent=True, response_code=response.status_code)
return WebhookResult(webhook_sent=True, response_code=response.status_code)
except httpx.HTTPStatusError as e:
ctx.log(
f"send_webhook failed (HTTP {e.response.status_code}), continuing anyway"
)
return WebhookResult(
webhook_sent=False, response_code=e.response.status_code
)
except httpx.ConnectError as e:
ctx.log(f"send_webhook failed (connection error), continuing anyway: {e}")
return WebhookResult(webhook_sent=False)
except httpx.TimeoutException as e:
ctx.log(f"send_webhook failed (timeout), continuing anyway: {e}")
return WebhookResult(webhook_sent=False)
except Exception as e:
ctx.log(f"send_webhook unexpected error, continuing anyway: {e}")
return WebhookResult(webhook_sent=False)

View File

@@ -95,7 +95,6 @@ class TopicChunkResult(BaseModel):
summary: str
timestamp: float
duration: float
words: list[Word]
class TopicsResult(BaseModel):

View File

@@ -0,0 +1,167 @@
"""
Hatchet child workflow: PaddingWorkflow
Handles individual audio track padding via Modal.com backend.
"""
from datetime import timedelta
import av
from hatchet_sdk import Context
from pydantic import BaseModel
from reflector.hatchet.client import HatchetClientManager
from reflector.hatchet.constants import TIMEOUT_AUDIO
from reflector.hatchet.workflows.models import PadTrackResult
from reflector.logger import logger
from reflector.utils.audio_constants import PRESIGNED_URL_EXPIRATION_SECONDS
from reflector.utils.audio_padding import extract_stream_start_time_from_container
class PaddingInput(BaseModel):
"""Input for individual track padding."""
track_index: int
s3_key: str
bucket_name: str
transcript_id: str
hatchet = HatchetClientManager.get_client()
padding_workflow = hatchet.workflow(
name="PaddingWorkflow", input_validator=PaddingInput
)
@padding_workflow.task(execution_timeout=timedelta(seconds=TIMEOUT_AUDIO), retries=3)
async def pad_track(input: PaddingInput, ctx: Context) -> PadTrackResult:
"""Pad audio track with silence based on WebM container start_time."""
ctx.log(f"pad_track: track {input.track_index}, s3_key={input.s3_key}")
logger.info(
"[Hatchet] pad_track",
track_index=input.track_index,
s3_key=input.s3_key,
transcript_id=input.transcript_id,
)
try:
# Create fresh storage instance to avoid aioboto3 fork issues
from reflector.settings import settings # noqa: PLC0415
from reflector.storage.storage_aws import AwsStorage # noqa: PLC0415
# TODO: replace direct AwsStorage construction with get_transcripts_storage() factory
storage = AwsStorage(
aws_bucket_name=settings.TRANSCRIPT_STORAGE_AWS_BUCKET_NAME,
aws_region=settings.TRANSCRIPT_STORAGE_AWS_REGION,
aws_access_key_id=settings.TRANSCRIPT_STORAGE_AWS_ACCESS_KEY_ID,
aws_secret_access_key=settings.TRANSCRIPT_STORAGE_AWS_SECRET_ACCESS_KEY,
aws_endpoint_url=settings.TRANSCRIPT_STORAGE_AWS_ENDPOINT_URL,
)
source_url = await storage.get_file_url(
input.s3_key,
operation="get_object",
expires_in=PRESIGNED_URL_EXPIRATION_SECONDS,
bucket=input.bucket_name,
)
# Extract start_time to determine if padding needed
with av.open(source_url) as in_container:
if in_container.duration:
try:
duration = timedelta(seconds=in_container.duration // 1_000_000)
ctx.log(
f"pad_track: track {input.track_index}, duration={duration}"
)
except (ValueError, TypeError, OverflowError) as e:
ctx.log(
f"pad_track: track {input.track_index}, duration error: {str(e)}"
)
start_time_seconds = extract_stream_start_time_from_container(
in_container, input.track_index, logger=logger
)
if start_time_seconds <= 0:
logger.info(
f"Track {input.track_index} requires no padding",
track_index=input.track_index,
)
return PadTrackResult(
padded_key=input.s3_key,
bucket_name=input.bucket_name,
size=0,
track_index=input.track_index,
)
storage_path = f"file_pipeline_hatchet/{input.transcript_id}/tracks/padded_{input.track_index}.webm"
# Presign PUT URL for output (Modal will upload directly)
output_url = await storage.get_file_url(
storage_path,
operation="put_object",
expires_in=PRESIGNED_URL_EXPIRATION_SECONDS,
)
import httpx # noqa: PLC0415
from reflector.processors.audio_padding_modal import ( # noqa: PLC0415
AudioPaddingModalProcessor,
)
try:
processor = AudioPaddingModalProcessor()
result = await processor.pad_track(
track_url=source_url,
output_url=output_url,
start_time_seconds=start_time_seconds,
track_index=input.track_index,
)
file_size = result.size
ctx.log(f"pad_track: Modal returned size={file_size}")
except httpx.HTTPStatusError as e:
error_detail = e.response.text if hasattr(e.response, "text") else str(e)
logger.error(
"[Hatchet] Modal padding HTTP error",
transcript_id=input.transcript_id,
track_index=input.track_index,
status_code=e.response.status_code if hasattr(e, "response") else None,
error=error_detail,
exc_info=True,
)
raise Exception(
f"Modal padding failed: HTTP {e.response.status_code}"
) from e
except httpx.TimeoutException as e:
logger.error(
"[Hatchet] Modal padding timeout",
transcript_id=input.transcript_id,
track_index=input.track_index,
error=str(e),
exc_info=True,
)
raise Exception("Modal padding timeout") from e
logger.info(
"[Hatchet] pad_track complete",
track_index=input.track_index,
padded_key=storage_path,
)
return PadTrackResult(
padded_key=storage_path,
bucket_name=None, # None = use default transcript storage bucket
size=file_size,
track_index=input.track_index,
)
except Exception as e:
logger.error(
"[Hatchet] pad_track failed",
transcript_id=input.transcript_id,
track_index=input.track_index,
error=str(e),
exc_info=True,
)
raise

View File

@@ -20,7 +20,6 @@ from reflector.hatchet.constants import LLM_RATE_LIMIT_KEY, TIMEOUT_MEDIUM
from reflector.hatchet.workflows.models import TopicChunkResult
from reflector.logger import logger
from reflector.processors.prompts import TOPIC_PROMPT
from reflector.processors.types import Word
class TopicChunkInput(BaseModel):
@@ -30,7 +29,6 @@ class TopicChunkInput(BaseModel):
chunk_text: str
timestamp: float
duration: float
words: list[Word]
hatchet = HatchetClientManager.get_client()
@@ -99,5 +97,4 @@ async def detect_chunk_topic(input: TopicChunkInput, ctx: Context) -> TopicChunk
summary=response.summary,
timestamp=input.timestamp,
duration=input.duration,
words=input.words,
)

View File

@@ -14,9 +14,7 @@ Hatchet workers run in forked processes; fresh imports per task ensure
storage/DB connections are not shared across forks.
"""
import tempfile
from datetime import timedelta
from pathlib import Path
import av
from hatchet_sdk import Context
@@ -27,10 +25,7 @@ from reflector.hatchet.constants import TIMEOUT_AUDIO, TIMEOUT_HEAVY
from reflector.hatchet.workflows.models import PadTrackResult, TranscribeTrackResult
from reflector.logger import logger
from reflector.utils.audio_constants import PRESIGNED_URL_EXPIRATION_SECONDS
from reflector.utils.audio_padding import (
apply_audio_padding_to_file,
extract_stream_start_time_from_container,
)
from reflector.utils.audio_padding import extract_stream_start_time_from_container
class TrackInput(BaseModel):
@@ -65,6 +60,7 @@ async def pad_track(input: TrackInput, ctx: Context) -> PadTrackResult:
try:
# Create fresh storage instance to avoid aioboto3 fork issues
# TODO: replace direct AwsStorage construction with get_transcripts_storage() factory
from reflector.settings import settings # noqa: PLC0415
from reflector.storage.storage_aws import AwsStorage # noqa: PLC0415
@@ -73,6 +69,7 @@ async def pad_track(input: TrackInput, ctx: Context) -> PadTrackResult:
aws_region=settings.TRANSCRIPT_STORAGE_AWS_REGION,
aws_access_key_id=settings.TRANSCRIPT_STORAGE_AWS_ACCESS_KEY_ID,
aws_secret_access_key=settings.TRANSCRIPT_STORAGE_AWS_SECRET_ACCESS_KEY,
aws_endpoint_url=settings.TRANSCRIPT_STORAGE_AWS_ENDPOINT_URL,
)
source_url = await storage.get_file_url(
@@ -83,63 +80,44 @@ async def pad_track(input: TrackInput, ctx: Context) -> PadTrackResult:
)
with av.open(source_url) as in_container:
if in_container.duration:
try:
duration = timedelta(seconds=in_container.duration // 1_000_000)
ctx.log(
f"pad_track: track {input.track_index}, duration={duration}"
)
except Exception:
ctx.log(f"pad_track: track {input.track_index}, duration=ERROR")
start_time_seconds = extract_stream_start_time_from_container(
in_container, input.track_index, logger=logger
)
# If no padding needed, return original S3 key
if start_time_seconds <= 0:
logger.info(
f"Track {input.track_index} requires no padding",
track_index=input.track_index,
)
return PadTrackResult(
padded_key=input.s3_key,
bucket_name=input.bucket_name,
size=0,
track_index=input.track_index,
)
# If no padding needed, return original S3 key
if start_time_seconds <= 0:
logger.info(
f"Track {input.track_index} requires no padding",
track_index=input.track_index,
)
return PadTrackResult(
padded_key=input.s3_key,
bucket_name=input.bucket_name,
size=0,
track_index=input.track_index,
)
with tempfile.NamedTemporaryFile(suffix=".webm", delete=False) as temp_file:
temp_path = temp_file.name
storage_path = f"file_pipeline_hatchet/{input.transcript_id}/tracks/padded_{input.track_index}.webm"
try:
apply_audio_padding_to_file(
in_container,
temp_path,
start_time_seconds,
input.track_index,
logger=logger,
)
# Presign PUT URL for output (Modal uploads directly)
output_url = await storage.get_file_url(
storage_path,
operation="put_object",
expires_in=PRESIGNED_URL_EXPIRATION_SECONDS,
)
file_size = Path(temp_path).stat().st_size
storage_path = f"file_pipeline_hatchet/{input.transcript_id}/tracks/padded_{input.track_index}.webm"
from reflector.processors.audio_padding_modal import ( # noqa: PLC0415
AudioPaddingModalProcessor,
)
logger.info(
f"About to upload padded track",
key=storage_path,
size=file_size,
)
with open(temp_path, "rb") as padded_file:
await storage.put_file(storage_path, padded_file)
logger.info(
f"Uploaded padded track to S3",
key=storage_path,
size=file_size,
)
finally:
Path(temp_path).unlink(missing_ok=True)
processor = AudioPaddingModalProcessor()
result = await processor.pad_track(
track_url=source_url,
output_url=output_url,
start_time_seconds=start_time_seconds,
track_index=input.track_index,
)
file_size = result.size
ctx.log(f"pad_track complete: track {input.track_index} -> {storage_path}")
logger.info(
@@ -183,6 +161,7 @@ async def transcribe_track(input: TrackInput, ctx: Context) -> TranscribeTrackRe
raise ValueError("Missing padded_key from pad_track")
# Presign URL on demand (avoids stale URLs on workflow replay)
# TODO: replace direct AwsStorage construction with get_transcripts_storage() factory
from reflector.settings import settings # noqa: PLC0415
from reflector.storage.storage_aws import AwsStorage # noqa: PLC0415
@@ -191,6 +170,7 @@ async def transcribe_track(input: TrackInput, ctx: Context) -> TranscribeTrackRe
aws_region=settings.TRANSCRIPT_STORAGE_AWS_REGION,
aws_access_key_id=settings.TRANSCRIPT_STORAGE_AWS_ACCESS_KEY_ID,
aws_secret_access_key=settings.TRANSCRIPT_STORAGE_AWS_SECRET_ACCESS_KEY,
aws_endpoint_url=settings.TRANSCRIPT_STORAGE_AWS_ENDPOINT_URL,
)
audio_url = await storage.get_file_url(

View File

@@ -144,7 +144,18 @@ class StructuredOutputWorkflow(Workflow, Generic[OutputT]):
)
# Network retries handled by OpenAILike (max_retries=3)
response = await Settings.llm.acomplete(json_prompt)
# response_format enables grammar-based constrained decoding on backends
# that support it (DMR/llama.cpp, vLLM, Ollama, OpenAI).
response = await Settings.llm.acomplete(
json_prompt,
response_format={
"type": "json_schema",
"json_schema": {
"name": self.output_cls.__name__,
"schema": self.output_cls.model_json_schema(),
},
},
)
return ExtractionDone(output=response.text)
@step

View File

@@ -1,74 +0,0 @@
import os
import torch
import torchaudio
from pyannote.audio import Pipeline
from reflector.processors.audio_diarization import AudioDiarizationProcessor
from reflector.processors.audio_diarization_auto import AudioDiarizationAutoProcessor
from reflector.processors.types import AudioDiarizationInput, DiarizationSegment
class AudioDiarizationPyannoteProcessor(AudioDiarizationProcessor):
"""Local diarization processor using pyannote.audio library"""
def __init__(
self,
model_name: str = "pyannote/speaker-diarization-3.1",
pyannote_auth_token: str | None = None,
device: str | None = None,
**kwargs,
):
super().__init__(**kwargs)
self.model_name = model_name
self.auth_token = pyannote_auth_token or os.environ.get("HF_TOKEN")
self.device = device
if device is None:
self.device = "cuda" if torch.cuda.is_available() else "cpu"
self.logger.info(f"Loading pyannote diarization model: {self.model_name}")
self.diarization_pipeline = Pipeline.from_pretrained(
self.model_name, use_auth_token=self.auth_token
)
self.diarization_pipeline.to(torch.device(self.device))
self.logger.info(f"Diarization model loaded on device: {self.device}")
async def _diarize(self, data: AudioDiarizationInput) -> list[DiarizationSegment]:
try:
# Load audio file (audio_url is assumed to be a local file path)
self.logger.info(f"Loading local audio file: {data.audio_url}")
waveform, sample_rate = torchaudio.load(data.audio_url)
audio_input = {"waveform": waveform, "sample_rate": sample_rate}
self.logger.info("Running speaker diarization")
diarization = self.diarization_pipeline(audio_input)
# Convert pyannote diarization output to our format
segments = []
for segment, _, speaker in diarization.itertracks(yield_label=True):
# Extract speaker number from label (e.g., "SPEAKER_00" -> 0)
speaker_id = 0
if speaker.startswith("SPEAKER_"):
try:
speaker_id = int(speaker.split("_")[-1])
except (ValueError, IndexError):
# Fallback to hash-based ID if parsing fails
speaker_id = hash(speaker) % 1000
segments.append(
{
"start": round(segment.start, 3),
"end": round(segment.end, 3),
"speaker": speaker_id,
}
)
self.logger.info(f"Diarization completed with {len(segments)} segments")
return segments
except Exception as e:
self.logger.exception(f"Diarization failed: {e}")
raise
AudioDiarizationAutoProcessor.register("pyannote", AudioDiarizationPyannoteProcessor)

View File

@@ -0,0 +1,113 @@
"""
Modal.com backend for audio padding.
"""
import asyncio
import os
import httpx
from pydantic import BaseModel
from reflector.hatchet.constants import TIMEOUT_AUDIO
from reflector.logger import logger
class PaddingResponse(BaseModel):
size: int
cancelled: bool = False
class AudioPaddingModalProcessor:
"""Audio padding processor using Modal.com CPU backend via HTTP."""
def __init__(
self, padding_url: str | None = None, modal_api_key: str | None = None
):
self.padding_url = padding_url or os.getenv("PADDING_URL")
if not self.padding_url:
raise ValueError(
"PADDING_URL required to use AudioPaddingModalProcessor. "
"Set PADDING_URL environment variable or pass padding_url parameter."
)
self.modal_api_key = modal_api_key or os.getenv("MODAL_API_KEY")
async def pad_track(
self,
track_url: str,
output_url: str,
start_time_seconds: float,
track_index: int,
) -> PaddingResponse:
"""Pad audio track with silence via Modal backend.
Args:
track_url: Presigned GET URL for source audio track
output_url: Presigned PUT URL for output WebM
start_time_seconds: Amount of silence to prepend
track_index: Track index for logging
"""
if not track_url:
raise ValueError("track_url cannot be empty")
if start_time_seconds <= 0:
raise ValueError(
f"start_time_seconds must be positive, got {start_time_seconds}"
)
log = logger.bind(track_index=track_index, padding_seconds=start_time_seconds)
log.info("Sending Modal padding HTTP request")
url = f"{self.padding_url}/pad"
headers = {}
if self.modal_api_key:
headers["Authorization"] = f"Bearer {self.modal_api_key}"
try:
async with httpx.AsyncClient(timeout=TIMEOUT_AUDIO) as client:
response = await client.post(
url,
headers=headers,
json={
"track_url": track_url,
"output_url": output_url,
"start_time_seconds": start_time_seconds,
"track_index": track_index,
},
follow_redirects=True,
)
if response.status_code != 200:
error_body = response.text
log.error(
"Modal padding API error",
status_code=response.status_code,
error_body=error_body,
)
response.raise_for_status()
result = response.json()
# Check if work was cancelled
if result.get("cancelled"):
log.warning("Modal padding was cancelled by disconnect detection")
raise asyncio.CancelledError(
"Padding cancelled due to client disconnect"
)
log.info("Modal padding complete", size=result["size"])
return PaddingResponse(**result)
except asyncio.CancelledError:
log.warning(
"Modal padding cancelled (Hatchet timeout, disconnect detected on Modal side)"
)
raise
except httpx.TimeoutException as e:
log.error("Modal padding timeout", error=str(e), exc_info=True)
raise Exception(f"Modal padding timeout: {e}") from e
except httpx.HTTPStatusError as e:
log.error("Modal padding HTTP error", error=str(e), exc_info=True)
raise Exception(f"Modal padding HTTP error: {e}") from e
except Exception as e:
log.error("Modal padding unexpected error", error=str(e), exc_info=True)
raise

View File

@@ -319,21 +319,6 @@ class ICSSyncService:
calendar = self.fetch_service.parse_ics(ics_content)
content_hash = hashlib.md5(ics_content.encode()).hexdigest()
if room.ics_last_etag == content_hash:
logger.info("No changes in ICS for room", room_id=room.id)
room_url = f"{settings.UI_BASE_URL}/{room.name}"
events, total_events = self.fetch_service.extract_room_events(
calendar, room.name, room_url
)
return {
"status": SyncStatus.UNCHANGED,
"hash": content_hash,
"events_found": len(events),
"total_events": total_events,
"events_created": 0,
"events_updated": 0,
"events_deleted": 0,
}
# Extract matching events
room_url = f"{settings.UI_BASE_URL}/{room.name}"
@@ -371,6 +356,44 @@ class ICSSyncService:
time_since_sync = datetime.now(timezone.utc) - room.ics_last_sync
return time_since_sync.total_seconds() >= room.ics_fetch_interval
def _event_data_changed(self, existing: CalendarEvent, new_data: EventData) -> bool:
"""Check if event data has changed by comparing relevant fields.
IMPORTANT: When adding fields to CalendarEvent/EventData, update this method
and the _COMPARED_FIELDS set below for runtime validation.
"""
# Fields that come from ICS and should trigger updates when changed
_COMPARED_FIELDS = {
"title",
"description",
"start_time",
"end_time",
"location",
"attendees",
"ics_raw_data",
}
# Runtime exhaustiveness check: ensure we're comparing all EventData fields
event_data_fields = set(EventData.__annotations__.keys()) - {"ics_uid"}
if event_data_fields != _COMPARED_FIELDS:
missing = event_data_fields - _COMPARED_FIELDS
extra = _COMPARED_FIELDS - event_data_fields
raise RuntimeError(
f"_event_data_changed() field mismatch: "
f"missing={missing}, extra={extra}. "
f"Update the comparison logic when adding/removing fields."
)
return (
existing.title != new_data["title"]
or existing.description != new_data["description"]
or existing.start_time != new_data["start_time"]
or existing.end_time != new_data["end_time"]
or existing.location != new_data["location"]
or existing.attendees != new_data["attendees"]
or existing.ics_raw_data != new_data["ics_raw_data"]
)
async def _sync_events_to_database(
self, room_id: str, events: list[EventData]
) -> SyncStats:
@@ -386,11 +409,14 @@ class ICSSyncService:
)
if existing:
updated += 1
# Only count as updated if data actually changed
if self._event_data_changed(existing, event_data):
updated += 1
await calendar_events_controller.upsert(calendar_event)
else:
created += 1
await calendar_events_controller.upsert(calendar_event)
await calendar_events_controller.upsert(calendar_event)
current_ics_uids.append(event_data["ics_uid"])
# Soft delete events that are no longer in calendar

View File

@@ -11,19 +11,14 @@ from typing import Literal, Union, assert_never
import celery
from celery.result import AsyncResult
from hatchet_sdk.clients.rest.exceptions import ApiException
from hatchet_sdk.clients.rest.exceptions import ApiException, NotFoundException
from hatchet_sdk.clients.rest.models import V1TaskStatus
from reflector.db.recordings import recordings_controller
from reflector.db.rooms import rooms_controller
from reflector.db.transcripts import Transcript, transcripts_controller
from reflector.hatchet.client import HatchetClientManager
from reflector.logger import logger
from reflector.pipelines.main_file_pipeline import task_pipeline_file_process
from reflector.pipelines.main_multitrack_pipeline import (
task_pipeline_multitrack_process,
)
from reflector.settings import settings
from reflector.utils.string import NonEmptyString
@@ -102,8 +97,11 @@ async def validate_transcript_for_processing(
if transcript.locked:
return ValidationLocked(detail="Recording is locked")
# hatchet is idempotent anyways + if it wasn't dispatched successfully
if transcript.status == "idle" and not settings.HATCHET_ENABLED:
if (
transcript.status == "idle"
and not transcript.workflow_run_id
and not transcript.recording_id
):
return ValidationNotReady(detail="Recording is not ready for processing")
# Check Celery tasks
@@ -116,7 +114,8 @@ async def validate_transcript_for_processing(
):
return ValidationAlreadyScheduled(detail="already running")
if settings.HATCHET_ENABLED and transcript.workflow_run_id:
# Check Hatchet workflow status if workflow_run_id exists
if transcript.workflow_run_id:
try:
status = await HatchetClientManager.get_workflow_run_status(
transcript.workflow_run_id
@@ -181,42 +180,24 @@ async def dispatch_transcript_processing(
Returns AsyncResult for Celery tasks, None for Hatchet workflows.
"""
if isinstance(config, MultitrackProcessingConfig):
# Check if room has use_hatchet=True (overrides env vars)
room_forces_hatchet = False
if config.room_id:
room = await rooms_controller.get_by_id(config.room_id)
room_forces_hatchet = room.use_hatchet if room else False
# Start durable workflow if enabled (Hatchet)
# and if room has use_hatchet=True
use_hatchet = settings.HATCHET_ENABLED and room_forces_hatchet
if room_forces_hatchet:
logger.info(
"Room forces Hatchet workflow",
room_id=config.room_id,
transcript_id=config.transcript_id,
# Multitrack processing always uses Hatchet (no Celery fallback)
# First check if we can replay (outside transaction since it's read-only)
transcript = await transcripts_controller.get_by_id(config.transcript_id)
if transcript and transcript.workflow_run_id and not force:
can_replay = await HatchetClientManager.can_replay(
transcript.workflow_run_id
)
if use_hatchet:
# First check if we can replay (outside transaction since it's read-only)
transcript = await transcripts_controller.get_by_id(config.transcript_id)
if transcript and transcript.workflow_run_id and not force:
can_replay = await HatchetClientManager.can_replay(
transcript.workflow_run_id
if can_replay:
await HatchetClientManager.replay_workflow(transcript.workflow_run_id)
logger.info(
"Replaying Hatchet workflow",
workflow_id=transcript.workflow_run_id,
)
if can_replay:
await HatchetClientManager.replay_workflow(
transcript.workflow_run_id
)
logger.info(
"Replaying Hatchet workflow",
workflow_id=transcript.workflow_run_id,
)
return None
else:
# Workflow exists but can't replay (CANCELLED, COMPLETED, etc.)
# Log and proceed to start new workflow
return None
else:
# Workflow can't replay (CANCELLED, COMPLETED, or 404 deleted)
# Log and proceed to start new workflow
try:
status = await HatchetClientManager.get_workflow_run_status(
transcript.workflow_run_id
)
@@ -225,68 +206,72 @@ async def dispatch_transcript_processing(
old_workflow_id=transcript.workflow_run_id,
old_status=status.value,
)
except NotFoundException:
# Workflow deleted from Hatchet but ID still in DB
logger.info(
"Old workflow not found in Hatchet, starting new",
old_workflow_id=transcript.workflow_run_id,
)
# Force: cancel old workflow if exists
if force and transcript and transcript.workflow_run_id:
# Force: cancel old workflow if exists
if force and transcript and transcript.workflow_run_id:
try:
await HatchetClientManager.cancel_workflow(transcript.workflow_run_id)
logger.info(
"Cancelled old workflow (--force)",
workflow_id=transcript.workflow_run_id,
)
await transcripts_controller.update(
transcript, {"workflow_run_id": None}
except NotFoundException:
logger.info(
"Old workflow already deleted (--force)",
workflow_id=transcript.workflow_run_id,
)
await transcripts_controller.update(transcript, {"workflow_run_id": None})
# Re-fetch and check for concurrent dispatch (optimistic approach).
# No database lock - worst case is duplicate dispatch, but Hatchet
# workflows are idempotent so this is acceptable.
transcript = await transcripts_controller.get_by_id(config.transcript_id)
if transcript and transcript.workflow_run_id:
# Another process started a workflow between validation and now
try:
status = await HatchetClientManager.get_workflow_run_status(
transcript.workflow_run_id
# Re-fetch and check for concurrent dispatch (optimistic approach).
# No database lock - worst case is duplicate dispatch, but Hatchet
# workflows are idempotent so this is acceptable.
transcript = await transcripts_controller.get_by_id(config.transcript_id)
if transcript and transcript.workflow_run_id:
# Another process started a workflow between validation and now
try:
status = await HatchetClientManager.get_workflow_run_status(
transcript.workflow_run_id
)
if status in (V1TaskStatus.RUNNING, V1TaskStatus.QUEUED):
logger.info(
"Concurrent workflow detected, skipping dispatch",
workflow_id=transcript.workflow_run_id,
)
if status in (V1TaskStatus.RUNNING, V1TaskStatus.QUEUED):
logger.info(
"Concurrent workflow detected, skipping dispatch",
workflow_id=transcript.workflow_run_id,
)
return None
except ApiException:
# Workflow might be gone (404) or API issue - proceed with new workflow
pass
return None
except ApiException:
# Workflow might be gone (404) or API issue - proceed with new workflow
pass
workflow_id = await HatchetClientManager.start_workflow(
workflow_name="DiarizationPipeline",
input_data={
"recording_id": config.recording_id,
"tracks": [{"s3_key": k} for k in config.track_keys],
"bucket_name": config.bucket_name,
"transcript_id": config.transcript_id,
"room_id": config.room_id,
},
additional_metadata={
"transcript_id": config.transcript_id,
"recording_id": config.recording_id,
"daily_recording_id": config.recording_id,
},
workflow_id = await HatchetClientManager.start_workflow(
workflow_name="DiarizationPipeline",
input_data={
"recording_id": config.recording_id,
"tracks": [{"s3_key": k} for k in config.track_keys],
"bucket_name": config.bucket_name,
"transcript_id": config.transcript_id,
"room_id": config.room_id,
},
additional_metadata={
"transcript_id": config.transcript_id,
"recording_id": config.recording_id,
"daily_recording_id": config.recording_id,
},
)
if transcript:
await transcripts_controller.update(
transcript, {"workflow_run_id": workflow_id}
)
if transcript:
await transcripts_controller.update(
transcript, {"workflow_run_id": workflow_id}
)
logger.info("Hatchet workflow dispatched", workflow_id=workflow_id)
return None
logger.info("Hatchet workflow dispatched", workflow_id=workflow_id)
return None
# Celery pipeline (durable workflows disabled)
return task_pipeline_multitrack_process.delay(
transcript_id=config.transcript_id,
bucket_name=config.bucket_name,
track_keys=config.track_keys,
)
elif isinstance(config, FileProcessingConfig):
return task_pipeline_file_process.delay(transcript_id=config.transcript_id)
else:

View File

@@ -1,7 +1,7 @@
from pydantic.types import PositiveInt
from pydantic_settings import BaseSettings, SettingsConfigDict
from reflector.schemas.platform import WHEREBY_PLATFORM, Platform
from reflector.schemas.platform import DAILY_PLATFORM, Platform
from reflector.utils.string import NonEmptyString
@@ -49,6 +49,7 @@ class Settings(BaseSettings):
TRANSCRIPT_STORAGE_AWS_REGION: str = "us-east-1"
TRANSCRIPT_STORAGE_AWS_ACCESS_KEY_ID: str | None = None
TRANSCRIPT_STORAGE_AWS_SECRET_ACCESS_KEY: str | None = None
TRANSCRIPT_STORAGE_AWS_ENDPOINT_URL: str | None = None
# Platform-specific recording storage (follows {PREFIX}_STORAGE_AWS_{CREDENTIAL} pattern)
# Whereby storage configuration
@@ -84,9 +85,7 @@ class Settings(BaseSettings):
)
# Diarization
# backends:
# - pyannote: in-process model loading (no HTTP, runs in same process)
# - modal: HTTP API client (works with Modal.com OR self-hosted gpu/self_hosted/)
# backend: modal — HTTP API client (works with Modal.com OR self-hosted gpu/self_hosted/)
DIARIZATION_ENABLED: bool = True
DIARIZATION_BACKEND: str = "modal"
DIARIZATION_URL: str | None = None
@@ -95,8 +94,9 @@ class Settings(BaseSettings):
# Diarization: modal backend
DIARIZATION_MODAL_API_KEY: str | None = None
# Diarization: local pyannote.audio
DIARIZATION_PYANNOTE_AUTH_TOKEN: str | None = None
# Audio Padding (Modal.com backend)
PADDING_URL: str | None = None
PADDING_MODAL_API_KEY: str | None = None
# Sentry
SENTRY_DSN: str | None = None
@@ -151,26 +151,17 @@ class Settings(BaseSettings):
None # Webhook UUID for this environment. Not used by production code
)
# Platform Configuration
DEFAULT_VIDEO_PLATFORM: Platform = WHEREBY_PLATFORM
DEFAULT_VIDEO_PLATFORM: Platform = DAILY_PLATFORM
# Zulip integration
ZULIP_REALM: str | None = None
ZULIP_API_KEY: str | None = None
ZULIP_BOT_EMAIL: str | None = None
# Durable workflow orchestration
# Provider: "hatchet" (or "none" to disable)
DURABLE_WORKFLOW_PROVIDER: str = "none"
# Hatchet workflow orchestration
# Hatchet workflow orchestration (always enabled for multitrack processing)
HATCHET_CLIENT_TOKEN: str | None = None
HATCHET_CLIENT_TLS_STRATEGY: str = "none" # none, tls, mtls
HATCHET_DEBUG: bool = False
@property
def HATCHET_ENABLED(self) -> bool:
"""True if Hatchet is the active provider."""
return self.DURABLE_WORKFLOW_PROVIDER == "hatchet"
settings = Settings()

View File

@@ -53,6 +53,7 @@ class AwsStorage(Storage):
aws_access_key_id: str | None = None,
aws_secret_access_key: str | None = None,
aws_role_arn: str | None = None,
aws_endpoint_url: str | None = None,
):
if not aws_bucket_name:
raise ValueError("Storage `aws_storage` require `aws_bucket_name`")
@@ -73,17 +74,26 @@ class AwsStorage(Storage):
self._access_key_id = aws_access_key_id
self._secret_access_key = aws_secret_access_key
self._role_arn = aws_role_arn
self._endpoint_url = aws_endpoint_url
self.aws_folder = ""
if "/" in aws_bucket_name:
self._bucket_name, self.aws_folder = aws_bucket_name.split("/", 1)
self.boto_config = Config(retries={"max_attempts": 3, "mode": "adaptive"})
config_kwargs: dict = {"retries": {"max_attempts": 3, "mode": "adaptive"}}
if aws_endpoint_url:
config_kwargs["s3"] = {"addressing_style": "path"}
self.boto_config = Config(**config_kwargs)
self.session = aioboto3.Session(
aws_access_key_id=aws_access_key_id,
aws_secret_access_key=aws_secret_access_key,
region_name=aws_region,
)
self.base_url = f"https://{self._bucket_name}.s3.amazonaws.com/"
if aws_endpoint_url:
self.base_url = f"{aws_endpoint_url}/{self._bucket_name}/"
else:
self.base_url = f"https://{self._bucket_name}.s3.amazonaws.com/"
# Implement credential properties
@property
@@ -139,7 +149,9 @@ class AwsStorage(Storage):
s3filename = f"{folder}/{filename}" if folder else filename
logger.info(f"Uploading {filename} to S3 {actual_bucket}/{folder}")
async with self.session.client("s3", config=self.boto_config) as client:
async with self.session.client(
"s3", config=self.boto_config, endpoint_url=self._endpoint_url
) as client:
if isinstance(data, bytes):
await client.put_object(Bucket=actual_bucket, Key=s3filename, Body=data)
else:
@@ -162,7 +174,9 @@ class AwsStorage(Storage):
actual_bucket = bucket or self._bucket_name
folder = self.aws_folder
s3filename = f"{folder}/{filename}" if folder else filename
async with self.session.client("s3", config=self.boto_config) as client:
async with self.session.client(
"s3", config=self.boto_config, endpoint_url=self._endpoint_url
) as client:
presigned_url = await client.generate_presigned_url(
operation,
Params={"Bucket": actual_bucket, "Key": s3filename},
@@ -177,7 +191,9 @@ class AwsStorage(Storage):
folder = self.aws_folder
logger.info(f"Deleting {filename} from S3 {actual_bucket}/{folder}")
s3filename = f"{folder}/{filename}" if folder else filename
async with self.session.client("s3", config=self.boto_config) as client:
async with self.session.client(
"s3", config=self.boto_config, endpoint_url=self._endpoint_url
) as client:
await client.delete_object(Bucket=actual_bucket, Key=s3filename)
@handle_s3_client_errors("download")
@@ -186,7 +202,9 @@ class AwsStorage(Storage):
folder = self.aws_folder
logger.info(f"Downloading {filename} from S3 {actual_bucket}/{folder}")
s3filename = f"{folder}/{filename}" if folder else filename
async with self.session.client("s3", config=self.boto_config) as client:
async with self.session.client(
"s3", config=self.boto_config, endpoint_url=self._endpoint_url
) as client:
response = await client.get_object(Bucket=actual_bucket, Key=s3filename)
return await response["Body"].read()
@@ -201,7 +219,9 @@ class AwsStorage(Storage):
logger.info(f"Listing objects from S3 {actual_bucket} with prefix '{s3prefix}'")
keys = []
async with self.session.client("s3", config=self.boto_config) as client:
async with self.session.client(
"s3", config=self.boto_config, endpoint_url=self._endpoint_url
) as client:
paginator = client.get_paginator("list_objects_v2")
async for page in paginator.paginate(Bucket=actual_bucket, Prefix=s3prefix):
if "Contents" in page:
@@ -227,7 +247,9 @@ class AwsStorage(Storage):
folder = self.aws_folder
logger.info(f"Streaming {filename} from S3 {actual_bucket}/{folder}")
s3filename = f"{folder}/{filename}" if folder else filename
async with self.session.client("s3", config=self.boto_config) as client:
async with self.session.client(
"s3", config=self.boto_config, endpoint_url=self._endpoint_url
) as client:
await client.download_fileobj(
Bucket=actual_bucket, Key=s3filename, Fileobj=fileobj
)

View File

@@ -5,7 +5,9 @@ Used by both Hatchet workflows and Celery pipelines for consistent audio encodin
"""
# Opus codec settings
# ref B0F71CE8-FC59-4AA5-8414-DAFB836DB711
OPUS_STANDARD_SAMPLE_RATE = 48000
# ref B0F71CE8-FC59-4AA5-8414-DAFB836DB711
OPUS_DEFAULT_BIT_RATE = 128000 # 128kbps for good speech quality
# S3 presigned URL expiration

View File

@@ -1,4 +1,5 @@
from datetime import datetime
from uuid import UUID
from reflector.dailyco_api import (
CreateMeetingTokenRequest,
@@ -12,9 +13,11 @@ from reflector.dailyco_api import (
RoomProperties,
verify_webhook_signature,
)
from reflector.dailyco_api import RecordingType as DailyRecordingType
from reflector.db.daily_participant_sessions import (
daily_participant_sessions_controller,
)
from reflector.db.meetings import meetings_controller
from reflector.db.rooms import Room
from reflector.logger import logger
from reflector.storage import get_dailyco_storage
@@ -58,10 +61,9 @@ class DailyClient(VideoPlatformClient):
enable_recording = None
if room.recording_type == self.RECORDING_LOCAL:
enable_recording = "local"
elif (
room.recording_type == self.RECORDING_CLOUD
): # daily "cloud" is not our "cloud"
enable_recording = "raw-tracks"
elif room.recording_type == self.RECORDING_CLOUD:
# Don't set enable_recording - recordings started via REST API (not auto-start)
enable_recording = None
properties = RoomProperties(
enable_recording=enable_recording,
@@ -106,8 +108,6 @@ class DailyClient(VideoPlatformClient):
Daily.co doesn't provide historical session API, so we query our database
where participant.joined/left webhooks are stored.
"""
from reflector.db.meetings import meetings_controller # noqa: PLC0415
meeting = await meetings_controller.get_by_room_name(room_name)
if not meeting:
return []
@@ -179,21 +179,14 @@ class DailyClient(VideoPlatformClient):
async def create_meeting_token(
self,
room_name: DailyRoomName,
start_cloud_recording: bool,
enable_recording_ui: bool,
user_id: NonEmptyString | None = None,
is_owner: bool = False,
max_recording_duration_seconds: int | None = None,
) -> NonEmptyString:
start_cloud_recording_opts = None
if start_cloud_recording and max_recording_duration_seconds:
start_cloud_recording_opts = {"maxDuration": max_recording_duration_seconds}
properties = MeetingTokenProperties(
room_name=room_name,
user_id=user_id,
start_cloud_recording=start_cloud_recording,
start_cloud_recording_opts=start_cloud_recording_opts,
enable_recording_ui=enable_recording_ui,
is_owner=is_owner,
)
@@ -201,6 +194,23 @@ class DailyClient(VideoPlatformClient):
result = await self._api_client.create_meeting_token(request)
return result.token
async def start_recording(
self,
room_name: DailyRoomName,
recording_type: DailyRecordingType,
instance_id: UUID,
) -> dict:
"""Start recording via Daily.co REST API.
Args:
instance_id: UUID for this recording session - one UUID per "room" in Daily (which is "meeting" in Reflector)
"""
return await self._api_client.start_recording(
room_name=room_name,
recording_type=recording_type,
instance_id=instance_id,
)
async def close(self):
"""Clean up API client resources."""
await self._api_client.close()

View File

@@ -19,6 +19,7 @@ from reflector.video_platforms.factory import create_platform_client
from reflector.worker.process import (
poll_daily_room_presence_task,
process_multitrack_recording,
store_cloud_recording,
)
router = APIRouter()
@@ -79,7 +80,14 @@ async def webhook(request: Request):
try:
event = event_adapter.validate_python(body_json)
except Exception as e:
logger.error("Failed to parse webhook event", error=str(e), body=body.decode())
err_detail = str(e)
if hasattr(e, "errors"):
err_detail = f"{err_detail}; errors={e.errors()!r}"
logger.error(
"Failed to parse webhook event",
error=err_detail,
body=body.decode(),
)
raise HTTPException(status_code=422, detail="Invalid event format")
match event:
@@ -174,46 +182,64 @@ async def _handle_recording_started(event: RecordingStartedEvent):
async def _handle_recording_ready(event: RecordingReadyEvent):
room_name = event.payload.room_name
recording_id = event.payload.recording_id
tracks = event.payload.tracks
if not tracks:
logger.warning(
"recording.ready-to-download: missing tracks",
room_name=room_name,
recording_id=recording_id,
payload=event.payload,
)
return
recording_type = event.payload.type
logger.info(
"Recording ready for download",
room_name=room_name,
recording_id=recording_id,
num_tracks=len(tracks),
recording_type=recording_type,
platform="daily",
)
bucket_name = settings.DAILYCO_STORAGE_AWS_BUCKET_NAME
if not bucket_name:
logger.error(
"DAILYCO_STORAGE_AWS_BUCKET_NAME not configured; cannot process Daily recording"
)
logger.error("DAILYCO_STORAGE_AWS_BUCKET_NAME not configured")
return
track_keys = [t.s3Key for t in tracks if t.type == "audio"]
if recording_type == "cloud":
await store_cloud_recording(
recording_id=recording_id,
room_name=room_name,
s3_key=event.payload.s3_key,
duration=event.payload.duration,
start_ts=event.payload.start_ts,
source="webhook",
)
logger.info(
"Recording webhook queuing processing",
recording_id=recording_id,
room_name=room_name,
)
elif recording_type == "raw-tracks":
tracks = event.payload.tracks
if not tracks:
logger.warning(
"raw-tracks recording: missing tracks array",
room_name=room_name,
recording_id=recording_id,
)
return
process_multitrack_recording.delay(
bucket_name=bucket_name,
daily_room_name=room_name,
recording_id=recording_id,
track_keys=track_keys,
)
track_keys = [t.s3Key for t in tracks if t.type == "audio"]
logger.info(
"Raw-tracks recording queuing processing",
recording_id=recording_id,
room_name=room_name,
num_tracks=len(track_keys),
)
process_multitrack_recording.delay(
bucket_name=bucket_name,
daily_room_name=room_name,
recording_id=recording_id,
track_keys=track_keys,
recording_start_ts=event.payload.start_ts,
)
else:
logger.warning(
"Unknown recording type",
recording_type=recording_type,
recording_id=recording_id,
)
async def _handle_recording_error(event: RecordingErrorEvent):

View File

@@ -1,16 +1,23 @@
import json
from datetime import datetime, timezone
from typing import Annotated, Optional
from typing import Annotated, Any, Optional
from uuid import UUID
from fastapi import APIRouter, Depends, HTTPException, Request
from pydantic import BaseModel
import reflector.auth as auth
from reflector.dailyco_api import RecordingType
from reflector.dailyco_api.client import DailyApiError
from reflector.db.meetings import (
MeetingConsent,
meeting_consent_controller,
meetings_controller,
)
from reflector.db.rooms import rooms_controller
from reflector.logger import logger
from reflector.utils.string import NonEmptyString
from reflector.video_platforms.factory import create_platform_client
router = APIRouter()
@@ -73,3 +80,72 @@ async def meeting_deactivate(
await meetings_controller.update_meeting(meeting_id, is_active=False)
return {"status": "success", "meeting_id": meeting_id}
class StartRecordingRequest(BaseModel):
type: RecordingType
instanceId: UUID
@router.post("/meetings/{meeting_id}/recordings/start")
async def start_recording(
meeting_id: NonEmptyString, body: StartRecordingRequest
) -> dict[str, Any]:
"""Start cloud or raw-tracks recording via Daily.co REST API.
Both cloud and raw-tracks are started via REST API to bypass enable_recording limitation of allowing only 1 recording at a time.
Uses different instanceIds for cloud vs raw-tracks (same won't work)
Note: No authentication required - anonymous users supported. TODO this is a DOS vector
"""
meeting = await meetings_controller.get_by_id(meeting_id)
if not meeting:
raise HTTPException(status_code=404, detail="Meeting not found")
log = logger.bind(
meeting_id=meeting_id,
room_name=meeting.room_name,
recording_type=body.type,
instance_id=body.instanceId,
)
try:
client = create_platform_client("daily")
result = await client.start_recording(
room_name=meeting.room_name,
recording_type=body.type,
instance_id=body.instanceId,
)
log.info(f"Started {body.type} recording via REST API")
return {"status": "ok", "result": result}
except DailyApiError as e:
# Parse Daily.co error response to detect "has an active stream"
try:
error_body = json.loads(e.response_body)
error_info = error_body.get("info", "")
# "has an active stream" means recording already started by another participant
# This is SUCCESS from business logic perspective - return 200
if "has an active stream" in error_info:
log.info(
f"{body.type} recording already active (started by another participant)"
)
return {"status": "already_active", "instanceId": str(body.instanceId)}
except (json.JSONDecodeError, KeyError):
pass # Fall through to error handling
# All other Daily.co API errors
log.error(f"Failed to start {body.type} recording", error=str(e))
raise HTTPException(
status_code=500, detail=f"Failed to start recording: {str(e)}"
)
except Exception as e:
# Non-Daily.co errors
log.error(f"Failed to start {body.type} recording", error=str(e))
raise HTTPException(
status_code=500, detail=f"Failed to start recording: {str(e)}"
)

View File

@@ -73,6 +73,8 @@ class Meeting(BaseModel):
calendar_event_id: str | None = None
calendar_metadata: dict[str, Any] | None = None
platform: Platform
daily_composed_video_s3_key: str | None = None
daily_composed_video_duration: int | None = None
class CreateRoom(BaseModel):
@@ -586,7 +588,6 @@ async def rooms_join_meeting(
)
token = await client.create_meeting_token(
meeting.room_name,
start_cloud_recording=meeting.recording_type == "cloud",
enable_recording_ui=enable_recording_ui,
user_id=user_id,
is_owner=user_id == room.user_id,

View File

@@ -5,7 +5,7 @@ from fastapi import APIRouter, Depends, HTTPException, UploadFile
from pydantic import BaseModel
import reflector.auth as auth
from reflector.db.transcripts import transcripts_controller
from reflector.db.transcripts import SourceKind, transcripts_controller
from reflector.pipelines.main_file_pipeline import task_pipeline_file_process
router = APIRouter()
@@ -88,8 +88,10 @@ async def transcript_record_upload(
finally:
container.close()
# set the status to "uploaded"
await transcripts_controller.update(transcript, {"status": "uploaded"})
# set the status to "uploaded" and mark as file source
await transcripts_controller.update(
transcript, {"status": "uploaded", "source_kind": SourceKind.FILE}
)
# launch a background task to process the file
task_pipeline_file_process.delay(transcript_id=transcript_id)

View File

@@ -1,6 +1,6 @@
from typing import Optional
from fastapi import APIRouter, WebSocket
from fastapi import APIRouter, WebSocket, WebSocketDisconnect
from reflector.auth.auth_jwt import JWTAuth # type: ignore
from reflector.db.users import user_controller
@@ -60,6 +60,8 @@ async def user_events_websocket(websocket: WebSocket):
try:
while True:
await websocket.receive()
except (RuntimeError, WebSocketDisconnect):
pass
finally:
if room_id:
await ws_manager.remove_user_from_room(room_id, websocket)

View File

@@ -6,6 +6,11 @@ from celery.schedules import crontab
from reflector.settings import settings
logger = structlog.get_logger(__name__)
# Polling intervals (seconds)
# Webhook-aware: 180s when webhook configured (backup mode), 15s when no webhook (primary discovery)
POLL_DAILY_RECORDINGS_INTERVAL_SEC = 180.0 if settings.DAILY_WEBHOOK_SECRET else 15.0
if celery.current_app.main != "default":
logger.info(f"Celery already configured ({celery.current_app})")
app = celery.current_app
@@ -44,7 +49,7 @@ else:
},
"poll_daily_recordings": {
"task": "reflector.worker.process.poll_daily_recordings",
"schedule": 180.0, # Every 3 minutes (configurable lookback window)
"schedule": POLL_DAILY_RECORDINGS_INTERVAL_SEC,
},
"trigger_daily_reconciliation": {
"task": "reflector.worker.process.trigger_daily_reconciliation",

View File

@@ -2,7 +2,7 @@ import json
import os
import re
from datetime import datetime, timezone
from typing import List
from typing import List, Literal
from urllib.parse import unquote
import av
@@ -27,9 +27,6 @@ from reflector.db.transcripts import (
from reflector.hatchet.client import HatchetClientManager
from reflector.pipelines.main_file_pipeline import task_pipeline_file_process
from reflector.pipelines.main_live_pipeline import asynctask
from reflector.pipelines.main_multitrack_pipeline import (
task_pipeline_multitrack_process,
)
from reflector.pipelines.topic_processing import EmptyPipeline
from reflector.processors import AudioFileWriterProcessor
from reflector.processors.audio_waveform_processor import AudioWaveformProcessor
@@ -42,6 +39,7 @@ from reflector.utils.daily import (
filter_cam_audio_tracks,
recording_lock_key,
)
from reflector.utils.string import NonEmptyString
from reflector.video_platforms.factory import create_platform_client
from reflector.video_platforms.whereby_utils import (
parse_whereby_recording_filename,
@@ -175,13 +173,18 @@ async def process_multitrack_recording(
daily_room_name: DailyRoomName,
recording_id: str,
track_keys: list[str],
recording_start_ts: int,
):
"""
Process raw-tracks (multitrack) recording from Daily.co.
"""
logger.info(
"Processing multitrack recording",
bucket=bucket_name,
room_name=daily_room_name,
recording_id=recording_id,
provided_keys=len(track_keys),
recording_start_ts=recording_start_ts,
)
if not track_keys:
@@ -212,7 +215,7 @@ async def process_multitrack_recording(
)
await _process_multitrack_recording_inner(
bucket_name, daily_room_name, recording_id, track_keys
bucket_name, daily_room_name, recording_id, track_keys, recording_start_ts
)
@@ -221,8 +224,18 @@ async def _process_multitrack_recording_inner(
daily_room_name: DailyRoomName,
recording_id: str,
track_keys: list[str],
recording_start_ts: int,
):
"""Inner function containing the actual processing logic."""
"""
Process multitrack recording (first time or reprocessing).
For first processing (webhook/polling):
- Uses recording_start_ts for time-based meeting matching (no instanceId available)
For reprocessing:
- Uses recording.meeting_id directly (already linked during first processing)
- recording_start_ts is ignored
"""
tz = timezone.utc
recorded_at = datetime.now(tz)
@@ -240,7 +253,53 @@ async def _process_multitrack_recording_inner(
exc_info=True,
)
meeting = await meetings_controller.get_by_room_name(daily_room_name)
# Check if recording already exists (reprocessing path)
recording = await recordings_controller.get_by_id(recording_id)
if recording and recording.meeting_id:
# Reprocessing: recording exists with meeting already linked
meeting = await meetings_controller.get_by_id(recording.meeting_id)
if not meeting:
logger.error(
"Reprocessing: meeting not found for recording - skipping",
meeting_id=recording.meeting_id,
recording_id=recording_id,
)
return
logger.info(
"Reprocessing: using existing recording.meeting_id",
recording_id=recording_id,
meeting_id=meeting.id,
room_name=daily_room_name,
)
else:
# First processing: recording doesn't exist, need time-based matching
# (Daily.co doesn't return instanceId in API, must match by timestamp)
recording_start = datetime.fromtimestamp(recording_start_ts, tz=timezone.utc)
meeting = await meetings_controller.get_by_room_name_and_time(
room_name=daily_room_name,
recording_start=recording_start,
time_window_hours=168, # 1 week
)
if not meeting:
logger.error(
"Raw-tracks: no meeting found within 1-week window (time-based match) - skipping",
recording_id=recording_id,
room_name=daily_room_name,
recording_start_ts=recording_start_ts,
recording_start=recording_start.isoformat(),
)
return # Skip processing, will retry on next poll
logger.info(
"First processing: found meeting via time-based matching",
meeting_id=meeting.id,
room_name=daily_room_name,
recording_id=recording_id,
time_delta_seconds=abs(
(meeting.start_date - recording_start).total_seconds()
),
)
room_name_base = extract_base_room_name(daily_room_name)
@@ -248,18 +307,8 @@ async def _process_multitrack_recording_inner(
if not room:
raise Exception(f"Room not found: {room_name_base}")
if not meeting:
raise Exception(f"Meeting not found: {room_name_base}")
logger.info(
"Found existing Meeting for recording",
meeting_id=meeting.id,
room_name=daily_room_name,
recording_id=recording_id,
)
recording = await recordings_controller.get_by_id(recording_id)
if not recording:
# Create recording (only happens during first processing)
object_key_dir = os.path.dirname(track_keys[0]) if track_keys else ""
recording = await recordings_controller.create(
Recording(
@@ -271,7 +320,19 @@ async def _process_multitrack_recording_inner(
track_keys=track_keys,
)
)
# else: Recording already exists; metadata set at creation time
elif not recording.meeting_id:
# Recording exists but meeting_id is null (failed first processing)
# Update with meeting from time-based matching
await recordings_controller.set_meeting_id(
recording_id=recording.id,
meeting_id=meeting.id,
)
recording.meeting_id = meeting.id
logger.info(
"Updated existing recording with meeting_id",
recording_id=recording.id,
meeting_id=meeting.id,
)
transcript = await transcripts_controller.get_by_recording_id(recording.id)
if not transcript:
@@ -287,48 +348,29 @@ async def _process_multitrack_recording_inner(
room_id=room.id,
)
use_hatchet = settings.HATCHET_ENABLED and room and room.use_hatchet
if room and room.use_hatchet and not settings.HATCHET_ENABLED:
logger.info(
"Room forces Hatchet workflow",
room_id=room.id,
transcript_id=transcript.id,
)
if use_hatchet:
workflow_id = await HatchetClientManager.start_workflow(
workflow_name="DiarizationPipeline",
input_data={
"recording_id": recording_id,
"tracks": [{"s3_key": k} for k in filter_cam_audio_tracks(track_keys)],
"bucket_name": bucket_name,
"transcript_id": transcript.id,
"room_id": room.id,
},
additional_metadata={
"transcript_id": transcript.id,
"recording_id": recording_id,
"daily_recording_id": recording_id,
},
)
logger.info(
"Started Hatchet workflow",
workflow_id=workflow_id,
transcript_id=transcript.id,
)
await transcripts_controller.update(
transcript, {"workflow_run_id": workflow_id}
)
return
# Celery pipeline (runs when durable workflows disabled)
task_pipeline_multitrack_process.delay(
transcript_id=transcript.id,
bucket_name=bucket_name,
track_keys=filter_cam_audio_tracks(track_keys),
# Multitrack processing always uses Hatchet (no Celery fallback)
workflow_id = await HatchetClientManager.start_workflow(
workflow_name="DiarizationPipeline",
input_data={
"recording_id": recording_id,
"tracks": [{"s3_key": k} for k in filter_cam_audio_tracks(track_keys)],
"bucket_name": bucket_name,
"transcript_id": transcript.id,
"room_id": room.id,
},
additional_metadata={
"transcript_id": transcript.id,
"recording_id": recording_id,
"daily_recording_id": recording_id,
},
)
logger.info(
"Started Hatchet workflow",
workflow_id=workflow_id,
transcript_id=transcript.id,
)
await transcripts_controller.update(transcript, {"workflow_run_id": workflow_id})
@shared_task
@@ -337,9 +379,11 @@ async def poll_daily_recordings():
"""Poll Daily.co API for recordings and process missing ones.
Fetches latest recordings from Daily.co API (default limit 100), compares with DB,
and queues processing for recordings not already in DB.
and stores/queues missing recordings:
- Cloud recordings: Store S3 key in meeting table
- Raw-tracks recordings: Queue multitrack processing
For each missing recording, uses audio tracks from API response.
Acts as fallback when webhooks active, primary discovery when webhooks unavailable.
Worker-level locking provides idempotency (see process_multitrack_recording).
"""
@@ -380,51 +424,222 @@ async def poll_daily_recordings():
)
return
recording_ids = [rec.id for rec in finished_recordings]
# Separate cloud and raw-tracks recordings
cloud_recordings = []
raw_tracks_recordings = []
for rec in finished_recordings:
if rec.type:
# Daily.co API returns null type - make sure this assumption stays
# If this logs, Daily.co API changed - we can remove inference logic.
recording_type = rec.type
logger.warning(
"Recording has explicit type field from Daily.co API (unexpected, API may have changed)",
recording_id=rec.id,
room_name=rec.room_name,
recording_type=recording_type,
has_s3key=bool(rec.s3key),
tracks_count=len(rec.tracks),
)
else:
# DAILY.CO API LIMITATION:
# GET /recordings response does NOT include type field.
# Daily.co docs mention type field exists, but API never returns it.
# Verified: 84 recordings from Nov 2025 - Jan 2026 ALL have type=None.
#
# This is not a recent API change - Daily.co has never returned type.
# Must infer from structural properties.
#
# Inference heuristic (reliable for finished recordings):
# - Has tracks array → raw-tracks
# - Has s3key but no tracks → cloud
# - Neither → failed/incomplete recording
if len(rec.tracks) > 0:
recording_type = "raw-tracks"
elif rec.s3key and len(rec.tracks) == 0:
recording_type = "cloud"
else:
logger.warning(
"Recording has no type, no s3key, and no tracks - likely failed recording",
recording_id=rec.id,
room_name=rec.room_name,
status=rec.status,
duration=rec.duration,
mtg_session_id=rec.mtgSessionId,
)
continue
if recording_type == "cloud":
cloud_recordings.append(rec)
else:
raw_tracks_recordings.append(rec)
logger.debug(
"Poll results",
total=len(finished_recordings),
cloud=len(cloud_recordings),
raw_tracks=len(raw_tracks_recordings),
)
# Process cloud recordings
await _poll_cloud_recordings(cloud_recordings)
# Process raw-tracks recordings
await _poll_raw_tracks_recordings(raw_tracks_recordings, bucket_name)
async def store_cloud_recording(
recording_id: NonEmptyString,
room_name: NonEmptyString,
s3_key: NonEmptyString,
duration: int,
start_ts: int,
source: Literal["webhook", "polling"],
) -> bool:
"""
Store cloud recording reference in meeting table.
Common function for both webhook and polling code paths.
Uses time-based matching to handle duplicate room_name values.
Args:
recording_id: Daily.co recording ID
room_name: Daily.co room name
s3_key: S3 key where recording is stored
duration: Recording duration in seconds
start_ts: Unix timestamp when recording started
source: "webhook" or "polling" (for logging)
Returns:
True if stored, False if skipped/failed
"""
recording_start = datetime.fromtimestamp(start_ts, tz=timezone.utc)
meeting = await meetings_controller.get_by_room_name_and_time(
room_name=room_name,
recording_start=recording_start,
time_window_hours=168, # 1 week
)
if not meeting:
logger.warning(
f"Cloud recording ({source}): no meeting found within 1-week window",
recording_id=recording_id,
room_name=room_name,
recording_start_ts=start_ts,
recording_start=recording_start.isoformat(),
)
return False
success = await meetings_controller.set_cloud_recording_if_missing(
meeting_id=meeting.id,
s3_key=s3_key,
duration=duration,
)
if not success:
logger.debug(
f"Cloud recording ({source}): already set (race lost)",
recording_id=recording_id,
room_name=room_name,
meeting_id=meeting.id,
)
return False
logger.info(
f"Cloud recording stored via {source} (time-based match)",
meeting_id=meeting.id,
recording_id=recording_id,
s3_key=s3_key,
duration=duration,
time_delta_seconds=abs((meeting.start_date - recording_start).total_seconds()),
)
return True
async def _poll_cloud_recordings(cloud_recordings: List[FinishedRecordingResponse]):
"""
Store cloud recordings missing from meeting table via polling.
Uses time-based matching via store_cloud_recording().
"""
if not cloud_recordings:
return
stored_count = 0
for recording in cloud_recordings:
# Extract S3 key from recording (cloud recordings use s3key field)
s3_key = recording.s3key or (recording.s3.key if recording.s3 else None)
if not s3_key:
logger.warning(
"Cloud recording: missing S3 key",
recording_id=recording.id,
room_name=recording.room_name,
)
continue
stored = await store_cloud_recording(
recording_id=recording.id,
room_name=recording.room_name,
s3_key=s3_key,
duration=recording.duration,
start_ts=recording.start_ts,
source="polling",
)
if stored:
stored_count += 1
logger.info(
"Cloud recording polling complete",
total=len(cloud_recordings),
stored=stored_count,
)
async def _poll_raw_tracks_recordings(
raw_tracks_recordings: List[FinishedRecordingResponse],
bucket_name: str,
):
"""Queue raw-tracks recordings missing from DB (existing logic)."""
if not raw_tracks_recordings:
return
recording_ids = [rec.id for rec in raw_tracks_recordings]
existing_recordings = await recordings_controller.get_by_ids(recording_ids)
existing_ids = {rec.id for rec in existing_recordings}
missing_recordings = [
rec for rec in finished_recordings if rec.id not in existing_ids
rec for rec in raw_tracks_recordings if rec.id not in existing_ids
]
if not missing_recordings:
logger.debug(
"All recordings already in DB",
api_count=len(finished_recordings),
"All raw-tracks recordings already in DB",
api_count=len(raw_tracks_recordings),
existing_count=len(existing_recordings),
)
return
logger.info(
"Found recordings missing from DB",
"Found raw-tracks recordings missing from DB",
missing_count=len(missing_recordings),
total_api_count=len(finished_recordings),
total_api_count=len(raw_tracks_recordings),
existing_count=len(existing_recordings),
)
for recording in missing_recordings:
if not recording.tracks:
if recording.status == "finished":
logger.warning(
"Finished recording has no tracks (no audio captured)",
recording_id=recording.id,
room_name=recording.room_name,
)
else:
logger.debug(
"No tracks in recording yet",
recording_id=recording.id,
room_name=recording.room_name,
status=recording.status,
)
logger.warning(
"Finished raw-tracks recording has no tracks (no audio captured)",
recording_id=recording.id,
room_name=recording.room_name,
)
continue
track_keys = [t.s3Key for t in recording.tracks if t.type == "audio"]
if not track_keys:
logger.warning(
"No audio tracks found in recording (only video tracks)",
"No audio tracks found in raw-tracks recording",
recording_id=recording.id,
room_name=recording.room_name,
total_tracks=len(recording.tracks),
@@ -432,7 +647,7 @@ async def poll_daily_recordings():
continue
logger.info(
"Queueing missing recording for processing",
"Queueing missing raw-tracks recording for processing",
recording_id=recording.id,
room_name=recording.room_name,
track_count=len(track_keys),
@@ -443,6 +658,7 @@ async def poll_daily_recordings():
daily_room_name=recording.room_name,
recording_id=recording.id,
track_keys=track_keys,
recording_start_ts=recording.start_ts,
)
@@ -810,7 +1026,6 @@ async def reprocess_failed_daily_recordings():
)
continue
# Fetch room to check use_hatchet flag
room = None
if meeting.room_id:
room = await rooms_controller.get_by_id(meeting.room_id)
@@ -834,61 +1049,43 @@ async def reprocess_failed_daily_recordings():
)
continue
use_hatchet = settings.HATCHET_ENABLED and room and room.use_hatchet
if use_hatchet:
# Hatchet requires a transcript for workflow_run_id tracking
if not transcript:
logger.warning(
"No transcript for Hatchet reprocessing, skipping",
recording_id=recording.id,
)
continue
workflow_id = await HatchetClientManager.start_workflow(
workflow_name="DiarizationPipeline",
input_data={
"recording_id": recording.id,
"tracks": [
{"s3_key": k}
for k in filter_cam_audio_tracks(recording.track_keys)
],
"bucket_name": bucket_name,
"transcript_id": transcript.id,
"room_id": room.id if room else None,
},
additional_metadata={
"transcript_id": transcript.id,
"recording_id": recording.id,
"reprocess": True,
},
)
await transcripts_controller.update(
transcript, {"workflow_run_id": workflow_id}
)
logger.info(
"Queued Daily recording for Hatchet reprocessing",
# Multitrack reprocessing always uses Hatchet (no Celery fallback)
if not transcript:
logger.warning(
"No transcript for Hatchet reprocessing, skipping",
recording_id=recording.id,
workflow_id=workflow_id,
room_name=meeting.room_name,
track_count=len(recording.track_keys),
)
else:
logger.info(
"Queueing Daily recording for Celery reprocessing",
recording_id=recording.id,
room_name=meeting.room_name,
track_count=len(recording.track_keys),
transcript_status=transcript.status if transcript else None,
)
continue
process_multitrack_recording.delay(
bucket_name=bucket_name,
daily_room_name=meeting.room_name,
recording_id=recording.id,
track_keys=recording.track_keys,
)
workflow_id = await HatchetClientManager.start_workflow(
workflow_name="DiarizationPipeline",
input_data={
"recording_id": recording.id,
"tracks": [
{"s3_key": k}
for k in filter_cam_audio_tracks(recording.track_keys)
],
"bucket_name": bucket_name,
"transcript_id": transcript.id,
"room_id": room.id if room else None,
},
additional_metadata={
"transcript_id": transcript.id,
"recording_id": recording.id,
"reprocess": True,
},
)
await transcripts_controller.update(
transcript, {"workflow_run_id": workflow_id}
)
logger.info(
"Queued Daily recording for Hatchet reprocessing",
recording_id=recording.id,
workflow_id=workflow_id,
room_name=meeting.room_name,
track_count=len(recording.track_keys),
)
reprocessed_count += 1

View File

@@ -11,7 +11,6 @@ broadcast messages to all connected websockets.
import asyncio
import json
import threading
import redis.asyncio as redis
from fastapi import WebSocket
@@ -49,7 +48,15 @@ class RedisPubSubManager:
if not self.redis_connection:
await self.connect()
message = json.dumps(message)
await self.redis_connection.publish(room_id, message)
try:
await self.redis_connection.publish(room_id, message)
except RuntimeError:
# Celery workers run each task in a new event loop (asyncio.run),
# which closes the previous loop. Cached Redis connection is dead.
# Reconnect on the current loop and retry.
self.redis_connection = None
await self.connect()
await self.redis_connection.publish(room_id, message)
async def subscribe(self, room_id: str) -> redis.Redis:
await self.pubsub.subscribe(room_id)
@@ -98,6 +105,7 @@ class WebsocketManager:
async def _pubsub_data_reader(self, pubsub_subscriber):
while True:
# timeout=1.0 prevents tight CPU loop when no messages available
message = await pubsub_subscriber.get_message(
ignore_subscribe_messages=True
)
@@ -109,29 +117,38 @@ class WebsocketManager:
await socket.send_json(data)
# Process-global singleton to ensure only one WebsocketManager instance exists.
# Multiple instances would cause resource leaks and CPU issues.
_ws_manager: WebsocketManager | None = None
def get_ws_manager() -> WebsocketManager:
"""
Returns the WebsocketManager instance for managing websockets.
Returns the global WebsocketManager singleton.
This function initializes and returns the WebsocketManager instance,
which is responsible for managing websockets and handling websocket
connections.
Creates instance on first call, subsequent calls return cached instance.
Thread-safe via GIL. Concurrent initialization may create duplicate
instances but last write wins (acceptable for this use case).
Returns:
WebsocketManager: The initialized WebsocketManager instance.
Raises:
ImportError: If the 'reflector.settings' module cannot be imported.
RedisConnectionError: If there is an error connecting to the Redis server.
WebsocketManager: The global WebsocketManager instance.
"""
local = threading.local()
if hasattr(local, "ws_manager"):
return local.ws_manager
global _ws_manager
if _ws_manager is not None:
return _ws_manager
# No lock needed - GIL makes this safe enough
# Worst case: race creates two instances, last assignment wins
pubsub_client = RedisPubSubManager(
host=settings.REDIS_HOST,
port=settings.REDIS_PORT,
)
ws_manager = WebsocketManager(pubsub_client=pubsub_client)
local.ws_manager = ws_manager
return ws_manager
_ws_manager = WebsocketManager(pubsub_client=pubsub_client)
return _ws_manager
def reset_ws_manager() -> None:
"""Reset singleton for testing. DO NOT use in production."""
global _ws_manager
_ws_manager = None

View File

@@ -15,8 +15,7 @@ from reflector.settings import settings
async def setup_webhook(webhook_url: str):
"""
Create or update Daily.co webhook for this environment using dailyco_api module.
Uses DAILY_WEBHOOK_UUID to identify existing webhook.
Create Daily.co webhook. Deletes any existing webhooks first, then creates the new one.
"""
if not settings.DAILY_API_KEY:
print("Error: DAILY_API_KEY not set")
@@ -35,79 +34,37 @@ async def setup_webhook(webhook_url: str):
]
async with DailyApiClient(api_key=settings.DAILY_API_KEY) as client:
webhook_uuid = settings.DAILY_WEBHOOK_UUID
webhooks = await client.list_webhooks()
for wh in webhooks:
await client.delete_webhook(wh.uuid)
print(f"Deleted webhook {wh.uuid}")
if webhook_uuid:
print(f"Updating existing webhook {webhook_uuid}...")
try:
# Note: Daily.co doesn't support PATCH well, so we delete + recreate
await client.delete_webhook(webhook_uuid)
print(f"Deleted old webhook {webhook_uuid}")
request = CreateWebhookRequest(
url=webhook_url,
eventTypes=event_types,
hmac=settings.DAILY_WEBHOOK_SECRET,
)
result = await client.create_webhook(request)
webhook_uuid = result.uuid
request = CreateWebhookRequest(
url=webhook_url,
eventTypes=event_types,
hmac=settings.DAILY_WEBHOOK_SECRET,
)
result = await client.create_webhook(request)
print(f"Created webhook {webhook_uuid} (state: {result.state})")
print(f" URL: {result.url}")
print(
f"✓ Created replacement webhook {result.uuid} (state: {result.state})"
)
print(f" URL: {result.url}")
env_file = Path(__file__).parent.parent / ".env"
if env_file.exists():
lines = env_file.read_text().splitlines()
updated = False
for i, line in enumerate(lines):
if line.startswith("DAILY_WEBHOOK_UUID="):
lines[i] = f"DAILY_WEBHOOK_UUID={webhook_uuid}"
updated = True
break
if not updated:
lines.append(f"DAILY_WEBHOOK_UUID={webhook_uuid}")
env_file.write_text("\n".join(lines) + "\n")
print("✓ Saved DAILY_WEBHOOK_UUID to .env")
webhook_uuid = result.uuid
except Exception as e:
if hasattr(e, "response") and e.response.status_code == 404:
print(f"Webhook {webhook_uuid} not found, creating new one...")
webhook_uuid = None # Fall through to creation
else:
print(f"Error updating webhook: {e}")
return 1
if not webhook_uuid:
print("Creating new webhook...")
request = CreateWebhookRequest(
url=webhook_url,
eventTypes=event_types,
hmac=settings.DAILY_WEBHOOK_SECRET,
)
result = await client.create_webhook(request)
webhook_uuid = result.uuid
print(f"✓ Created webhook {webhook_uuid} (state: {result.state})")
print(f" URL: {result.url}")
print()
print("=" * 60)
print("IMPORTANT: Add this to your environment variables:")
print("=" * 60)
print(f"DAILY_WEBHOOK_UUID: {webhook_uuid}")
print("=" * 60)
print()
# Try to write UUID to .env file
env_file = Path(__file__).parent.parent / ".env"
if env_file.exists():
lines = env_file.read_text().splitlines()
updated = False
# Update existing DAILY_WEBHOOK_UUID line or add it
for i, line in enumerate(lines):
if line.startswith("DAILY_WEBHOOK_UUID="):
lines[i] = f"DAILY_WEBHOOK_UUID={webhook_uuid}"
updated = True
break
if not updated:
lines.append(f"DAILY_WEBHOOK_UUID={webhook_uuid}")
env_file.write_text("\n".join(lines) + "\n")
print(f"✓ Also saved to local .env file")
else:
print(f"⚠ Local .env file not found - please add manually")
return 0
return 0
if __name__ == "__main__":
@@ -117,11 +74,7 @@ if __name__ == "__main__":
"Example: python recreate_daily_webhook.py https://example.com/v1/daily/webhook"
)
print()
print("Behavior:")
print(" - If DAILY_WEBHOOK_UUID set: Deletes old webhook, creates new one")
print(
" - If DAILY_WEBHOOK_UUID empty: Creates new webhook, saves UUID to .env"
)
print("Deletes all existing webhooks, then creates a new one.")
sys.exit(1)
sys.exit(asyncio.run(setup_webhook(sys.argv[1])))

View File

@@ -1,11 +1,10 @@
import os
from contextlib import asynccontextmanager
from tempfile import NamedTemporaryFile
from unittest.mock import patch
import pytest
from reflector.schemas.platform import WHEREBY_PLATFORM
from reflector.schemas.platform import DAILY_PLATFORM, WHEREBY_PLATFORM
@pytest.fixture(scope="session", autouse=True)
@@ -15,6 +14,7 @@ def register_mock_platform():
from reflector.video_platforms.registry import register_platform
register_platform(WHEREBY_PLATFORM, MockPlatformClient)
register_platform(DAILY_PLATFORM, MockPlatformClient)
yield
@@ -333,11 +333,14 @@ def celery_enable_logging():
@pytest.fixture(scope="session")
def celery_config():
with NamedTemporaryFile() as f:
yield {
"broker_url": "memory://",
"result_backend": f"db+sqlite:///{f.name}",
}
redis_host = os.environ.get("REDIS_HOST", "localhost")
redis_port = os.environ.get("REDIS_PORT", "6379")
# Use db 2 to avoid conflicts with main app
redis_url = f"redis://{redis_host}:{redis_port}/2"
yield {
"broker_url": redis_url,
"result_backend": redis_url,
}
@pytest.fixture(scope="session")
@@ -370,9 +373,12 @@ async def ws_manager_in_memory(monkeypatch):
def __init__(self, queue: asyncio.Queue):
self.queue = queue
async def get_message(self, ignore_subscribe_messages: bool = True):
async def get_message(
self, ignore_subscribe_messages: bool = True, timeout: float | None = None
):
wait_timeout = timeout if timeout is not None else 0.05
try:
return await asyncio.wait_for(self.queue.get(), timeout=0.05)
return await asyncio.wait_for(self.queue.get(), timeout=wait_timeout)
except Exception:
return None

View File

@@ -0,0 +1,147 @@
"""
Tests for Daily.co instanceId generation.
Verifies deterministic behavior and frontend/backend consistency.
"""
import pytest
from reflector.dailyco_api.instance_id import (
RAW_TRACKS_NAMESPACE,
generate_cloud_instance_id,
generate_raw_tracks_instance_id,
)
class TestInstanceIdDeterminism:
"""Test deterministic generation of instanceIds."""
def test_cloud_instance_id_is_meeting_id(self):
"""Cloud instanceId is meeting ID directly (implicitly tests determinism)."""
meeting_id = "550e8400-e29b-41d4-a716-446655440000"
result1 = generate_cloud_instance_id(meeting_id)
result2 = generate_cloud_instance_id(meeting_id)
assert str(result1) == meeting_id
assert result1 == result2
def test_raw_tracks_instance_id_deterministic(self):
"""Raw-tracks instanceId generation is deterministic."""
meeting_id = "550e8400-e29b-41d4-a716-446655440000"
result1 = generate_raw_tracks_instance_id(meeting_id)
result2 = generate_raw_tracks_instance_id(meeting_id)
assert result1 == result2
def test_raw_tracks_different_from_cloud(self):
"""Raw-tracks instanceId differs from cloud instanceId."""
meeting_id = "550e8400-e29b-41d4-a716-446655440000"
cloud_id = generate_cloud_instance_id(meeting_id)
raw_tracks_id = generate_raw_tracks_instance_id(meeting_id)
assert cloud_id != raw_tracks_id
def test_different_meetings_different_instance_ids(self):
"""Different meetings generate different instanceIds."""
meeting_id1 = "550e8400-e29b-41d4-a716-446655440000"
meeting_id2 = "6ba7b810-9dad-11d1-80b4-00c04fd430c8"
cloud1 = generate_cloud_instance_id(meeting_id1)
cloud2 = generate_cloud_instance_id(meeting_id2)
assert cloud1 != cloud2
raw1 = generate_raw_tracks_instance_id(meeting_id1)
raw2 = generate_raw_tracks_instance_id(meeting_id2)
assert raw1 != raw2
class TestFrontendBackendConsistency:
"""Test that backend matches frontend logic."""
def test_namespace_matches_frontend(self):
"""Namespace UUID matches frontend RAW_TRACKS_NAMESPACE constant."""
# From www/app/[roomName]/components/DailyRoom.tsx
frontend_namespace = "a1b2c3d4-e5f6-7890-abcd-ef1234567890"
assert str(RAW_TRACKS_NAMESPACE) == frontend_namespace
def test_raw_tracks_generation_matches_frontend_logic(self):
"""Backend UUIDv5 generation matches frontend uuidv5() call."""
# Example meeting ID
meeting_id = "550e8400-e29b-41d4-a716-446655440000"
# Backend result
backend_result = generate_raw_tracks_instance_id(meeting_id)
# Expected result from frontend: uuidv5(meeting.id, RAW_TRACKS_NAMESPACE)
# Python uuid5 uses (namespace, name) argument order
# JavaScript uuid.v5(name, namespace) - same args, different order
# Frontend: uuidv5(meeting.id, "a1b2c3d4-e5f6-7890-abcd-ef1234567890")
# Backend: uuid5(UUID("a1b2c3d4-e5f6-7890-abcd-ef1234567890"), meeting.id)
# Verify it's a valid UUID (will raise if not)
assert len(str(backend_result)) == 36
assert backend_result.version == 5
class TestEdgeCases:
"""Test edge cases and error conditions."""
def test_invalid_uuid_format_raises(self):
"""Invalid UUID format raises ValueError."""
with pytest.raises(ValueError):
generate_cloud_instance_id("not-a-uuid")
def test_lowercase_uuid_normalized_for_cloud(self):
"""Cloud instanceId: lowercase/uppercase UUIDs produce same result."""
meeting_id_lower = "550e8400-e29b-41d4-a716-446655440000"
meeting_id_upper = "550E8400-E29B-41D4-A716-446655440000"
cloud_lower = generate_cloud_instance_id(meeting_id_lower)
cloud_upper = generate_cloud_instance_id(meeting_id_upper)
assert cloud_lower == cloud_upper
def test_uuid5_is_case_sensitive_warning(self):
"""
Documents uuid5 case sensitivity - different case UUIDs produce different hashes.
Not a problem: meeting.id always lowercase from DB and API.
Frontend generates raw-tracks instanceId from lowercase meeting.id.
Backend receives lowercase meeting_id when matching.
This test documents the behavior, not a requirement.
"""
meeting_id_lower = "550e8400-e29b-41d4-a716-446655440000"
meeting_id_upper = "550E8400-E29B-41D4-A716-446655440000"
raw_lower = generate_raw_tracks_instance_id(meeting_id_lower)
raw_upper = generate_raw_tracks_instance_id(meeting_id_upper)
assert raw_lower != raw_upper
class TestMtgSessionIdVsInstanceId:
"""
Documents that Daily.co's mtgSessionId differs from our instanceId.
Why this matters: We investigated using mtgSessionId for matching but discovered
it's Daily.co-generated and unrelated to instanceId we send. This test documents
that finding so we don't investigate it again.
Production data from 2026-01-13:
- Meeting ID: 4ad503b6-8189-4910-a8f7-68cdd1b7f990
- Cloud instanceId: 4ad503b6-8189-4910-a8f7-68cdd1b7f990 (same as meeting ID)
- Raw-tracks instanceId: 784b3af3-c7dd-57f0-ac54-2ee91c6927cb (UUIDv5 derived)
- Recording mtgSessionId: f25a2e09-740f-4932-9c0d-b1bebaa669c6 (different!)
Conclusion: Cannot use mtgSessionId for recording-to-meeting matching.
"""
def test_mtg_session_id_differs_from_our_instance_ids(self):
"""mtgSessionId (Daily.co) != instanceId (ours) for both cloud and raw-tracks."""
meeting_id = "4ad503b6-8189-4910-a8f7-68cdd1b7f990"
expected_raw_tracks_id = "784b3af3-c7dd-57f0-ac54-2ee91c6927cb"
mtg_session_id = "f25a2e09-740f-4932-9c0d-b1bebaa669c6"
cloud_instance_id = generate_cloud_instance_id(meeting_id)
raw_tracks_instance_id = generate_raw_tracks_instance_id(meeting_id)
assert str(cloud_instance_id) == meeting_id
assert str(raw_tracks_instance_id) == expected_raw_tracks_id
assert str(cloud_instance_id) != mtg_session_id
assert str(raw_tracks_instance_id) != mtg_session_id

View File

@@ -2,10 +2,9 @@
Tests for Hatchet workflow dispatch and routing logic.
These tests verify:
1. Routing to Hatchet when HATCHET_ENABLED=True
2. Replay logic for failed workflows
3. Force flag to cancel and restart
4. Validation prevents concurrent workflows
1. Hatchet workflow validation and replay logic
2. Force flag to cancel and restart workflows
3. Validation prevents concurrent workflows
"""
from unittest.mock import AsyncMock, patch
@@ -34,25 +33,22 @@ async def test_hatchet_validation_blocks_running_workflow():
workflow_run_id="running-workflow-123",
)
with patch("reflector.services.transcript_process.settings") as mock_settings:
mock_settings.HATCHET_ENABLED = True
with patch(
"reflector.services.transcript_process.HatchetClientManager"
) as mock_hatchet:
mock_hatchet.get_workflow_run_status = AsyncMock(
return_value=V1TaskStatus.RUNNING
)
with patch(
"reflector.services.transcript_process.HatchetClientManager"
) as mock_hatchet:
mock_hatchet.get_workflow_run_status = AsyncMock(
return_value=V1TaskStatus.RUNNING
)
"reflector.services.transcript_process.task_is_scheduled_or_active"
) as mock_celery_check:
mock_celery_check.return_value = False
with patch(
"reflector.services.transcript_process.task_is_scheduled_or_active"
) as mock_celery_check:
mock_celery_check.return_value = False
result = await validate_transcript_for_processing(mock_transcript)
result = await validate_transcript_for_processing(mock_transcript)
assert isinstance(result, ValidationAlreadyScheduled)
assert "running" in result.detail.lower()
assert isinstance(result, ValidationAlreadyScheduled)
assert "running" in result.detail.lower()
@pytest.mark.usefixtures("setup_database")
@@ -72,24 +68,21 @@ async def test_hatchet_validation_blocks_queued_workflow():
workflow_run_id="queued-workflow-123",
)
with patch("reflector.services.transcript_process.settings") as mock_settings:
mock_settings.HATCHET_ENABLED = True
with patch(
"reflector.services.transcript_process.HatchetClientManager"
) as mock_hatchet:
mock_hatchet.get_workflow_run_status = AsyncMock(
return_value=V1TaskStatus.QUEUED
)
with patch(
"reflector.services.transcript_process.HatchetClientManager"
) as mock_hatchet:
mock_hatchet.get_workflow_run_status = AsyncMock(
return_value=V1TaskStatus.QUEUED
)
"reflector.services.transcript_process.task_is_scheduled_or_active"
) as mock_celery_check:
mock_celery_check.return_value = False
with patch(
"reflector.services.transcript_process.task_is_scheduled_or_active"
) as mock_celery_check:
mock_celery_check.return_value = False
result = await validate_transcript_for_processing(mock_transcript)
result = await validate_transcript_for_processing(mock_transcript)
assert isinstance(result, ValidationAlreadyScheduled)
assert isinstance(result, ValidationAlreadyScheduled)
@pytest.mark.usefixtures("setup_database")
@@ -110,25 +103,22 @@ async def test_hatchet_validation_allows_failed_workflow():
recording_id="test-recording-id",
)
with patch("reflector.services.transcript_process.settings") as mock_settings:
mock_settings.HATCHET_ENABLED = True
with patch(
"reflector.services.transcript_process.HatchetClientManager"
) as mock_hatchet:
mock_hatchet.get_workflow_run_status = AsyncMock(
return_value=V1TaskStatus.FAILED
)
with patch(
"reflector.services.transcript_process.HatchetClientManager"
) as mock_hatchet:
mock_hatchet.get_workflow_run_status = AsyncMock(
return_value=V1TaskStatus.FAILED
)
"reflector.services.transcript_process.task_is_scheduled_or_active"
) as mock_celery_check:
mock_celery_check.return_value = False
with patch(
"reflector.services.transcript_process.task_is_scheduled_or_active"
) as mock_celery_check:
mock_celery_check.return_value = False
result = await validate_transcript_for_processing(mock_transcript)
result = await validate_transcript_for_processing(mock_transcript)
assert isinstance(result, ValidationOk)
assert result.transcript_id == "test-transcript-id"
assert isinstance(result, ValidationOk)
assert result.transcript_id == "test-transcript-id"
@pytest.mark.usefixtures("setup_database")
@@ -149,24 +139,21 @@ async def test_hatchet_validation_allows_completed_workflow():
recording_id="test-recording-id",
)
with patch("reflector.services.transcript_process.settings") as mock_settings:
mock_settings.HATCHET_ENABLED = True
with patch(
"reflector.services.transcript_process.HatchetClientManager"
) as mock_hatchet:
mock_hatchet.get_workflow_run_status = AsyncMock(
return_value=V1TaskStatus.COMPLETED
)
with patch(
"reflector.services.transcript_process.HatchetClientManager"
) as mock_hatchet:
mock_hatchet.get_workflow_run_status = AsyncMock(
return_value=V1TaskStatus.COMPLETED
)
"reflector.services.transcript_process.task_is_scheduled_or_active"
) as mock_celery_check:
mock_celery_check.return_value = False
with patch(
"reflector.services.transcript_process.task_is_scheduled_or_active"
) as mock_celery_check:
mock_celery_check.return_value = False
result = await validate_transcript_for_processing(mock_transcript)
result = await validate_transcript_for_processing(mock_transcript)
assert isinstance(result, ValidationOk)
assert isinstance(result, ValidationOk)
@pytest.mark.usefixtures("setup_database")
@@ -187,26 +174,23 @@ async def test_hatchet_validation_allows_when_status_check_fails():
recording_id="test-recording-id",
)
with patch("reflector.services.transcript_process.settings") as mock_settings:
mock_settings.HATCHET_ENABLED = True
with patch(
"reflector.services.transcript_process.HatchetClientManager"
) as mock_hatchet:
# Status check fails (workflow might be deleted)
mock_hatchet.get_workflow_run_status = AsyncMock(
side_effect=ApiException("Workflow not found")
)
with patch(
"reflector.services.transcript_process.HatchetClientManager"
) as mock_hatchet:
# Status check fails (workflow might be deleted)
mock_hatchet.get_workflow_run_status = AsyncMock(
side_effect=ApiException("Workflow not found")
)
"reflector.services.transcript_process.task_is_scheduled_or_active"
) as mock_celery_check:
mock_celery_check.return_value = False
with patch(
"reflector.services.transcript_process.task_is_scheduled_or_active"
) as mock_celery_check:
mock_celery_check.return_value = False
result = await validate_transcript_for_processing(mock_transcript)
result = await validate_transcript_for_processing(mock_transcript)
# Should allow processing when we can't get status
assert isinstance(result, ValidationOk)
# Should allow processing when we can't get status
assert isinstance(result, ValidationOk)
@pytest.mark.usefixtures("setup_database")
@@ -227,47 +211,11 @@ async def test_hatchet_validation_skipped_when_no_workflow_id():
recording_id="test-recording-id",
)
with patch("reflector.services.transcript_process.settings") as mock_settings:
mock_settings.HATCHET_ENABLED = True
with patch(
"reflector.services.transcript_process.HatchetClientManager"
) as mock_hatchet:
# Should not be called
mock_hatchet.get_workflow_run_status = AsyncMock()
with patch(
"reflector.services.transcript_process.task_is_scheduled_or_active"
) as mock_celery_check:
mock_celery_check.return_value = False
result = await validate_transcript_for_processing(mock_transcript)
# Should not check Hatchet status
mock_hatchet.get_workflow_run_status.assert_not_called()
assert isinstance(result, ValidationOk)
@pytest.mark.usefixtures("setup_database")
@pytest.mark.asyncio
async def test_hatchet_validation_skipped_when_disabled():
"""Test that Hatchet validation is skipped when HATCHET_ENABLED is False."""
from reflector.services.transcript_process import (
ValidationOk,
validate_transcript_for_processing,
)
mock_transcript = Transcript(
id="test-transcript-id",
name="Test",
status="uploaded",
source_kind="room",
workflow_run_id="some-workflow-123",
recording_id="test-recording-id",
)
with patch("reflector.services.transcript_process.settings") as mock_settings:
mock_settings.HATCHET_ENABLED = False # Hatchet disabled
with patch(
"reflector.services.transcript_process.HatchetClientManager"
) as mock_hatchet:
# Should not be called
mock_hatchet.get_workflow_run_status = AsyncMock()
with patch(
"reflector.services.transcript_process.task_is_scheduled_or_active"
@@ -276,7 +224,8 @@ async def test_hatchet_validation_skipped_when_disabled():
result = await validate_transcript_for_processing(mock_transcript)
# Should not check Hatchet at all
# Should not check Hatchet status
mock_hatchet.get_workflow_run_status.assert_not_called()
assert isinstance(result, ValidationOk)
@@ -306,7 +255,7 @@ async def test_validation_locked_transcript():
@pytest.mark.usefixtures("setup_database")
@pytest.mark.asyncio
async def test_validation_idle_transcript():
"""Test that validation rejects idle transcripts (not ready)."""
"""Test that validation rejects idle transcripts without recording (file upload not ready)."""
from reflector.services.transcript_process import (
ValidationNotReady,
validate_transcript_for_processing,
@@ -325,6 +274,34 @@ async def test_validation_idle_transcript():
assert "not ready" in result.detail.lower()
@pytest.mark.usefixtures("setup_database")
@pytest.mark.asyncio
async def test_validation_idle_transcript_with_recording_allowed():
"""Test that validation allows idle transcripts with recording_id (multitrack ready/retry)."""
from reflector.services.transcript_process import (
ValidationOk,
validate_transcript_for_processing,
)
mock_transcript = Transcript(
id="test-transcript-id",
name="Test",
status="idle",
source_kind="room",
recording_id="test-recording-id",
)
with patch(
"reflector.services.transcript_process.task_is_scheduled_or_active"
) as mock_celery_check:
mock_celery_check.return_value = False
result = await validate_transcript_for_processing(mock_transcript)
assert isinstance(result, ValidationOk)
assert result.recording_id == "test-recording-id"
@pytest.mark.usefixtures("setup_database")
@pytest.mark.asyncio
async def test_prepare_multitrack_config():

View File

@@ -0,0 +1,185 @@
"""
Tests for Hatchet payload thinning optimizations.
Verifies that:
1. TopicChunkInput no longer carries words
2. TopicChunkResult no longer carries words
3. words_to_segments() matches Transcript.as_segments(is_multitrack=False) — behavioral equivalence
for the extract_subjects refactoring
4. TopicsResult can be constructed with empty transcript words
"""
from reflector.hatchet.workflows.models import TopicChunkResult
from reflector.hatchet.workflows.topic_chunk_processing import TopicChunkInput
from reflector.processors.types import Word
def _make_words(speaker: int = 0, start: float = 0.0) -> list[Word]:
return [
Word(text="Hello", start=start, end=start + 0.5, speaker=speaker),
Word(text=" world.", start=start + 0.5, end=start + 1.0, speaker=speaker),
]
class TestTopicChunkInputNoWords:
"""TopicChunkInput must not have a words field."""
def test_no_words_field(self):
assert "words" not in TopicChunkInput.model_fields
def test_construction_without_words(self):
inp = TopicChunkInput(
chunk_index=0, chunk_text="Hello world.", timestamp=0.0, duration=1.0
)
assert inp.chunk_index == 0
assert inp.chunk_text == "Hello world."
def test_rejects_words_kwarg(self):
"""Passing words= should raise a validation error (field doesn't exist)."""
import pydantic
try:
TopicChunkInput(
chunk_index=0,
chunk_text="text",
timestamp=0.0,
duration=1.0,
words=_make_words(),
)
# If pydantic is configured to ignore extra, this won't raise.
# Verify the field is still absent from the model.
assert "words" not in TopicChunkInput.model_fields
except pydantic.ValidationError:
pass # Expected
class TestTopicChunkResultNoWords:
"""TopicChunkResult must not have a words field."""
def test_no_words_field(self):
assert "words" not in TopicChunkResult.model_fields
def test_construction_without_words(self):
result = TopicChunkResult(
chunk_index=0,
title="Test",
summary="Summary",
timestamp=0.0,
duration=1.0,
)
assert result.title == "Test"
assert result.chunk_index == 0
def test_serialization_roundtrip(self):
"""Serialized TopicChunkResult has no words key."""
result = TopicChunkResult(
chunk_index=0,
title="Test",
summary="Summary",
timestamp=0.0,
duration=1.0,
)
data = result.model_dump()
assert "words" not in data
reconstructed = TopicChunkResult(**data)
assert reconstructed == result
class TestWordsToSegmentsBehavioralEquivalence:
"""words_to_segments() must produce same output as Transcript.as_segments(is_multitrack=False).
This ensures the extract_subjects refactoring (from task output topic.transcript.as_segments()
to words_to_segments(db_topic.words)) preserves identical behavior.
"""
def test_single_speaker(self):
from reflector.processors.types import Transcript as TranscriptType
from reflector.processors.types import words_to_segments
words = _make_words(speaker=0)
direct = words_to_segments(words)
via_transcript = TranscriptType(words=words).as_segments(is_multitrack=False)
assert len(direct) == len(via_transcript)
for d, v in zip(direct, via_transcript):
assert d.text == v.text
assert d.speaker == v.speaker
assert d.start == v.start
assert d.end == v.end
def test_multiple_speakers(self):
from reflector.processors.types import Transcript as TranscriptType
from reflector.processors.types import words_to_segments
words = [
Word(text="Hello", start=0.0, end=0.5, speaker=0),
Word(text=" world.", start=0.5, end=1.0, speaker=0),
Word(text=" How", start=1.0, end=1.5, speaker=1),
Word(text=" are", start=1.5, end=2.0, speaker=1),
Word(text=" you?", start=2.0, end=2.5, speaker=1),
]
direct = words_to_segments(words)
via_transcript = TranscriptType(words=words).as_segments(is_multitrack=False)
assert len(direct) == len(via_transcript)
for d, v in zip(direct, via_transcript):
assert d.text == v.text
assert d.speaker == v.speaker
def test_empty_words(self):
from reflector.processors.types import Transcript as TranscriptType
from reflector.processors.types import words_to_segments
assert words_to_segments([]) == []
assert TranscriptType(words=[]).as_segments(is_multitrack=False) == []
class TestTopicsResultEmptyWords:
"""TopicsResult can carry topics with empty transcript words."""
def test_construction_with_empty_words(self):
from reflector.hatchet.workflows.models import TopicsResult
from reflector.processors.types import TitleSummary
from reflector.processors.types import Transcript as TranscriptType
topics = [
TitleSummary(
title="Topic A",
summary="Summary A",
timestamp=0.0,
duration=5.0,
transcript=TranscriptType(words=[]),
),
TitleSummary(
title="Topic B",
summary="Summary B",
timestamp=5.0,
duration=5.0,
transcript=TranscriptType(words=[]),
),
]
result = TopicsResult(topics=topics)
assert len(result.topics) == 2
for t in result.topics:
assert t.transcript.words == []
def test_serialization_roundtrip(self):
from reflector.hatchet.workflows.models import TopicsResult
from reflector.processors.types import TitleSummary
from reflector.processors.types import Transcript as TranscriptType
topics = [
TitleSummary(
title="Topic",
summary="Summary",
timestamp=0.0,
duration=1.0,
transcript=TranscriptType(words=[]),
)
]
result = TopicsResult(topics=topics)
data = result.model_dump()
reconstructed = TopicsResult(**data)
assert len(reconstructed.topics) == 1
assert reconstructed.topics[0].transcript.words == []

View File

@@ -189,14 +189,17 @@ async def test_ics_sync_service_sync_room_calendar():
assert events[0].ics_uid == "sync-event-1"
assert events[0].title == "Sync Test Meeting"
# Second sync with same content (should be unchanged)
# Second sync with same content (calendar unchanged, but sync always runs)
# Refresh room to get updated etag and force sync by setting old sync time
room = await rooms_controller.get_by_id(room.id)
await rooms_controller.update(
room, {"ics_last_sync": datetime.now(timezone.utc) - timedelta(minutes=10)}
)
result = await sync_service.sync_room_calendar(room)
assert result["status"] == "unchanged"
assert result["status"] == "success"
assert result["events_created"] == 0
assert result["events_updated"] == 0
assert result["events_deleted"] == 0
# Third sync with updated event
event["summary"] = "Updated Meeting Title"
@@ -288,3 +291,43 @@ async def test_ics_sync_service_error_handling():
result = await sync_service.sync_room_calendar(room)
assert result["status"] == "error"
assert "Network error" in result["error"]
@pytest.mark.asyncio
async def test_event_data_changed_exhaustiveness():
"""Test that _event_data_changed compares all EventData fields (except ics_uid).
This test ensures programmers don't forget to update the comparison logic
when adding new fields to EventData/CalendarEvent.
"""
from reflector.services.ics_sync import EventData
sync_service = ICSSyncService()
from reflector.db.calendar_events import CalendarEvent
now = datetime.now(timezone.utc)
event_data: EventData = {
"ics_uid": "test-123",
"title": "Test",
"description": "Desc",
"location": "Loc",
"start_time": now,
"end_time": now + timedelta(hours=1),
"attendees": [],
"ics_raw_data": "raw",
}
existing = CalendarEvent(
room_id="room1",
**event_data,
)
# Will raise RuntimeError if fields are missing from comparison
result = sync_service._event_data_changed(existing, event_data)
assert result is False
modified_data = event_data.copy()
modified_data["title"] = "Changed Title"
result = sync_service._event_data_changed(existing, modified_data)
assert result is True

View File

@@ -319,3 +319,51 @@ def test_aws_storage_constructor_rejects_mixed_auth():
aws_secret_access_key="test-secret",
aws_role_arn="arn:aws:iam::123456789012:role/test-role",
)
@pytest.mark.asyncio
async def test_aws_storage_custom_endpoint_url():
"""Test that custom endpoint_url configures path-style addressing and passes endpoint to client."""
storage = AwsStorage(
aws_bucket_name="reflector-media",
aws_region="garage",
aws_access_key_id="GKtest",
aws_secret_access_key="secret",
aws_endpoint_url="http://garage:3900",
)
assert storage._endpoint_url == "http://garage:3900"
assert storage.boto_config.s3["addressing_style"] == "path"
assert storage.base_url == "http://garage:3900/reflector-media/"
# retries config preserved (merge, not replace)
assert storage.boto_config.retries["max_attempts"] == 3
mock_client = AsyncMock()
mock_client.put_object = AsyncMock()
mock_client.__aenter__ = AsyncMock(return_value=mock_client)
mock_client.__aexit__ = AsyncMock(return_value=None)
mock_client.generate_presigned_url = AsyncMock(
return_value="http://garage:3900/reflector-media/test.txt"
)
with patch.object(
storage.session, "client", return_value=mock_client
) as mock_session_client:
await storage.put_file("test.txt", b"data")
mock_session_client.assert_called_with(
"s3", config=storage.boto_config, endpoint_url="http://garage:3900"
)
@pytest.mark.asyncio
async def test_aws_storage_none_endpoint_url():
"""Test that None endpoint preserves current AWS behavior."""
storage = AwsStorage(
aws_bucket_name="reflector-bucket",
aws_region="us-east-1",
aws_access_key_id="AKIAtest",
aws_secret_access_key="secret",
)
assert storage._endpoint_url is None
assert storage.base_url == "https://reflector-bucket.s3.amazonaws.com/"
# No s3 addressing_style override — boto_config should only have retries
assert not hasattr(storage.boto_config, "s3") or storage.boto_config.s3 is None

View File

@@ -0,0 +1,374 @@
"""
Integration tests for time-based meeting-to-recording matching.
Tests the critical path for matching Daily.co recordings to meetings when
API doesn't return instanceId.
"""
from datetime import datetime, timedelta, timezone
import pytest
from reflector.db.meetings import meetings_controller
from reflector.db.rooms import rooms_controller
@pytest.fixture
async def test_room():
"""Create a test room for meetings."""
room = await rooms_controller.add(
name="test-room-time",
user_id="test-user-id",
zulip_auto_post=False,
zulip_stream="",
zulip_topic="",
is_locked=False,
room_mode="normal",
recording_type="cloud",
recording_trigger="automatic",
is_shared=False,
platform="daily",
)
return room
@pytest.fixture
def base_time():
"""Fixed timestamp for deterministic tests."""
return datetime(2026, 1, 14, 9, 0, 0, tzinfo=timezone.utc)
class TestTimeBasedMatching:
"""Test get_by_room_name_and_time() matching logic."""
async def test_exact_time_match(self, test_room, base_time):
"""Recording timestamp exactly matches meeting start_date."""
meeting = await meetings_controller.create(
id="meeting-exact",
room_name="daily-test-20260114090000",
room_url="https://example.daily.co/test",
host_room_url="https://example.daily.co/test?t=host",
start_date=base_time,
end_date=base_time + timedelta(hours=1),
room=test_room,
)
result = await meetings_controller.get_by_room_name_and_time(
room_name="daily-test-20260114090000",
recording_start=base_time,
time_window_hours=168,
)
assert result is not None
assert result.id == meeting.id
async def test_recording_slightly_after_meeting_start(self, test_room, base_time):
"""Recording started 1 minute after meeting (participants joined late)."""
meeting = await meetings_controller.create(
id="meeting-late",
room_name="daily-test-20260114090100",
room_url="https://example.daily.co/test",
host_room_url="https://example.daily.co/test?t=host",
start_date=base_time,
end_date=base_time + timedelta(hours=1),
room=test_room,
)
recording_start = base_time + timedelta(minutes=1)
result = await meetings_controller.get_by_room_name_and_time(
room_name="daily-test-20260114090100",
recording_start=recording_start,
time_window_hours=168,
)
assert result is not None
assert result.id == meeting.id
async def test_duplicate_room_names_picks_closest(self, test_room, base_time):
"""
Two meetings with same room_name (duplicate/race condition).
Should pick closest by timestamp.
"""
meeting1 = await meetings_controller.create(
id="meeting-1-first",
room_name="daily-duplicate-room",
room_url="https://example.daily.co/test",
host_room_url="https://example.daily.co/test?t=host",
start_date=base_time,
end_date=base_time + timedelta(hours=1),
room=test_room,
)
meeting2 = await meetings_controller.create(
id="meeting-2-second",
room_name="daily-duplicate-room", # Same room_name!
room_url="https://example.daily.co/test",
host_room_url="https://example.daily.co/test?t=host",
start_date=base_time + timedelta(seconds=0.99), # 0.99s later
end_date=base_time + timedelta(hours=1),
room=test_room,
)
# Recording started 0.5s after meeting1
# Distance: meeting1 = 0.5s, meeting2 = 0.49s → meeting2 is closer
recording_start = base_time + timedelta(seconds=0.5)
result = await meetings_controller.get_by_room_name_and_time(
room_name="daily-duplicate-room",
recording_start=recording_start,
time_window_hours=168,
)
assert result is not None
assert result.id == meeting2.id # meeting2 is closer (0.49s vs 0.5s)
async def test_outside_time_window_returns_none(self, test_room, base_time):
"""Recording outside 1-week window returns None."""
await meetings_controller.create(
id="meeting-old",
room_name="daily-test-old",
room_url="https://example.daily.co/test",
host_room_url="https://example.daily.co/test?t=host",
start_date=base_time,
end_date=base_time + timedelta(hours=1),
room=test_room,
)
# Recording 8 days later (outside 7-day window)
recording_start = base_time + timedelta(days=8)
result = await meetings_controller.get_by_room_name_and_time(
room_name="daily-test-old",
recording_start=recording_start,
time_window_hours=168,
)
assert result is None
async def test_tie_breaker_deterministic(self, test_room, base_time):
"""When time delta identical, tie-breaker by meeting.id is deterministic."""
meeting_z = await meetings_controller.create(
id="zzz-last-uuid",
room_name="daily-test-tie",
room_url="https://example.daily.co/test",
host_room_url="https://example.daily.co/test?t=host",
start_date=base_time,
end_date=base_time + timedelta(hours=1),
room=test_room,
)
meeting_a = await meetings_controller.create(
id="aaa-first-uuid",
room_name="daily-test-tie",
room_url="https://example.daily.co/test",
host_room_url="https://example.daily.co/test?t=host",
start_date=base_time, # Exact same start_date
end_date=base_time + timedelta(hours=1),
room=test_room,
)
result = await meetings_controller.get_by_room_name_and_time(
room_name="daily-test-tie",
recording_start=base_time,
time_window_hours=168,
)
assert result is not None
# Tie-breaker: lexicographically first UUID
assert result.id == "aaa-first-uuid"
async def test_timezone_naive_datetime_raises(self, test_room, base_time):
"""Timezone-naive datetime raises ValueError."""
await meetings_controller.create(
id="meeting-tz",
room_name="daily-test-tz",
room_url="https://example.daily.co/test",
host_room_url="https://example.daily.co/test?t=host",
start_date=base_time,
end_date=base_time + timedelta(hours=1),
room=test_room,
)
# Naive datetime (no timezone)
naive_dt = datetime(2026, 1, 14, 9, 0, 0)
with pytest.raises(ValueError, match="timezone-aware"):
await meetings_controller.get_by_room_name_and_time(
room_name="daily-test-tz",
recording_start=naive_dt,
time_window_hours=168,
)
async def test_one_week_boundary_after_included(self, test_room, base_time):
"""Meeting 1-week AFTER recording is included (window_end boundary)."""
meeting_time = base_time + timedelta(hours=168)
await meetings_controller.create(
id="meeting-boundary-after",
room_name="daily-test-boundary-after",
room_url="https://example.daily.co/test",
host_room_url="https://example.daily.co/test?t=host",
start_date=meeting_time,
end_date=meeting_time + timedelta(hours=1),
room=test_room,
)
result = await meetings_controller.get_by_room_name_and_time(
room_name="daily-test-boundary-after",
recording_start=base_time,
time_window_hours=168,
)
assert result is not None
assert result.id == "meeting-boundary-after"
async def test_one_week_boundary_before_included(self, test_room, base_time):
"""Meeting 1-week BEFORE recording is included (window_start boundary)."""
meeting_time = base_time - timedelta(hours=168)
await meetings_controller.create(
id="meeting-boundary-before",
room_name="daily-test-boundary-before",
room_url="https://example.daily.co/test",
host_room_url="https://example.daily.co/test?t=host",
start_date=meeting_time,
end_date=meeting_time + timedelta(hours=1),
room=test_room,
)
result = await meetings_controller.get_by_room_name_and_time(
room_name="daily-test-boundary-before",
recording_start=base_time,
time_window_hours=168,
)
assert result is not None
assert result.id == "meeting-boundary-before"
async def test_recording_before_meeting_start(self, test_room, base_time):
"""Recording started before meeting (clock skew or early join)."""
await meetings_controller.create(
id="meeting-early",
room_name="daily-test-early",
room_url="https://example.daily.co/test",
host_room_url="https://example.daily.co/test?t=host",
start_date=base_time,
end_date=base_time + timedelta(hours=1),
room=test_room,
)
recording_start = base_time - timedelta(minutes=2)
result = await meetings_controller.get_by_room_name_and_time(
room_name="daily-test-early",
recording_start=recording_start,
time_window_hours=168,
)
assert result is not None
assert result.id == "meeting-early"
async def test_mixed_inside_outside_window(self, test_room, base_time):
"""Multiple meetings, only one inside window - returns the inside one."""
await meetings_controller.create(
id="meeting-old",
room_name="daily-test-mixed",
room_url="https://example.daily.co/test",
host_room_url="https://example.daily.co/test?t=host",
start_date=base_time - timedelta(days=10),
end_date=base_time - timedelta(days=10, hours=-1),
room=test_room,
)
await meetings_controller.create(
id="meeting-inside",
room_name="daily-test-mixed",
room_url="https://example.daily.co/test",
host_room_url="https://example.daily.co/test?t=host",
start_date=base_time - timedelta(days=2),
end_date=base_time - timedelta(days=2, hours=-1),
room=test_room,
)
await meetings_controller.create(
id="meeting-future",
room_name="daily-test-mixed",
room_url="https://example.daily.co/test",
host_room_url="https://example.daily.co/test?t=host",
start_date=base_time + timedelta(days=10),
end_date=base_time + timedelta(days=10, hours=1),
room=test_room,
)
result = await meetings_controller.get_by_room_name_and_time(
room_name="daily-test-mixed",
recording_start=base_time,
time_window_hours=168,
)
assert result is not None
assert result.id == "meeting-inside"
class TestAtomicCloudRecordingUpdate:
"""Test atomic update prevents race conditions."""
async def test_first_update_succeeds(self, test_room, base_time):
"""First call to set_cloud_recording_if_missing succeeds."""
meeting = await meetings_controller.create(
id="meeting-atomic-1",
room_name="daily-test-atomic",
room_url="https://example.daily.co/test",
host_room_url="https://example.daily.co/test?t=host",
start_date=base_time,
end_date=base_time + timedelta(hours=1),
room=test_room,
)
success = await meetings_controller.set_cloud_recording_if_missing(
meeting_id=meeting.id,
s3_key="first-s3-key",
duration=100,
)
assert success is True
updated = await meetings_controller.get_by_id(meeting.id)
assert updated.daily_composed_video_s3_key == "first-s3-key"
assert updated.daily_composed_video_duration == 100
async def test_second_update_fails_atomically(self, test_room, base_time):
"""Second call to update same meeting doesn't overwrite (atomic check)."""
meeting = await meetings_controller.create(
id="meeting-atomic-2",
room_name="daily-test-atomic2",
room_url="https://example.daily.co/test",
host_room_url="https://example.daily.co/test?t=host",
start_date=base_time,
end_date=base_time + timedelta(hours=1),
room=test_room,
)
success1 = await meetings_controller.set_cloud_recording_if_missing(
meeting_id=meeting.id,
s3_key="first-s3-key",
duration=100,
)
assert success1 is True
after_first = await meetings_controller.get_by_id(meeting.id)
assert after_first.daily_composed_video_s3_key == "first-s3-key"
success2 = await meetings_controller.set_cloud_recording_if_missing(
meeting_id=meeting.id,
s3_key="bucket/path/should-not-overwrite",
duration=200,
)
assert success2 is False
final = await meetings_controller.get_by_id(meeting.id)
assert final.daily_composed_video_s3_key == "first-s3-key"
assert final.daily_composed_video_duration == 100

View File

@@ -1,6 +1,6 @@
import asyncio
import time
from unittest.mock import patch
from unittest.mock import AsyncMock, patch
import pytest
from httpx import ASGITransport, AsyncClient
@@ -142,17 +142,17 @@ async def test_whereby_recording_uses_file_pipeline(client):
"reflector.services.transcript_process.task_pipeline_file_process"
) as mock_file_pipeline,
patch(
"reflector.services.transcript_process.task_pipeline_multitrack_process"
) as mock_multitrack_pipeline,
"reflector.services.transcript_process.HatchetClientManager"
) as mock_hatchet,
):
response = await client.post(f"/transcripts/{transcript.id}/process")
assert response.status_code == 200
assert response.json()["status"] == "ok"
# Whereby recordings should use file pipeline
# Whereby recordings should use file pipeline, not Hatchet
mock_file_pipeline.delay.assert_called_once_with(transcript_id=transcript.id)
mock_multitrack_pipeline.delay.assert_not_called()
mock_hatchet.start_workflow.assert_not_called()
@pytest.mark.usefixtures("setup_database")
@@ -162,9 +162,22 @@ async def test_dailyco_recording_uses_multitrack_pipeline(client):
from datetime import datetime, timezone
from reflector.db.recordings import Recording, recordings_controller
from reflector.db.rooms import rooms_controller
from reflector.db.transcripts import transcripts_controller
# Create transcript with Daily.co multitrack recording
room = await rooms_controller.add(
name="test-room",
user_id="test-user",
zulip_auto_post=False,
zulip_stream="",
zulip_topic="",
is_locked=False,
room_mode="normal",
recording_type="cloud",
recording_trigger="automatic-2nd-participant",
is_shared=False,
)
transcript = await transcripts_controller.add(
"",
source_kind="room",
@@ -172,6 +185,7 @@ async def test_dailyco_recording_uses_multitrack_pipeline(client):
target_language="en",
user_id="test-user",
share_mode="public",
room_id=room.id,
)
track_keys = [
@@ -197,18 +211,23 @@ async def test_dailyco_recording_uses_multitrack_pipeline(client):
"reflector.services.transcript_process.task_pipeline_file_process"
) as mock_file_pipeline,
patch(
"reflector.services.transcript_process.task_pipeline_multitrack_process"
) as mock_multitrack_pipeline,
"reflector.services.transcript_process.HatchetClientManager"
) as mock_hatchet,
):
mock_hatchet.start_workflow = AsyncMock(return_value="test-workflow-id")
response = await client.post(f"/transcripts/{transcript.id}/process")
assert response.status_code == 200
assert response.json()["status"] == "ok"
# Daily.co multitrack recordings should use multitrack pipeline
mock_multitrack_pipeline.delay.assert_called_once_with(
transcript_id=transcript.id,
bucket_name="daily-bucket",
track_keys=track_keys,
)
# Daily.co multitrack recordings should use Hatchet workflow
mock_hatchet.start_workflow.assert_called_once()
call_kwargs = mock_hatchet.start_workflow.call_args.kwargs
assert call_kwargs["workflow_name"] == "DiarizationPipeline"
assert call_kwargs["input_data"]["transcript_id"] == transcript.id
assert call_kwargs["input_data"]["bucket_name"] == "daily-bucket"
assert call_kwargs["input_data"]["tracks"] == [
{"s3_key": k} for k in track_keys
]
mock_file_pipeline.delay.assert_not_called()

View File

@@ -115,9 +115,7 @@ def appserver(tmpdir, setup_database, celery_session_app, celery_session_worker)
settings.DATA_DIR = DATA_DIR
@pytest.fixture(scope="session")
def celery_includes():
return ["reflector.pipelines.main_live_pipeline"]
# Using celery_includes from conftest.py which includes both pipelines
@pytest.mark.usefixtures("setup_database")

View File

@@ -56,7 +56,12 @@ def appserver_ws_user(setup_database):
if server_instance:
server_instance.should_exit = True
server_thread.join(timeout=30)
server_thread.join(timeout=2.0)
# Reset global singleton for test isolation
from reflector.ws_manager import reset_ws_manager
reset_ws_manager()
@pytest.fixture(autouse=True)
@@ -133,6 +138,8 @@ async def test_user_ws_accepts_valid_token_and_receives_events(appserver_ws_user
# Connect and then trigger an event via HTTP create
async with aconnect_ws(base_ws, subprotocols=subprotocols) as ws:
await asyncio.sleep(0.2)
# Emit an event to the user's room via a standard HTTP action
from httpx import AsyncClient
@@ -150,6 +157,7 @@ async def test_user_ws_accepts_valid_token_and_receives_events(appserver_ws_user
"email": "user-abc@example.com",
}
# Use in-memory client (global singleton makes it share ws_manager)
async with AsyncClient(app=app, base_url=f"http://{host}:{port}/v1") as ac:
# Create a transcript as this user so that the server publishes TRANSCRIPT_CREATED to user room
resp = await ac.post("/transcripts", json={"name": "WS Test"})

721
server/uv.lock generated
View File

@@ -159,21 +159,20 @@ wheels = [
[[package]]
name = "aiortc"
version = "1.13.0"
version = "1.14.0"
source = { registry = "https://pypi.org/simple" }
dependencies = [
{ name = "aioice" },
{ name = "av" },
{ name = "cffi" },
{ name = "cryptography" },
{ name = "google-crc32c" },
{ name = "pyee" },
{ name = "pylibsrtp" },
{ name = "pyopenssl" },
]
sdist = { url = "https://files.pythonhosted.org/packages/62/03/bc947d74c548e0c17cf94e5d5bdacaed0ee9e5b2bb7b8b8cf1ac7a7c01ec/aiortc-1.13.0.tar.gz", hash = "sha256:5d209975c22d0910fb5a0f0e2caa828f2da966c53580f7c7170ac3a16a871620", size = 1179894 }
sdist = { url = "https://files.pythonhosted.org/packages/51/9c/4e027bfe0195de0442da301e2389329496745d40ae44d2d7c4571c4290ce/aiortc-1.14.0.tar.gz", hash = "sha256:adc8a67ace10a085721e588e06a00358ed8eaf5f6b62f0a95358ff45628dd762", size = 1180864 }
wheels = [
{ url = "https://files.pythonhosted.org/packages/87/29/765633cab5f1888890f5f172d1d53009b9b14e079cdfa01a62d9896a9ea9/aiortc-1.13.0-py3-none-any.whl", hash = "sha256:9ccccec98796f6a96bd1c3dd437a06da7e0f57521c96bd56e4b965a91b03a0a0", size = 92910 },
{ url = "https://files.pythonhosted.org/packages/57/ab/31646a49209568cde3b97eeade0d28bb78b400e6645c56422c101df68932/aiortc-1.14.0-py3-none-any.whl", hash = "sha256:4b244d7e482f4e1f67e685b3468269628eca1ec91fa5b329ab517738cfca086e", size = 93183 },
]
[[package]]
@@ -236,12 +235,6 @@ wheels = [
{ url = "https://files.pythonhosted.org/packages/78/b6/6307fbef88d9b5ee7421e68d78a9f162e0da4900bc5f5793f6d3d0e34fb8/annotated_types-0.7.0-py3-none-any.whl", hash = "sha256:1f02e8b43a8fbbc3f3e0d4f0f4bfc8131bcb4eebe8849b8e5c773f3a1c582a53", size = 13643 },
]
[[package]]
name = "antlr4-python3-runtime"
version = "4.9.3"
source = { registry = "https://pypi.org/simple" }
sdist = { url = "https://files.pythonhosted.org/packages/3e/38/7859ff46355f76f8d19459005ca000b6e7012f2f1ca597746cbcd1fbfe5e/antlr4-python3-runtime-4.9.3.tar.gz", hash = "sha256:f224469b4168294902bb1efa80a8bf7855f24c99aef99cbefc1bcd3cce77881b", size = 117034 }
[[package]]
name = "anyio"
version = "4.9.0"
@@ -268,21 +261,6 @@ wheels = [
{ url = "https://files.pythonhosted.org/packages/2f/f5/c36551e93acba41a59939ae6a0fb77ddb3f2e8e8caa716410c65f7341f72/asgi_lifespan-2.1.0-py3-none-any.whl", hash = "sha256:ed840706680e28428c01e14afb3875d7d76d3206f3d5b2f2294e059b5c23804f", size = 10895 },
]
[[package]]
name = "asteroid-filterbanks"
version = "0.4.0"
source = { registry = "https://pypi.org/simple" }
dependencies = [
{ name = "numpy" },
{ name = "torch", version = "2.8.0", source = { registry = "https://download.pytorch.org/whl/cpu" }, marker = "sys_platform == 'darwin'" },
{ name = "torch", version = "2.8.0+cpu", source = { registry = "https://download.pytorch.org/whl/cpu" }, marker = "sys_platform != 'darwin'" },
{ name = "typing-extensions" },
]
sdist = { url = "https://files.pythonhosted.org/packages/90/fa/5c2be1f96dc179f83cdd3bb267edbd1f47d08f756785c016d5c2163901a7/asteroid-filterbanks-0.4.0.tar.gz", hash = "sha256:415f89d1dcf2b13b35f03f7a9370968ac4e6fa6800633c522dac992b283409b9", size = 24599 }
wheels = [
{ url = "https://files.pythonhosted.org/packages/c5/7c/83ff6046176a675e6a1e8aeefed8892cd97fe7c46af93cc540d1b24b8323/asteroid_filterbanks-0.4.0-py3-none-any.whl", hash = "sha256:4932ac8b6acc6e08fb87cbe8ece84215b5a74eee284fe83acf3540a72a02eaf5", size = 29912 },
]
[[package]]
name = "async-timeout"
version = "5.0.1"
@@ -327,28 +305,24 @@ wheels = [
[[package]]
name = "av"
version = "14.4.0"
version = "16.1.0"
source = { registry = "https://pypi.org/simple" }
sdist = { url = "https://files.pythonhosted.org/packages/86/f6/0b473dab52dfdea05f28f3578b1c56b6c796ce85e76951bab7c4e38d5a74/av-14.4.0.tar.gz", hash = "sha256:3ecbf803a7fdf67229c0edada0830d6bfaea4d10bfb24f0c3f4e607cd1064b42", size = 3892203 }
sdist = { url = "https://files.pythonhosted.org/packages/78/cd/3a83ffbc3cc25b39721d174487fb0d51a76582f4a1703f98e46170ce83d4/av-16.1.0.tar.gz", hash = "sha256:a094b4fd87a3721dacf02794d3d2c82b8d712c85b9534437e82a8a978c175ffd", size = 4285203 }
wheels = [
{ url = "https://files.pythonhosted.org/packages/18/8a/d57418b686ffd05fabd5a0a9cfa97e63b38c35d7101af00e87c51c8cc43c/av-14.4.0-cp311-cp311-macosx_12_0_arm64.whl", hash = "sha256:5b21d5586a88b9fce0ab78e26bd1c38f8642f8e2aad5b35e619f4d202217c701", size = 19965048 },
{ url = "https://files.pythonhosted.org/packages/f5/aa/3f878b0301efe587e9b07bb773dd6b47ef44ca09a3cffb4af50c08a170f3/av-14.4.0-cp311-cp311-macosx_12_0_x86_64.whl", hash = "sha256:cf8762d90b0f94a20c9f6e25a94f1757db5a256707964dfd0b1d4403e7a16835", size = 23750064 },
{ url = "https://files.pythonhosted.org/packages/9a/b4/6fe94a31f9ed3a927daa72df67c7151968587106f30f9f8fcd792b186633/av-14.4.0-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:c0ac9f08920c7bbe0795319689d901e27cb3d7870b9a0acae3f26fc9daa801a6", size = 33648775 },
{ url = "https://files.pythonhosted.org/packages/6c/f3/7f3130753521d779450c935aec3f4beefc8d4645471159f27b54e896470c/av-14.4.0-cp311-cp311-manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:a56d9ad2afdb638ec0404e962dc570960aae7e08ae331ad7ff70fbe99a6cf40e", size = 32216915 },
{ url = "https://files.pythonhosted.org/packages/f8/9a/8ffabfcafb42154b4b3a67d63f9b69e68fa8c34cb39ddd5cb813dd049ed4/av-14.4.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:6bed513cbcb3437d0ae47743edc1f5b4a113c0b66cdd4e1aafc533abf5b2fbf2", size = 35287279 },
{ url = "https://files.pythonhosted.org/packages/ad/11/7023ba0a2ca94a57aedf3114ab8cfcecb0819b50c30982a4c5be4d31df41/av-14.4.0-cp311-cp311-musllinux_1_2_aarch64.whl", hash = "sha256:d030c2d3647931e53d51f2f6e0fcf465263e7acf9ec6e4faa8dbfc77975318c3", size = 36294683 },
{ url = "https://files.pythonhosted.org/packages/3d/fa/b8ac9636bd5034e2b899354468bef9f4dadb067420a16d8a493a514b7817/av-14.4.0-cp311-cp311-musllinux_1_2_i686.whl", hash = "sha256:1cc21582a4f606271d8c2036ec7a6247df0831050306c55cf8a905701d0f0474", size = 34552391 },
{ url = "https://files.pythonhosted.org/packages/fb/29/0db48079c207d1cba7a2783896db5aec3816e17de55942262c244dffbc0f/av-14.4.0-cp311-cp311-musllinux_1_2_x86_64.whl", hash = "sha256:ce7c9cd452153d36f1b1478f904ed5f9ab191d76db873bdd3a597193290805d4", size = 37265250 },
{ url = "https://files.pythonhosted.org/packages/1c/55/715858c3feb7efa4d667ce83a829c8e6ee3862e297fb2b568da3f968639d/av-14.4.0-cp311-cp311-win_amd64.whl", hash = "sha256:fd261e31cc6b43ca722f80656c39934199d8f2eb391e0147e704b6226acebc29", size = 27925845 },
{ url = "https://files.pythonhosted.org/packages/a6/75/b8641653780336c90ba89e5352cac0afa6256a86a150c7703c0b38851c6d/av-14.4.0-cp312-cp312-macosx_12_0_arm64.whl", hash = "sha256:a53e682b239dd23b4e3bc9568cfb1168fc629ab01925fdb2e7556eb426339e94", size = 19954125 },
{ url = "https://files.pythonhosted.org/packages/99/e6/37fe6fa5853a48d54d749526365780a63a4bc530be6abf2115e3a21e292a/av-14.4.0-cp312-cp312-macosx_12_0_x86_64.whl", hash = "sha256:5aa0b901751a32703fa938d2155d56ce3faf3630e4a48d238b35d2f7e49e5395", size = 23751479 },
{ url = "https://files.pythonhosted.org/packages/f7/75/9a5f0e6bda5f513b62bafd1cff2b495441a8b07ab7fb7b8e62f0c0d1683f/av-14.4.0-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:a3b316fed3597675fe2aacfed34e25fc9d5bb0196dc8c0b014ae5ed4adda48de", size = 33801401 },
{ url = "https://files.pythonhosted.org/packages/6a/c9/e4df32a2ad1cb7f3a112d0ed610c5e43c89da80b63c60d60e3dc23793ec0/av-14.4.0-cp312-cp312-manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:a587b5c5014c3c0e16143a0f8d99874e46b5d0c50db6111aa0b54206b5687c81", size = 32364330 },
{ url = "https://files.pythonhosted.org/packages/ca/f0/64e7444a41817fde49a07d0239c033f7e9280bec4a4bb4784f5c79af95e6/av-14.4.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:10d53f75e8ac1ec8877a551c0db32a83c0aaeae719d05285281eaaba211bbc30", size = 35519508 },
{ url = "https://files.pythonhosted.org/packages/c2/a8/a370099daa9033a3b6f9b9bd815304b3d8396907a14d09845f27467ba138/av-14.4.0-cp312-cp312-musllinux_1_2_aarch64.whl", hash = "sha256:c8558cfde79dd8fc92d97c70e0f0fa8c94c7a66f68ae73afdf58598f0fe5e10d", size = 36448593 },
{ url = "https://files.pythonhosted.org/packages/27/bb/edb6ceff8fa7259cb6330c51dbfbc98dd1912bd6eb5f7bc05a4bb14a9d6e/av-14.4.0-cp312-cp312-musllinux_1_2_i686.whl", hash = "sha256:455b6410dea0ab2d30234ffb28df7d62ca3cdf10708528e247bec3a4cdcced09", size = 34701485 },
{ url = "https://files.pythonhosted.org/packages/a7/8a/957da1f581aa1faa9a5dfa8b47ca955edb47f2b76b949950933b457bfa1d/av-14.4.0-cp312-cp312-musllinux_1_2_x86_64.whl", hash = "sha256:1661efbe9d975f927b8512d654704223d936f39016fad2ddab00aee7c40f412c", size = 37521981 },
{ url = "https://files.pythonhosted.org/packages/28/76/3f1cf0568592f100fd68eb40ed8c491ce95ca3c1378cc2d4c1f6d1bd295d/av-14.4.0-cp312-cp312-win_amd64.whl", hash = "sha256:fbbeef1f421a3461086853d6464ad5526b56ffe8ccb0ab3fd0a1f121dfbf26ad", size = 27925944 },
{ url = "https://files.pythonhosted.org/packages/48/d0/b71b65d1b36520dcb8291a2307d98b7fc12329a45614a303ff92ada4d723/av-16.1.0-cp311-cp311-macosx_11_0_x86_64.whl", hash = "sha256:e88ad64ee9d2b9c4c5d891f16c22ae78e725188b8926eb88187538d9dd0b232f", size = 26927747 },
{ url = "https://files.pythonhosted.org/packages/2f/79/720a5a6ccdee06eafa211b945b0a450e3a0b8fc3d12922f0f3c454d870d2/av-16.1.0-cp311-cp311-macosx_14_0_arm64.whl", hash = "sha256:cb296073fa6935724de72593800ba86ae49ed48af03960a4aee34f8a611f442b", size = 21492232 },
{ url = "https://files.pythonhosted.org/packages/8e/4f/a1ba8d922f2f6d1a3d52419463ef26dd6c4d43ee364164a71b424b5ae204/av-16.1.0-cp311-cp311-manylinux_2_28_aarch64.whl", hash = "sha256:720edd4d25aa73723c1532bb0597806d7b9af5ee34fc02358782c358cfe2f879", size = 39291737 },
{ url = "https://files.pythonhosted.org/packages/1a/31/fc62b9fe8738d2693e18d99f040b219e26e8df894c10d065f27c6b4f07e3/av-16.1.0-cp311-cp311-manylinux_2_28_x86_64.whl", hash = "sha256:c7f2bc703d0df260a1fdf4de4253c7f5500ca9fc57772ea241b0cb241bcf972e", size = 40846822 },
{ url = "https://files.pythonhosted.org/packages/53/10/ab446583dbce730000e8e6beec6ec3c2753e628c7f78f334a35cad0317f4/av-16.1.0-cp311-cp311-musllinux_1_2_aarch64.whl", hash = "sha256:d69c393809babada7d54964d56099e4b30a3e1f8b5736ca5e27bd7be0e0f3c83", size = 40675604 },
{ url = "https://files.pythonhosted.org/packages/31/d7/1003be685277005f6d63fd9e64904ee222fe1f7a0ea70af313468bb597db/av-16.1.0-cp311-cp311-musllinux_1_2_x86_64.whl", hash = "sha256:441892be28582356d53f282873c5a951592daaf71642c7f20165e3ddcb0b4c63", size = 42015955 },
{ url = "https://files.pythonhosted.org/packages/2f/4a/fa2a38ee9306bf4579f556f94ecbc757520652eb91294d2a99c7cf7623b9/av-16.1.0-cp311-cp311-win_amd64.whl", hash = "sha256:273a3e32de64819e4a1cd96341824299fe06f70c46f2288b5dc4173944f0fd62", size = 31750339 },
{ url = "https://files.pythonhosted.org/packages/9c/84/2535f55edcd426cebec02eb37b811b1b0c163f26b8d3f53b059e2ec32665/av-16.1.0-cp312-cp312-macosx_11_0_x86_64.whl", hash = "sha256:640f57b93f927fba8689f6966c956737ee95388a91bd0b8c8b5e0481f73513d6", size = 26945785 },
{ url = "https://files.pythonhosted.org/packages/b6/17/ffb940c9e490bf42e86db4db1ff426ee1559cd355a69609ec1efe4d3a9eb/av-16.1.0-cp312-cp312-macosx_14_0_arm64.whl", hash = "sha256:ae3fb658eec00852ebd7412fdc141f17f3ddce8afee2d2e1cf366263ad2a3b35", size = 21481147 },
{ url = "https://files.pythonhosted.org/packages/15/c1/e0d58003d2d83c3921887d5c8c9b8f5f7de9b58dc2194356a2656a45cfdc/av-16.1.0-cp312-cp312-manylinux_2_28_aarch64.whl", hash = "sha256:27ee558d9c02a142eebcbe55578a6d817fedfde42ff5676275504e16d07a7f86", size = 39517197 },
{ url = "https://files.pythonhosted.org/packages/32/77/787797b43475d1b90626af76f80bfb0c12cfec5e11eafcfc4151b8c80218/av-16.1.0-cp312-cp312-manylinux_2_28_x86_64.whl", hash = "sha256:7ae547f6d5fa31763f73900d43901e8c5fa6367bb9a9840978d57b5a7ae14ed2", size = 41174337 },
{ url = "https://files.pythonhosted.org/packages/8e/ac/d90df7f1e3b97fc5554cf45076df5045f1e0a6adf13899e10121229b826c/av-16.1.0-cp312-cp312-musllinux_1_2_aarch64.whl", hash = "sha256:8cf065f9d438e1921dc31fc7aa045790b58aee71736897866420d80b5450f62a", size = 40817720 },
{ url = "https://files.pythonhosted.org/packages/80/6f/13c3a35f9dbcebafd03fe0c4cbd075d71ac8968ec849a3cfce406c35a9d2/av-16.1.0-cp312-cp312-musllinux_1_2_x86_64.whl", hash = "sha256:a345877a9d3cc0f08e2bc4ec163ee83176864b92587afb9d08dff50f37a9a829", size = 42267396 },
{ url = "https://files.pythonhosted.org/packages/c8/b9/275df9607f7fb44317ccb1d4be74827185c0d410f52b6e2cd770fe209118/av-16.1.0-cp312-cp312-win_amd64.whl", hash = "sha256:f49243b1d27c91cd8c66fdba90a674e344eb8eb917264f36117bf2b6879118fd", size = 31752045 },
]
[[package]]
@@ -608,56 +582,6 @@ wheels = [
{ url = "https://files.pythonhosted.org/packages/a7/06/3d6badcf13db419e25b07041d9c7b4a2c331d3f4e7134445ec5df57714cd/coloredlogs-15.0.1-py2.py3-none-any.whl", hash = "sha256:612ee75c546f53e92e70049c9dbfcc18c935a2b9a53b66085ce9ef6a6e5c0934", size = 46018 },
]
[[package]]
name = "colorlog"
version = "6.9.0"
source = { registry = "https://pypi.org/simple" }
dependencies = [
{ name = "colorama", marker = "sys_platform == 'win32'" },
]
sdist = { url = "https://files.pythonhosted.org/packages/d3/7a/359f4d5df2353f26172b3cc39ea32daa39af8de522205f512f458923e677/colorlog-6.9.0.tar.gz", hash = "sha256:bfba54a1b93b94f54e1f4fe48395725a3d92fd2a4af702f6bd70946bdc0c6ac2", size = 16624 }
wheels = [
{ url = "https://files.pythonhosted.org/packages/e3/51/9b208e85196941db2f0654ad0357ca6388ab3ed67efdbfc799f35d1f83aa/colorlog-6.9.0-py3-none-any.whl", hash = "sha256:5906e71acd67cb07a71e779c47c4bcb45fb8c2993eebe9e5adcd6a6f1b283eff", size = 11424 },
]
[[package]]
name = "contourpy"
version = "1.3.3"
source = { registry = "https://pypi.org/simple" }
dependencies = [
{ name = "numpy" },
]
sdist = { url = "https://files.pythonhosted.org/packages/58/01/1253e6698a07380cd31a736d248a3f2a50a7c88779a1813da27503cadc2a/contourpy-1.3.3.tar.gz", hash = "sha256:083e12155b210502d0bca491432bb04d56dc3432f95a979b429f2848c3dbe880", size = 13466174 }
wheels = [
{ url = "https://files.pythonhosted.org/packages/91/2e/c4390a31919d8a78b90e8ecf87cd4b4c4f05a5b48d05ec17db8e5404c6f4/contourpy-1.3.3-cp311-cp311-macosx_10_9_x86_64.whl", hash = "sha256:709a48ef9a690e1343202916450bc48b9e51c049b089c7f79a267b46cffcdaa1", size = 288773 },
{ url = "https://files.pythonhosted.org/packages/0d/44/c4b0b6095fef4dc9c420e041799591e3b63e9619e3044f7f4f6c21c0ab24/contourpy-1.3.3-cp311-cp311-macosx_11_0_arm64.whl", hash = "sha256:23416f38bfd74d5d28ab8429cc4d63fa67d5068bd711a85edb1c3fb0c3e2f381", size = 270149 },
{ url = "https://files.pythonhosted.org/packages/30/2e/dd4ced42fefac8470661d7cb7e264808425e6c5d56d175291e93890cce09/contourpy-1.3.3-cp311-cp311-manylinux_2_26_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:929ddf8c4c7f348e4c0a5a3a714b5c8542ffaa8c22954862a46ca1813b667ee7", size = 329222 },
{ url = "https://files.pythonhosted.org/packages/f2/74/cc6ec2548e3d276c71389ea4802a774b7aa3558223b7bade3f25787fafc2/contourpy-1.3.3-cp311-cp311-manylinux_2_26_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:9e999574eddae35f1312c2b4b717b7885d4edd6cb46700e04f7f02db454e67c1", size = 377234 },
{ url = "https://files.pythonhosted.org/packages/03/b3/64ef723029f917410f75c09da54254c5f9ea90ef89b143ccadb09df14c15/contourpy-1.3.3-cp311-cp311-manylinux_2_26_s390x.manylinux_2_28_s390x.whl", hash = "sha256:0bf67e0e3f482cb69779dd3061b534eb35ac9b17f163d851e2a547d56dba0a3a", size = 380555 },
{ url = "https://files.pythonhosted.org/packages/5f/4b/6157f24ca425b89fe2eb7e7be642375711ab671135be21e6faa100f7448c/contourpy-1.3.3-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:51e79c1f7470158e838808d4a996fa9bac72c498e93d8ebe5119bc1e6becb0db", size = 355238 },
{ url = "https://files.pythonhosted.org/packages/98/56/f914f0dd678480708a04cfd2206e7c382533249bc5001eb9f58aa693e200/contourpy-1.3.3-cp311-cp311-musllinux_1_2_aarch64.whl", hash = "sha256:598c3aaece21c503615fd59c92a3598b428b2f01bfb4b8ca9c4edeecc2438620", size = 1326218 },
{ url = "https://files.pythonhosted.org/packages/fb/d7/4a972334a0c971acd5172389671113ae82aa7527073980c38d5868ff1161/contourpy-1.3.3-cp311-cp311-musllinux_1_2_x86_64.whl", hash = "sha256:322ab1c99b008dad206d406bb61d014cf0174df491ae9d9d0fac6a6fda4f977f", size = 1392867 },
{ url = "https://files.pythonhosted.org/packages/75/3e/f2cc6cd56dc8cff46b1a56232eabc6feea52720083ea71ab15523daab796/contourpy-1.3.3-cp311-cp311-win32.whl", hash = "sha256:fd907ae12cd483cd83e414b12941c632a969171bf90fc937d0c9f268a31cafff", size = 183677 },
{ url = "https://files.pythonhosted.org/packages/98/4b/9bd370b004b5c9d8045c6c33cf65bae018b27aca550a3f657cdc99acdbd8/contourpy-1.3.3-cp311-cp311-win_amd64.whl", hash = "sha256:3519428f6be58431c56581f1694ba8e50626f2dd550af225f82fb5f5814d2a42", size = 225234 },
{ url = "https://files.pythonhosted.org/packages/d9/b6/71771e02c2e004450c12b1120a5f488cad2e4d5b590b1af8bad060360fe4/contourpy-1.3.3-cp311-cp311-win_arm64.whl", hash = "sha256:15ff10bfada4bf92ec8b31c62bf7c1834c244019b4a33095a68000d7075df470", size = 193123 },
{ url = "https://files.pythonhosted.org/packages/be/45/adfee365d9ea3d853550b2e735f9d66366701c65db7855cd07621732ccfc/contourpy-1.3.3-cp312-cp312-macosx_10_13_x86_64.whl", hash = "sha256:b08a32ea2f8e42cf1d4be3169a98dd4be32bafe4f22b6c4cb4ba810fa9e5d2cb", size = 293419 },
{ url = "https://files.pythonhosted.org/packages/53/3e/405b59cfa13021a56bba395a6b3aca8cec012b45bf177b0eaf7a202cde2c/contourpy-1.3.3-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:556dba8fb6f5d8742f2923fe9457dbdd51e1049c4a43fd3986a0b14a1d815fc6", size = 273979 },
{ url = "https://files.pythonhosted.org/packages/d4/1c/a12359b9b2ca3a845e8f7f9ac08bdf776114eb931392fcad91743e2ea17b/contourpy-1.3.3-cp312-cp312-manylinux_2_26_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:92d9abc807cf7d0e047b95ca5d957cf4792fcd04e920ca70d48add15c1a90ea7", size = 332653 },
{ url = "https://files.pythonhosted.org/packages/63/12/897aeebfb475b7748ea67b61e045accdfcf0d971f8a588b67108ed7f5512/contourpy-1.3.3-cp312-cp312-manylinux_2_26_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:b2e8faa0ed68cb29af51edd8e24798bb661eac3bd9f65420c1887b6ca89987c8", size = 379536 },
{ url = "https://files.pythonhosted.org/packages/43/8a/a8c584b82deb248930ce069e71576fc09bd7174bbd35183b7943fb1064fd/contourpy-1.3.3-cp312-cp312-manylinux_2_26_s390x.manylinux_2_28_s390x.whl", hash = "sha256:626d60935cf668e70a5ce6ff184fd713e9683fb458898e4249b63be9e28286ea", size = 384397 },
{ url = "https://files.pythonhosted.org/packages/cc/8f/ec6289987824b29529d0dfda0d74a07cec60e54b9c92f3c9da4c0ac732de/contourpy-1.3.3-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:4d00e655fcef08aba35ec9610536bfe90267d7ab5ba944f7032549c55a146da1", size = 362601 },
{ url = "https://files.pythonhosted.org/packages/05/0a/a3fe3be3ee2dceb3e615ebb4df97ae6f3828aa915d3e10549ce016302bd1/contourpy-1.3.3-cp312-cp312-musllinux_1_2_aarch64.whl", hash = "sha256:451e71b5a7d597379ef572de31eeb909a87246974d960049a9848c3bc6c41bf7", size = 1331288 },
{ url = "https://files.pythonhosted.org/packages/33/1d/acad9bd4e97f13f3e2b18a3977fe1b4a37ecf3d38d815333980c6c72e963/contourpy-1.3.3-cp312-cp312-musllinux_1_2_x86_64.whl", hash = "sha256:459c1f020cd59fcfe6650180678a9993932d80d44ccde1fa1868977438f0b411", size = 1403386 },
{ url = "https://files.pythonhosted.org/packages/cf/8f/5847f44a7fddf859704217a99a23a4f6417b10e5ab1256a179264561540e/contourpy-1.3.3-cp312-cp312-win32.whl", hash = "sha256:023b44101dfe49d7d53932be418477dba359649246075c996866106da069af69", size = 185018 },
{ url = "https://files.pythonhosted.org/packages/19/e8/6026ed58a64563186a9ee3f29f41261fd1828f527dd93d33b60feca63352/contourpy-1.3.3-cp312-cp312-win_amd64.whl", hash = "sha256:8153b8bfc11e1e4d75bcb0bff1db232f9e10b274e0929de9d608027e0d34ff8b", size = 226567 },
{ url = "https://files.pythonhosted.org/packages/d1/e2/f05240d2c39a1ed228d8328a78b6f44cd695f7ef47beb3e684cf93604f86/contourpy-1.3.3-cp312-cp312-win_arm64.whl", hash = "sha256:07ce5ed73ecdc4a03ffe3e1b3e3c1166db35ae7584be76f65dbbe28a7791b0cc", size = 193655 },
{ url = "https://files.pythonhosted.org/packages/a5/29/8dcfe16f0107943fa92388c23f6e05cff0ba58058c4c95b00280d4c75a14/contourpy-1.3.3-pp311-pypy311_pp73-macosx_10_15_x86_64.whl", hash = "sha256:cd5dfcaeb10f7b7f9dc8941717c6c2ade08f587be2226222c12b25f0483ed497", size = 278809 },
{ url = "https://files.pythonhosted.org/packages/85/a9/8b37ef4f7dafeb335daee3c8254645ef5725be4d9c6aa70b50ec46ef2f7e/contourpy-1.3.3-pp311-pypy311_pp73-macosx_11_0_arm64.whl", hash = "sha256:0c1fc238306b35f246d61a1d416a627348b5cf0648648a031e14bb8705fcdfe8", size = 261593 },
{ url = "https://files.pythonhosted.org/packages/0a/59/ebfb8c677c75605cc27f7122c90313fd2f375ff3c8d19a1694bda74aaa63/contourpy-1.3.3-pp311-pypy311_pp73-manylinux_2_26_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:70f9aad7de812d6541d29d2bbf8feb22ff7e1c299523db288004e3157ff4674e", size = 302202 },
{ url = "https://files.pythonhosted.org/packages/3c/37/21972a15834d90bfbfb009b9d004779bd5a07a0ec0234e5ba8f64d5736f4/contourpy-1.3.3-pp311-pypy311_pp73-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:5ed3657edf08512fc3fe81b510e35c2012fbd3081d2e26160f27ca28affec989", size = 329207 },
{ url = "https://files.pythonhosted.org/packages/0c/58/bd257695f39d05594ca4ad60df5bcb7e32247f9951fd09a9b8edb82d1daa/contourpy-1.3.3-pp311-pypy311_pp73-win_amd64.whl", hash = "sha256:3d1a3799d62d45c18bafd41c5fa05120b96a28079f2393af559b843d1a966a77", size = 225315 },
]
[[package]]
name = "coverage"
version = "7.9.2"
@@ -758,15 +682,6 @@ wheels = [
{ url = "https://files.pythonhosted.org/packages/ec/4c/0ecd260233290bee4b2facec4d8e755e57d8781d68f276e1248433993c9f/ctranslate2-4.6.0-cp312-cp312-win_amd64.whl", hash = "sha256:511cdf810a5bf6a2cec735799e5cd47966e63f8f7688fdee1b97fed621abda00", size = 19470040 },
]
[[package]]
name = "cycler"
version = "0.12.1"
source = { registry = "https://pypi.org/simple" }
sdist = { url = "https://files.pythonhosted.org/packages/a9/95/a3dbbb5028f35eafb79008e7522a75244477d2838f38cbb722248dabc2a8/cycler-0.12.1.tar.gz", hash = "sha256:88bb128f02ba341da8ef447245a9e138fae777f6a23943da4540077d3601eb1c", size = 7615 }
wheels = [
{ url = "https://files.pythonhosted.org/packages/e7/05/c19819d5e3d95294a6f5947fb9b9629efb316b96de511b418c53d245aae6/cycler-0.12.1-py3-none-any.whl", hash = "sha256:85cef7cff222d8644161529808465972e51340599459b8ac3ccbac5a854e0d30", size = 8321 },
]
[[package]]
name = "databases"
version = "0.8.0"
@@ -879,12 +794,6 @@ wheels = [
{ url = "https://files.pythonhosted.org/packages/e3/26/57c6fb270950d476074c087527a558ccb6f4436657314bfb6cdf484114c4/docker-7.1.0-py3-none-any.whl", hash = "sha256:c96b93b7f0a746f9e77d325bcfb87422a3d8bd4f03136ae8a85b37f1898d5fc0", size = 147774 },
]
[[package]]
name = "docopt"
version = "0.6.2"
source = { registry = "https://pypi.org/simple" }
sdist = { url = "https://files.pythonhosted.org/packages/a2/55/8f8cab2afd404cf578136ef2cc5dfb50baa1761b68c9da1fb1e4eed343c9/docopt-0.6.2.tar.gz", hash = "sha256:49b3a825280bd66b3aa83585ef59c4a8c82f2c8a522dbe754a8bc8d08c85c491", size = 25901 }
[[package]]
name = "ecdsa"
version = "0.19.1"
@@ -897,15 +806,6 @@ wheels = [
{ url = "https://files.pythonhosted.org/packages/cb/a3/460c57f094a4a165c84a1341c373b0a4f5ec6ac244b998d5021aade89b77/ecdsa-0.19.1-py2.py3-none-any.whl", hash = "sha256:30638e27cf77b7e15c4c4cc1973720149e1033827cfd00661ca5c8cc0cdb24c3", size = 150607 },
]
[[package]]
name = "einops"
version = "0.8.1"
source = { registry = "https://pypi.org/simple" }
sdist = { url = "https://files.pythonhosted.org/packages/e5/81/df4fbe24dff8ba3934af99044188e20a98ed441ad17a274539b74e82e126/einops-0.8.1.tar.gz", hash = "sha256:de5d960a7a761225532e0f1959e5315ebeafc0cd43394732f103ca44b9837e84", size = 54805 }
wheels = [
{ url = "https://files.pythonhosted.org/packages/87/62/9773de14fe6c45c23649e98b83231fffd7b9892b6cf863251dc2afa73643/einops-0.8.1-py3-none-any.whl", hash = "sha256:919387eb55330f5757c6bea9165c5ff5cfe63a642682ea788a6d472576d81737", size = 64359 },
]
[[package]]
name = "email-validator"
version = "2.2.0"
@@ -1039,31 +939,6 @@ wheels = [
{ url = "https://files.pythonhosted.org/packages/b8/25/155f9f080d5e4bc0082edfda032ea2bc2b8fab3f4d25d46c1e9dd22a1a89/flatbuffers-25.2.10-py2.py3-none-any.whl", hash = "sha256:ebba5f4d5ea615af3f7fd70fc310636fbb2bbd1f566ac0a23d98dd412de50051", size = 30953 },
]
[[package]]
name = "fonttools"
version = "4.59.2"
source = { registry = "https://pypi.org/simple" }
sdist = { url = "https://files.pythonhosted.org/packages/0d/a5/fba25f9fbdab96e26dedcaeeba125e5f05a09043bf888e0305326e55685b/fonttools-4.59.2.tar.gz", hash = "sha256:e72c0749b06113f50bcb80332364c6be83a9582d6e3db3fe0b280f996dc2ef22", size = 3540889 }
wheels = [
{ url = "https://files.pythonhosted.org/packages/f8/53/742fcd750ae0bdc74de4c0ff923111199cc2f90a4ee87aaddad505b6f477/fonttools-4.59.2-cp311-cp311-macosx_10_9_universal2.whl", hash = "sha256:511946e8d7ea5c0d6c7a53c4cb3ee48eda9ab9797cd9bf5d95829a398400354f", size = 2774961 },
{ url = "https://files.pythonhosted.org/packages/57/2a/976f5f9fa3b4dd911dc58d07358467bec20e813d933bc5d3db1a955dd456/fonttools-4.59.2-cp311-cp311-macosx_10_9_x86_64.whl", hash = "sha256:8e5e2682cf7be766d84f462ba8828d01e00c8751a8e8e7ce12d7784ccb69a30d", size = 2344690 },
{ url = "https://files.pythonhosted.org/packages/c1/8f/b7eefc274fcf370911e292e95565c8253b0b87c82a53919ab3c795a4f50e/fonttools-4.59.2-cp311-cp311-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:5729e12a982dba3eeae650de48b06f3b9ddb51e9aee2fcaf195b7d09a96250e2", size = 5026910 },
{ url = "https://files.pythonhosted.org/packages/69/95/864726eaa8f9d4e053d0c462e64d5830ec7c599cbdf1db9e40f25ca3972e/fonttools-4.59.2-cp311-cp311-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:c52694eae5d652361d59ecdb5a2246bff7cff13b6367a12da8499e9df56d148d", size = 4971031 },
{ url = "https://files.pythonhosted.org/packages/24/4c/b8c4735ebdea20696277c70c79e0de615dbe477834e5a7c2569aa1db4033/fonttools-4.59.2-cp311-cp311-musllinux_1_2_aarch64.whl", hash = "sha256:f1f1bbc23ba1312bd8959896f46f667753b90216852d2a8cfa2d07e0cb234144", size = 5006112 },
{ url = "https://files.pythonhosted.org/packages/3b/23/f9ea29c292aa2fc1ea381b2e5621ac436d5e3e0a5dee24ffe5404e58eae8/fonttools-4.59.2-cp311-cp311-musllinux_1_2_x86_64.whl", hash = "sha256:1a1bfe5378962825dabe741720885e8b9ae9745ec7ecc4a5ec1f1ce59a6062bf", size = 5117671 },
{ url = "https://files.pythonhosted.org/packages/ba/07/cfea304c555bf06e86071ff2a3916bc90f7c07ec85b23bab758d4908c33d/fonttools-4.59.2-cp311-cp311-win32.whl", hash = "sha256:e937790f3c2c18a1cbc7da101550a84319eb48023a715914477d2e7faeaba570", size = 2218157 },
{ url = "https://files.pythonhosted.org/packages/d7/de/35d839aa69db737a3f9f3a45000ca24721834d40118652a5775d5eca8ebb/fonttools-4.59.2-cp311-cp311-win_amd64.whl", hash = "sha256:9836394e2f4ce5f9c0a7690ee93bd90aa1adc6b054f1a57b562c5d242c903104", size = 2265846 },
{ url = "https://files.pythonhosted.org/packages/ba/3d/1f45db2df51e7bfa55492e8f23f383d372200be3a0ded4bf56a92753dd1f/fonttools-4.59.2-cp312-cp312-macosx_10_13_universal2.whl", hash = "sha256:82906d002c349cad647a7634b004825a7335f8159d0d035ae89253b4abf6f3ea", size = 2769711 },
{ url = "https://files.pythonhosted.org/packages/29/df/cd236ab32a8abfd11558f296e064424258db5edefd1279ffdbcfd4fd8b76/fonttools-4.59.2-cp312-cp312-macosx_10_13_x86_64.whl", hash = "sha256:a10c1bd7644dc58f8862d8ba0cf9fb7fef0af01ea184ba6ce3f50ab7dfe74d5a", size = 2340225 },
{ url = "https://files.pythonhosted.org/packages/98/12/b6f9f964fe6d4b4dd4406bcbd3328821c3de1f909ffc3ffa558fe72af48c/fonttools-4.59.2-cp312-cp312-manylinux1_x86_64.manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_5_x86_64.whl", hash = "sha256:738f31f23e0339785fd67652a94bc69ea49e413dfdb14dcb8c8ff383d249464e", size = 4912766 },
{ url = "https://files.pythonhosted.org/packages/73/78/82bde2f2d2c306ef3909b927363170b83df96171f74e0ccb47ad344563cd/fonttools-4.59.2-cp312-cp312-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:0ec99f9bdfee9cdb4a9172f9e8fd578cce5feb231f598909e0aecf5418da4f25", size = 4955178 },
{ url = "https://files.pythonhosted.org/packages/92/77/7de766afe2d31dda8ee46d7e479f35c7d48747e558961489a2d6e3a02bd4/fonttools-4.59.2-cp312-cp312-musllinux_1_2_aarch64.whl", hash = "sha256:0476ea74161322e08c7a982f83558a2b81b491509984523a1a540baf8611cc31", size = 4897898 },
{ url = "https://files.pythonhosted.org/packages/c5/77/ce0e0b905d62a06415fda9f2b2e109a24a5db54a59502b769e9e297d2242/fonttools-4.59.2-cp312-cp312-musllinux_1_2_x86_64.whl", hash = "sha256:95922a922daa1f77cc72611747c156cfb38030ead72436a2c551d30ecef519b9", size = 5049144 },
{ url = "https://files.pythonhosted.org/packages/d9/ea/870d93aefd23fff2e07cbeebdc332527868422a433c64062c09d4d5e7fe6/fonttools-4.59.2-cp312-cp312-win32.whl", hash = "sha256:39ad9612c6a622726a6a130e8ab15794558591f999673f1ee7d2f3d30f6a3e1c", size = 2206473 },
{ url = "https://files.pythonhosted.org/packages/61/c4/e44bad000c4a4bb2e9ca11491d266e857df98ab6d7428441b173f0fe2517/fonttools-4.59.2-cp312-cp312-win_amd64.whl", hash = "sha256:980fd7388e461b19a881d35013fec32c713ffea1fc37aef2f77d11f332dfd7da", size = 2254706 },
{ url = "https://files.pythonhosted.org/packages/65/a4/d2f7be3c86708912c02571db0b550121caab8cd88a3c0aacb9cfa15ea66e/fonttools-4.59.2-py3-none-any.whl", hash = "sha256:8bd0f759020e87bb5d323e6283914d9bf4ae35a7307dafb2cbd1e379e720ad37", size = 1132315 },
]
[[package]]
name = "frozenlist"
version = "1.7.0"
@@ -1116,11 +991,6 @@ wheels = [
{ url = "https://files.pythonhosted.org/packages/2f/e0/014d5d9d7a4564cf1c40b5039bc882db69fd881111e03ab3657ac0b218e2/fsspec-2025.7.0-py3-none-any.whl", hash = "sha256:8b012e39f63c7d5f10474de957f3ab793b47b45ae7d39f2fb735f8bbe25c0e21", size = 199597 },
]
[package.optional-dependencies]
http = [
{ name = "aiohttp" },
]
[[package]]
name = "google-crc32c"
version = "1.7.1"
@@ -1385,19 +1255,6 @@ wheels = [
{ url = "https://files.pythonhosted.org/packages/f0/0f/310fb31e39e2d734ccaa2c0fb981ee41f7bd5056ce9bc29b2248bd569169/humanfriendly-10.0-py2.py3-none-any.whl", hash = "sha256:1697e1a8a8f550fd43c2865cd84542fc175a61dcb779b6fee18cf6b6ccba1477", size = 86794 },
]
[[package]]
name = "hyperpyyaml"
version = "1.2.2"
source = { registry = "https://pypi.org/simple" }
dependencies = [
{ name = "pyyaml" },
{ name = "ruamel-yaml" },
]
sdist = { url = "https://files.pythonhosted.org/packages/52/e3/3ac46d9a662b037f699a6948b39c8d03bfcff0b592335d5953ba0c55d453/HyperPyYAML-1.2.2.tar.gz", hash = "sha256:bdb734210d18770a262f500fe5755c7a44a5d3b91521b06e24f7a00a36ee0f87", size = 17085 }
wheels = [
{ url = "https://files.pythonhosted.org/packages/33/c9/751b6401887f4b50f9307cc1e53d287b3dc77c375c126aeb6335aff73ccb/HyperPyYAML-1.2.2-py3-none-any.whl", hash = "sha256:3c5864bdc8864b2f0fbd7bc495e7e8fdf2dfd5dd80116f72da27ca96a128bdeb", size = 16118 },
]
[[package]]
name = "icalendar"
version = "6.3.1"
@@ -1540,55 +1397,6 @@ wheels = [
{ url = "https://files.pythonhosted.org/packages/01/0e/b27cdbaccf30b890c40ed1da9fd4a3593a5cf94dae54fb34f8a4b74fcd3f/jsonschema_specifications-2025.4.1-py3-none-any.whl", hash = "sha256:4653bffbd6584f7de83a67e0d620ef16900b390ddc7939d56684d6c81e33f1af", size = 18437 },
]
[[package]]
name = "julius"
version = "0.2.7"
source = { registry = "https://pypi.org/simple" }
dependencies = [
{ name = "torch", version = "2.8.0", source = { registry = "https://download.pytorch.org/whl/cpu" }, marker = "sys_platform == 'darwin'" },
{ name = "torch", version = "2.8.0+cpu", source = { registry = "https://download.pytorch.org/whl/cpu" }, marker = "sys_platform != 'darwin'" },
]
sdist = { url = "https://files.pythonhosted.org/packages/a1/19/c9e1596b5572c786b93428d0904280e964c930fae7e6c9368ed9e1b63922/julius-0.2.7.tar.gz", hash = "sha256:3c0f5f5306d7d6016fcc95196b274cae6f07e2c9596eed314e4e7641554fbb08", size = 59640 }
[[package]]
name = "kiwisolver"
version = "1.4.9"
source = { registry = "https://pypi.org/simple" }
sdist = { url = "https://files.pythonhosted.org/packages/5c/3c/85844f1b0feb11ee581ac23fe5fce65cd049a200c1446708cc1b7f922875/kiwisolver-1.4.9.tar.gz", hash = "sha256:c3b22c26c6fd6811b0ae8363b95ca8ce4ea3c202d3d0975b2914310ceb1bcc4d", size = 97564 }
wheels = [
{ url = "https://files.pythonhosted.org/packages/6f/ab/c80b0d5a9d8a1a65f4f815f2afff9798b12c3b9f31f1d304dd233dd920e2/kiwisolver-1.4.9-cp311-cp311-macosx_10_9_universal2.whl", hash = "sha256:eb14a5da6dc7642b0f3a18f13654847cd8b7a2550e2645a5bda677862b03ba16", size = 124167 },
{ url = "https://files.pythonhosted.org/packages/a0/c0/27fe1a68a39cf62472a300e2879ffc13c0538546c359b86f149cc19f6ac3/kiwisolver-1.4.9-cp311-cp311-macosx_10_9_x86_64.whl", hash = "sha256:39a219e1c81ae3b103643d2aedb90f1ef22650deb266ff12a19e7773f3e5f089", size = 66579 },
{ url = "https://files.pythonhosted.org/packages/31/a2/a12a503ac1fd4943c50f9822678e8015a790a13b5490354c68afb8489814/kiwisolver-1.4.9-cp311-cp311-macosx_11_0_arm64.whl", hash = "sha256:2405a7d98604b87f3fc28b1716783534b1b4b8510d8142adca34ee0bc3c87543", size = 65309 },
{ url = "https://files.pythonhosted.org/packages/66/e1/e533435c0be77c3f64040d68d7a657771194a63c279f55573188161e81ca/kiwisolver-1.4.9-cp311-cp311-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:dc1ae486f9abcef254b5618dfb4113dd49f94c68e3e027d03cf0143f3f772b61", size = 1435596 },
{ url = "https://files.pythonhosted.org/packages/67/1e/51b73c7347f9aabdc7215aa79e8b15299097dc2f8e67dee2b095faca9cb0/kiwisolver-1.4.9-cp311-cp311-manylinux_2_24_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:8a1f570ce4d62d718dce3f179ee78dac3b545ac16c0c04bb363b7607a949c0d1", size = 1246548 },
{ url = "https://files.pythonhosted.org/packages/21/aa/72a1c5d1e430294f2d32adb9542719cfb441b5da368d09d268c7757af46c/kiwisolver-1.4.9-cp311-cp311-manylinux_2_24_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:cb27e7b78d716c591e88e0a09a2139c6577865d7f2e152488c2cc6257f460872", size = 1263618 },
{ url = "https://files.pythonhosted.org/packages/a3/af/db1509a9e79dbf4c260ce0cfa3903ea8945f6240e9e59d1e4deb731b1a40/kiwisolver-1.4.9-cp311-cp311-manylinux_2_24_s390x.manylinux_2_28_s390x.whl", hash = "sha256:15163165efc2f627eb9687ea5f3a28137217d217ac4024893d753f46bce9de26", size = 1317437 },
{ url = "https://files.pythonhosted.org/packages/e0/f2/3ea5ee5d52abacdd12013a94130436e19969fa183faa1e7c7fbc89e9a42f/kiwisolver-1.4.9-cp311-cp311-musllinux_1_2_aarch64.whl", hash = "sha256:bdee92c56a71d2b24c33a7d4c2856bd6419d017e08caa7802d2963870e315028", size = 2195742 },
{ url = "https://files.pythonhosted.org/packages/6f/9b/1efdd3013c2d9a2566aa6a337e9923a00590c516add9a1e89a768a3eb2fc/kiwisolver-1.4.9-cp311-cp311-musllinux_1_2_ppc64le.whl", hash = "sha256:412f287c55a6f54b0650bd9b6dce5aceddb95864a1a90c87af16979d37c89771", size = 2290810 },
{ url = "https://files.pythonhosted.org/packages/fb/e5/cfdc36109ae4e67361f9bc5b41323648cb24a01b9ade18784657e022e65f/kiwisolver-1.4.9-cp311-cp311-musllinux_1_2_s390x.whl", hash = "sha256:2c93f00dcba2eea70af2be5f11a830a742fe6b579a1d4e00f47760ef13be247a", size = 2461579 },
{ url = "https://files.pythonhosted.org/packages/62/86/b589e5e86c7610842213994cdea5add00960076bef4ae290c5fa68589cac/kiwisolver-1.4.9-cp311-cp311-musllinux_1_2_x86_64.whl", hash = "sha256:f117e1a089d9411663a3207ba874f31be9ac8eaa5b533787024dc07aeb74f464", size = 2268071 },
{ url = "https://files.pythonhosted.org/packages/3b/c6/f8df8509fd1eee6c622febe54384a96cfaf4d43bf2ccec7a0cc17e4715c9/kiwisolver-1.4.9-cp311-cp311-win_amd64.whl", hash = "sha256:be6a04e6c79819c9a8c2373317d19a96048e5a3f90bec587787e86a1153883c2", size = 73840 },
{ url = "https://files.pythonhosted.org/packages/e2/2d/16e0581daafd147bc11ac53f032a2b45eabac897f42a338d0a13c1e5c436/kiwisolver-1.4.9-cp311-cp311-win_arm64.whl", hash = "sha256:0ae37737256ba2de764ddc12aed4956460277f00c4996d51a197e72f62f5eec7", size = 65159 },
{ url = "https://files.pythonhosted.org/packages/86/c9/13573a747838aeb1c76e3267620daa054f4152444d1f3d1a2324b78255b5/kiwisolver-1.4.9-cp312-cp312-macosx_10_13_universal2.whl", hash = "sha256:ac5a486ac389dddcc5bef4f365b6ae3ffff2c433324fb38dd35e3fab7c957999", size = 123686 },
{ url = "https://files.pythonhosted.org/packages/51/ea/2ecf727927f103ffd1739271ca19c424d0e65ea473fbaeea1c014aea93f6/kiwisolver-1.4.9-cp312-cp312-macosx_10_13_x86_64.whl", hash = "sha256:f2ba92255faa7309d06fe44c3a4a97efe1c8d640c2a79a5ef728b685762a6fd2", size = 66460 },
{ url = "https://files.pythonhosted.org/packages/5b/5a/51f5464373ce2aeb5194508298a508b6f21d3867f499556263c64c621914/kiwisolver-1.4.9-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:4a2899935e724dd1074cb568ce7ac0dce28b2cd6ab539c8e001a8578eb106d14", size = 64952 },
{ url = "https://files.pythonhosted.org/packages/70/90/6d240beb0f24b74371762873e9b7f499f1e02166a2d9c5801f4dbf8fa12e/kiwisolver-1.4.9-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:f6008a4919fdbc0b0097089f67a1eb55d950ed7e90ce2cc3e640abadd2757a04", size = 1474756 },
{ url = "https://files.pythonhosted.org/packages/12/42/f36816eaf465220f683fb711efdd1bbf7a7005a2473d0e4ed421389bd26c/kiwisolver-1.4.9-cp312-cp312-manylinux_2_24_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:67bb8b474b4181770f926f7b7d2f8c0248cbcb78b660fdd41a47054b28d2a752", size = 1276404 },
{ url = "https://files.pythonhosted.org/packages/2e/64/bc2de94800adc830c476dce44e9b40fd0809cddeef1fde9fcf0f73da301f/kiwisolver-1.4.9-cp312-cp312-manylinux_2_24_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:2327a4a30d3ee07d2fbe2e7933e8a37c591663b96ce42a00bc67461a87d7df77", size = 1294410 },
{ url = "https://files.pythonhosted.org/packages/5f/42/2dc82330a70aa8e55b6d395b11018045e58d0bb00834502bf11509f79091/kiwisolver-1.4.9-cp312-cp312-manylinux_2_24_s390x.manylinux_2_28_s390x.whl", hash = "sha256:7a08b491ec91b1d5053ac177afe5290adacf1f0f6307d771ccac5de30592d198", size = 1343631 },
{ url = "https://files.pythonhosted.org/packages/22/fd/f4c67a6ed1aab149ec5a8a401c323cee7a1cbe364381bb6c9c0d564e0e20/kiwisolver-1.4.9-cp312-cp312-musllinux_1_2_aarch64.whl", hash = "sha256:d8fc5c867c22b828001b6a38d2eaeb88160bf5783c6cb4a5e440efc981ce286d", size = 2224963 },
{ url = "https://files.pythonhosted.org/packages/45/aa/76720bd4cb3713314677d9ec94dcc21ced3f1baf4830adde5bb9b2430a5f/kiwisolver-1.4.9-cp312-cp312-musllinux_1_2_ppc64le.whl", hash = "sha256:3b3115b2581ea35bb6d1f24a4c90af37e5d9b49dcff267eeed14c3893c5b86ab", size = 2321295 },
{ url = "https://files.pythonhosted.org/packages/80/19/d3ec0d9ab711242f56ae0dc2fc5d70e298bb4a1f9dfab44c027668c673a1/kiwisolver-1.4.9-cp312-cp312-musllinux_1_2_s390x.whl", hash = "sha256:858e4c22fb075920b96a291928cb7dea5644e94c0ee4fcd5af7e865655e4ccf2", size = 2487987 },
{ url = "https://files.pythonhosted.org/packages/39/e9/61e4813b2c97e86b6fdbd4dd824bf72d28bcd8d4849b8084a357bc0dd64d/kiwisolver-1.4.9-cp312-cp312-musllinux_1_2_x86_64.whl", hash = "sha256:ed0fecd28cc62c54b262e3736f8bb2512d8dcfdc2bcf08be5f47f96bf405b145", size = 2291817 },
{ url = "https://files.pythonhosted.org/packages/a0/41/85d82b0291db7504da3c2defe35c9a8a5c9803a730f297bd823d11d5fb77/kiwisolver-1.4.9-cp312-cp312-win_amd64.whl", hash = "sha256:f68208a520c3d86ea51acf688a3e3002615a7f0238002cccc17affecc86a8a54", size = 73895 },
{ url = "https://files.pythonhosted.org/packages/e2/92/5f3068cf15ee5cb624a0c7596e67e2a0bb2adee33f71c379054a491d07da/kiwisolver-1.4.9-cp312-cp312-win_arm64.whl", hash = "sha256:2c1a4f57df73965f3f14df20b80ee29e6a7930a57d2d9e8491a25f676e197c60", size = 64992 },
{ url = "https://files.pythonhosted.org/packages/a3/0f/36d89194b5a32c054ce93e586d4049b6c2c22887b0eb229c61c68afd3078/kiwisolver-1.4.9-pp311-pypy311_pp73-macosx_10_15_x86_64.whl", hash = "sha256:720e05574713db64c356e86732c0f3c5252818d05f9df320f0ad8380641acea5", size = 60104 },
{ url = "https://files.pythonhosted.org/packages/52/ba/4ed75f59e4658fd21fe7dde1fee0ac397c678ec3befba3fe6482d987af87/kiwisolver-1.4.9-pp311-pypy311_pp73-macosx_11_0_arm64.whl", hash = "sha256:17680d737d5335b552994a2008fab4c851bcd7de33094a82067ef3a576ff02fa", size = 58592 },
{ url = "https://files.pythonhosted.org/packages/33/01/a8ea7c5ea32a9b45ceeaee051a04c8ed4320f5add3c51bfa20879b765b70/kiwisolver-1.4.9-pp311-pypy311_pp73-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:85b5352f94e490c028926ea567fc569c52ec79ce131dadb968d3853e809518c2", size = 80281 },
{ url = "https://files.pythonhosted.org/packages/da/e3/dbd2ecdce306f1d07a1aaf324817ee993aab7aee9db47ceac757deabafbe/kiwisolver-1.4.9-pp311-pypy311_pp73-manylinux_2_24_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:464415881e4801295659462c49461a24fb107c140de781d55518c4b80cb6790f", size = 78009 },
{ url = "https://files.pythonhosted.org/packages/da/e9/0d4add7873a73e462aeb45c036a2dead2562b825aa46ba326727b3f31016/kiwisolver-1.4.9-pp311-pypy311_pp73-win_amd64.whl", hash = "sha256:fb940820c63a9590d31d88b815e7a3aa5915cad3ce735ab45f0c730b39547de1", size = 73929 },
]
[[package]]
name = "kombu"
version = "5.5.4"
@@ -1651,41 +1459,6 @@ wheels = [
{ url = "https://files.pythonhosted.org/packages/dc/1e/408fd10217eac0e43aea0604be22b4851a09e03d761d44d4ea12089dd70e/levenshtein-0.27.1-pp311-pypy311_pp73-win_amd64.whl", hash = "sha256:7987ef006a3cf56a4532bd4c90c2d3b7b4ca9ad3bf8ae1ee5713c4a3bdfda913", size = 98045 },
]
[[package]]
name = "lightning"
version = "2.5.5"
source = { registry = "https://pypi.org/simple" }
dependencies = [
{ name = "fsspec", extra = ["http"] },
{ name = "lightning-utilities" },
{ name = "packaging" },
{ name = "pytorch-lightning" },
{ name = "pyyaml" },
{ name = "torch", version = "2.8.0", source = { registry = "https://download.pytorch.org/whl/cpu" }, marker = "sys_platform == 'darwin'" },
{ name = "torch", version = "2.8.0+cpu", source = { registry = "https://download.pytorch.org/whl/cpu" }, marker = "sys_platform != 'darwin'" },
{ name = "torchmetrics" },
{ name = "tqdm" },
{ name = "typing-extensions" },
]
sdist = { url = "https://files.pythonhosted.org/packages/0f/dd/86bb3bebadcdbc6e6e5a63657f0a03f74cd065b5ea965896679f76fec0b4/lightning-2.5.5.tar.gz", hash = "sha256:4d3d66c5b1481364a7e6a1ce8ddde1777a04fa740a3145ec218a9941aed7dd30", size = 640770 }
wheels = [
{ url = "https://files.pythonhosted.org/packages/2e/d0/4b4fbafc3b18df91207a6e46782d9fd1905f9f45cb2c3b8dfbb239aef781/lightning-2.5.5-py3-none-any.whl", hash = "sha256:69eb248beadd7b600bf48eff00a0ec8af171ec7a678d23787c4aedf12e225e8f", size = 828490 },
]
[[package]]
name = "lightning-utilities"
version = "0.15.2"
source = { registry = "https://pypi.org/simple" }
dependencies = [
{ name = "packaging" },
{ name = "setuptools" },
{ name = "typing-extensions" },
]
sdist = { url = "https://files.pythonhosted.org/packages/b8/39/6fc58ca81492db047149b4b8fd385aa1bfb8c28cd7cacb0c7eb0c44d842f/lightning_utilities-0.15.2.tar.gz", hash = "sha256:cdf12f530214a63dacefd713f180d1ecf5d165338101617b4742e8f22c032e24", size = 31090 }
wheels = [
{ url = "https://files.pythonhosted.org/packages/de/73/3d757cb3fc16f0f9794dd289bcd0c4a031d9cf54d8137d6b984b2d02edf3/lightning_utilities-0.15.2-py3-none-any.whl", hash = "sha256:ad3ab1703775044bbf880dbf7ddaaac899396c96315f3aa1779cec9d618a9841", size = 29431 },
]
[[package]]
name = "llama-cloud"
version = "0.1.32"
@@ -2033,42 +1806,6 @@ wheels = [
{ url = "https://files.pythonhosted.org/packages/34/75/51952c7b2d3873b44a0028b1bd26a25078c18f92f256608e8d1dc61b39fd/marshmallow-3.26.1-py3-none-any.whl", hash = "sha256:3350409f20a70a7e4e11a27661187b77cdcaeb20abca41c1454fe33636bea09c", size = 50878 },
]
[[package]]
name = "matplotlib"
version = "3.10.6"
source = { registry = "https://pypi.org/simple" }
dependencies = [
{ name = "contourpy" },
{ name = "cycler" },
{ name = "fonttools" },
{ name = "kiwisolver" },
{ name = "numpy" },
{ name = "packaging" },
{ name = "pillow" },
{ name = "pyparsing" },
{ name = "python-dateutil" },
]
sdist = { url = "https://files.pythonhosted.org/packages/a0/59/c3e6453a9676ffba145309a73c462bb407f4400de7de3f2b41af70720a3c/matplotlib-3.10.6.tar.gz", hash = "sha256:ec01b645840dd1996df21ee37f208cd8ba57644779fa20464010638013d3203c", size = 34804264 }
wheels = [
{ url = "https://files.pythonhosted.org/packages/80/d6/5d3665aa44c49005aaacaa68ddea6fcb27345961cd538a98bb0177934ede/matplotlib-3.10.6-cp311-cp311-macosx_10_12_x86_64.whl", hash = "sha256:905b60d1cb0ee604ce65b297b61cf8be9f4e6cfecf95a3fe1c388b5266bc8f4f", size = 8257527 },
{ url = "https://files.pythonhosted.org/packages/8c/af/30ddefe19ca67eebd70047dabf50f899eaff6f3c5e6a1a7edaecaf63f794/matplotlib-3.10.6-cp311-cp311-macosx_11_0_arm64.whl", hash = "sha256:7bac38d816637343e53d7185d0c66677ff30ffb131044a81898b5792c956ba76", size = 8119583 },
{ url = "https://files.pythonhosted.org/packages/d3/29/4a8650a3dcae97fa4f375d46efcb25920d67b512186f8a6788b896062a81/matplotlib-3.10.6-cp311-cp311-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:942a8de2b5bfff1de31d95722f702e2966b8a7e31f4e68f7cd963c7cd8861cf6", size = 8692682 },
{ url = "https://files.pythonhosted.org/packages/aa/d3/b793b9cb061cfd5d42ff0f69d1822f8d5dbc94e004618e48a97a8373179a/matplotlib-3.10.6-cp311-cp311-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:a3276c85370bc0dfca051ec65c5817d1e0f8f5ce1b7787528ec8ed2d524bbc2f", size = 9521065 },
{ url = "https://files.pythonhosted.org/packages/f7/c5/53de5629f223c1c66668d46ac2621961970d21916a4bc3862b174eb2a88f/matplotlib-3.10.6-cp311-cp311-musllinux_1_2_x86_64.whl", hash = "sha256:9df5851b219225731f564e4b9e7f2ac1e13c9e6481f941b5631a0f8e2d9387ce", size = 9576888 },
{ url = "https://files.pythonhosted.org/packages/fc/8e/0a18d6d7d2d0a2e66585032a760d13662e5250c784d53ad50434e9560991/matplotlib-3.10.6-cp311-cp311-win_amd64.whl", hash = "sha256:abb5d9478625dd9c9eb51a06d39aae71eda749ae9b3138afb23eb38824026c7e", size = 8115158 },
{ url = "https://files.pythonhosted.org/packages/07/b3/1a5107bb66c261e23b9338070702597a2d374e5aa7004b7adfc754fbed02/matplotlib-3.10.6-cp311-cp311-win_arm64.whl", hash = "sha256:886f989ccfae63659183173bb3fced7fd65e9eb793c3cc21c273add368536951", size = 7992444 },
{ url = "https://files.pythonhosted.org/packages/ea/1a/7042f7430055d567cc3257ac409fcf608599ab27459457f13772c2d9778b/matplotlib-3.10.6-cp312-cp312-macosx_10_13_x86_64.whl", hash = "sha256:31ca662df6a80bd426f871105fdd69db7543e28e73a9f2afe80de7e531eb2347", size = 8272404 },
{ url = "https://files.pythonhosted.org/packages/a9/5d/1d5f33f5b43f4f9e69e6a5fe1fb9090936ae7bc8e2ff6158e7a76542633b/matplotlib-3.10.6-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:1678bb61d897bb4ac4757b5ecfb02bfb3fddf7f808000fb81e09c510712fda75", size = 8128262 },
{ url = "https://files.pythonhosted.org/packages/67/c3/135fdbbbf84e0979712df58e5e22b4f257b3f5e52a3c4aacf1b8abec0d09/matplotlib-3.10.6-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:56cd2d20842f58c03d2d6e6c1f1cf5548ad6f66b91e1e48f814e4fb5abd1cb95", size = 8697008 },
{ url = "https://files.pythonhosted.org/packages/9c/be/c443ea428fb2488a3ea7608714b1bd85a82738c45da21b447dc49e2f8e5d/matplotlib-3.10.6-cp312-cp312-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:662df55604a2f9a45435566d6e2660e41efe83cd94f4288dfbf1e6d1eae4b0bb", size = 9530166 },
{ url = "https://files.pythonhosted.org/packages/a9/35/48441422b044d74034aea2a3e0d1a49023f12150ebc58f16600132b9bbaf/matplotlib-3.10.6-cp312-cp312-musllinux_1_2_x86_64.whl", hash = "sha256:08f141d55148cd1fc870c3387d70ca4df16dee10e909b3b038782bd4bda6ea07", size = 9593105 },
{ url = "https://files.pythonhosted.org/packages/45/c3/994ef20eb4154ab84cc08d033834555319e4af970165e6c8894050af0b3c/matplotlib-3.10.6-cp312-cp312-win_amd64.whl", hash = "sha256:590f5925c2d650b5c9d813c5b3b5fc53f2929c3f8ef463e4ecfa7e052044fb2b", size = 8122784 },
{ url = "https://files.pythonhosted.org/packages/57/b8/5c85d9ae0e40f04e71bedb053aada5d6bab1f9b5399a0937afb5d6b02d98/matplotlib-3.10.6-cp312-cp312-win_arm64.whl", hash = "sha256:f44c8d264a71609c79a78d50349e724f5d5fc3684ead7c2a473665ee63d868aa", size = 7992823 },
{ url = "https://files.pythonhosted.org/packages/12/bb/02c35a51484aae5f49bd29f091286e7af5f3f677a9736c58a92b3c78baeb/matplotlib-3.10.6-pp311-pypy311_pp73-macosx_10_15_x86_64.whl", hash = "sha256:f2d684c3204fa62421bbf770ddfebc6b50130f9cad65531eeba19236d73bb488", size = 8252296 },
{ url = "https://files.pythonhosted.org/packages/7d/85/41701e3092005aee9a2445f5ee3904d9dbd4a7df7a45905ffef29b7ef098/matplotlib-3.10.6-pp311-pypy311_pp73-macosx_11_0_arm64.whl", hash = "sha256:6f4a69196e663a41d12a728fab8751177215357906436804217d6d9cf0d4d6cf", size = 8116749 },
{ url = "https://files.pythonhosted.org/packages/16/53/8d8fa0ea32a8c8239e04d022f6c059ee5e1b77517769feccd50f1df43d6d/matplotlib-3.10.6-pp311-pypy311_pp73-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:4d6ca6ef03dfd269f4ead566ec6f3fb9becf8dab146fb999022ed85ee9f6b3eb", size = 8693933 },
]
[[package]]
name = "mdurl"
version = "0.1.2"
@@ -2210,19 +1947,6 @@ wheels = [
{ url = "https://files.pythonhosted.org/packages/48/6b/1c6b515a83d5564b1698a61efa245727c8feecf308f4091f565988519d20/numpy-2.3.1-pp311-pypy311_pp73-win_amd64.whl", hash = "sha256:e610832418a2bc09d974cc9fecebfa51e9532d6190223bc5ef6a7402ebf3b5cb", size = 12927246 },
]
[[package]]
name = "omegaconf"
version = "2.3.0"
source = { registry = "https://pypi.org/simple" }
dependencies = [
{ name = "antlr4-python3-runtime" },
{ name = "pyyaml" },
]
sdist = { url = "https://files.pythonhosted.org/packages/09/48/6388f1bb9da707110532cb70ec4d2822858ddfb44f1cdf1233c20a80ea4b/omegaconf-2.3.0.tar.gz", hash = "sha256:d5d4b6d29955cc50ad50c46dc269bcd92c6e00f5f90d23ab5fee7bfca4ba4cc7", size = 3298120 }
wheels = [
{ url = "https://files.pythonhosted.org/packages/e3/94/1843518e420fa3ed6919835845df698c7e27e183cb997394e4a670973a65/omegaconf-2.3.0-py3-none-any.whl", hash = "sha256:7b4df175cdb08ba400f45cae3bdcae7ba8365db4d165fc65fd04b050ab63b46b", size = 79500 },
]
[[package]]
name = "onnxruntime"
version = "1.22.1"
@@ -2265,24 +1989,6 @@ wheels = [
{ url = "https://files.pythonhosted.org/packages/8a/91/1f1cf577f745e956b276a8b1d3d76fa7a6ee0c2b05db3b001b900f2c71db/openai-1.97.0-py3-none-any.whl", hash = "sha256:a1c24d96f4609f3f7f51c9e1c2606d97cc6e334833438659cfd687e9c972c610", size = 764953 },
]
[[package]]
name = "optuna"
version = "4.5.0"
source = { registry = "https://pypi.org/simple" }
dependencies = [
{ name = "alembic" },
{ name = "colorlog" },
{ name = "numpy" },
{ name = "packaging" },
{ name = "pyyaml" },
{ name = "sqlalchemy" },
{ name = "tqdm" },
]
sdist = { url = "https://files.pythonhosted.org/packages/53/a3/bcd1e5500de6ec794c085a277e5b624e60b4fac1790681d7cdbde25b93a2/optuna-4.5.0.tar.gz", hash = "sha256:264844da16dad744dea295057d8bc218646129c47567d52c35a201d9f99942ba", size = 472338 }
wheels = [
{ url = "https://files.pythonhosted.org/packages/7f/12/cba81286cbaf0f0c3f0473846cfd992cb240bdcea816bf2ef7de8ed0f744/optuna-4.5.0-py3-none-any.whl", hash = "sha256:5b8a783e84e448b0742501bc27195344a28d2c77bd2feef5b558544d954851b0", size = 400872 },
]
[[package]]
name = "packaging"
version = "25.0"
@@ -2384,15 +2090,6 @@ wheels = [
{ url = "https://files.pythonhosted.org/packages/54/20/4d324d65cc6d9205fabedc306948156824eb9f0ee1633355a8f7ec5c66bf/pluggy-1.6.0-py3-none-any.whl", hash = "sha256:e920276dd6813095e9377c0bc5566d94c932c33b27a3e3945d8389c374dd4746", size = 20538 },
]
[[package]]
name = "primepy"
version = "1.3"
source = { registry = "https://pypi.org/simple" }
sdist = { url = "https://files.pythonhosted.org/packages/35/77/0cfa1b4697cfb5336f3a96e8bc73327f64610be3a64c97275f1801afb395/primePy-1.3.tar.gz", hash = "sha256:25fd7e25344b0789a5984c75d89f054fcf1f180bef20c998e4befbac92de4669", size = 3914 }
wheels = [
{ url = "https://files.pythonhosted.org/packages/74/c1/bb7e334135859c3a92ec399bc89293ea73f28e815e35b43929c8db6af030/primePy-1.3-py3-none-any.whl", hash = "sha256:5ed443718765be9bf7e2ff4c56cdff71b42140a15b39d054f9d99f0009e2317a", size = 4040 },
]
[[package]]
name = "prometheus-client"
version = "0.22.1"
@@ -2529,109 +2226,6 @@ wheels = [
{ url = "https://files.pythonhosted.org/packages/92/29/06261ea000e2dc1e22907dbbc483a1093665509ea586b29b8986a0e56733/psycopg2_binary-2.9.10-cp312-cp312-win_amd64.whl", hash = "sha256:18c5ee682b9c6dd3696dad6e54cc7ff3a1a9020df6a5c0f861ef8bfd338c3ca0", size = 1164031 },
]
[[package]]
name = "pyannote-audio"
version = "3.3.2"
source = { registry = "https://pypi.org/simple" }
dependencies = [
{ name = "asteroid-filterbanks" },
{ name = "einops" },
{ name = "huggingface-hub" },
{ name = "lightning" },
{ name = "omegaconf" },
{ name = "pyannote-core" },
{ name = "pyannote-database" },
{ name = "pyannote-metrics" },
{ name = "pyannote-pipeline" },
{ name = "pytorch-metric-learning" },
{ name = "rich" },
{ name = "semver" },
{ name = "soundfile" },
{ name = "speechbrain" },
{ name = "tensorboardx" },
{ name = "torch", version = "2.8.0", source = { registry = "https://download.pytorch.org/whl/cpu" }, marker = "sys_platform == 'darwin'" },
{ name = "torch", version = "2.8.0+cpu", source = { registry = "https://download.pytorch.org/whl/cpu" }, marker = "sys_platform != 'darwin'" },
{ name = "torch-audiomentations" },
{ name = "torchaudio", version = "2.8.0", source = { registry = "https://download.pytorch.org/whl/cpu" }, marker = "(platform_machine == 'aarch64' and sys_platform == 'linux') or sys_platform == 'darwin'" },
{ name = "torchaudio", version = "2.8.0+cpu", source = { registry = "https://download.pytorch.org/whl/cpu" }, marker = "(platform_machine != 'aarch64' and sys_platform == 'linux') or (sys_platform != 'darwin' and sys_platform != 'linux')" },
{ name = "torchmetrics" },
]
sdist = { url = "https://files.pythonhosted.org/packages/e9/00/3b96ca7ad0641e4f64cfaa2af153dc7da0998ff972280e1c1681b1fcc243/pyannote_audio-3.3.2.tar.gz", hash = "sha256:b2115e86b0db5faedb9f36ee1a150cebd07f7758e65e815accdac1a12ca9c777", size = 13664309 }
wheels = [
{ url = "https://files.pythonhosted.org/packages/17/e6/76049470d90217f9a15a34abf3e92d782cabc3fb4ab27515c9baaa5495d1/pyannote.audio-3.3.2-py2.py3-none-any.whl", hash = "sha256:599c694acd5d193215147ff82d0bf638bb191204ed502bd9fde8ff582e20aa1c", size = 898707 },
{ url = "https://files.pythonhosted.org/packages/b7/9a/98a8992727e762b031ed30451d5726ece46cf8bb7b872a9dba5cef011e5d/pyannote_audio-3.3.2-py2.py3-none-any.whl", hash = "sha256:23e0dcedda920cb2e154e146bcd9663289ee7942d0e012663dad76f2e571ebeb", size = 897827 },
]
[[package]]
name = "pyannote-core"
version = "5.0.0"
source = { registry = "https://pypi.org/simple" }
dependencies = [
{ name = "numpy" },
{ name = "scipy" },
{ name = "sortedcontainers" },
{ name = "typing-extensions" },
]
sdist = { url = "https://files.pythonhosted.org/packages/65/03/feaf7534206f02c75baf151ce4b8c322b402a6f477c2be82f69d9269cbe6/pyannote.core-5.0.0.tar.gz", hash = "sha256:1a55bcc8bd680ba6be5fa53efa3b6f3d2cdd67144c07b6b4d8d66d5cb0d2096f", size = 59247 }
wheels = [
{ url = "https://files.pythonhosted.org/packages/84/c4/370bc8ba66815a5832ece753a1009388bb07ea353d21c83f2d5a1a436f2c/pyannote.core-5.0.0-py3-none-any.whl", hash = "sha256:04920a6754492242ce0dc6017545595ab643870fe69a994f20c1a5f2da0544d0", size = 58475 },
]
[[package]]
name = "pyannote-database"
version = "5.1.3"
source = { registry = "https://pypi.org/simple" }
dependencies = [
{ name = "pandas" },
{ name = "pyannote-core" },
{ name = "pyyaml" },
{ name = "typer" },
]
sdist = { url = "https://files.pythonhosted.org/packages/a9/ae/de36413d69a46be87cb612ebbcdc4eacbeebce3bc809124603e44a88fe26/pyannote.database-5.1.3.tar.gz", hash = "sha256:0eaf64c1cc506718de60d2d702f1359b1ae7ff252ee3e4799f1c5e378cd52c31", size = 49957 }
wheels = [
{ url = "https://files.pythonhosted.org/packages/a1/64/92d51a3a05615ba58be8ba62a43f9f9f952d9f3646f7e4fb7826e5a3a24e/pyannote.database-5.1.3-py3-none-any.whl", hash = "sha256:37887844c7dfbcc075cb591eddc00aff45fae1ed905344e1f43e0090e63bd40a", size = 48127 },
]
[[package]]
name = "pyannote-metrics"
version = "3.2.1"
source = { registry = "https://pypi.org/simple" }
dependencies = [
{ name = "docopt" },
{ name = "matplotlib" },
{ name = "numpy" },
{ name = "pandas" },
{ name = "pyannote-core" },
{ name = "pyannote-database" },
{ name = "scikit-learn" },
{ name = "scipy" },
{ name = "sympy" },
{ name = "tabulate" },
]
sdist = { url = "https://files.pythonhosted.org/packages/39/2b/6c5f01d3c49aa1c160765946e23782ca6436ae8b9bc514b56319ff5f16e7/pyannote.metrics-3.2.1.tar.gz", hash = "sha256:08024255a3550e96a8e9da4f5f4af326886548480de891414567c8900920ee5c", size = 49086 }
wheels = [
{ url = "https://files.pythonhosted.org/packages/6c/7d/035b370ab834b30e849fe9cd092b7bd7f321fcc4a2c56b84e96476b7ede5/pyannote.metrics-3.2.1-py3-none-any.whl", hash = "sha256:46be797cdade26c82773e5018659ae610145260069c7c5bf3d3c8a029ade8e22", size = 51386 },
]
[[package]]
name = "pyannote-pipeline"
version = "3.0.1"
source = { registry = "https://pypi.org/simple" }
dependencies = [
{ name = "docopt" },
{ name = "filelock" },
{ name = "optuna" },
{ name = "pyannote-core" },
{ name = "pyannote-database" },
{ name = "pyyaml" },
{ name = "scikit-learn" },
{ name = "tqdm" },
]
sdist = { url = "https://files.pythonhosted.org/packages/35/04/4bcfe0dd588577a188328b806f3a7213d8cead0ce5fe5784d01fd57df93f/pyannote.pipeline-3.0.1.tar.gz", hash = "sha256:021794e26a2cf5d8fb5bb1835951e71f5fac33eb14e23dfb7468e16b1b805151", size = 34486 }
wheels = [
{ url = "https://files.pythonhosted.org/packages/83/42/1bf7cbf061ed05c580bfb63bffdd3f3474cbd5c02bee4fac518eea9e9d9e/pyannote.pipeline-3.0.1-py3-none-any.whl", hash = "sha256:819bde4c4dd514f740f2373dfec794832b9fc8e346a35e43a7681625ee187393", size = 31517 },
]
[[package]]
name = "pyasn1"
version = "0.6.1"
@@ -2811,15 +2405,6 @@ wheels = [
{ url = "https://files.pythonhosted.org/packages/80/28/2659c02301b9500751f8d42f9a6632e1508aa5120de5e43042b8b30f8d5d/pyopenssl-25.1.0-py3-none-any.whl", hash = "sha256:2b11f239acc47ac2e5aca04fd7fa829800aeee22a2eb30d744572a157bd8a1ab", size = 56771 },
]
[[package]]
name = "pyparsing"
version = "3.2.3"
source = { registry = "https://pypi.org/simple" }
sdist = { url = "https://files.pythonhosted.org/packages/bb/22/f1129e69d94ffff626bdb5c835506b3a5b4f3d070f17ea295e12c2c6f60f/pyparsing-3.2.3.tar.gz", hash = "sha256:b9c13f1ab8b3b542f72e28f634bad4de758ab3ce4546e4301970ad6fa77c38be", size = 1088608 }
wheels = [
{ url = "https://files.pythonhosted.org/packages/05/e7/df2285f3d08fee213f2d041540fa4fc9ca6c2d44cf36d3a035bf2a8d2bcc/pyparsing-3.2.3-py3-none-any.whl", hash = "sha256:a749938e02d6fd0b59b356ca504a24982314bb090c383e3cf201c95ef7e2bfcf", size = 111120 },
]
[[package]]
name = "pypdf"
version = "5.8.0"
@@ -3027,42 +2612,6 @@ wheels = [
{ url = "https://files.pythonhosted.org/packages/45/58/38b5afbc1a800eeea951b9285d3912613f2603bdf897a4ab0f4bd7f405fc/python_multipart-0.0.20-py3-none-any.whl", hash = "sha256:8a62d3a8335e06589fe01f2a3e178cdcc632f3fbe0d492ad9ee0ec35aab1f104", size = 24546 },
]
[[package]]
name = "pytorch-lightning"
version = "2.5.5"
source = { registry = "https://pypi.org/simple" }
dependencies = [
{ name = "fsspec", extra = ["http"] },
{ name = "lightning-utilities" },
{ name = "packaging" },
{ name = "pyyaml" },
{ name = "torch", version = "2.8.0", source = { registry = "https://download.pytorch.org/whl/cpu" }, marker = "sys_platform == 'darwin'" },
{ name = "torch", version = "2.8.0+cpu", source = { registry = "https://download.pytorch.org/whl/cpu" }, marker = "sys_platform != 'darwin'" },
{ name = "torchmetrics" },
{ name = "tqdm" },
{ name = "typing-extensions" },
]
sdist = { url = "https://files.pythonhosted.org/packages/16/78/bce84aab9a5b3b2e9d087d4f1a6be9b481adbfaac4903bc9daaaf09d49a3/pytorch_lightning-2.5.5.tar.gz", hash = "sha256:d6fc8173d1d6e49abfd16855ea05d2eb2415e68593f33d43e59028ecb4e64087", size = 643703 }
wheels = [
{ url = "https://files.pythonhosted.org/packages/04/f6/99a5c66478f469598dee25b0e29b302b5bddd4e03ed0da79608ac964056e/pytorch_lightning-2.5.5-py3-none-any.whl", hash = "sha256:0b533991df2353c0c6ea9ca10a7d0728b73631fd61f5a15511b19bee2aef8af0", size = 832431 },
]
[[package]]
name = "pytorch-metric-learning"
version = "2.9.0"
source = { registry = "https://pypi.org/simple" }
dependencies = [
{ name = "numpy" },
{ name = "scikit-learn" },
{ name = "torch", version = "2.8.0", source = { registry = "https://download.pytorch.org/whl/cpu" }, marker = "sys_platform == 'darwin'" },
{ name = "torch", version = "2.8.0+cpu", source = { registry = "https://download.pytorch.org/whl/cpu" }, marker = "sys_platform != 'darwin'" },
{ name = "tqdm" },
]
sdist = { url = "https://files.pythonhosted.org/packages/9b/80/6e61b1a91debf4c1b47d441f9a9d7fe2aabcdd9575ed70b2811474eb95c3/pytorch-metric-learning-2.9.0.tar.gz", hash = "sha256:27a626caf5e2876a0fd666605a78cb67ef7597e25d7a68c18053dd503830701f", size = 84530 }
wheels = [
{ url = "https://files.pythonhosted.org/packages/46/7d/73ef5052f57b7720cad00e16598db3592a5ef4826745ffca67a2f085d4dc/pytorch_metric_learning-2.9.0-py3-none-any.whl", hash = "sha256:d51646006dc87168f00cf954785db133a4c5aac81253877248737aa42ef6432a", size = 127801 },
]
[[package]]
name = "pytz"
version = "2025.2"
@@ -3239,7 +2788,6 @@ evaluation = [
]
local = [
{ name = "faster-whisper" },
{ name = "pyannote-audio" },
]
silero-vad = [
{ name = "silero-vad" },
@@ -3267,7 +2815,7 @@ requires-dist = [
{ name = "aiohttp-cors", specifier = ">=0.7.0" },
{ name = "aiortc", specifier = ">=1.5.0" },
{ name = "alembic", specifier = ">=1.11.3" },
{ name = "av", specifier = ">=10.0.0" },
{ name = "av", specifier = ">=15.0.0" },
{ name = "celery", specifier = ">=5.3.4" },
{ name = "databases", extras = ["aiosqlite", "asyncpg"], specifier = ">=0.7.0" },
{ name = "fastapi", extras = ["standard"], specifier = ">=0.100.1" },
@@ -3312,10 +2860,7 @@ evaluation = [
{ name = "pydantic", specifier = ">=2.1.1" },
{ name = "tqdm", specifier = ">=4.66.0" },
]
local = [
{ name = "faster-whisper", specifier = ">=0.10.0" },
{ name = "pyannote-audio", specifier = ">=3.3.2" },
]
local = [{ name = "faster-whisper", specifier = ">=0.10.0" }]
silero-vad = [
{ name = "silero-vad", specifier = ">=5.1.2" },
{ name = "torch", specifier = ">=2.8.0", index = "https://download.pytorch.org/whl/cpu" },
@@ -3519,44 +3064,6 @@ wheels = [
{ url = "https://files.pythonhosted.org/packages/64/8d/0133e4eb4beed9e425d9a98ed6e081a55d195481b7632472be1af08d2f6b/rsa-4.9.1-py3-none-any.whl", hash = "sha256:68635866661c6836b8d39430f97a996acbd61bfa49406748ea243539fe239762", size = 34696 },
]
[[package]]
name = "ruamel-yaml"
version = "0.18.15"
source = { registry = "https://pypi.org/simple" }
dependencies = [
{ name = "ruamel-yaml-clib", marker = "platform_python_implementation == 'CPython'" },
]
sdist = { url = "https://files.pythonhosted.org/packages/3e/db/f3950f5e5031b618aae9f423a39bf81a55c148aecd15a34527898e752cf4/ruamel.yaml-0.18.15.tar.gz", hash = "sha256:dbfca74b018c4c3fba0b9cc9ee33e53c371194a9000e694995e620490fd40700", size = 146865 }
wheels = [
{ url = "https://files.pythonhosted.org/packages/d1/e5/f2a0621f1781b76a38194acae72f01e37b1941470407345b6e8653ad7640/ruamel.yaml-0.18.15-py3-none-any.whl", hash = "sha256:148f6488d698b7a5eded5ea793a025308b25eca97208181b6a026037f391f701", size = 119702 },
]
[[package]]
name = "ruamel-yaml-clib"
version = "0.2.12"
source = { registry = "https://pypi.org/simple" }
sdist = { url = "https://files.pythonhosted.org/packages/20/84/80203abff8ea4993a87d823a5f632e4d92831ef75d404c9fc78d0176d2b5/ruamel.yaml.clib-0.2.12.tar.gz", hash = "sha256:6c8fbb13ec503f99a91901ab46e0b07ae7941cd527393187039aec586fdfd36f", size = 225315 }
wheels = [
{ url = "https://files.pythonhosted.org/packages/fb/8f/683c6ad562f558cbc4f7c029abcd9599148c51c54b5ef0f24f2638da9fbb/ruamel.yaml.clib-0.2.12-cp311-cp311-macosx_13_0_arm64.whl", hash = "sha256:4a6679521a58256a90b0d89e03992c15144c5f3858f40d7c18886023d7943db6", size = 132224 },
{ url = "https://files.pythonhosted.org/packages/3c/d2/b79b7d695e2f21da020bd44c782490578f300dd44f0a4c57a92575758a76/ruamel.yaml.clib-0.2.12-cp311-cp311-manylinux2014_aarch64.whl", hash = "sha256:d84318609196d6bd6da0edfa25cedfbabd8dbde5140a0a23af29ad4b8f91fb1e", size = 641480 },
{ url = "https://files.pythonhosted.org/packages/68/6e/264c50ce2a31473a9fdbf4fa66ca9b2b17c7455b31ef585462343818bd6c/ruamel.yaml.clib-0.2.12-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:bb43a269eb827806502c7c8efb7ae7e9e9d0573257a46e8e952f4d4caba4f31e", size = 739068 },
{ url = "https://files.pythonhosted.org/packages/86/29/88c2567bc893c84d88b4c48027367c3562ae69121d568e8a3f3a8d363f4d/ruamel.yaml.clib-0.2.12-cp311-cp311-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:811ea1594b8a0fb466172c384267a4e5e367298af6b228931f273b111f17ef52", size = 703012 },
{ url = "https://files.pythonhosted.org/packages/11/46/879763c619b5470820f0cd6ca97d134771e502776bc2b844d2adb6e37753/ruamel.yaml.clib-0.2.12-cp311-cp311-musllinux_1_1_i686.whl", hash = "sha256:cf12567a7b565cbf65d438dec6cfbe2917d3c1bdddfce84a9930b7d35ea59642", size = 704352 },
{ url = "https://files.pythonhosted.org/packages/02/80/ece7e6034256a4186bbe50dee28cd032d816974941a6abf6a9d65e4228a7/ruamel.yaml.clib-0.2.12-cp311-cp311-musllinux_1_1_x86_64.whl", hash = "sha256:7dd5adc8b930b12c8fc5b99e2d535a09889941aa0d0bd06f4749e9a9397c71d2", size = 737344 },
{ url = "https://files.pythonhosted.org/packages/f0/ca/e4106ac7e80efbabdf4bf91d3d32fc424e41418458251712f5672eada9ce/ruamel.yaml.clib-0.2.12-cp311-cp311-musllinux_1_2_aarch64.whl", hash = "sha256:1492a6051dab8d912fc2adeef0e8c72216b24d57bd896ea607cb90bb0c4981d3", size = 714498 },
{ url = "https://files.pythonhosted.org/packages/67/58/b1f60a1d591b771298ffa0428237afb092c7f29ae23bad93420b1eb10703/ruamel.yaml.clib-0.2.12-cp311-cp311-win32.whl", hash = "sha256:bd0a08f0bab19093c54e18a14a10b4322e1eacc5217056f3c063bd2f59853ce4", size = 100205 },
{ url = "https://files.pythonhosted.org/packages/b4/4f/b52f634c9548a9291a70dfce26ca7ebce388235c93588a1068028ea23fcc/ruamel.yaml.clib-0.2.12-cp311-cp311-win_amd64.whl", hash = "sha256:a274fb2cb086c7a3dea4322ec27f4cb5cc4b6298adb583ab0e211a4682f241eb", size = 118185 },
{ url = "https://files.pythonhosted.org/packages/48/41/e7a405afbdc26af961678474a55373e1b323605a4f5e2ddd4a80ea80f628/ruamel.yaml.clib-0.2.12-cp312-cp312-macosx_14_0_arm64.whl", hash = "sha256:20b0f8dc160ba83b6dcc0e256846e1a02d044e13f7ea74a3d1d56ede4e48c632", size = 133433 },
{ url = "https://files.pythonhosted.org/packages/ec/b0/b850385604334c2ce90e3ee1013bd911aedf058a934905863a6ea95e9eb4/ruamel.yaml.clib-0.2.12-cp312-cp312-manylinux2014_aarch64.whl", hash = "sha256:943f32bc9dedb3abff9879edc134901df92cfce2c3d5c9348f172f62eb2d771d", size = 647362 },
{ url = "https://files.pythonhosted.org/packages/44/d0/3f68a86e006448fb6c005aee66565b9eb89014a70c491d70c08de597f8e4/ruamel.yaml.clib-0.2.12-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:95c3829bb364fdb8e0332c9931ecf57d9be3519241323c5274bd82f709cebc0c", size = 754118 },
{ url = "https://files.pythonhosted.org/packages/52/a9/d39f3c5ada0a3bb2870d7db41901125dbe2434fa4f12ca8c5b83a42d7c53/ruamel.yaml.clib-0.2.12-cp312-cp312-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:749c16fcc4a2b09f28843cda5a193e0283e47454b63ec4b81eaa2242f50e4ccd", size = 706497 },
{ url = "https://files.pythonhosted.org/packages/b0/fa/097e38135dadd9ac25aecf2a54be17ddf6e4c23e43d538492a90ab3d71c6/ruamel.yaml.clib-0.2.12-cp312-cp312-musllinux_1_1_i686.whl", hash = "sha256:bf165fef1f223beae7333275156ab2022cffe255dcc51c27f066b4370da81e31", size = 698042 },
{ url = "https://files.pythonhosted.org/packages/ec/d5/a659ca6f503b9379b930f13bc6b130c9f176469b73b9834296822a83a132/ruamel.yaml.clib-0.2.12-cp312-cp312-musllinux_1_1_x86_64.whl", hash = "sha256:32621c177bbf782ca5a18ba4d7af0f1082a3f6e517ac2a18b3974d4edf349680", size = 745831 },
{ url = "https://files.pythonhosted.org/packages/db/5d/36619b61ffa2429eeaefaab4f3374666adf36ad8ac6330d855848d7d36fd/ruamel.yaml.clib-0.2.12-cp312-cp312-musllinux_1_2_aarch64.whl", hash = "sha256:b82a7c94a498853aa0b272fd5bc67f29008da798d4f93a2f9f289feb8426a58d", size = 715692 },
{ url = "https://files.pythonhosted.org/packages/b1/82/85cb92f15a4231c89b95dfe08b09eb6adca929ef7df7e17ab59902b6f589/ruamel.yaml.clib-0.2.12-cp312-cp312-win32.whl", hash = "sha256:e8c4ebfcfd57177b572e2040777b8abc537cdef58a2120e830124946aa9b42c5", size = 98777 },
{ url = "https://files.pythonhosted.org/packages/d7/8f/c3654f6f1ddb75daf3922c3d8fc6005b1ab56671ad56ffb874d908bfa668/ruamel.yaml.clib-0.2.12-cp312-cp312-win_amd64.whl", hash = "sha256:0467c5965282c62203273b838ae77c0d29d7638c8a4e3a1c8bdd3602c10904e4", size = 115523 },
]
[[package]]
name = "s3transfer"
version = "0.13.0"
@@ -3591,68 +3098,6 @@ wheels = [
{ url = "https://files.pythonhosted.org/packages/69/e2/b011c38e5394c4c18fb5500778a55ec43ad6106126e74723ffaee246f56e/safetensors-0.5.3-cp38-abi3-win_amd64.whl", hash = "sha256:836cbbc320b47e80acd40e44c8682db0e8ad7123209f69b093def21ec7cafd11", size = 308878 },
]
[[package]]
name = "scikit-learn"
version = "1.7.1"
source = { registry = "https://pypi.org/simple" }
dependencies = [
{ name = "joblib" },
{ name = "numpy" },
{ name = "scipy" },
{ name = "threadpoolctl" },
]
sdist = { url = "https://files.pythonhosted.org/packages/41/84/5f4af978fff619706b8961accac84780a6d298d82a8873446f72edb4ead0/scikit_learn-1.7.1.tar.gz", hash = "sha256:24b3f1e976a4665aa74ee0fcaac2b8fccc6ae77c8e07ab25da3ba6d3292b9802", size = 7190445 }
wheels = [
{ url = "https://files.pythonhosted.org/packages/b4/bd/a23177930abd81b96daffa30ef9c54ddbf544d3226b8788ce4c3ef1067b4/scikit_learn-1.7.1-cp311-cp311-macosx_10_9_x86_64.whl", hash = "sha256:90c8494ea23e24c0fb371afc474618c1019dc152ce4a10e4607e62196113851b", size = 9334838 },
{ url = "https://files.pythonhosted.org/packages/8d/a1/d3a7628630a711e2ac0d1a482910da174b629f44e7dd8cfcd6924a4ef81a/scikit_learn-1.7.1-cp311-cp311-macosx_12_0_arm64.whl", hash = "sha256:bb870c0daf3bf3be145ec51df8ac84720d9972170786601039f024bf6d61a518", size = 8651241 },
{ url = "https://files.pythonhosted.org/packages/26/92/85ec172418f39474c1cd0221d611345d4f433fc4ee2fc68e01f524ccc4e4/scikit_learn-1.7.1-cp311-cp311-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:40daccd1b5623f39e8943ab39735cadf0bdce80e67cdca2adcb5426e987320a8", size = 9718677 },
{ url = "https://files.pythonhosted.org/packages/df/ce/abdb1dcbb1d2b66168ec43b23ee0cee356b4cc4100ddee3943934ebf1480/scikit_learn-1.7.1-cp311-cp311-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:30d1f413cfc0aa5a99132a554f1d80517563c34a9d3e7c118fde2d273c6fe0f7", size = 9511189 },
{ url = "https://files.pythonhosted.org/packages/b2/3b/47b5eaee01ef2b5a80ba3f7f6ecf79587cb458690857d4777bfd77371c6f/scikit_learn-1.7.1-cp311-cp311-win_amd64.whl", hash = "sha256:c711d652829a1805a95d7fe96654604a8f16eab5a9e9ad87b3e60173415cb650", size = 8914794 },
{ url = "https://files.pythonhosted.org/packages/cb/16/57f176585b35ed865f51b04117947fe20f130f78940c6477b6d66279c9c2/scikit_learn-1.7.1-cp312-cp312-macosx_10_13_x86_64.whl", hash = "sha256:3cee419b49b5bbae8796ecd690f97aa412ef1674410c23fc3257c6b8b85b8087", size = 9260431 },
{ url = "https://files.pythonhosted.org/packages/67/4e/899317092f5efcab0e9bc929e3391341cec8fb0e816c4789686770024580/scikit_learn-1.7.1-cp312-cp312-macosx_12_0_arm64.whl", hash = "sha256:2fd8b8d35817b0d9ebf0b576f7d5ffbbabdb55536b0655a8aaae629d7ffd2e1f", size = 8637191 },
{ url = "https://files.pythonhosted.org/packages/f3/1b/998312db6d361ded1dd56b457ada371a8d8d77ca2195a7d18fd8a1736f21/scikit_learn-1.7.1-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:588410fa19a96a69763202f1d6b7b91d5d7a5d73be36e189bc6396bfb355bd87", size = 9486346 },
{ url = "https://files.pythonhosted.org/packages/ad/09/a2aa0b4e644e5c4ede7006748f24e72863ba2ae71897fecfd832afea01b4/scikit_learn-1.7.1-cp312-cp312-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:e3142f0abe1ad1d1c31a2ae987621e41f6b578144a911ff4ac94781a583adad7", size = 9290988 },
{ url = "https://files.pythonhosted.org/packages/15/fa/c61a787e35f05f17fc10523f567677ec4eeee5f95aa4798dbbbcd9625617/scikit_learn-1.7.1-cp312-cp312-win_amd64.whl", hash = "sha256:3ddd9092c1bd469acab337d87930067c87eac6bd544f8d5027430983f1e1ae88", size = 8735568 },
]
[[package]]
name = "scipy"
version = "1.16.1"
source = { registry = "https://pypi.org/simple" }
dependencies = [
{ name = "numpy" },
]
sdist = { url = "https://files.pythonhosted.org/packages/f5/4a/b927028464795439faec8eaf0b03b011005c487bb2d07409f28bf30879c4/scipy-1.16.1.tar.gz", hash = "sha256:44c76f9e8b6e8e488a586190ab38016e4ed2f8a038af7cd3defa903c0a2238b3", size = 30580861 }
wheels = [
{ url = "https://files.pythonhosted.org/packages/da/91/812adc6f74409b461e3a5fa97f4f74c769016919203138a3bf6fc24ba4c5/scipy-1.16.1-cp311-cp311-macosx_10_14_x86_64.whl", hash = "sha256:c033fa32bab91dc98ca59d0cf23bb876454e2bb02cbe592d5023138778f70030", size = 36552519 },
{ url = "https://files.pythonhosted.org/packages/47/18/8e355edcf3b71418d9e9f9acd2708cc3a6c27e8f98fde0ac34b8a0b45407/scipy-1.16.1-cp311-cp311-macosx_12_0_arm64.whl", hash = "sha256:6e5c2f74e5df33479b5cd4e97a9104c511518fbd979aa9b8f6aec18b2e9ecae7", size = 28638010 },
{ url = "https://files.pythonhosted.org/packages/d9/eb/e931853058607bdfbc11b86df19ae7a08686121c203483f62f1ecae5989c/scipy-1.16.1-cp311-cp311-macosx_14_0_arm64.whl", hash = "sha256:0a55ffe0ba0f59666e90951971a884d1ff6f4ec3275a48f472cfb64175570f77", size = 20909790 },
{ url = "https://files.pythonhosted.org/packages/45/0c/be83a271d6e96750cd0be2e000f35ff18880a46f05ce8b5d3465dc0f7a2a/scipy-1.16.1-cp311-cp311-macosx_14_0_x86_64.whl", hash = "sha256:f8a5d6cd147acecc2603fbd382fed6c46f474cccfcf69ea32582e033fb54dcfe", size = 23513352 },
{ url = "https://files.pythonhosted.org/packages/7c/bf/fe6eb47e74f762f933cca962db7f2c7183acfdc4483bd1c3813cfe83e538/scipy-1.16.1-cp311-cp311-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:cb18899127278058bcc09e7b9966d41a5a43740b5bb8dcba401bd983f82e885b", size = 33534643 },
{ url = "https://files.pythonhosted.org/packages/bb/ba/63f402e74875486b87ec6506a4f93f6d8a0d94d10467280f3d9d7837ce3a/scipy-1.16.1-cp311-cp311-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:adccd93a2fa937a27aae826d33e3bfa5edf9aa672376a4852d23a7cd67a2e5b7", size = 35376776 },
{ url = "https://files.pythonhosted.org/packages/c3/b4/04eb9d39ec26a1b939689102da23d505ea16cdae3dbb18ffc53d1f831044/scipy-1.16.1-cp311-cp311-musllinux_1_2_aarch64.whl", hash = "sha256:18aca1646a29ee9a0625a1be5637fa798d4d81fdf426481f06d69af828f16958", size = 35698906 },
{ url = "https://files.pythonhosted.org/packages/04/d6/bb5468da53321baeb001f6e4e0d9049eadd175a4a497709939128556e3ec/scipy-1.16.1-cp311-cp311-musllinux_1_2_x86_64.whl", hash = "sha256:d85495cef541729a70cdddbbf3e6b903421bc1af3e8e3a9a72a06751f33b7c39", size = 38129275 },
{ url = "https://files.pythonhosted.org/packages/c4/94/994369978509f227cba7dfb9e623254d0d5559506fe994aef4bea3ed469c/scipy-1.16.1-cp311-cp311-win_amd64.whl", hash = "sha256:226652fca853008119c03a8ce71ffe1b3f6d2844cc1686e8f9806edafae68596", size = 38644572 },
{ url = "https://files.pythonhosted.org/packages/f8/d9/ec4864f5896232133f51382b54a08de91a9d1af7a76dfa372894026dfee2/scipy-1.16.1-cp312-cp312-macosx_10_14_x86_64.whl", hash = "sha256:81b433bbeaf35728dad619afc002db9b189e45eebe2cd676effe1fb93fef2b9c", size = 36575194 },
{ url = "https://files.pythonhosted.org/packages/5c/6d/40e81ecfb688e9d25d34a847dca361982a6addf8e31f0957b1a54fbfa994/scipy-1.16.1-cp312-cp312-macosx_12_0_arm64.whl", hash = "sha256:886cc81fdb4c6903a3bb0464047c25a6d1016fef77bb97949817d0c0d79f9e04", size = 28594590 },
{ url = "https://files.pythonhosted.org/packages/0e/37/9f65178edfcc629377ce9a64fc09baebea18c80a9e57ae09a52edf84880b/scipy-1.16.1-cp312-cp312-macosx_14_0_arm64.whl", hash = "sha256:15240c3aac087a522b4eaedb09f0ad061753c5eebf1ea430859e5bf8640d5919", size = 20866458 },
{ url = "https://files.pythonhosted.org/packages/2c/7b/749a66766871ea4cb1d1ea10f27004db63023074c22abed51f22f09770e0/scipy-1.16.1-cp312-cp312-macosx_14_0_x86_64.whl", hash = "sha256:65f81a25805f3659b48126b5053d9e823d3215e4a63730b5e1671852a1705921", size = 23539318 },
{ url = "https://files.pythonhosted.org/packages/c4/db/8d4afec60eb833a666434d4541a3151eedbf2494ea6d4d468cbe877f00cd/scipy-1.16.1-cp312-cp312-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:6c62eea7f607f122069b9bad3f99489ddca1a5173bef8a0c75555d7488b6f725", size = 33292899 },
{ url = "https://files.pythonhosted.org/packages/51/1e/79023ca3bbb13a015d7d2757ecca3b81293c663694c35d6541b4dca53e98/scipy-1.16.1-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:f965bbf3235b01c776115ab18f092a95aa74c271a52577bcb0563e85738fd618", size = 35162637 },
{ url = "https://files.pythonhosted.org/packages/b6/49/0648665f9c29fdaca4c679182eb972935b3b4f5ace41d323c32352f29816/scipy-1.16.1-cp312-cp312-musllinux_1_2_aarch64.whl", hash = "sha256:f006e323874ffd0b0b816d8c6a8e7f9a73d55ab3b8c3f72b752b226d0e3ac83d", size = 35490507 },
{ url = "https://files.pythonhosted.org/packages/62/8f/66cbb9d6bbb18d8c658f774904f42a92078707a7c71e5347e8bf2f52bb89/scipy-1.16.1-cp312-cp312-musllinux_1_2_x86_64.whl", hash = "sha256:e8fd15fc5085ab4cca74cb91fe0a4263b1f32e4420761ddae531ad60934c2119", size = 37923998 },
{ url = "https://files.pythonhosted.org/packages/14/c3/61f273ae550fbf1667675701112e380881905e28448c080b23b5a181df7c/scipy-1.16.1-cp312-cp312-win_amd64.whl", hash = "sha256:f7b8013c6c066609577d910d1a2a077021727af07b6fab0ee22c2f901f22352a", size = 38508060 },
]
[[package]]
name = "semver"
version = "3.0.4"
source = { registry = "https://pypi.org/simple" }
sdist = { url = "https://files.pythonhosted.org/packages/72/d1/d3159231aec234a59dd7d601e9dd9fe96f3afff15efd33c1070019b26132/semver-3.0.4.tar.gz", hash = "sha256:afc7d8c584a5ed0a11033af086e8af226a9c0b206f313e0301f8dd7b6b589602", size = 269730 }
wheels = [
{ url = "https://files.pythonhosted.org/packages/a6/24/4d91e05817e92e3a61c8a21e08fd0f390f5301f1c448b137c57c4bc6e543/semver-3.0.4-py3-none-any.whl", hash = "sha256:9c824d87ba7f7ab4a1890799cec8596f15c1241cb473404ea1cb0c55e4b04746", size = 17912 },
]
[[package]]
name = "sentencepiece"
version = "0.2.0"
@@ -3756,25 +3201,6 @@ wheels = [
{ url = "https://files.pythonhosted.org/packages/32/46/9cb0e58b2deb7f82b84065f37f3bffeb12413f947f9388e4cac22c4621ce/sortedcontainers-2.4.0-py2.py3-none-any.whl", hash = "sha256:a163dcaede0f1c021485e957a39245190e74249897e2ae4b2aa38595db237ee0", size = 29575 },
]
[[package]]
name = "soundfile"
version = "0.13.1"
source = { registry = "https://pypi.org/simple" }
dependencies = [
{ name = "cffi" },
{ name = "numpy" },
]
sdist = { url = "https://files.pythonhosted.org/packages/e1/41/9b873a8c055582859b239be17902a85339bec6a30ad162f98c9b0288a2cc/soundfile-0.13.1.tar.gz", hash = "sha256:b2c68dab1e30297317080a5b43df57e302584c49e2942defdde0acccc53f0e5b", size = 46156 }
wheels = [
{ url = "https://files.pythonhosted.org/packages/64/28/e2a36573ccbcf3d57c00626a21fe51989380636e821b341d36ccca0c1c3a/soundfile-0.13.1-py2.py3-none-any.whl", hash = "sha256:a23c717560da2cf4c7b5ae1142514e0fd82d6bbd9dfc93a50423447142f2c445", size = 25751 },
{ url = "https://files.pythonhosted.org/packages/ea/ab/73e97a5b3cc46bba7ff8650a1504348fa1863a6f9d57d7001c6b67c5f20e/soundfile-0.13.1-py2.py3-none-macosx_10_9_x86_64.whl", hash = "sha256:82dc664d19831933fe59adad199bf3945ad06d84bc111a5b4c0d3089a5b9ec33", size = 1142250 },
{ url = "https://files.pythonhosted.org/packages/a0/e5/58fd1a8d7b26fc113af244f966ee3aecf03cb9293cb935daaddc1e455e18/soundfile-0.13.1-py2.py3-none-macosx_11_0_arm64.whl", hash = "sha256:743f12c12c4054921e15736c6be09ac26b3b3d603aef6fd69f9dde68748f2593", size = 1101406 },
{ url = "https://files.pythonhosted.org/packages/58/ae/c0e4a53d77cf6e9a04179535766b3321b0b9ced5f70522e4caf9329f0046/soundfile-0.13.1-py2.py3-none-manylinux_2_28_aarch64.whl", hash = "sha256:9c9e855f5a4d06ce4213f31918653ab7de0c5a8d8107cd2427e44b42df547deb", size = 1235729 },
{ url = "https://files.pythonhosted.org/packages/57/5e/70bdd9579b35003a489fc850b5047beeda26328053ebadc1fb60f320f7db/soundfile-0.13.1-py2.py3-none-manylinux_2_28_x86_64.whl", hash = "sha256:03267c4e493315294834a0870f31dbb3b28a95561b80b134f0bd3cf2d5f0e618", size = 1313646 },
{ url = "https://files.pythonhosted.org/packages/fe/df/8c11dc4dfceda14e3003bb81a0d0edcaaf0796dd7b4f826ea3e532146bba/soundfile-0.13.1-py2.py3-none-win32.whl", hash = "sha256:c734564fab7c5ddf8e9be5bf70bab68042cd17e9c214c06e365e20d64f9a69d5", size = 899881 },
{ url = "https://files.pythonhosted.org/packages/14/e9/6b761de83277f2f02ded7e7ea6f07828ec78e4b229b80e4ca55dd205b9dc/soundfile-0.13.1-py2.py3-none-win_amd64.whl", hash = "sha256:1e70a05a0626524a69e9f0f4dd2ec174b4e9567f4d8b6c11d38b5c289be36ee9", size = 1019162 },
]
[[package]]
name = "soupsieve"
version = "2.7"
@@ -3784,29 +3210,6 @@ wheels = [
{ url = "https://files.pythonhosted.org/packages/e7/9c/0e6afc12c269578be5c0c1c9f4b49a8d32770a080260c333ac04cc1c832d/soupsieve-2.7-py3-none-any.whl", hash = "sha256:6e60cc5c1ffaf1cebcc12e8188320b72071e922c2e897f737cadce79ad5d30c4", size = 36677 },
]
[[package]]
name = "speechbrain"
version = "1.0.3"
source = { registry = "https://pypi.org/simple" }
dependencies = [
{ name = "huggingface-hub" },
{ name = "hyperpyyaml" },
{ name = "joblib" },
{ name = "numpy" },
{ name = "packaging" },
{ name = "scipy" },
{ name = "sentencepiece" },
{ name = "torch", version = "2.8.0", source = { registry = "https://download.pytorch.org/whl/cpu" }, marker = "sys_platform == 'darwin'" },
{ name = "torch", version = "2.8.0+cpu", source = { registry = "https://download.pytorch.org/whl/cpu" }, marker = "sys_platform != 'darwin'" },
{ name = "torchaudio", version = "2.8.0", source = { registry = "https://download.pytorch.org/whl/cpu" }, marker = "(platform_machine == 'aarch64' and sys_platform == 'linux') or sys_platform == 'darwin'" },
{ name = "torchaudio", version = "2.8.0+cpu", source = { registry = "https://download.pytorch.org/whl/cpu" }, marker = "(platform_machine != 'aarch64' and sys_platform == 'linux') or (sys_platform != 'darwin' and sys_platform != 'linux')" },
{ name = "tqdm" },
]
sdist = { url = "https://files.pythonhosted.org/packages/ab/10/87e666544a4e0cec7cbdc09f26948994831ae0f8bbc58de3bf53b68285ff/speechbrain-1.0.3.tar.gz", hash = "sha256:fcab3c6e90012cecb1eed40ea235733b550137e73da6bfa2340ba191ec714052", size = 747735 }
wheels = [
{ url = "https://files.pythonhosted.org/packages/58/13/e61f1085aebee17d5fc2df19fcc5177c10379be52578afbecdd615a831c9/speechbrain-1.0.3-py3-none-any.whl", hash = "sha256:9859d4c1b1fb3af3b85523c0c89f52e45a04f305622ed55f31aa32dd2fba19e9", size = 864091 },
]
[[package]]
name = "sqlalchemy"
version = "1.4.54"
@@ -3888,15 +3291,6 @@ wheels = [
{ url = "https://files.pythonhosted.org/packages/a2/09/77d55d46fd61b4a135c444fc97158ef34a095e5681d0a6c10b75bf356191/sympy-1.14.0-py3-none-any.whl", hash = "sha256:e091cc3e99d2141a0ba2847328f5479b05d94a6635cb96148ccb3f34671bd8f5", size = 6299353 },
]
[[package]]
name = "tabulate"
version = "0.9.0"
source = { registry = "https://pypi.org/simple" }
sdist = { url = "https://files.pythonhosted.org/packages/ec/fe/802052aecb21e3797b8f7902564ab6ea0d60ff8ca23952079064155d1ae1/tabulate-0.9.0.tar.gz", hash = "sha256:0095b12bf5966de529c0feb1fa08671671b3368eec77d7ef7ab114be2c068b3c", size = 81090 }
wheels = [
{ url = "https://files.pythonhosted.org/packages/40/44/4a5f08c96eb108af5cb50b41f76142f0afa346dfa99d5296fe7202a11854/tabulate-0.9.0-py3-none-any.whl", hash = "sha256:024ca478df22e9340661486f85298cff5f6dcdba14f3813e8830015b9ed1948f", size = 35252 },
]
[[package]]
name = "tenacity"
version = "9.1.2"
@@ -3906,29 +3300,6 @@ wheels = [
{ url = "https://files.pythonhosted.org/packages/e5/30/643397144bfbfec6f6ef821f36f33e57d35946c44a2352d3c9f0ae847619/tenacity-9.1.2-py3-none-any.whl", hash = "sha256:f77bf36710d8b73a50b2dd155c97b870017ad21afe6ab300326b0371b3b05138", size = 28248 },
]
[[package]]
name = "tensorboardx"
version = "2.6.4"
source = { registry = "https://pypi.org/simple" }
dependencies = [
{ name = "numpy" },
{ name = "packaging" },
{ name = "protobuf" },
]
sdist = { url = "https://files.pythonhosted.org/packages/2b/c5/d4cc6e293fb837aaf9f76dd7745476aeba8ef7ef5146c3b3f9ee375fe7a5/tensorboardx-2.6.4.tar.gz", hash = "sha256:b163ccb7798b31100b9f5fa4d6bc22dad362d7065c2f24b51e50731adde86828", size = 4769801 }
wheels = [
{ url = "https://files.pythonhosted.org/packages/e0/1d/b5d63f1a6b824282b57f7b581810d20b7a28ca951f2d5b59f1eb0782c12b/tensorboardx-2.6.4-py3-none-any.whl", hash = "sha256:5970cf3a1f0a6a6e8b180ccf46f3fe832b8a25a70b86e5a237048a7c0beb18e2", size = 87201 },
]
[[package]]
name = "threadpoolctl"
version = "3.6.0"
source = { registry = "https://pypi.org/simple" }
sdist = { url = "https://files.pythonhosted.org/packages/b7/4d/08c89e34946fce2aec4fbb45c9016efd5f4d7f24af8e5d93296e935631d8/threadpoolctl-3.6.0.tar.gz", hash = "sha256:8ab8b4aa3491d812b623328249fab5302a68d2d71745c8a4c719a2fcaba9f44e", size = 21274 }
wheels = [
{ url = "https://files.pythonhosted.org/packages/32/d5/f9a850d79b0851d1d4ef6456097579a9005b31fea68726a4ae5f2d82ddd9/threadpoolctl-3.6.0-py3-none-any.whl", hash = "sha256:43a0b8fd5a2928500110039e43a5eed8480b918967083ea48dc3ab9f13c4a7fb", size = 18638 },
]
[[package]]
name = "tiktoken"
version = "0.9.0"
@@ -4069,40 +3440,6 @@ wheels = [
{ url = "https://download.pytorch.org/whl/cpu/torch-2.8.0%2Bcpu-cp312-cp312-win_arm64.whl", hash = "sha256:99fc421a5d234580e45957a7b02effbf3e1c884a5dd077afc85352c77bf41434" },
]
[[package]]
name = "torch-audiomentations"
version = "0.12.0"
source = { registry = "https://pypi.org/simple" }
dependencies = [
{ name = "julius" },
{ name = "torch", version = "2.8.0", source = { registry = "https://download.pytorch.org/whl/cpu" }, marker = "sys_platform == 'darwin'" },
{ name = "torch", version = "2.8.0+cpu", source = { registry = "https://download.pytorch.org/whl/cpu" }, marker = "sys_platform != 'darwin'" },
{ name = "torch-pitch-shift" },
{ name = "torchaudio", version = "2.8.0", source = { registry = "https://download.pytorch.org/whl/cpu" }, marker = "(platform_machine == 'aarch64' and sys_platform == 'linux') or sys_platform == 'darwin'" },
{ name = "torchaudio", version = "2.8.0+cpu", source = { registry = "https://download.pytorch.org/whl/cpu" }, marker = "(platform_machine != 'aarch64' and sys_platform == 'linux') or (sys_platform != 'darwin' and sys_platform != 'linux')" },
]
sdist = { url = "https://files.pythonhosted.org/packages/31/8d/2f8fd7e34c75f5ee8de4310c3bd3f22270acd44d1f809e2fe7c12fbf35f8/torch_audiomentations-0.12.0.tar.gz", hash = "sha256:b02d4c5eb86376986a53eb405cca5e34f370ea9284411237508e720c529f7888", size = 52094 }
wheels = [
{ url = "https://files.pythonhosted.org/packages/21/9d/1ee04f49c15d2d632f6f7102061d7c07652858e6d91b58a091531034e84f/torch_audiomentations-0.12.0-py3-none-any.whl", hash = "sha256:1b80b91d2016ccf83979622cac8f702072a79b7dcc4c2bee40f00b26433a786b", size = 48506 },
]
[[package]]
name = "torch-pitch-shift"
version = "1.2.5"
source = { registry = "https://pypi.org/simple" }
dependencies = [
{ name = "packaging" },
{ name = "primepy" },
{ name = "torch", version = "2.8.0", source = { registry = "https://download.pytorch.org/whl/cpu" }, marker = "sys_platform == 'darwin'" },
{ name = "torch", version = "2.8.0+cpu", source = { registry = "https://download.pytorch.org/whl/cpu" }, marker = "sys_platform != 'darwin'" },
{ name = "torchaudio", version = "2.8.0", source = { registry = "https://download.pytorch.org/whl/cpu" }, marker = "(platform_machine == 'aarch64' and sys_platform == 'linux') or sys_platform == 'darwin'" },
{ name = "torchaudio", version = "2.8.0+cpu", source = { registry = "https://download.pytorch.org/whl/cpu" }, marker = "(platform_machine != 'aarch64' and sys_platform == 'linux') or (sys_platform != 'darwin' and sys_platform != 'linux')" },
]
sdist = { url = "https://files.pythonhosted.org/packages/79/a6/722a832bca75d5079f6731e005b3d0c2eec7c6c6863d030620952d143d57/torch_pitch_shift-1.2.5.tar.gz", hash = "sha256:6e1c7531f08d0f407a4c55e5ff8385a41355c5c5d27ab7fa08632e51defbd0ed", size = 4725 }
wheels = [
{ url = "https://files.pythonhosted.org/packages/27/4c/96ac2a09efb56cc3c41fb3ce9b6f4d8c0604499f7481d4a13a7b03e21382/torch_pitch_shift-1.2.5-py3-none-any.whl", hash = "sha256:6f8500cbc13f1c98b11cde1805ce5084f82cdd195c285f34287541f168a7c6a7", size = 5005 },
]
[[package]]
name = "torchaudio"
version = "2.8.0"
@@ -4150,22 +3487,6 @@ wheels = [
{ url = "https://download.pytorch.org/whl/cpu/torchaudio-2.8.0%2Bcpu-cp312-cp312-win_amd64.whl", hash = "sha256:9b302192b570657c1cc787a4d487ae4bbb7f2aab1c01b1fcc46757e7f86f391e" },
]
[[package]]
name = "torchmetrics"
version = "1.8.2"
source = { registry = "https://pypi.org/simple" }
dependencies = [
{ name = "lightning-utilities" },
{ name = "numpy" },
{ name = "packaging" },
{ name = "torch", version = "2.8.0", source = { registry = "https://download.pytorch.org/whl/cpu" }, marker = "sys_platform == 'darwin'" },
{ name = "torch", version = "2.8.0+cpu", source = { registry = "https://download.pytorch.org/whl/cpu" }, marker = "sys_platform != 'darwin'" },
]
sdist = { url = "https://files.pythonhosted.org/packages/85/2e/48a887a59ecc4a10ce9e8b35b3e3c5cef29d902c4eac143378526e7485cb/torchmetrics-1.8.2.tar.gz", hash = "sha256:cf64a901036bf107f17a524009eea7781c9c5315d130713aeca5747a686fe7a5", size = 580679 }
wheels = [
{ url = "https://files.pythonhosted.org/packages/02/21/aa0f434434c48490f91b65962b1ce863fdcce63febc166ca9fe9d706c2b6/torchmetrics-1.8.2-py3-none-any.whl", hash = "sha256:08382fd96b923e39e904c4d570f3d49e2cc71ccabd2a94e0f895d1f0dac86242", size = 983161 },
]
[[package]]
name = "tqdm"
version = "4.67.1"

View File

@@ -302,10 +302,10 @@ export default function RoomsList() {
return;
}
const platform: "whereby" | "daily" | null =
const platform: "whereby" | "daily" =
room.platform === "whereby" || room.platform === "daily"
? room.platform
: null;
: "daily";
const roomData = {
name: room.name,

View File

@@ -16,6 +16,7 @@ import {
import { useError } from "../../../../(errors)/errorContext";
import { useRouter } from "next/navigation";
import { Box, Grid } from "@chakra-ui/react";
import { parseNonEmptyString } from "../../../../lib/utils";
export type TranscriptCorrect = {
params: Promise<{
@@ -25,8 +26,7 @@ export type TranscriptCorrect = {
export default function TranscriptCorrect(props: TranscriptCorrect) {
const params = use(props.params);
const { transcriptId } = params;
const transcriptId = parseNonEmptyString(params.transcriptId);
const updateTranscriptMutation = useTranscriptUpdate();
const transcript = useTranscriptGet(transcriptId);

View File

@@ -3,7 +3,8 @@ import React from "react";
import Markdown from "react-markdown";
import "../../../styles/markdown.css";
import type { components } from "../../../reflector-api";
type GetTranscript = components["schemas"]["GetTranscript"];
type GetTranscriptWithParticipants =
components["schemas"]["GetTranscriptWithParticipants"];
type GetTranscriptTopic = components["schemas"]["GetTranscriptTopic"];
import { useTranscriptUpdate } from "../../../lib/apiHooks";
import {
@@ -18,7 +19,7 @@ import { LuPen } from "react-icons/lu";
import { useError } from "../../../(errors)/errorContext";
type FinalSummaryProps = {
transcript: GetTranscript;
transcript: GetTranscriptWithParticipants;
topics: GetTranscriptTopic[];
onUpdate: (newSummary: string) => void;
finalSummaryRef: React.Dispatch<React.SetStateAction<HTMLDivElement | null>>;

View File

@@ -9,7 +9,9 @@ import React, { useEffect, useState, use } from "react";
import FinalSummary from "./finalSummary";
import TranscriptTitle from "../transcriptTitle";
import Player from "../player";
import { useWebSockets } from "../useWebSockets";
import { useRouter } from "next/navigation";
import { parseNonEmptyString } from "../../../lib/utils";
import {
Box,
Flex,
@@ -30,7 +32,7 @@ type TranscriptDetails = {
export default function TranscriptDetails(details: TranscriptDetails) {
const params = use(details.params);
const transcriptId = params.transcriptId;
const transcriptId = parseNonEmptyString(params.transcriptId);
const router = useRouter();
const statusToRedirect = [
"idle",
@@ -49,6 +51,7 @@ export default function TranscriptDetails(details: TranscriptDetails) {
transcriptId,
waiting || mp3.audioDeleted === true,
);
useWebSockets(transcriptId);
const useActiveTopic = useState<Topic | null>(null);
const [finalSummaryElement, setFinalSummaryElement] =
useState<HTMLDivElement | null>(null);

View File

@@ -10,6 +10,8 @@ import {
} from "@chakra-ui/react";
import { useRouter } from "next/navigation";
import { useTranscriptGet } from "../../../../lib/apiHooks";
import { parseNonEmptyString } from "../../../../lib/utils";
import { useWebSockets } from "../../useWebSockets";
type TranscriptProcessing = {
params: Promise<{
@@ -19,10 +21,11 @@ type TranscriptProcessing = {
export default function TranscriptProcessing(details: TranscriptProcessing) {
const params = use(details.params);
const transcriptId = params.transcriptId;
const transcriptId = parseNonEmptyString(params.transcriptId);
const router = useRouter();
const transcript = useTranscriptGet(transcriptId);
useWebSockets(transcriptId);
useEffect(() => {
const status = transcript.data?.status;

View File

@@ -12,6 +12,7 @@ import { Box, Text, Grid, Heading, VStack, Flex } from "@chakra-ui/react";
import LiveTrancription from "../../liveTranscription";
import { useTranscriptGet } from "../../../../lib/apiHooks";
import { TranscriptStatus } from "../../../../lib/transcript";
import { parseNonEmptyString } from "../../../../lib/utils";
type TranscriptDetails = {
params: Promise<{
@@ -21,13 +22,14 @@ type TranscriptDetails = {
const TranscriptRecord = (details: TranscriptDetails) => {
const params = use(details.params);
const transcript = useTranscriptGet(params.transcriptId);
const transcriptId = parseNonEmptyString(params.transcriptId);
const transcript = useTranscriptGet(transcriptId);
const [transcriptStarted, setTranscriptStarted] = useState(false);
const useActiveTopic = useState<Topic | null>(null);
const webSockets = useWebSockets(params.transcriptId);
const webSockets = useWebSockets(transcriptId);
const mp3 = useMp3(params.transcriptId, true);
const mp3 = useMp3(transcriptId, true);
const router = useRouter();

View File

@@ -7,6 +7,7 @@ import useMp3 from "../../useMp3";
import { Center, VStack, Text, Heading } from "@chakra-ui/react";
import FileUploadButton from "../../fileUploadButton";
import { useTranscriptGet } from "../../../../lib/apiHooks";
import { parseNonEmptyString } from "../../../../lib/utils";
type TranscriptUpload = {
params: Promise<{
@@ -16,12 +17,13 @@ type TranscriptUpload = {
const TranscriptUpload = (details: TranscriptUpload) => {
const params = use(details.params);
const transcript = useTranscriptGet(params.transcriptId);
const transcriptId = parseNonEmptyString(params.transcriptId);
const transcript = useTranscriptGet(transcriptId);
const [transcriptStarted, setTranscriptStarted] = useState(false);
const webSockets = useWebSockets(params.transcriptId);
const webSockets = useWebSockets(transcriptId);
const mp3 = useMp3(params.transcriptId, true);
const mp3 = useMp3(transcriptId, true);
const router = useRouter();

View File

@@ -2,10 +2,11 @@ import type { components } from "../../reflector-api";
import { useTranscriptCreate } from "../../lib/apiHooks";
type CreateTranscript = components["schemas"]["CreateTranscript"];
type GetTranscript = components["schemas"]["GetTranscript"];
type GetTranscriptWithParticipants =
components["schemas"]["GetTranscriptWithParticipants"];
type UseCreateTranscript = {
transcript: GetTranscript | null;
transcript: GetTranscriptWithParticipants | null;
loading: boolean;
error: Error | null;
create: (transcriptCreationDetails: CreateTranscript) => Promise<void>;

View File

@@ -2,7 +2,8 @@ import { useEffect, useState } from "react";
import { ShareMode, toShareMode } from "../../lib/shareMode";
import type { components } from "../../reflector-api";
type GetTranscript = components["schemas"]["GetTranscript"];
type GetTranscriptWithParticipants =
components["schemas"]["GetTranscriptWithParticipants"];
type GetTranscriptTopic = components["schemas"]["GetTranscriptTopic"];
type UpdateTranscript = components["schemas"]["UpdateTranscript"];
import {
@@ -27,7 +28,7 @@ import { featureEnabled } from "../../lib/features";
type ShareAndPrivacyProps = {
finalSummaryElement: HTMLDivElement | null;
transcript: GetTranscript;
transcript: GetTranscriptWithParticipants;
topics: GetTranscriptTopic[];
};

View File

@@ -1,7 +1,8 @@
import { useState, useEffect, useMemo } from "react";
import type { components } from "../../reflector-api";
type GetTranscript = components["schemas"]["GetTranscript"];
type GetTranscriptWithParticipants =
components["schemas"]["GetTranscriptWithParticipants"];
type GetTranscriptTopic = components["schemas"]["GetTranscriptTopic"];
import {
BoxProps,
@@ -26,7 +27,7 @@ import {
import { featureEnabled } from "../../lib/features";
type ShareZulipProps = {
transcript: GetTranscript;
transcript: GetTranscriptWithParticipants;
topics: GetTranscriptTopic[];
disabled: boolean;
};

View File

@@ -1,8 +1,10 @@
import { useState } from "react";
import type { components } from "../../reflector-api";
import { parseMaybeNonEmptyString } from "../../lib/utils";
type UpdateTranscript = components["schemas"]["UpdateTranscript"];
type GetTranscript = components["schemas"]["GetTranscript"];
type GetTranscriptWithParticipants =
components["schemas"]["GetTranscriptWithParticipants"];
type GetTranscriptTopic = components["schemas"]["GetTranscriptTopic"];
import {
useTranscriptUpdate,
@@ -20,7 +22,7 @@ type TranscriptTitle = {
onUpdate: (newTitle: string) => void;
// share props
transcript: GetTranscript | null;
transcript: GetTranscriptWithParticipants | null;
topics: GetTranscriptTopic[] | null;
finalSummaryElement: HTMLDivElement | null;
};
@@ -31,7 +33,7 @@ const TranscriptTitle = (props: TranscriptTitle) => {
const [isEditing, setIsEditing] = useState(false);
const updateTranscriptMutation = useTranscriptUpdate();
const participantsQuery = useTranscriptParticipants(
props.transcript?.id || null,
props.transcript?.id ? parseMaybeNonEmptyString(props.transcript.id) : null,
);
const updateTitle = async (newTitle: string, transcriptId: string) => {

View File

@@ -1,5 +1,6 @@
import { useEffect, useState } from "react";
import { useTranscriptGet } from "../../lib/apiHooks";
import { parseMaybeNonEmptyString } from "../../lib/utils";
import { useAuth } from "../../lib/AuthProvider";
import { API_URL } from "../../lib/apiClient";
@@ -27,7 +28,7 @@ const useMp3 = (transcriptId: string, waiting?: boolean): Mp3Response => {
data: transcript,
isLoading: transcriptMetadataLoading,
error: transcriptError,
} = useTranscriptGet(later ? null : transcriptId);
} = useTranscriptGet(later ? null : parseMaybeNonEmptyString(transcriptId));
const [serviceWorker, setServiceWorker] =
useState<ServiceWorkerRegistration | null>(null);

View File

@@ -1,6 +1,7 @@
import type { components } from "../../reflector-api";
type Participant = components["schemas"]["Participant"];
import { useTranscriptParticipants } from "../../lib/apiHooks";
import { parseMaybeNonEmptyString } from "../../lib/utils";
type ErrorParticipants = {
error: Error;
@@ -32,7 +33,7 @@ const useParticipants = (transcriptId: string): UseParticipants => {
isLoading: loading,
error,
refetch,
} = useTranscriptParticipants(transcriptId || null);
} = useTranscriptParticipants(parseMaybeNonEmptyString(transcriptId));
// Type-safe return based on state
if (error) {

View File

@@ -1,5 +1,6 @@
import type { components } from "../../reflector-api";
import { useTranscriptTopicsWithWordsPerSpeaker } from "../../lib/apiHooks";
import { parseMaybeNonEmptyString } from "../../lib/utils";
type GetTranscriptTopicWithWordsPerSpeaker =
components["schemas"]["GetTranscriptTopicWithWordsPerSpeaker"];
@@ -38,7 +39,7 @@ const useTopicWithWords = (
error,
refetch,
} = useTranscriptTopicsWithWordsPerSpeaker(
transcriptId || null,
parseMaybeNonEmptyString(transcriptId),
topicId || null,
);

View File

@@ -1,5 +1,6 @@
import { useTranscriptTopics } from "../../lib/apiHooks";
import type { components } from "../../reflector-api";
import { parseMaybeNonEmptyString } from "../../lib/utils";
type GetTranscriptTopic = components["schemas"]["GetTranscriptTopic"];
@@ -10,7 +11,11 @@ type TranscriptTopics = {
};
const useTopics = (id: string): TranscriptTopics => {
const { data: topics, isLoading: loading, error } = useTranscriptTopics(id);
const {
data: topics,
isLoading: loading,
error,
} = useTranscriptTopics(parseMaybeNonEmptyString(id));
return {
topics: topics || null,

View File

@@ -1,5 +1,6 @@
import type { components } from "../../reflector-api";
import { useTranscriptWaveform } from "../../lib/apiHooks";
import { parseMaybeNonEmptyString } from "../../lib/utils";
type AudioWaveform = components["schemas"]["AudioWaveform"];
@@ -14,7 +15,7 @@ const useWaveform = (id: string, skip: boolean): AudioWaveFormResponse => {
data: waveform,
isLoading: loading,
error,
} = useTranscriptWaveform(skip ? null : id);
} = useTranscriptWaveform(skip ? null : parseMaybeNonEmptyString(id));
return {
waveform: waveform || null,

View File

@@ -23,7 +23,16 @@ const useWebRTC = (
let p: Peer;
try {
p = new Peer({ initiator: true, stream: stream });
p = new Peer({
initiator: true,
stream: stream,
// Disable trickle ICE: single SDP exchange (offer + answer) with all candidates.
// Required for HTTP-based signaling; trickle needs WebSocket for candidate exchange.
trickle: false,
config: {
iceServers: [{ urls: "stun:stun.l.google.com:19302" }],
},
});
} catch (error) {
setError(error as Error, "Error creating WebRTC");
return;

View File

@@ -7,6 +7,12 @@ type GetTranscriptSegmentTopic =
components["schemas"]["GetTranscriptSegmentTopic"];
import { useQueryClient } from "@tanstack/react-query";
import { $api, WEBSOCKET_URL } from "../../lib/apiClient";
import {
invalidateTranscript,
invalidateTranscriptTopics,
invalidateTranscriptWaveform,
} from "../../lib/apiHooks";
import { NonEmptyString } from "../../lib/utils";
export type UseWebSockets = {
transcriptTextLive: string;
@@ -369,15 +375,10 @@ export const useWebSockets = (transcriptId: string | null): UseWebSockets => {
});
console.debug("TOPIC event:", message.data);
// Invalidate topics query to sync with WebSocket data
queryClient.invalidateQueries({
queryKey: $api.queryOptions(
"get",
"/v1/transcripts/{transcript_id}/topics",
{
params: { path: { transcript_id: transcriptId } },
},
).queryKey,
});
invalidateTranscriptTopics(
queryClient,
transcriptId as NonEmptyString,
);
break;
case "FINAL_SHORT_SUMMARY":
@@ -388,15 +389,7 @@ export const useWebSockets = (transcriptId: string | null): UseWebSockets => {
if (message.data) {
setFinalSummary(message.data);
// Invalidate transcript query to sync summary
queryClient.invalidateQueries({
queryKey: $api.queryOptions(
"get",
"/v1/transcripts/{transcript_id}",
{
params: { path: { transcript_id: transcriptId } },
},
).queryKey,
});
invalidateTranscript(queryClient, transcriptId as NonEmptyString);
}
break;
@@ -405,15 +398,7 @@ export const useWebSockets = (transcriptId: string | null): UseWebSockets => {
if (message.data) {
setTitle(message.data.title);
// Invalidate transcript query to sync title
queryClient.invalidateQueries({
queryKey: $api.queryOptions(
"get",
"/v1/transcripts/{transcript_id}",
{
params: { path: { transcript_id: transcriptId } },
},
).queryKey,
});
invalidateTranscript(queryClient, transcriptId as NonEmptyString);
}
break;
@@ -424,6 +409,10 @@ export const useWebSockets = (transcriptId: string | null): UseWebSockets => {
);
if (message.data) {
setWaveForm(message.data.waveform);
invalidateTranscriptWaveform(
queryClient,
transcriptId as NonEmptyString,
);
}
break;
case "DURATION":
@@ -442,6 +431,7 @@ export const useWebSockets = (transcriptId: string | null): UseWebSockets => {
);
}
setStatus(message.data);
invalidateTranscript(queryClient, transcriptId as NonEmptyString);
if (message.data.value === "ended") {
ws.close();
}

View File

@@ -26,7 +26,7 @@ import { useRouter } from "next/navigation";
import { formatDateTime, formatStartedAgo } from "../lib/timeUtils";
import MeetingMinimalHeader from "../components/MeetingMinimalHeader";
import { NonEmptyString } from "../lib/utils";
import { MeetingId } from "../lib/types";
import { MeetingId, assertMeetingId } from "../lib/types";
type Meeting = components["schemas"]["Meeting"];
@@ -315,7 +315,9 @@ export default function MeetingSelection({
variant="outline"
colorScheme="red"
size="md"
onClick={() => handleEndMeeting(meeting.id)}
onClick={() =>
handleEndMeeting(assertMeetingId(meeting.id))
}
loading={deactivateMeetingMutation.isPending}
>
<Icon as={LuX} me={2} />
@@ -460,7 +462,9 @@ export default function MeetingSelection({
variant="outline"
colorScheme="red"
size="md"
onClick={() => handleEndMeeting(meeting.id)}
onClick={() =>
handleEndMeeting(assertMeetingId(meeting.id))
}
loading={deactivateMeetingMutation.isPending}
>
<Icon as={LuX} me={2} />

View File

@@ -22,14 +22,29 @@ import DailyIframe, {
import type { components } from "../../reflector-api";
import { useAuth } from "../../lib/AuthProvider";
import { useConsentDialog } from "../../lib/consent";
import { useRoomJoinMeeting } from "../../lib/apiHooks";
import {
useRoomJoinMeeting,
useMeetingStartRecording,
} from "../../lib/apiHooks";
import { omit } from "remeda";
import { assertExists } from "../../lib/utils";
import { assertMeetingId } from "../../lib/types";
import {
assertExists,
NonEmptyString,
parseNonEmptyString,
} from "../../lib/utils";
import { assertMeetingId, DailyRecordingType } from "../../lib/types";
import { useUuidV5 } from "react-uuid-hook";
const CONSENT_BUTTON_ID = "recording-consent";
const RECORDING_INDICATOR_ID = "recording-indicator";
// Namespace UUID for UUIDv5 generation of raw-tracks instanceIds
// DO NOT CHANGE: Breaks instanceId determinism across deployments
const RAW_TRACKS_NAMESPACE = "a1b2c3d4-e5f6-7890-abcd-ef1234567890";
const RECORDING_START_DELAY_MS = 2000;
const RECORDING_START_MAX_RETRIES = 5;
type Meeting = components["schemas"]["Meeting"];
type Room = components["schemas"]["RoomDetails"];
@@ -73,9 +88,7 @@ const useFrame = (
cbs: {
onLeftMeeting: () => void;
onCustomButtonClick: (ev: DailyEventObjectCustomButtonClick) => void;
onJoinMeeting: (
startRecording: (args: { type: "raw-tracks" }) => void,
) => void;
onJoinMeeting: () => void;
},
) => {
const [{ frame, joined }, setState] = useState(USE_FRAME_INIT_STATE);
@@ -126,7 +139,7 @@ const useFrame = (
console.error("frame is null in joined-meeting callback");
return;
}
cbs.onJoinMeeting(frame.startRecording.bind(frame));
cbs.onJoinMeeting();
};
frame.on("joined-meeting", joinCb);
return () => {
@@ -173,8 +186,15 @@ export default function DailyRoom({ meeting, room }: DailyRoomProps) {
const authLastUserId = auth.lastUserId;
const [container, setContainer] = useState<HTMLDivElement | null>(null);
const joinMutation = useRoomJoinMeeting();
const startRecordingMutation = useMeetingStartRecording();
const [joinedMeeting, setJoinedMeeting] = useState<Meeting | null>(null);
// Generate deterministic instanceIds so all participants use SAME IDs
const cloudInstanceId = parseNonEmptyString(meeting.id);
const rawTracksInstanceId = parseNonEmptyString(
useUuidV5(meeting.id, RAW_TRACKS_NAMESPACE)[0],
);
const roomName = params?.roomName as string;
const {
@@ -228,19 +248,72 @@ export default function DailyRoom({ meeting, room }: DailyRoomProps) {
],
);
const handleFrameJoinMeeting = useCallback(
(startRecording: (args: { type: "raw-tracks" }) => void) => {
try {
if (meeting.recording_type === "cloud") {
console.log("Starting cloud recording");
startRecording({ type: "raw-tracks" });
}
} catch (error) {
console.error("Failed to start recording:", error);
}
},
[meeting.recording_type],
);
const handleFrameJoinMeeting = useCallback(() => {
if (meeting.recording_type === "cloud") {
console.log("Starting dual recording via REST API", {
cloudInstanceId,
rawTracksInstanceId,
});
// Start both cloud and raw-tracks via backend REST API (with retry on 404)
// Daily.co needs time to register call as "hosting" for REST API
const startRecordingWithRetry = (
type: DailyRecordingType,
instanceId: NonEmptyString,
attempt: number = 1,
) => {
setTimeout(() => {
startRecordingMutation.mutate(
{
params: {
path: {
meeting_id: meeting.id,
},
},
body: {
type,
instanceId,
},
},
{
onError: (error: any) => {
const errorText = error?.detail || error?.message || "";
const is404NotHosting = errorText.includes(
"does not seem to be hosting a call",
);
const isActiveStream = errorText.includes(
"has an active stream",
);
if (is404NotHosting && attempt < RECORDING_START_MAX_RETRIES) {
console.log(
`${type}: Call not hosting yet, retry ${attempt + 1}/${RECORDING_START_MAX_RETRIES} in ${RECORDING_START_DELAY_MS}ms...`,
);
startRecordingWithRetry(type, instanceId, attempt + 1);
} else if (isActiveStream) {
console.log(
`${type}: Recording already active (started by another participant)`,
);
} else {
console.error(`Failed to start ${type} recording:`, error);
}
},
},
);
}, RECORDING_START_DELAY_MS);
};
// Start both recordings
startRecordingWithRetry("cloud", cloudInstanceId);
startRecordingWithRetry("raw-tracks", rawTracksInstanceId);
}
}, [
meeting.recording_type,
meeting.id,
startRecordingMutation,
cloudInstanceId,
rawTracksInstanceId,
]);
const recordingIconUrl = useMemo(
() => new URL("/recording-icon.svg", window.location.origin),

View File

@@ -6,6 +6,7 @@ import { QueryClient, useQueryClient } from "@tanstack/react-query";
import type { components } from "../reflector-api";
import { useAuth } from "./AuthProvider";
import { MeetingId } from "./types";
import { NonEmptyString } from "./utils";
/*
* XXX error types returned from the hooks are not always correct; declared types are ValidationError but real type could be string or any other
@@ -103,7 +104,7 @@ export function useTranscriptProcess() {
});
}
export function useTranscriptGet(transcriptId: string | null) {
export function useTranscriptGet(transcriptId: NonEmptyString | null) {
return $api.useQuery(
"get",
"/v1/transcripts/{transcript_id}",
@@ -120,6 +121,16 @@ export function useTranscriptGet(transcriptId: string | null) {
);
}
export const invalidateTranscript = (
queryClient: QueryClient,
transcriptId: NonEmptyString,
) =>
queryClient.invalidateQueries({
queryKey: $api.queryOptions("get", "/v1/transcripts/{transcript_id}", {
params: { path: { transcript_id: transcriptId } },
}).queryKey,
});
export function useRoomGet(roomId: string | null) {
const { isAuthenticated } = useAuthReady();
@@ -297,7 +308,7 @@ export function useTranscriptUploadAudio() {
);
}
export function useTranscriptWaveform(transcriptId: string | null) {
export function useTranscriptWaveform(transcriptId: NonEmptyString | null) {
return $api.useQuery(
"get",
"/v1/transcripts/{transcript_id}/audio/waveform",
@@ -312,7 +323,21 @@ export function useTranscriptWaveform(transcriptId: string | null) {
);
}
export function useTranscriptMP3(transcriptId: string | null) {
export const invalidateTranscriptWaveform = (
queryClient: QueryClient,
transcriptId: NonEmptyString,
) =>
queryClient.invalidateQueries({
queryKey: $api.queryOptions(
"get",
"/v1/transcripts/{transcript_id}/audio/waveform",
{
params: { path: { transcript_id: transcriptId } },
},
).queryKey,
});
export function useTranscriptMP3(transcriptId: NonEmptyString | null) {
const { isAuthenticated } = useAuthReady();
return $api.useQuery(
@@ -329,7 +354,7 @@ export function useTranscriptMP3(transcriptId: string | null) {
);
}
export function useTranscriptTopics(transcriptId: string | null) {
export function useTranscriptTopics(transcriptId: NonEmptyString | null) {
return $api.useQuery(
"get",
"/v1/transcripts/{transcript_id}/topics",
@@ -344,7 +369,23 @@ export function useTranscriptTopics(transcriptId: string | null) {
);
}
export function useTranscriptTopicsWithWords(transcriptId: string | null) {
export const invalidateTranscriptTopics = (
queryClient: QueryClient,
transcriptId: NonEmptyString,
) =>
queryClient.invalidateQueries({
queryKey: $api.queryOptions(
"get",
"/v1/transcripts/{transcript_id}/topics",
{
params: { path: { transcript_id: transcriptId } },
},
).queryKey,
});
export function useTranscriptTopicsWithWords(
transcriptId: NonEmptyString | null,
) {
const { isAuthenticated } = useAuthReady();
return $api.useQuery(
@@ -362,7 +403,7 @@ export function useTranscriptTopicsWithWords(transcriptId: string | null) {
}
export function useTranscriptTopicsWithWordsPerSpeaker(
transcriptId: string | null,
transcriptId: NonEmptyString | null,
topicId: string | null,
) {
const { isAuthenticated } = useAuthReady();
@@ -384,7 +425,7 @@ export function useTranscriptTopicsWithWordsPerSpeaker(
);
}
export function useTranscriptParticipants(transcriptId: string | null) {
export function useTranscriptParticipants(transcriptId: NonEmptyString | null) {
const { isAuthenticated } = useAuthReady();
return $api.useQuery(
@@ -567,6 +608,20 @@ export function useTranscriptSpeakerMerge() {
);
}
export function useMeetingStartRecording() {
const { setError } = useError();
return $api.useMutation(
"post",
"/v1/meetings/{meeting_id}/recordings/start",
{
onError: (error) => {
setError(error as Error, "Failed to start recording");
},
},
);
}
export function useMeetingAudioConsent() {
const { setError } = useError();

View File

@@ -1,5 +1,6 @@
import { components } from "../reflector-api";
type ApiTranscriptStatus = components["schemas"]["GetTranscript"]["status"];
type ApiTranscriptStatus =
components["schemas"]["GetTranscriptWithParticipants"]["status"];
export type TranscriptStatus = ApiTranscriptStatus;

Some files were not shown because too many files have changed in this diff Show More