Compare commits

...

13 Commits

Author SHA1 Message Date
Juan Diego García
b570d202dc chore(main): release 0.43.0 (#940) 2026-03-31 19:27:00 -05:00
Juan Diego García
8c4f5e9c0f fix: cpu usage + email improvements (#944)
* fix: cpu usage on server ws manager, 100% to 0% on idle

* fix:  change email icon to white and prefill email in daily room for authenticated users

* fix: improve email sending with full ts transcript
2026-03-31 16:34:10 -05:00
Juan Diego García
ec8b49738e feat: show trash for soft deleted transcripts and hard delete option (#942)
* feat: show trash for soft deleted transcripts and hard delete option

* fix: test fixtures

* docs: aws new permissions
2026-03-31 13:15:52 -05:00
Juan Diego García
cc9c5cd4a5 fix: add parakeet as default transcriber and fix diarizer image (#939) 2026-03-31 10:22:57 -05:00
Juan Diego García
61d6fbd344 chore(main): release 0.42.0 (#935) 2026-03-30 18:48:27 -05:00
Juan Diego García
7b3b5b9858 fix: remove share public from integration tests (#938) 2026-03-30 18:02:56 -05:00
Juan Diego García
a22789d548 fix: grpc tls for local hatchet (#937) 2026-03-30 17:46:23 -05:00
dependabot[bot]
e3cc646cf5 build(deps): bump the npm_and_yarn group across 1 directory with 2 updates (#934)
Bumps the npm_and_yarn group with 2 updates in the /docs directory: [brace-expansion](https://github.com/juliangruber/brace-expansion) and [path-to-regexp](https://github.com/pillarjs/path-to-regexp).


Updates `brace-expansion` from 1.1.12 to 1.1.13
- [Release notes](https://github.com/juliangruber/brace-expansion/releases)
- [Commits](https://github.com/juliangruber/brace-expansion/compare/v1.1.12...v1.1.13)

Updates `path-to-regexp` from 0.1.12 to 0.1.13
- [Release notes](https://github.com/pillarjs/path-to-regexp/releases)
- [Changelog](https://github.com/pillarjs/path-to-regexp/blob/v.0.1.13/History.md)
- [Commits](https://github.com/pillarjs/path-to-regexp/compare/v0.1.12...v.0.1.13)

---
updated-dependencies:
- dependency-name: brace-expansion
  dependency-version: 1.1.13
  dependency-type: indirect
  dependency-group: npm_and_yarn
- dependency-name: path-to-regexp
  dependency-version: 0.1.13
  dependency-type: indirect
  dependency-group: npm_and_yarn
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-03-30 17:38:52 -05:00
dependabot[bot]
778ff6268c build(deps): bump cryptography (#932)
Bumps the uv group with 1 update in the /server directory: [cryptography](https://github.com/pyca/cryptography).


Updates `cryptography` from 46.0.5 to 46.0.6
- [Changelog](https://github.com/pyca/cryptography/blob/main/CHANGELOG.rst)
- [Commits](https://github.com/pyca/cryptography/compare/46.0.5...46.0.6)

---
updated-dependencies:
- dependency-name: cryptography
  dependency-version: 46.0.6
  dependency-type: indirect
  dependency-group: uv
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-03-30 17:38:37 -05:00
Juan Diego García
d164e486cc feat: mixdown modal services + processor pattern (#936)
* allow memory flags and per service config

* feat: mixdown modal services + processor pattern
2026-03-30 17:38:23 -05:00
Juan Diego García
12bf0c2d77 feat: custom ca for caddy (#931)
* fix: send email on transcript page permissions fixed

* feat: custom ca for caddy
2026-03-30 11:42:39 -05:00
dependabot[bot]
bfaf4f403b build(deps): bump the uv group across 2 directories with 1 update (#930)
Bumps the uv group with 1 update in the /gpu/self_hosted directory: [requests](https://github.com/psf/requests).
Bumps the uv group with 1 update in the /server directory: [requests](https://github.com/psf/requests).


Updates `requests` from 2.32.5 to 2.33.0
- [Release notes](https://github.com/psf/requests/releases)
- [Changelog](https://github.com/psf/requests/blob/main/HISTORY.md)
- [Commits](https://github.com/psf/requests/compare/v2.32.5...v2.33.0)

Updates `requests` from 2.32.4 to 2.33.0
- [Release notes](https://github.com/psf/requests/releases)
- [Changelog](https://github.com/psf/requests/blob/main/HISTORY.md)
- [Commits](https://github.com/psf/requests/compare/v2.32.5...v2.33.0)

---
updated-dependencies:
- dependency-name: requests
  dependency-version: 2.33.0
  dependency-type: indirect
  dependency-group: uv
- dependency-name: requests
  dependency-version: 2.33.0
  dependency-type: direct:production
  dependency-group: uv
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-03-26 10:27:11 -05:00
dependabot[bot]
0258754a4c build(deps): bump picomatch (#929)
Bumps the npm_and_yarn group with 1 update in the /docs directory: [picomatch](https://github.com/micromatch/picomatch).


Updates `picomatch` from 2.3.1 to 2.3.2
- [Release notes](https://github.com/micromatch/picomatch/releases)
- [Changelog](https://github.com/micromatch/picomatch/blob/master/CHANGELOG.md)
- [Commits](https://github.com/micromatch/picomatch/compare/2.3.1...2.3.2)

---
updated-dependencies:
- dependency-name: picomatch
  dependency-version: 2.3.2
  dependency-type: indirect
  dependency-group: npm_and_yarn
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-03-26 10:26:16 -05:00
66 changed files with 4945 additions and 377 deletions

5
.gitignore vendored
View File

@@ -24,5 +24,10 @@ www/.env.production
.secrets
opencode.json
certs/
docker-compose.ca.yml
docker-compose.gpu-ca.yml
Caddyfile.gpu-host
.env.gpu-host
vibedocs/
server/tests/integration/logs/

View File

@@ -1,5 +1,32 @@
# Changelog
## [0.43.0](https://github.com/GreyhavenHQ/reflector/compare/v0.42.0...v0.43.0) (2026-03-31)
### Features
* show trash for soft deleted transcripts and hard delete option ([#942](https://github.com/GreyhavenHQ/reflector/issues/942)) ([ec8b497](https://github.com/GreyhavenHQ/reflector/commit/ec8b49738e8e76f6e5d2496a42cb454ef6c2d7c7))
### Bug Fixes
* add parakeet as default transcriber and fix diarizer image ([#939](https://github.com/GreyhavenHQ/reflector/issues/939)) ([cc9c5cd](https://github.com/GreyhavenHQ/reflector/commit/cc9c5cd4a5f4123ef957ad82461ca37a727d1ba6))
* cpu usage + email improvements ([#944](https://github.com/GreyhavenHQ/reflector/issues/944)) ([8c4f5e9](https://github.com/GreyhavenHQ/reflector/commit/8c4f5e9c0f893f4cb029595505b53136f04760f4))
## [0.42.0](https://github.com/GreyhavenHQ/reflector/compare/v0.41.0...v0.42.0) (2026-03-30)
### Features
* custom ca for caddy ([#931](https://github.com/GreyhavenHQ/reflector/issues/931)) ([12bf0c2](https://github.com/GreyhavenHQ/reflector/commit/12bf0c2d77f9915b79b1eb1decd77ed2dadbb31d))
* mixdown modal services + processor pattern ([#936](https://github.com/GreyhavenHQ/reflector/issues/936)) ([d164e48](https://github.com/GreyhavenHQ/reflector/commit/d164e486cc33ff8babf6cff6c163893cfc56fd76))
### Bug Fixes
* grpc tls for local hatchet ([#937](https://github.com/GreyhavenHQ/reflector/issues/937)) ([a22789d](https://github.com/GreyhavenHQ/reflector/commit/a22789d5486bf8b83e33ab2fb5eb3ee9799c6d47))
* remove share public from integration tests ([#938](https://github.com/GreyhavenHQ/reflector/issues/938)) ([7b3b5b9](https://github.com/GreyhavenHQ/reflector/commit/7b3b5b98586449afd0b6996ba9fd7aec8308bbc6))
## [0.41.0](https://github.com/GreyhavenHQ/reflector/compare/v0.40.0...v0.41.0) (2026-03-25)

View File

@@ -193,6 +193,11 @@ Modal.com integration for scalable ML processing:
If you need to do any worker/pipeline related work, search for "Pipeline" classes and their "create" or "build" methods to find the main processor sequence. Look for task orchestration patterns (like "chord", "group", or "chain") to identify the post-processing flow with parallel execution chains. This will give you abstract vision on how processing pipeling is organized.
## Documentation
- New documentation files go in `docsv2/`, not in `docs/docs/`.
- Existing `docs/` directory contains legacy Docusaurus docs.
## Code Style
- Always put imports at the top of the file. Let ruff/pre-commit handle sorting and formatting of imports.

106
docker-compose.gpu-host.yml Normal file
View File

@@ -0,0 +1,106 @@
# Standalone GPU host for Reflector — transcription, diarization, translation.
#
# Usage: ./scripts/setup-gpu-host.sh [--domain DOMAIN] [--custom-ca PATH] [--api-key KEY] [--cpu]
# or: docker compose -f docker-compose.gpu-host.yml --profile gpu [--profile caddy] up -d
#
# Processing mode (pick ONE — mutually exclusive, both bind port 8000):
# --profile gpu NVIDIA GPU container (requires nvidia-container-toolkit)
# --profile cpu CPU-only container (no GPU required, slower)
#
# Optional:
# --profile caddy Caddy reverse proxy with HTTPS
#
# This file is checked into the repo. The setup script generates:
# - .env.gpu-host (HF_TOKEN, API key, port config)
# - Caddyfile.gpu-host (Caddy config, only with --domain)
# - docker-compose.gpu-ca.yml (CA cert mounts, only with --custom-ca)
services:
# ===========================================================
# GPU service — NVIDIA GPU accelerated
# Activated with: --profile gpu
# ===========================================================
gpu:
build:
context: ./gpu/self_hosted
dockerfile: Dockerfile
profiles: [gpu]
restart: unless-stopped
ports:
- "${GPU_HOST_PORT:-8000}:8000"
environment:
HF_TOKEN: ${HF_TOKEN:-}
REFLECTOR_GPU_APIKEY: ${REFLECTOR_GPU_APIKEY:-}
volumes:
- gpu_cache:/root/.cache
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: all
capabilities: [gpu]
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:8000/docs"]
interval: 15s
timeout: 5s
retries: 10
start_period: 120s
networks:
default:
aliases:
- transcription
# ===========================================================
# CPU service — no GPU required, uses Dockerfile.cpu
# Activated with: --profile cpu
# Mutually exclusive with gpu (both bind port 8000)
# ===========================================================
cpu:
build:
context: ./gpu/self_hosted
dockerfile: Dockerfile.cpu
profiles: [cpu]
restart: unless-stopped
ports:
- "${GPU_HOST_PORT:-8000}:8000"
environment:
HF_TOKEN: ${HF_TOKEN:-}
REFLECTOR_GPU_APIKEY: ${REFLECTOR_GPU_APIKEY:-}
volumes:
- gpu_cache:/root/.cache
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:8000/docs"]
interval: 15s
timeout: 5s
retries: 10
start_period: 120s
networks:
default:
aliases:
- transcription
# ===========================================================
# Caddy — reverse proxy with HTTPS (optional)
# Activated with: --profile caddy
# Proxies to "transcription" network alias (works for both gpu and cpu)
# ===========================================================
caddy:
image: caddy:2-alpine
profiles: [caddy]
restart: unless-stopped
ports:
- "80:80"
- "${CADDY_HTTPS_PORT:-443}:443"
volumes:
- ./Caddyfile.gpu-host:/etc/caddy/Caddyfile:ro
- caddy_data:/data
- caddy_config:/config
volumes:
gpu_cache:
caddy_data:
caddy_config:

View File

@@ -95,6 +95,12 @@ DAILYCO_STORAGE_AWS_BUCKET_NAME=<your-bucket-from-daily-setup>
DAILYCO_STORAGE_AWS_REGION=us-east-1
DAILYCO_STORAGE_AWS_ROLE_ARN=<your-role-arn-from-daily-setup>
# Worker credentials for reading/deleting recordings from Daily's S3 bucket.
# Required when transcript storage uses a different bucket or credentials
# (e.g., selfhosted with Garage or a separate S3 account).
DAILYCO_STORAGE_AWS_ACCESS_KEY_ID=<your-aws-access-key>
DAILYCO_STORAGE_AWS_SECRET_ACCESS_KEY=<your-aws-secret-key>
# Transcript storage (should already be configured from main setup)
# TRANSCRIPT_STORAGE_BACKEND=aws
# TRANSCRIPT_STORAGE_AWS_ACCESS_KEY_ID=<your-key>
@@ -103,6 +109,19 @@ DAILYCO_STORAGE_AWS_ROLE_ARN=<your-role-arn-from-daily-setup>
# TRANSCRIPT_STORAGE_AWS_REGION=<your-bucket-region>
```
:::info Two separate credential sets for Daily.co
- **`ROLE_ARN`** — Used by Daily's API to *write* recordings into your S3 bucket (configured via Daily dashboard).
- **`ACCESS_KEY_ID` / `SECRET_ACCESS_KEY`** — Used by Reflector workers to *read* recordings for transcription and *delete* them on consent denial or permanent transcript deletion.
Required IAM permissions for the worker key on the Daily recordings bucket:
- `s3:GetObject` — Download recording files for processing
- `s3:DeleteObject` — Remove files on consent denial, trash destroy, or data retention cleanup
- `s3:ListBucket` — Scan for recordings needing reprocessing
If the worker keys are not set, Reflector falls back to the transcript storage master key, which then needs cross-bucket access to the Daily bucket.
:::
---
## Restart Services

76
docs/pnpm-lock.yaml generated
View File

@@ -701,6 +701,10 @@ packages:
resolution: {integrity: sha512-05WQkdpL9COIMz4LjTxGpPNCdlpyimKppYNoJ5Di5EUObifl8t4tuLuUBBZEpoLYOmfvIWrsp9fCl0HoPRVTdA==}
engines: {node: '>=6.9.0'}
'@babel/runtime@7.29.2':
resolution: {integrity: sha512-JiDShH45zKHWyGe4ZNVRrCjBz8Nh9TMmZG1kh4QTK8hCBTWBi8Da+i7s1fJw7/lYpM4ccepSNfqzZ/QvABBi5g==}
engines: {node: '>=6.9.0'}
'@babel/template@7.28.6':
resolution: {integrity: sha512-YA6Ma2KsCdGb+WC6UpBVFJGXL58MDA6oyONbjyF/+5sBgxY/dwkhLogbMT2GXXyU84/IhRw/2D1Os1B/giz+BQ==}
engines: {node: '>=6.9.0'}
@@ -1490,42 +1494,36 @@ packages:
engines: {node: '>= 10.0.0'}
cpu: [arm]
os: [linux]
libc: [glibc]
'@parcel/watcher-linux-arm-musl@2.5.6':
resolution: {integrity: sha512-Ve3gUCG57nuUUSyjBq/MAM0CzArtuIOxsBdQ+ftz6ho8n7s1i9E1Nmk/xmP323r2YL0SONs1EuwqBp2u1k5fxg==}
engines: {node: '>= 10.0.0'}
cpu: [arm]
os: [linux]
libc: [musl]
'@parcel/watcher-linux-arm64-glibc@2.5.6':
resolution: {integrity: sha512-f2g/DT3NhGPdBmMWYoxixqYr3v/UXcmLOYy16Bx0TM20Tchduwr4EaCbmxh1321TABqPGDpS8D/ggOTaljijOA==}
engines: {node: '>= 10.0.0'}
cpu: [arm64]
os: [linux]
libc: [glibc]
'@parcel/watcher-linux-arm64-musl@2.5.6':
resolution: {integrity: sha512-qb6naMDGlbCwdhLj6hgoVKJl2odL34z2sqkC7Z6kzir8b5W65WYDpLB6R06KabvZdgoHI/zxke4b3zR0wAbDTA==}
engines: {node: '>= 10.0.0'}
cpu: [arm64]
os: [linux]
libc: [musl]
'@parcel/watcher-linux-x64-glibc@2.5.6':
resolution: {integrity: sha512-kbT5wvNQlx7NaGjzPFu8nVIW1rWqV780O7ZtkjuWaPUgpv2NMFpjYERVi0UYj1msZNyCzGlaCWEtzc+exjMGbQ==}
engines: {node: '>= 10.0.0'}
cpu: [x64]
os: [linux]
libc: [glibc]
'@parcel/watcher-linux-x64-musl@2.5.6':
resolution: {integrity: sha512-1JRFeC+h7RdXwldHzTsmdtYR/Ku8SylLgTU/reMuqdVD7CtLwf0VR1FqeprZ0eHQkO0vqsbvFLXUmYm/uNKJBg==}
engines: {node: '>= 10.0.0'}
cpu: [x64]
os: [linux]
libc: [musl]
'@parcel/watcher-win32-arm64@2.5.6':
resolution: {integrity: sha512-3ukyebjc6eGlw9yRt678DxVF7rjXatWiHvTXqphZLvo7aC5NdEgFufVwjFfY51ijYEWpXbqF5jtrK275z52D4Q==}
@@ -2254,11 +2252,11 @@ packages:
resolution: {integrity: sha512-2hCgjEmP8YLWQ130n2FerGv7rYpfBmnmp9Uy2Le1vge6X3gZIfSmEzP5QTDElFxcvVcXlEn8Aq6MU/PZygIOog==}
engines: {node: '>=14.16'}
brace-expansion@1.1.12:
resolution: {integrity: sha512-9T9UjW3r0UW5c1Q7GTwllptXwhvYmEzFhzMfZ9H7FQWt+uZePjZPjBP/W1ZEyZ1twGWom5/56TF4lPcqjnDHcg==}
brace-expansion@1.1.13:
resolution: {integrity: sha512-9ZLprWS6EENmhEOpjCYW2c8VkmOvckIJZfkr7rBW6dObmfgJ/L1GpSYW5Hpo9lDz4D1+n0Ckz8rU7FwHDQiG/w==}
brace-expansion@2.0.2:
resolution: {integrity: sha512-Jt0vHyM+jmUBqojB7E1NIYadt0vI0Qxjxd2TErW94wDz+E2LAm5vKMXXwg6ZZBTHPuUlDgQHKXvjGBdfcF1ZDQ==}
brace-expansion@2.0.3:
resolution: {integrity: sha512-MCV/fYJEbqx68aE58kv2cA/kiky1G8vux3OR6/jbS+jIMe/6fJWa0DTzJU7dqijOWYwHi1t29FlfYI9uytqlpA==}
braces@3.0.3:
resolution: {integrity: sha512-yQbXgO/OSZVD2IsiLlro+7Hf6Q18EJrKSEsdoMzKePKXct3gvD8oLcOQdIzGupr5Fj+EDe8gO/lxc1BzfMpxvA==}
@@ -3410,8 +3408,8 @@ packages:
graphlib@2.1.8:
resolution: {integrity: sha512-jcLLfkpoVGmH7/InMC/1hIvOPSUh38oJtGhvrOFGzioE1DZ+0YW16RgmOJhHiuWTvGiJQ9Z1Ik43JvkRPRvE+A==}
gray-matter@https://codeload.github.com/jonschlinkert/gray-matter/tar.gz/234163e317c87fe031e9368ffabde9c9149ce3ec:
resolution: {tarball: https://codeload.github.com/jonschlinkert/gray-matter/tar.gz/234163e317c87fe031e9368ffabde9c9149ce3ec}
gray-matter@https://codeload.github.com/jonschlinkert/gray-matter/tar.gz/234163e:
resolution: {tarball: https://codeload.github.com/jonschlinkert/gray-matter/tar.gz/234163e}
version: 4.0.3
engines: {node: '>=6.0'}
@@ -4533,8 +4531,8 @@ packages:
path-parse@1.0.7:
resolution: {integrity: sha512-LDJzPVEEEPR+y48z93A0Ed0yXb8pAByGWo/k5YYdYgpY2/2EsOsksJrq7lOHxryrVOn1ejG6oAp8ahvOIQD8sw==}
path-to-regexp@0.1.12:
resolution: {integrity: sha512-RA1GjUVMnvYFxuqovrEqZoxxW5NUZqbwKtYz/Tt7nXerk0LbLblQmrsgdeOxV5SFHf0UDggjS/bSeOZwt1pmEQ==}
path-to-regexp@0.1.13:
resolution: {integrity: sha512-A/AGNMFN3c8bOlvV9RreMdrv7jsmF9XIfDeCd87+I8RNg6s78BhJxMu69NEMHBSJFxKidViTEdruRwEk/WIKqA==}
path-to-regexp@1.9.0:
resolution: {integrity: sha512-xIp7/apCFJuUHdDLWe8O1HIkb0kQrOMb/0u6FXQjemHn/ii5LrIzU6bdECnsiTF/GjZkMEKg1xdiZwNqDYlZ6g==}
@@ -4555,12 +4553,12 @@ packages:
picocolors@1.1.1:
resolution: {integrity: sha512-xceH2snhtb5M9liqDsmEw56le376mTZkEX/jEb/RxNFyegNul7eNslCXP9FDj/Lcu0X8KEyMceP2ntpaHrDEVA==}
picomatch@2.3.1:
resolution: {integrity: sha512-JU3teHTNjmE2VCGFzuY8EXzCDVwEqB2a8fsIvwaStHhAWJEeVd1o1QD80CU6+ZdEXXSLbSsuLwJjkCBWqRQUVA==}
picomatch@2.3.2:
resolution: {integrity: sha512-V7+vQEJ06Z+c5tSye8S+nHUfI51xoXIXjHQ99cQtKUkQqqO1kO/KCJUfZXuB47h/YBlDhah2H3hdUGXn8ie0oA==}
engines: {node: '>=8.6'}
picomatch@4.0.3:
resolution: {integrity: sha512-5gTmgEY/sqK6gFXLIsQNH19lWb4ebPDLA4SdLP7dsWkIXHWlG66oPuVvXSGFPppYZz8ZDZq0dYYrbHfBCVUb1Q==}
picomatch@4.0.4:
resolution: {integrity: sha512-QP88BAKvMam/3NxH6vj2o21R6MjxZUAd6nlwAS/pnGvN9IVLocLHxGYIzFhg6fUQ+5th6P4dv4eW9jX3DSIj7A==}
engines: {node: '>=12'}
pirates@4.0.7:
@@ -7024,6 +7022,8 @@ snapshots:
'@babel/runtime@7.28.6': {}
'@babel/runtime@7.29.2': {}
'@babel/template@7.28.6':
dependencies:
'@babel/code-frame': 7.29.0
@@ -8162,7 +8162,7 @@ snapshots:
fs-extra: 11.3.3
github-slugger: 1.5.0
globby: 11.1.0
gray-matter: https://codeload.github.com/jonschlinkert/gray-matter/tar.gz/234163e317c87fe031e9368ffabde9c9149ce3ec
gray-matter: https://codeload.github.com/jonschlinkert/gray-matter/tar.gz/234163e
jiti: 1.21.7
js-yaml: 4.1.1
lodash: 4.17.23
@@ -8473,7 +8473,7 @@ snapshots:
detect-libc: 2.1.2
is-glob: 4.0.3
node-addon-api: 7.1.1
picomatch: 4.0.3
picomatch: 4.0.4
optionalDependencies:
'@parcel/watcher-android-arm64': 2.5.6
'@parcel/watcher-darwin-arm64': 2.5.6
@@ -8645,7 +8645,7 @@ snapshots:
'@slorber/react-helmet-async@1.3.0(react-dom@19.2.4(react@19.2.4))(react@19.2.4)':
dependencies:
'@babel/runtime': 7.28.6
'@babel/runtime': 7.29.2
invariant: 2.2.4
prop-types: 15.8.1
react: 19.2.4
@@ -9244,7 +9244,7 @@ snapshots:
anymatch@3.1.3:
dependencies:
normalize-path: 3.0.0
picomatch: 2.3.1
picomatch: 2.3.2
arg@5.0.2: {}
@@ -9378,12 +9378,12 @@ snapshots:
widest-line: 4.0.1
wrap-ansi: 8.1.0
brace-expansion@1.1.12:
brace-expansion@1.1.13:
dependencies:
balanced-match: 1.0.2
concat-map: 0.0.1
brace-expansion@2.0.2:
brace-expansion@2.0.3:
dependencies:
balanced-match: 1.0.2
@@ -10436,7 +10436,7 @@ snapshots:
methods: 1.1.2
on-finished: 2.4.1
parseurl: 1.3.3
path-to-regexp: 0.1.12
path-to-regexp: 0.1.13
proxy-addr: 2.0.7
qs: 6.14.2
range-parser: 1.2.1
@@ -10485,9 +10485,9 @@ snapshots:
dependencies:
websocket-driver: 0.7.4
fdir@6.5.0(picomatch@4.0.3):
fdir@6.5.0(picomatch@4.0.4):
optionalDependencies:
picomatch: 4.0.3
picomatch: 4.0.4
feed@4.2.2:
dependencies:
@@ -10658,7 +10658,7 @@ snapshots:
dependencies:
lodash: 4.17.23
gray-matter@https://codeload.github.com/jonschlinkert/gray-matter/tar.gz/234163e317c87fe031e9368ffabde9c9149ce3ec:
gray-matter@https://codeload.github.com/jonschlinkert/gray-matter/tar.gz/234163e:
dependencies:
js-yaml: 4.1.1
kind-of: 6.0.3
@@ -11080,7 +11080,7 @@ snapshots:
chalk: 4.1.2
ci-info: 3.9.0
graceful-fs: 4.2.11
picomatch: 2.3.1
picomatch: 2.3.2
jest-worker@27.5.1:
dependencies:
@@ -11780,7 +11780,7 @@ snapshots:
micromatch@4.0.8:
dependencies:
braces: 3.0.3
picomatch: 2.3.1
picomatch: 2.3.2
mime-db@1.33.0: {}
@@ -11824,11 +11824,11 @@ snapshots:
minimatch@3.1.5:
dependencies:
brace-expansion: 1.1.12
brace-expansion: 1.1.13
minimatch@5.1.8:
dependencies:
brace-expansion: 2.0.2
brace-expansion: 2.0.3
minimist@1.2.8: {}
@@ -12127,7 +12127,7 @@ snapshots:
path-parse@1.0.7: {}
path-to-regexp@0.1.12: {}
path-to-regexp@0.1.13: {}
path-to-regexp@1.9.0:
dependencies:
@@ -12146,9 +12146,9 @@ snapshots:
picocolors@1.1.1: {}
picomatch@2.3.1: {}
picomatch@2.3.2: {}
picomatch@4.0.3: {}
picomatch@4.0.4: {}
pirates@4.0.7: {}
@@ -12852,7 +12852,7 @@ snapshots:
readdirp@3.6.0:
dependencies:
picomatch: 2.3.1
picomatch: 2.3.2
readdirp@4.1.2: {}
@@ -13510,8 +13510,8 @@ snapshots:
tinyglobby@0.2.15:
dependencies:
fdir: 6.5.0(picomatch@4.0.3)
picomatch: 4.0.3
fdir: 6.5.0(picomatch@4.0.4)
picomatch: 4.0.4
tinypool@1.1.1: {}

338
docsv2/custom-ca-setup.md Normal file
View File

@@ -0,0 +1,338 @@
# Custom CA Certificate Setup
Use a private Certificate Authority (CA) with Reflector self-hosted deployments. This covers two scenarios:
1. **Custom local domain** — Serve Reflector over HTTPS on an internal domain (e.g., `reflector.local`) using certs signed by your own CA
2. **Backend CA trust** — Let Reflector's backend services (server, workers, GPU) make HTTPS calls to GPU, LLM, or other internal services behind your private CA
Both can be used independently or together.
## Quick Start
### Generate test certificates
```bash
./scripts/generate-certs.sh reflector.local
```
This creates `certs/` with:
- `ca.key` + `ca.crt` — Root CA (10-year validity)
- `server-key.pem` + `server.pem` — Server certificate (1-year, SAN: domain + localhost + 127.0.0.1)
### Deploy with custom CA + domain
```bash
# Add domain to /etc/hosts on the server (use 127.0.0.1 for local, or server LAN IP for network access)
echo "127.0.0.1 reflector.local" | sudo tee -a /etc/hosts
# Run setup — pass the certs directory
./scripts/setup-selfhosted.sh --gpu --caddy --domain reflector.local --custom-ca certs/
# Trust the CA on your machine (see "Trust the CA" section below)
```
### Deploy with CA trust only (GPU/LLM behind private CA)
```bash
# Only need the CA cert file — no Caddy TLS certs needed
./scripts/setup-selfhosted.sh --hosted --custom-ca /path/to/corporate-ca.crt
```
## How `--custom-ca` Works
The flag accepts a **directory** or a **single file**:
### Directory mode
```bash
--custom-ca certs/
```
Looks for these files by convention:
- `ca.crt` (required) — CA certificate to trust
- `server.pem` + `server-key.pem` (optional) — TLS certificate/key for Caddy
If `server.pem` + `server-key.pem` are found AND `--domain` is provided:
- Caddy serves HTTPS using those certs
- Backend containers trust the CA for outbound calls
If only `ca.crt` is found:
- Backend containers trust the CA for outbound calls
- Caddy is unaffected (uses Let's Encrypt, self-signed, or no Caddy)
### Single file mode
```bash
--custom-ca /path/to/corporate-ca.crt
```
Only injects CA trust into backend containers. No Caddy TLS changes.
## Scenarios
### Scenario 1: Custom local domain
Your Reflector instance runs on an internal network. You want `https://reflector.local` with proper TLS (no browser warnings).
```bash
# 1. Generate certs
./scripts/generate-certs.sh reflector.local
# 2. Add to /etc/hosts on the server
echo "127.0.0.1 reflector.local" | sudo tee -a /etc/hosts
# 3. Deploy
./scripts/setup-selfhosted.sh --gpu --garage --caddy --domain reflector.local --custom-ca certs/
# 4. Trust the CA on your machine (see "Trust the CA" section below)
```
If other machines on the network need to access it, add the server's LAN IP to `/etc/hosts` on those machines instead:
```bash
echo "192.168.1.100 reflector.local" | sudo tee -a /etc/hosts
```
And include that IP as an extra SAN when generating certs:
```bash
./scripts/generate-certs.sh reflector.local "IP:192.168.1.100"
```
### Scenario 2: GPU/LLM behind corporate CA
Your GPU or LLM server (e.g., `https://gpu.internal.corp`) uses certificates signed by your corporate CA. Reflector's backend needs to trust that CA for outbound HTTPS calls.
```bash
# Get the CA certificate from your IT team (PEM format)
# Then deploy — Caddy can still use Let's Encrypt or self-signed
./scripts/setup-selfhosted.sh --hosted --garage --caddy --custom-ca /path/to/corporate-ca.crt
```
This works because:
- **TLS cert/key** = "this is my identity" — for Caddy to serve HTTPS to browsers
- **CA cert** = "I trust this authority" — for backend containers to verify outbound connections
Your Reflector frontend can use Let's Encrypt (public domain) or self-signed certs, while the backend trusts a completely different CA for GPU/LLM calls.
### Scenario 3: Both combined (same CA)
Custom domain + GPU/LLM all behind the same CA:
```bash
./scripts/generate-certs.sh reflector.local "DNS:gpu.local"
./scripts/setup-selfhosted.sh --gpu --garage --caddy --domain reflector.local --custom-ca certs/
```
### Scenario 4: Multiple CAs (local domain + remote GPU on different CA)
Your Reflector uses one CA for `reflector.local`, but the GPU host uses a different CA:
```bash
# Your local domain setup
./scripts/generate-certs.sh reflector.local
# Deploy with your CA + trust the GPU host's CA too
./scripts/setup-selfhosted.sh --hosted --garage --caddy \
--domain reflector.local \
--custom-ca certs/ \
--extra-ca /path/to/gpu-machine-ca.crt
```
`--extra-ca` appends additional CA certs to the trust bundle. Backend containers trust ALL CAs — your local domain AND the GPU host's certs both work.
You can repeat `--extra-ca` for multiple remote services:
```bash
--extra-ca /path/to/gpu-ca.crt --extra-ca /path/to/llm-ca.crt
```
For setting up a dedicated GPU host, see [Standalone GPU Host Setup](gpu-host-setup.md).
## Trust the CA on Client Machines
After deploying, clients need to trust the CA to avoid browser warnings.
### macOS
```bash
sudo security add-trusted-cert -d -r trustRoot \
-k /Library/Keychains/System.keychain certs/ca.crt
```
### Linux (Ubuntu/Debian)
```bash
sudo cp certs/ca.crt /usr/local/share/ca-certificates/reflector-ca.crt
sudo update-ca-certificates
```
### Linux (RHEL/Fedora)
```bash
sudo cp certs/ca.crt /etc/pki/ca-trust/source/anchors/reflector-ca.crt
sudo update-ca-trust
```
### Windows (PowerShell as admin)
```powershell
Import-Certificate -FilePath .\certs\ca.crt -CertStoreLocation Cert:\LocalMachine\Root
```
### Firefox (all platforms)
Firefox uses its own certificate store:
1. Settings > Privacy & Security > View Certificates
2. Authorities tab > Import
3. Select `ca.crt` and check "Trust this CA to identify websites"
## How It Works Internally
### Docker entrypoint CA injection
Each backend container (server, worker, beat, hatchet workers, GPU) has an entrypoint script (`docker-entrypoint.sh`) that:
1. Checks if a CA cert is mounted at `/usr/local/share/ca-certificates/custom-ca.crt`
2. If present, runs `update-ca-certificates` to create a **combined bundle** (system CAs + custom CA)
3. Sets environment variables so all Python/gRPC libraries use the combined bundle:
| Env var | Covers |
|---------|--------|
| `SSL_CERT_FILE` | httpx, OpenAI SDK, llama-index, Python ssl module |
| `REQUESTS_CA_BUNDLE` | requests library (transitive dependencies) |
| `CURL_CA_BUNDLE` | curl CLI (container healthchecks) |
Note: `GRPC_DEFAULT_SSL_ROOTS_FILE_PATH` is intentionally NOT set. Setting it causes grpcio to attempt TLS on internal Hatchet gRPC connections that run without TLS, resulting in handshake failures. The internal Hatchet connection uses `HATCHET_CLIENT_TLS_STRATEGY=none` (plaintext).
When no CA cert is mounted, the entrypoint is a no-op — containers behave exactly as before.
### Why this replaces manual certifi patching
Previously, the workaround for trusting a private CA in Python was to patch certifi's bundle directly:
```bash
# OLD approach — fragile, do NOT use
cat custom-ca.crt >> $(python -c "import certifi; print(certifi.where())")
```
This breaks whenever certifi is updated (any `pip install`/`uv sync` overwrites the bundle and the CA is lost).
Our entrypoint approach is permanent because:
1. `SSL_CERT_FILE` is checked by Python's `ssl.create_default_context()` **before** falling back to `certifi.where()`. When set, certifi's bundle is never read.
2. `REQUESTS_CA_BUNDLE` similarly overrides certifi for the `requests` library.
3. The CA is injected at container startup (runtime), not baked into the Python environment. It survives image rebuilds, dependency updates, and `uv sync`.
```
Python SSL lookup chain:
ssl.create_default_context()
→ SSL_CERT_FILE env var? → YES → use combined bundle (system + custom CA) ✓
→ (certifi.where() is never reached)
```
This covers all outbound HTTPS calls: httpx (transcription, diarization, translation, webhooks), OpenAI SDK (transcription), llama-index (LLM/summarization), and requests (transitive dependencies).
### Compose override
The setup script generates `docker-compose.ca.yml` which mounts the CA cert into every backend container as a read-only bind mount. This file is:
- Only generated when `--custom-ca` is passed
- Deleted on re-runs without `--custom-ca` (prevents stale overrides)
- Added to `.gitignore`
### Node.js (frontend)
The web container uses `NODE_EXTRA_CA_CERTS` which **adds** to Node's trust store (unlike Python's `SSL_CERT_FILE` which replaces it). This is set via the compose override.
## Generate Your Own CA (Manual)
If you prefer not to use `generate-certs.sh`:
```bash
# 1. Create CA
openssl genrsa -out ca.key 4096
openssl req -x509 -new -nodes -key ca.key -sha256 -days 3650 \
-out ca.crt -subj "/CN=My CA/O=My Organization"
# 2. Create server key
openssl genrsa -out server-key.pem 2048
# 3. Create CSR with SANs
openssl req -new -key server-key.pem -out server.csr \
-subj "/CN=reflector.local" \
-addext "subjectAltName=DNS:reflector.local,DNS:localhost,IP:127.0.0.1"
# 4. Sign with CA
openssl x509 -req -in server.csr -CA ca.crt -CAkey ca.key \
-CAcreateserial -out server.pem -days 365 -sha256 \
-copy_extensions copyall
# 5. Clean up
rm server.csr ca.srl
```
## Using Existing Corporate Certificates
If your organization already has a CA:
1. Get the CA certificate in PEM format from your IT team
2. If you have a PKCS#12 (.p12/.pfx) bundle, extract the CA cert:
```bash
openssl pkcs12 -in bundle.p12 -cacerts -nokeys -out ca.crt
```
3. If you have multiple intermediate CAs, concatenate them into one PEM file:
```bash
cat intermediate-ca.crt root-ca.crt > ca.crt
```
## Troubleshooting
### Browser: "Your connection is not private"
The CA is not trusted on the client machine. See "Trust the CA" section above.
Check certificate expiry:
```bash
openssl x509 -noout -dates -in certs/server.pem
```
### Backend: `SSL: CERTIFICATE_VERIFY_FAILED`
CA cert not mounted or not loaded. Check inside the container:
```bash
docker compose exec server env | grep SSL_CERT_FILE
docker compose exec server python -c "
import ssl, os
print('SSL_CERT_FILE:', os.environ.get('SSL_CERT_FILE', 'not set'))
ctx = ssl.create_default_context()
print('CA certs loaded:', ctx.cert_store_stats())
"
```
### Caddy: "certificate is not valid for any names"
Domain in Caddyfile doesn't match the certificate's SAN/CN. Check:
```bash
openssl x509 -noout -text -in certs/server.pem | grep -A1 "Subject Alternative Name"
```
### Certificate chain issues
If you have intermediate CAs, concatenate them into `server.pem`:
```bash
cat server-cert.pem intermediate-ca.pem > certs/server.pem
```
Verify the chain:
```bash
openssl verify -CAfile certs/ca.crt certs/server.pem
```
### Certificate renewal
Custom CA certs are NOT auto-renewed (unlike Let's Encrypt). Replace cert files and restart:
```bash
# Replace certs
cp new-server.pem certs/server.pem
cp new-server-key.pem certs/server-key.pem
# Restart Caddy to pick up new certs
docker compose restart caddy
```

294
docsv2/gpu-host-setup.md Normal file
View File

@@ -0,0 +1,294 @@
# Standalone GPU Host Setup
Deploy Reflector's GPU transcription/diarization/translation service on a dedicated machine, separate from the main Reflector instance. Useful when:
- Your GPU machine is on a different network than the Reflector server
- You want to share one GPU service across multiple Reflector instances
- The GPU machine has special hardware/drivers that can't run the full stack
- You need to scale GPU processing independently
## Architecture
```
┌─────────────────────┐ HTTPS ┌────────────────────┐
│ Reflector Server │ ────────────────────── │ GPU Host │
│ (server, worker, │ TRANSCRIPT_URL │ (transcription, │
│ web, postgres, │ DIARIZATION_URL │ diarization, │
│ redis, hatchet) │ TRANSLATE_URL │ translation) │
│ │ │ │
│ setup-selfhosted.sh │ │ setup-gpu-host.sh │
│ --hosted │ │ │
└─────────────────────┘ └────────────────────┘
```
The GPU service is a standalone FastAPI app that exposes transcription, diarization, translation, and audio padding endpoints. It has **no dependencies** on PostgreSQL, Redis, Hatchet, or any other Reflector service.
## Quick Start
### On the GPU machine
```bash
git clone <reflector-repo>
cd reflector
# Set HuggingFace token (required for diarization models)
export HF_TOKEN=your-huggingface-token
# Deploy with HTTPS (Let's Encrypt)
./scripts/setup-gpu-host.sh --domain gpu.example.com --api-key my-secret-key
# Or deploy with custom CA
./scripts/generate-certs.sh gpu.local
./scripts/setup-gpu-host.sh --domain gpu.local --custom-ca certs/ --api-key my-secret-key
```
### On the Reflector machine
```bash
# If the GPU host uses a custom CA, trust it
./scripts/setup-selfhosted.sh --hosted --garage --caddy \
--extra-ca /path/to/gpu-machine-ca.crt
# Or if you already have --custom-ca for your local domain
./scripts/setup-selfhosted.sh --hosted --garage --caddy \
--domain reflector.local --custom-ca certs/ \
--extra-ca /path/to/gpu-machine-ca.crt
```
Then configure `server/.env` to point to the GPU host:
```bash
TRANSCRIPT_BACKEND=modal
TRANSCRIPT_URL=https://gpu.example.com
TRANSCRIPT_MODAL_API_KEY=my-secret-key
DIARIZATION_BACKEND=modal
DIARIZATION_URL=https://gpu.example.com
DIARIZATION_MODAL_API_KEY=my-secret-key
TRANSLATION_BACKEND=modal
TRANSLATE_URL=https://gpu.example.com
TRANSLATION_MODAL_API_KEY=my-secret-key
```
## Script Options
```
./scripts/setup-gpu-host.sh [OPTIONS]
Options:
--domain DOMAIN Domain name for HTTPS (Let's Encrypt or custom cert)
--custom-ca PATH Custom CA (directory or single PEM file)
--extra-ca FILE Additional CA cert to trust (repeatable)
--api-key KEY API key to protect the service (strongly recommended)
--cpu CPU-only mode (no NVIDIA GPU required)
--port PORT Host port (default: 443 with Caddy, 8000 without)
```
## Deployment Scenarios
### Public internet with Let's Encrypt
GPU machine has a public IP and domain:
```bash
./scripts/setup-gpu-host.sh --domain gpu.example.com --api-key my-secret-key
```
Requirements:
- DNS A record: `gpu.example.com` → GPU machine's public IP
- Ports 80 and 443 open
- Caddy auto-provisions Let's Encrypt certificate
### Internal network with custom CA
GPU machine on a private network:
```bash
# Generate certs on the GPU machine
./scripts/generate-certs.sh gpu.internal "IP:192.168.1.200"
# Deploy
./scripts/setup-gpu-host.sh --domain gpu.internal --custom-ca certs/ --api-key my-secret-key
```
On each machine that connects (including the Reflector server), add DNS:
```bash
echo "192.168.1.200 gpu.internal" | sudo tee -a /etc/hosts
```
### IP-only (no domain)
No domain needed — just use the machine's IP:
```bash
./scripts/setup-gpu-host.sh --api-key my-secret-key
```
Caddy is not used; the GPU service runs directly on port 8000 (HTTP). For HTTPS without a domain, the Reflector machine connects via `http://<GPU_IP>:8000`.
### CPU-only (no NVIDIA GPU)
Works on any machine — transcription will be slower:
```bash
./scripts/setup-gpu-host.sh --cpu --domain gpu.example.com --api-key my-secret-key
```
## DNS Resolution
The Reflector server must be able to reach the GPU host by name or IP.
| Setup | DNS Method | TRANSCRIPT_URL example |
|-------|------------|----------------------|
| Public domain | DNS A record | `https://gpu.example.com` |
| Internal domain | `/etc/hosts` on both machines | `https://gpu.internal` |
| IP only | No DNS needed | `http://192.168.1.200:8000` |
For internal domains, add the GPU machine's IP to `/etc/hosts` on the Reflector machine:
```bash
echo "192.168.1.200 gpu.internal" | sudo tee -a /etc/hosts
```
If the Reflector server runs in Docker, the containers resolve DNS from the host (Docker's default DNS behavior). So adding to the host's `/etc/hosts` is sufficient.
## Multi-CA Setup
When your Reflector instance has its own CA (for `reflector.local`) and the GPU host has a different CA:
**On the GPU machine:**
```bash
./scripts/generate-certs.sh gpu.local
./scripts/setup-gpu-host.sh --domain gpu.local --custom-ca certs/ --api-key my-key
```
**On the Reflector machine:**
```bash
# Your local CA for reflector.local + the GPU host's CA
./scripts/setup-selfhosted.sh --hosted --garage --caddy \
--domain reflector.local \
--custom-ca certs/ \
--extra-ca /path/to/gpu-machine-ca.crt
```
The `--extra-ca` flag appends the GPU host's CA to the trust bundle. Backend containers trust both CAs — your local domain works AND outbound calls to the GPU host succeed.
You can repeat `--extra-ca` for multiple remote services:
```bash
--extra-ca /path/to/gpu-ca.crt --extra-ca /path/to/llm-ca.crt
```
## API Key Authentication
The GPU service uses Bearer token authentication via `REFLECTOR_GPU_APIKEY`:
```bash
# Test from the Reflector machine
curl -s https://gpu.example.com/docs # No auth needed for docs
curl -s -X POST https://gpu.example.com/v1/audio/transcriptions \
-H "Authorization: Bearer <my-secret-key>" \ #gitleaks:allow
-F "file=@audio.wav"
```
If `REFLECTOR_GPU_APIKEY` is not set, the service accepts all requests (open access). Always use `--api-key` for internet-facing deployments.
The same key goes in Reflector's `server/.env` as `TRANSCRIPT_MODAL_API_KEY` and `DIARIZATION_MODAL_API_KEY`.
## Files
| File | Checked in? | Purpose |
|------|-------------|---------|
| `docker-compose.gpu-host.yml` | Yes | Static compose file with profiles (`gpu`, `cpu`, `caddy`) |
| `.env.gpu-host` | No (generated) | Environment variables (HF_TOKEN, API key, ports) |
| `Caddyfile.gpu-host` | No (generated) | Caddy config (only when using HTTPS) |
| `docker-compose.gpu-ca.yml` | No (generated) | CA cert mounts override (only with --custom-ca) |
| `certs/` | No (generated) | Staged certificates (when using --custom-ca) |
The compose file is checked into the repo — you can read it to understand exactly what runs. The script only generates env vars, Caddyfile, and CA overrides. Profiles control which service starts:
```bash
# What the script does under the hood:
docker compose -f docker-compose.gpu-host.yml --profile gpu --profile caddy \
--env-file .env.gpu-host up -d
# CPU mode:
docker compose -f docker-compose.gpu-host.yml --profile cpu --profile caddy \
--env-file .env.gpu-host up -d
```
Both `gpu` and `cpu` services get the network alias `transcription`, so Caddy's config works with either.
## Management
```bash
# View logs
docker compose -f docker-compose.gpu-host.yml --profile gpu logs -f gpu
# Restart
docker compose -f docker-compose.gpu-host.yml --profile gpu restart gpu
# Stop
docker compose -f docker-compose.gpu-host.yml --profile gpu --profile caddy down
# Re-run setup
./scripts/setup-gpu-host.sh [same flags]
# Rebuild after code changes
docker compose -f docker-compose.gpu-host.yml --profile gpu build gpu
docker compose -f docker-compose.gpu-host.yml --profile gpu up -d gpu
```
If you deployed with `--custom-ca`, include the CA override in manual commands:
```bash
docker compose -f docker-compose.gpu-host.yml -f docker-compose.gpu-ca.yml \
--profile gpu logs -f gpu
```
## Troubleshooting
### GPU service won't start
Check logs:
```bash
docker compose -f docker-compose.gpu-host.yml logs gpu
```
Common causes:
- NVIDIA driver not installed or `nvidia-container-toolkit` missing
- `HF_TOKEN` not set (diarization model download fails)
- Port already in use
### Reflector can't connect to GPU host
From the Reflector machine:
```bash
# Test HTTPS connectivity
curl -v https://gpu.example.com/docs
# If using custom CA, test with explicit CA
curl --cacert /path/to/gpu-ca.crt https://gpu.internal/docs
```
From inside the Reflector container:
```bash
docker compose exec server python -c "
import httpx
r = httpx.get('https://gpu.internal/docs')
print(r.status_code)
"
```
### SSL: CERTIFICATE_VERIFY_FAILED
The Reflector backend doesn't trust the GPU host's CA. Fix:
```bash
# Re-run Reflector setup with the GPU host's CA
./scripts/setup-selfhosted.sh --hosted --extra-ca /path/to/gpu-ca.crt
```
### Diarization returns errors
- Accept pyannote model licenses on HuggingFace:
- https://huggingface.co/pyannote/speaker-diarization-3.1
- https://huggingface.co/pyannote/segmentation-3.0
- Verify `HF_TOKEN` is set in `.env.gpu-host`

View File

@@ -24,6 +24,8 @@ This document explains the internals of the self-hosted deployment: how the setu
The self-hosted deployment runs the entire Reflector platform on a single server using Docker Compose. A single bash script (`scripts/setup-selfhosted.sh`) handles all configuration and orchestration. The key design principles are:
- **One command to deploy** — flags select which features to enable
- **Config memory** — CLI args are saved to `data/.selfhosted-last-args`; re-run with no flags to replay
- **Per-service overrides** — individual ML backends (transcript, diarization, translation, padding, mixdown) can be overridden independently from the base mode
- **Idempotent** — safe to re-run without losing existing configuration
- **Profile-based composition** — Docker Compose profiles activate optional services
- **No external dependencies required** — with `--garage` and `--ollama-*`, everything runs locally
@@ -61,8 +63,9 @@ Creates or updates the backend environment file from `server/.env.selfhosted.exa
- **Infrastructure** — PostgreSQL URL, Redis host, Celery broker (all pointing to Docker-internal hostnames)
- **Public URLs** — `BASE_URL` and `CORS_ORIGIN` computed from the domain (if `--domain`), IP (if detected on Linux), or `localhost`
- **WebRTC** — `WEBRTC_HOST` set to the server's LAN IP so browsers can reach UDP ICE candidates
- **Specialized models** — always points to `http://transcription:8000` (the Docker network alias shared by GPU and CPU containers)
- **HuggingFace token** — prompts interactively for pyannote model access; writes to root `.env` so Docker Compose can inject it into GPU/CPU containers
- **ML backends (per-service)** — Each ML service (transcript, diarization, translation, padding, mixdown) is configured independently using "effective backends" (`EFF_TRANSCRIPT`, `EFF_DIARIZATION`, `EFF_TRANSLATION`, `EFF_PADDING`, `EFF_MIXDOWN`). These are resolved from the base mode default + any `--transcript`/`--diarization`/`--translation`/`--padding`/`--mixdown` overrides. For `modal` backends, the URL is `http://transcription:8000` (GPU mode), user-provided (hosted mode), or read from existing env (CPU mode with override). For CPU backends, no URL is needed (in-process). If a service is overridden to `modal` in CPU mode without a URL configured, the script warns the user to set `TRANSCRIPT_URL` in `server/.env`
- **CPU timeouts** — `TRANSCRIPT_FILE_TIMEOUT` and `DIARIZATION_FILE_TIMEOUT` are increased to 3600s only for services actually using CPU backends (whisper/pyannote), not blanket for the whole mode
- **HuggingFace token** — prompted when diarization uses `pyannote` (in-process) or when GPU mode is active (GPU container needs it). Writes to root `.env` so Docker Compose can inject it into GPU/CPU containers
- **LLM** — if `--ollama-*` is used, configures `LLM_URL` pointing to the Ollama container. Otherwise, warns that the user needs to configure an external LLM
- **Public mode** — sets `PUBLIC_MODE=true` so the app is accessible without authentication by default
- **Password auth** — if `--password` is passed, sets `AUTH_BACKEND=password`, `PUBLIC_MODE=false`, `ADMIN_EMAIL=admin@localhost`, and `ADMIN_PASSWORD_HASH` (the hash generated in Step 1). The admin user is provisioned in the database on container startup via `runserver.sh`
@@ -228,11 +231,19 @@ Both the `gpu` and `cpu` services define a Docker network alias of `transcriptio
Environment variables flow through multiple layers. Understanding this prevents confusion when debugging:
```
Flags (--gpu, --garage, etc.)
CLI args (--gpu, --garage, --padding modal, --mixdown modal, etc.)
├── setup-selfhosted.sh interprets flags
├── Config memory: saved to data/.selfhosted-last-args
│ (replayed on next run if no args provided)
├── setup-selfhosted.sh resolves effective backends:
│ EFF_TRANSCRIPT = override or base mode default
│ EFF_DIARIZATION = override or base mode default
│ EFF_TRANSLATION = override or base mode default
│ EFF_PADDING = override or base mode default
│ EFF_MIXDOWN = override or base mode default
│ │
│ ├── Writes server/.env (backend config)
│ ├── Writes server/.env (backend config, per-service backends)
│ ├── Writes www/.env (frontend config)
│ ├── Writes .env (HF_TOKEN for compose interpolation)
│ └── Writes Caddyfile (proxy routes)

View File

@@ -70,7 +70,7 @@ That's it. The script generates env files, secrets, starts all containers, waits
## ML Processing Modes (Required)
Pick `--gpu`, `--cpu`, or `--hosted`. This determines how **transcription, diarization, translation, and audio padding** run:
Pick `--gpu`, `--cpu`, or `--hosted`. This determines how **transcription, diarization, translation, audio padding, and audio mixdown** run:
| Flag | What it does | Requires |
|------|-------------|----------|
@@ -158,6 +158,56 @@ Without `--caddy` or `--domain`, no ports are exposed. Point your own reverse pr
**Without a domain:** `--caddy` alone uses a self-signed certificate. Browsers will show a security warning that must be accepted.
## Per-Service Backend Overrides
Override individual ML services without changing the base mode. Useful when you want most services on one backend but need specific services on another.
| Flag | Valid backends | Default (`--gpu`/`--hosted`) | Default (`--cpu`) |
|------|---------------|------------------------------|-------------------|
| `--transcript BACKEND` | `whisper`, `modal` | `modal` | `whisper` |
| `--diarization BACKEND` | `pyannote`, `modal` | `modal` | `pyannote` |
| `--translation BACKEND` | `marian`, `modal`, `passthrough` | `modal` | `marian` |
| `--padding BACKEND` | `pyav`, `modal` | `modal` | `pyav` |
| `--mixdown BACKEND` | `pyav`, `modal` | `modal` | `pyav` |
**Examples:**
```bash
# CPU base, but use a remote modal service for padding only
./scripts/setup-selfhosted.sh --cpu --padding modal --garage --caddy
# GPU base, but skip translation entirely (passthrough)
./scripts/setup-selfhosted.sh --gpu --translation passthrough --garage --caddy
# CPU base with remote modal diarization and translation
./scripts/setup-selfhosted.sh --cpu --diarization modal --translation modal --garage
```
When overriding a service to `modal` in `--cpu` mode, the script will warn you to configure the service URL (`TRANSCRIPT_URL` etc.) in `server/.env` to point to your GPU service, then re-run.
When overriding a service to a CPU backend (e.g., `--transcript whisper`) in `--gpu` mode, that service runs in-process on the server/worker containers while the GPU container still serves the remaining `modal` services.
## Config Memory (No-Flag Re-run)
After a successful run, the script saves your CLI arguments to `data/.selfhosted-last-args`. On subsequent runs with no arguments, the saved configuration is automatically replayed:
```bash
# First run — saves the config
./scripts/setup-selfhosted.sh --gpu --ollama-gpu --garage --caddy
# Later re-runs — same config, no flags needed
./scripts/setup-selfhosted.sh
# => "No flags provided — replaying saved configuration:"
# => " --gpu --ollama-gpu --garage --caddy"
```
To change the configuration, pass new flags — they override and replace the saved config:
```bash
# Switch to CPU mode with overrides — this becomes the new saved config
./scripts/setup-selfhosted.sh --cpu --padding modal --garage --caddy
```
## What the Script Does
1. **Prerequisites check** — Docker, NVIDIA GPU (if needed), compose file exists
@@ -189,6 +239,8 @@ Without `--caddy` or `--domain`, no ports are exposed. Point your own reverse pr
| `TRANSCRIPT_URL` | Specialized model endpoint | `http://transcription:8000` |
| `PADDING_BACKEND` | Audio padding backend (`pyav` or `modal`) | `modal` (selfhosted), `pyav` (default) |
| `PADDING_URL` | Audio padding endpoint (when `PADDING_BACKEND=modal`) | `http://transcription:8000` |
| `MIXDOWN_BACKEND` | Audio mixdown backend (`pyav` or `modal`) | `modal` (selfhosted), `pyav` (default) |
| `MIXDOWN_URL` | Audio mixdown endpoint (when `MIXDOWN_BACKEND=modal`) | `http://transcription:8000` |
| `LLM_URL` | OpenAI-compatible LLM endpoint | Auto-set for Ollama modes |
| `LLM_API_KEY` | LLM API key | `not-needed` for Ollama |
| `LLM_MODEL` | LLM model name | `qwen2.5:14b` for Ollama (override with `--llm-model`) |
@@ -253,6 +305,48 @@ TRANSCRIPT_STORAGE_AWS_REGION=us-east-1
TRANSCRIPT_STORAGE_AWS_ENDPOINT_URL=http://minio:9000
```
### S3 IAM Permissions Reference
Reflector uses up to 3 separate S3 credential sets, each scoped to a specific bucket. When using AWS IAM in production, each key should have only the permissions it needs.
**Transcript storage key** (`TRANSCRIPT_STORAGE_AWS_*`) — the main bucket for processed files:
```json
{
"Effect": "Allow",
"Action": ["s3:PutObject", "s3:GetObject", "s3:DeleteObject", "s3:ListBucket"],
"Resource": ["arn:aws:s3:::reflector-media/*", "arn:aws:s3:::reflector-media"]
}
```
Used for: processed MP3 audio, waveform JSON, temporary pipeline files. Deletions happen during trash "Destroy", consent-denied cleanup, and public mode data retention.
**Daily.co worker key** (`DAILYCO_STORAGE_AWS_ACCESS_KEY_ID/SECRET_ACCESS_KEY`) — for reading and cleaning up Daily recordings:
```json
{
"Effect": "Allow",
"Action": ["s3:GetObject", "s3:DeleteObject", "s3:ListBucket"],
"Resource": ["arn:aws:s3:::your-daily-bucket/*", "arn:aws:s3:::your-daily-bucket"]
}
```
Used for: downloading multitrack recording files for processing, deleting track files and composed video on consent denial or trash destroy. No `s3:PutObject` needed — Daily's own API writes via the Role ARN.
**Whereby worker key** (`WHEREBY_STORAGE_AWS_ACCESS_KEY_ID/SECRET_ACCESS_KEY`) — same pattern as Daily:
```json
{
"Effect": "Allow",
"Action": ["s3:GetObject", "s3:DeleteObject", "s3:ListBucket"],
"Resource": ["arn:aws:s3:::your-whereby-bucket/*", "arn:aws:s3:::your-whereby-bucket"]
}
```
> **Fallback behavior:** If platform-specific worker keys are not set, Reflector falls back to the transcript storage master key with a bucket override. This means the master key would need cross-bucket access to the Daily/Whereby buckets. For least-privilege, configure platform-specific keys so each only accesses its own bucket.
> **Garage / single-bucket setups:** When using Garage or a single S3 bucket for everything, one master key with full permissions on that bucket is sufficient. The IAM scoping above only matters when using separate buckets per platform (typical in AWS production).
## What Authentication Enables
By default, Reflector runs in **public mode** (`AUTH_BACKEND=none`, `PUBLIC_MODE=true`) — anyone can create and view transcripts without logging in. Transcripts are anonymous (not linked to any user) and cannot be edited or deleted after creation.
@@ -576,9 +670,9 @@ docker compose -f docker-compose.selfhosted.yml exec gpu curl http://localhost:8
## Updating
```bash
# Option A: Pull latest prebuilt images and restart
# Option A: Pull latest prebuilt images and restart (replays saved config automatically)
docker compose -f docker-compose.selfhosted.yml down
./scripts/setup-selfhosted.sh <same-flags-as-before>
./scripts/setup-selfhosted.sh
# Option B: Build from source (after git pull) and restart
git pull
@@ -589,6 +683,8 @@ docker compose -f docker-compose.selfhosted.yml down
docker compose -f docker-compose.selfhosted.yml build gpu # or cpu
```
> **Note on config memory:** Running with no flags replays the saved config from your last run. Running with *any* flags replaces the saved config entirely — the script always saves the complete set of flags you provide. See [Config Memory](#config-memory-no-flag-re-run).
The setup script is idempotent — it won't overwrite existing secrets or env vars that are already set.
## Architecture Overview

View File

@@ -114,8 +114,8 @@ modal secret create reflector-gpu REFLECTOR_GPU_APIKEY="$API_KEY"
# --- Deploy Functions ---
echo ""
echo "Deploying transcriber (Whisper)..."
TRANSCRIBER_URL=$(modal deploy reflector_transcriber.py 2>&1 | grep -o 'https://[^ ]*web.modal.run' | head -1)
echo "Deploying transcriber (Parakeet)..."
TRANSCRIBER_URL=$(modal deploy reflector_transcriber_parakeet.py 2>&1 | grep -o 'https://[^ ]*web.modal.run' | head -1)
if [ -z "$TRANSCRIBER_URL" ]; then
echo "Error: Failed to deploy transcriber. Check Modal dashboard for details."
exit 1
@@ -132,13 +132,22 @@ fi
echo " -> $DIARIZER_URL"
echo ""
echo "Deploying padding (CPU audio processing via Modal SDK)..."
modal deploy reflector_padding.py
if [ $? -ne 0 ]; then
echo "Deploying padding (CPU audio processing)..."
PADDING_URL=$(modal deploy reflector_padding.py 2>&1 | grep -o 'https://[^ ]*web.modal.run' | head -1)
if [ -z "$PADDING_URL" ]; then
echo "Error: Failed to deploy padding. Check Modal dashboard for details."
exit 1
fi
echo " -> reflector-padding.pad_track (Modal SDK function)"
echo " -> $PADDING_URL"
echo ""
echo "Deploying mixdown (CPU multi-track audio mixing)..."
MIXDOWN_URL=$(modal deploy reflector_mixdown.py 2>&1 | grep -o 'https://[^ ]*web.modal.run' | head -1)
if [ -z "$MIXDOWN_URL" ]; then
echo "Error: Failed to deploy mixdown. Check Modal dashboard for details."
exit 1
fi
echo " -> $MIXDOWN_URL"
# --- Output Configuration ---
echo ""
@@ -157,5 +166,11 @@ echo "DIARIZATION_BACKEND=modal"
echo "DIARIZATION_URL=$DIARIZER_URL"
echo "DIARIZATION_MODAL_API_KEY=$API_KEY"
echo ""
echo "# Padding uses Modal SDK (requires MODAL_TOKEN_ID/SECRET in worker containers)"
echo "PADDING_BACKEND=modal"
echo "PADDING_URL=$PADDING_URL"
echo "PADDING_MODAL_API_KEY=$API_KEY"
echo ""
echo "MIXDOWN_BACKEND=modal"
echo "MIXDOWN_URL=$MIXDOWN_URL"
echo "MIXDOWN_MODAL_API_KEY=$API_KEY"
echo "# --- End Modal Configuration ---"

View File

@@ -113,12 +113,14 @@ def download_pyannote_audio():
diarizer_image = (
modal.Image.debian_slim(python_version="3.10")
modal.Image.from_registry(
"nvidia/cuda:11.8.0-cudnn8-devel-ubuntu22.04", add_python="3.10"
)
.pip_install(
"pyannote.audio==3.1.0",
"requests",
"onnx",
"torchaudio",
"torchaudio==2.0.1",
"onnxruntime-gpu",
"torch==2.0.0",
"transformers==4.34.0",
@@ -133,14 +135,6 @@ diarizer_image = (
secrets=[modal.Secret.from_name("hf_token")],
)
.run_function(migrate_cache_llm)
.env(
{
"LD_LIBRARY_PATH": (
"/usr/local/lib/python3.10/site-packages/nvidia/cudnn/lib/:"
"/opt/conda/lib/python3.10/site-packages/nvidia/cublas/lib/"
)
}
)
)

View File

@@ -0,0 +1,385 @@
"""
Reflector GPU backend - audio mixdown
=====================================
CPU-intensive multi-track audio mixdown service.
Mixes N audio tracks into a single MP3 using PyAV amix filter graph.
IMPORTANT: This mixdown logic is duplicated from server/reflector/utils/audio_mixdown.py
for Modal deployment isolation (Modal can't import from server/reflector/). If you modify
the PyAV filter graph or mixdown algorithm, you MUST update both:
- gpu/modal_deployments/reflector_mixdown.py (this file)
- server/reflector/utils/audio_mixdown.py
Constants duplicated from server/reflector/utils/audio_constants.py for same reason.
"""
import os
import tempfile
from fractions import Fraction
import asyncio
import modal
S3_TIMEOUT = 120 # Higher than padding (60s) — multiple track downloads
MIXDOWN_TIMEOUT = 1200 + (S3_TIMEOUT * 2) # 1440s total
SCALEDOWN_WINDOW = 60
DISCONNECT_CHECK_INTERVAL = 2
app = modal.App("reflector-mixdown")
# CPU-based image (mixdown is CPU-bound, no GPU needed)
image = (
modal.Image.debian_slim(python_version="3.12")
.apt_install("ffmpeg") # Required by PyAV
.pip_install(
"av==13.1.0", # PyAV for audio processing
"requests==2.32.3", # HTTP for presigned URL downloads/uploads
"fastapi==0.115.12", # API framework
)
)
@app.function(
cpu=4.0, # Higher than padding (2.0) for multi-track mixing
timeout=MIXDOWN_TIMEOUT,
scaledown_window=SCALEDOWN_WINDOW,
image=image,
secrets=[modal.Secret.from_name("reflector-gpu")],
)
@modal.asgi_app()
def web():
from fastapi import Depends, FastAPI, HTTPException, Request, status
from fastapi.security import OAuth2PasswordBearer
from pydantic import BaseModel
class MixdownRequest(BaseModel):
track_urls: list[str]
output_url: str
target_sample_rate: int | None = None
offsets_seconds: list[float] | None = None
class MixdownResponse(BaseModel):
size: int
duration_ms: float = 0.0
cancelled: bool = False
web_app = FastAPI()
oauth2_scheme = OAuth2PasswordBearer(tokenUrl="token")
def apikey_auth(apikey: str = Depends(oauth2_scheme)):
if apikey == os.environ["REFLECTOR_GPU_APIKEY"]:
return
raise HTTPException(
status_code=status.HTTP_401_UNAUTHORIZED,
detail="Invalid API key",
headers={"WWW-Authenticate": "Bearer"},
)
@web_app.post("/mixdown", dependencies=[Depends(apikey_auth)])
async def mixdown_endpoint(request: Request, req: MixdownRequest) -> MixdownResponse:
"""Modal web endpoint for mixing audio tracks with disconnect detection."""
import logging
import threading
logging.basicConfig(level=logging.INFO, format="%(asctime)s - %(levelname)s - %(message)s")
logger = logging.getLogger(__name__)
valid_urls = [u for u in req.track_urls if u]
if not valid_urls:
raise HTTPException(status_code=400, detail="No valid track URLs provided")
if req.offsets_seconds is not None:
if len(req.offsets_seconds) != len(req.track_urls):
raise HTTPException(
status_code=400,
detail=f"offsets_seconds length ({len(req.offsets_seconds)}) "
f"must match track_urls ({len(req.track_urls)})",
)
if any(o > 18000 for o in req.offsets_seconds):
raise HTTPException(status_code=400, detail="offsets_seconds exceeds maximum 18000s (5 hours)")
if not req.output_url:
raise HTTPException(status_code=400, detail="output_url cannot be empty")
logger.info(f"Mixdown request: {len(valid_urls)} tracks")
# Thread-safe cancellation flag
cancelled = threading.Event()
async def check_disconnect():
"""Background task to check for client disconnect."""
while not cancelled.is_set():
await asyncio.sleep(DISCONNECT_CHECK_INTERVAL)
if await request.is_disconnected():
logger.warning("Client disconnected, setting cancellation flag")
cancelled.set()
break
disconnect_task = asyncio.create_task(check_disconnect())
try:
result = await asyncio.get_event_loop().run_in_executor(
None, _mixdown_tracks_blocking, req, cancelled, logger
)
return MixdownResponse(**result)
finally:
cancelled.set()
disconnect_task.cancel()
try:
await disconnect_task
except asyncio.CancelledError:
pass
def _mixdown_tracks_blocking(req, cancelled, logger) -> dict:
"""Blocking CPU-bound mixdown work with periodic cancellation checks.
Downloads all tracks, builds PyAV amix filter graph, encodes to MP3,
and uploads the result to the presigned output URL.
"""
import av
import requests
from av.audio.resampler import AudioResampler
import time
temp_dir = tempfile.mkdtemp()
track_paths = []
output_path = None
last_check = time.time()
try:
# --- Download all tracks ---
valid_urls = [u for u in req.track_urls if u]
for i, url in enumerate(valid_urls):
if cancelled.is_set():
logger.info("Cancelled during download phase")
return {"size": 0, "duration_ms": 0.0, "cancelled": True}
logger.info(f"Downloading track {i}")
response = requests.get(url, stream=True, timeout=S3_TIMEOUT)
response.raise_for_status()
track_path = os.path.join(temp_dir, f"track_{i}.webm")
total_bytes = 0
chunk_count = 0
with open(track_path, "wb") as f:
for chunk in response.iter_content(chunk_size=8192):
if chunk:
f.write(chunk)
total_bytes += len(chunk)
chunk_count += 1
if chunk_count % 12 == 0:
now = time.time()
if now - last_check >= DISCONNECT_CHECK_INTERVAL:
if cancelled.is_set():
logger.info(f"Cancelled during track {i} download")
return {"size": 0, "duration_ms": 0.0, "cancelled": True}
last_check = now
track_paths.append(track_path)
logger.info(f"Track {i} downloaded: {total_bytes} bytes")
if not track_paths:
raise ValueError("No tracks downloaded")
# --- Detect sample rate ---
target_sample_rate = req.target_sample_rate
if target_sample_rate is None:
for path in track_paths:
try:
container = av.open(path)
for frame in container.decode(audio=0):
target_sample_rate = frame.sample_rate
container.close()
break
else:
container.close()
continue
break
except Exception:
continue
if target_sample_rate is None:
raise ValueError("Could not detect sample rate from any track")
logger.info(f"Target sample rate: {target_sample_rate}")
# --- Calculate per-input delays ---
input_offsets_seconds = None
if req.offsets_seconds is not None:
input_offsets_seconds = [
req.offsets_seconds[i] for i, url in enumerate(req.track_urls) if url
]
delays_ms = []
if input_offsets_seconds is not None:
base = min(input_offsets_seconds) if input_offsets_seconds else 0.0
delays_ms = [max(0, int(round((o - base) * 1000))) for o in input_offsets_seconds]
else:
delays_ms = [0 for _ in track_paths]
# --- Build filter graph ---
# N abuffer -> optional adelay -> amix -> aformat -> abuffersink
graph = av.filter.Graph()
inputs = []
for idx in range(len(track_paths)):
args = (
f"time_base=1/{target_sample_rate}:"
f"sample_rate={target_sample_rate}:"
f"sample_fmt=s32:"
f"channel_layout=stereo"
)
in_ctx = graph.add("abuffer", args=args, name=f"in{idx}")
inputs.append(in_ctx)
mixer = graph.add("amix", args=f"inputs={len(inputs)}:normalize=0", name="mix")
fmt = graph.add(
"aformat",
args=f"sample_fmts=s32:channel_layouts=stereo:sample_rates={target_sample_rate}",
name="fmt",
)
sink = graph.add("abuffersink", name="out")
for idx, in_ctx in enumerate(inputs):
delay_ms = delays_ms[idx] if idx < len(delays_ms) else 0
if delay_ms > 0:
adelay = graph.add(
"adelay",
args=f"delays={delay_ms}|{delay_ms}:all=1",
name=f"delay{idx}",
)
in_ctx.link_to(adelay)
adelay.link_to(mixer, 0, idx)
else:
in_ctx.link_to(mixer, 0, idx)
mixer.link_to(fmt)
fmt.link_to(sink)
graph.configure()
# --- Open all containers and decode ---
containers = []
output_path = os.path.join(temp_dir, "mixed.mp3")
try:
for path in track_paths:
containers.append(av.open(path))
decoders = [c.decode(audio=0) for c in containers]
active = [True] * len(decoders)
resamplers = [
AudioResampler(format="s32", layout="stereo", rate=target_sample_rate)
for _ in decoders
]
# Open output MP3
out_container = av.open(output_path, "w", format="mp3")
out_stream = out_container.add_stream("libmp3lame", rate=target_sample_rate)
total_duration = 0
while any(active):
# Check cancellation periodically
now = time.time()
if now - last_check >= DISCONNECT_CHECK_INTERVAL:
if cancelled.is_set():
logger.info("Cancelled during mixing")
out_container.close()
return {"size": 0, "duration_ms": 0.0, "cancelled": True}
last_check = now
for i, (dec, is_active) in enumerate(zip(decoders, active)):
if not is_active:
continue
try:
frame = next(dec)
except StopIteration:
active[i] = False
inputs[i].push(None)
continue
if frame.sample_rate != target_sample_rate:
continue
out_frames = resamplers[i].resample(frame) or []
for rf in out_frames:
rf.sample_rate = target_sample_rate
rf.time_base = Fraction(1, target_sample_rate)
inputs[i].push(rf)
while True:
try:
mixed = sink.pull()
except Exception:
break
mixed.sample_rate = target_sample_rate
mixed.time_base = Fraction(1, target_sample_rate)
for packet in out_stream.encode(mixed):
out_container.mux(packet)
total_duration += packet.duration
# Flush filter graph
while True:
try:
mixed = sink.pull()
except Exception:
break
mixed.sample_rate = target_sample_rate
mixed.time_base = Fraction(1, target_sample_rate)
for packet in out_stream.encode(mixed):
out_container.mux(packet)
total_duration += packet.duration
# Flush encoder
for packet in out_stream.encode(None):
out_container.mux(packet)
total_duration += packet.duration
# Calculate duration in ms
last_tb = out_stream.time_base
duration_ms = 0.0
if last_tb and total_duration > 0:
duration_ms = round(float(total_duration * last_tb * 1000), 2)
out_container.close()
finally:
for c in containers:
try:
c.close()
except Exception:
pass
file_size = os.path.getsize(output_path)
logger.info(f"Mixdown complete: {file_size} bytes, {duration_ms}ms")
if cancelled.is_set():
logger.info("Cancelled after mixing, before upload")
return {"size": 0, "duration_ms": 0.0, "cancelled": True}
# --- Upload result ---
logger.info("Uploading mixed audio to S3")
with open(output_path, "rb") as f:
upload_response = requests.put(req.output_url, data=f, timeout=S3_TIMEOUT)
upload_response.raise_for_status()
logger.info(f"Upload complete: {file_size} bytes")
return {"size": file_size, "duration_ms": duration_ms}
finally:
# Cleanup all temp files
for path in track_paths:
if os.path.exists(path):
try:
os.unlink(path)
except Exception as e:
logger.warning(f"Failed to cleanup track file: {e}")
if output_path and os.path.exists(output_path):
try:
os.unlink(output_path)
except Exception as e:
logger.warning(f"Failed to cleanup output file: {e}")
try:
os.rmdir(temp_dir)
except Exception as e:
logger.warning(f"Failed to cleanup temp directory: {e}")
return web_app

View File

@@ -52,10 +52,12 @@ OPUS_DEFAULT_BIT_RATE = 128000
timeout=PADDING_TIMEOUT,
scaledown_window=SCALEDOWN_WINDOW,
image=image,
secrets=[modal.Secret.from_name("reflector-gpu")],
)
@modal.asgi_app()
def web():
from fastapi import FastAPI, Request, HTTPException
from fastapi import Depends, FastAPI, HTTPException, Request, status
from fastapi.security import OAuth2PasswordBearer
from pydantic import BaseModel
class PaddingRequest(BaseModel):
@@ -70,7 +72,18 @@ def web():
web_app = FastAPI()
@web_app.post("/pad")
oauth2_scheme = OAuth2PasswordBearer(tokenUrl="token")
def apikey_auth(apikey: str = Depends(oauth2_scheme)):
if apikey == os.environ["REFLECTOR_GPU_APIKEY"]:
return
raise HTTPException(
status_code=status.HTTP_401_UNAUTHORIZED,
detail="Invalid API key",
headers={"WWW-Authenticate": "Bearer"},
)
@web_app.post("/pad", dependencies=[Depends(apikey_auth)])
async def pad_track_endpoint(request: Request, req: PaddingRequest) -> PaddingResponse:
"""Modal web endpoint for padding audio tracks with disconnect detection.
"""

View File

@@ -42,6 +42,7 @@ COPY pyproject.toml uv.lock /app/
COPY ./app /app/app
COPY ./main.py /app/
COPY ./runserver.sh /app/
COPY ./docker-entrypoint.sh /app/
# prevent uv failing with too many open files on big cpus
ENV UV_CONCURRENT_INSTALLS=16
@@ -52,6 +53,8 @@ RUN --mount=type=cache,target=/root/.cache/uv \
EXPOSE 8000
CMD ["sh", "/app/runserver.sh"]
RUN chmod +x /app/docker-entrypoint.sh
CMD ["sh", "/app/docker-entrypoint.sh"]

View File

@@ -26,6 +26,7 @@ COPY pyproject.toml uv.lock /app/
COPY ./app /app/app
COPY ./main.py /app/
COPY ./runserver.sh /app/
COPY ./docker-entrypoint.sh /app/
# prevent uv failing with too many open files on big cpus
ENV UV_CONCURRENT_INSTALLS=16
@@ -36,4 +37,6 @@ RUN --mount=type=cache,target=/root/.cache/uv \
EXPOSE 8000
CMD ["sh", "/app/runserver.sh"]
RUN chmod +x /app/docker-entrypoint.sh
CMD ["sh", "/app/docker-entrypoint.sh"]

View File

@@ -3,6 +3,7 @@ from contextlib import asynccontextmanager
from fastapi import FastAPI
from .routers.diarization import router as diarization_router
from .routers.mixdown import router as mixdown_router
from .routers.padding import router as padding_router
from .routers.transcription import router as transcription_router
from .routers.translation import router as translation_router
@@ -29,4 +30,5 @@ def create_app() -> FastAPI:
app.include_router(translation_router)
app.include_router(diarization_router)
app.include_router(padding_router)
app.include_router(mixdown_router)
return app

View File

@@ -0,0 +1,288 @@
"""
Audio mixdown endpoint for selfhosted GPU service.
CPU-intensive multi-track audio mixing service for combining N audio tracks
into a single MP3 using PyAV amix filter graph.
IMPORTANT: This mixdown logic is duplicated from server/reflector/utils/audio_mixdown.py
for deployment isolation (self_hosted can't import from server/reflector/). If you modify
the PyAV filter graph or mixdown algorithm, you MUST update both:
- gpu/self_hosted/app/routers/mixdown.py (this file)
- server/reflector/utils/audio_mixdown.py
Constants duplicated from server/reflector/utils/audio_constants.py for same reason.
"""
import logging
import os
import tempfile
from fractions import Fraction
import av
import requests
from av.audio.resampler import AudioResampler
from fastapi import APIRouter, Depends, HTTPException
from pydantic import BaseModel
from ..auth import apikey_auth
logger = logging.getLogger(__name__)
router = APIRouter(tags=["mixdown"])
S3_TIMEOUT = 120
class MixdownRequest(BaseModel):
track_urls: list[str]
output_url: str
target_sample_rate: int | None = None
offsets_seconds: list[float] | None = None
class MixdownResponse(BaseModel):
size: int
duration_ms: float = 0.0
cancelled: bool = False
@router.post("/mixdown", dependencies=[Depends(apikey_auth)], response_model=MixdownResponse)
def mixdown_tracks(req: MixdownRequest):
"""Mix multiple audio tracks into single MP3 using PyAV amix filter graph."""
valid_urls = [u for u in req.track_urls if u]
if not valid_urls:
raise HTTPException(status_code=400, detail="No valid track URLs provided")
if req.offsets_seconds is not None:
if len(req.offsets_seconds) != len(req.track_urls):
raise HTTPException(
status_code=400,
detail=f"offsets_seconds length ({len(req.offsets_seconds)}) "
f"must match track_urls ({len(req.track_urls)})",
)
if any(o > 18000 for o in req.offsets_seconds):
raise HTTPException(
status_code=400, detail="offsets_seconds exceeds maximum 18000s (5 hours)"
)
if not req.output_url:
raise HTTPException(status_code=400, detail="output_url cannot be empty")
logger.info("Mixdown request: %d tracks", len(valid_urls))
temp_dir = tempfile.mkdtemp()
track_paths = []
output_path = None
try:
# --- Download all tracks ---
for i, url in enumerate(valid_urls):
logger.info("Downloading track %d", i)
response = requests.get(url, stream=True, timeout=S3_TIMEOUT)
response.raise_for_status()
track_path = os.path.join(temp_dir, f"track_{i}.webm")
total_bytes = 0
with open(track_path, "wb") as f:
for chunk in response.iter_content(chunk_size=8192):
if chunk:
f.write(chunk)
total_bytes += len(chunk)
track_paths.append(track_path)
logger.info("Track %d downloaded: %d bytes", i, total_bytes)
if not track_paths:
raise HTTPException(status_code=400, detail="No tracks could be downloaded")
# --- Detect sample rate ---
target_sample_rate = req.target_sample_rate
if target_sample_rate is None:
for path in track_paths:
try:
container = av.open(path)
for frame in container.decode(audio=0):
target_sample_rate = frame.sample_rate
container.close()
break
else:
container.close()
continue
break
except Exception:
continue
if target_sample_rate is None:
raise HTTPException(
status_code=400, detail="Could not detect sample rate from any track"
)
logger.info("Target sample rate: %d", target_sample_rate)
# --- Calculate per-input delays ---
input_offsets_seconds = None
if req.offsets_seconds is not None:
input_offsets_seconds = [
req.offsets_seconds[i] for i, url in enumerate(req.track_urls) if url
]
delays_ms = []
if input_offsets_seconds is not None:
base = min(input_offsets_seconds) if input_offsets_seconds else 0.0
delays_ms = [max(0, int(round((o - base) * 1000))) for o in input_offsets_seconds]
else:
delays_ms = [0 for _ in track_paths]
# --- Build filter graph ---
# N abuffer -> optional adelay -> amix -> aformat -> abuffersink
graph = av.filter.Graph()
inputs = []
for idx in range(len(track_paths)):
args = (
f"time_base=1/{target_sample_rate}:"
f"sample_rate={target_sample_rate}:"
f"sample_fmt=s32:"
f"channel_layout=stereo"
)
in_ctx = graph.add("abuffer", args=args, name=f"in{idx}")
inputs.append(in_ctx)
mixer = graph.add("amix", args=f"inputs={len(inputs)}:normalize=0", name="mix")
fmt = graph.add(
"aformat",
args=f"sample_fmts=s32:channel_layouts=stereo:sample_rates={target_sample_rate}",
name="fmt",
)
sink = graph.add("abuffersink", name="out")
for idx, in_ctx in enumerate(inputs):
delay_ms = delays_ms[idx] if idx < len(delays_ms) else 0
if delay_ms > 0:
adelay = graph.add(
"adelay",
args=f"delays={delay_ms}|{delay_ms}:all=1",
name=f"delay{idx}",
)
in_ctx.link_to(adelay)
adelay.link_to(mixer, 0, idx)
else:
in_ctx.link_to(mixer, 0, idx)
mixer.link_to(fmt)
fmt.link_to(sink)
graph.configure()
# --- Open all containers and decode ---
containers = []
output_path = os.path.join(temp_dir, "mixed.mp3")
try:
for path in track_paths:
containers.append(av.open(path))
decoders = [c.decode(audio=0) for c in containers]
active = [True] * len(decoders)
resamplers = [
AudioResampler(format="s32", layout="stereo", rate=target_sample_rate)
for _ in decoders
]
# Open output MP3
out_container = av.open(output_path, "w", format="mp3")
out_stream = out_container.add_stream("libmp3lame", rate=target_sample_rate)
total_duration = 0
while any(active):
for i, (dec, is_active) in enumerate(zip(decoders, active)):
if not is_active:
continue
try:
frame = next(dec)
except StopIteration:
active[i] = False
inputs[i].push(None)
continue
if frame.sample_rate != target_sample_rate:
continue
out_frames = resamplers[i].resample(frame) or []
for rf in out_frames:
rf.sample_rate = target_sample_rate
rf.time_base = Fraction(1, target_sample_rate)
inputs[i].push(rf)
while True:
try:
mixed = sink.pull()
except Exception:
break
mixed.sample_rate = target_sample_rate
mixed.time_base = Fraction(1, target_sample_rate)
for packet in out_stream.encode(mixed):
out_container.mux(packet)
total_duration += packet.duration
# Flush filter graph
while True:
try:
mixed = sink.pull()
except Exception:
break
mixed.sample_rate = target_sample_rate
mixed.time_base = Fraction(1, target_sample_rate)
for packet in out_stream.encode(mixed):
out_container.mux(packet)
total_duration += packet.duration
# Flush encoder
for packet in out_stream.encode(None):
out_container.mux(packet)
total_duration += packet.duration
# Calculate duration in ms
last_tb = out_stream.time_base
duration_ms = 0.0
if last_tb and total_duration > 0:
duration_ms = round(float(total_duration * last_tb * 1000), 2)
out_container.close()
finally:
for c in containers:
try:
c.close()
except Exception:
pass
file_size = os.path.getsize(output_path)
logger.info("Mixdown complete: %d bytes, %.2fms", file_size, duration_ms)
# --- Upload result ---
logger.info("Uploading mixed audio to S3")
with open(output_path, "rb") as f:
upload_response = requests.put(req.output_url, data=f, timeout=S3_TIMEOUT)
upload_response.raise_for_status()
logger.info("Upload complete: %d bytes", file_size)
return MixdownResponse(size=file_size, duration_ms=duration_ms)
except HTTPException:
raise
except Exception as e:
logger.error("Mixdown failed: %s", e, exc_info=True)
raise HTTPException(status_code=500, detail=f"Mixdown failed: {e}") from e
finally:
for path in track_paths:
if os.path.exists(path):
try:
os.unlink(path)
except Exception as e:
logger.warning("Failed to cleanup track file: %s", e)
if output_path and os.path.exists(output_path):
try:
os.unlink(output_path)
except Exception as e:
logger.warning("Failed to cleanup output file: %s", e)
try:
os.rmdir(temp_dir)
except Exception as e:
logger.warning("Failed to cleanup temp directory: %s", e)

View File

@@ -0,0 +1,23 @@
#!/bin/sh
set -e
# Custom CA certificate injection
# If a CA cert is mounted at this path (via docker-compose.ca.yml),
# add it to the system trust store and configure all Python SSL libraries.
CUSTOM_CA_PATH="/usr/local/share/ca-certificates/custom-ca.crt"
if [ -s "$CUSTOM_CA_PATH" ]; then
echo "[entrypoint] Custom CA certificate detected, updating trust store..."
update-ca-certificates 2>/dev/null
# update-ca-certificates creates a combined bundle (system + custom CAs)
COMBINED_BUNDLE="/etc/ssl/certs/ca-certificates.crt"
export SSL_CERT_FILE="$COMBINED_BUNDLE"
export REQUESTS_CA_BUNDLE="$COMBINED_BUNDLE"
export CURL_CA_BUNDLE="$COMBINED_BUNDLE"
# Note: GRPC_DEFAULT_SSL_ROOTS_FILE_PATH is intentionally NOT set here.
# Setting it causes grpcio to attempt TLS on connections that may be plaintext.
echo "[entrypoint] CA trust store updated (SSL_CERT_FILE=$COMBINED_BUNDLE)"
fi
exec sh /app/runserver.sh

View File

@@ -2153,7 +2153,7 @@ wheels = [
[[package]]
name = "requests"
version = "2.32.5"
version = "2.33.0"
source = { registry = "https://pypi.org/simple" }
dependencies = [
{ name = "certifi" },
@@ -2161,9 +2161,9 @@ dependencies = [
{ name = "idna" },
{ name = "urllib3" },
]
sdist = { url = "https://files.pythonhosted.org/packages/c9/74/b3ff8e6c8446842c3f5c837e9c3dfcfe2018ea6ecef224c710c85ef728f4/requests-2.32.5.tar.gz", hash = "sha256:dbba0bac56e100853db0ea71b82b4dfd5fe2bf6d3754a8893c3af500cec7d7cf", size = 134517, upload-time = "2025-08-18T20:46:02.573Z" }
sdist = { url = "https://files.pythonhosted.org/packages/34/64/8860370b167a9721e8956ae116825caff829224fbca0ca6e7bf8ddef8430/requests-2.33.0.tar.gz", hash = "sha256:c7ebc5e8b0f21837386ad0e1c8fe8b829fa5f544d8df3b2253bff14ef29d7652", size = 134232, upload-time = "2026-03-25T15:10:41.586Z" }
wheels = [
{ url = "https://files.pythonhosted.org/packages/1e/db/4254e3eabe8020b458f1a747140d32277ec7a271daf1d235b70dc0b4e6e3/requests-2.32.5-py3-none-any.whl", hash = "sha256:2462f94637a34fd532264295e186976db0f5d453d1cdd31473c85a6a161affb6", size = 64738, upload-time = "2025-08-18T20:46:00.542Z" },
{ url = "https://files.pythonhosted.org/packages/56/5d/c814546c2333ceea4ba42262d8c4d55763003e767fa169adc693bd524478/requests-2.33.0-py3-none-any.whl", hash = "sha256:3324635456fa185245e24865e810cecec7b4caf933d7eb133dcde67d48cee69b", size = 65017, upload-time = "2026-03-25T15:10:40.382Z" },
]
[[package]]

130
scripts/generate-certs.sh Executable file
View File

@@ -0,0 +1,130 @@
#!/usr/bin/env bash
#
# Generate a local CA and server certificate for Reflector self-hosted deployments.
#
# Usage:
# ./scripts/generate-certs.sh DOMAIN [EXTRA_SANS...]
#
# Examples:
# ./scripts/generate-certs.sh reflector.local
# ./scripts/generate-certs.sh reflector.local "DNS:gpu.local,IP:192.168.1.100"
#
# Generates in certs/:
# ca.key — CA private key (keep secret)
# ca.crt — CA certificate (distribute to clients)
# server-key.pem — Server private key
# server.pem — Server certificate (signed by CA)
#
# Then use with setup-selfhosted.sh:
# ./scripts/setup-selfhosted.sh --gpu --caddy --domain DOMAIN --custom-ca certs/
#
set -euo pipefail
DOMAIN="${1:?Usage: $0 DOMAIN [EXTRA_SANS...]}"
EXTRA_SANS="${2:-}"
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
CERTS_DIR="$(cd "$SCRIPT_DIR/.." && pwd)/certs"
# Colors
GREEN='\033[0;32m'
CYAN='\033[0;36m'
NC='\033[0m'
info() { echo -e "${CYAN}==>${NC} $*"; }
ok() { echo -e "${GREEN}${NC} $*"; }
# Check for openssl
if ! command -v openssl &>/dev/null; then
echo "Error: openssl is required but not found. Install it first." >&2
exit 1
fi
mkdir -p "$CERTS_DIR"
# Build SAN list
SAN_LIST="DNS:$DOMAIN,DNS:localhost,IP:127.0.0.1"
if [[ -n "$EXTRA_SANS" ]]; then
SAN_LIST="$SAN_LIST,$EXTRA_SANS"
fi
info "Generating CA and server certificate for: $DOMAIN"
echo " SANs: $SAN_LIST"
echo ""
# --- Step 1: Generate CA ---
if [[ -f "$CERTS_DIR/ca.key" ]] && [[ -f "$CERTS_DIR/ca.crt" ]]; then
ok "CA already exists at certs/ca.key + certs/ca.crt — reusing"
else
info "Generating CA key and certificate..."
openssl genrsa -out "$CERTS_DIR/ca.key" 4096 2>/dev/null
openssl req -x509 -new -nodes \
-key "$CERTS_DIR/ca.key" \
-sha256 -days 3650 \
-out "$CERTS_DIR/ca.crt" \
-subj "/CN=Reflector Local CA/O=Reflector Self-Hosted"
ok "CA certificate generated (valid for 10 years)"
fi
# --- Step 2: Generate server key ---
info "Generating server key..."
openssl genrsa -out "$CERTS_DIR/server-key.pem" 2048 2>/dev/null
ok "Server key generated"
# --- Step 3: Create CSR with SANs ---
info "Creating certificate signing request..."
openssl req -new \
-key "$CERTS_DIR/server-key.pem" \
-out "$CERTS_DIR/server.csr" \
-subj "/CN=$DOMAIN" \
-addext "subjectAltName=$SAN_LIST"
ok "CSR created"
# --- Step 4: Sign with CA ---
info "Signing server certificate with CA..."
openssl x509 -req \
-in "$CERTS_DIR/server.csr" \
-CA "$CERTS_DIR/ca.crt" \
-CAkey "$CERTS_DIR/ca.key" \
-CAcreateserial \
-out "$CERTS_DIR/server.pem" \
-days 365 -sha256 \
-copy_extensions copyall \
2>/dev/null
ok "Server certificate signed (valid for 1 year)"
# --- Cleanup ---
rm -f "$CERTS_DIR/server.csr" "$CERTS_DIR/ca.srl"
# --- Set permissions ---
chmod 644 "$CERTS_DIR/ca.crt" "$CERTS_DIR/server.pem"
chmod 600 "$CERTS_DIR/ca.key" "$CERTS_DIR/server-key.pem"
echo ""
echo "=========================================="
echo -e " ${GREEN}Certificates generated in certs/${NC}"
echo "=========================================="
echo ""
echo " certs/ca.key CA private key (keep secret)"
echo " certs/ca.crt CA certificate (distribute to clients)"
echo " certs/server-key.pem Server private key"
echo " certs/server.pem Server certificate for $DOMAIN"
echo ""
echo " SANs: $SAN_LIST"
echo ""
echo "Use with setup-selfhosted.sh:"
echo " ./scripts/setup-selfhosted.sh --gpu --caddy --domain $DOMAIN --custom-ca certs/"
echo ""
echo "Trust the CA on your machine:"
case "$(uname -s)" in
Darwin)
echo " sudo security add-trusted-cert -d -r trustRoot -k /Library/Keychains/System.keychain certs/ca.crt"
;;
Linux)
echo " sudo cp certs/ca.crt /usr/local/share/ca-certificates/reflector-ca.crt"
echo " sudo update-ca-certificates"
;;
*)
echo " See docsv2/custom-ca-setup.md for your platform"
;;
esac
echo ""

496
scripts/setup-gpu-host.sh Executable file
View File

@@ -0,0 +1,496 @@
#!/usr/bin/env bash
#
# Standalone GPU service setup for Reflector.
# Deploys ONLY the GPU transcription/diarization/translation service on a dedicated machine.
# The main Reflector instance connects to this machine over HTTPS.
#
# Usage:
# ./scripts/setup-gpu-host.sh [--domain DOMAIN] [--custom-ca PATH] [--extra-ca FILE] [--api-key KEY] [--cpu] [--build]
#
# Options:
# --domain DOMAIN Domain name for this GPU host (e.g., gpu.example.com)
# With --custom-ca: uses custom TLS cert. Without: uses Let's Encrypt.
# --custom-ca PATH Custom CA certificate (dir with ca.crt + server.pem + server-key.pem, or single PEM file)
# --extra-ca FILE Additional CA cert to trust (repeatable)
# --api-key KEY API key to protect the GPU service (recommended for internet-facing deployments)
# --cpu Use CPU-only Dockerfile (no NVIDIA GPU required)
# --build Build image from source (default: build, since no pre-built GPU image is published)
# --port PORT Host port to expose (default: 443 with Caddy, 8000 without)
#
# Examples:
# # GPU on LAN with custom CA
# ./scripts/generate-certs.sh gpu.local
# ./scripts/setup-gpu-host.sh --domain gpu.local --custom-ca certs/ --api-key my-secret-key
#
# # GPU on public internet with Let's Encrypt
# ./scripts/setup-gpu-host.sh --domain gpu.example.com --api-key my-secret-key
#
# # GPU on LAN, IP access only (self-signed cert)
# ./scripts/setup-gpu-host.sh --api-key my-secret-key
#
# # CPU-only mode (no NVIDIA GPU)
# ./scripts/setup-gpu-host.sh --cpu --api-key my-secret-key
#
# After setup, configure the main Reflector instance to use this GPU:
# In server/.env on the Reflector machine:
# TRANSCRIPT_BACKEND=modal
# TRANSCRIPT_URL=https://gpu.example.com
# TRANSCRIPT_MODAL_API_KEY=my-secret-key
# DIARIZATION_BACKEND=modal
# DIARIZATION_URL=https://gpu.example.com
# DIARIZATION_MODAL_API_KEY=my-secret-key
# TRANSLATION_BACKEND=modal
# TRANSLATE_URL=https://gpu.example.com
# TRANSLATION_MODAL_API_KEY=my-secret-key
#
# DNS Resolution:
# - Public domain: Create a DNS A record pointing to this machine's public IP.
# - Internal domain (e.g., gpu.local): Add to /etc/hosts on both machines:
# <GPU_MACHINE_IP> gpu.local
# - IP-only: Use the machine's IP directly in TRANSCRIPT_URL/DIARIZATION_URL.
# The Reflector backend must trust the CA or accept self-signed certs.
#
set -euo pipefail
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
ROOT_DIR="$(cd "$SCRIPT_DIR/.." && pwd)"
GPU_DIR="$ROOT_DIR/gpu/self_hosted"
OS="$(uname -s)"
# --- Colors ---
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
CYAN='\033[0;36m'
NC='\033[0m'
info() { echo -e "${CYAN}==>${NC} $*"; }
ok() { echo -e "${GREEN}${NC} $*"; }
warn() { echo -e "${YELLOW} !${NC} $*"; }
err() { echo -e "${RED}${NC} $*" >&2; }
# --- Parse arguments ---
CUSTOM_DOMAIN=""
CUSTOM_CA=""
EXTRA_CA_FILES=()
API_KEY=""
USE_CPU=false
HOST_PORT=""
SKIP_NEXT=false
ARGS=("$@")
for i in "${!ARGS[@]}"; do
if [[ "$SKIP_NEXT" == "true" ]]; then
SKIP_NEXT=false
continue
fi
arg="${ARGS[$i]}"
case "$arg" in
--domain)
next_i=$((i + 1))
if [[ $next_i -ge ${#ARGS[@]} ]] || [[ "${ARGS[$next_i]}" == --* ]]; then
err "--domain requires a domain name"
exit 1
fi
CUSTOM_DOMAIN="${ARGS[$next_i]}"
SKIP_NEXT=true ;;
--custom-ca)
next_i=$((i + 1))
if [[ $next_i -ge ${#ARGS[@]} ]] || [[ "${ARGS[$next_i]}" == --* ]]; then
err "--custom-ca requires a path to a directory or PEM certificate file"
exit 1
fi
CUSTOM_CA="${ARGS[$next_i]}"
SKIP_NEXT=true ;;
--extra-ca)
next_i=$((i + 1))
if [[ $next_i -ge ${#ARGS[@]} ]] || [[ "${ARGS[$next_i]}" == --* ]]; then
err "--extra-ca requires a path to a PEM certificate file"
exit 1
fi
if [[ ! -f "${ARGS[$next_i]}" ]]; then
err "--extra-ca file not found: ${ARGS[$next_i]}"
exit 1
fi
EXTRA_CA_FILES+=("${ARGS[$next_i]}")
SKIP_NEXT=true ;;
--api-key)
next_i=$((i + 1))
if [[ $next_i -ge ${#ARGS[@]} ]] || [[ "${ARGS[$next_i]}" == --* ]]; then
err "--api-key requires a key value"
exit 1
fi
API_KEY="${ARGS[$next_i]}"
SKIP_NEXT=true ;;
--cpu)
USE_CPU=true ;;
--port)
next_i=$((i + 1))
if [[ $next_i -ge ${#ARGS[@]} ]] || [[ "${ARGS[$next_i]}" == --* ]]; then
err "--port requires a port number"
exit 1
fi
HOST_PORT="${ARGS[$next_i]}"
SKIP_NEXT=true ;;
--build)
;; # Always build from source for GPU, flag accepted for compatibility
*)
err "Unknown argument: $arg"
err "Usage: $0 [--domain DOMAIN] [--custom-ca PATH] [--extra-ca FILE] [--api-key KEY] [--cpu] [--port PORT]"
exit 1
;;
esac
done
# --- Resolve CA paths ---
CA_CERT_PATH=""
TLS_CERT_PATH=""
TLS_KEY_PATH=""
USE_CUSTOM_CA=false
USE_CADDY=false
if [[ -n "$CUSTOM_CA" ]] || [[ -n "${EXTRA_CA_FILES[0]+x}" ]]; then
USE_CUSTOM_CA=true
fi
if [[ -n "$CUSTOM_CA" ]]; then
CUSTOM_CA="${CUSTOM_CA%/}"
if [[ -d "$CUSTOM_CA" ]]; then
[[ -f "$CUSTOM_CA/ca.crt" ]] || { err "$CUSTOM_CA/ca.crt not found"; exit 1; }
CA_CERT_PATH="$CUSTOM_CA/ca.crt"
if [[ -f "$CUSTOM_CA/server.pem" ]] && [[ -f "$CUSTOM_CA/server-key.pem" ]]; then
TLS_CERT_PATH="$CUSTOM_CA/server.pem"
TLS_KEY_PATH="$CUSTOM_CA/server-key.pem"
elif [[ -f "$CUSTOM_CA/server.pem" ]] || [[ -f "$CUSTOM_CA/server-key.pem" ]]; then
warn "Found only one of server.pem/server-key.pem — both needed for TLS. Skipping."
fi
elif [[ -f "$CUSTOM_CA" ]]; then
CA_CERT_PATH="$CUSTOM_CA"
else
err "--custom-ca path not found: $CUSTOM_CA"
exit 1
fi
elif [[ -n "${EXTRA_CA_FILES[0]+x}" ]]; then
CA_CERT_PATH="${EXTRA_CA_FILES[0]}"
unset 'EXTRA_CA_FILES[0]'
EXTRA_CA_FILES=("${EXTRA_CA_FILES[@]+"${EXTRA_CA_FILES[@]}"}")
fi
# Caddy if we have a domain or TLS certs
if [[ -n "$CUSTOM_DOMAIN" ]] || [[ -n "$TLS_CERT_PATH" ]]; then
USE_CADDY=true
fi
# Default port
if [[ -z "$HOST_PORT" ]]; then
if [[ "$USE_CADDY" == "true" ]]; then
HOST_PORT="443"
else
HOST_PORT="8000"
fi
fi
# Detect primary IP
PRIMARY_IP=""
if [[ "$OS" == "Linux" ]]; then
PRIMARY_IP=$(hostname -I 2>/dev/null | awk '{print $1}' || true)
if [[ "$PRIMARY_IP" == "127."* ]] || [[ -z "$PRIMARY_IP" ]]; then
PRIMARY_IP=$(ip -4 route get 1 2>/dev/null | sed -n 's/.*src \([0-9.]*\).*/\1/p' || true)
fi
fi
# --- Display config ---
echo ""
echo "=========================================="
echo " Reflector — Standalone GPU Host Setup"
echo "=========================================="
echo ""
echo " Mode: $(if [[ "$USE_CPU" == "true" ]]; then echo "CPU-only"; else echo "NVIDIA GPU"; fi)"
echo " Caddy: $USE_CADDY"
[[ -n "$CUSTOM_DOMAIN" ]] && echo " Domain: $CUSTOM_DOMAIN"
[[ "$USE_CUSTOM_CA" == "true" ]] && echo " CA: Custom"
[[ -n "$TLS_CERT_PATH" ]] && echo " TLS: Custom cert"
[[ -n "$API_KEY" ]] && echo " Auth: API key protected"
[[ -z "$API_KEY" ]] && echo " Auth: NONE (open access — use --api-key for production!)"
echo " Port: $HOST_PORT"
echo ""
# --- Prerequisites ---
info "Checking prerequisites"
if ! command -v docker &>/dev/null; then
err "Docker not found. Install Docker first."
exit 1
fi
ok "Docker available"
if ! docker compose version &>/dev/null; then
err "Docker Compose V2 not found."
exit 1
fi
ok "Docker Compose V2 available"
if [[ "$USE_CPU" != "true" ]]; then
if ! docker info 2>/dev/null | grep -qi nvidia; then
warn "NVIDIA runtime not detected in Docker. GPU mode may fail."
warn "Install nvidia-container-toolkit if you have an NVIDIA GPU."
else
ok "NVIDIA Docker runtime available"
fi
fi
# --- Stage certificates ---
CERTS_DIR="$ROOT_DIR/certs"
if [[ "$USE_CUSTOM_CA" == "true" ]]; then
info "Staging certificates"
mkdir -p "$CERTS_DIR"
if [[ -n "$CA_CERT_PATH" ]]; then
local_ca_dest="$CERTS_DIR/ca.crt"
src_id=$(ls -i "$CA_CERT_PATH" 2>/dev/null | awk '{print $1}')
dst_id=$(ls -i "$local_ca_dest" 2>/dev/null | awk '{print $1}')
if [[ "$src_id" != "$dst_id" ]] || [[ -z "$dst_id" ]]; then
cp "$CA_CERT_PATH" "$local_ca_dest"
fi
chmod 644 "$local_ca_dest"
ok "CA certificate staged"
# Append extra CAs
for extra_ca in "${EXTRA_CA_FILES[@]+"${EXTRA_CA_FILES[@]}"}"; do
echo "" >> "$local_ca_dest"
cat "$extra_ca" >> "$local_ca_dest"
ok "Appended extra CA: $extra_ca"
done
fi
if [[ -n "$TLS_CERT_PATH" ]]; then
cert_dest="$CERTS_DIR/server.pem"
key_dest="$CERTS_DIR/server-key.pem"
src_id=$(ls -i "$TLS_CERT_PATH" 2>/dev/null | awk '{print $1}')
dst_id=$(ls -i "$cert_dest" 2>/dev/null | awk '{print $1}')
if [[ "$src_id" != "$dst_id" ]] || [[ -z "$dst_id" ]]; then
cp "$TLS_CERT_PATH" "$cert_dest"
cp "$TLS_KEY_PATH" "$key_dest"
fi
chmod 644 "$cert_dest"
chmod 600 "$key_dest"
ok "TLS cert/key staged"
fi
fi
# --- Build profiles and compose command ---
COMPOSE_FILE="$ROOT_DIR/docker-compose.gpu-host.yml"
COMPOSE_PROFILES=()
GPU_SERVICE="gpu"
if [[ "$USE_CPU" == "true" ]]; then
COMPOSE_PROFILES+=("cpu")
GPU_SERVICE="cpu"
else
COMPOSE_PROFILES+=("gpu")
fi
if [[ "$USE_CADDY" == "true" ]]; then
COMPOSE_PROFILES+=("caddy")
fi
# Compose command helper
compose_cmd() {
local profiles="" files="-f $COMPOSE_FILE"
if [[ "$USE_CUSTOM_CA" == "true" ]] && [[ -f "$ROOT_DIR/docker-compose.gpu-ca.yml" ]]; then
files="$files -f $ROOT_DIR/docker-compose.gpu-ca.yml"
fi
for p in "${COMPOSE_PROFILES[@]}"; do
profiles="$profiles --profile $p"
done
docker compose $files $profiles "$@"
}
# Generate CA compose override if needed (mounts certs into containers)
if [[ "$USE_CUSTOM_CA" == "true" ]]; then
info "Generating docker-compose.gpu-ca.yml override"
ca_override="$ROOT_DIR/docker-compose.gpu-ca.yml"
cat > "$ca_override" << 'CAEOF'
# Generated by setup-gpu-host.sh — custom CA trust.
# Do not edit manually; re-run setup-gpu-host.sh with --custom-ca to regenerate.
services:
gpu:
volumes:
- ./certs/ca.crt:/usr/local/share/ca-certificates/custom-ca.crt:ro
cpu:
volumes:
- ./certs/ca.crt:/usr/local/share/ca-certificates/custom-ca.crt:ro
CAEOF
if [[ -n "$TLS_CERT_PATH" ]]; then
cat >> "$ca_override" << 'CADDYCAEOF'
caddy:
volumes:
- ./certs:/etc/caddy/certs:ro
CADDYCAEOF
fi
ok "Generated docker-compose.gpu-ca.yml"
else
rm -f "$ROOT_DIR/docker-compose.gpu-ca.yml"
fi
# --- Generate Caddyfile ---
if [[ "$USE_CADDY" == "true" ]]; then
info "Generating Caddyfile.gpu-host"
CADDYFILE="$ROOT_DIR/Caddyfile.gpu-host"
if [[ -n "$TLS_CERT_PATH" ]] && [[ -n "$CUSTOM_DOMAIN" ]]; then
cat > "$CADDYFILE" << CADDYEOF
# Generated by setup-gpu-host.sh — Custom TLS cert for $CUSTOM_DOMAIN
$CUSTOM_DOMAIN {
tls /etc/caddy/certs/server.pem /etc/caddy/certs/server-key.pem
reverse_proxy transcription:8000
}
CADDYEOF
ok "Caddyfile: custom TLS for $CUSTOM_DOMAIN"
elif [[ -n "$CUSTOM_DOMAIN" ]]; then
cat > "$CADDYFILE" << CADDYEOF
# Generated by setup-gpu-host.sh — Let's Encrypt for $CUSTOM_DOMAIN
$CUSTOM_DOMAIN {
reverse_proxy transcription:8000
}
CADDYEOF
ok "Caddyfile: Let's Encrypt for $CUSTOM_DOMAIN"
else
cat > "$CADDYFILE" << 'CADDYEOF'
# Generated by setup-gpu-host.sh — self-signed cert for IP access
:443 {
tls internal
reverse_proxy transcription:8000
}
CADDYEOF
ok "Caddyfile: self-signed cert for IP access"
fi
fi
# --- Generate .env ---
info "Generating GPU service .env"
GPU_ENV="$ROOT_DIR/.env.gpu-host"
cat > "$GPU_ENV" << EOF
# Generated by setup-gpu-host.sh
# HuggingFace token for pyannote diarization models
HF_TOKEN=${HF_TOKEN:-}
# API key to protect the GPU service (set via --api-key)
REFLECTOR_GPU_APIKEY=${API_KEY:-}
# Port configuration
GPU_HOST_PORT=${HOST_PORT}
CADDY_HTTPS_PORT=${HOST_PORT}
EOF
if [[ -z "${HF_TOKEN:-}" ]]; then
warn "HF_TOKEN not set. Diarization requires a HuggingFace token."
warn "Set it: export HF_TOKEN=your-token-here and re-run, or edit .env.gpu-host"
fi
ok "Generated .env.gpu-host"
# --- Build and start ---
info "Building $GPU_SERVICE image (first build downloads ML models — may take a while)..."
compose_cmd --env-file "$GPU_ENV" build "$GPU_SERVICE"
ok "$GPU_SERVICE image built"
info "Starting services..."
compose_cmd --env-file "$GPU_ENV" up -d
ok "Services started"
# --- Wait for health ---
info "Waiting for GPU service to be healthy (model loading takes 1-2 minutes)..."
local_url="http://localhost:8000"
for i in $(seq 1 40); do
if curl -sf "$local_url/docs" >/dev/null 2>&1; then
ok "GPU service is healthy!"
break
fi
if [[ $i -eq 40 ]]; then
err "GPU service did not become healthy after 5 minutes."
err "Check logs: docker compose -f docker-compose.gpu-host.yml logs gpu"
exit 1
fi
sleep 8
done
# --- Summary ---
echo ""
echo "=========================================="
echo -e " ${GREEN}GPU service is running!${NC}"
echo "=========================================="
echo ""
if [[ "$USE_CADDY" == "true" ]]; then
if [[ -n "$CUSTOM_DOMAIN" ]]; then
echo " URL: https://$CUSTOM_DOMAIN"
elif [[ -n "$PRIMARY_IP" ]]; then
echo " URL: https://$PRIMARY_IP"
else
echo " URL: https://localhost"
fi
else
if [[ -n "$PRIMARY_IP" ]]; then
echo " URL: http://$PRIMARY_IP:$HOST_PORT"
else
echo " URL: http://localhost:$HOST_PORT"
fi
fi
echo " Health: curl \$(URL)/docs"
[[ -n "$API_KEY" ]] && echo " API key: $API_KEY"
echo ""
echo " Configure the main Reflector instance (in server/.env):"
echo ""
local_gpu_url=""
if [[ "$USE_CADDY" == "true" ]]; then
if [[ -n "$CUSTOM_DOMAIN" ]]; then
local_gpu_url="https://$CUSTOM_DOMAIN"
elif [[ -n "$PRIMARY_IP" ]]; then
local_gpu_url="https://$PRIMARY_IP"
else
local_gpu_url="https://localhost"
fi
else
if [[ -n "$PRIMARY_IP" ]]; then
local_gpu_url="http://$PRIMARY_IP:$HOST_PORT"
else
local_gpu_url="http://localhost:$HOST_PORT"
fi
fi
echo " TRANSCRIPT_BACKEND=modal"
echo " TRANSCRIPT_URL=$local_gpu_url"
[[ -n "$API_KEY" ]] && echo " TRANSCRIPT_MODAL_API_KEY=$API_KEY"
echo " DIARIZATION_BACKEND=modal"
echo " DIARIZATION_URL=$local_gpu_url"
[[ -n "$API_KEY" ]] && echo " DIARIZATION_MODAL_API_KEY=$API_KEY"
echo " TRANSLATION_BACKEND=modal"
echo " TRANSLATE_URL=$local_gpu_url"
[[ -n "$API_KEY" ]] && echo " TRANSLATION_MODAL_API_KEY=$API_KEY"
echo ""
if [[ "$USE_CUSTOM_CA" == "true" ]]; then
echo " The Reflector instance must also trust this CA."
echo " On the Reflector machine, run setup-selfhosted.sh with:"
echo " --extra-ca /path/to/this-machines-ca.crt"
echo ""
fi
echo " DNS Resolution:"
if [[ -n "$CUSTOM_DOMAIN" ]]; then
echo " Ensure '$CUSTOM_DOMAIN' resolves to this machine's IP."
echo " Public: Create a DNS A record."
echo " Internal: Add to /etc/hosts on the Reflector machine:"
echo " ${PRIMARY_IP:-<GPU_IP>} $CUSTOM_DOMAIN"
else
echo " Use this machine's IP directly in TRANSCRIPT_URL/DIARIZATION_URL."
fi
echo ""
echo " To stop: docker compose -f docker-compose.gpu-host.yml down"
echo " To re-run: ./scripts/setup-gpu-host.sh $*"
echo " Logs: docker compose -f docker-compose.gpu-host.yml logs -f gpu"
echo ""

View File

@@ -4,13 +4,21 @@
# Single script to configure and launch everything on one server.
#
# Usage:
# ./scripts/setup-selfhosted.sh <--gpu|--cpu|--hosted> [--ollama-gpu|--ollama-cpu] [--llm-model MODEL] [--garage] [--caddy] [--domain DOMAIN] [--password PASSWORD] [--build]
# ./scripts/setup-selfhosted.sh <--gpu|--cpu|--hosted> [options] [--transcript BACKEND] [--diarization BACKEND] [--translation BACKEND] [--padding BACKEND] [--mixdown BACKEND]
# ./scripts/setup-selfhosted.sh (re-run with saved config from last run)
#
# ML processing modes (pick ONE — required):
# ML processing modes (pick ONE — required on first run):
# --gpu NVIDIA GPU container for transcription/diarization/translation
# --cpu In-process CPU processing (no ML container, slower)
# --hosted Remote GPU service URL (no ML container)
#
# Per-service backend overrides (optional — override individual services from the base mode):
# --transcript BACKEND whisper | modal (default: whisper for --cpu, modal for --gpu/--hosted)
# --diarization BACKEND pyannote | modal (default: pyannote for --cpu, modal for --gpu/--hosted)
# --translation BACKEND marian | modal | passthrough (default: marian for --cpu, modal for --gpu/--hosted)
# --padding BACKEND pyav | modal (default: pyav for --cpu, modal for --gpu/--hosted)
# --mixdown BACKEND pyav | modal (default: pyav for --cpu, modal for --gpu/--hosted)
#
# Local LLM (optional — for summarization & topic detection):
# --ollama-gpu Local Ollama with NVIDIA GPU acceleration
# --ollama-cpu Local Ollama on CPU only
@@ -23,6 +31,13 @@
# --domain DOMAIN Use a real domain for Caddy (enables Let's Encrypt auto-HTTPS)
# Requires: DNS pointing to this server + ports 80/443 open
# Without --domain: Caddy uses self-signed cert for IP access
# --custom-ca PATH Custom CA certificate for private HTTPS services
# PATH can be a directory (containing ca.crt, optionally server.pem + server-key.pem)
# or a single PEM file (CA trust only, no Caddy TLS)
# With server.pem+server-key.pem: Caddy serves HTTPS using those certs (requires --domain)
# Without: only injects CA trust into backend containers for outbound calls
# --extra-ca FILE Additional CA cert to trust (can be repeated for multiple CAs)
# Appended to the CA bundle so backends trust multiple authorities
# --password PASS Enable password auth with admin@localhost user
# --build Build backend and frontend images from source instead of pulling
#
@@ -31,10 +46,17 @@
# ./scripts/setup-selfhosted.sh --gpu --ollama-gpu --garage --caddy --domain reflector.example.com
# ./scripts/setup-selfhosted.sh --cpu --ollama-cpu --garage --caddy
# ./scripts/setup-selfhosted.sh --hosted --garage --caddy
# ./scripts/setup-selfhosted.sh --cpu --padding modal --garage --caddy
# ./scripts/setup-selfhosted.sh --gpu --translation passthrough --garage --caddy
# ./scripts/setup-selfhosted.sh --cpu --diarization modal --translation modal --garage
# ./scripts/setup-selfhosted.sh --gpu --ollama-gpu --llm-model mistral --garage --caddy
# ./scripts/setup-selfhosted.sh --gpu --garage --caddy --password mysecretpass
# ./scripts/setup-selfhosted.sh --gpu --garage --caddy
# ./scripts/setup-selfhosted.sh --cpu
# ./scripts/setup-selfhosted.sh --gpu --caddy --domain reflector.local --custom-ca certs/
# ./scripts/setup-selfhosted.sh --hosted --custom-ca /path/to/corporate-ca.crt
# ./scripts/setup-selfhosted.sh # re-run with saved config
#
# Config memory: after a successful run, flags are saved to data/.selfhosted-last-args.
# Re-running with no arguments replays the saved configuration automatically.
#
# The script auto-detects Daily.co (DAILY_API_KEY) and Whereby (WHEREBY_API_KEY)
# from server/.env. If Daily.co is configured, Hatchet workflow services are
@@ -50,6 +72,7 @@ ROOT_DIR="$(cd "$SCRIPT_DIR/.." && pwd)"
COMPOSE_FILE="$ROOT_DIR/docker-compose.selfhosted.yml"
SERVER_ENV="$ROOT_DIR/server/.env"
WWW_ENV="$ROOT_DIR/www/.env"
LAST_ARGS_FILE="$ROOT_DIR/data/.selfhosted-last-args"
OLLAMA_MODEL="qwen2.5:14b"
OS="$(uname -s)"
@@ -154,18 +177,32 @@ env_set() {
}
compose_cmd() {
local profiles=""
local profiles="" files="-f $COMPOSE_FILE"
[[ "$USE_CUSTOM_CA" == "true" ]] && files="$files -f $ROOT_DIR/docker-compose.ca.yml"
for p in "${COMPOSE_PROFILES[@]}"; do
profiles="$profiles --profile $p"
done
docker compose -f "$COMPOSE_FILE" $profiles "$@"
docker compose $files $profiles "$@"
}
# Compose command with only garage profile (for garage-only operations before full stack start)
compose_garage_cmd() {
docker compose -f "$COMPOSE_FILE" --profile garage "$@"
local files="-f $COMPOSE_FILE"
[[ "$USE_CUSTOM_CA" == "true" ]] && files="$files -f $ROOT_DIR/docker-compose.ca.yml"
docker compose $files --profile garage "$@"
}
# --- Config memory: replay last args if none provided ---
if [[ $# -eq 0 ]] && [[ -f "$LAST_ARGS_FILE" ]]; then
SAVED_ARGS="$(cat "$LAST_ARGS_FILE")"
if [[ -n "$SAVED_ARGS" ]]; then
info "No flags provided — replaying saved configuration:"
info " $SAVED_ARGS"
echo ""
eval "set -- $SAVED_ARGS"
fi
fi
# --- Parse arguments ---
MODEL_MODE="" # gpu or cpu (required, mutually exclusive)
OLLAMA_MODE="" # ollama-gpu or ollama-cpu (optional)
@@ -174,6 +211,22 @@ USE_CADDY=false
CUSTOM_DOMAIN="" # optional domain for Let's Encrypt HTTPS
BUILD_IMAGES=false # build backend/frontend from source
ADMIN_PASSWORD="" # optional admin password for password auth
CUSTOM_CA="" # --custom-ca: path to dir or CA cert file
USE_CUSTOM_CA=false # derived flag: true when --custom-ca is provided
EXTRA_CA_FILES=() # --extra-ca: additional CA certs to trust (can be repeated)
OVERRIDE_TRANSCRIPT="" # per-service override: whisper | modal
OVERRIDE_DIARIZATION="" # per-service override: pyannote | modal
OVERRIDE_TRANSLATION="" # per-service override: marian | modal | passthrough
OVERRIDE_PADDING="" # per-service override: pyav | modal
OVERRIDE_MIXDOWN="" # per-service override: pyav | modal
# Validate per-service backend override values
validate_backend() {
local service="$1" value="$2"; shift 2; local valid=("$@")
for v in "${valid[@]}"; do [[ "$value" == "$v" ]] && return 0; done
err "--$service value '$value' is not valid. Choose one of: ${valid[*]}"
exit 1
}
SKIP_NEXT=false
ARGS=("$@")
@@ -227,24 +280,159 @@ for i in "${!ARGS[@]}"; do
CUSTOM_DOMAIN="${ARGS[$next_i]}"
USE_CADDY=true # --domain implies --caddy
SKIP_NEXT=true ;;
--custom-ca)
next_i=$((i + 1))
if [[ $next_i -ge ${#ARGS[@]} ]] || [[ "${ARGS[$next_i]}" == --* ]]; then
err "--custom-ca requires a path to a directory or PEM certificate file"
exit 1
fi
CUSTOM_CA="${ARGS[$next_i]}"
USE_CUSTOM_CA=true
SKIP_NEXT=true ;;
--extra-ca)
next_i=$((i + 1))
if [[ $next_i -ge ${#ARGS[@]} ]] || [[ "${ARGS[$next_i]}" == --* ]]; then
err "--extra-ca requires a path to a PEM certificate file"
exit 1
fi
extra_ca_file="${ARGS[$next_i]}"
if [[ ! -f "$extra_ca_file" ]]; then
err "--extra-ca file not found: $extra_ca_file"
exit 1
fi
EXTRA_CA_FILES+=("$extra_ca_file")
USE_CUSTOM_CA=true
SKIP_NEXT=true ;;
--transcript)
next_i=$((i + 1))
if [[ $next_i -ge ${#ARGS[@]} ]] || [[ "${ARGS[$next_i]}" == --* ]]; then
err "--transcript requires a backend (whisper | modal)"
exit 1
fi
validate_backend "transcript" "${ARGS[$next_i]}" whisper modal
OVERRIDE_TRANSCRIPT="${ARGS[$next_i]}"
SKIP_NEXT=true ;;
--diarization)
next_i=$((i + 1))
if [[ $next_i -ge ${#ARGS[@]} ]] || [[ "${ARGS[$next_i]}" == --* ]]; then
err "--diarization requires a backend (pyannote | modal)"
exit 1
fi
validate_backend "diarization" "${ARGS[$next_i]}" pyannote modal
OVERRIDE_DIARIZATION="${ARGS[$next_i]}"
SKIP_NEXT=true ;;
--translation)
next_i=$((i + 1))
if [[ $next_i -ge ${#ARGS[@]} ]] || [[ "${ARGS[$next_i]}" == --* ]]; then
err "--translation requires a backend (marian | modal | passthrough)"
exit 1
fi
validate_backend "translation" "${ARGS[$next_i]}" marian modal passthrough
OVERRIDE_TRANSLATION="${ARGS[$next_i]}"
SKIP_NEXT=true ;;
--padding)
next_i=$((i + 1))
if [[ $next_i -ge ${#ARGS[@]} ]] || [[ "${ARGS[$next_i]}" == --* ]]; then
err "--padding requires a backend (pyav | modal)"
exit 1
fi
validate_backend "padding" "${ARGS[$next_i]}" pyav modal
OVERRIDE_PADDING="${ARGS[$next_i]}"
SKIP_NEXT=true ;;
--mixdown)
next_i=$((i + 1))
if [[ $next_i -ge ${#ARGS[@]} ]] || [[ "${ARGS[$next_i]}" == --* ]]; then
err "--mixdown requires a backend (pyav | modal)"
exit 1
fi
validate_backend "mixdown" "${ARGS[$next_i]}" pyav modal
OVERRIDE_MIXDOWN="${ARGS[$next_i]}"
SKIP_NEXT=true ;;
*)
err "Unknown argument: $arg"
err "Usage: $0 <--gpu|--cpu|--hosted> [--ollama-gpu|--ollama-cpu] [--llm-model MODEL] [--garage] [--caddy] [--domain DOMAIN] [--password PASS] [--build]"
err "Usage: $0 <--gpu|--cpu|--hosted> [options] [--transcript BACKEND] [--diarization BACKEND] [--translation BACKEND] [--padding BACKEND] [--mixdown BACKEND]"
exit 1
;;
esac
done
# --- Save CLI args for config memory (re-run without flags) ---
if [[ $# -gt 0 ]]; then
mkdir -p "$ROOT_DIR/data"
printf '%q ' "$@" > "$LAST_ARGS_FILE"
fi
# --- Resolve --custom-ca flag ---
CA_CERT_PATH="" # resolved path to CA certificate
TLS_CERT_PATH="" # resolved path to server cert (optional, for Caddy TLS)
TLS_KEY_PATH="" # resolved path to server key (optional, for Caddy TLS)
if [[ "$USE_CUSTOM_CA" == "true" ]]; then
# Strip trailing slashes to avoid double-slash paths
CUSTOM_CA="${CUSTOM_CA%/}"
if [[ -z "$CUSTOM_CA" ]] && [[ -n "${EXTRA_CA_FILES[0]+x}" ]]; then
# --extra-ca only (no --custom-ca): use first extra CA as the base
CA_CERT_PATH="${EXTRA_CA_FILES[0]}"
unset 'EXTRA_CA_FILES[0]'
EXTRA_CA_FILES=("${EXTRA_CA_FILES[@]+"${EXTRA_CA_FILES[@]}"}")
elif [[ -d "$CUSTOM_CA" ]]; then
# Directory mode: look for convention files
if [[ ! -f "$CUSTOM_CA/ca.crt" ]]; then
err "CA certificate not found: $CUSTOM_CA/ca.crt"
err "Directory must contain ca.crt (and optionally server.pem + server-key.pem)"
exit 1
fi
CA_CERT_PATH="$CUSTOM_CA/ca.crt"
# Server cert/key are optional — if both present, use for Caddy TLS
if [[ -f "$CUSTOM_CA/server.pem" ]] && [[ -f "$CUSTOM_CA/server-key.pem" ]]; then
TLS_CERT_PATH="$CUSTOM_CA/server.pem"
TLS_KEY_PATH="$CUSTOM_CA/server-key.pem"
elif [[ -f "$CUSTOM_CA/server.pem" ]] || [[ -f "$CUSTOM_CA/server-key.pem" ]]; then
warn "Found only one of server.pem/server-key.pem in $CUSTOM_CA — both are needed for Caddy TLS. Skipping."
fi
elif [[ -f "$CUSTOM_CA" ]]; then
# Single file mode: CA trust only (no Caddy TLS certs)
CA_CERT_PATH="$CUSTOM_CA"
else
err "--custom-ca path not found: $CUSTOM_CA"
exit 1
fi
# Validate PEM format
if ! head -1 "$CA_CERT_PATH" | grep -q "BEGIN"; then
err "CA certificate does not appear to be PEM format: $CA_CERT_PATH"
exit 1
fi
# If server cert/key found, require --domain and imply --caddy
if [[ -n "$TLS_CERT_PATH" ]]; then
if [[ -z "$CUSTOM_DOMAIN" ]]; then
err "Server cert/key found in $CUSTOM_CA but --domain not set."
err "Provide --domain to specify the domain name matching the certificate."
exit 1
fi
USE_CADDY=true # custom TLS certs imply --caddy
fi
fi
if [[ -z "$MODEL_MODE" ]]; then
err "No model mode specified. You must choose --gpu, --cpu, or --hosted."
err ""
err "Usage: $0 <--gpu|--cpu|--hosted> [--ollama-gpu|--ollama-cpu] [--llm-model MODEL] [--garage] [--caddy] [--domain DOMAIN] [--password PASS] [--build]"
err "Usage: $0 <--gpu|--cpu|--hosted> [options] [--transcript BACKEND] [--diarization BACKEND] [--translation BACKEND] [--padding BACKEND] [--mixdown BACKEND]"
err ""
err "ML processing modes (required):"
err " --gpu NVIDIA GPU container for transcription/diarization/translation"
err " --cpu In-process CPU processing (no ML container, slower)"
err " --hosted Remote GPU service URL (no ML container)"
err ""
err "Per-service backend overrides (optional — override individual services):"
err " --transcript BACKEND whisper | modal (default: whisper for --cpu, modal for --gpu/--hosted)"
err " --diarization BACKEND pyannote | modal (default: pyannote for --cpu, modal for --gpu/--hosted)"
err " --translation BACKEND marian | modal | passthrough (default: marian for --cpu, modal for --gpu/--hosted)"
err " --padding BACKEND pyav | modal (default: pyav for --cpu, modal for --gpu/--hosted)"
err " --mixdown BACKEND pyav | modal (default: pyav for --cpu, modal for --gpu/--hosted)"
err ""
err "Local LLM (optional):"
err " --ollama-gpu Local Ollama with GPU (for summarization/topics)"
err " --ollama-cpu Local Ollama on CPU (for summarization/topics)"
@@ -255,8 +443,12 @@ if [[ -z "$MODEL_MODE" ]]; then
err " --garage Local S3-compatible storage (Garage)"
err " --caddy Caddy reverse proxy with self-signed cert"
err " --domain DOMAIN Use a real domain with Let's Encrypt HTTPS (implies --caddy)"
err " --custom-ca PATH Custom CA cert (dir with ca.crt[+server.pem+server-key.pem] or single PEM file)"
err " --extra-ca FILE Additional CA cert to trust (repeatable for multiple CAs)"
err " --password PASS Enable password auth (admin@localhost) instead of public mode"
err " --build Build backend/frontend images from source instead of pulling"
err ""
err "Tip: After your first run, re-run with no flags to reuse the same configuration."
exit 1
fi
@@ -280,9 +472,38 @@ OLLAMA_SVC=""
[[ "$OLLAMA_MODE" == "ollama-gpu" ]] && USES_OLLAMA=true && OLLAMA_SVC="ollama"
[[ "$OLLAMA_MODE" == "ollama-cpu" ]] && USES_OLLAMA=true && OLLAMA_SVC="ollama-cpu"
# Resolve effective backend per service (override wins over base mode default)
case "$MODEL_MODE" in
gpu|hosted)
EFF_TRANSCRIPT="${OVERRIDE_TRANSCRIPT:-modal}"
EFF_DIARIZATION="${OVERRIDE_DIARIZATION:-modal}"
EFF_TRANSLATION="${OVERRIDE_TRANSLATION:-modal}"
EFF_PADDING="${OVERRIDE_PADDING:-modal}"
EFF_MIXDOWN="${OVERRIDE_MIXDOWN:-modal}"
;;
cpu)
EFF_TRANSCRIPT="${OVERRIDE_TRANSCRIPT:-whisper}"
EFF_DIARIZATION="${OVERRIDE_DIARIZATION:-pyannote}"
EFF_TRANSLATION="${OVERRIDE_TRANSLATION:-marian}"
EFF_PADDING="${OVERRIDE_PADDING:-pyav}"
EFF_MIXDOWN="${OVERRIDE_MIXDOWN:-pyav}"
;;
esac
# Check if any per-service overrides were provided
HAS_OVERRIDES=false
[[ -n "$OVERRIDE_TRANSCRIPT" ]] && HAS_OVERRIDES=true
[[ -n "$OVERRIDE_DIARIZATION" ]] && HAS_OVERRIDES=true
[[ -n "$OVERRIDE_TRANSLATION" ]] && HAS_OVERRIDES=true
[[ -n "$OVERRIDE_PADDING" ]] && HAS_OVERRIDES=true
[[ -n "$OVERRIDE_MIXDOWN" ]] && HAS_OVERRIDES=true
# Human-readable mode string for display
MODE_DISPLAY="$MODEL_MODE"
[[ -n "$OLLAMA_MODE" ]] && MODE_DISPLAY="$MODEL_MODE + $OLLAMA_MODE"
if [[ "$HAS_OVERRIDES" == "true" ]]; then
MODE_DISPLAY="$MODE_DISPLAY (overrides: transcript=$EFF_TRANSCRIPT, diarization=$EFF_DIARIZATION, translation=$EFF_TRANSLATION, padding=$EFF_PADDING, mixdown=$EFF_MIXDOWN)"
fi
# =========================================================
# Step 0: Prerequisites
@@ -366,6 +587,103 @@ print(f'pbkdf2:sha256:100000\$\$' + salt + '\$\$' + dk.hex())
ok "Secrets ready"
}
# =========================================================
# Step 1b: Custom CA certificate setup
# =========================================================
step_custom_ca() {
if [[ "$USE_CUSTOM_CA" != "true" ]]; then
# Clean up stale override from previous runs
rm -f "$ROOT_DIR/docker-compose.ca.yml"
return
fi
info "Configuring custom CA certificate"
local certs_dir="$ROOT_DIR/certs"
mkdir -p "$certs_dir"
# Stage CA certificate (skip copy if source and dest are the same file)
local ca_dest="$certs_dir/ca.crt"
local src_id dst_id
src_id=$(ls -i "$CA_CERT_PATH" 2>/dev/null | awk '{print $1}')
dst_id=$(ls -i "$ca_dest" 2>/dev/null | awk '{print $1}')
if [[ "$src_id" != "$dst_id" ]] || [[ -z "$dst_id" ]]; then
cp "$CA_CERT_PATH" "$ca_dest"
fi
chmod 644 "$ca_dest"
ok "CA certificate staged at certs/ca.crt"
# Append extra CA certs (--extra-ca flags)
for extra_ca in "${EXTRA_CA_FILES[@]+"${EXTRA_CA_FILES[@]}"}"; do
if ! head -1 "$extra_ca" | grep -q "BEGIN"; then
warn "Skipping $extra_ca — does not appear to be PEM format"
continue
fi
echo "" >> "$ca_dest"
cat "$extra_ca" >> "$ca_dest"
ok "Appended extra CA: $extra_ca"
done
# Stage TLS cert/key if present (for Caddy)
if [[ -n "$TLS_CERT_PATH" ]]; then
local cert_dest="$certs_dir/server.pem"
local key_dest="$certs_dir/server-key.pem"
src_id=$(ls -i "$TLS_CERT_PATH" 2>/dev/null | awk '{print $1}')
dst_id=$(ls -i "$cert_dest" 2>/dev/null | awk '{print $1}')
if [[ "$src_id" != "$dst_id" ]] || [[ -z "$dst_id" ]]; then
cp "$TLS_CERT_PATH" "$cert_dest"
cp "$TLS_KEY_PATH" "$key_dest"
fi
chmod 644 "$cert_dest"
chmod 600 "$key_dest"
ok "TLS cert/key staged at certs/server.pem, certs/server-key.pem"
fi
# Generate docker-compose.ca.yml override
local ca_override="$ROOT_DIR/docker-compose.ca.yml"
cat > "$ca_override" << 'CAEOF'
# Generated by setup-selfhosted.sh — custom CA trust for backend services.
# Do not edit manually; re-run setup-selfhosted.sh with --custom-ca to regenerate.
services:
server:
volumes:
- ./certs/ca.crt:/usr/local/share/ca-certificates/custom-ca.crt:ro
worker:
volumes:
- ./certs/ca.crt:/usr/local/share/ca-certificates/custom-ca.crt:ro
beat:
volumes:
- ./certs/ca.crt:/usr/local/share/ca-certificates/custom-ca.crt:ro
hatchet-worker-llm:
volumes:
- ./certs/ca.crt:/usr/local/share/ca-certificates/custom-ca.crt:ro
hatchet-worker-cpu:
volumes:
- ./certs/ca.crt:/usr/local/share/ca-certificates/custom-ca.crt:ro
gpu:
volumes:
- ./certs/ca.crt:/usr/local/share/ca-certificates/custom-ca.crt:ro
cpu:
volumes:
- ./certs/ca.crt:/usr/local/share/ca-certificates/custom-ca.crt:ro
web:
environment:
NODE_EXTRA_CA_CERTS: /usr/local/share/ca-certificates/custom-ca.crt
volumes:
- ./certs/ca.crt:/usr/local/share/ca-certificates/custom-ca.crt:ro
CAEOF
# If TLS cert/key present, also mount certs dir into Caddy
if [[ -n "$TLS_CERT_PATH" ]]; then
cat >> "$ca_override" << 'CADDYCAEOF'
caddy:
volumes:
- ./certs:/etc/caddy/certs:ro
CADDYCAEOF
fi
ok "Generated docker-compose.ca.yml override"
}
# =========================================================
# Step 2: Generate server/.env
# =========================================================
@@ -432,54 +750,30 @@ step_server_env() {
env_set "$SERVER_ENV" "WEBRTC_HOST" "$PRIMARY_IP"
fi
# Specialized models — backend configuration per mode
# Specialized models — backend configuration per service
env_set "$SERVER_ENV" "DIARIZATION_ENABLED" "true"
# Resolve the URL for modal backends
local modal_url=""
case "$MODEL_MODE" in
gpu)
# GPU container aliased as "transcription" on docker network
env_set "$SERVER_ENV" "TRANSCRIPT_BACKEND" "modal"
env_set "$SERVER_ENV" "TRANSCRIPT_URL" "http://transcription:8000"
env_set "$SERVER_ENV" "TRANSCRIPT_MODAL_API_KEY" "selfhosted"
env_set "$SERVER_ENV" "DIARIZATION_BACKEND" "modal"
env_set "$SERVER_ENV" "DIARIZATION_URL" "http://transcription:8000"
env_set "$SERVER_ENV" "TRANSLATION_BACKEND" "modal"
env_set "$SERVER_ENV" "TRANSLATE_URL" "http://transcription:8000"
env_set "$SERVER_ENV" "PADDING_BACKEND" "modal"
env_set "$SERVER_ENV" "PADDING_URL" "http://transcription:8000"
ok "ML backends: GPU container (modal)"
;;
cpu)
# In-process backends — no ML service container needed
env_set "$SERVER_ENV" "TRANSCRIPT_BACKEND" "whisper"
env_set "$SERVER_ENV" "DIARIZATION_BACKEND" "pyannote"
env_set "$SERVER_ENV" "TRANSLATION_BACKEND" "marian"
env_set "$SERVER_ENV" "PADDING_BACKEND" "pyav"
ok "ML backends: in-process CPU (whisper/pyannote/marian/pyav)"
modal_url="http://transcription:8000"
;;
hosted)
# Remote GPU service — user provides URL
local gpu_url=""
if env_has_key "$SERVER_ENV" "TRANSCRIPT_URL"; then
gpu_url=$(env_get "$SERVER_ENV" "TRANSCRIPT_URL")
modal_url=$(env_get "$SERVER_ENV" "TRANSCRIPT_URL")
fi
if [[ -z "$gpu_url" ]] && [[ -t 0 ]]; then
if [[ -z "$modal_url" ]] && [[ -t 0 ]]; then
echo ""
info "Enter the URL of your remote GPU service (e.g. https://gpu.example.com)"
read -rp " GPU service URL: " gpu_url
read -rp " GPU service URL: " modal_url
fi
if [[ -z "$gpu_url" ]]; then
if [[ -z "$modal_url" ]]; then
err "GPU service URL required for --hosted mode."
err "Set TRANSCRIPT_URL in server/.env or provide it interactively."
exit 1
fi
env_set "$SERVER_ENV" "TRANSCRIPT_BACKEND" "modal"
env_set "$SERVER_ENV" "TRANSCRIPT_URL" "$gpu_url"
env_set "$SERVER_ENV" "DIARIZATION_BACKEND" "modal"
env_set "$SERVER_ENV" "DIARIZATION_URL" "$gpu_url"
env_set "$SERVER_ENV" "TRANSLATION_BACKEND" "modal"
env_set "$SERVER_ENV" "TRANSLATE_URL" "$gpu_url"
env_set "$SERVER_ENV" "PADDING_BACKEND" "modal"
env_set "$SERVER_ENV" "PADDING_URL" "$gpu_url"
# API key for remote service
local gpu_api_key=""
if env_has_key "$SERVER_ENV" "TRANSCRIPT_MODAL_API_KEY"; then
@@ -491,15 +785,106 @@ step_server_env() {
if [[ -n "$gpu_api_key" ]]; then
env_set "$SERVER_ENV" "TRANSCRIPT_MODAL_API_KEY" "$gpu_api_key"
fi
ok "ML backends: remote hosted ($gpu_url)"
;;
cpu)
# CPU mode: modal_url stays empty. If services are overridden to modal,
# the user must configure the URL (TRANSCRIPT_URL etc.) in server/.env manually.
# We intentionally do NOT read from existing env here to avoid overwriting
# per-service URLs with a stale TRANSCRIPT_URL from a previous --gpu run.
;;
esac
# Set each service backend independently using effective backends
# Transcript
case "$EFF_TRANSCRIPT" in
modal)
env_set "$SERVER_ENV" "TRANSCRIPT_BACKEND" "modal"
if [[ -n "$modal_url" ]]; then
env_set "$SERVER_ENV" "TRANSCRIPT_URL" "$modal_url"
fi
[[ "$MODEL_MODE" == "gpu" ]] && env_set "$SERVER_ENV" "TRANSCRIPT_MODAL_API_KEY" "selfhosted"
;;
whisper)
env_set "$SERVER_ENV" "TRANSCRIPT_BACKEND" "whisper"
;;
esac
# Diarization
case "$EFF_DIARIZATION" in
modal)
env_set "$SERVER_ENV" "DIARIZATION_BACKEND" "modal"
if [[ -n "$modal_url" ]]; then
env_set "$SERVER_ENV" "DIARIZATION_URL" "$modal_url"
fi
;;
pyannote)
env_set "$SERVER_ENV" "DIARIZATION_BACKEND" "pyannote"
;;
esac
# Translation
case "$EFF_TRANSLATION" in
modal)
env_set "$SERVER_ENV" "TRANSLATION_BACKEND" "modal"
if [[ -n "$modal_url" ]]; then
env_set "$SERVER_ENV" "TRANSLATE_URL" "$modal_url"
fi
;;
marian)
env_set "$SERVER_ENV" "TRANSLATION_BACKEND" "marian"
;;
passthrough)
env_set "$SERVER_ENV" "TRANSLATION_BACKEND" "passthrough"
;;
esac
# Padding
case "$EFF_PADDING" in
modal)
env_set "$SERVER_ENV" "PADDING_BACKEND" "modal"
if [[ -n "$modal_url" ]]; then
env_set "$SERVER_ENV" "PADDING_URL" "$modal_url"
fi
;;
pyav)
env_set "$SERVER_ENV" "PADDING_BACKEND" "pyav"
;;
esac
# Mixdown
case "$EFF_MIXDOWN" in
modal)
env_set "$SERVER_ENV" "MIXDOWN_BACKEND" "modal"
if [[ -n "$modal_url" ]]; then
env_set "$SERVER_ENV" "MIXDOWN_URL" "$modal_url"
fi
;;
pyav)
env_set "$SERVER_ENV" "MIXDOWN_BACKEND" "pyav"
;;
esac
# Warn about modal overrides in CPU mode that need URL configuration
if [[ "$MODEL_MODE" == "cpu" ]] && [[ -z "$modal_url" ]]; then
local needs_url=false
[[ "$EFF_TRANSCRIPT" == "modal" ]] && needs_url=true
[[ "$EFF_DIARIZATION" == "modal" ]] && needs_url=true
[[ "$EFF_TRANSLATION" == "modal" ]] && needs_url=true
[[ "$EFF_PADDING" == "modal" ]] && needs_url=true
[[ "$EFF_MIXDOWN" == "modal" ]] && needs_url=true
if [[ "$needs_url" == "true" ]]; then
warn "One or more services are set to 'modal' but no service URL is configured."
warn "Set TRANSCRIPT_URL (and optionally TRANSCRIPT_MODAL_API_KEY) in server/.env"
warn "to point to your GPU service, then re-run this script."
fi
fi
ok "ML backends: transcript=$EFF_TRANSCRIPT, diarization=$EFF_DIARIZATION, translation=$EFF_TRANSLATION, padding=$EFF_PADDING, mixdown=$EFF_MIXDOWN"
# HuggingFace token for gated models (pyannote diarization)
# --gpu: written to root .env (docker compose passes to GPU container)
# --cpu: written to both root .env and server/.env (in-process pyannote needs it)
# --hosted: not needed (remote service handles its own auth)
if [[ "$MODEL_MODE" != "hosted" ]]; then
# Needed when: GPU container is running (MODEL_MODE=gpu), or diarization uses pyannote in-process
# Not needed when: all modal services point to a remote hosted URL with its own auth
if [[ "$MODEL_MODE" == "gpu" ]] || [[ "$EFF_DIARIZATION" == "pyannote" ]]; then
local root_env="$ROOT_DIR/.env"
local current_hf_token="${HF_TOKEN:-}"
if [[ -f "$root_env" ]] && env_has_key "$root_env" "HF_TOKEN"; then
@@ -518,8 +903,8 @@ step_server_env() {
touch "$root_env"
env_set "$root_env" "HF_TOKEN" "$current_hf_token"
export HF_TOKEN="$current_hf_token"
# In CPU mode, server process needs HF_TOKEN directly
if [[ "$MODEL_MODE" == "cpu" ]]; then
# When diarization runs in-process (pyannote), server process needs HF_TOKEN directly
if [[ "$EFF_DIARIZATION" == "pyannote" ]]; then
env_set "$SERVER_ENV" "HF_TOKEN" "$current_hf_token"
fi
ok "HF_TOKEN configured"
@@ -552,11 +937,15 @@ step_server_env() {
fi
fi
# CPU mode: increase file processing timeouts (default 600s is too short for long audio on CPU)
if [[ "$MODEL_MODE" == "cpu" ]]; then
# Increase file processing timeouts for CPU backends (default 600s is too short for long audio on CPU)
if [[ "$EFF_TRANSCRIPT" == "whisper" ]]; then
env_set "$SERVER_ENV" "TRANSCRIPT_FILE_TIMEOUT" "3600"
fi
if [[ "$EFF_DIARIZATION" == "pyannote" ]]; then
env_set "$SERVER_ENV" "DIARIZATION_FILE_TIMEOUT" "3600"
ok "CPU mode — file processing timeouts set to 3600s (1 hour)"
fi
if [[ "$EFF_TRANSCRIPT" == "whisper" ]] || [[ "$EFF_DIARIZATION" == "pyannote" ]]; then
ok "CPU backend(s) detected — file processing timeouts set to 3600s (1 hour)"
fi
# Hatchet is always required (file, live, and multitrack pipelines all use it)
@@ -799,7 +1188,25 @@ step_caddyfile() {
rm -rf "$caddyfile"
fi
if [[ -n "$CUSTOM_DOMAIN" ]]; then
if [[ -n "$TLS_CERT_PATH" ]] && [[ -n "$CUSTOM_DOMAIN" ]]; then
# Custom domain with user-provided TLS certificate (from --custom-ca directory)
cat > "$caddyfile" << CADDYEOF
# Generated by setup-selfhosted.sh — Custom TLS cert for $CUSTOM_DOMAIN
$CUSTOM_DOMAIN {
tls /etc/caddy/certs/server.pem /etc/caddy/certs/server-key.pem
handle /v1/* {
reverse_proxy server:1250
}
handle /health {
reverse_proxy server:1250
}
handle {
reverse_proxy web:3000
}
}
CADDYEOF
ok "Created Caddyfile for $CUSTOM_DOMAIN (custom TLS certificate)"
elif [[ -n "$CUSTOM_DOMAIN" ]]; then
# Real domain: Caddy auto-provisions Let's Encrypt certificate
cat > "$caddyfile" << CADDYEOF
# Generated by setup-selfhosted.sh — Let's Encrypt HTTPS for $CUSTOM_DOMAIN
@@ -966,9 +1373,9 @@ step_health() {
warn "Check with: docker compose -f docker-compose.selfhosted.yml logs gpu"
fi
elif [[ "$MODEL_MODE" == "cpu" ]]; then
ok "CPU mode — ML processing runs in-process on server/worker (no separate service)"
ok "CPU mode — in-process backends run on server/worker (transcript=$EFF_TRANSCRIPT, diarization=$EFF_DIARIZATION, translation=$EFF_TRANSLATION, padding=$EFF_PADDING, mixdown=$EFF_MIXDOWN)"
elif [[ "$MODEL_MODE" == "hosted" ]]; then
ok "Hosted mode — ML processing via remote GPU service (no local health check)"
ok "Hosted mode — ML processing via remote GPU service (transcript=$EFF_TRANSCRIPT, diarization=$EFF_DIARIZATION, translation=$EFF_TRANSLATION, padding=$EFF_PADDING, mixdown=$EFF_MIXDOWN)"
fi
# Ollama (if applicable)
@@ -1166,10 +1573,16 @@ main() {
echo "=========================================="
echo ""
echo " Models: $MODEL_MODE"
if [[ "$HAS_OVERRIDES" == "true" ]]; then
echo " transcript=$EFF_TRANSCRIPT, diarization=$EFF_DIARIZATION"
echo " translation=$EFF_TRANSLATION, padding=$EFF_PADDING, mixdown=$EFF_MIXDOWN"
fi
echo " LLM: ${OLLAMA_MODE:-external}"
echo " Garage: $USE_GARAGE"
echo " Caddy: $USE_CADDY"
[[ -n "$CUSTOM_DOMAIN" ]] && echo " Domain: $CUSTOM_DOMAIN"
[[ "$USE_CUSTOM_CA" == "true" ]] && echo " CA: Custom ($CUSTOM_CA)"
[[ -n "$TLS_CERT_PATH" ]] && echo " TLS: Custom cert (from $CUSTOM_CA)"
[[ "$BUILD_IMAGES" == "true" ]] && echo " Build: from source"
echo ""
@@ -1200,6 +1613,8 @@ main() {
echo ""
step_secrets
echo ""
step_custom_ca
echo ""
step_server_env
echo ""
@@ -1274,7 +1689,13 @@ EOF
echo " API: server:1250 (or localhost:1250 from host)"
fi
echo ""
echo " Models: $MODEL_MODE (transcription/diarization/translation)"
if [[ "$HAS_OVERRIDES" == "true" ]]; then
echo " Models: $MODEL_MODE base + overrides"
echo " transcript=$EFF_TRANSCRIPT, diarization=$EFF_DIARIZATION"
echo " translation=$EFF_TRANSLATION, padding=$EFF_PADDING, mixdown=$EFF_MIXDOWN"
else
echo " Models: $MODEL_MODE (transcription/diarization/translation/padding)"
fi
[[ "$USE_GARAGE" == "true" ]] && echo " Storage: Garage (local S3)"
[[ "$USE_GARAGE" != "true" ]] && echo " Storage: External S3"
[[ "$USES_OLLAMA" == "true" ]] && echo " LLM: Ollama ($OLLAMA_MODEL) for summarization/topics"
@@ -1282,9 +1703,20 @@ EOF
[[ "$DAILY_DETECTED" == "true" ]] && echo " Video: Daily.co (live rooms + multitrack processing via Hatchet)"
[[ "$WHEREBY_DETECTED" == "true" ]] && echo " Video: Whereby (live rooms)"
[[ "$ANY_PLATFORM_DETECTED" != "true" ]] && echo " Video: None (rooms disabled)"
if [[ "$USE_CUSTOM_CA" == "true" ]]; then
echo " CA: Custom (certs/ca.crt)"
[[ -n "$TLS_CERT_PATH" ]] && echo " TLS: Custom cert (certs/server.pem)"
fi
echo ""
if [[ "$USE_CUSTOM_CA" == "true" ]]; then
echo " NOTE: Clients must trust the CA certificate to avoid browser warnings."
echo " CA cert location: certs/ca.crt"
echo " See docsv2/custom-ca-setup.md for instructions."
echo ""
fi
echo " To stop: docker compose -f docker-compose.selfhosted.yml down"
echo " To re-run: ./scripts/setup-selfhosted.sh $*"
echo " To re-run: ./scripts/setup-selfhosted.sh (replays saved config)"
echo " Last args: $*"
echo ""
}

View File

@@ -6,7 +6,7 @@ ENV PYTHONUNBUFFERED=1 \
# builder install base dependencies
WORKDIR /tmp
RUN apt-get update && apt-get install -y curl ffmpeg && apt-get clean
RUN apt-get update && apt-get install -y curl ffmpeg ca-certificates && apt-get clean
ADD https://astral.sh/uv/install.sh /uv-installer.sh
RUN sh /uv-installer.sh && rm /uv-installer.sh
ENV PATH="/root/.local/bin/:$PATH"
@@ -18,7 +18,7 @@ COPY pyproject.toml uv.lock README.md /app/
RUN uv sync --compile-bytecode --locked
# bootstrap
COPY alembic.ini runserver.sh /app/
COPY alembic.ini docker-entrypoint.sh runserver.sh /app/
COPY images /app/images
COPY migrations /app/migrations
COPY reflector /app/reflector
@@ -35,4 +35,6 @@ RUN if [ "$(uname -m)" = "aarch64" ] && [ ! -f /usr/lib/libgomp.so.1 ]; then \
# Pre-check just to make sure the image will not fail
RUN uv run python -c "import silero_vad.model"
CMD ["./runserver.sh"]
RUN chmod +x /app/docker-entrypoint.sh
CMD ["./docker-entrypoint.sh"]

View File

@@ -0,0 +1,25 @@
#!/bin/bash
set -e
# Custom CA certificate injection
# If a CA cert is mounted at this path (via docker-compose.ca.yml),
# add it to the system trust store and configure all Python SSL libraries.
CUSTOM_CA_PATH="/usr/local/share/ca-certificates/custom-ca.crt"
if [ -s "$CUSTOM_CA_PATH" ]; then
echo "[entrypoint] Custom CA certificate detected, updating trust store..."
update-ca-certificates 2>/dev/null
# update-ca-certificates creates a combined bundle (system + custom CAs)
COMBINED_BUNDLE="/etc/ssl/certs/ca-certificates.crt"
export SSL_CERT_FILE="$COMBINED_BUNDLE"
export REQUESTS_CA_BUNDLE="$COMBINED_BUNDLE"
export CURL_CA_BUNDLE="$COMBINED_BUNDLE"
# Note: GRPC_DEFAULT_SSL_ROOTS_FILE_PATH is intentionally NOT set here.
# Setting it causes grpcio to attempt TLS on internal Hatchet connections
# that run without TLS (SERVER_GRPC_INSECURE=t), resulting in handshake failures.
# If you need gRPC with custom CA, set GRPC_DEFAULT_SSL_ROOTS_FILE_PATH explicitly.
echo "[entrypoint] CA trust store updated (SSL_CERT_FILE=$COMBINED_BUNDLE)"
fi
exec ./runserver.sh

View File

@@ -120,7 +120,8 @@ class Meeting(BaseModel):
daily_composed_video_s3_key: str | None = None
daily_composed_video_duration: int | None = None
# Email recipients for transcript notification
email_recipients: list[str] | None = None
# Each entry is {"email": str, "include_link": bool} or a legacy plain str
email_recipients: list[dict | str] | None = None
class MeetingController:
@@ -399,15 +400,27 @@ class MeetingController:
async with get_database().transaction(isolation="serializable"):
yield
async def add_email_recipient(self, meeting_id: str, email: str) -> list[str]:
"""Add an email to the meeting's email_recipients list (no duplicates)."""
async def add_email_recipient(
self, meeting_id: str, email: str, *, include_link: bool = True
) -> list[dict]:
"""Add an email to the meeting's email_recipients list (no duplicates).
Each entry is stored as {"email": str, "include_link": bool}.
Legacy plain-string entries are normalised on read.
"""
async with self.transaction():
meeting = await self.get_by_id(meeting_id)
if not meeting:
raise ValueError(f"Meeting {meeting_id} not found")
current = meeting.email_recipients or []
if email not in current:
current.append(email)
# Normalise legacy string entries
current: list[dict] = [
entry
if isinstance(entry, dict)
else {"email": entry, "include_link": True}
for entry in (meeting.email_recipients or [])
]
if not any(r["email"] == email for r in current):
current.append({"email": email, "include_link": include_link})
await self.update_meeting(meeting_id, email_recipients=current)
return current

View File

@@ -78,6 +78,14 @@ class RecordingController:
)
await get_database().execute(query)
async def restore_by_id(self, id: str) -> None:
query = recordings.update().where(recordings.c.id == id).values(deleted_at=None)
await get_database().execute(query)
async def hard_delete_by_id(self, id: str) -> None:
query = recordings.delete().where(recordings.c.id == id)
await get_database().execute(query)
async def set_meeting_id(
self,
recording_id: NonEmptyString,

View File

@@ -138,6 +138,7 @@ class SearchParameters(BaseModel):
source_kind: SourceKind | None = None
from_datetime: datetime | None = None
to_datetime: datetime | None = None
include_deleted: bool = False
class SearchResultDB(BaseModel):
@@ -387,7 +388,10 @@ class SearchController:
transcripts.join(rooms, transcripts.c.room_id == rooms.c.id, isouter=True)
)
base_query = base_query.where(transcripts.c.deleted_at.is_(None))
if params.include_deleted:
base_query = base_query.where(transcripts.c.deleted_at.isnot(None))
else:
base_query = base_query.where(transcripts.c.deleted_at.is_(None))
if params.query_text is not None:
# because already initialized based on params.query_text presence above
@@ -396,7 +400,13 @@ class SearchController:
transcripts.c.search_vector_en.op("@@")(search_query)
)
if params.user_id:
if params.include_deleted:
# Trash view: only show user's own deleted transcripts.
# Defense-in-depth: require user_id to prevent leaking all users' trash.
if not params.user_id:
return [], 0
base_query = base_query.where(transcripts.c.user_id == params.user_id)
elif params.user_id:
base_query = base_query.where(
sqlalchemy.or_(
transcripts.c.user_id == params.user_id, rooms.c.is_shared
@@ -421,6 +431,8 @@ class SearchController:
if params.query_text is not None:
order_by = sqlalchemy.desc(sqlalchemy.text("rank"))
elif params.include_deleted:
order_by = sqlalchemy.desc(transcripts.c.deleted_at)
else:
order_by = sqlalchemy.desc(transcripts.c.created_at)

View File

@@ -24,7 +24,7 @@ from reflector.db.utils import is_postgresql
from reflector.logger import logger
from reflector.processors.types import Word as ProcessorWord
from reflector.settings import settings
from reflector.storage import get_transcripts_storage
from reflector.storage import get_source_storage, get_transcripts_storage
from reflector.utils import generate_uuid4
from reflector.utils.webvtt import topics_to_webvtt
@@ -676,6 +676,126 @@ class TranscriptController:
)
await get_database().execute(query)
async def restore_by_id(
self,
transcript_id: str,
user_id: str | None = None,
) -> bool:
"""
Restore a soft-deleted transcript by clearing deleted_at.
Also restores the associated recording if present.
Returns True if the transcript was restored, False otherwise.
"""
transcript = await self.get_by_id(transcript_id)
if not transcript:
return False
if transcript.deleted_at is None:
return False
if user_id is not None and transcript.user_id != user_id:
return False
query = (
transcripts.update()
.where(transcripts.c.id == transcript_id)
.values(deleted_at=None)
)
await get_database().execute(query)
if transcript.recording_id:
try:
await recordings_controller.restore_by_id(transcript.recording_id)
except Exception as e:
logger.warning(
"Failed to restore recording",
exc_info=e,
recording_id=transcript.recording_id,
)
return True
async def hard_delete(self, transcript_id: str) -> None:
"""
Permanently delete a transcript, its recording, and all associated files.
Only deletes transcript-owned resources:
- Transcript row and recording row from DB (first, to make data inaccessible)
- Transcript audio in S3 storage
- Recording files in S3 (both object_key and track_keys, since a recording can have both)
- Local files (data_path directory)
Does NOT delete: meetings, consent records, rooms, or any shared entity.
Requires the transcript to be soft-deleted first (deleted_at must be set).
"""
transcript = await self.get_by_id(transcript_id)
if not transcript:
return
if transcript.deleted_at is None:
return
# Collect file references before deleting DB rows
recording = None
recording_storage = None
if transcript.recording_id:
recording = await recordings_controller.get_by_id(transcript.recording_id)
# Determine the correct storage backend for recording files.
# Recordings from different platforms (daily, whereby) live in
# platform-specific buckets with separate credentials.
if recording and recording.meeting_id:
from reflector.db.meetings import meetings_controller # noqa: PLC0415
meeting = await meetings_controller.get_by_id(recording.meeting_id)
if meeting:
recording_storage = get_source_storage(meeting.platform)
if recording_storage is None:
recording_storage = get_transcripts_storage()
# 1. Hard-delete DB rows first (makes data inaccessible immediately)
if recording:
await recordings_controller.hard_delete_by_id(recording.id)
await get_database().execute(
transcripts.delete().where(transcripts.c.id == transcript_id)
)
# 2. Delete transcript audio from S3 (always uses transcript storage)
transcript_storage = get_transcripts_storage()
if transcript.audio_location == "storage" and not transcript.audio_deleted:
try:
await transcript_storage.delete_file(transcript.storage_audio_path)
except Exception as e:
logger.warning(
"Failed to delete transcript audio from storage",
exc_info=e,
transcript_id=transcript_id,
path=transcript.storage_audio_path,
)
# 3. Delete recording files from S3 (both object_key and track_keys —
# a recording can have both, unlike consent cleanup which uses elif).
# Uses platform-specific storage resolved above.
if recording and recording.bucket_name and recording_storage:
keys_to_delete = []
if recording.track_keys:
keys_to_delete = recording.track_keys
if recording.object_key:
keys_to_delete.append(recording.object_key)
for key in keys_to_delete:
try:
await recording_storage.delete_file(
key, bucket=recording.bucket_name
)
except Exception as e:
logger.warning(
"Failed to delete recording file",
exc_info=e,
key=key,
bucket=recording.bucket_name,
)
# 4. Delete local files
transcript.unlink()
async def remove_by_recording_id(self, recording_id: str):
"""
Soft-delete a transcript by recording_id

View File

@@ -1,11 +1,13 @@
from email.mime.multipart import MIMEMultipart
from email.mime.text import MIMEText
from html import escape
import aiosmtplib
import structlog
from reflector.db.transcripts import Transcript
from reflector.db.transcripts import SourceKind, Transcript
from reflector.settings import settings
from reflector.utils.transcript_formats import transcript_to_text_timestamped
logger = structlog.get_logger(__name__)
@@ -18,35 +20,111 @@ def get_transcript_url(transcript: Transcript) -> str:
return f"{settings.UI_BASE_URL}/transcripts/{transcript.id}"
def _build_plain_text(transcript: Transcript, url: str) -> str:
def _get_timestamped_text(transcript: Transcript) -> str:
"""Build the full timestamped transcript text using existing utility."""
if not transcript.topics:
return ""
is_multitrack = transcript.source_kind == SourceKind.ROOM
return transcript_to_text_timestamped(
transcript.topics, transcript.participants, is_multitrack=is_multitrack
)
def _build_plain_text(transcript: Transcript, url: str, include_link: bool) -> str:
title = transcript.title or "Unnamed recording"
lines = [
f"Your transcript is ready: {title}",
"",
f"View it here: {url}",
]
lines = [f"Reflector: {title}", ""]
if transcript.short_summary:
lines.extend(["", "Summary:", transcript.short_summary])
lines.extend(["Summary:", transcript.short_summary, ""])
timestamped = _get_timestamped_text(transcript)
if timestamped:
lines.extend(["Transcript:", timestamped, ""])
if include_link:
lines.append(f"View transcript: {url}")
lines.append("")
lines.append(
"This email was sent because you requested to receive "
"the transcript from a meeting."
)
return "\n".join(lines)
def _build_html(transcript: Transcript, url: str) -> str:
title = transcript.title or "Unnamed recording"
def _build_html(transcript: Transcript, url: str, include_link: bool) -> str:
title = escape(transcript.title or "Unnamed recording")
summary_html = ""
if transcript.short_summary:
summary_html = f"<p style='color:#555;'>{transcript.short_summary}</p>"
summary_html = (
f'<p style="color:#555;margin-bottom:16px;">'
f"{escape(transcript.short_summary)}</p>"
)
transcript_html = ""
timestamped = _get_timestamped_text(transcript)
if timestamped:
# Build styled transcript lines
styled_lines = []
for line in timestamped.split("\n"):
if not line.strip():
continue
# Lines are formatted as "[MM:SS] Speaker: text"
if line.startswith("[") and "] " in line:
bracket_end = line.index("] ")
timestamp = escape(line[: bracket_end + 1])
rest = line[bracket_end + 2 :]
if ": " in rest:
colon_pos = rest.index(": ")
speaker = escape(rest[:colon_pos])
text = escape(rest[colon_pos + 2 :])
styled_lines.append(
f'<div style="margin-bottom:4px;">'
f'<span style="color:#888;font-size:12px;">{timestamp}</span> '
f"<strong>{speaker}:</strong> {text}</div>"
)
else:
styled_lines.append(
f'<div style="margin-bottom:4px;">{escape(line)}</div>'
)
else:
styled_lines.append(
f'<div style="margin-bottom:4px;">{escape(line)}</div>'
)
transcript_html = (
'<h3 style="margin-top:20px;margin-bottom:8px;">Transcript</h3>'
'<div style="background:#f7f7f7;padding:16px;border-radius:6px;'
'font-size:13px;line-height:1.6;max-height:600px;overflow-y:auto;">'
f"{''.join(styled_lines)}</div>"
)
link_html = ""
if include_link:
link_html = (
'<p style="margin-top:20px;">'
f'<a href="{url}" style="display:inline-block;padding:10px 20px;'
"background:#4A90D9;color:#fff;text-decoration:none;"
'border-radius:4px;">View Transcript</a></p>'
)
return f"""\
<div style="font-family:sans-serif;max-width:600px;margin:0 auto;">
<h2>Your transcript is ready</h2>
<p><strong>{title}</strong></p>
<h2 style="margin-bottom:4px;">{title}</h2>
{summary_html}
<p><a href="{url}" style="display:inline-block;padding:10px 20px;background:#4A90D9;color:#fff;text-decoration:none;border-radius:4px;">View Transcript</a></p>
<p style="color:#999;font-size:12px;">This email was sent because you requested to receive the transcript from a meeting.</p>
{transcript_html}
{link_html}
<p style="color:#999;font-size:12px;margin-top:20px;">This email was sent because you requested to receive the transcript from a meeting.</p>
</div>"""
async def send_transcript_email(to_emails: list[str], transcript: Transcript) -> int:
async def send_transcript_email(
to_emails: list[str],
transcript: Transcript,
*,
include_link: bool = True,
) -> int:
"""Send transcript notification to all emails. Returns count sent."""
if not is_email_configured() or not to_emails:
return 0
@@ -57,12 +135,12 @@ async def send_transcript_email(to_emails: list[str], transcript: Transcript) ->
for email_addr in to_emails:
msg = MIMEMultipart("alternative")
msg["Subject"] = f"Transcript Ready: {title}"
msg["Subject"] = f"Reflector: {title}"
msg["From"] = settings.SMTP_FROM_EMAIL
msg["To"] = email_addr
msg.attach(MIMEText(_build_plain_text(transcript, url), "plain"))
msg.attach(MIMEText(_build_html(transcript, url), "html"))
msg.attach(MIMEText(_build_plain_text(transcript, url, include_link), "plain"))
msg.attach(MIMEText(_build_html(transcript, url, include_link), "html"))
try:
await aiosmtplib.send(

View File

@@ -64,3 +64,9 @@ TIMEOUT_HEAVY = 1200 # Transcription, fan-out LLM tasks (Hatchet execution_time
TIMEOUT_HEAVY_HTTP = (
1150 # httpx timeout for transcribe_track — below 1200 so Hatchet doesn't race
)
TIMEOUT_EXTRA_HEAVY = (
3600 # Detect Topics, fan-out LLM tasks (Hatchet execution_timeout)
)
TIMEOUT_EXTRA_HEAVY_HTTP = (
3400 # httpx timeout for detect_topics — below 3600 so Hatchet doesn't race
)

View File

@@ -41,6 +41,7 @@ from reflector.hatchet.broadcast import (
from reflector.hatchet.client import HatchetClientManager
from reflector.hatchet.constants import (
TIMEOUT_AUDIO,
TIMEOUT_EXTRA_HEAVY,
TIMEOUT_HEAVY,
TIMEOUT_LONG,
TIMEOUT_MEDIUM,
@@ -84,7 +85,7 @@ from reflector.hatchet.workflows.topic_chunk_processing import (
from reflector.hatchet.workflows.track_processing import TrackInput, track_workflow
from reflector.logger import logger
from reflector.pipelines import topic_processing
from reflector.processors import AudioFileWriterProcessor
from reflector.processors.audio_mixdown_auto import AudioMixdownAutoProcessor
from reflector.processors.summary.models import ActionItemsResponse
from reflector.processors.summary.prompts import (
RECAP_PROMPT,
@@ -99,10 +100,6 @@ from reflector.utils.audio_constants import (
PRESIGNED_URL_EXPIRATION_SECONDS,
WAVEFORM_SEGMENTS,
)
from reflector.utils.audio_mixdown import (
detect_sample_rate_from_tracks,
mixdown_tracks_pyav,
)
from reflector.utils.audio_waveform import get_audio_waveform
from reflector.utils.daily import (
filter_cam_audio_tracks,
@@ -539,7 +536,7 @@ async def process_tracks(input: PipelineInput, ctx: Context) -> ProcessTracksRes
)
@with_error_handling(TaskName.MIXDOWN_TRACKS)
async def mixdown_tracks(input: PipelineInput, ctx: Context) -> MixdownResult:
"""Mix all padded tracks into single audio file using PyAV (same as Celery)."""
"""Mix all padded tracks into single audio file via configured backend."""
ctx.log("mixdown_tracks: mixing padded tracks into single audio file")
track_result = ctx.task_output(process_tracks)
@@ -579,37 +576,33 @@ async def mixdown_tracks(input: PipelineInput, ctx: Context) -> MixdownResult:
if not valid_urls:
raise ValueError("No valid padded tracks to mixdown")
target_sample_rate = detect_sample_rate_from_tracks(valid_urls, logger=logger)
if not target_sample_rate:
logger.error("Mixdown failed - no decodable audio frames found")
raise ValueError("No decodable audio frames in any track")
output_path = tempfile.mktemp(suffix=".mp3")
duration_ms_callback_capture_container = [0.0]
async def capture_duration(d):
duration_ms_callback_capture_container[0] = d
writer = AudioFileWriterProcessor(path=output_path, on_duration=capture_duration)
await mixdown_tracks_pyav(
valid_urls,
writer,
target_sample_rate,
offsets_seconds=None,
logger=logger,
progress_callback=make_audio_progress_logger(ctx, TaskName.MIXDOWN_TRACKS),
expected_duration_sec=recording_duration if recording_duration > 0 else None,
)
await writer.flush()
file_size = Path(output_path).stat().st_size
storage_path = f"{input.transcript_id}/audio.mp3"
with open(output_path, "rb") as mixed_file:
await storage.put_file(storage_path, mixed_file)
# Generate presigned PUT URL for the output (used by modal backend;
# pyav backend ignores it and writes locally instead)
output_url = await storage.get_file_url(
storage_path,
operation="put_object",
expires_in=PRESIGNED_URL_EXPIRATION_SECONDS,
)
Path(output_path).unlink(missing_ok=True)
processor = AudioMixdownAutoProcessor()
result = await processor.mixdown_tracks(
valid_urls, output_url, offsets_seconds=None
)
if result.output_path:
# Pyav backend wrote locally — upload to storage ourselves
output_file = Path(result.output_path)
with open(output_file, "rb") as mixed_file:
await storage.put_file(storage_path, mixed_file)
output_file.unlink(missing_ok=True)
# Clean up the temp directory the pyav processor created
try:
output_file.parent.rmdir()
except OSError:
pass
# else: modal backend already uploaded to output_url
async with fresh_db_connection():
from reflector.db.transcripts import transcripts_controller # noqa: PLC0415
@@ -620,11 +613,11 @@ async def mixdown_tracks(input: PipelineInput, ctx: Context) -> MixdownResult:
transcript, {"audio_location": "storage"}
)
ctx.log(f"mixdown_tracks complete: uploaded {file_size} bytes to {storage_path}")
ctx.log(f"mixdown_tracks complete: {result.size} bytes to {storage_path}")
return MixdownResult(
audio_key=storage_path,
duration=duration_ms_callback_capture_container[0],
duration=result.duration_ms,
tracks_mixed=len(valid_urls),
)
@@ -701,7 +694,7 @@ async def generate_waveform(input: PipelineInput, ctx: Context) -> WaveformResul
@daily_multitrack_pipeline.task(
parents=[process_tracks],
execution_timeout=timedelta(seconds=TIMEOUT_HEAVY),
execution_timeout=timedelta(seconds=TIMEOUT_EXTRA_HEAVY),
retries=3,
backoff_factor=2.0,
backoff_max_seconds=30,
@@ -1285,6 +1278,7 @@ async def cleanup_consent(input: PipelineInput, ctx: Context) -> ConsentResult:
return ConsentResult()
consent_denied = False
meeting = None
if transcript.meeting_id:
meeting = await meetings_controller.get_by_id(transcript.meeting_id)
if meeting:
@@ -1347,6 +1341,22 @@ async def cleanup_consent(input: PipelineInput, ctx: Context) -> ConsentResult:
logger.error(error_msg, exc_info=True)
deletion_errors.append(error_msg)
# Delete cloud video if present
if meeting and meeting.daily_composed_video_s3_key:
try:
source_storage = get_source_storage("daily")
await source_storage.delete_file(meeting.daily_composed_video_s3_key)
await meetings_controller.update_meeting(
meeting.id,
daily_composed_video_s3_key=None,
daily_composed_video_duration=None,
)
ctx.log(f"Deleted cloud video: {meeting.daily_composed_video_s3_key}")
except Exception as e:
error_msg = f"Failed to delete cloud video: {e}"
logger.error(error_msg, exc_info=True)
deletion_errors.append(error_msg)
if deletion_errors:
logger.warning(
"[Hatchet] cleanup_consent completed with errors",
@@ -1357,7 +1367,7 @@ async def cleanup_consent(input: PipelineInput, ctx: Context) -> ConsentResult:
ctx.log(f"cleanup_consent completed with {len(deletion_errors)} errors")
else:
await transcripts_controller.update(transcript, {"audio_deleted": True})
ctx.log("cleanup_consent: all audio deleted successfully")
ctx.log("cleanup_consent: all audio and video deleted successfully")
return ConsentResult()
@@ -1501,22 +1511,41 @@ async def send_email(input: PipelineInput, ctx: Context) -> EmailResult:
if recording and recording.meeting_id:
meeting = await meetings_controller.get_by_id(recording.meeting_id)
recipients = (
list(meeting.email_recipients)
# Normalise meeting recipients (legacy strings → dicts)
meeting_recipients: list[dict] = (
[
entry
if isinstance(entry, dict)
else {"email": entry, "include_link": True}
for entry in (meeting.email_recipients or [])
]
if meeting and meeting.email_recipients
else []
)
# Also check room-level email
# Room-level email always gets a link (room owner)
from reflector.db.rooms import rooms_controller # noqa: PLC0415
room_email = None
if transcript.room_id:
room = await rooms_controller.get_by_id(transcript.room_id)
if room and room.email_transcript_to:
if room.email_transcript_to not in recipients:
recipients.append(room.email_transcript_to)
room_email = room.email_transcript_to
if not recipients:
# Build two groups: with link and without link
with_link = [
r["email"] for r in meeting_recipients if r.get("include_link", True)
]
without_link = [
r["email"] for r in meeting_recipients if not r.get("include_link", True)
]
if room_email:
if room_email not in with_link:
with_link.append(room_email)
without_link = [e for e in without_link if e != room_email]
if not with_link and not without_link:
ctx.log("send_email skipped (no email recipients)")
return EmailResult(skipped=True)
@@ -1524,7 +1553,15 @@ async def send_email(input: PipelineInput, ctx: Context) -> EmailResult:
if meeting and meeting.email_recipients:
await transcripts_controller.update(transcript, {"share_mode": "public"})
count = await send_transcript_email(recipients, transcript)
count = 0
if with_link:
count += await send_transcript_email(
with_link, transcript, include_link=True
)
if without_link:
count += await send_transcript_email(
without_link, transcript, include_link=False
)
ctx.log(f"send_email complete: sent {count} emails")
return EmailResult(emails_sent=count)

View File

@@ -688,7 +688,10 @@ async def cleanup_consent(input: FilePipelineInput, ctx: Context) -> ConsentResu
)
from reflector.db.recordings import recordings_controller # noqa: PLC0415
from reflector.db.transcripts import transcripts_controller # noqa: PLC0415
from reflector.storage import get_transcripts_storage # noqa: PLC0415
from reflector.storage import ( # noqa: PLC0415
get_source_storage,
get_transcripts_storage,
)
transcript = await transcripts_controller.get_by_id(input.transcript_id)
if not transcript:
@@ -697,6 +700,7 @@ async def cleanup_consent(input: FilePipelineInput, ctx: Context) -> ConsentResu
consent_denied = False
recording = None
meeting = None
if transcript.recording_id:
recording = await recordings_controller.get_by_id(transcript.recording_id)
if recording and recording.meeting_id:
@@ -756,6 +760,22 @@ async def cleanup_consent(input: FilePipelineInput, ctx: Context) -> ConsentResu
logger.error(error_msg, exc_info=True)
deletion_errors.append(error_msg)
# Delete cloud video if present
if meeting and meeting.daily_composed_video_s3_key:
try:
source_storage = get_source_storage("daily")
await source_storage.delete_file(meeting.daily_composed_video_s3_key)
await meetings_controller.update_meeting(
meeting.id,
daily_composed_video_s3_key=None,
daily_composed_video_duration=None,
)
ctx.log(f"Deleted cloud video: {meeting.daily_composed_video_s3_key}")
except Exception as e:
error_msg = f"Failed to delete cloud video: {e}"
logger.error(error_msg, exc_info=True)
deletion_errors.append(error_msg)
if deletion_errors:
logger.warning(
"[Hatchet] cleanup_consent completed with errors",
@@ -764,7 +784,7 @@ async def cleanup_consent(input: FilePipelineInput, ctx: Context) -> ConsentResu
)
else:
await transcripts_controller.update(transcript, {"audio_deleted": True})
ctx.log("cleanup_consent: all audio deleted successfully")
ctx.log("cleanup_consent: all audio and video deleted successfully")
return ConsentResult()
@@ -896,22 +916,41 @@ async def send_email(input: FilePipelineInput, ctx: Context) -> EmailResult:
if recording and recording.meeting_id:
meeting = await meetings_controller.get_by_id(recording.meeting_id)
recipients = (
list(meeting.email_recipients)
# Normalise meeting recipients (legacy strings → dicts)
meeting_recipients: list[dict] = (
[
entry
if isinstance(entry, dict)
else {"email": entry, "include_link": True}
for entry in (meeting.email_recipients or [])
]
if meeting and meeting.email_recipients
else []
)
# Also check room-level email
# Room-level email always gets a link (room owner)
from reflector.db.rooms import rooms_controller # noqa: PLC0415
room_email = None
if transcript.room_id:
room = await rooms_controller.get_by_id(transcript.room_id)
if room and room.email_transcript_to:
if room.email_transcript_to not in recipients:
recipients.append(room.email_transcript_to)
room_email = room.email_transcript_to
if not recipients:
# Build two groups: with link and without link
with_link = [
r["email"] for r in meeting_recipients if r.get("include_link", True)
]
without_link = [
r["email"] for r in meeting_recipients if not r.get("include_link", True)
]
if room_email:
if room_email not in with_link:
with_link.append(room_email)
without_link = [e for e in without_link if e != room_email]
if not with_link and not without_link:
ctx.log("send_email skipped (no email recipients)")
return EmailResult(skipped=True)
@@ -919,7 +958,15 @@ async def send_email(input: FilePipelineInput, ctx: Context) -> EmailResult:
if meeting and meeting.email_recipients:
await transcripts_controller.update(transcript, {"share_mode": "public"})
count = await send_transcript_email(recipients, transcript)
count = 0
if with_link:
count += await send_transcript_email(
with_link, transcript, include_link=True
)
if without_link:
count += await send_transcript_email(
without_link, transcript, include_link=False
)
ctx.log(f"send_email complete: sent {count} emails")
return EmailResult(emails_sent=count)

View File

@@ -397,22 +397,41 @@ async def send_email(input: LivePostPipelineInput, ctx: Context) -> EmailResult:
if recording and recording.meeting_id:
meeting = await meetings_controller.get_by_id(recording.meeting_id)
recipients = (
list(meeting.email_recipients)
# Normalise meeting recipients (legacy strings → dicts)
meeting_recipients: list[dict] = (
[
entry
if isinstance(entry, dict)
else {"email": entry, "include_link": True}
for entry in (meeting.email_recipients or [])
]
if meeting and meeting.email_recipients
else []
)
# Also check room-level email
# Room-level email always gets a link (room owner)
from reflector.db.rooms import rooms_controller # noqa: PLC0415
room_email = None
if transcript.room_id:
room = await rooms_controller.get_by_id(transcript.room_id)
if room and room.email_transcript_to:
if room.email_transcript_to not in recipients:
recipients.append(room.email_transcript_to)
room_email = room.email_transcript_to
if not recipients:
# Build two groups: with link and without link
with_link = [
r["email"] for r in meeting_recipients if r.get("include_link", True)
]
without_link = [
r["email"] for r in meeting_recipients if not r.get("include_link", True)
]
if room_email:
if room_email not in with_link:
with_link.append(room_email)
without_link = [e for e in without_link if e != room_email]
if not with_link and not without_link:
ctx.log("send_email skipped (no email recipients)")
return EmailResult(skipped=True)
@@ -420,7 +439,15 @@ async def send_email(input: LivePostPipelineInput, ctx: Context) -> EmailResult:
if meeting and meeting.email_recipients:
await transcripts_controller.update(transcript, {"share_mode": "public"})
count = await send_transcript_email(recipients, transcript)
count = 0
if with_link:
count += await send_transcript_email(
with_link, transcript, include_link=True
)
if without_link:
count += await send_transcript_email(
without_link, transcript, include_link=False
)
ctx.log(f"send_email complete: sent {count} emails")
return EmailResult(emails_sent=count)

View File

@@ -61,7 +61,7 @@ from reflector.processors.types import (
)
from reflector.processors.types import Transcript as TranscriptProcessorType
from reflector.settings import settings
from reflector.storage import get_transcripts_storage
from reflector.storage import get_source_storage, get_transcripts_storage
from reflector.views.transcripts import GetTranscriptTopic
from reflector.ws_events import TranscriptEventName
from reflector.ws_manager import WebsocketManager, get_ws_manager
@@ -671,6 +671,22 @@ async def cleanup_consent(transcript: Transcript, logger: Logger):
logger.error(error_msg, exc_info=e)
deletion_errors.append(error_msg)
# Delete cloud video if present
if meeting and meeting.daily_composed_video_s3_key:
try:
source_storage = get_source_storage("daily")
await source_storage.delete_file(meeting.daily_composed_video_s3_key)
await meetings_controller.update_meeting(
meeting.id,
daily_composed_video_s3_key=None,
daily_composed_video_duration=None,
)
logger.info(f"Deleted cloud video: {meeting.daily_composed_video_s3_key}")
except Exception as e:
error_msg = f"Failed to delete cloud video: {e}"
logger.error(error_msg, exc_info=e)
deletion_errors.append(error_msg)
if deletion_errors:
logger.warning(
f"Consent cleanup completed with {len(deletion_errors)} errors",
@@ -678,7 +694,7 @@ async def cleanup_consent(transcript: Transcript, logger: Logger):
)
else:
await transcripts_controller.update(transcript, {"audio_deleted": True})
logger.info("Consent cleanup done - all audio deleted")
logger.info("Consent cleanup done - all audio and video deleted")
@get_transcript

View File

@@ -4,6 +4,8 @@ from .audio_diarization_auto import AudioDiarizationAutoProcessor # noqa: F401
from .audio_downscale import AudioDownscaleProcessor # noqa: F401
from .audio_file_writer import AudioFileWriterProcessor # noqa: F401
from .audio_merge import AudioMergeProcessor # noqa: F401
from .audio_mixdown import AudioMixdownProcessor # noqa: F401
from .audio_mixdown_auto import AudioMixdownAutoProcessor # noqa: F401
from .audio_padding import AudioPaddingProcessor # noqa: F401
from .audio_padding_auto import AudioPaddingAutoProcessor # noqa: F401
from .audio_transcript import AudioTranscriptProcessor # noqa: F401

View File

@@ -0,0 +1,27 @@
"""
Base class for audio mixdown processors.
"""
from pydantic import BaseModel
class MixdownResponse(BaseModel):
size: int
duration_ms: float = 0.0
cancelled: bool = False
output_path: str | None = (
None # Local file path (pyav sets this; modal leaves None)
)
class AudioMixdownProcessor:
"""Base class for audio mixdown processors."""
async def mixdown_tracks(
self,
track_urls: list[str],
output_url: str,
target_sample_rate: int | None = None,
offsets_seconds: list[float] | None = None,
) -> MixdownResponse:
raise NotImplementedError

View File

@@ -0,0 +1,32 @@
import importlib
from reflector.processors.audio_mixdown import AudioMixdownProcessor
from reflector.settings import settings
class AudioMixdownAutoProcessor(AudioMixdownProcessor):
_registry = {}
@classmethod
def register(cls, name, kclass):
cls._registry[name] = kclass
def __new__(cls, name: str | None = None, **kwargs):
if name is None:
name = settings.MIXDOWN_BACKEND
if name not in cls._registry:
module_name = f"reflector.processors.audio_mixdown_{name}"
importlib.import_module(module_name)
# gather specific configuration for the processor
# search `MIXDOWN_XXX_YYY`, push to constructor as `xxx_yyy`
config = {}
name_upper = name.upper()
settings_prefix = "MIXDOWN_"
config_prefix = f"{settings_prefix}{name_upper}_"
for key, value in settings:
if key.startswith(config_prefix):
config_name = key[len(settings_prefix) :].lower()
config[config_name] = value
return cls._registry[name](**config | kwargs)

View File

@@ -0,0 +1,110 @@
"""
Modal.com backend for audio mixdown.
"""
import asyncio
import os
import httpx
from reflector.hatchet.constants import TIMEOUT_HEAVY_HTTP
from reflector.logger import logger
from reflector.processors.audio_mixdown import AudioMixdownProcessor, MixdownResponse
from reflector.processors.audio_mixdown_auto import AudioMixdownAutoProcessor
class AudioMixdownModalProcessor(AudioMixdownProcessor):
"""Audio mixdown processor using Modal.com/self-hosted backend via HTTP."""
def __init__(
self, mixdown_url: str | None = None, modal_api_key: str | None = None
):
self.mixdown_url = mixdown_url or os.getenv("MIXDOWN_URL")
if not self.mixdown_url:
raise ValueError(
"MIXDOWN_URL required to use AudioMixdownModalProcessor. "
"Set MIXDOWN_URL environment variable or pass mixdown_url parameter."
)
self.modal_api_key = modal_api_key or os.getenv("MODAL_API_KEY")
async def mixdown_tracks(
self,
track_urls: list[str],
output_url: str,
target_sample_rate: int | None = None,
offsets_seconds: list[float] | None = None,
) -> MixdownResponse:
"""Mix audio tracks via remote Modal/self-hosted backend.
Args:
track_urls: Presigned GET URLs for source audio tracks
output_url: Presigned PUT URL for output MP3
target_sample_rate: Sample rate for output (Hz), auto-detected if None
offsets_seconds: Optional per-track delays in seconds for alignment
"""
valid_count = len([u for u in track_urls if u])
log = logger.bind(track_count=valid_count)
log.info("Sending Modal mixdown HTTP request")
url = f"{self.mixdown_url}/mixdown"
headers = {}
if self.modal_api_key:
headers["Authorization"] = f"Bearer {self.modal_api_key}"
# Scale timeout with track count: base TIMEOUT_HEAVY_HTTP + 60s per track beyond 2
extra_timeout = max(0, (valid_count - 2)) * 60
timeout = TIMEOUT_HEAVY_HTTP + extra_timeout
try:
async with httpx.AsyncClient(timeout=timeout) as client:
response = await client.post(
url,
headers=headers,
json={
"track_urls": track_urls,
"output_url": output_url,
"target_sample_rate": target_sample_rate,
"offsets_seconds": offsets_seconds,
},
follow_redirects=True,
)
if response.status_code != 200:
error_body = response.text
log.error(
"Modal mixdown API error",
status_code=response.status_code,
error_body=error_body,
)
response.raise_for_status()
result = response.json()
# Check if work was cancelled
if result.get("cancelled"):
log.warning("Modal mixdown was cancelled by disconnect detection")
raise asyncio.CancelledError(
"Mixdown cancelled due to client disconnect"
)
log.info("Modal mixdown complete", size=result["size"])
return MixdownResponse(**result)
except asyncio.CancelledError:
log.warning(
"Modal mixdown cancelled (Hatchet timeout, disconnect detected on Modal side)"
)
raise
except httpx.TimeoutException as e:
log.error("Modal mixdown timeout", error=str(e), exc_info=True)
raise Exception(f"Modal mixdown timeout: {e}") from e
except httpx.HTTPStatusError as e:
log.error("Modal mixdown HTTP error", error=str(e), exc_info=True)
raise Exception(f"Modal mixdown HTTP error: {e}") from e
except Exception as e:
log.error("Modal mixdown unexpected error", error=str(e), exc_info=True)
raise
AudioMixdownAutoProcessor.register("modal", AudioMixdownModalProcessor)

View File

@@ -0,0 +1,101 @@
"""
PyAV audio mixdown processor.
Mixes N tracks in-process using the existing utility from reflector.utils.audio_mixdown.
Writes to a local temp file (does NOT upload to S3 — the pipeline handles upload).
"""
import os
import tempfile
from reflector.logger import logger
from reflector.processors.audio_file_writer import AudioFileWriterProcessor
from reflector.processors.audio_mixdown import AudioMixdownProcessor, MixdownResponse
from reflector.processors.audio_mixdown_auto import AudioMixdownAutoProcessor
from reflector.utils.audio_mixdown import (
detect_sample_rate_from_tracks,
mixdown_tracks_pyav,
)
class AudioMixdownPyavProcessor(AudioMixdownProcessor):
"""Audio mixdown processor using PyAV (no HTTP backend).
Writes the mixed output to a local temp file and returns its path
in MixdownResponse.output_path. The caller is responsible for
uploading the file and cleaning it up.
"""
async def mixdown_tracks(
self,
track_urls: list[str],
output_url: str,
target_sample_rate: int | None = None,
offsets_seconds: list[float] | None = None,
) -> MixdownResponse:
log = logger.bind(track_count=len(track_urls))
log.info("Starting local PyAV mixdown")
valid_urls = [url for url in track_urls if url]
if not valid_urls:
raise ValueError("No valid track URLs provided")
# Auto-detect sample rate if not provided
if target_sample_rate is None:
target_sample_rate = detect_sample_rate_from_tracks(
valid_urls, logger=logger
)
if not target_sample_rate:
raise ValueError("No decodable audio frames in any track")
# Write to temp MP3 file
temp_dir = tempfile.mkdtemp()
output_path = os.path.join(temp_dir, "mixed.mp3")
duration_ms_container = [0.0]
async def capture_duration(d):
duration_ms_container[0] = d
writer = AudioFileWriterProcessor(
path=output_path, on_duration=capture_duration
)
try:
await mixdown_tracks_pyav(
valid_urls,
writer,
target_sample_rate,
offsets_seconds=offsets_seconds,
logger=logger,
)
await writer.flush()
file_size = os.path.getsize(output_path)
log.info(
"Local mixdown complete",
size=file_size,
duration_ms=duration_ms_container[0],
)
return MixdownResponse(
size=file_size,
duration_ms=duration_ms_container[0],
output_path=output_path,
)
except Exception as e:
# Cleanup on failure
if os.path.exists(output_path):
try:
os.unlink(output_path)
except Exception:
pass
try:
os.rmdir(temp_dir)
except Exception:
pass
log.error("Local mixdown failed", error=str(e), exc_info=True)
raise
AudioMixdownAutoProcessor.register("pyav", AudioMixdownPyavProcessor)

View File

@@ -127,6 +127,14 @@ class Settings(BaseSettings):
PADDING_URL: str | None = None
PADDING_MODAL_API_KEY: str | None = None
# Audio Mixdown
# backends:
# - pyav: in-process PyAV mixdown (no HTTP, runs in same process)
# - modal: HTTP API client (works with Modal.com OR self-hosted gpu/self_hosted/)
MIXDOWN_BACKEND: str = "pyav"
MIXDOWN_URL: str | None = None
MIXDOWN_MODAL_API_KEY: str | None = None
# Sentry
SENTRY_DSN: str | None = None

View File

@@ -168,8 +168,9 @@ async def add_email_recipient(
if not meeting:
raise HTTPException(status_code=404, detail="Meeting not found")
include_link = user is not None
recipients = await meetings_controller.add_email_recipient(
meeting_id, request.email
meeting_id, request.email, include_link=include_link
)
return {"status": "success", "email_recipients": recipients}

View File

@@ -309,6 +309,7 @@ async def transcripts_search(
source_kind: Optional[SourceKind] = None,
from_datetime: SearchFromDatetimeParam = None,
to_datetime: SearchToDatetimeParam = None,
include_deleted: bool = False,
user: Annotated[
Optional[auth.UserInfo], Depends(auth.current_user_optional_if_public_mode)
] = None,
@@ -316,6 +317,12 @@ async def transcripts_search(
"""Full-text search across transcript titles and content."""
user_id = user["sub"] if user else None
if include_deleted and not user_id:
raise HTTPException(
status_code=401,
detail="Authentication required to view deleted transcripts",
)
if from_datetime and to_datetime and from_datetime > to_datetime:
raise HTTPException(
status_code=400, detail="'from' must be less than or equal to 'to'"
@@ -330,6 +337,7 @@ async def transcripts_search(
source_kind=source_kind,
from_datetime=from_datetime,
to_datetime=to_datetime,
include_deleted=include_deleted,
)
results, total = await search_controller.search_transcripts(search_params)
@@ -615,6 +623,54 @@ async def transcript_delete(
return DeletionStatus(status="ok")
@router.post("/transcripts/{transcript_id}/restore", response_model=DeletionStatus)
async def transcript_restore(
transcript_id: str,
user: Annotated[auth.UserInfo, Depends(auth.current_user)],
):
"""Restore a soft-deleted transcript."""
user_id = user["sub"]
transcript = await transcripts_controller.get_by_id(transcript_id)
if not transcript:
raise HTTPException(status_code=404, detail="Transcript not found")
if transcript.deleted_at is None:
raise HTTPException(status_code=400, detail="Transcript is not deleted")
if not transcripts_controller.user_can_mutate(transcript, user_id):
raise HTTPException(status_code=403, detail="Not authorized")
await transcripts_controller.restore_by_id(transcript.id, user_id=user_id)
await get_ws_manager().send_json(
room_id=f"user:{user_id}",
message={"event": "TRANSCRIPT_RESTORED", "data": {"id": transcript.id}},
)
return DeletionStatus(status="ok")
@router.delete("/transcripts/{transcript_id}/destroy", response_model=DeletionStatus)
async def transcript_destroy(
transcript_id: str,
user: Annotated[auth.UserInfo, Depends(auth.current_user)],
):
"""Permanently delete a transcript and all associated files."""
user_id = user["sub"]
transcript = await transcripts_controller.get_by_id(transcript_id)
if not transcript:
raise HTTPException(status_code=404, detail="Transcript not found")
if transcript.deleted_at is None:
raise HTTPException(
status_code=400, detail="Transcript must be soft-deleted first"
)
if not transcripts_controller.user_can_mutate(transcript, user_id):
raise HTTPException(status_code=403, detail="Not authorized")
await transcripts_controller.hard_delete(transcript.id)
await get_ws_manager().send_json(
room_id=f"user:{user_id}",
message={"event": "TRANSCRIPT_DELETED", "data": {"id": transcript.id}},
)
return DeletionStatus(status="ok")
@router.get(
"/transcripts/{transcript_id}/topics",
response_model=list[GetTranscriptTopic],
@@ -700,8 +756,6 @@ async def transcript_post_to_zulip(
)
if not transcript:
raise HTTPException(status_code=404, detail="Transcript not found")
if not transcripts_controller.user_can_mutate(transcript, user_id):
raise HTTPException(status_code=403, detail="Not authorized")
content = get_zulip_message(transcript, include_topics)
message_updated = False
@@ -733,17 +787,17 @@ class SendEmailResponse(BaseModel):
async def transcript_send_email(
transcript_id: str,
request: SendEmailRequest,
user: Annotated[auth.UserInfo, Depends(auth.current_user)],
user: Annotated[Optional[auth.UserInfo], Depends(auth.current_user_optional)],
):
if not is_email_configured():
raise HTTPException(status_code=400, detail="Email not configured")
user_id = user["sub"]
user_id = user["sub"] if user else None
transcript = await transcripts_controller.get_by_id_for_http(
transcript_id, user_id=user_id
)
if not transcript:
raise HTTPException(status_code=404, detail="Transcript not found")
if not transcripts_controller.user_can_mutate(transcript, user_id):
raise HTTPException(status_code=403, detail="Not authorized")
sent = await send_transcript_email([request.email], transcript)
sent = await send_transcript_email(
[request.email], transcript, include_link=(transcript.share_mode == "public")
)
return SendEmailResponse(sent=sent)

View File

@@ -146,7 +146,6 @@ else:
app.conf.broker_connection_retry_on_startup = True
app.autodiscover_tasks(
[
"reflector.pipelines.main_live_pipeline",
"reflector.worker.healthcheck",
"reflector.worker.process",
"reflector.worker.cleanup",

View File

@@ -12,6 +12,7 @@ from celery import shared_task
from celery.utils.log import get_task_logger
from pydantic import ValidationError
from reflector.asynctask import asynctask
from reflector.dailyco_api import FinishedRecordingResponse, RecordingResponse
from reflector.db.daily_participant_sessions import (
DailyParticipantSession,
@@ -25,9 +26,6 @@ from reflector.db.transcripts import (
transcripts_controller,
)
from reflector.hatchet.client import HatchetClientManager
from reflector.pipelines.main_live_pipeline import asynctask
from reflector.pipelines.topic_processing import EmptyPipeline
from reflector.processors import AudioFileWriterProcessor
from reflector.processors.audio_waveform_processor import AudioWaveformProcessor
from reflector.redis_cache import RedisAsyncLock
from reflector.settings import settings
@@ -908,6 +906,11 @@ async def convert_audio_and_waveform(transcript) -> None:
transcript_id=transcript.id,
)
from reflector.pipelines.topic_processing import EmptyPipeline # noqa: PLC0415
from reflector.processors.audio_file_writer import (
AudioFileWriterProcessor, # noqa: PLC0415
)
upload_path = transcript.data_path / "upload.webm"
mp3_path = transcript.audio_mp3_filename

View File

@@ -8,8 +8,8 @@ import structlog
from celery import shared_task
from celery.utils.log import get_task_logger
from reflector.asynctask import asynctask
from reflector.db.rooms import rooms_controller
from reflector.pipelines.main_live_pipeline import asynctask
from reflector.utils.webhook import (
WebhookRoomPayload,
WebhookTestPayload,

View File

@@ -113,6 +113,7 @@ TranscriptWsEvent = Annotated[
UserEventName = Literal[
"TRANSCRIPT_CREATED",
"TRANSCRIPT_DELETED",
"TRANSCRIPT_RESTORED",
"TRANSCRIPT_STATUS",
"TRANSCRIPT_FINAL_TITLE",
"TRANSCRIPT_DURATION",
@@ -161,6 +162,15 @@ class UserWsTranscriptDeleted(BaseModel):
data: UserTranscriptDeletedData
class UserTranscriptRestoredData(BaseModel):
id: NonEmptyString
class UserWsTranscriptRestored(BaseModel):
event: Literal["TRANSCRIPT_RESTORED"] = "TRANSCRIPT_RESTORED"
data: UserTranscriptRestoredData
class UserWsTranscriptStatus(BaseModel):
event: Literal["TRANSCRIPT_STATUS"] = "TRANSCRIPT_STATUS"
data: UserTranscriptStatusData
@@ -180,6 +190,7 @@ UserWsEvent = Annotated[
Union[
UserWsTranscriptCreated,
UserWsTranscriptDeleted,
UserWsTranscriptRestored,
UserWsTranscriptStatus,
UserWsTranscriptFinalTitle,
UserWsTranscriptDuration,

View File

@@ -107,7 +107,8 @@ class WebsocketManager:
while True:
# timeout=1.0 prevents tight CPU loop when no messages available
message = await pubsub_subscriber.get_message(
ignore_subscribe_messages=True
ignore_subscribe_messages=True,
timeout=1.0,
)
if message is not None:
room_id = message["channel"].decode("utf-8")

View File

@@ -162,15 +162,9 @@ async def test_multitrack_pipeline_end_to_end(
), f"Expected at least 2 speakers for multitrack, got {len(participants)}"
# 7. Verify email transcript notification
# The send_email pipeline task should have:
# a) Set the transcript to public share_mode
# b) Sent an email to TEST_EMAIL via Mailpit
transcript_resp = await api_client.get(f"/transcripts/{transcript_id}")
transcript_resp.raise_for_status()
transcript_data = transcript_resp.json()
assert (
transcript_data.get("share_mode") == "public"
), "Transcript should be set to public when email recipients exist"
# The send_email pipeline task should have sent an email to TEST_EMAIL via Mailpit.
# Note: share_mode is only set to "public" when meeting has email_recipients;
# room-level emails do NOT change share_mode.
# Poll Mailpit for the delivered email (send_email task runs async after finalize)
messages = await poll_mailpit_messages(mailpit_client, TEST_EMAIL, max_wait=30)

206
server/tests/test_email.py Normal file
View File

@@ -0,0 +1,206 @@
"""Tests for reflector.email — transcript email composition and sending."""
from unittest.mock import AsyncMock, patch
import pytest
from reflector.db.transcripts import (
SourceKind,
Transcript,
TranscriptParticipant,
TranscriptTopic,
)
from reflector.email import (
_build_html,
_build_plain_text,
get_transcript_url,
send_transcript_email,
)
from reflector.processors.types import Word
def _make_transcript(
*,
title: str | None = "Weekly Standup",
short_summary: str | None = "Team discussed sprint progress.",
with_topics: bool = True,
share_mode: str = "private",
source_kind: SourceKind = SourceKind.FILE,
) -> Transcript:
topics = []
participants = []
if with_topics:
participants = [
TranscriptParticipant(id="p1", speaker=0, name="Alice"),
TranscriptParticipant(id="p2", speaker=1, name="Bob"),
]
topics = [
TranscriptTopic(
title="Intro",
summary="Greetings",
timestamp=0.0,
duration=10.0,
words=[
Word(text="Hello", start=0.0, end=0.5, speaker=0),
Word(text="everyone", start=0.5, end=1.0, speaker=0),
Word(text="Thanks", start=5.0, end=5.5, speaker=1),
Word(text="for", start=5.5, end=5.8, speaker=1),
Word(text="joining", start=5.8, end=6.2, speaker=1),
],
),
]
return Transcript(
id="tx-123",
title=title,
short_summary=short_summary,
topics=topics,
participants=participants,
share_mode=share_mode,
source_kind=source_kind,
)
URL = "http://localhost:3000/transcripts/tx-123"
class TestBuildPlainText:
def test_full_content_with_link(self):
t = _make_transcript()
text = _build_plain_text(t, URL, include_link=True)
assert text.startswith("Reflector: Weekly Standup")
assert "Team discussed sprint progress." in text
assert "[00:00] Alice:" in text
assert "[00:05] Bob:" in text
assert URL in text
def test_full_content_without_link(self):
t = _make_transcript()
text = _build_plain_text(t, URL, include_link=False)
assert "Reflector: Weekly Standup" in text
assert "Team discussed sprint progress." in text
assert "[00:00] Alice:" in text
assert URL not in text
def test_no_summary(self):
t = _make_transcript(short_summary=None)
text = _build_plain_text(t, URL, include_link=True)
assert "Summary:" not in text
assert "[00:00] Alice:" in text
def test_no_topics(self):
t = _make_transcript(with_topics=False)
text = _build_plain_text(t, URL, include_link=True)
assert "Transcript:" not in text
assert "Reflector: Weekly Standup" in text
def test_unnamed_recording(self):
t = _make_transcript(title=None)
text = _build_plain_text(t, URL, include_link=True)
assert "Reflector: Unnamed recording" in text
class TestBuildHtml:
def test_full_content_with_link(self):
t = _make_transcript()
html = _build_html(t, URL, include_link=True)
assert "Weekly Standup" in html
assert "Team discussed sprint progress." in html
assert "Alice" in html
assert "Bob" in html
assert URL in html
assert "View Transcript" in html
def test_full_content_without_link(self):
t = _make_transcript()
html = _build_html(t, URL, include_link=False)
assert "Weekly Standup" in html
assert "Alice" in html
assert URL not in html
assert "View Transcript" not in html
def test_no_summary(self):
t = _make_transcript(short_summary=None)
html = _build_html(t, URL, include_link=True)
assert "sprint progress" not in html
assert "Alice" in html
def test_no_topics(self):
t = _make_transcript(with_topics=False)
html = _build_html(t, URL, include_link=True)
assert "Transcript" not in html or "View Transcript" in html
def test_html_escapes_title(self):
t = _make_transcript(title='<script>alert("xss")</script>')
html = _build_html(t, URL, include_link=True)
assert "<script>" not in html
assert "&lt;script&gt;" in html
class TestGetTranscriptUrl:
def test_url_format(self):
t = _make_transcript()
url = get_transcript_url(t)
assert url.endswith("/transcripts/tx-123")
class TestSendTranscriptEmail:
@pytest.mark.asyncio
async def test_include_link_default_true(self):
t = _make_transcript()
with (
patch("reflector.email.is_email_configured", return_value=True),
patch(
"reflector.email.aiosmtplib.send", new_callable=AsyncMock
) as mock_send,
):
count = await send_transcript_email(["a@test.com"], t)
assert count == 1
call_args = mock_send.call_args
msg = call_args[0][0]
assert msg["Subject"] == "Reflector: Weekly Standup"
# Default include_link=True, so HTML part should contain the URL
html_part = msg.get_payload()[1].get_payload()
assert "/transcripts/tx-123" in html_part
@pytest.mark.asyncio
async def test_include_link_false(self):
t = _make_transcript()
with (
patch("reflector.email.is_email_configured", return_value=True),
patch(
"reflector.email.aiosmtplib.send", new_callable=AsyncMock
) as mock_send,
):
count = await send_transcript_email(["a@test.com"], t, include_link=False)
assert count == 1
msg = mock_send.call_args[0][0]
html_part = msg.get_payload()[1].get_payload()
assert "/transcripts/tx-123" not in html_part
plain_part = msg.get_payload()[0].get_payload()
assert "/transcripts/tx-123" not in plain_part
@pytest.mark.asyncio
async def test_skips_when_not_configured(self):
t = _make_transcript()
with patch("reflector.email.is_email_configured", return_value=False):
count = await send_transcript_email(["a@test.com"], t)
assert count == 0
@pytest.mark.asyncio
async def test_skips_empty_recipients(self):
t = _make_transcript()
with patch("reflector.email.is_email_configured", return_value=True):
count = await send_transcript_email([], t)
assert count == 0

View File

@@ -1,5 +1,9 @@
from datetime import datetime, timedelta, timezone
from unittest.mock import AsyncMock, patch
import pytest
from reflector.db.meetings import meetings_controller
from reflector.db.recordings import Recording, recordings_controller
from reflector.db.rooms import rooms_controller
from reflector.db.transcripts import SourceKind, transcripts_controller
@@ -390,3 +394,463 @@ async def test_transcripts_list_filtered_by_room_id(authenticated_client, client
ids = [t["id"] for t in items]
assert in_room.id in ids
assert other.id not in ids
# ---------------------------------------------------------------------------
# Restore tests
# ---------------------------------------------------------------------------
@pytest.mark.asyncio
async def test_transcript_restore(authenticated_client, client):
"""Soft-delete then restore, verify accessible again."""
response = await client.post("/transcripts", json={"name": "restore-me"})
assert response.status_code == 200
tid = response.json()["id"]
# Soft-delete
response = await client.delete(f"/transcripts/{tid}")
assert response.status_code == 200
# 404 while deleted
response = await client.get(f"/transcripts/{tid}")
assert response.status_code == 404
# Restore
response = await client.post(f"/transcripts/{tid}/restore")
assert response.status_code == 200
assert response.json()["status"] == "ok"
# Accessible again
response = await client.get(f"/transcripts/{tid}")
assert response.status_code == 200
assert response.json()["name"] == "restore-me"
# deleted_at is cleared
transcript = await transcripts_controller.get_by_id(tid)
assert transcript.deleted_at is None
@pytest.mark.asyncio
async def test_transcript_restore_recording_also_restored(authenticated_client, client):
"""Restoring a transcript also restores its recording."""
recording = await recordings_controller.create(
Recording(
bucket_name="test-bucket",
object_key="restore-test.mp4",
recorded_at=datetime.now(timezone.utc),
)
)
transcript = await transcripts_controller.add(
name="restore-with-recording",
source_kind=SourceKind.ROOM,
recording_id=recording.id,
user_id="randomuserid",
)
# Soft-delete
response = await client.delete(f"/transcripts/{transcript.id}")
assert response.status_code == 200
# Both should be soft-deleted
rec = await recordings_controller.get_by_id(recording.id)
assert rec.deleted_at is not None
# Restore
response = await client.post(f"/transcripts/{transcript.id}/restore")
assert response.status_code == 200
# Recording also restored
rec = await recordings_controller.get_by_id(recording.id)
assert rec.deleted_at is None
tr = await transcripts_controller.get_by_id(transcript.id)
assert tr.deleted_at is None
@pytest.mark.asyncio
async def test_transcript_restore_not_deleted(authenticated_client, client):
"""Restoring a non-deleted transcript returns 400."""
response = await client.post("/transcripts", json={"name": "not-deleted"})
assert response.status_code == 200
tid = response.json()["id"]
response = await client.post(f"/transcripts/{tid}/restore")
assert response.status_code == 400
@pytest.mark.asyncio
async def test_transcript_restore_not_found(authenticated_client, client):
"""Restoring a nonexistent transcript returns 404."""
response = await client.post("/transcripts/nonexistent-id/restore")
assert response.status_code == 404
@pytest.mark.asyncio
async def test_transcript_restore_forbidden(authenticated_client, client):
"""Cannot restore another user's deleted transcript."""
# Create transcript owned by a different user
transcript = await transcripts_controller.add(
name="other-user-restore",
source_kind=SourceKind.FILE,
user_id="some-other-user",
)
# Soft-delete directly in DB
await transcripts_controller.remove_by_id(transcript.id, user_id="some-other-user")
# Try to restore as randomuserid (authenticated_client)
response = await client.post(f"/transcripts/{transcript.id}/restore")
assert response.status_code == 403
# ---------------------------------------------------------------------------
# Destroy tests
# ---------------------------------------------------------------------------
@pytest.fixture
def mock_destroy_storage():
"""Mock storage backends so hard_delete doesn't require S3 credentials."""
with (
patch(
"reflector.db.transcripts.get_transcripts_storage",
return_value=AsyncMock(delete_file=AsyncMock()),
),
patch(
"reflector.db.transcripts.get_source_storage",
return_value=AsyncMock(delete_file=AsyncMock()),
),
):
yield
@pytest.mark.asyncio
async def test_transcript_destroy(authenticated_client, client, mock_destroy_storage):
"""Soft-delete then destroy, verify transcript gone from DB."""
response = await client.post("/transcripts", json={"name": "destroy-me"})
assert response.status_code == 200
tid = response.json()["id"]
# Soft-delete first
response = await client.delete(f"/transcripts/{tid}")
assert response.status_code == 200
# Destroy
response = await client.delete(f"/transcripts/{tid}/destroy")
assert response.status_code == 200
assert response.json()["status"] == "ok"
# Gone from DB entirely
transcript = await transcripts_controller.get_by_id(tid)
assert transcript is None
@pytest.mark.asyncio
async def test_transcript_destroy_not_soft_deleted(authenticated_client, client):
"""Cannot destroy a transcript that hasn't been soft-deleted."""
response = await client.post("/transcripts", json={"name": "not-soft-deleted"})
assert response.status_code == 200
tid = response.json()["id"]
response = await client.delete(f"/transcripts/{tid}/destroy")
assert response.status_code == 400
@pytest.mark.asyncio
async def test_transcript_destroy_with_recording(
authenticated_client, client, mock_destroy_storage
):
"""Destroying a transcript also hard-deletes its recording from DB."""
recording = await recordings_controller.create(
Recording(
bucket_name="test-bucket",
object_key="destroy-test.mp4",
recorded_at=datetime.now(timezone.utc),
)
)
transcript = await transcripts_controller.add(
name="destroy-with-recording",
source_kind=SourceKind.ROOM,
recording_id=recording.id,
user_id="randomuserid",
)
# Soft-delete
response = await client.delete(f"/transcripts/{transcript.id}")
assert response.status_code == 200
# Destroy
response = await client.delete(f"/transcripts/{transcript.id}/destroy")
assert response.status_code == 200
# Both gone from DB
assert await transcripts_controller.get_by_id(transcript.id) is None
assert await recordings_controller.get_by_id(recording.id) is None
@pytest.mark.asyncio
async def test_transcript_destroy_forbidden(authenticated_client, client):
"""Cannot destroy another user's deleted transcript."""
transcript = await transcripts_controller.add(
name="other-user-destroy",
source_kind=SourceKind.FILE,
user_id="some-other-user",
)
await transcripts_controller.remove_by_id(transcript.id, user_id="some-other-user")
# Try to destroy as randomuserid (authenticated_client)
response = await client.delete(f"/transcripts/{transcript.id}/destroy")
assert response.status_code == 403
# ---------------------------------------------------------------------------
# Isolation tests — verify unrelated data is NOT deleted
# ---------------------------------------------------------------------------
@pytest.mark.asyncio
async def test_transcript_destroy_does_not_delete_meeting(
authenticated_client, client, mock_destroy_storage
):
"""Destroying a transcript must NOT delete its associated meeting."""
room = await rooms_controller.add(
name="room-for-meeting-isolation",
user_id="randomuserid",
zulip_auto_post=False,
zulip_stream="",
zulip_topic="",
is_locked=False,
room_mode="normal",
recording_type="cloud",
recording_trigger="automatic-2nd-participant",
is_shared=False,
webhook_url="",
webhook_secret="",
)
now = datetime.now(timezone.utc)
meeting = await meetings_controller.create(
id="meeting-isolation-test",
room_name=room.name,
room_url="https://example.com/room",
host_room_url="https://example.com/room-host",
start_date=now,
end_date=now + timedelta(hours=1),
room=room,
)
recording = await recordings_controller.create(
Recording(
bucket_name="test-bucket",
object_key="meeting-iso.mp4",
recorded_at=now,
meeting_id=meeting.id,
)
)
transcript = await transcripts_controller.add(
name="transcript-with-meeting",
source_kind=SourceKind.ROOM,
recording_id=recording.id,
meeting_id=meeting.id,
room_id=room.id,
user_id="randomuserid",
)
# Soft-delete then destroy
await transcripts_controller.remove_by_id(transcript.id, user_id="randomuserid")
response = await client.delete(f"/transcripts/{transcript.id}/destroy")
assert response.status_code == 200
# Transcript and recording are gone
assert await transcripts_controller.get_by_id(transcript.id) is None
assert await recordings_controller.get_by_id(recording.id) is None
# Meeting still exists
m = await meetings_controller.get_by_id(meeting.id)
assert m is not None
assert m.id == meeting.id
@pytest.mark.asyncio
async def test_transcript_destroy_does_not_affect_other_transcripts(
authenticated_client, client, mock_destroy_storage
):
"""Destroying one transcript must not affect another transcript or its recording."""
user_id = "randomuserid"
rec1 = await recordings_controller.create(
Recording(
bucket_name="test-bucket",
object_key="sibling1.mp4",
recorded_at=datetime.now(timezone.utc),
)
)
rec2 = await recordings_controller.create(
Recording(
bucket_name="test-bucket",
object_key="sibling2.mp4",
recorded_at=datetime.now(timezone.utc),
)
)
t1 = await transcripts_controller.add(
name="sibling-1",
source_kind=SourceKind.FILE,
recording_id=rec1.id,
user_id=user_id,
)
t2 = await transcripts_controller.add(
name="sibling-2",
source_kind=SourceKind.FILE,
recording_id=rec2.id,
user_id=user_id,
)
# Soft-delete and destroy t1
await transcripts_controller.remove_by_id(t1.id, user_id=user_id)
response = await client.delete(f"/transcripts/{t1.id}/destroy")
assert response.status_code == 200
# t1 and rec1 gone
assert await transcripts_controller.get_by_id(t1.id) is None
assert await recordings_controller.get_by_id(rec1.id) is None
# t2 and rec2 untouched
t2_after = await transcripts_controller.get_by_id(t2.id)
assert t2_after is not None
assert t2_after.deleted_at is None
rec2_after = await recordings_controller.get_by_id(rec2.id)
assert rec2_after is not None
assert rec2_after.deleted_at is None
@pytest.mark.asyncio
async def test_transcript_destroy_meeting_with_multiple_transcripts(
authenticated_client, client, mock_destroy_storage
):
"""Destroying one transcript from a meeting must not affect the other
transcript, its recording, or the shared meeting."""
user_id = "randomuserid"
room = await rooms_controller.add(
name="room-multi-transcript",
user_id=user_id,
zulip_auto_post=False,
zulip_stream="",
zulip_topic="",
is_locked=False,
room_mode="normal",
recording_type="cloud",
recording_trigger="automatic-2nd-participant",
is_shared=False,
webhook_url="",
webhook_secret="",
)
now = datetime.now(timezone.utc)
meeting = await meetings_controller.create(
id="meeting-multi-transcript-test",
room_name=room.name,
room_url="https://example.com/room",
host_room_url="https://example.com/room-host",
start_date=now,
end_date=now + timedelta(hours=1),
room=room,
)
rec1 = await recordings_controller.create(
Recording(
bucket_name="test-bucket",
object_key="multi1.mp4",
recorded_at=now,
meeting_id=meeting.id,
)
)
rec2 = await recordings_controller.create(
Recording(
bucket_name="test-bucket",
object_key="multi2.mp4",
recorded_at=now,
meeting_id=meeting.id,
)
)
t1 = await transcripts_controller.add(
name="multi-t1",
source_kind=SourceKind.ROOM,
recording_id=rec1.id,
meeting_id=meeting.id,
room_id=room.id,
user_id=user_id,
)
t2 = await transcripts_controller.add(
name="multi-t2",
source_kind=SourceKind.ROOM,
recording_id=rec2.id,
meeting_id=meeting.id,
room_id=room.id,
user_id=user_id,
)
# Soft-delete and destroy t1
await transcripts_controller.remove_by_id(t1.id, user_id=user_id)
response = await client.delete(f"/transcripts/{t1.id}/destroy")
assert response.status_code == 200
# t1 + rec1 gone
assert await transcripts_controller.get_by_id(t1.id) is None
assert await recordings_controller.get_by_id(rec1.id) is None
# t2 + rec2 + meeting all still exist
assert (await transcripts_controller.get_by_id(t2.id)) is not None
assert (await recordings_controller.get_by_id(rec2.id)) is not None
assert (await meetings_controller.get_by_id(meeting.id)) is not None
# ---------------------------------------------------------------------------
# Search tests
# ---------------------------------------------------------------------------
@pytest.mark.asyncio
async def test_search_include_deleted(authenticated_client, client):
"""Search with include_deleted=true returns only deleted transcripts."""
response = await client.post("/transcripts", json={"name": "search-deleted"})
assert response.status_code == 200
tid = response.json()["id"]
# Soft-delete
response = await client.delete(f"/transcripts/{tid}")
assert response.status_code == 200
# Normal search should not include it
response = await client.get("/transcripts/search", params={"q": ""})
assert response.status_code == 200
ids = [r["id"] for r in response.json()["results"]]
assert tid not in ids
# Search with include_deleted should include it
response = await client.get(
"/transcripts/search", params={"q": "", "include_deleted": True}
)
assert response.status_code == 200
ids = [r["id"] for r in response.json()["results"]]
assert tid in ids
@pytest.mark.asyncio
async def test_search_exclude_deleted_by_default(authenticated_client, client):
"""Normal search excludes deleted transcripts by default."""
response = await client.post(
"/transcripts", json={"name": "search-exclude-deleted"}
)
assert response.status_code == 200
tid = response.json()["id"]
# Verify it appears in search
response = await client.get("/transcripts/search", params={"q": ""})
assert response.status_code == 200
ids = [r["id"] for r in response.json()["results"]]
assert tid in ids
# Soft-delete
response = await client.delete(f"/transcripts/{tid}")
assert response.status_code == 200
# Verify it no longer appears in default search
response = await client.get("/transcripts/search", params={"q": ""})
assert response.status_code == 200
ids = [r["id"] for r in response.json()["results"]]
assert tid not in ids

78
server/uv.lock generated
View File

@@ -776,47 +776,47 @@ toml = [
[[package]]
name = "cryptography"
version = "46.0.5"
version = "46.0.6"
source = { registry = "https://pypi.org/simple" }
dependencies = [
{ name = "cffi", version = "2.0.0", source = { registry = "https://pypi.org/simple" }, marker = "platform_python_implementation != 'PyPy'" },
]
sdist = { url = "https://files.pythonhosted.org/packages/60/04/ee2a9e8542e4fa2773b81771ff8349ff19cdd56b7258a0cc442639052edb/cryptography-46.0.5.tar.gz", hash = "sha256:abace499247268e3757271b2f1e244b36b06f8515cf27c4d49468fc9eb16e93d", size = 750064, upload-time = "2026-02-10T19:18:38.255Z" }
sdist = { url = "https://files.pythonhosted.org/packages/a4/ba/04b1bd4218cbc58dc90ce967106d51582371b898690f3ae0402876cc4f34/cryptography-46.0.6.tar.gz", hash = "sha256:27550628a518c5c6c903d84f637fbecf287f6cb9ced3804838a1295dc1fd0759", size = 750542, upload-time = "2026-03-25T23:34:53.396Z" }
wheels = [
{ url = "https://files.pythonhosted.org/packages/f7/81/b0bb27f2ba931a65409c6b8a8b358a7f03c0e46eceacddff55f7c84b1f3b/cryptography-46.0.5-cp311-abi3-macosx_10_9_universal2.whl", hash = "sha256:351695ada9ea9618b3500b490ad54c739860883df6c1f555e088eaf25b1bbaad", size = 7176289, upload-time = "2026-02-10T19:17:08.274Z" },
{ url = "https://files.pythonhosted.org/packages/ff/9e/6b4397a3e3d15123de3b1806ef342522393d50736c13b20ec4c9ea6693a6/cryptography-46.0.5-cp311-abi3-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:c18ff11e86df2e28854939acde2d003f7984f721eba450b56a200ad90eeb0e6b", size = 4275637, upload-time = "2026-02-10T19:17:10.53Z" },
{ url = "https://files.pythonhosted.org/packages/63/e7/471ab61099a3920b0c77852ea3f0ea611c9702f651600397ac567848b897/cryptography-46.0.5-cp311-abi3-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:4d7e3d356b8cd4ea5aff04f129d5f66ebdc7b6f8eae802b93739ed520c47c79b", size = 4424742, upload-time = "2026-02-10T19:17:12.388Z" },
{ url = "https://files.pythonhosted.org/packages/37/53/a18500f270342d66bf7e4d9f091114e31e5ee9e7375a5aba2e85a91e0044/cryptography-46.0.5-cp311-abi3-manylinux_2_28_aarch64.whl", hash = "sha256:50bfb6925eff619c9c023b967d5b77a54e04256c4281b0e21336a130cd7fc263", size = 4277528, upload-time = "2026-02-10T19:17:13.853Z" },
{ url = "https://files.pythonhosted.org/packages/22/29/c2e812ebc38c57b40e7c583895e73c8c5adb4d1e4a0cc4c5a4fdab2b1acc/cryptography-46.0.5-cp311-abi3-manylinux_2_28_ppc64le.whl", hash = "sha256:803812e111e75d1aa73690d2facc295eaefd4439be1023fefc4995eaea2af90d", size = 4947993, upload-time = "2026-02-10T19:17:15.618Z" },
{ url = "https://files.pythonhosted.org/packages/6b/e7/237155ae19a9023de7e30ec64e5d99a9431a567407ac21170a046d22a5a3/cryptography-46.0.5-cp311-abi3-manylinux_2_28_x86_64.whl", hash = "sha256:3ee190460e2fbe447175cda91b88b84ae8322a104fc27766ad09428754a618ed", size = 4456855, upload-time = "2026-02-10T19:17:17.221Z" },
{ url = "https://files.pythonhosted.org/packages/2d/87/fc628a7ad85b81206738abbd213b07702bcbdada1dd43f72236ef3cffbb5/cryptography-46.0.5-cp311-abi3-manylinux_2_31_armv7l.whl", hash = "sha256:f145bba11b878005c496e93e257c1e88f154d278d2638e6450d17e0f31e558d2", size = 3984635, upload-time = "2026-02-10T19:17:18.792Z" },
{ url = "https://files.pythonhosted.org/packages/84/29/65b55622bde135aedf4565dc509d99b560ee4095e56989e815f8fd2aa910/cryptography-46.0.5-cp311-abi3-manylinux_2_34_aarch64.whl", hash = "sha256:e9251e3be159d1020c4030bd2e5f84d6a43fe54b6c19c12f51cde9542a2817b2", size = 4277038, upload-time = "2026-02-10T19:17:20.256Z" },
{ url = "https://files.pythonhosted.org/packages/bc/36/45e76c68d7311432741faf1fbf7fac8a196a0a735ca21f504c75d37e2558/cryptography-46.0.5-cp311-abi3-manylinux_2_34_ppc64le.whl", hash = "sha256:47fb8a66058b80e509c47118ef8a75d14c455e81ac369050f20ba0d23e77fee0", size = 4912181, upload-time = "2026-02-10T19:17:21.825Z" },
{ url = "https://files.pythonhosted.org/packages/6d/1a/c1ba8fead184d6e3d5afcf03d569acac5ad063f3ac9fb7258af158f7e378/cryptography-46.0.5-cp311-abi3-manylinux_2_34_x86_64.whl", hash = "sha256:4c3341037c136030cb46e4b1e17b7418ea4cbd9dd207e4a6f3b2b24e0d4ac731", size = 4456482, upload-time = "2026-02-10T19:17:25.133Z" },
{ url = "https://files.pythonhosted.org/packages/f9/e5/3fb22e37f66827ced3b902cf895e6a6bc1d095b5b26be26bd13c441fdf19/cryptography-46.0.5-cp311-abi3-musllinux_1_2_aarch64.whl", hash = "sha256:890bcb4abd5a2d3f852196437129eb3667d62630333aacc13dfd470fad3aaa82", size = 4405497, upload-time = "2026-02-10T19:17:26.66Z" },
{ url = "https://files.pythonhosted.org/packages/1a/df/9d58bb32b1121a8a2f27383fabae4d63080c7ca60b9b5c88be742be04ee7/cryptography-46.0.5-cp311-abi3-musllinux_1_2_x86_64.whl", hash = "sha256:80a8d7bfdf38f87ca30a5391c0c9ce4ed2926918e017c29ddf643d0ed2778ea1", size = 4667819, upload-time = "2026-02-10T19:17:28.569Z" },
{ url = "https://files.pythonhosted.org/packages/ea/ed/325d2a490c5e94038cdb0117da9397ece1f11201f425c4e9c57fe5b9f08b/cryptography-46.0.5-cp311-abi3-win32.whl", hash = "sha256:60ee7e19e95104d4c03871d7d7dfb3d22ef8a9b9c6778c94e1c8fcc8365afd48", size = 3028230, upload-time = "2026-02-10T19:17:30.518Z" },
{ url = "https://files.pythonhosted.org/packages/e9/5a/ac0f49e48063ab4255d9e3b79f5def51697fce1a95ea1370f03dc9db76f6/cryptography-46.0.5-cp311-abi3-win_amd64.whl", hash = "sha256:38946c54b16c885c72c4f59846be9743d699eee2b69b6988e0a00a01f46a61a4", size = 3480909, upload-time = "2026-02-10T19:17:32.083Z" },
{ url = "https://files.pythonhosted.org/packages/e2/fa/a66aa722105ad6a458bebd64086ca2b72cdd361fed31763d20390f6f1389/cryptography-46.0.5-cp38-abi3-macosx_10_9_universal2.whl", hash = "sha256:4108d4c09fbbf2789d0c926eb4152ae1760d5a2d97612b92d508d96c861e4d31", size = 7170514, upload-time = "2026-02-10T19:17:56.267Z" },
{ url = "https://files.pythonhosted.org/packages/0f/04/c85bdeab78c8bc77b701bf0d9bdcf514c044e18a46dcff330df5448631b0/cryptography-46.0.5-cp38-abi3-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:7d1f30a86d2757199cb2d56e48cce14deddf1f9c95f1ef1b64ee91ea43fe2e18", size = 4275349, upload-time = "2026-02-10T19:17:58.419Z" },
{ url = "https://files.pythonhosted.org/packages/5c/32/9b87132a2f91ee7f5223b091dc963055503e9b442c98fc0b8a5ca765fab0/cryptography-46.0.5-cp38-abi3-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:039917b0dc418bb9f6edce8a906572d69e74bd330b0b3fea4f79dab7f8ddd235", size = 4420667, upload-time = "2026-02-10T19:18:00.619Z" },
{ url = "https://files.pythonhosted.org/packages/a1/a6/a7cb7010bec4b7c5692ca6f024150371b295ee1c108bdc1c400e4c44562b/cryptography-46.0.5-cp38-abi3-manylinux_2_28_aarch64.whl", hash = "sha256:ba2a27ff02f48193fc4daeadf8ad2590516fa3d0adeeb34336b96f7fa64c1e3a", size = 4276980, upload-time = "2026-02-10T19:18:02.379Z" },
{ url = "https://files.pythonhosted.org/packages/8e/7c/c4f45e0eeff9b91e3f12dbd0e165fcf2a38847288fcfd889deea99fb7b6d/cryptography-46.0.5-cp38-abi3-manylinux_2_28_ppc64le.whl", hash = "sha256:61aa400dce22cb001a98014f647dc21cda08f7915ceb95df0c9eaf84b4b6af76", size = 4939143, upload-time = "2026-02-10T19:18:03.964Z" },
{ url = "https://files.pythonhosted.org/packages/37/19/e1b8f964a834eddb44fa1b9a9976f4e414cbb7aa62809b6760c8803d22d1/cryptography-46.0.5-cp38-abi3-manylinux_2_28_x86_64.whl", hash = "sha256:3ce58ba46e1bc2aac4f7d9290223cead56743fa6ab94a5d53292ffaac6a91614", size = 4453674, upload-time = "2026-02-10T19:18:05.588Z" },
{ url = "https://files.pythonhosted.org/packages/db/ed/db15d3956f65264ca204625597c410d420e26530c4e2943e05a0d2f24d51/cryptography-46.0.5-cp38-abi3-manylinux_2_31_armv7l.whl", hash = "sha256:420d0e909050490d04359e7fdb5ed7e667ca5c3c402b809ae2563d7e66a92229", size = 3978801, upload-time = "2026-02-10T19:18:07.167Z" },
{ url = "https://files.pythonhosted.org/packages/41/e2/df40a31d82df0a70a0daf69791f91dbb70e47644c58581d654879b382d11/cryptography-46.0.5-cp38-abi3-manylinux_2_34_aarch64.whl", hash = "sha256:582f5fcd2afa31622f317f80426a027f30dc792e9c80ffee87b993200ea115f1", size = 4276755, upload-time = "2026-02-10T19:18:09.813Z" },
{ url = "https://files.pythonhosted.org/packages/33/45/726809d1176959f4a896b86907b98ff4391a8aa29c0aaaf9450a8a10630e/cryptography-46.0.5-cp38-abi3-manylinux_2_34_ppc64le.whl", hash = "sha256:bfd56bb4b37ed4f330b82402f6f435845a5f5648edf1ad497da51a8452d5d62d", size = 4901539, upload-time = "2026-02-10T19:18:11.263Z" },
{ url = "https://files.pythonhosted.org/packages/99/0f/a3076874e9c88ecb2ecc31382f6e7c21b428ede6f55aafa1aa272613e3cd/cryptography-46.0.5-cp38-abi3-manylinux_2_34_x86_64.whl", hash = "sha256:a3d507bb6a513ca96ba84443226af944b0f7f47dcc9a399d110cd6146481d24c", size = 4452794, upload-time = "2026-02-10T19:18:12.914Z" },
{ url = "https://files.pythonhosted.org/packages/02/ef/ffeb542d3683d24194a38f66ca17c0a4b8bf10631feef44a7ef64e631b1a/cryptography-46.0.5-cp38-abi3-musllinux_1_2_aarch64.whl", hash = "sha256:9f16fbdf4da055efb21c22d81b89f155f02ba420558db21288b3d0035bafd5f4", size = 4404160, upload-time = "2026-02-10T19:18:14.375Z" },
{ url = "https://files.pythonhosted.org/packages/96/93/682d2b43c1d5f1406ed048f377c0fc9fc8f7b0447a478d5c65ab3d3a66eb/cryptography-46.0.5-cp38-abi3-musllinux_1_2_x86_64.whl", hash = "sha256:ced80795227d70549a411a4ab66e8ce307899fad2220ce5ab2f296e687eacde9", size = 4667123, upload-time = "2026-02-10T19:18:15.886Z" },
{ url = "https://files.pythonhosted.org/packages/45/2d/9c5f2926cb5300a8eefc3f4f0b3f3df39db7f7ce40c8365444c49363cbda/cryptography-46.0.5-cp38-abi3-win32.whl", hash = "sha256:02f547fce831f5096c9a567fd41bc12ca8f11df260959ecc7c3202555cc47a72", size = 3010220, upload-time = "2026-02-10T19:18:17.361Z" },
{ url = "https://files.pythonhosted.org/packages/48/ef/0c2f4a8e31018a986949d34a01115dd057bf536905dca38897bacd21fac3/cryptography-46.0.5-cp38-abi3-win_amd64.whl", hash = "sha256:556e106ee01aa13484ce9b0239bca667be5004efb0aabbed28d353df86445595", size = 3467050, upload-time = "2026-02-10T19:18:18.899Z" },
{ url = "https://files.pythonhosted.org/packages/eb/dd/2d9fdb07cebdf3d51179730afb7d5e576153c6744c3ff8fded23030c204e/cryptography-46.0.5-pp311-pypy311_pp73-macosx_11_0_arm64.whl", hash = "sha256:3b4995dc971c9fb83c25aa44cf45f02ba86f71ee600d81091c2f0cbae116b06c", size = 3476964, upload-time = "2026-02-10T19:18:20.687Z" },
{ url = "https://files.pythonhosted.org/packages/e9/6f/6cc6cc9955caa6eaf83660b0da2b077c7fe8ff9950a3c5e45d605038d439/cryptography-46.0.5-pp311-pypy311_pp73-manylinux_2_28_aarch64.whl", hash = "sha256:bc84e875994c3b445871ea7181d424588171efec3e185dced958dad9e001950a", size = 4218321, upload-time = "2026-02-10T19:18:22.349Z" },
{ url = "https://files.pythonhosted.org/packages/3e/5d/c4da701939eeee699566a6c1367427ab91a8b7088cc2328c09dbee940415/cryptography-46.0.5-pp311-pypy311_pp73-manylinux_2_28_x86_64.whl", hash = "sha256:2ae6971afd6246710480e3f15824ed3029a60fc16991db250034efd0b9fb4356", size = 4381786, upload-time = "2026-02-10T19:18:24.529Z" },
{ url = "https://files.pythonhosted.org/packages/ac/97/a538654732974a94ff96c1db621fa464f455c02d4bb7d2652f4edc21d600/cryptography-46.0.5-pp311-pypy311_pp73-manylinux_2_34_aarch64.whl", hash = "sha256:d861ee9e76ace6cf36a6a89b959ec08e7bc2493ee39d07ffe5acb23ef46d27da", size = 4217990, upload-time = "2026-02-10T19:18:25.957Z" },
{ url = "https://files.pythonhosted.org/packages/ae/11/7e500d2dd3ba891197b9efd2da5454b74336d64a7cc419aa7327ab74e5f6/cryptography-46.0.5-pp311-pypy311_pp73-manylinux_2_34_x86_64.whl", hash = "sha256:2b7a67c9cd56372f3249b39699f2ad479f6991e62ea15800973b956f4b73e257", size = 4381252, upload-time = "2026-02-10T19:18:27.496Z" },
{ url = "https://files.pythonhosted.org/packages/bc/58/6b3d24e6b9bc474a2dcdee65dfd1f008867015408a271562e4b690561a4d/cryptography-46.0.5-pp311-pypy311_pp73-win_amd64.whl", hash = "sha256:8456928655f856c6e1533ff59d5be76578a7157224dbd9ce6872f25055ab9ab7", size = 3407605, upload-time = "2026-02-10T19:18:29.233Z" },
{ url = "https://files.pythonhosted.org/packages/47/23/9285e15e3bc57325b0a72e592921983a701efc1ee8f91c06c5f0235d86d9/cryptography-46.0.6-cp311-abi3-macosx_10_9_universal2.whl", hash = "sha256:64235194bad039a10bb6d2d930ab3323baaec67e2ce36215fd0952fad0930ca8", size = 7176401, upload-time = "2026-03-25T23:33:22.096Z" },
{ url = "https://files.pythonhosted.org/packages/60/f8/e61f8f13950ab6195b31913b42d39f0f9afc7d93f76710f299b5ec286ae6/cryptography-46.0.6-cp311-abi3-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:26031f1e5ca62fcb9d1fcb34b2b60b390d1aacaa15dc8b895a9ed00968b97b30", size = 4275275, upload-time = "2026-03-25T23:33:23.844Z" },
{ url = "https://files.pythonhosted.org/packages/19/69/732a736d12c2631e140be2348b4ad3d226302df63ef64d30dfdb8db7ad1c/cryptography-46.0.6-cp311-abi3-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:9a693028b9cbe51b5a1136232ee8f2bc242e4e19d456ded3fa7c86e43c713b4a", size = 4425320, upload-time = "2026-03-25T23:33:25.703Z" },
{ url = "https://files.pythonhosted.org/packages/d4/12/123be7292674abf76b21ac1fc0e1af50661f0e5b8f0ec8285faac18eb99e/cryptography-46.0.6-cp311-abi3-manylinux_2_28_aarch64.whl", hash = "sha256:67177e8a9f421aa2d3a170c3e56eca4e0128883cf52a071a7cbf53297f18b175", size = 4278082, upload-time = "2026-03-25T23:33:27.423Z" },
{ url = "https://files.pythonhosted.org/packages/5b/ba/d5e27f8d68c24951b0a484924a84c7cdaed7502bac9f18601cd357f8b1d2/cryptography-46.0.6-cp311-abi3-manylinux_2_28_ppc64le.whl", hash = "sha256:d9528b535a6c4f8ff37847144b8986a9a143585f0540fbcb1a98115b543aa463", size = 4926514, upload-time = "2026-03-25T23:33:29.206Z" },
{ url = "https://files.pythonhosted.org/packages/34/71/1ea5a7352ae516d5512d17babe7e1b87d9db5150b21f794b1377eac1edc0/cryptography-46.0.6-cp311-abi3-manylinux_2_28_x86_64.whl", hash = "sha256:22259338084d6ae497a19bae5d4c66b7ca1387d3264d1c2c0e72d9e9b6a77b97", size = 4457766, upload-time = "2026-03-25T23:33:30.834Z" },
{ url = "https://files.pythonhosted.org/packages/01/59/562be1e653accee4fdad92c7a2e88fced26b3fdfce144047519bbebc299e/cryptography-46.0.6-cp311-abi3-manylinux_2_31_armv7l.whl", hash = "sha256:760997a4b950ff00d418398ad73fbc91aa2894b5c1db7ccb45b4f68b42a63b3c", size = 3986535, upload-time = "2026-03-25T23:33:33.02Z" },
{ url = "https://files.pythonhosted.org/packages/d6/8b/b1ebfeb788bf4624d36e45ed2662b8bd43a05ff62157093c1539c1288a18/cryptography-46.0.6-cp311-abi3-manylinux_2_34_aarch64.whl", hash = "sha256:3dfa6567f2e9e4c5dceb8ccb5a708158a2a871052fa75c8b78cb0977063f1507", size = 4277618, upload-time = "2026-03-25T23:33:34.567Z" },
{ url = "https://files.pythonhosted.org/packages/dd/52/a005f8eabdb28df57c20f84c44d397a755782d6ff6d455f05baa2785bd91/cryptography-46.0.6-cp311-abi3-manylinux_2_34_ppc64le.whl", hash = "sha256:cdcd3edcbc5d55757e5f5f3d330dd00007ae463a7e7aa5bf132d1f22a4b62b19", size = 4890802, upload-time = "2026-03-25T23:33:37.034Z" },
{ url = "https://files.pythonhosted.org/packages/ec/4d/8e7d7245c79c617d08724e2efa397737715ca0ec830ecb3c91e547302555/cryptography-46.0.6-cp311-abi3-manylinux_2_34_x86_64.whl", hash = "sha256:d4e4aadb7fc1f88687f47ca20bb7227981b03afaae69287029da08096853b738", size = 4457425, upload-time = "2026-03-25T23:33:38.904Z" },
{ url = "https://files.pythonhosted.org/packages/1d/5c/f6c3596a1430cec6f949085f0e1a970638d76f81c3ea56d93d564d04c340/cryptography-46.0.6-cp311-abi3-musllinux_1_2_aarch64.whl", hash = "sha256:2b417edbe8877cda9022dde3a008e2deb50be9c407eef034aeeb3a8b11d9db3c", size = 4405530, upload-time = "2026-03-25T23:33:40.842Z" },
{ url = "https://files.pythonhosted.org/packages/7e/c9/9f9cea13ee2dbde070424e0c4f621c091a91ffcc504ffea5e74f0e1daeff/cryptography-46.0.6-cp311-abi3-musllinux_1_2_x86_64.whl", hash = "sha256:380343e0653b1c9d7e1f55b52aaa2dbb2fdf2730088d48c43ca1c7c0abb7cc2f", size = 4667896, upload-time = "2026-03-25T23:33:42.781Z" },
{ url = "https://files.pythonhosted.org/packages/ad/b5/1895bc0821226f129bc74d00eccfc6a5969e2028f8617c09790bf89c185e/cryptography-46.0.6-cp311-abi3-win32.whl", hash = "sha256:bcb87663e1f7b075e48c3be3ecb5f0b46c8fc50b50a97cf264e7f60242dca3f2", size = 3026348, upload-time = "2026-03-25T23:33:45.021Z" },
{ url = "https://files.pythonhosted.org/packages/c3/f8/c9bcbf0d3e6ad288b9d9aa0b1dee04b063d19e8c4f871855a03ab3a297ab/cryptography-46.0.6-cp311-abi3-win_amd64.whl", hash = "sha256:6739d56300662c468fddb0e5e291f9b4d084bead381667b9e654c7dd81705124", size = 3483896, upload-time = "2026-03-25T23:33:46.649Z" },
{ url = "https://files.pythonhosted.org/packages/c4/cc/f330e982852403da79008552de9906804568ae9230da8432f7496ce02b71/cryptography-46.0.6-cp38-abi3-macosx_10_9_universal2.whl", hash = "sha256:12cae594e9473bca1a7aceb90536060643128bb274fcea0fc459ab90f7d1ae7a", size = 7162776, upload-time = "2026-03-25T23:34:13.308Z" },
{ url = "https://files.pythonhosted.org/packages/49/b3/dc27efd8dcc4bff583b3f01d4a3943cd8b5821777a58b3a6a5f054d61b79/cryptography-46.0.6-cp38-abi3-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:639301950939d844a9e1c4464d7e07f902fe9a7f6b215bb0d4f28584729935d8", size = 4270529, upload-time = "2026-03-25T23:34:15.019Z" },
{ url = "https://files.pythonhosted.org/packages/e6/05/e8d0e6eb4f0d83365b3cb0e00eb3c484f7348db0266652ccd84632a3d58d/cryptography-46.0.6-cp38-abi3-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:ed3775295fb91f70b4027aeba878d79b3e55c0b3e97eaa4de71f8f23a9f2eb77", size = 4414827, upload-time = "2026-03-25T23:34:16.604Z" },
{ url = "https://files.pythonhosted.org/packages/2f/97/daba0f5d2dc6d855e2dcb70733c812558a7977a55dd4a6722756628c44d1/cryptography-46.0.6-cp38-abi3-manylinux_2_28_aarch64.whl", hash = "sha256:8927ccfbe967c7df312ade694f987e7e9e22b2425976ddbf28271d7e58845290", size = 4271265, upload-time = "2026-03-25T23:34:18.586Z" },
{ url = "https://files.pythonhosted.org/packages/89/06/fe1fce39a37ac452e58d04b43b0855261dac320a2ebf8f5260dd55b201a9/cryptography-46.0.6-cp38-abi3-manylinux_2_28_ppc64le.whl", hash = "sha256:b12c6b1e1651e42ab5de8b1e00dc3b6354fdfd778e7fa60541ddacc27cd21410", size = 4916800, upload-time = "2026-03-25T23:34:20.561Z" },
{ url = "https://files.pythonhosted.org/packages/ff/8a/b14f3101fe9c3592603339eb5d94046c3ce5f7fc76d6512a2d40efd9724e/cryptography-46.0.6-cp38-abi3-manylinux_2_28_x86_64.whl", hash = "sha256:063b67749f338ca9c5a0b7fe438a52c25f9526b851e24e6c9310e7195aad3b4d", size = 4448771, upload-time = "2026-03-25T23:34:22.406Z" },
{ url = "https://files.pythonhosted.org/packages/01/b3/0796998056a66d1973fd52ee89dc1bb3b6581960a91ad4ac705f182d398f/cryptography-46.0.6-cp38-abi3-manylinux_2_31_armv7l.whl", hash = "sha256:02fad249cb0e090b574e30b276a3da6a149e04ee2f049725b1f69e7b8351ec70", size = 3978333, upload-time = "2026-03-25T23:34:24.281Z" },
{ url = "https://files.pythonhosted.org/packages/c5/3d/db200af5a4ffd08918cd55c08399dc6c9c50b0bc72c00a3246e099d3a849/cryptography-46.0.6-cp38-abi3-manylinux_2_34_aarch64.whl", hash = "sha256:7e6142674f2a9291463e5e150090b95a8519b2fb6e6aaec8917dd8d094ce750d", size = 4271069, upload-time = "2026-03-25T23:34:25.895Z" },
{ url = "https://files.pythonhosted.org/packages/d7/18/61acfd5b414309d74ee838be321c636fe71815436f53c9f0334bf19064fa/cryptography-46.0.6-cp38-abi3-manylinux_2_34_ppc64le.whl", hash = "sha256:456b3215172aeefb9284550b162801d62f5f264a081049a3e94307fe20792cfa", size = 4878358, upload-time = "2026-03-25T23:34:27.67Z" },
{ url = "https://files.pythonhosted.org/packages/8b/65/5bf43286d566f8171917cae23ac6add941654ccf085d739195a4eacf1674/cryptography-46.0.6-cp38-abi3-manylinux_2_34_x86_64.whl", hash = "sha256:341359d6c9e68834e204ceaf25936dffeafea3829ab80e9503860dcc4f4dac58", size = 4448061, upload-time = "2026-03-25T23:34:29.375Z" },
{ url = "https://files.pythonhosted.org/packages/e0/25/7e49c0fa7205cf3597e525d156a6bce5b5c9de1fd7e8cb01120e459f205a/cryptography-46.0.6-cp38-abi3-musllinux_1_2_aarch64.whl", hash = "sha256:9a9c42a2723999a710445bc0d974e345c32adfd8d2fac6d8a251fa829ad31cfb", size = 4399103, upload-time = "2026-03-25T23:34:32.036Z" },
{ url = "https://files.pythonhosted.org/packages/44/46/466269e833f1c4718d6cd496ffe20c56c9c8d013486ff66b4f69c302a68d/cryptography-46.0.6-cp38-abi3-musllinux_1_2_x86_64.whl", hash = "sha256:6617f67b1606dfd9fe4dbfa354a9508d4a6d37afe30306fe6c101b7ce3274b72", size = 4659255, upload-time = "2026-03-25T23:34:33.679Z" },
{ url = "https://files.pythonhosted.org/packages/0a/09/ddc5f630cc32287d2c953fc5d32705e63ec73e37308e5120955316f53827/cryptography-46.0.6-cp38-abi3-win32.whl", hash = "sha256:7f6690b6c55e9c5332c0b59b9c8a3fb232ebf059094c17f9019a51e9827df91c", size = 3010660, upload-time = "2026-03-25T23:34:35.418Z" },
{ url = "https://files.pythonhosted.org/packages/1b/82/ca4893968aeb2709aacfb57a30dec6fa2ab25b10fa9f064b8882ce33f599/cryptography-46.0.6-cp38-abi3-win_amd64.whl", hash = "sha256:79e865c642cfc5c0b3eb12af83c35c5aeff4fa5c672dc28c43721c2c9fdd2f0f", size = 3471160, upload-time = "2026-03-25T23:34:37.191Z" },
{ url = "https://files.pythonhosted.org/packages/2e/84/7ccff00ced5bac74b775ce0beb7d1be4e8637536b522b5df9b73ada42da2/cryptography-46.0.6-pp311-pypy311_pp73-macosx_11_0_arm64.whl", hash = "sha256:2ea0f37e9a9cf0df2952893ad145fd9627d326a59daec9b0802480fa3bcd2ead", size = 3475444, upload-time = "2026-03-25T23:34:38.944Z" },
{ url = "https://files.pythonhosted.org/packages/bc/1f/4c926f50df7749f000f20eede0c896769509895e2648db5da0ed55db711d/cryptography-46.0.6-pp311-pypy311_pp73-manylinux_2_28_aarch64.whl", hash = "sha256:a3e84d5ec9ba01f8fd03802b2147ba77f0c8f2617b2aff254cedd551844209c8", size = 4218227, upload-time = "2026-03-25T23:34:40.871Z" },
{ url = "https://files.pythonhosted.org/packages/c6/65/707be3ffbd5f786028665c3223e86e11c4cda86023adbc56bd72b1b6bab5/cryptography-46.0.6-pp311-pypy311_pp73-manylinux_2_28_x86_64.whl", hash = "sha256:12f0fa16cc247b13c43d56d7b35287ff1569b5b1f4c5e87e92cc4fcc00cd10c0", size = 4381399, upload-time = "2026-03-25T23:34:42.609Z" },
{ url = "https://files.pythonhosted.org/packages/f3/6d/73557ed0ef7d73d04d9aba745d2c8e95218213687ee5e76b7d236a5030fc/cryptography-46.0.6-pp311-pypy311_pp73-manylinux_2_34_aarch64.whl", hash = "sha256:50575a76e2951fe7dbd1f56d181f8c5ceeeb075e9ff88e7ad997d2f42af06e7b", size = 4217595, upload-time = "2026-03-25T23:34:44.205Z" },
{ url = "https://files.pythonhosted.org/packages/9e/c5/e1594c4eec66a567c3ac4400008108a415808be2ce13dcb9a9045c92f1a0/cryptography-46.0.6-pp311-pypy311_pp73-manylinux_2_34_x86_64.whl", hash = "sha256:90e5f0a7b3be5f40c3a0a0eafb32c681d8d2c181fc2a1bdabe9b3f611d9f6b1a", size = 4380912, upload-time = "2026-03-25T23:34:46.328Z" },
{ url = "https://files.pythonhosted.org/packages/1a/89/843b53614b47f97fe1abc13f9a86efa5ec9e275292c457af1d4a60dc80e0/cryptography-46.0.6-pp311-pypy311_pp73-win_amd64.whl", hash = "sha256:6728c49e3b2c180ef26f8e9f0a883a2c585638db64cf265b49c9ba10652d430e", size = 3409955, upload-time = "2026-03-25T23:34:48.465Z" },
]
[[package]]
@@ -3543,7 +3543,7 @@ wheels = [
[[package]]
name = "requests"
version = "2.32.4"
version = "2.33.0"
source = { registry = "https://pypi.org/simple" }
dependencies = [
{ name = "certifi" },
@@ -3551,9 +3551,9 @@ dependencies = [
{ name = "idna" },
{ name = "urllib3" },
]
sdist = { url = "https://files.pythonhosted.org/packages/e1/0a/929373653770d8a0d7ea76c37de6e41f11eb07559b103b1c02cafb3f7cf8/requests-2.32.4.tar.gz", hash = "sha256:27d0316682c8a29834d3264820024b62a36942083d52caf2f14c0591336d3422", size = 135258, upload-time = "2025-06-09T16:43:07.34Z" }
sdist = { url = "https://files.pythonhosted.org/packages/34/64/8860370b167a9721e8956ae116825caff829224fbca0ca6e7bf8ddef8430/requests-2.33.0.tar.gz", hash = "sha256:c7ebc5e8b0f21837386ad0e1c8fe8b829fa5f544d8df3b2253bff14ef29d7652", size = 134232, upload-time = "2026-03-25T15:10:41.586Z" }
wheels = [
{ url = "https://files.pythonhosted.org/packages/7c/e4/56027c4a6b4ae70ca9de302488c5ca95ad4a39e190093d6c1a8ace08341b/requests-2.32.4-py3-none-any.whl", hash = "sha256:27babd3cda2a6d50b30443204ee89830707d396671944c998b5975b031ac2b2c", size = 64847, upload-time = "2025-06-09T16:43:05.728Z" },
{ url = "https://files.pythonhosted.org/packages/56/5d/c814546c2333ceea4ba42262d8c4d55763003e767fa169adc693bd524478/requests-2.33.0-py3-none-any.whl", hash = "sha256:3324635456fa185245e24865e810cecec7b4caf933d7eb133dcde67d48cee69b", size = 65017, upload-time = "2026-03-25T15:10:40.382Z" },
]
[[package]]

View File

@@ -34,11 +34,11 @@ export default function DeleteTranscriptDialog({
<Dialog.Positioner>
<Dialog.Content>
<Dialog.Header fontSize="lg" fontWeight="bold">
Delete transcript
Move to Trash
</Dialog.Header>
<Dialog.Body>
Are you sure you want to delete this transcript? This action cannot
be undone.
This transcript will be moved to the trash. You can restore it later
from the Trash view.
{title && (
<Text mt={3} fontWeight="600">
{title}
@@ -71,7 +71,7 @@ export default function DeleteTranscriptDialog({
ml={3}
disabled={!!isLoading}
>
Delete
Move to Trash
</Button>
</Dialog.Footer>
</Dialog.Content>

View File

@@ -0,0 +1,83 @@
import React from "react";
import { Button, Dialog, Text } from "@chakra-ui/react";
interface DestroyTranscriptDialogProps {
isOpen: boolean;
onClose: () => void;
onConfirm: () => void;
cancelRef: React.RefObject<any>;
isLoading?: boolean;
title?: string;
date?: string;
source?: string;
}
export default function DestroyTranscriptDialog({
isOpen,
onClose,
onConfirm,
cancelRef,
isLoading,
title,
date,
source,
}: DestroyTranscriptDialogProps) {
return (
<Dialog.Root
open={isOpen}
onOpenChange={(e) => {
if (!e.open) onClose();
}}
initialFocusEl={() => cancelRef.current}
>
<Dialog.Backdrop />
<Dialog.Positioner>
<Dialog.Content>
<Dialog.Header fontSize="lg" fontWeight="bold">
Permanently Destroy Transcript
</Dialog.Header>
<Dialog.Body>
<Text color="red.600" fontWeight="medium">
This will permanently delete this transcript and all its
associated audio files. This action cannot be undone.
</Text>
{title && (
<Text mt={3} fontWeight="600">
{title}
</Text>
)}
{date && (
<Text color="gray.600" fontSize="sm">
Date: {date}
</Text>
)}
{source && (
<Text color="gray.600" fontSize="sm">
Source: {source}
</Text>
)}
</Dialog.Body>
<Dialog.Footer>
<Button
ref={cancelRef as any}
onClick={onClose}
disabled={!!isLoading}
variant="outline"
colorPalette="gray"
>
Cancel
</Button>
<Button
colorPalette="red"
onClick={onConfirm}
ml={3}
disabled={!!isLoading}
>
Destroy
</Button>
</Dialog.Footer>
</Dialog.Content>
</Dialog.Positioner>
</Dialog.Root>
);
}

View File

@@ -1,8 +1,9 @@
"use client";
import React from "react";
import { Box, Stack, Link, Heading } from "@chakra-ui/react";
import { Box, Stack, Link, Heading, Flex } from "@chakra-ui/react";
import NextLink from "next/link";
import { LuTrash2 } from "react-icons/lu";
import type { components } from "../../../reflector-api";
type Room = components["schemas"]["Room"];
@@ -13,6 +14,9 @@ interface FilterSidebarProps {
selectedSourceKind: SourceKind | null;
selectedRoomId: string;
onFilterChange: (sourceKind: SourceKind | null, roomId: string) => void;
isTrashView: boolean;
onTrashClick: () => void;
isAuthenticated: boolean;
}
export default function FilterSidebar({
@@ -20,6 +24,9 @@ export default function FilterSidebar({
selectedSourceKind,
selectedRoomId,
onFilterChange,
isTrashView,
onTrashClick,
isAuthenticated,
}: FilterSidebarProps) {
const myRooms = rooms.filter((room) => !room.is_shared);
const sharedRooms = rooms.filter((room) => room.is_shared);
@@ -32,8 +39,14 @@ export default function FilterSidebar({
fontSize="sm"
href="#"
onClick={() => onFilterChange(null, "")}
color={selectedSourceKind === null ? "blue.500" : "gray.600"}
fontWeight={selectedSourceKind === null ? "bold" : "normal"}
color={
!isTrashView && selectedSourceKind === null
? "blue.500"
: "gray.600"
}
fontWeight={
!isTrashView && selectedSourceKind === null ? "bold" : "normal"
}
>
All Transcripts
</Link>
@@ -51,12 +64,16 @@ export default function FilterSidebar({
href="#"
onClick={() => onFilterChange("room", room.id)}
color={
selectedSourceKind === "room" && selectedRoomId === room.id
!isTrashView &&
selectedSourceKind === "room" &&
selectedRoomId === room.id
? "blue.500"
: "gray.600"
}
fontWeight={
selectedSourceKind === "room" && selectedRoomId === room.id
!isTrashView &&
selectedSourceKind === "room" &&
selectedRoomId === room.id
? "bold"
: "normal"
}
@@ -79,12 +96,16 @@ export default function FilterSidebar({
href="#"
onClick={() => onFilterChange("room" as SourceKind, room.id)}
color={
selectedSourceKind === "room" && selectedRoomId === room.id
!isTrashView &&
selectedSourceKind === "room" &&
selectedRoomId === room.id
? "blue.500"
: "gray.600"
}
fontWeight={
selectedSourceKind === "room" && selectedRoomId === room.id
!isTrashView &&
selectedSourceKind === "room" &&
selectedRoomId === room.id
? "bold"
: "normal"
}
@@ -101,9 +122,15 @@ export default function FilterSidebar({
as={NextLink}
href="#"
onClick={() => onFilterChange("live", "")}
color={selectedSourceKind === "live" ? "blue.500" : "gray.600"}
color={
!isTrashView && selectedSourceKind === "live"
? "blue.500"
: "gray.600"
}
_hover={{ color: "blue.300" }}
fontWeight={selectedSourceKind === "live" ? "bold" : "normal"}
fontWeight={
!isTrashView && selectedSourceKind === "live" ? "bold" : "normal"
}
fontSize="sm"
>
Live Transcripts
@@ -112,13 +139,39 @@ export default function FilterSidebar({
as={NextLink}
href="#"
onClick={() => onFilterChange("file", "")}
color={selectedSourceKind === "file" ? "blue.500" : "gray.600"}
color={
!isTrashView && selectedSourceKind === "file"
? "blue.500"
: "gray.600"
}
_hover={{ color: "blue.300" }}
fontWeight={selectedSourceKind === "file" ? "bold" : "normal"}
fontWeight={
!isTrashView && selectedSourceKind === "file" ? "bold" : "normal"
}
fontSize="sm"
>
Uploaded Files
</Link>
{isAuthenticated && (
<>
<Box borderBottomWidth="1px" my={2} />
<Link
as={NextLink}
href="#"
onClick={onTrashClick}
color={isTrashView ? "red.600" : "red.500"}
_hover={{ color: "red.400" }}
fontWeight={isTrashView ? "bold" : "normal"}
fontSize="sm"
>
<Flex align="center" gap={1}>
<LuTrash2 />
Trash
</Flex>
</Link>
</>
)}
</Stack>
</Box>
);

View File

@@ -1,17 +1,21 @@
import React from "react";
import { IconButton, Icon, Menu } from "@chakra-ui/react";
import { LuMenu, LuTrash, LuRotateCw } from "react-icons/lu";
import { IconButton, Menu } from "@chakra-ui/react";
import { LuMenu, LuTrash, LuRotateCw, LuUndo2 } from "react-icons/lu";
interface TranscriptActionsMenuProps {
transcriptId: string;
onDelete: (transcriptId: string) => void;
onReprocess: (transcriptId: string) => void;
onDelete?: (transcriptId: string) => void;
onReprocess?: (transcriptId: string) => void;
onRestore?: (transcriptId: string) => void;
onDestroy?: (transcriptId: string) => void;
}
export default function TranscriptActionsMenu({
transcriptId,
onDelete,
onReprocess,
onRestore,
onDestroy,
}: TranscriptActionsMenuProps) {
return (
<Menu.Root closeOnSelect={true} lazyMount={true}>
@@ -22,21 +26,42 @@ export default function TranscriptActionsMenu({
</Menu.Trigger>
<Menu.Positioner>
<Menu.Content>
<Menu.Item
value="reprocess"
onClick={() => onReprocess(transcriptId)}
>
<LuRotateCw /> Reprocess
</Menu.Item>
<Menu.Item
value="delete"
onClick={(e) => {
e.stopPropagation();
onDelete(transcriptId);
}}
>
<LuTrash /> Delete
</Menu.Item>
{onReprocess && (
<Menu.Item
value="reprocess"
onClick={() => onReprocess(transcriptId)}
>
<LuRotateCw /> Reprocess
</Menu.Item>
)}
{onDelete && (
<Menu.Item
value="delete"
onClick={(e) => {
e.stopPropagation();
onDelete(transcriptId);
}}
>
<LuTrash /> Delete
</Menu.Item>
)}
{onRestore && (
<Menu.Item value="restore" onClick={() => onRestore(transcriptId)}>
<LuUndo2 /> Restore
</Menu.Item>
)}
{onDestroy && (
<Menu.Item
value="destroy"
color="red.500"
onClick={(e) => {
e.stopPropagation();
onDestroy(transcriptId);
}}
>
<LuTrash /> Destroy
</Menu.Item>
)}
</Menu.Content>
</Menu.Positioner>
</Menu.Root>

View File

@@ -29,8 +29,11 @@ interface TranscriptCardsProps {
results: SearchResult[];
query: string;
isLoading?: boolean;
onDelete: (transcriptId: string) => void;
onReprocess: (transcriptId: string) => void;
isTrash?: boolean;
onDelete?: (transcriptId: string) => void;
onReprocess?: (transcriptId: string) => void;
onRestore?: (transcriptId: string) => void;
onDestroy?: (transcriptId: string) => void;
}
function highlightText(text: string, query: string): React.ReactNode {
@@ -102,13 +105,19 @@ const transcriptHref = (
function TranscriptCard({
result,
query,
isTrash,
onDelete,
onReprocess,
onRestore,
onDestroy,
}: {
result: SearchResult;
query: string;
onDelete: (transcriptId: string) => void;
onReprocess: (transcriptId: string) => void;
isTrash?: boolean;
onDelete?: (transcriptId: string) => void;
onReprocess?: (transcriptId: string) => void;
onRestore?: (transcriptId: string) => void;
onDestroy?: (transcriptId: string) => void;
}) {
const [isExpanded, setIsExpanded] = useState(false);
@@ -136,22 +145,36 @@ function TranscriptCard({
};
return (
<Box borderWidth={1} p={4} borderRadius="md" fontSize="sm">
<Box
borderWidth={1}
p={4}
borderRadius="md"
fontSize="sm"
borderLeftWidth={isTrash ? "3px" : 1}
borderLeftColor={isTrash ? "red.400" : undefined}
bg={isTrash ? "gray.50" : undefined}
>
<Flex justify="space-between" alignItems="flex-start" gap="2">
<Box>
<TranscriptStatusIcon status={result.status} />
</Box>
<Box flex="1">
{/* Title with highlighting and text fragment for deep linking */}
<Link
as={NextLink}
href={transcriptHref(result.id, mainSnippet, query)}
fontWeight="600"
display="block"
mb={2}
>
{highlightText(resultTitle, query)}
</Link>
{/* Title — plain text in trash (deleted transcripts return 404) */}
{isTrash ? (
<Text fontWeight="600" mb={2} color="gray.600">
{highlightText(resultTitle, query)}
</Text>
) : (
<Link
as={NextLink}
href={transcriptHref(result.id, mainSnippet, query)}
fontWeight="600"
display="block"
mb={2}
>
{highlightText(resultTitle, query)}
</Link>
)}
{/* Metadata - Horizontal on desktop, vertical on mobile */}
<Flex
@@ -272,8 +295,10 @@ function TranscriptCard({
</Box>
<TranscriptActionsMenu
transcriptId={result.id}
onDelete={onDelete}
onReprocess={onReprocess}
onDelete={isTrash ? undefined : onDelete}
onReprocess={isTrash ? undefined : onReprocess}
onRestore={isTrash ? onRestore : undefined}
onDestroy={isTrash ? onDestroy : undefined}
/>
</Flex>
</Box>
@@ -284,8 +309,11 @@ export default function TranscriptCards({
results,
query,
isLoading,
isTrash,
onDelete,
onReprocess,
onRestore,
onDestroy,
}: TranscriptCardsProps) {
return (
<Box position="relative">
@@ -315,8 +343,11 @@ export default function TranscriptCards({
key={result.id}
result={result}
query={query}
isTrash={isTrash}
onDelete={onDelete}
onReprocess={onReprocess}
onRestore={onRestore}
onDestroy={onDestroy}
/>
))}
</Stack>

View File

@@ -19,6 +19,7 @@ import {
parseAsStringLiteral,
} from "nuqs";
import { LuX } from "react-icons/lu";
import { toaster } from "../../components/ui/toaster";
import type { components } from "../../reflector-api";
type Room = components["schemas"]["Room"];
@@ -29,6 +30,9 @@ import {
useTranscriptsSearch,
useTranscriptDelete,
useTranscriptProcess,
useTranscriptRestore,
useTranscriptDestroy,
useAuthReady,
} from "../../lib/apiHooks";
import FilterSidebar from "./_components/FilterSidebar";
import Pagination, {
@@ -40,6 +44,7 @@ import Pagination, {
} from "./_components/Pagination";
import TranscriptCards from "./_components/TranscriptCards";
import DeleteTranscriptDialog from "./_components/DeleteTranscriptDialog";
import DestroyTranscriptDialog from "./_components/DestroyTranscriptDialog";
import { formatLocalDate } from "../../lib/time";
import { RECORD_A_MEETING_URL } from "../../api/urls";
import { useUserName } from "../../lib/useUserName";
@@ -175,14 +180,17 @@ const UnderSearchFormFilterIndicators: React.FC<{
const EmptyResult: React.FC<{
searchQuery: string;
}> = ({ searchQuery }) => {
isTrash?: boolean;
}> = ({ searchQuery, isTrash }) => {
return (
<Flex flexDir="column" alignItems="center" justifyContent="center" py={8}>
<Text textAlign="center">
{searchQuery
? `No results found for "${searchQuery}". Try adjusting your search terms.`
: "No transcripts found, but you can "}
{!searchQuery && (
{isTrash
? "Trash is empty."
: searchQuery
? `No results found for "${searchQuery}". Try adjusting your search terms.`
: "No transcripts found, but you can "}
{!isTrash && !searchQuery && (
<>
<Link href={RECORD_A_MEETING_URL} color="blue.500">
record a meeting
@@ -196,6 +204,8 @@ const EmptyResult: React.FC<{
};
export default function TranscriptBrowser() {
const { isAuthenticated } = useAuthReady();
const [urlSearchQuery, setUrlSearchQuery] = useQueryState(
"q",
parseAsString.withDefault("").withOptions({ shallow: false }),
@@ -216,6 +226,12 @@ export default function TranscriptBrowser() {
parseAsString.withDefault("").withOptions({ shallow: false }),
);
const [urlTrash, setUrlTrash] = useQueryState(
"trash",
parseAsStringLiteral(["1"] as const).withOptions({ shallow: false }),
);
const isTrashView = urlTrash === "1";
const [urlPage, setPage] = useQueryState(
"page",
parseAsInteger.withDefault(1).withOptions({ shallow: false }),
@@ -231,7 +247,7 @@ export default function TranscriptBrowser() {
return;
}
_setSafePage(maybePage.value);
}, [urlPage]);
}, [urlPage, setPage]);
const pageSize = 20;
@@ -240,11 +256,12 @@ export default function TranscriptBrowser() {
() => ({
q: urlSearchQuery,
extras: {
room_id: urlRoomId || undefined,
source_kind: urlSourceKind || undefined,
room_id: isTrashView ? undefined : urlRoomId || undefined,
source_kind: isTrashView ? undefined : urlSourceKind || undefined,
include_deleted: isTrashView ? true : undefined,
},
}),
[urlSearchQuery, urlRoomId, urlSourceKind],
[urlSearchQuery, urlRoomId, urlSourceKind, isTrashView],
);
const {
@@ -266,35 +283,55 @@ export default function TranscriptBrowser() {
const totalPages = getTotalPages(totalResults, pageSize);
// reset pagination when search results change (detected by total change; good enough approximation)
// reset pagination when search filters change
useEffect(() => {
// operation is idempotent
setPage(FIRST_PAGE).then(() => {});
}, [JSON.stringify(searchFilters)]);
}, [searchFilters, setPage]);
const userName = useUserName();
const [deletionLoading, setDeletionLoading] = useState(false);
const [actionLoading, setActionLoading] = useState(false);
const cancelRef = React.useRef(null);
const destroyCancelRef = React.useRef(null);
// Delete (soft-delete / move to trash)
const [transcriptToDeleteId, setTranscriptToDeleteId] =
React.useState<string>();
// Destroy (hard-delete)
const [transcriptToDestroyId, setTranscriptToDestroyId] =
React.useState<string>();
const handleFilterTranscripts = (
sourceKind: SourceKind | null,
roomId: string,
) => {
if (isTrashView) {
setUrlTrash(null);
}
setUrlSourceKind(sourceKind);
setUrlRoomId(roomId);
setPage(1);
};
const handleTrashClick = () => {
setUrlTrash(isTrashView ? null : "1");
setUrlSourceKind(null);
setUrlRoomId(null);
setPage(1);
};
const onCloseDeletion = () => setTranscriptToDeleteId(undefined);
const onCloseDestroy = () => setTranscriptToDestroyId(undefined);
const deleteTranscript = useTranscriptDelete();
const processTranscript = useTranscriptProcess();
const restoreTranscript = useTranscriptRestore();
const destroyTranscript = useTranscriptDestroy();
const confirmDeleteTranscript = (transcriptId: string) => {
if (deletionLoading) return;
setDeletionLoading(true);
if (actionLoading) return;
setActionLoading(true);
deleteTranscript.mutate(
{
params: {
@@ -303,12 +340,12 @@ export default function TranscriptBrowser() {
},
{
onSuccess: () => {
setDeletionLoading(false);
setActionLoading(false);
onCloseDeletion();
reloadSearch();
},
onError: () => {
setDeletionLoading(false);
setActionLoading(false);
},
},
);
@@ -322,18 +359,83 @@ export default function TranscriptBrowser() {
});
};
const handleRestoreTranscript = (transcriptId: string) => {
if (actionLoading) return;
setActionLoading(true);
restoreTranscript.mutate(
{
params: {
path: { transcript_id: transcriptId },
},
},
{
onSuccess: () => {
setActionLoading(false);
reloadSearch();
toaster.create({
duration: 3000,
render: () => (
<Box bg="green.500" color="white" px={4} py={3} borderRadius="md">
<Text fontWeight="bold">Transcript restored</Text>
</Box>
),
});
},
onError: () => {
setActionLoading(false);
},
},
);
};
const confirmDestroyTranscript = (transcriptId: string) => {
if (actionLoading) return;
setActionLoading(true);
destroyTranscript.mutate(
{
params: {
path: { transcript_id: transcriptId },
},
},
{
onSuccess: () => {
setActionLoading(false);
onCloseDestroy();
reloadSearch();
},
onError: () => {
setActionLoading(false);
},
},
);
};
// Dialog data for delete
const transcriptToDelete = results?.find(
(i) => i.id === transcriptToDeleteId,
);
const dialogTitle = transcriptToDelete?.title || "Unnamed Transcript";
const dialogDate = transcriptToDelete?.created_at
const deleteDialogTitle = transcriptToDelete?.title || "Unnamed Transcript";
const deleteDialogDate = transcriptToDelete?.created_at
? formatLocalDate(transcriptToDelete.created_at)
: undefined;
const dialogSource =
const deleteDialogSource =
transcriptToDelete?.source_kind === "room" && transcriptToDelete?.room_id
? transcriptToDelete.room_name || transcriptToDelete.room_id
: transcriptToDelete?.source_kind;
// Dialog data for destroy
const transcriptToDestroy = results?.find(
(i) => i.id === transcriptToDestroyId,
);
const destroyDialogTitle = transcriptToDestroy?.title || "Unnamed Transcript";
const destroyDialogDate = transcriptToDestroy?.created_at
? formatLocalDate(transcriptToDestroy.created_at)
: undefined;
const destroyDialogSource =
transcriptToDestroy?.source_kind === "room" && transcriptToDestroy?.room_id
? transcriptToDestroy.room_name || transcriptToDestroy.room_id
: transcriptToDestroy?.source_kind;
if (searchLoading && results.length === 0) {
return (
<Flex
@@ -361,17 +463,24 @@ export default function TranscriptBrowser() {
mb={4}
>
<Heading size="lg">
{userName ? `${userName}'s Transcriptions` : "Your Transcriptions"}{" "}
{(searchLoading || deletionLoading) && <Spinner size="sm" />}
{isTrashView
? "Trash"
: userName
? `${userName}'s Transcriptions`
: "Your Transcriptions"}{" "}
{(searchLoading || actionLoading) && <Spinner size="sm" />}
</Heading>
</Flex>
<Flex flexDir={{ base: "column", md: "row" }}>
<FilterSidebar
rooms={rooms}
selectedSourceKind={urlSourceKind}
selectedRoomId={urlRoomId}
selectedSourceKind={isTrashView ? null : urlSourceKind}
selectedRoomId={isTrashView ? "" : urlRoomId}
onFilterChange={handleFilterTranscripts}
isTrashView={isTrashView}
onTrashClick={handleTrashClick}
isAuthenticated={isAuthenticated}
/>
<Flex
@@ -384,8 +493,8 @@ export default function TranscriptBrowser() {
>
<SearchForm
setPage={setPage}
sourceKind={urlSourceKind}
roomId={urlRoomId}
sourceKind={isTrashView ? null : urlSourceKind}
roomId={isTrashView ? null : urlRoomId}
searchQuery={urlSearchQuery}
setSearchQuery={setUrlSearchQuery}
setSourceKind={setUrlSourceKind}
@@ -406,12 +515,15 @@ export default function TranscriptBrowser() {
results={results}
query={urlSearchQuery}
isLoading={searchLoading}
onDelete={setTranscriptToDeleteId}
onReprocess={handleProcessTranscript}
isTrash={isTrashView}
onDelete={isTrashView ? undefined : setTranscriptToDeleteId}
onReprocess={isTrashView ? undefined : handleProcessTranscript}
onRestore={isTrashView ? handleRestoreTranscript : undefined}
onDestroy={isTrashView ? setTranscriptToDestroyId : undefined}
/>
{!searchLoading && results.length === 0 && (
<EmptyResult searchQuery={urlSearchQuery} />
<EmptyResult searchQuery={urlSearchQuery} isTrash={isTrashView} />
)}
</Flex>
</Flex>
@@ -423,10 +535,24 @@ export default function TranscriptBrowser() {
transcriptToDeleteId && confirmDeleteTranscript(transcriptToDeleteId)
}
cancelRef={cancelRef}
isLoading={deletionLoading}
title={dialogTitle}
date={dialogDate}
source={dialogSource}
isLoading={actionLoading}
title={deleteDialogTitle}
date={deleteDialogDate}
source={deleteDialogSource}
/>
<DestroyTranscriptDialog
isOpen={!!transcriptToDestroyId}
onClose={onCloseDestroy}
onConfirm={() =>
transcriptToDestroyId &&
confirmDestroyTranscript(transcriptToDestroyId)
}
cancelRef={destroyCancelRef}
isLoading={actionLoading}
title={destroyDialogTitle}
date={destroyDialogDate}
source={destroyDialogSource}
/>
</Flex>
);

View File

@@ -178,14 +178,11 @@ export default function ShareAndPrivacy(props: ShareAndPrivacyProps) {
<ShareZulip
transcript={props.transcript}
topics={props.topics}
disabled={toShareMode(shareMode.value) === "private"}
disabled={toShareMode(shareMode.value) === "public"}
/>
)}
{emailEnabled && (
<ShareEmail
transcript={props.transcript}
disabled={toShareMode(shareMode.value) === "private"}
/>
<ShareEmail transcript={props.transcript} disabled={false} />
)}
<ShareCopy
finalSummaryElement={props.finalSummaryElement}

View File

@@ -212,8 +212,13 @@ export default function DailyRoom({ meeting, room }: DailyRoomProps) {
const showConsentModalRef = useRef(showConsentModal);
showConsentModalRef.current = showConsentModal;
const userEmail =
auth.status === "authenticated" || auth.status === "refreshing"
? auth.user.email
: null;
const { showEmailModal } = useEmailTranscriptDialog({
meetingId: assertMeetingId(meeting.id),
userEmail,
});
const showEmailModalRef = useRef(showEmailModal);
showEmailModalRef.current = showEmailModal;

View File

@@ -136,6 +136,7 @@ export function UserEventsProvider({
switch (msg.event) {
case "TRANSCRIPT_CREATED":
case "TRANSCRIPT_DELETED":
case "TRANSCRIPT_RESTORED":
case "TRANSCRIPT_STATUS":
case "TRANSCRIPT_FINAL_TITLE":
case "TRANSCRIPT_DURATION":

View File

@@ -57,6 +57,7 @@ export function useTranscriptsSearch(
offset?: number;
room_id?: string;
source_kind?: SourceKind;
include_deleted?: boolean;
} = {},
) {
return $api.useQuery(
@@ -70,6 +71,7 @@ export function useTranscriptsSearch(
offset: options.offset,
room_id: options.room_id,
source_kind: options.source_kind,
include_deleted: options.include_deleted,
},
},
},
@@ -105,6 +107,38 @@ export function useTranscriptProcess() {
});
}
export function useTranscriptRestore() {
const { setError } = useError();
const queryClient = useQueryClient();
return $api.useMutation("post", "/v1/transcripts/{transcript_id}/restore", {
onSuccess: () => {
return queryClient.invalidateQueries({
queryKey: ["get", TRANSCRIPT_SEARCH_URL],
});
},
onError: (error) => {
setError(error as Error, "There was an error restoring the transcript");
},
});
}
export function useTranscriptDestroy() {
const { setError } = useError();
const queryClient = useQueryClient();
return $api.useMutation("delete", "/v1/transcripts/{transcript_id}/destroy", {
onSuccess: () => {
return queryClient.invalidateQueries({
queryKey: ["get", TRANSCRIPT_SEARCH_URL],
});
},
onError: (error) => {
setError(error as Error, "There was an error destroying the transcript");
},
});
}
const ACTIVE_TRANSCRIPT_STATUSES = new Set<TranscriptStatus>([
"processing",
"uploaded",

View File

@@ -6,13 +6,15 @@ import { Box, Button, Input, Text, VStack, HStack } from "@chakra-ui/react";
interface EmailTranscriptDialogProps {
onSubmit: (email: string) => void;
onDismiss: () => void;
initialEmail?: string;
}
export function EmailTranscriptDialog({
onSubmit,
onDismiss,
initialEmail,
}: EmailTranscriptDialogProps) {
const [email, setEmail] = useState("");
const [email, setEmail] = useState(initialEmail ?? "");
const [inputEl, setInputEl] = useState<HTMLInputElement | null>(null);
useEffect(() => {

View File

@@ -11,10 +11,12 @@ const TOAST_CHECK_INTERVAL_MS = 100;
type UseEmailTranscriptDialogParams = {
meetingId: MeetingId;
userEmail?: string | null;
};
export function useEmailTranscriptDialog({
meetingId,
userEmail,
}: UseEmailTranscriptDialogParams) {
const [modalOpen, setModalOpen] = useState(false);
const addEmailMutation = useMeetingAddEmailRecipient();
@@ -83,6 +85,7 @@ export function useEmailTranscriptDialog({
duration: null,
render: ({ dismiss }) => (
<EmailTranscriptDialog
initialEmail={userEmail ?? undefined}
onSubmit={(email) => {
handleSubmitEmail(email);
dismiss();
@@ -120,7 +123,7 @@ export function useEmailTranscriptDialog({
}
}, TOAST_CHECK_INTERVAL_MS);
});
}, [handleSubmitEmail, modalOpen]);
}, [handleSubmitEmail, modalOpen, userEmail]);
return {
showEmailModal,

View File

@@ -388,6 +388,46 @@ export interface paths {
patch: operations["v1_transcript_update"];
trace?: never;
};
"/v1/transcripts/{transcript_id}/restore": {
parameters: {
query?: never;
header?: never;
path?: never;
cookie?: never;
};
get?: never;
put?: never;
/**
* Transcript Restore
* @description Restore a soft-deleted transcript.
*/
post: operations["v1_transcript_restore"];
delete?: never;
options?: never;
head?: never;
patch?: never;
trace?: never;
};
"/v1/transcripts/{transcript_id}/destroy": {
parameters: {
query?: never;
header?: never;
path?: never;
cookie?: never;
};
get?: never;
put?: never;
post?: never;
/**
* Transcript Destroy
* @description Permanently delete a transcript and all associated files.
*/
delete: operations["v1_transcript_destroy"];
options?: never;
head?: never;
patch?: never;
trace?: never;
};
"/v1/transcripts/{transcript_id}/topics": {
parameters: {
query?: never;
@@ -2391,6 +2431,14 @@ export interface components {
*/
title: string;
};
/** UserTranscriptRestoredData */
UserTranscriptRestoredData: {
/**
* Id
* @description A non-empty string
*/
id: string;
};
/** UserTranscriptStatusData */
UserTranscriptStatusData: {
/**
@@ -2446,6 +2494,15 @@ export interface components {
event: "TRANSCRIPT_FINAL_TITLE";
data: components["schemas"]["UserTranscriptFinalTitleData"];
};
/** UserWsTranscriptRestored */
UserWsTranscriptRestored: {
/**
* @description discriminator enum property added by openapi-typescript
* @enum {string}
*/
event: "TRANSCRIPT_RESTORED";
data: components["schemas"]["UserTranscriptRestoredData"];
};
/** UserWsTranscriptStatus */
UserWsTranscriptStatus: {
/**
@@ -3293,6 +3350,7 @@ export interface operations {
from?: string | null;
/** @description Filter transcripts created on or before this datetime (ISO 8601 with timezone) */
to?: string | null;
include_deleted?: boolean;
};
header?: never;
path?: never;
@@ -3427,6 +3485,68 @@ export interface operations {
};
};
};
v1_transcript_restore: {
parameters: {
query?: never;
header?: never;
path: {
transcript_id: string;
};
cookie?: never;
};
requestBody?: never;
responses: {
/** @description Successful Response */
200: {
headers: {
[name: string]: unknown;
};
content: {
"application/json": components["schemas"]["DeletionStatus"];
};
};
/** @description Validation Error */
422: {
headers: {
[name: string]: unknown;
};
content: {
"application/json": components["schemas"]["HTTPValidationError"];
};
};
};
};
v1_transcript_destroy: {
parameters: {
query?: never;
header?: never;
path: {
transcript_id: string;
};
cookie?: never;
};
requestBody?: never;
responses: {
/** @description Successful Response */
200: {
headers: {
[name: string]: unknown;
};
content: {
"application/json": components["schemas"]["DeletionStatus"];
};
};
/** @description Validation Error */
422: {
headers: {
[name: string]: unknown;
};
content: {
"application/json": components["schemas"]["HTTPValidationError"];
};
};
};
};
v1_transcript_get_topics: {
parameters: {
query?: never;
@@ -3995,9 +4115,7 @@ export interface operations {
};
v1_transcript_get_video_url: {
parameters: {
query?: {
token?: string | null;
};
query?: never;
header?: never;
path: {
transcript_id: string;
@@ -4254,6 +4372,7 @@ export interface operations {
"application/json":
| components["schemas"]["UserWsTranscriptCreated"]
| components["schemas"]["UserWsTranscriptDeleted"]
| components["schemas"]["UserWsTranscriptRestored"]
| components["schemas"]["UserWsTranscriptStatus"]
| components["schemas"]["UserWsTranscriptFinalTitle"]
| components["schemas"]["UserWsTranscriptDuration"];

View File

@@ -1,4 +1,4 @@
<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round">
<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24" fill="none" stroke="white" stroke-width="2" stroke-linecap="round" stroke-linejoin="round">
<rect x="2" y="4" width="20" height="16" rx="2"/>
<path d="m22 7-8.97 5.7a1.94 1.94 0 0 1-2.06 0L2 7"/>
</svg>

Before

Width:  |  Height:  |  Size: 274 B

After

Width:  |  Height:  |  Size: 267 B