Compare commits

...

20 Commits

Author SHA1 Message Date
34a3f5618c chore(main): release 0.17.0 (#717) 2025-11-12 21:25:59 -05:00
Igor Monadical
1473fd82dc feat: daily.co support as alternative to whereby (#691)
* llm instructions

* vibe dailyco

* vibe dailyco

* doc update (vibe)

* dont show recording ui on call

* stub processor (vibe)

* stub processor (vibe) self-review

* stub processor (vibe) self-review

* chore(main): release 0.14.0 (#670)

* Add multitrack pipeline

* Mixdown audio tracks

* Mixdown with pyav filter graph

* Trigger multitrack processing for daily recordings

* apply platform from envs in priority: non-dry

* Use explicit track keys for processing

* Align tracks of a multitrack recording

* Generate waveforms for the mixed audio

* Emit multriack pipeline events

* Fix multitrack pipeline track alignment

* dailico docs

* Enable multitrack reprocessing

* modal temp files uniform names, cleanup. remove llm temporary docs

* docs cleanup

* dont proceed with raw recordings if any of the downloads fail

* dry transcription pipelines

* remove is_miltitrack

* comments

* explicit dailyco room name

* docs

* remove stub data/method

* frontend daily/whereby code self-review (no-mistake)

* frontend daily/whereby code self-review (no-mistakes)

* frontend daily/whereby code self-review (no-mistakes)

* consent cleanup for multitrack (no-mistakes)

* llm fun

* remove extra comments

* fix tests

* merge migrations

* Store participant names

* Get participants by meeting session id

* pop back main branch migration

* s3 paddington (no-mistakes)

* comment

* pr comments

* pr comments

* pr comments

* platform / meeting cleanup

* Use participant names in summary generation

* platform assignment to meeting at controller level

* pr comment

* room playform properly default none

* room playform properly default none

* restore migration lost

* streaming WIP

* extract storage / use common storage / proper env vars for storage

* fix mocks tests

* remove fall back

* streaming for multifile

* cenrtal storage abstraction (no-mistakes)

* remove dead code / vars

* Set participant user id for authenticated users

* whereby recording name parsing fix

* whereby recording name parsing fix

* more file stream

* storage dry + tests

* remove homemade boto3 streaming and use proper boto

* update migration guide

* webhook creation script - print uuid

---------

Co-authored-by: Igor Loskutov <igor.loskutoff@gmail.com>
Co-authored-by: Mathieu Virbel <mat@meltingrocks.com>
Co-authored-by: Sergey Mankovsky <sergey@monadical.com>
2025-11-12 21:21:16 -05:00
372202b0e1 feat: add API key management UI (#716)
* feat: add API key management UI

- Created settings page for users to create, view, and delete API keys
- Added Settings link to app navigation header
- Fixed delete operation return value handling in backend to properly handle asyncpg's None response

* feat: replace browser confirm with dialog for API key deletion

- Added Chakra UI Dialog component for better UX when confirming API key deletion
- Implemented proper focus management with cancelRef for accessibility
- Replaced native browser confirm() with controlled dialog state

* style: format API keys page with consistent line breaks

* feat: auto-select API key text for easier copying

- Added automatic text selection after API key creation to streamline the copy workflow
- Applied className to Code component for DOM targeting

* feat: improve API keys page layout and responsiveness

- Reduced max width from 1200px to 800px for better readability
- Added explicit width constraint to ensure consistent sizing across viewports

* refactor: remove redundant comments from API keys page
2025-11-10 18:25:08 -05:00
Igor Monadical
d20aac66c4 ui search pagination 2+page re-search fix (#714)
Co-authored-by: Igor Loskutov <igor.loskutoff@gmail.com>
2025-11-10 14:18:41 -05:00
dc4b737daa chore(main): release 0.16.0 (#711) 2025-10-24 16:18:49 -06:00
Igor Monadical
0baff7abf7 transcript ui copy button placement (#712)
Co-authored-by: Igor Loskutov <igor.loskutoff@gmail.com>
2025-10-24 16:52:02 -04:00
Igor Monadical
962c40e2b6 feat: search date filter (#710)
* search date filter

* search date filter

* search date filter

* search date filter

* pr comment

---------

Co-authored-by: Igor Loskutov <igor.loskutoff@gmail.com>
2025-10-23 20:16:43 -04:00
Igor Monadical
3c4b9f2103 chore: error reporting and naming (#708)
* chore: error reporting and naming

* chore: error reporting and naming

---------

Co-authored-by: Igor Loskutov <igor.loskutoff@gmail.com>
2025-10-22 13:45:08 -04:00
Igor Monadical
c6c035aacf removal of email-verified from /me (#707)
Co-authored-by: Igor Loskutov <igor.loskutoff@gmail.com>
2025-10-21 14:49:33 -04:00
c086b91445 chore(main): release 0.15.0 (#706) 2025-10-21 08:30:22 -06:00
Igor Monadical
9a258abc02 feat: api tokens (#705)
* feat: api tokens (vibe)

* self-review

* remove token terminology + pr comments (vibe)

* return email_verified

---------

Co-authored-by: Igor Loskutov <igor.loskutoff@gmail.com>
2025-10-20 12:55:25 -04:00
af86c47f1d chore(main): release 0.14.0 (#670) 2025-10-08 14:57:31 -06:00
5f6910e513 feat: Add calendar event data to transcript webhook payload (#689)
* feat: add calendar event data to transcript webhook payload and implement get_by_id method

* Update server/reflector/worker/webhook.py

Co-authored-by: pr-agent-monadical[bot] <198624643+pr-agent-monadical[bot]@users.noreply.github.com>

* Update server/reflector/worker/webhook.py

Co-authored-by: pr-agent-monadical[bot] <198624643+pr-agent-monadical[bot]@users.noreply.github.com>

* style: format conditional time fields with line breaks for better readability

* docs: add calendar event fields to transcript.completed webhook payload schema

---------

Co-authored-by: pr-agent-monadical[bot] <198624643+pr-agent-monadical[bot]@users.noreply.github.com>
2025-10-08 11:11:57 -05:00
9a71af145e fix: update transcript list on reprocess (#676)
* Update transcript list on reprocess

* Fix transcript create

* Fix multiple sockets issue

* Pass token in sec websocket protocol

* userEvent parse example

* transcript list invalidation non-abstraction

* Emit only relevant events to the user room

* Add ws close code const

* Refactor user websocket endpoint

* Refactor user events provider

---------

Co-authored-by: Igor Loskutov <igor.loskutoff@gmail.com>
2025-10-07 19:11:30 +02:00
eef6dc3903 fix: upgrade nemo toolkit (#678) 2025-10-07 16:45:02 +02:00
Igor Monadical
1dee255fed parakeet endpoint doc (#679)
Co-authored-by: Igor Loskutov <igor.loskutoff@gmail.com>
2025-10-07 10:41:01 -04:00
5d98754305 fix: security review (#656)
* Add security review doc

* Add tests to reproduce security issues

* Fix security issues

* Fix tests

* Set auth auth backend for tests

* Fix ics api tests

* Fix transcript mutate check

* Update frontent env var names

* Remove permissions doc
2025-09-29 23:07:49 +02:00
Igor Monadical
969bd84fcc feat: container build for www / github (#672)
Co-authored-by: Igor Loskutov <igor.loskutoff@gmail.com>
2025-09-24 12:27:45 -04:00
Igor Monadical
36608849ec fix: restore feature boolean logic (#671)
Co-authored-by: Igor Loskutov <igor.loskutoff@gmail.com>
2025-09-24 11:57:49 -04:00
Igor Monadical
5bf64b5a41 feat: docker-compose for production frontend (#664)
* docker-compose for production frontend

* fix: Remove external Redis port mapping for Coolify compatibility

Redis should only be accessible within the internal Docker network in Coolify deployments to avoid port conflicts with other applications.

* fix: Remove external port mapping for web service in Coolify

Coolify handles port exposure through its proxy (Traefik), so services should not expose ports directly in the docker-compose file.

* server side client envs

* missing vars

* nextjs experimental

* fix claude 'fix'

* remove build env vars compose

* docker

* remove ports for coolify

* review

* cleanup

---------

Co-authored-by: Igor Loskutov <igor.loskutoff@gmail.com>
2025-09-24 11:15:27 -04:00
135 changed files with 8017 additions and 804 deletions

View File

@@ -1,4 +1,4 @@
name: Deploy to Amazon ECS
name: Build container/push to container registry
on: [workflow_dispatch]

57
.github/workflows/docker-frontend.yml vendored Normal file
View File

@@ -0,0 +1,57 @@
name: Build and Push Frontend Docker Image
on:
push:
branches:
- main
paths:
- 'www/**'
- '.github/workflows/docker-frontend.yml'
workflow_dispatch:
env:
REGISTRY: ghcr.io
IMAGE_NAME: ${{ github.repository }}-frontend
jobs:
build-and-push:
runs-on: ubuntu-latest
permissions:
contents: read
packages: write
steps:
- name: Checkout repository
uses: actions/checkout@v4
- name: Log in to GitHub Container Registry
uses: docker/login-action@v3
with:
registry: ${{ env.REGISTRY }}
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}
- name: Extract metadata
id: meta
uses: docker/metadata-action@v5
with:
images: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}
tags: |
type=ref,event=branch
type=sha,prefix={{branch}}-
type=raw,value=latest,enable={{is_default_branch}}
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
- name: Build and push Docker image
uses: docker/build-push-action@v5
with:
context: ./www
file: ./www/Dockerfile
push: true
tags: ${{ steps.meta.outputs.tags }}
labels: ${{ steps.meta.outputs.labels }}
cache-from: type=gha
cache-to: type=gha,mode=max
platforms: linux/amd64,linux/arm64

View File

@@ -1,5 +1,44 @@
# Changelog
## [0.17.0](https://github.com/Monadical-SAS/reflector/compare/v0.16.0...v0.17.0) (2025-11-13)
### Features
* add API key management UI ([#716](https://github.com/Monadical-SAS/reflector/issues/716)) ([372202b](https://github.com/Monadical-SAS/reflector/commit/372202b0e1a86823900b0aa77be1bfbc2893d8a1))
* daily.co support as alternative to whereby ([#691](https://github.com/Monadical-SAS/reflector/issues/691)) ([1473fd8](https://github.com/Monadical-SAS/reflector/commit/1473fd82dc472c394cbaa2987212ad662a74bcac))
## [0.16.0](https://github.com/Monadical-SAS/reflector/compare/v0.15.0...v0.16.0) (2025-10-24)
### Features
* search date filter ([#710](https://github.com/Monadical-SAS/reflector/issues/710)) ([962c40e](https://github.com/Monadical-SAS/reflector/commit/962c40e2b6428ac42fd10aea926782d7a6f3f902))
## [0.15.0](https://github.com/Monadical-SAS/reflector/compare/v0.14.0...v0.15.0) (2025-10-20)
### Features
* api tokens ([#705](https://github.com/Monadical-SAS/reflector/issues/705)) ([9a258ab](https://github.com/Monadical-SAS/reflector/commit/9a258abc0209b0ac3799532a507ea6a9125d703a))
## [0.14.0](https://github.com/Monadical-SAS/reflector/compare/v0.13.1...v0.14.0) (2025-10-08)
### Features
* Add calendar event data to transcript webhook payload ([#689](https://github.com/Monadical-SAS/reflector/issues/689)) ([5f6910e](https://github.com/Monadical-SAS/reflector/commit/5f6910e5131b7f28f86c9ecdcc57fed8412ee3cd))
* container build for www / github ([#672](https://github.com/Monadical-SAS/reflector/issues/672)) ([969bd84](https://github.com/Monadical-SAS/reflector/commit/969bd84fcc14851d1a101412a0ba115f1b7cde82))
* docker-compose for production frontend ([#664](https://github.com/Monadical-SAS/reflector/issues/664)) ([5bf64b5](https://github.com/Monadical-SAS/reflector/commit/5bf64b5a41f64535e22849b4bb11734d4dbb4aae))
### Bug Fixes
* restore feature boolean logic ([#671](https://github.com/Monadical-SAS/reflector/issues/671)) ([3660884](https://github.com/Monadical-SAS/reflector/commit/36608849ec64e953e3be456172502762e3c33df9))
* security review ([#656](https://github.com/Monadical-SAS/reflector/issues/656)) ([5d98754](https://github.com/Monadical-SAS/reflector/commit/5d98754305c6c540dd194dda268544f6d88bfaf8))
* update transcript list on reprocess ([#676](https://github.com/Monadical-SAS/reflector/issues/676)) ([9a71af1](https://github.com/Monadical-SAS/reflector/commit/9a71af145ee9b833078c78d0c684590ab12e9f0e))
* upgrade nemo toolkit ([#678](https://github.com/Monadical-SAS/reflector/issues/678)) ([eef6dc3](https://github.com/Monadical-SAS/reflector/commit/eef6dc39037329b65804297786d852dddb0557f9))
## [0.13.1](https://github.com/Monadical-SAS/reflector/compare/v0.13.0...v0.13.1) (2025-09-22)

View File

@@ -151,7 +151,7 @@ All endpoints prefixed `/v1/`:
**Frontend** (`www/.env`):
- `NEXTAUTH_URL`, `NEXTAUTH_SECRET` - Authentication configuration
- `NEXT_PUBLIC_REFLECTOR_API_URL` - Backend API endpoint
- `REFLECTOR_API_URL` - Backend API endpoint
- `REFLECTOR_DOMAIN_CONFIG` - Feature flags and domain settings
## Testing Strategy

View File

@@ -168,6 +168,13 @@ You can manually process an audio file by calling the process tool:
uv run python -m reflector.tools.process path/to/audio.wav
```
## Build-time env variables
Next.js projects are more used to NEXT_PUBLIC_ prefixed buildtime vars. We don't have those for the reason we need to serve a ccustomizable prebuild docker container.
Instead, all the variables are runtime. Variables needed to the frontend are served to the frontend app at initial render.
It also means there's no static prebuild and no static files to serve for js/html.
## Feature Flags
@@ -177,24 +184,24 @@ Reflector uses environment variable-based feature flags to control application f
| Feature Flag | Environment Variable |
|-------------|---------------------|
| `requireLogin` | `NEXT_PUBLIC_FEATURE_REQUIRE_LOGIN` |
| `privacy` | `NEXT_PUBLIC_FEATURE_PRIVACY` |
| `browse` | `NEXT_PUBLIC_FEATURE_BROWSE` |
| `sendToZulip` | `NEXT_PUBLIC_FEATURE_SEND_TO_ZULIP` |
| `rooms` | `NEXT_PUBLIC_FEATURE_ROOMS` |
| `requireLogin` | `FEATURE_REQUIRE_LOGIN` |
| `privacy` | `FEATURE_PRIVACY` |
| `browse` | `FEATURE_BROWSE` |
| `sendToZulip` | `FEATURE_SEND_TO_ZULIP` |
| `rooms` | `FEATURE_ROOMS` |
### Setting Feature Flags
Feature flags are controlled via environment variables using the pattern `NEXT_PUBLIC_FEATURE_{FEATURE_NAME}` where `{FEATURE_NAME}` is the SCREAMING_SNAKE_CASE version of the feature name.
Feature flags are controlled via environment variables using the pattern `FEATURE_{FEATURE_NAME}` where `{FEATURE_NAME}` is the SCREAMING_SNAKE_CASE version of the feature name.
**Examples:**
```bash
# Enable user authentication requirement
NEXT_PUBLIC_FEATURE_REQUIRE_LOGIN=true
FEATURE_REQUIRE_LOGIN=true
# Disable browse functionality
NEXT_PUBLIC_FEATURE_BROWSE=false
FEATURE_BROWSE=false
# Enable Zulip integration
NEXT_PUBLIC_FEATURE_SEND_TO_ZULIP=true
FEATURE_SEND_TO_ZULIP=true
```

39
docker-compose.prod.yml Normal file
View File

@@ -0,0 +1,39 @@
# Production Docker Compose configuration for Frontend
# Usage: docker compose -f docker-compose.prod.yml up -d
services:
web:
build:
context: ./www
dockerfile: Dockerfile
image: reflector-frontend:latest
environment:
- KV_URL=${KV_URL:-redis://redis:6379}
- SITE_URL=${SITE_URL}
- API_URL=${API_URL}
- WEBSOCKET_URL=${WEBSOCKET_URL}
- NEXTAUTH_URL=${NEXTAUTH_URL:-http://localhost:3000}
- NEXTAUTH_SECRET=${NEXTAUTH_SECRET:-changeme-in-production}
- AUTHENTIK_ISSUER=${AUTHENTIK_ISSUER}
- AUTHENTIK_CLIENT_ID=${AUTHENTIK_CLIENT_ID}
- AUTHENTIK_CLIENT_SECRET=${AUTHENTIK_CLIENT_SECRET}
- AUTHENTIK_REFRESH_TOKEN_URL=${AUTHENTIK_REFRESH_TOKEN_URL}
- SENTRY_DSN=${SENTRY_DSN}
- SENTRY_IGNORE_API_RESOLUTION_ERROR=${SENTRY_IGNORE_API_RESOLUTION_ERROR:-1}
depends_on:
- redis
restart: unless-stopped
redis:
image: redis:7.2-alpine
restart: unless-stopped
healthcheck:
test: ["CMD", "redis-cli", "ping"]
interval: 30s
timeout: 3s
retries: 3
volumes:
- redis_data:/data
volumes:
redis_data:

View File

@@ -39,7 +39,7 @@ services:
ports:
- 6379:6379
web:
image: node:18
image: node:22-alpine
ports:
- "3000:3000"
command: sh -c "corepack enable && pnpm install && pnpm dev"
@@ -50,6 +50,8 @@ services:
- /app/node_modules
env_file:
- ./www/.env.local
environment:
- NODE_ENV=development
postgres:
image: postgres:17

View File

@@ -77,7 +77,7 @@ image = (
.pip_install(
"hf_transfer==0.1.9",
"huggingface_hub[hf-xet]==0.31.2",
"nemo_toolkit[asr]==2.3.0",
"nemo_toolkit[asr]==2.5.0",
"cuda-python==12.8.0",
"fastapi==0.115.12",
"numpy<2",

View File

@@ -1,3 +1,29 @@
## API Key Management
### Finding Your User ID
```bash
# Get your OAuth sub (user ID) - requires authentication
curl -H "Authorization: Bearer <your_jwt>" http://localhost:1250/v1/me
# Returns: {"sub": "your-oauth-sub-here", "email": "...", ...}
```
### Creating API Keys
```bash
curl -X POST http://localhost:1250/v1/user/api-keys \
-H "Authorization: Bearer <your_jwt>" \
-H "Content-Type: application/json" \
-d '{"name": "My API Key"}'
```
### Using API Keys
```bash
# Use X-API-Key header instead of Authorization
curl -H "X-API-Key: <your_api_key>" http://localhost:1250/v1/transcripts
```
## AWS S3/SQS usage clarification
Whereby.com uploads recordings directly to our S3 bucket when meetings end.

View File

@@ -0,0 +1,234 @@
# Reflector Architecture: Whereby + Daily.co Recording Storage
## System Overview
```mermaid
graph TB
subgraph "Actors"
APP[Our App<br/>Reflector]
WHEREBY[Whereby Service<br/>External]
DAILY[Daily.co Service<br/>External]
end
subgraph "AWS S3 Buckets"
TRANSCRIPT_BUCKET[Transcript Bucket<br/>reflector-transcripts<br/>Output: Processed MP3s]
WHEREBY_BUCKET[Whereby Bucket<br/>reflector-whereby-recordings<br/>Input: Raw MP4s]
DAILY_BUCKET[Daily.co Bucket<br/>reflector-dailyco-recordings<br/>Input: Raw WebM tracks]
end
subgraph "AWS Infrastructure"
SQS[SQS Queue<br/>Whereby notifications]
end
subgraph "Database"
DB[(PostgreSQL<br/>Recordings, Transcripts, Meetings)]
end
APP -->|Write processed| TRANSCRIPT_BUCKET
APP -->|Read/Delete| WHEREBY_BUCKET
APP -->|Read/Delete| DAILY_BUCKET
APP -->|Poll| SQS
APP -->|Store metadata| DB
WHEREBY -->|Write recordings| WHEREBY_BUCKET
WHEREBY_BUCKET -->|S3 Event| SQS
WHEREBY -->|Participant webhooks<br/>room.client.joined/left| APP
DAILY -->|Write recordings| DAILY_BUCKET
DAILY -->|Recording webhook<br/>recording.ready-to-download| APP
```
**Note on Webhook vs S3 Event for Recording Processing:**
- **Whereby**: Uses S3 Events → SQS for recording availability (S3 as source of truth, no race conditions)
- **Daily.co**: Uses webhooks for recording availability (more immediate, built-in reliability)
- **Both**: Use webhooks for participant tracking (real-time updates)
## Credentials & Permissions
```mermaid
graph LR
subgraph "Master Credentials"
MASTER[TRANSCRIPT_STORAGE_AWS_*<br/>Access Key ID + Secret]
end
subgraph "Whereby Upload Credentials"
WHEREBY_CREDS[AWS_WHEREBY_ACCESS_KEY_*<br/>Access Key ID + Secret]
end
subgraph "Daily.co Upload Role"
DAILY_ROLE[DAILY_STORAGE_AWS_ROLE_ARN<br/>IAM Role ARN]
end
subgraph "Our App Uses"
MASTER -->|Read/Write/Delete| TRANSCRIPT_BUCKET[Transcript Bucket]
MASTER -->|Read/Delete| WHEREBY_BUCKET[Whereby Bucket]
MASTER -->|Read/Delete| DAILY_BUCKET[Daily.co Bucket]
MASTER -->|Poll/Delete| SQS[SQS Queue]
end
subgraph "We Give To Services"
WHEREBY_CREDS -->|Passed in API call| WHEREBY_SERVICE[Whereby Service]
WHEREBY_SERVICE -->|Write Only| WHEREBY_BUCKET
DAILY_ROLE -->|Passed in API call| DAILY_SERVICE[Daily.co Service]
DAILY_SERVICE -->|Assume Role| DAILY_ROLE
DAILY_SERVICE -->|Write Only| DAILY_BUCKET
end
```
# Video Platform Recording Integration
This document explains how Reflector receives and identifies multitrack audio recordings from different video platforms.
## Platform Comparison
| Platform | Delivery Method | Track Identification |
|----------|----------------|---------------------|
| **Daily.co** | Webhook | Explicit track list in payload |
| **Whereby** | SQS (S3 notifications) | Single file per notification |
---
## Daily.co (Webhook-based)
Daily.co uses **webhooks** to notify Reflector when recordings are ready.
### How It Works
1. **Daily.co sends webhook** when recording is ready
- Event type: `recording.ready-to-download`
- Endpoint: `/v1/daily/webhook` (`reflector/views/daily.py:46-102`)
2. **Webhook payload explicitly includes track list**:
```json
{
"recording_id": "7443ee0a-dab1-40eb-b316-33d6c0d5ff88",
"room_name": "daily-20251020193458",
"tracks": [
{
"type": "audio",
"s3Key": "monadical/daily-20251020193458/1760988935484-52f7f48b-fbab-431f-9a50-87b9abfc8255-cam-audio-1760988935922",
"size": 831843
},
{
"type": "audio",
"s3Key": "monadical/daily-20251020193458/1760988935484-a37c35e3-6f8e-4274-a482-e9d0f102a732-cam-audio-1760988943823",
"size": 408438
},
{
"type": "video",
"s3Key": "monadical/daily-20251020193458/...-video.webm",
"size": 30000000
}
]
}
```
3. **System extracts audio tracks** (`daily.py:211`):
```python
track_keys = [t.s3Key for t in tracks if t.type == "audio"]
```
4. **Triggers multitrack processing** (`daily.py:213-218`):
```python
process_multitrack_recording.delay(
bucket_name=bucket_name, # reflector-dailyco-local
room_name=room_name, # daily-20251020193458
recording_id=recording_id, # 7443ee0a-dab1-40eb-b316-33d6c0d5ff88
track_keys=track_keys # Only audio s3Keys
)
```
### Key Advantage: No Ambiguity
Even though multiple meetings may share the same S3 bucket/folder (`monadical/`), **there's no ambiguity** because:
- Each webhook payload contains the exact `s3Key` list for that specific `recording_id`
- No need to scan folders or guess which files belong together
- Each track's s3Key includes the room timestamp subfolder (e.g., `daily-20251020193458/`)
The room name includes timestamp (`daily-20251020193458`) to keep recordings organized, but **the webhook's explicit track list is what prevents mixing files from different meetings**.
### Track Timeline Extraction
Daily.co provides timing information in two places:
**1. PyAV WebM Metadata (current approach)**:
```python
# Read from WebM container stream metadata
stream.start_time = 8.130s # Meeting-relative timing
```
**2. Filename Timestamps (alternative approach, commit 3bae9076)**:
```
Filename format: {recording_start_ts}-{uuid}-cam-audio-{track_start_ts}.webm
Example: 1760988935484-52f7f48b-fbab-431f-9a50-87b9abfc8255-cam-audio-1760988935922.webm
Parse timestamps:
- recording_start_ts: 1760988935484 (Unix ms)
- track_start_ts: 1760988935922 (Unix ms)
- offset: (1760988935922 - 1760988935484) / 1000 = 0.438s
```
**Time Difference (PyAV vs Filename)**:
```
Track 0:
Filename offset: 438ms
PyAV metadata: 229ms
Difference: 209ms
Track 1:
Filename offset: 8339ms
PyAV metadata: 8130ms
Difference: 209ms
```
**Consistent 209ms delta** suggests network/encoding delay between file upload initiation (filename) and actual audio stream start (metadata).
**Current implementation uses PyAV metadata** because:
- More accurate (represents when audio actually started)
- Padding BEFORE transcription produces correct Whisper timestamps automatically
- No manual offset adjustment needed during transcript merge
### Why Re-encoding During Padding
Padding coincidentally involves re-encoding, which is important for Daily.co + Whisper:
**Problem:** Daily.co skips frames in recordings when microphone is muted or paused
- WebM containers have gaps where audio frames should be
- Whisper doesn't understand these gaps and produces incorrect timestamps
- Example: 5s of audio with 2s muted → file has frames only for 3s, Whisper thinks duration is 3s
**Solution:** Re-encoding via PyAV filter graph (`adelay` + `aresample`)
- Restores missing frames as silence
- Produces continuous audio stream without gaps
- Whisper now sees correct duration and produces accurate timestamps
**Why combined with padding:**
- Already re-encoding for padding (adding initial silence)
- More performant to do both operations in single PyAV pipeline
- Padded values needed for mixdown anyway (creating final MP3)
Implementation: `main_multitrack_pipeline.py:_apply_audio_padding_streaming()`
---
## Whereby (SQS-based)
Whereby uses **AWS SQS** (via S3 notifications) to notify Reflector when files are uploaded.
### How It Works
1. **Whereby uploads recording** to S3
2. **S3 sends notification** to SQS queue (one notification per file)
3. **Reflector polls SQS queue** (`worker/process.py:process_messages()`)
4. **System processes single file** (`worker/process.py:process_recording()`)
### Key Difference from Daily.co
**Whereby (SQS):** System receives S3 notification "file X was created" - only knows about one file at a time, would need to scan folder to find related files
**Daily.co (Webhook):** Daily explicitly tells system which files belong together in the webhook payload
---

View File

@@ -14,7 +14,7 @@ Webhooks are configured at the room level with two fields:
### `transcript.completed`
Triggered when a transcript has been fully processed, including transcription, diarization, summarization, and topic detection.
Triggered when a transcript has been fully processed, including transcription, diarization, summarization, topic detection and calendar event integration.
### `test`
@@ -128,6 +128,27 @@ This event includes a convenient URL for accessing the transcript:
"room": {
"id": "room-789",
"name": "Product Team Room"
},
"calendar_event": {
"id": "calendar-event-123",
"ics_uid": "event-123",
"title": "Q3 Product Planning Meeting",
"start_time": "2025-08-27T12:00:00Z",
"end_time": "2025-08-27T12:30:00Z",
"description": "Team discussed Q3 product roadmap, prioritizing mobile app features and API improvements.",
"location": "Conference Room 1",
"attendees": [
{
"id": "participant-1",
"name": "John Doe",
"speaker": "Speaker 1"
},
{
"id": "participant-2",
"name": "Jane Smith",
"speaker": "Speaker 2"
}
]
}
}
```

View File

@@ -27,7 +27,7 @@ AUTH_JWT_AUDIENCE=
#TRANSCRIPT_MODAL_API_KEY=xxxxx
TRANSCRIPT_BACKEND=modal
TRANSCRIPT_URL=https://monadical-sas--reflector-transcriber-web.modal.run
TRANSCRIPT_URL=https://monadical-sas--reflector-transcriber-parakeet-web.modal.run
TRANSCRIPT_MODAL_API_KEY=
## =======================================================
@@ -71,3 +71,30 @@ DIARIZATION_URL=https://monadical-sas--reflector-diarizer-web.modal.run
## Sentry DSN configuration
#SENTRY_DSN=
## =======================================================
## Video Platform Configuration
## =======================================================
## Whereby
#WHEREBY_API_KEY=your-whereby-api-key
#WHEREBY_WEBHOOK_SECRET=your-whereby-webhook-secret
#WHEREBY_STORAGE_AWS_ACCESS_KEY_ID=your-aws-key
#WHEREBY_STORAGE_AWS_SECRET_ACCESS_KEY=your-aws-secret
#AWS_PROCESS_RECORDING_QUEUE_URL=https://sqs.us-west-2.amazonaws.com/...
## Daily.co
#DAILY_API_KEY=your-daily-api-key
#DAILY_WEBHOOK_SECRET=your-daily-webhook-secret
#DAILY_SUBDOMAIN=your-subdomain
#DAILY_WEBHOOK_UUID= # Auto-populated by recreate_daily_webhook.py script
#DAILYCO_STORAGE_AWS_ROLE_ARN=... # IAM role ARN for Daily.co S3 access
#DAILYCO_STORAGE_AWS_BUCKET_NAME=reflector-dailyco
#DAILYCO_STORAGE_AWS_REGION=us-west-2
## Whereby (optional separate bucket)
#WHEREBY_STORAGE_AWS_BUCKET_NAME=reflector-whereby
#WHEREBY_STORAGE_AWS_REGION=us-east-1
## Platform Configuration
#DEFAULT_VIDEO_PLATFORM=whereby # Default platform for new rooms

View File

@@ -0,0 +1,50 @@
"""add_platform_support
Revision ID: 1e49625677e4
Revises: 9e3f7b2a4c8e
Create Date: 2025-10-08 13:17:29.943612
"""
from typing import Sequence, Union
import sqlalchemy as sa
from alembic import op
# revision identifiers, used by Alembic.
revision: str = "1e49625677e4"
down_revision: Union[str, None] = "9e3f7b2a4c8e"
branch_labels: Union[str, Sequence[str], None] = None
depends_on: Union[str, Sequence[str], None] = None
def upgrade() -> None:
"""Add platform field with default 'whereby' for backward compatibility."""
with op.batch_alter_table("room", schema=None) as batch_op:
batch_op.add_column(
sa.Column(
"platform",
sa.String(),
nullable=True,
server_default=None,
)
)
with op.batch_alter_table("meeting", schema=None) as batch_op:
batch_op.add_column(
sa.Column(
"platform",
sa.String(),
nullable=False,
server_default="whereby",
)
)
def downgrade() -> None:
"""Remove platform field."""
with op.batch_alter_table("meeting", schema=None) as batch_op:
batch_op.drop_column("platform")
with op.batch_alter_table("room", schema=None) as batch_op:
batch_op.drop_column("platform")

View File

@@ -0,0 +1,38 @@
"""add user api keys
Revision ID: 9e3f7b2a4c8e
Revises: dc035ff72fd5
Create Date: 2025-10-17 00:00:00.000000
"""
from typing import Sequence, Union
import sqlalchemy as sa
from alembic import op
# revision identifiers, used by Alembic.
revision: str = "9e3f7b2a4c8e"
down_revision: Union[str, None] = "dc035ff72fd5"
branch_labels: Union[str, Sequence[str], None] = None
depends_on: Union[str, Sequence[str], None] = None
def upgrade() -> None:
op.create_table(
"user_api_key",
sa.Column("id", sa.String(), nullable=False),
sa.Column("user_id", sa.String(), nullable=False),
sa.Column("key_hash", sa.String(), nullable=False),
sa.Column("name", sa.String(), nullable=True),
sa.Column("created_at", sa.DateTime(timezone=True), nullable=False),
sa.PrimaryKeyConstraint("id"),
)
with op.batch_alter_table("user_api_key", schema=None) as batch_op:
batch_op.create_index("idx_user_api_key_hash", ["key_hash"], unique=True)
batch_op.create_index("idx_user_api_key_user_id", ["user_id"], unique=False)
def downgrade() -> None:
op.drop_table("user_api_key")

View File

@@ -0,0 +1,28 @@
"""add_track_keys
Revision ID: f8294b31f022
Revises: 1e49625677e4
Create Date: 2025-10-27 18:52:17.589167
"""
from typing import Sequence, Union
import sqlalchemy as sa
from alembic import op
# revision identifiers, used by Alembic.
revision: str = "f8294b31f022"
down_revision: Union[str, None] = "1e49625677e4"
branch_labels: Union[str, Sequence[str], None] = None
depends_on: Union[str, Sequence[str], None] = None
def upgrade() -> None:
with op.batch_alter_table("recording", schema=None) as batch_op:
batch_op.add_column(sa.Column("track_keys", sa.JSON(), nullable=True))
def downgrade() -> None:
with op.batch_alter_table("recording", schema=None) as batch_op:
batch_op.drop_column("track_keys")

View File

@@ -112,6 +112,7 @@ source = ["reflector"]
[tool.pytest_env]
ENVIRONMENT = "pytest"
DATABASE_URL = "postgresql://test_user:test_password@localhost:15432/reflector_test"
AUTH_BACKEND = "jwt"
[tool.pytest.ini_options]
addopts = "-ra -q --disable-pytest-warnings --cov --cov-report html -v"

View File

@@ -12,6 +12,7 @@ from reflector.events import subscribers_shutdown, subscribers_startup
from reflector.logger import logger
from reflector.metrics import metrics_init
from reflector.settings import settings
from reflector.views.daily import router as daily_router
from reflector.views.meetings import router as meetings_router
from reflector.views.rooms import router as rooms_router
from reflector.views.rtc_offer import router as rtc_offer_router
@@ -26,6 +27,8 @@ from reflector.views.transcripts_upload import router as transcripts_upload_rout
from reflector.views.transcripts_webrtc import router as transcripts_webrtc_router
from reflector.views.transcripts_websocket import router as transcripts_websocket_router
from reflector.views.user import router as user_router
from reflector.views.user_api_keys import router as user_api_keys_router
from reflector.views.user_websocket import router as user_ws_router
from reflector.views.whereby import router as whereby_router
from reflector.views.zulip import router as zulip_router
@@ -65,6 +68,12 @@ app.add_middleware(
allow_headers=["*"],
)
@app.get("/health")
async def health():
return {"status": "healthy"}
# metrics
instrumentator = Instrumentator(
excluded_handlers=["/docs", "/metrics"],
@@ -84,8 +93,11 @@ app.include_router(transcripts_websocket_router, prefix="/v1")
app.include_router(transcripts_webrtc_router, prefix="/v1")
app.include_router(transcripts_process_router, prefix="/v1")
app.include_router(user_router, prefix="/v1")
app.include_router(user_api_keys_router, prefix="/v1")
app.include_router(user_ws_router, prefix="/v1")
app.include_router(zulip_router, prefix="/v1")
app.include_router(whereby_router, prefix="/v1")
app.include_router(daily_router, prefix="/v1/daily")
add_pagination(app)
# prepare celery

View File

@@ -1,14 +1,16 @@
from typing import Annotated, Optional
from typing import Annotated, List, Optional
from fastapi import Depends, HTTPException
from fastapi.security import OAuth2PasswordBearer
from fastapi.security import APIKeyHeader, OAuth2PasswordBearer
from jose import JWTError, jwt
from pydantic import BaseModel
from reflector.db.user_api_keys import user_api_keys_controller
from reflector.logger import logger
from reflector.settings import settings
oauth2_scheme = OAuth2PasswordBearer(tokenUrl="token", auto_error=False)
api_key_header = APIKeyHeader(name="X-API-Key", auto_error=False)
jwt_public_key = open(f"reflector/auth/jwt/keys/{settings.AUTH_JWT_PUBLIC_KEY}").read()
jwt_algorithm = settings.AUTH_JWT_ALGORITHM
@@ -26,7 +28,7 @@ class JWTException(Exception):
class UserInfo(BaseModel):
sub: str
email: str
email: Optional[str] = None
def __getitem__(self, key):
return getattr(self, key)
@@ -58,34 +60,53 @@ def authenticated(token: Annotated[str, Depends(oauth2_scheme)]):
return None
def current_user(
token: Annotated[Optional[str], Depends(oauth2_scheme)],
jwtauth: JWTAuth = Depends(),
):
if token is None:
raise HTTPException(status_code=401, detail="Not authenticated")
try:
payload = jwtauth.verify_token(token)
sub = payload["sub"]
email = payload["email"]
return UserInfo(sub=sub, email=email)
except JWTError as e:
logger.error(f"JWT error: {e}")
raise HTTPException(status_code=401, detail="Invalid authentication")
async def _authenticate_user(
jwt_token: Optional[str],
api_key: Optional[str],
jwtauth: JWTAuth,
) -> UserInfo | None:
user_infos: List[UserInfo] = []
if api_key:
user_api_key = await user_api_keys_controller.verify_key(api_key)
if user_api_key:
user_infos.append(UserInfo(sub=user_api_key.user_id, email=None))
if jwt_token:
try:
payload = jwtauth.verify_token(jwt_token)
sub = payload["sub"]
email = payload["email"]
user_infos.append(UserInfo(sub=sub, email=email))
except JWTError as e:
logger.error(f"JWT error: {e}")
raise HTTPException(status_code=401, detail="Invalid authentication")
def current_user_optional(
token: Annotated[Optional[str], Depends(oauth2_scheme)],
jwtauth: JWTAuth = Depends(),
):
# we accept no token, but if one is provided, it must be a valid one.
if token is None:
if len(user_infos) == 0:
return None
try:
payload = jwtauth.verify_token(token)
sub = payload["sub"]
email = payload["email"]
return UserInfo(sub=sub, email=email)
except JWTError as e:
logger.error(f"JWT error: {e}")
raise HTTPException(status_code=401, detail="Invalid authentication")
if len(set([x.sub for x in user_infos])) > 1:
raise JWTException(
status_code=401,
detail="Invalid authentication: more than one user provided",
)
return user_infos[0]
async def current_user(
jwt_token: Annotated[Optional[str], Depends(oauth2_scheme)],
api_key: Annotated[Optional[str], Depends(api_key_header)],
jwtauth: JWTAuth = Depends(),
):
user = await _authenticate_user(jwt_token, api_key, jwtauth)
if user is None:
raise HTTPException(status_code=401, detail="Not authenticated")
return user
async def current_user_optional(
jwt_token: Annotated[Optional[str], Depends(oauth2_scheme)],
api_key: Annotated[Optional[str], Depends(api_key_header)],
jwtauth: JWTAuth = Depends(),
):
return await _authenticate_user(jwt_token, api_key, jwtauth)

View File

@@ -29,6 +29,7 @@ import reflector.db.meetings # noqa
import reflector.db.recordings # noqa
import reflector.db.rooms # noqa
import reflector.db.transcripts # noqa
import reflector.db.user_api_keys # noqa
kwargs = {}
if "postgres" not in settings.DATABASE_URL:

View File

@@ -104,6 +104,11 @@ class CalendarEventController:
results = await get_database().fetch_all(query)
return [CalendarEvent(**result) for result in results]
async def get_by_id(self, event_id: str) -> CalendarEvent | None:
query = calendar_events.select().where(calendar_events.c.id == event_id)
result = await get_database().fetch_one(query)
return CalendarEvent(**result) if result else None
async def get_by_ics_uid(self, room_id: str, ics_uid: str) -> CalendarEvent | None:
query = calendar_events.select().where(
sa.and_(

View File

@@ -7,7 +7,10 @@ from sqlalchemy.dialects.postgresql import JSONB
from reflector.db import get_database, metadata
from reflector.db.rooms import Room
from reflector.schemas.platform import WHEREBY_PLATFORM, Platform
from reflector.utils import generate_uuid4
from reflector.utils.string import assert_equal
from reflector.video_platforms.factory import get_platform
meetings = sa.Table(
"meeting",
@@ -55,6 +58,12 @@ meetings = sa.Table(
),
),
sa.Column("calendar_metadata", JSONB),
sa.Column(
"platform",
sa.String,
nullable=False,
server_default=assert_equal(WHEREBY_PLATFORM, "whereby"),
),
sa.Index("idx_meeting_room_id", "room_id"),
sa.Index("idx_meeting_calendar_event", "calendar_event_id"),
)
@@ -94,13 +103,14 @@ class Meeting(BaseModel):
is_locked: bool = False
room_mode: Literal["normal", "group"] = "normal"
recording_type: Literal["none", "local", "cloud"] = "cloud"
recording_trigger: Literal[
recording_trigger: Literal[ # whereby-specific
"none", "prompt", "automatic", "automatic-2nd-participant"
] = "automatic-2nd-participant"
num_clients: int = 0
is_active: bool = True
calendar_event_id: str | None = None
calendar_metadata: dict[str, Any] | None = None
platform: Platform = WHEREBY_PLATFORM
class MeetingController:
@@ -130,6 +140,7 @@ class MeetingController:
recording_trigger=room.recording_trigger,
calendar_event_id=calendar_event_id,
calendar_metadata=calendar_metadata,
platform=get_platform(room.platform),
)
query = meetings.insert().values(**meeting.model_dump())
await get_database().execute(query)
@@ -137,7 +148,8 @@ class MeetingController:
async def get_all_active(self) -> list[Meeting]:
query = meetings.select().where(meetings.c.is_active)
return await get_database().fetch_all(query)
results = await get_database().fetch_all(query)
return [Meeting(**result) for result in results]
async def get_by_room_name(
self,
@@ -147,16 +159,14 @@ class MeetingController:
Get a meeting by room name.
For backward compatibility, returns the most recent meeting.
"""
end_date = getattr(meetings.c, "end_date")
query = (
meetings.select()
.where(meetings.c.room_name == room_name)
.order_by(end_date.desc())
.order_by(meetings.c.end_date.desc())
)
result = await get_database().fetch_one(query)
if not result:
return None
return Meeting(**result)
async def get_active(self, room: Room, current_time: datetime) -> Meeting | None:
@@ -179,7 +189,6 @@ class MeetingController:
result = await get_database().fetch_one(query)
if not result:
return None
return Meeting(**result)
async def get_all_active_for_room(
@@ -219,17 +228,27 @@ class MeetingController:
return None
return Meeting(**result)
async def get_by_id(self, meeting_id: str, **kwargs) -> Meeting | None:
async def get_by_id(
self, meeting_id: str, room: Room | None = None
) -> Meeting | None:
query = meetings.select().where(meetings.c.id == meeting_id)
if room:
query = query.where(meetings.c.room_id == room.id)
result = await get_database().fetch_one(query)
if not result:
return None
return Meeting(**result)
async def get_by_calendar_event(self, calendar_event_id: str) -> Meeting | None:
async def get_by_calendar_event(
self, calendar_event_id: str, room: Room
) -> Meeting | None:
query = meetings.select().where(
meetings.c.calendar_event_id == calendar_event_id
)
if room:
query = query.where(meetings.c.room_id == room.id)
result = await get_database().fetch_one(query)
if not result:
return None
@@ -239,6 +258,28 @@ class MeetingController:
query = meetings.update().where(meetings.c.id == meeting_id).values(**kwargs)
await get_database().execute(query)
async def increment_num_clients(self, meeting_id: str) -> None:
"""Atomically increment participant count."""
query = (
meetings.update()
.where(meetings.c.id == meeting_id)
.values(num_clients=meetings.c.num_clients + 1)
)
await get_database().execute(query)
async def decrement_num_clients(self, meeting_id: str) -> None:
"""Atomically decrement participant count (min 0)."""
query = (
meetings.update()
.where(meetings.c.id == meeting_id)
.values(
num_clients=sa.case(
(meetings.c.num_clients > 0, meetings.c.num_clients - 1), else_=0
)
)
)
await get_database().execute(query)
class MeetingConsentController:
async def get_by_meeting_id(self, meeting_id: str) -> list[MeetingConsent]:

View File

@@ -21,6 +21,7 @@ recordings = sa.Table(
server_default="pending",
),
sa.Column("meeting_id", sa.String),
sa.Column("track_keys", sa.JSON, nullable=True),
sa.Index("idx_recording_meeting_id", "meeting_id"),
)
@@ -28,10 +29,13 @@ recordings = sa.Table(
class Recording(BaseModel):
id: str = Field(default_factory=generate_uuid4)
bucket_name: str
# for single-track
object_key: str
recorded_at: datetime
status: Literal["pending", "processing", "completed", "failed"] = "pending"
meeting_id: str | None = None
# for multitrack reprocessing
track_keys: list[str] | None = None
class RecordingController:

View File

@@ -9,6 +9,7 @@ from pydantic import BaseModel, Field
from sqlalchemy.sql import false, or_
from reflector.db import get_database, metadata
from reflector.schemas.platform import Platform
from reflector.utils import generate_uuid4
rooms = sqlalchemy.Table(
@@ -50,6 +51,12 @@ rooms = sqlalchemy.Table(
),
sqlalchemy.Column("ics_last_sync", sqlalchemy.DateTime(timezone=True)),
sqlalchemy.Column("ics_last_etag", sqlalchemy.Text),
sqlalchemy.Column(
"platform",
sqlalchemy.String,
nullable=True,
server_default=None,
),
sqlalchemy.Index("idx_room_is_shared", "is_shared"),
sqlalchemy.Index("idx_room_ics_enabled", "ics_enabled"),
)
@@ -66,7 +73,7 @@ class Room(BaseModel):
is_locked: bool = False
room_mode: Literal["normal", "group"] = "normal"
recording_type: Literal["none", "local", "cloud"] = "cloud"
recording_trigger: Literal[
recording_trigger: Literal[ # whereby-specific
"none", "prompt", "automatic", "automatic-2nd-participant"
] = "automatic-2nd-participant"
is_shared: bool = False
@@ -77,6 +84,7 @@ class Room(BaseModel):
ics_enabled: bool = False
ics_last_sync: datetime | None = None
ics_last_etag: str | None = None
platform: Platform | None = None
class RoomController:
@@ -130,6 +138,7 @@ class RoomController:
ics_url: str | None = None,
ics_fetch_interval: int = 300,
ics_enabled: bool = False,
platform: Platform | None = None,
):
"""
Add a new room
@@ -153,6 +162,7 @@ class RoomController:
ics_url=ics_url,
ics_fetch_interval=ics_fetch_interval,
ics_enabled=ics_enabled,
platform=platform,
)
query = rooms.insert().values(**room.model_dump())
try:

View File

@@ -135,6 +135,8 @@ class SearchParameters(BaseModel):
user_id: str | None = None
room_id: str | None = None
source_kind: SourceKind | None = None
from_datetime: datetime | None = None
to_datetime: datetime | None = None
class SearchResultDB(BaseModel):
@@ -402,6 +404,14 @@ class SearchController:
base_query = base_query.where(
transcripts.c.source_kind == params.source_kind
)
if params.from_datetime:
base_query = base_query.where(
transcripts.c.created_at >= params.from_datetime
)
if params.to_datetime:
base_query = base_query.where(
transcripts.c.created_at <= params.to_datetime
)
if params.query_text is not None:
order_by = sqlalchemy.desc(sqlalchemy.text("rank"))

View File

@@ -21,7 +21,7 @@ from reflector.db.utils import is_postgresql
from reflector.logger import logger
from reflector.processors.types import Word as ProcessorWord
from reflector.settings import settings
from reflector.storage import get_recordings_storage, get_transcripts_storage
from reflector.storage import get_transcripts_storage
from reflector.utils import generate_uuid4
from reflector.utils.webvtt import topics_to_webvtt
@@ -186,6 +186,7 @@ class TranscriptParticipant(BaseModel):
id: str = Field(default_factory=generate_uuid4)
speaker: int | None
name: str
user_id: str | None = None
class Transcript(BaseModel):
@@ -623,7 +624,9 @@ class TranscriptController:
)
if recording:
try:
await get_recordings_storage().delete_file(recording.object_key)
await get_transcripts_storage().delete_file(
recording.object_key, bucket=recording.bucket_name
)
except Exception as e:
logger.warning(
"Failed to delete recording object from S3",
@@ -647,6 +650,19 @@ class TranscriptController:
query = transcripts.delete().where(transcripts.c.recording_id == recording_id)
await get_database().execute(query)
@staticmethod
def user_can_mutate(transcript: Transcript, user_id: str | None) -> bool:
"""
Returns True if the given user is allowed to modify the transcript.
Policy:
- Anonymous transcripts (user_id is None) cannot be modified via API
- Only the owner (matching user_id) can modify their transcript
"""
if transcript.user_id is None:
return False
return user_id and transcript.user_id == user_id
@asynccontextmanager
async def transaction(self):
"""
@@ -712,11 +728,13 @@ class TranscriptController:
"""
Download audio from storage
"""
transcript.audio_mp3_filename.write_bytes(
await get_transcripts_storage().get_file(
transcript.storage_audio_path,
)
)
storage = get_transcripts_storage()
try:
with open(transcript.audio_mp3_filename, "wb") as f:
await storage.stream_to_fileobj(transcript.storage_audio_path, f)
except Exception:
transcript.audio_mp3_filename.unlink(missing_ok=True)
raise
async def upsert_participant(
self,

View File

@@ -0,0 +1,91 @@
import hmac
import secrets
from datetime import datetime, timezone
from hashlib import sha256
import sqlalchemy
from pydantic import BaseModel, Field
from reflector.db import get_database, metadata
from reflector.settings import settings
from reflector.utils import generate_uuid4
from reflector.utils.string import NonEmptyString
user_api_keys = sqlalchemy.Table(
"user_api_key",
metadata,
sqlalchemy.Column("id", sqlalchemy.String, primary_key=True),
sqlalchemy.Column("user_id", sqlalchemy.String, nullable=False),
sqlalchemy.Column("key_hash", sqlalchemy.String, nullable=False),
sqlalchemy.Column("name", sqlalchemy.String, nullable=True),
sqlalchemy.Column("created_at", sqlalchemy.DateTime(timezone=True), nullable=False),
sqlalchemy.Index("idx_user_api_key_hash", "key_hash", unique=True),
sqlalchemy.Index("idx_user_api_key_user_id", "user_id"),
)
class UserApiKey(BaseModel):
id: NonEmptyString = Field(default_factory=generate_uuid4)
user_id: NonEmptyString
key_hash: NonEmptyString
name: NonEmptyString | None = None
created_at: datetime = Field(default_factory=lambda: datetime.now(timezone.utc))
class UserApiKeyController:
@staticmethod
def generate_key() -> NonEmptyString:
return secrets.token_urlsafe(48)
@staticmethod
def hash_key(key: NonEmptyString) -> str:
return hmac.new(
settings.SECRET_KEY.encode(), key.encode(), digestmod=sha256
).hexdigest()
@classmethod
async def create_key(
cls,
user_id: NonEmptyString,
name: NonEmptyString | None = None,
) -> tuple[UserApiKey, NonEmptyString]:
plaintext = cls.generate_key()
api_key = UserApiKey(
user_id=user_id,
key_hash=cls.hash_key(plaintext),
name=name,
)
query = user_api_keys.insert().values(**api_key.model_dump())
await get_database().execute(query)
return api_key, plaintext
@classmethod
async def verify_key(cls, plaintext_key: NonEmptyString) -> UserApiKey | None:
key_hash = cls.hash_key(plaintext_key)
query = user_api_keys.select().where(
user_api_keys.c.key_hash == key_hash,
)
result = await get_database().fetch_one(query)
return UserApiKey(**result) if result else None
@staticmethod
async def list_by_user_id(user_id: NonEmptyString) -> list[UserApiKey]:
query = (
user_api_keys.select()
.where(user_api_keys.c.user_id == user_id)
.order_by(user_api_keys.c.created_at.desc())
)
results = await get_database().fetch_all(query)
return [UserApiKey(**r) for r in results]
@staticmethod
async def delete_key(key_id: NonEmptyString, user_id: NonEmptyString) -> bool:
query = user_api_keys.delete().where(
(user_api_keys.c.id == key_id) & (user_api_keys.c.user_id == user_id)
)
result = await get_database().execute(query)
# asyncpg returns None for DELETE, consider it success if no exception
return result is None or result > 0
user_api_keys_controller = UserApiKeyController()

View File

@@ -0,0 +1 @@
"""Pipeline modules for audio processing."""

View File

@@ -23,23 +23,18 @@ from reflector.db.transcripts import (
transcripts_controller,
)
from reflector.logger import logger
from reflector.pipelines import topic_processing
from reflector.pipelines.main_live_pipeline import (
PipelineMainBase,
broadcast_to_sockets,
task_cleanup_consent,
task_pipeline_post_to_zulip,
)
from reflector.processors import (
AudioFileWriterProcessor,
TranscriptFinalSummaryProcessor,
TranscriptFinalTitleProcessor,
TranscriptTopicDetectorProcessor,
)
from reflector.pipelines.transcription_helpers import transcribe_file_with_processor
from reflector.processors import AudioFileWriterProcessor
from reflector.processors.audio_waveform_processor import AudioWaveformProcessor
from reflector.processors.file_diarization import FileDiarizationInput
from reflector.processors.file_diarization_auto import FileDiarizationAutoProcessor
from reflector.processors.file_transcript import FileTranscriptInput
from reflector.processors.file_transcript_auto import FileTranscriptAutoProcessor
from reflector.processors.transcript_diarization_assembler import (
TranscriptDiarizationAssemblerInput,
TranscriptDiarizationAssemblerProcessor,
@@ -56,19 +51,6 @@ from reflector.storage import get_transcripts_storage
from reflector.worker.webhook import send_transcript_webhook
class EmptyPipeline:
"""Empty pipeline for processors that need a pipeline reference"""
def __init__(self, logger: structlog.BoundLogger):
self.logger = logger
def get_pref(self, k, d=None):
return d
async def emit(self, event):
pass
class PipelineMainFile(PipelineMainBase):
"""
Optimized file processing pipeline.
@@ -81,7 +63,7 @@ class PipelineMainFile(PipelineMainBase):
def __init__(self, transcript_id: str):
super().__init__(transcript_id=transcript_id)
self.logger = logger.bind(transcript_id=self.transcript_id)
self.empty_pipeline = EmptyPipeline(logger=self.logger)
self.empty_pipeline = topic_processing.EmptyPipeline(logger=self.logger)
def _handle_gather_exceptions(self, results: list, operation: str) -> None:
"""Handle exceptions from asyncio.gather with return_exceptions=True"""
@@ -131,7 +113,7 @@ class PipelineMainFile(PipelineMainBase):
self.logger.info("File pipeline complete")
await transcripts_controller.set_status(transcript.id, "ended")
await self.set_status(transcript.id, "ended")
async def extract_and_write_audio(
self, file_path: Path, transcript: Transcript
@@ -262,24 +244,7 @@ class PipelineMainFile(PipelineMainBase):
async def transcribe_file(self, audio_url: str, language: str) -> TranscriptType:
"""Transcribe complete file"""
processor = FileTranscriptAutoProcessor()
input_data = FileTranscriptInput(audio_url=audio_url, language=language)
# Store result for retrieval
result: TranscriptType | None = None
async def capture_result(transcript):
nonlocal result
result = transcript
processor.on(capture_result)
await processor.push(input_data)
await processor.flush()
if not result:
raise ValueError("No transcript captured")
return result
return await transcribe_file_with_processor(audio_url, language)
async def diarize_file(self, audio_url: str) -> list[DiarizationSegment] | None:
"""Get diarization for file"""
@@ -322,63 +287,31 @@ class PipelineMainFile(PipelineMainBase):
async def detect_topics(
self, transcript: TranscriptType, target_language: str
) -> list[TitleSummary]:
"""Detect topics from complete transcript"""
chunk_size = 300
topics: list[TitleSummary] = []
async def on_topic(topic: TitleSummary):
topics.append(topic)
return await self.on_topic(topic)
topic_detector = TranscriptTopicDetectorProcessor(callback=on_topic)
topic_detector.set_pipeline(self.empty_pipeline)
for i in range(0, len(transcript.words), chunk_size):
chunk_words = transcript.words[i : i + chunk_size]
if not chunk_words:
continue
chunk_transcript = TranscriptType(
words=chunk_words, translation=transcript.translation
)
await topic_detector.push(chunk_transcript)
await topic_detector.flush()
return topics
return await topic_processing.detect_topics(
transcript,
target_language,
on_topic_callback=self.on_topic,
empty_pipeline=self.empty_pipeline,
)
async def generate_title(self, topics: list[TitleSummary]):
"""Generate title from topics"""
if not topics:
self.logger.warning("No topics for title generation")
return
processor = TranscriptFinalTitleProcessor(callback=self.on_title)
processor.set_pipeline(self.empty_pipeline)
for topic in topics:
await processor.push(topic)
await processor.flush()
return await topic_processing.generate_title(
topics,
on_title_callback=self.on_title,
empty_pipeline=self.empty_pipeline,
logger=self.logger,
)
async def generate_summaries(self, topics: list[TitleSummary]):
"""Generate long and short summaries from topics"""
if not topics:
self.logger.warning("No topics for summary generation")
return
transcript = await self.get_transcript()
processor = TranscriptFinalSummaryProcessor(
transcript=transcript,
callback=self.on_long_summary,
on_short_summary=self.on_short_summary,
return await topic_processing.generate_summaries(
topics,
transcript,
on_long_summary_callback=self.on_long_summary,
on_short_summary_callback=self.on_short_summary,
empty_pipeline=self.empty_pipeline,
logger=self.logger,
)
processor.set_pipeline(self.empty_pipeline)
for topic in topics:
await processor.push(topic)
await processor.flush()
@shared_task
@@ -426,7 +359,12 @@ async def task_pipeline_file_process(*, transcript_id: str):
await pipeline.process(audio_file)
except Exception:
except Exception as e:
logger.error(
f"File pipeline failed for transcript {transcript_id}: {type(e).__name__}: {str(e)}",
exc_info=True,
transcript_id=transcript_id,
)
await pipeline.set_status(transcript_id, "error")
raise

View File

@@ -17,7 +17,6 @@ from contextlib import asynccontextmanager
from typing import Generic
import av
import boto3
from celery import chord, current_task, group, shared_task
from pydantic import BaseModel
from structlog import BoundLogger as Logger
@@ -85,6 +84,20 @@ def broadcast_to_sockets(func):
message=resp.model_dump(mode="json"),
)
transcript = await transcripts_controller.get_by_id(self.transcript_id)
if transcript and transcript.user_id:
# Emit only relevant events to the user room to avoid noisy updates.
# Allowed: STATUS, FINAL_TITLE, DURATION. All are prefixed with TRANSCRIPT_
allowed_user_events = {"STATUS", "FINAL_TITLE", "DURATION"}
if resp.event in allowed_user_events:
await self.ws_manager.send_json(
room_id=f"user:{transcript.user_id}",
message={
"event": f"TRANSCRIPT_{resp.event}",
"data": {"id": self.transcript_id, **resp.data},
},
)
return wrapper
@@ -570,6 +583,7 @@ async def cleanup_consent(transcript: Transcript, logger: Logger):
consent_denied = False
recording = None
meeting = None
try:
if transcript.recording_id:
recording = await recordings_controller.get_by_id(transcript.recording_id)
@@ -580,8 +594,8 @@ async def cleanup_consent(transcript: Transcript, logger: Logger):
meeting.id
)
except Exception as e:
logger.error(f"Failed to get fetch consent: {e}", exc_info=e)
consent_denied = True
logger.error(f"Failed to fetch consent: {e}", exc_info=e)
raise
if not consent_denied:
logger.info("Consent approved, keeping all files")
@@ -589,25 +603,24 @@ async def cleanup_consent(transcript: Transcript, logger: Logger):
logger.info("Consent denied, cleaning up all related audio files")
if recording and recording.bucket_name and recording.object_key:
s3_whereby = boto3.client(
"s3",
aws_access_key_id=settings.AWS_WHEREBY_ACCESS_KEY_ID,
aws_secret_access_key=settings.AWS_WHEREBY_ACCESS_KEY_SECRET,
)
try:
s3_whereby.delete_object(
Bucket=recording.bucket_name, Key=recording.object_key
)
logger.info(
f"Deleted original Whereby recording: {recording.bucket_name}/{recording.object_key}"
)
except Exception as e:
logger.error(f"Failed to delete Whereby recording: {e}", exc_info=e)
deletion_errors = []
if recording and recording.bucket_name:
keys_to_delete = []
if recording.track_keys:
keys_to_delete = recording.track_keys
elif recording.object_key:
keys_to_delete = [recording.object_key]
master_storage = get_transcripts_storage()
for key in keys_to_delete:
try:
await master_storage.delete_file(key, bucket=recording.bucket_name)
logger.info(f"Deleted recording file: {recording.bucket_name}/{key}")
except Exception as e:
error_msg = f"Failed to delete {key}: {e}"
logger.error(error_msg, exc_info=e)
deletion_errors.append(error_msg)
# non-transactional, files marked for deletion not actually deleted is possible
await transcripts_controller.update(transcript, {"audio_deleted": True})
# 2. Delete processed audio from transcript storage S3 bucket
if transcript.audio_location == "storage":
storage = get_transcripts_storage()
try:
@@ -616,18 +629,28 @@ async def cleanup_consent(transcript: Transcript, logger: Logger):
f"Deleted processed audio from storage: {transcript.storage_audio_path}"
)
except Exception as e:
logger.error(f"Failed to delete processed audio: {e}", exc_info=e)
error_msg = f"Failed to delete processed audio: {e}"
logger.error(error_msg, exc_info=e)
deletion_errors.append(error_msg)
# 3. Delete local audio files
try:
if hasattr(transcript, "audio_mp3_filename") and transcript.audio_mp3_filename:
transcript.audio_mp3_filename.unlink(missing_ok=True)
if hasattr(transcript, "audio_wav_filename") and transcript.audio_wav_filename:
transcript.audio_wav_filename.unlink(missing_ok=True)
except Exception as e:
logger.error(f"Failed to delete local audio files: {e}", exc_info=e)
error_msg = f"Failed to delete local audio files: {e}"
logger.error(error_msg, exc_info=e)
deletion_errors.append(error_msg)
logger.info("Consent cleanup done")
if deletion_errors:
logger.warning(
f"Consent cleanup completed with {len(deletion_errors)} errors",
errors=deletion_errors,
)
else:
await transcripts_controller.update(transcript, {"audio_deleted": True})
logger.info("Consent cleanup done - all audio deleted")
@get_transcript

View File

@@ -0,0 +1,694 @@
import asyncio
import math
import tempfile
from fractions import Fraction
from pathlib import Path
import av
from av.audio.resampler import AudioResampler
from celery import chain, shared_task
from reflector.asynctask import asynctask
from reflector.db.transcripts import (
TranscriptStatus,
TranscriptWaveform,
transcripts_controller,
)
from reflector.logger import logger
from reflector.pipelines import topic_processing
from reflector.pipelines.main_file_pipeline import task_send_webhook_if_needed
from reflector.pipelines.main_live_pipeline import (
PipelineMainBase,
broadcast_to_sockets,
task_cleanup_consent,
task_pipeline_post_to_zulip,
)
from reflector.pipelines.transcription_helpers import transcribe_file_with_processor
from reflector.processors import AudioFileWriterProcessor
from reflector.processors.audio_waveform_processor import AudioWaveformProcessor
from reflector.processors.types import TitleSummary
from reflector.processors.types import Transcript as TranscriptType
from reflector.storage import Storage, get_transcripts_storage
from reflector.utils.string import NonEmptyString
# Audio encoding constants
OPUS_STANDARD_SAMPLE_RATE = 48000
OPUS_DEFAULT_BIT_RATE = 128000
# Storage operation constants
PRESIGNED_URL_EXPIRATION_SECONDS = 7200 # 2 hours
class PipelineMainMultitrack(PipelineMainBase):
def __init__(self, transcript_id: str):
super().__init__(transcript_id=transcript_id)
self.logger = logger.bind(transcript_id=self.transcript_id)
self.empty_pipeline = topic_processing.EmptyPipeline(logger=self.logger)
async def pad_track_for_transcription(
self,
track_url: NonEmptyString,
track_idx: int,
storage: Storage,
) -> NonEmptyString:
"""
Pad a single track with silence based on stream metadata start_time.
Downloads from S3 presigned URL, processes via PyAV using tempfile, uploads to S3.
Returns presigned URL of padded track (or original URL if no padding needed).
Memory usage:
- Pattern: fixed_overhead(2-5MB) for PyAV codec/filters
- PyAV streams input efficiently (no full download, verified)
- Output written to tempfile (disk-based, not memory)
- Upload streams from file handle (boto3 chunks, typically 5-10MB)
Daily.co raw-tracks timing - Two approaches:
CURRENT APPROACH (PyAV metadata):
The WebM stream.start_time field encodes MEETING-RELATIVE timing:
- t=0: When Daily.co recording started (first participant joined)
- start_time=8.13s: This participant's track began 8.13s after recording started
- Purpose: Enables track alignment without external manifest files
This is NOT:
- Stream-internal offset (first packet timestamp relative to stream start)
- Absolute/wall-clock time
- Recording duration
ALTERNATIVE APPROACH (filename parsing):
Daily.co filenames contain Unix timestamps (milliseconds):
Format: {recording_start_ts}-{participant_id}-cam-audio-{track_start_ts}.webm
Example: 1760988935484-52f7f48b-fbab-431f-9a50-87b9abfc8255-cam-audio-1760988935922.webm
Can calculate offset: (track_start_ts - recording_start_ts) / 1000
- Track 0: (1760988935922 - 1760988935484) / 1000 = 0.438s
- Track 1: (1760988943823 - 1760988935484) / 1000 = 8.339s
TIME DIFFERENCE: PyAV metadata vs filename timestamps differ by ~209ms:
- Track 0: filename=438ms, metadata=229ms (diff: 209ms)
- Track 1: filename=8339ms, metadata=8130ms (diff: 209ms)
Consistent delta suggests network/encoding delay. PyAV metadata is ground truth
(represents when audio stream actually started vs when file upload initiated).
Example with 2 participants:
Track A: start_time=0.2s → Joined 200ms after recording began
Track B: start_time=8.1s → Joined 8.1 seconds later
After padding:
Track A: [0.2s silence] + [speech...]
Track B: [8.1s silence] + [speech...]
Whisper transcription timestamps are now synchronized:
Track A word at 5.0s → happened at meeting t=5.0s
Track B word at 10.0s → happened at meeting t=10.0s
Merging just sorts by timestamp - no offset calculation needed.
Padding coincidentally involves re-encoding. It's important when we work with Daily.co + Whisper.
This is because Daily.co returns recordings with skipped frames e.g. when microphone muted.
Daily.co doesn't understand those frames and ignores them, causing timestamp issues in transcription.
Re-encoding restores those frames. We do padding and re-encoding together just because it's convenient and more performant:
we need padded values for mix mp3 anyways
"""
transcript = await self.get_transcript()
try:
# PyAV streams input from S3 URL efficiently (2-5MB fixed overhead for codec/filters)
with av.open(track_url) as in_container:
start_time_seconds = self._extract_stream_start_time_from_container(
in_container, track_idx
)
if start_time_seconds <= 0:
self.logger.info(
f"Track {track_idx} requires no padding (start_time={start_time_seconds}s)",
track_idx=track_idx,
)
return track_url
# Use tempfile instead of BytesIO for better memory efficiency
# Reduces peak memory usage during encoding/upload
with tempfile.NamedTemporaryFile(
suffix=".webm", delete=False
) as temp_file:
temp_path = temp_file.name
try:
self._apply_audio_padding_to_file(
in_container, temp_path, start_time_seconds, track_idx
)
storage_path = (
f"file_pipeline/{transcript.id}/tracks/padded_{track_idx}.webm"
)
# Upload using file handle for streaming
with open(temp_path, "rb") as padded_file:
await storage.put_file(storage_path, padded_file)
finally:
# Clean up temp file
Path(temp_path).unlink(missing_ok=True)
padded_url = await storage.get_file_url(
storage_path,
operation="get_object",
expires_in=PRESIGNED_URL_EXPIRATION_SECONDS,
)
self.logger.info(
f"Successfully padded track {track_idx}",
track_idx=track_idx,
start_time_seconds=start_time_seconds,
padded_url=padded_url,
)
return padded_url
except Exception as e:
self.logger.error(
f"Failed to process track {track_idx}",
track_idx=track_idx,
url=track_url,
error=str(e),
exc_info=True,
)
raise Exception(
f"Track {track_idx} padding failed - transcript would have incorrect timestamps"
) from e
def _extract_stream_start_time_from_container(
self, container, track_idx: int
) -> float:
"""
Extract meeting-relative start time from WebM stream metadata.
Uses PyAV to read stream.start_time from WebM container.
More accurate than filename timestamps by ~209ms due to network/encoding delays.
"""
start_time_seconds = 0.0
try:
audio_streams = [s for s in container.streams if s.type == "audio"]
stream = audio_streams[0] if audio_streams else container.streams[0]
# 1) Try stream-level start_time (most reliable for Daily.co tracks)
if stream.start_time is not None and stream.time_base is not None:
start_time_seconds = float(stream.start_time * stream.time_base)
# 2) Fallback to container-level start_time (in av.time_base units)
if (start_time_seconds <= 0) and (container.start_time is not None):
start_time_seconds = float(container.start_time * av.time_base)
# 3) Fallback to first packet DTS in stream.time_base
if start_time_seconds <= 0:
for packet in container.demux(stream):
if packet.dts is not None:
start_time_seconds = float(packet.dts * stream.time_base)
break
except Exception as e:
self.logger.warning(
"PyAV metadata read failed; assuming 0 start_time",
track_idx=track_idx,
error=str(e),
)
start_time_seconds = 0.0
self.logger.info(
f"Track {track_idx} stream metadata: start_time={start_time_seconds:.3f}s",
track_idx=track_idx,
)
return start_time_seconds
def _apply_audio_padding_to_file(
self,
in_container,
output_path: str,
start_time_seconds: float,
track_idx: int,
) -> None:
"""Apply silence padding to audio track using PyAV filter graph, writing to file"""
delay_ms = math.floor(start_time_seconds * 1000)
self.logger.info(
f"Padding track {track_idx} with {delay_ms}ms delay using PyAV",
track_idx=track_idx,
delay_ms=delay_ms,
)
try:
with av.open(output_path, "w", format="webm") as out_container:
in_stream = next(
(s for s in in_container.streams if s.type == "audio"), None
)
if in_stream is None:
raise Exception("No audio stream in input")
out_stream = out_container.add_stream(
"libopus", rate=OPUS_STANDARD_SAMPLE_RATE
)
out_stream.bit_rate = OPUS_DEFAULT_BIT_RATE
graph = av.filter.Graph()
abuf_args = (
f"time_base=1/{OPUS_STANDARD_SAMPLE_RATE}:"
f"sample_rate={OPUS_STANDARD_SAMPLE_RATE}:"
f"sample_fmt=s16:"
f"channel_layout=stereo"
)
src = graph.add("abuffer", args=abuf_args, name="src")
aresample_f = graph.add("aresample", args="async=1", name="ares")
# adelay requires one delay value per channel separated by '|'
delays_arg = f"{delay_ms}|{delay_ms}"
adelay_f = graph.add(
"adelay", args=f"delays={delays_arg}:all=1", name="delay"
)
sink = graph.add("abuffersink", name="sink")
src.link_to(aresample_f)
aresample_f.link_to(adelay_f)
adelay_f.link_to(sink)
graph.configure()
resampler = AudioResampler(
format="s16", layout="stereo", rate=OPUS_STANDARD_SAMPLE_RATE
)
# Decode -> resample -> push through graph -> encode Opus
for frame in in_container.decode(in_stream):
out_frames = resampler.resample(frame) or []
for rframe in out_frames:
rframe.sample_rate = OPUS_STANDARD_SAMPLE_RATE
rframe.time_base = Fraction(1, OPUS_STANDARD_SAMPLE_RATE)
src.push(rframe)
while True:
try:
f_out = sink.pull()
except Exception:
break
f_out.sample_rate = OPUS_STANDARD_SAMPLE_RATE
f_out.time_base = Fraction(1, OPUS_STANDARD_SAMPLE_RATE)
for packet in out_stream.encode(f_out):
out_container.mux(packet)
src.push(None)
while True:
try:
f_out = sink.pull()
except Exception:
break
f_out.sample_rate = OPUS_STANDARD_SAMPLE_RATE
f_out.time_base = Fraction(1, OPUS_STANDARD_SAMPLE_RATE)
for packet in out_stream.encode(f_out):
out_container.mux(packet)
for packet in out_stream.encode(None):
out_container.mux(packet)
except Exception as e:
self.logger.error(
"PyAV padding failed for track",
track_idx=track_idx,
delay_ms=delay_ms,
error=str(e),
exc_info=True,
)
raise
async def mixdown_tracks(
self,
track_urls: list[str],
writer: AudioFileWriterProcessor,
offsets_seconds: list[float] | None = None,
) -> None:
"""Multi-track mixdown using PyAV filter graph (amix), reading from S3 presigned URLs"""
target_sample_rate: int | None = None
for url in track_urls:
if not url:
continue
container = None
try:
container = av.open(url)
for frame in container.decode(audio=0):
target_sample_rate = frame.sample_rate
break
except Exception:
continue
finally:
if container is not None:
container.close()
if target_sample_rate:
break
if not target_sample_rate:
self.logger.error("Mixdown failed - no decodable audio frames found")
raise Exception("Mixdown failed: No decodable audio frames in any track")
# Build PyAV filter graph:
# N abuffer (s32/stereo)
# -> optional adelay per input (for alignment)
# -> amix (s32)
# -> aformat(s16)
# -> sink
graph = av.filter.Graph()
inputs = []
valid_track_urls = [url for url in track_urls if url]
input_offsets_seconds = None
if offsets_seconds is not None:
input_offsets_seconds = [
offsets_seconds[i] for i, url in enumerate(track_urls) if url
]
for idx, url in enumerate(valid_track_urls):
args = (
f"time_base=1/{target_sample_rate}:"
f"sample_rate={target_sample_rate}:"
f"sample_fmt=s32:"
f"channel_layout=stereo"
)
in_ctx = graph.add("abuffer", args=args, name=f"in{idx}")
inputs.append(in_ctx)
if not inputs:
self.logger.error("Mixdown failed - no valid inputs for graph")
raise Exception("Mixdown failed: No valid inputs for filter graph")
mixer = graph.add("amix", args=f"inputs={len(inputs)}:normalize=0", name="mix")
fmt = graph.add(
"aformat",
args=(
f"sample_fmts=s32:channel_layouts=stereo:sample_rates={target_sample_rate}"
),
name="fmt",
)
sink = graph.add("abuffersink", name="out")
# Optional per-input delay before mixing
delays_ms: list[int] = []
if input_offsets_seconds is not None:
base = min(input_offsets_seconds) if input_offsets_seconds else 0.0
delays_ms = [
max(0, int(round((o - base) * 1000))) for o in input_offsets_seconds
]
else:
delays_ms = [0 for _ in inputs]
for idx, in_ctx in enumerate(inputs):
delay_ms = delays_ms[idx] if idx < len(delays_ms) else 0
if delay_ms > 0:
# adelay requires one value per channel; use same for stereo
adelay = graph.add(
"adelay",
args=f"delays={delay_ms}|{delay_ms}:all=1",
name=f"delay{idx}",
)
in_ctx.link_to(adelay)
adelay.link_to(mixer, 0, idx)
else:
in_ctx.link_to(mixer, 0, idx)
mixer.link_to(fmt)
fmt.link_to(sink)
graph.configure()
containers = []
try:
# Open all containers with cleanup guaranteed
for i, url in enumerate(valid_track_urls):
try:
c = av.open(url)
containers.append(c)
except Exception as e:
self.logger.warning(
"Mixdown: failed to open container from URL",
input=i,
url=url,
error=str(e),
)
if not containers:
self.logger.error("Mixdown failed - no valid containers opened")
raise Exception("Mixdown failed: Could not open any track containers")
decoders = [c.decode(audio=0) for c in containers]
active = [True] * len(decoders)
resamplers = [
AudioResampler(format="s32", layout="stereo", rate=target_sample_rate)
for _ in decoders
]
while any(active):
for i, (dec, is_active) in enumerate(zip(decoders, active)):
if not is_active:
continue
try:
frame = next(dec)
except StopIteration:
active[i] = False
continue
if frame.sample_rate != target_sample_rate:
continue
out_frames = resamplers[i].resample(frame) or []
for rf in out_frames:
rf.sample_rate = target_sample_rate
rf.time_base = Fraction(1, target_sample_rate)
inputs[i].push(rf)
while True:
try:
mixed = sink.pull()
except Exception:
break
mixed.sample_rate = target_sample_rate
mixed.time_base = Fraction(1, target_sample_rate)
await writer.push(mixed)
for in_ctx in inputs:
in_ctx.push(None)
while True:
try:
mixed = sink.pull()
except Exception:
break
mixed.sample_rate = target_sample_rate
mixed.time_base = Fraction(1, target_sample_rate)
await writer.push(mixed)
finally:
# Cleanup all containers, even if processing failed
for c in containers:
if c is not None:
try:
c.close()
except Exception:
pass # Best effort cleanup
@broadcast_to_sockets
async def set_status(self, transcript_id: str, status: TranscriptStatus):
async with self.lock_transaction():
return await transcripts_controller.set_status(transcript_id, status)
async def on_waveform(self, data):
async with self.transaction():
waveform = TranscriptWaveform(waveform=data)
transcript = await self.get_transcript()
return await transcripts_controller.append_event(
transcript=transcript, event="WAVEFORM", data=waveform
)
async def process(self, bucket_name: str, track_keys: list[str]):
transcript = await self.get_transcript()
async with self.transaction():
await transcripts_controller.update(
transcript,
{
"events": [],
"topics": [],
},
)
source_storage = get_transcripts_storage()
transcript_storage = source_storage
track_urls: list[str] = []
for key in track_keys:
url = await source_storage.get_file_url(
key,
operation="get_object",
expires_in=PRESIGNED_URL_EXPIRATION_SECONDS,
bucket=bucket_name,
)
track_urls.append(url)
self.logger.info(
f"Generated presigned URL for track from {bucket_name}",
key=key,
)
created_padded_files = set()
padded_track_urls: list[str] = []
for idx, url in enumerate(track_urls):
padded_url = await self.pad_track_for_transcription(
url, idx, transcript_storage
)
padded_track_urls.append(padded_url)
if padded_url != url:
storage_path = f"file_pipeline/{transcript.id}/tracks/padded_{idx}.webm"
created_padded_files.add(storage_path)
self.logger.info(f"Track {idx} processed, padded URL: {padded_url}")
transcript.data_path.mkdir(parents=True, exist_ok=True)
mp3_writer = AudioFileWriterProcessor(
path=str(transcript.audio_mp3_filename),
on_duration=self.on_duration,
)
await self.mixdown_tracks(padded_track_urls, mp3_writer, offsets_seconds=None)
await mp3_writer.flush()
if not transcript.audio_mp3_filename.exists():
raise Exception(
"Mixdown failed - no MP3 file generated. Cannot proceed without playable audio."
)
storage_path = f"{transcript.id}/audio.mp3"
# Use file handle streaming to avoid loading entire MP3 into memory
mp3_size = transcript.audio_mp3_filename.stat().st_size
with open(transcript.audio_mp3_filename, "rb") as mp3_file:
await transcript_storage.put_file(storage_path, mp3_file)
mp3_url = await transcript_storage.get_file_url(storage_path)
await transcripts_controller.update(transcript, {"audio_location": "storage"})
self.logger.info(
f"Uploaded mixed audio to storage",
storage_path=storage_path,
size=mp3_size,
url=mp3_url,
)
self.logger.info("Generating waveform from mixed audio")
waveform_processor = AudioWaveformProcessor(
audio_path=transcript.audio_mp3_filename,
waveform_path=transcript.audio_waveform_filename,
on_waveform=self.on_waveform,
)
waveform_processor.set_pipeline(self.empty_pipeline)
await waveform_processor.flush()
self.logger.info("Waveform generated successfully")
speaker_transcripts: list[TranscriptType] = []
for idx, padded_url in enumerate(padded_track_urls):
if not padded_url:
continue
t = await self.transcribe_file(padded_url, transcript.source_language)
if not t.words:
continue
for w in t.words:
w.speaker = idx
speaker_transcripts.append(t)
self.logger.info(
f"Track {idx} transcribed successfully with {len(t.words)} words",
track_idx=idx,
)
valid_track_count = len([url for url in padded_track_urls if url])
if valid_track_count > 0 and len(speaker_transcripts) != valid_track_count:
raise Exception(
f"Only {len(speaker_transcripts)}/{valid_track_count} tracks transcribed successfully. "
f"All tracks must succeed to avoid incomplete transcripts."
)
if not speaker_transcripts:
raise Exception("No valid track transcriptions")
self.logger.info(f"Cleaning up {len(created_padded_files)} temporary S3 files")
cleanup_tasks = []
for storage_path in created_padded_files:
cleanup_tasks.append(transcript_storage.delete_file(storage_path))
if cleanup_tasks:
cleanup_results = await asyncio.gather(
*cleanup_tasks, return_exceptions=True
)
for storage_path, result in zip(created_padded_files, cleanup_results):
if isinstance(result, Exception):
self.logger.warning(
"Failed to cleanup temporary padded track",
storage_path=storage_path,
error=str(result),
)
merged_words = []
for t in speaker_transcripts:
merged_words.extend(t.words)
merged_words.sort(
key=lambda w: w.start if hasattr(w, "start") and w.start is not None else 0
)
merged_transcript = TranscriptType(words=merged_words, translation=None)
await self.on_transcript(merged_transcript)
topics = await self.detect_topics(merged_transcript, transcript.target_language)
await asyncio.gather(
self.generate_title(topics),
self.generate_summaries(topics),
return_exceptions=False,
)
await self.set_status(transcript.id, "ended")
async def transcribe_file(self, audio_url: str, language: str) -> TranscriptType:
return await transcribe_file_with_processor(audio_url, language)
async def detect_topics(
self, transcript: TranscriptType, target_language: str
) -> list[TitleSummary]:
return await topic_processing.detect_topics(
transcript,
target_language,
on_topic_callback=self.on_topic,
empty_pipeline=self.empty_pipeline,
)
async def generate_title(self, topics: list[TitleSummary]):
return await topic_processing.generate_title(
topics,
on_title_callback=self.on_title,
empty_pipeline=self.empty_pipeline,
logger=self.logger,
)
async def generate_summaries(self, topics: list[TitleSummary]):
transcript = await self.get_transcript()
return await topic_processing.generate_summaries(
topics,
transcript,
on_long_summary_callback=self.on_long_summary,
on_short_summary_callback=self.on_short_summary,
empty_pipeline=self.empty_pipeline,
logger=self.logger,
)
@shared_task
@asynctask
async def task_pipeline_multitrack_process(
*, transcript_id: str, bucket_name: str, track_keys: list[str]
):
pipeline = PipelineMainMultitrack(transcript_id=transcript_id)
try:
await pipeline.set_status(transcript_id, "processing")
await pipeline.process(bucket_name, track_keys)
except Exception:
await pipeline.set_status(transcript_id, "error")
raise
post_chain = chain(
task_cleanup_consent.si(transcript_id=transcript_id),
task_pipeline_post_to_zulip.si(transcript_id=transcript_id),
task_send_webhook_if_needed.si(transcript_id=transcript_id),
)
post_chain.delay()

View File

@@ -0,0 +1,109 @@
"""
Topic processing utilities
==========================
Shared topic detection, title generation, and summarization logic
used across file and multitrack pipelines.
"""
from typing import Callable
import structlog
from reflector.db.transcripts import Transcript
from reflector.processors import (
TranscriptFinalSummaryProcessor,
TranscriptFinalTitleProcessor,
TranscriptTopicDetectorProcessor,
)
from reflector.processors.types import TitleSummary
from reflector.processors.types import Transcript as TranscriptType
class EmptyPipeline:
def __init__(self, logger: structlog.BoundLogger):
self.logger = logger
def get_pref(self, k, d=None):
return d
async def emit(self, event):
pass
async def detect_topics(
transcript: TranscriptType,
target_language: str,
*,
on_topic_callback: Callable,
empty_pipeline: EmptyPipeline,
) -> list[TitleSummary]:
chunk_size = 300
topics: list[TitleSummary] = []
async def on_topic(topic: TitleSummary):
topics.append(topic)
return await on_topic_callback(topic)
topic_detector = TranscriptTopicDetectorProcessor(callback=on_topic)
topic_detector.set_pipeline(empty_pipeline)
for i in range(0, len(transcript.words), chunk_size):
chunk_words = transcript.words[i : i + chunk_size]
if not chunk_words:
continue
chunk_transcript = TranscriptType(
words=chunk_words, translation=transcript.translation
)
await topic_detector.push(chunk_transcript)
await topic_detector.flush()
return topics
async def generate_title(
topics: list[TitleSummary],
*,
on_title_callback: Callable,
empty_pipeline: EmptyPipeline,
logger: structlog.BoundLogger,
):
if not topics:
logger.warning("No topics for title generation")
return
processor = TranscriptFinalTitleProcessor(callback=on_title_callback)
processor.set_pipeline(empty_pipeline)
for topic in topics:
await processor.push(topic)
await processor.flush()
async def generate_summaries(
topics: list[TitleSummary],
transcript: Transcript,
*,
on_long_summary_callback: Callable,
on_short_summary_callback: Callable,
empty_pipeline: EmptyPipeline,
logger: structlog.BoundLogger,
):
if not topics:
logger.warning("No topics for summary generation")
return
processor = TranscriptFinalSummaryProcessor(
transcript=transcript,
callback=on_long_summary_callback,
on_short_summary=on_short_summary_callback,
)
processor.set_pipeline(empty_pipeline)
for topic in topics:
await processor.push(topic)
await processor.flush()

View File

@@ -0,0 +1,34 @@
from reflector.processors.file_transcript import FileTranscriptInput
from reflector.processors.file_transcript_auto import FileTranscriptAutoProcessor
from reflector.processors.types import Transcript as TranscriptType
async def transcribe_file_with_processor(
audio_url: str,
language: str,
processor_name: str | None = None,
) -> TranscriptType:
processor = (
FileTranscriptAutoProcessor(name=processor_name)
if processor_name
else FileTranscriptAutoProcessor()
)
input_data = FileTranscriptInput(audio_url=audio_url, language=language)
result: TranscriptType | None = None
async def capture_result(transcript):
nonlocal result
result = transcript
processor.on(capture_result)
await processor.push(input_data)
await processor.flush()
if not result:
processor_label = processor_name or "default"
raise ValueError(
f"No transcript captured from {processor_label} processor for audio: {audio_url}"
)
return result

View File

@@ -56,6 +56,16 @@ class FileTranscriptModalProcessor(FileTranscriptProcessor):
},
follow_redirects=True,
)
if response.status_code != 200:
error_body = response.text
self.logger.error(
"Modal API error",
audio_url=data.audio_url,
status_code=response.status_code,
error_body=error_body,
)
response.raise_for_status()
result = response.json()

View File

@@ -165,6 +165,7 @@ class SummaryBuilder:
self.llm: LLM = llm
self.model_name: str = llm.model_name
self.logger = logger or structlog.get_logger()
self.participant_instructions: str | None = None
if filename:
self.read_transcript_from_file(filename)
@@ -191,14 +192,61 @@ class SummaryBuilder:
self, prompt: str, output_cls: Type[T], tone_name: str | None = None
) -> T:
"""Generic function to get structured output from LLM for non-function-calling models."""
# Add participant instructions to the prompt if available
enhanced_prompt = self._enhance_prompt_with_participants(prompt)
return await self.llm.get_structured_response(
prompt, [self.transcript], output_cls, tone_name=tone_name
enhanced_prompt, [self.transcript], output_cls, tone_name=tone_name
)
async def _get_response(
self, prompt: str, texts: list[str], tone_name: str | None = None
) -> str:
"""Get text response with automatic participant instructions injection."""
enhanced_prompt = self._enhance_prompt_with_participants(prompt)
return await self.llm.get_response(enhanced_prompt, texts, tone_name=tone_name)
def _enhance_prompt_with_participants(self, prompt: str) -> str:
"""Add participant instructions to any prompt if participants are known."""
if self.participant_instructions:
self.logger.debug("Adding participant instructions to prompt")
return f"{prompt}\n\n{self.participant_instructions}"
return prompt
# ----------------------------------------------------------------------------
# Participants
# ----------------------------------------------------------------------------
def set_known_participants(self, participants: list[str]) -> None:
"""
Set known participants directly without LLM identification.
This is used when participants are already identified and stored.
They are appended at the end of the transcript, providing more context for the assistant.
"""
if not participants:
self.logger.warning("No participants provided")
return
self.logger.info(
"Using known participants",
participants=participants,
)
participants_md = self.format_list_md(participants)
self.transcript += f"\n\n# Participants\n\n{participants_md}"
# Set instructions that will be automatically added to all prompts
participants_list = ", ".join(participants)
self.participant_instructions = dedent(
f"""
# IMPORTANT: Participant Names
The following participants are identified in this conversation: {participants_list}
You MUST use these specific participant names when referring to people in your response.
Do NOT use generic terms like "a participant", "someone", "attendee", "Speaker 1", "Speaker 2", etc.
Always refer to people by their actual names (e.g., "John suggested..." not "A participant suggested...").
"""
).strip()
async def identify_participants(self) -> None:
"""
From a transcript, try to identify the participants using TreeSummarize with structured output.
@@ -232,6 +280,19 @@ class SummaryBuilder:
if unique_participants:
participants_md = self.format_list_md(unique_participants)
self.transcript += f"\n\n# Participants\n\n{participants_md}"
# Set instructions that will be automatically added to all prompts
participants_list = ", ".join(unique_participants)
self.participant_instructions = dedent(
f"""
# IMPORTANT: Participant Names
The following participants are identified in this conversation: {participants_list}
You MUST use these specific participant names when referring to people in your response.
Do NOT use generic terms like "a participant", "someone", "attendee", "Speaker 1", "Speaker 2", etc.
Always refer to people by their actual names (e.g., "John suggested..." not "A participant suggested...").
"""
).strip()
else:
self.logger.warning("No participants identified in the transcript")
@@ -318,13 +379,13 @@ class SummaryBuilder:
for subject in self.subjects:
detailed_prompt = DETAILED_SUBJECT_PROMPT_TEMPLATE.format(subject=subject)
detailed_response = await self.llm.get_response(
detailed_response = await self._get_response(
detailed_prompt, [self.transcript], tone_name="Topic assistant"
)
paragraph_prompt = PARAGRAPH_SUMMARY_PROMPT
paragraph_response = await self.llm.get_response(
paragraph_response = await self._get_response(
paragraph_prompt, [str(detailed_response)], tone_name="Topic summarizer"
)
@@ -345,7 +406,7 @@ class SummaryBuilder:
recap_prompt = RECAP_PROMPT
recap_response = await self.llm.get_response(
recap_response = await self._get_response(
recap_prompt, [summaries_text], tone_name="Recap summarizer"
)

View File

@@ -26,7 +26,25 @@ class TranscriptFinalSummaryProcessor(Processor):
async def get_summary_builder(self, text) -> SummaryBuilder:
builder = SummaryBuilder(self.llm, logger=self.logger)
builder.set_transcript(text)
await builder.identify_participants()
# Use known participants if available, otherwise identify them
if self.transcript and self.transcript.participants:
# Extract participant names from the stored participants
participant_names = [p.name for p in self.transcript.participants if p.name]
if participant_names:
self.logger.info(
f"Using {len(participant_names)} known participants from transcript"
)
builder.set_known_participants(participant_names)
else:
self.logger.info(
"Participants field exists but is empty, identifying participants"
)
await builder.identify_participants()
else:
self.logger.info("No participants stored, identifying participants")
await builder.identify_participants()
await builder.generate_summary()
return builder
@@ -49,18 +67,30 @@ class TranscriptFinalSummaryProcessor(Processor):
speakermap = {}
if self.transcript:
speakermap = {
participant["speaker"]: participant["name"]
for participant in self.transcript.participants
p.speaker: p.name
for p in (self.transcript.participants or [])
if p.speaker is not None and p.name
}
self.logger.info(
f"Built speaker map with {len(speakermap)} participants",
speakermap=speakermap,
)
# build the transcript as a single string
# XXX: unsure if the participants name as replaced directly in speaker ?
# Replace speaker IDs with actual participant names if available
text_transcript = []
unique_speakers = set()
for topic in self.chunks:
for segment in topic.transcript.as_segments():
name = speakermap.get(segment.speaker, f"Speaker {segment.speaker}")
unique_speakers.add((segment.speaker, name))
text_transcript.append(f"{name}: {segment.text}")
self.logger.info(
f"Built transcript with {len(unique_speakers)} unique speakers",
speakers=list(unique_speakers),
)
text_transcript = "\n".join(text_transcript)
last_chunk = self.chunks[-1]

View File

@@ -1,6 +1,6 @@
from textwrap import dedent
from pydantic import BaseModel, Field
from pydantic import AliasChoices, BaseModel, Field
from reflector.llm import LLM
from reflector.processors.base import Processor
@@ -34,8 +34,14 @@ TOPIC_PROMPT = dedent(
class TopicResponse(BaseModel):
"""Structured response for topic detection"""
title: str = Field(description="A descriptive title for the topic being discussed")
summary: str = Field(description="A concise 1-2 sentence summary of the discussion")
title: str = Field(
description="A descriptive title for the topic being discussed",
validation_alias=AliasChoices("title", "Title"),
)
summary: str = Field(
description="A concise 1-2 sentence summary of the discussion",
validation_alias=AliasChoices("summary", "Summary"),
)
class TranscriptTopicDetectorProcessor(Processor):

View File

@@ -0,0 +1,5 @@
from typing import Literal
Platform = Literal["whereby", "daily"]
WHEREBY_PLATFORM: Platform = "whereby"
DAILY_PLATFORM: Platform = "daily"

View File

@@ -1,6 +1,7 @@
from pydantic.types import PositiveInt
from pydantic_settings import BaseSettings, SettingsConfigDict
from reflector.schemas.platform import WHEREBY_PLATFORM, Platform
from reflector.utils.string import NonEmptyString
@@ -47,14 +48,17 @@ class Settings(BaseSettings):
TRANSCRIPT_STORAGE_AWS_ACCESS_KEY_ID: str | None = None
TRANSCRIPT_STORAGE_AWS_SECRET_ACCESS_KEY: str | None = None
# Recording storage
RECORDING_STORAGE_BACKEND: str | None = None
# Platform-specific recording storage (follows {PREFIX}_STORAGE_AWS_{CREDENTIAL} pattern)
# Whereby storage configuration
WHEREBY_STORAGE_AWS_BUCKET_NAME: str | None = None
WHEREBY_STORAGE_AWS_REGION: str | None = None
WHEREBY_STORAGE_AWS_ACCESS_KEY_ID: str | None = None
WHEREBY_STORAGE_AWS_SECRET_ACCESS_KEY: str | None = None
# Recording storage configuration for AWS
RECORDING_STORAGE_AWS_BUCKET_NAME: str = "recording-bucket"
RECORDING_STORAGE_AWS_REGION: str = "us-east-1"
RECORDING_STORAGE_AWS_ACCESS_KEY_ID: str | None = None
RECORDING_STORAGE_AWS_SECRET_ACCESS_KEY: str | None = None
# Daily.co storage configuration
DAILYCO_STORAGE_AWS_BUCKET_NAME: str | None = None
DAILYCO_STORAGE_AWS_REGION: str | None = None
DAILYCO_STORAGE_AWS_ROLE_ARN: str | None = None
# Translate into the target language
TRANSLATION_BACKEND: str = "passthrough"
@@ -124,11 +128,20 @@ class Settings(BaseSettings):
WHEREBY_API_URL: str = "https://api.whereby.dev/v1"
WHEREBY_API_KEY: NonEmptyString | None = None
WHEREBY_WEBHOOK_SECRET: str | None = None
AWS_WHEREBY_ACCESS_KEY_ID: str | None = None
AWS_WHEREBY_ACCESS_KEY_SECRET: str | None = None
AWS_PROCESS_RECORDING_QUEUE_URL: str | None = None
SQS_POLLING_TIMEOUT_SECONDS: int = 60
# Daily.co integration
DAILY_API_KEY: str | None = None
DAILY_WEBHOOK_SECRET: str | None = None
DAILY_SUBDOMAIN: str | None = None
DAILY_WEBHOOK_UUID: str | None = (
None # Webhook UUID for this environment. Not used by production code
)
# Platform Configuration
DEFAULT_VIDEO_PLATFORM: Platform = WHEREBY_PLATFORM
# Zulip integration
ZULIP_REALM: str | None = None
ZULIP_API_KEY: str | None = None

View File

@@ -3,6 +3,13 @@ from reflector.settings import settings
def get_transcripts_storage() -> Storage:
"""
Get storage for processed transcript files (master credentials).
Also use this for ALL our file operations with bucket override:
master = get_transcripts_storage()
master.delete_file(key, bucket=recording.bucket_name)
"""
assert settings.TRANSCRIPT_STORAGE_BACKEND
return Storage.get_instance(
name=settings.TRANSCRIPT_STORAGE_BACKEND,
@@ -10,8 +17,53 @@ def get_transcripts_storage() -> Storage:
)
def get_recordings_storage() -> Storage:
def get_whereby_storage() -> Storage:
"""
Get storage config for Whereby (for passing to Whereby API).
Usage:
whereby_storage = get_whereby_storage()
key_id, secret = whereby_storage.key_credentials
whereby_api.create_meeting(
bucket=whereby_storage.bucket_name,
access_key_id=key_id,
secret=secret,
)
Do NOT use for our file operations - use get_transcripts_storage() instead.
"""
if not settings.WHEREBY_STORAGE_AWS_BUCKET_NAME:
raise ValueError(
"WHEREBY_STORAGE_AWS_BUCKET_NAME required for Whereby with AWS storage"
)
return Storage.get_instance(
name=settings.RECORDING_STORAGE_BACKEND,
settings_prefix="RECORDING_STORAGE_",
name="aws",
settings_prefix="WHEREBY_STORAGE_",
)
def get_dailyco_storage() -> Storage:
"""
Get storage config for Daily.co (for passing to Daily API).
Usage:
daily_storage = get_dailyco_storage()
daily_api.create_meeting(
bucket=daily_storage.bucket_name,
region=daily_storage.region,
role_arn=daily_storage.role_credential,
)
Do NOT use for our file operations - use get_transcripts_storage() instead.
"""
# Fail fast if platform-specific config missing
if not settings.DAILYCO_STORAGE_AWS_BUCKET_NAME:
raise ValueError(
"DAILYCO_STORAGE_AWS_BUCKET_NAME required for Daily.co with AWS storage"
)
return Storage.get_instance(
name="aws",
settings_prefix="DAILYCO_STORAGE_",
)

View File

@@ -1,10 +1,23 @@
import importlib
from typing import BinaryIO, Union
from pydantic import BaseModel
from reflector.settings import settings
class StorageError(Exception):
"""Base exception for storage operations."""
pass
class StoragePermissionError(StorageError):
"""Exception raised when storage operation fails due to permission issues."""
pass
class FileResult(BaseModel):
filename: str
url: str
@@ -36,26 +49,113 @@ class Storage:
return cls._registry[name](**config)
async def put_file(self, filename: str, data: bytes) -> FileResult:
return await self._put_file(filename, data)
async def _put_file(self, filename: str, data: bytes) -> FileResult:
# Credential properties for API passthrough
@property
def bucket_name(self) -> str:
"""Default bucket name for this storage instance."""
raise NotImplementedError
async def delete_file(self, filename: str):
return await self._delete_file(filename)
async def _delete_file(self, filename: str):
@property
def region(self) -> str:
"""AWS region for this storage instance."""
raise NotImplementedError
async def get_file_url(self, filename: str) -> str:
return await self._get_file_url(filename)
@property
def access_key_id(self) -> str | None:
"""AWS access key ID (None for role-based auth). Prefer key_credentials property."""
return None
async def _get_file_url(self, filename: str) -> str:
@property
def secret_access_key(self) -> str | None:
"""AWS secret access key (None for role-based auth). Prefer key_credentials property."""
return None
@property
def role_arn(self) -> str | None:
"""AWS IAM role ARN for role-based auth (None for key-based auth). Prefer role_credential property."""
return None
@property
def key_credentials(self) -> tuple[str, str]:
"""
Get (access_key_id, secret_access_key) for key-based auth.
Raises ValueError if storage uses IAM role instead.
"""
raise NotImplementedError
async def get_file(self, filename: str):
return await self._get_file(filename)
async def _get_file(self, filename: str):
@property
def role_credential(self) -> str:
"""
Get IAM role ARN for role-based auth.
Raises ValueError if storage uses access keys instead.
"""
raise NotImplementedError
async def put_file(
self, filename: str, data: Union[bytes, BinaryIO], *, bucket: str | None = None
) -> FileResult:
"""Upload data. bucket: override instance default if provided."""
return await self._put_file(filename, data, bucket=bucket)
async def _put_file(
self, filename: str, data: Union[bytes, BinaryIO], *, bucket: str | None = None
) -> FileResult:
raise NotImplementedError
async def delete_file(self, filename: str, *, bucket: str | None = None):
"""Delete file. bucket: override instance default if provided."""
return await self._delete_file(filename, bucket=bucket)
async def _delete_file(self, filename: str, *, bucket: str | None = None):
raise NotImplementedError
async def get_file_url(
self,
filename: str,
operation: str = "get_object",
expires_in: int = 3600,
*,
bucket: str | None = None,
) -> str:
"""Generate presigned URL. bucket: override instance default if provided."""
return await self._get_file_url(filename, operation, expires_in, bucket=bucket)
async def _get_file_url(
self,
filename: str,
operation: str = "get_object",
expires_in: int = 3600,
*,
bucket: str | None = None,
) -> str:
raise NotImplementedError
async def get_file(self, filename: str, *, bucket: str | None = None):
"""Download file. bucket: override instance default if provided."""
return await self._get_file(filename, bucket=bucket)
async def _get_file(self, filename: str, *, bucket: str | None = None):
raise NotImplementedError
async def list_objects(
self, prefix: str = "", *, bucket: str | None = None
) -> list[str]:
"""List object keys. bucket: override instance default if provided."""
return await self._list_objects(prefix, bucket=bucket)
async def _list_objects(
self, prefix: str = "", *, bucket: str | None = None
) -> list[str]:
raise NotImplementedError
async def stream_to_fileobj(
self, filename: str, fileobj: BinaryIO, *, bucket: str | None = None
):
"""Stream file directly to file object without loading into memory.
bucket: override instance default if provided."""
return await self._stream_to_fileobj(filename, fileobj, bucket=bucket)
async def _stream_to_fileobj(
self, filename: str, fileobj: BinaryIO, *, bucket: str | None = None
):
raise NotImplementedError

View File

@@ -1,79 +1,236 @@
from functools import wraps
from typing import BinaryIO, Union
import aioboto3
from botocore.config import Config
from botocore.exceptions import ClientError
from reflector.logger import logger
from reflector.storage.base import FileResult, Storage
from reflector.storage.base import FileResult, Storage, StoragePermissionError
def handle_s3_client_errors(operation_name: str):
"""Decorator to handle S3 ClientError with bucket-aware messaging.
Args:
operation_name: Human-readable operation name for error messages (e.g., "upload", "delete")
"""
def decorator(func):
@wraps(func)
async def wrapper(self, *args, **kwargs):
bucket = kwargs.get("bucket")
try:
return await func(self, *args, **kwargs)
except ClientError as e:
error_code = e.response.get("Error", {}).get("Code")
if error_code in ("AccessDenied", "NoSuchBucket"):
actual_bucket = bucket or self._bucket_name
bucket_context = (
f"overridden bucket '{actual_bucket}'"
if bucket
else f"default bucket '{actual_bucket}'"
)
raise StoragePermissionError(
f"S3 {operation_name} failed for {bucket_context}: {error_code}. "
f"Check TRANSCRIPT_STORAGE_AWS_* credentials have permission."
) from e
raise
return wrapper
return decorator
class AwsStorage(Storage):
"""AWS S3 storage with bucket override for multi-platform recording architecture.
Master credentials access all buckets via optional bucket parameter in operations."""
def __init__(
self,
aws_access_key_id: str,
aws_secret_access_key: str,
aws_bucket_name: str,
aws_region: str,
aws_access_key_id: str | None = None,
aws_secret_access_key: str | None = None,
aws_role_arn: str | None = None,
):
if not aws_access_key_id:
raise ValueError("Storage `aws_storage` require `aws_access_key_id`")
if not aws_secret_access_key:
raise ValueError("Storage `aws_storage` require `aws_secret_access_key`")
if not aws_bucket_name:
raise ValueError("Storage `aws_storage` require `aws_bucket_name`")
if not aws_region:
raise ValueError("Storage `aws_storage` require `aws_region`")
if not aws_access_key_id and not aws_role_arn:
raise ValueError(
"Storage `aws_storage` require either `aws_access_key_id` or `aws_role_arn`"
)
if aws_role_arn and (aws_access_key_id or aws_secret_access_key):
raise ValueError(
"Storage `aws_storage` cannot use both `aws_role_arn` and access keys"
)
super().__init__()
self.aws_bucket_name = aws_bucket_name
self._bucket_name = aws_bucket_name
self._region = aws_region
self._access_key_id = aws_access_key_id
self._secret_access_key = aws_secret_access_key
self._role_arn = aws_role_arn
self.aws_folder = ""
if "/" in aws_bucket_name:
self.aws_bucket_name, self.aws_folder = aws_bucket_name.split("/", 1)
self._bucket_name, self.aws_folder = aws_bucket_name.split("/", 1)
self.boto_config = Config(retries={"max_attempts": 3, "mode": "adaptive"})
self.session = aioboto3.Session(
aws_access_key_id=aws_access_key_id,
aws_secret_access_key=aws_secret_access_key,
region_name=aws_region,
)
self.base_url = f"https://{aws_bucket_name}.s3.amazonaws.com/"
self.base_url = f"https://{self._bucket_name}.s3.amazonaws.com/"
async def _put_file(self, filename: str, data: bytes) -> FileResult:
bucket = self.aws_bucket_name
folder = self.aws_folder
logger.info(f"Uploading {filename} to S3 {bucket}/{folder}")
s3filename = f"{folder}/{filename}" if folder else filename
async with self.session.client("s3") as client:
await client.put_object(
Bucket=bucket,
Key=s3filename,
Body=data,
# Implement credential properties
@property
def bucket_name(self) -> str:
return self._bucket_name
@property
def region(self) -> str:
return self._region
@property
def access_key_id(self) -> str | None:
return self._access_key_id
@property
def secret_access_key(self) -> str | None:
return self._secret_access_key
@property
def role_arn(self) -> str | None:
return self._role_arn
@property
def key_credentials(self) -> tuple[str, str]:
"""Get (access_key_id, secret_access_key) for key-based auth."""
if self._role_arn:
raise ValueError(
"Storage uses IAM role authentication. "
"Use role_credential property instead of key_credentials."
)
if not self._access_key_id or not self._secret_access_key:
raise ValueError("Storage access key credentials not configured")
return (self._access_key_id, self._secret_access_key)
async def _get_file_url(self, filename: str) -> FileResult:
bucket = self.aws_bucket_name
@property
def role_credential(self) -> str:
"""Get IAM role ARN for role-based auth."""
if self._access_key_id or self._secret_access_key:
raise ValueError(
"Storage uses access key authentication. "
"Use key_credentials property instead of role_credential."
)
if not self._role_arn:
raise ValueError("Storage IAM role ARN not configured")
return self._role_arn
@handle_s3_client_errors("upload")
async def _put_file(
self, filename: str, data: Union[bytes, BinaryIO], *, bucket: str | None = None
) -> FileResult:
actual_bucket = bucket or self._bucket_name
folder = self.aws_folder
s3filename = f"{folder}/{filename}" if folder else filename
async with self.session.client("s3") as client:
logger.info(f"Uploading {filename} to S3 {actual_bucket}/{folder}")
async with self.session.client("s3", config=self.boto_config) as client:
if isinstance(data, bytes):
await client.put_object(Bucket=actual_bucket, Key=s3filename, Body=data)
else:
# boto3 reads file-like object in chunks
# avoids creating extra memory copy vs bytes.getvalue() approach
await client.upload_fileobj(data, Bucket=actual_bucket, Key=s3filename)
url = await self._get_file_url(filename, bucket=bucket)
return FileResult(filename=filename, url=url)
@handle_s3_client_errors("presign")
async def _get_file_url(
self,
filename: str,
operation: str = "get_object",
expires_in: int = 3600,
*,
bucket: str | None = None,
) -> str:
actual_bucket = bucket or self._bucket_name
folder = self.aws_folder
s3filename = f"{folder}/{filename}" if folder else filename
async with self.session.client("s3", config=self.boto_config) as client:
presigned_url = await client.generate_presigned_url(
"get_object",
Params={"Bucket": bucket, "Key": s3filename},
ExpiresIn=3600,
operation,
Params={"Bucket": actual_bucket, "Key": s3filename},
ExpiresIn=expires_in,
)
return presigned_url
async def _delete_file(self, filename: str):
bucket = self.aws_bucket_name
@handle_s3_client_errors("delete")
async def _delete_file(self, filename: str, *, bucket: str | None = None):
actual_bucket = bucket or self._bucket_name
folder = self.aws_folder
logger.info(f"Deleting {filename} from S3 {bucket}/{folder}")
logger.info(f"Deleting {filename} from S3 {actual_bucket}/{folder}")
s3filename = f"{folder}/{filename}" if folder else filename
async with self.session.client("s3") as client:
await client.delete_object(Bucket=bucket, Key=s3filename)
async with self.session.client("s3", config=self.boto_config) as client:
await client.delete_object(Bucket=actual_bucket, Key=s3filename)
async def _get_file(self, filename: str):
bucket = self.aws_bucket_name
@handle_s3_client_errors("download")
async def _get_file(self, filename: str, *, bucket: str | None = None):
actual_bucket = bucket or self._bucket_name
folder = self.aws_folder
logger.info(f"Downloading {filename} from S3 {bucket}/{folder}")
logger.info(f"Downloading {filename} from S3 {actual_bucket}/{folder}")
s3filename = f"{folder}/{filename}" if folder else filename
async with self.session.client("s3") as client:
response = await client.get_object(Bucket=bucket, Key=s3filename)
async with self.session.client("s3", config=self.boto_config) as client:
response = await client.get_object(Bucket=actual_bucket, Key=s3filename)
return await response["Body"].read()
@handle_s3_client_errors("list_objects")
async def _list_objects(
self, prefix: str = "", *, bucket: str | None = None
) -> list[str]:
actual_bucket = bucket or self._bucket_name
folder = self.aws_folder
# Combine folder and prefix
s3prefix = f"{folder}/{prefix}" if folder else prefix
logger.info(f"Listing objects from S3 {actual_bucket} with prefix '{s3prefix}'")
keys = []
async with self.session.client("s3", config=self.boto_config) as client:
paginator = client.get_paginator("list_objects_v2")
async for page in paginator.paginate(Bucket=actual_bucket, Prefix=s3prefix):
if "Contents" in page:
for obj in page["Contents"]:
# Strip folder prefix from keys if present
key = obj["Key"]
if folder:
if key.startswith(f"{folder}/"):
key = key[len(folder) + 1 :]
elif key == folder:
# Skip folder marker itself
continue
keys.append(key)
return keys
@handle_s3_client_errors("stream")
async def _stream_to_fileobj(
self, filename: str, fileobj: BinaryIO, *, bucket: str | None = None
):
"""Stream file from S3 directly to file object without loading into memory."""
actual_bucket = bucket or self._bucket_name
folder = self.aws_folder
logger.info(f"Streaming {filename} from S3 {actual_bucket}/{folder}")
s3filename = f"{folder}/{filename}" if folder else filename
async with self.session.client("s3", config=self.boto_config) as client:
await client.download_fileobj(
Bucket=actual_bucket, Key=s3filename, Fileobj=fileobj
)
Storage.register("aws", AwsStorage)

View File

@@ -0,0 +1,26 @@
from reflector.utils.string import NonEmptyString
DailyRoomName = str
def extract_base_room_name(daily_room_name: DailyRoomName) -> NonEmptyString:
"""
Extract base room name from Daily.co timestamped room name.
Daily.co creates rooms with timestamp suffix: {base_name}-YYYYMMDDHHMMSS
This function removes the timestamp to get the original room name.
Examples:
"daily-20251020193458""daily"
"daily-2-20251020193458""daily-2"
"my-room-name-20251020193458""my-room-name"
Args:
daily_room_name: Full Daily.co room name with optional timestamp
Returns:
Base room name without timestamp suffix
"""
base_name = daily_room_name.rsplit("-", 1)[0]
assert base_name, f"Extracted base name is empty from: {daily_room_name}"
return base_name

View File

@@ -0,0 +1,9 @@
from datetime import datetime, timezone
def parse_datetime_with_timezone(iso_string: str) -> datetime:
"""Parse ISO datetime string and ensure timezone awareness (defaults to UTC if naive)."""
dt = datetime.fromisoformat(iso_string)
if dt.tzinfo is None:
dt = dt.replace(tzinfo=timezone.utc)
return dt

View File

@@ -1,4 +1,4 @@
from typing import Annotated
from typing import Annotated, TypeVar
from pydantic import Field, TypeAdapter, constr
@@ -21,3 +21,12 @@ def try_parse_non_empty_string(s: str) -> NonEmptyString | None:
if not s:
return None
return parse_non_empty_string(s)
T = TypeVar("T", bound=str)
def assert_equal[T](s1: T, s2: T) -> T:
if s1 != s2:
raise ValueError(f"assert_equal: {s1} != {s2}")
return s1

View File

@@ -0,0 +1,37 @@
"""URL manipulation utilities."""
from urllib.parse import parse_qs, urlencode, urlparse, urlunparse
def add_query_param(url: str, key: str, value: str) -> str:
"""
Add or update a query parameter in a URL.
Properly handles URLs with or without existing query parameters,
preserving fragments and encoding special characters.
Args:
url: The URL to modify
key: The query parameter name
value: The query parameter value
Returns:
The URL with the query parameter added or updated
Examples:
>>> add_query_param("https://example.com/room", "t", "token123")
'https://example.com/room?t=token123'
>>> add_query_param("https://example.com/room?existing=param", "t", "token123")
'https://example.com/room?existing=param&t=token123'
"""
parsed = urlparse(url)
query_params = parse_qs(parsed.query, keep_blank_values=True)
query_params[key] = [value]
new_query = urlencode(query_params, doseq=True)
new_parsed = parsed._replace(query=new_query)
return urlunparse(new_parsed)

View File

@@ -0,0 +1,11 @@
from .base import VideoPlatformClient
from .models import MeetingData, VideoPlatformConfig
from .registry import get_platform_client, register_platform
__all__ = [
"VideoPlatformClient",
"VideoPlatformConfig",
"MeetingData",
"get_platform_client",
"register_platform",
]

View File

@@ -0,0 +1,54 @@
from abc import ABC, abstractmethod
from datetime import datetime
from typing import TYPE_CHECKING, Any, Dict, List, Optional
from ..schemas.platform import Platform
from ..utils.string import NonEmptyString
from .models import MeetingData, VideoPlatformConfig
if TYPE_CHECKING:
from reflector.db.rooms import Room
# separator doesn't guarantee there's no more "ROOM_PREFIX_SEPARATOR" strings in room name
ROOM_PREFIX_SEPARATOR = "-"
class VideoPlatformClient(ABC):
PLATFORM_NAME: Platform
def __init__(self, config: VideoPlatformConfig):
self.config = config
@abstractmethod
async def create_meeting(
self, room_name_prefix: NonEmptyString, end_date: datetime, room: "Room"
) -> MeetingData:
pass
@abstractmethod
async def get_room_sessions(self, room_name: str) -> List[Any] | None:
pass
@abstractmethod
async def delete_room(self, room_name: str) -> bool:
pass
@abstractmethod
async def upload_logo(self, room_name: str, logo_path: str) -> bool:
pass
@abstractmethod
def verify_webhook_signature(
self, body: bytes, signature: str, timestamp: Optional[str] = None
) -> bool:
pass
def format_recording_config(self, room: "Room") -> Dict[str, Any]:
if room.recording_type == "cloud" and self.config.s3_bucket:
return {
"type": room.recording_type,
"bucket": self.config.s3_bucket,
"region": self.config.s3_region,
"trigger": room.recording_trigger,
}
return {"type": room.recording_type}

View File

@@ -0,0 +1,198 @@
import base64
import hmac
from datetime import datetime
from hashlib import sha256
from http import HTTPStatus
from typing import Any, Dict, List, Optional
import httpx
from reflector.db.rooms import Room
from reflector.logger import logger
from reflector.storage import get_dailyco_storage
from ..schemas.platform import Platform
from ..utils.daily import DailyRoomName
from ..utils.string import NonEmptyString
from .base import ROOM_PREFIX_SEPARATOR, VideoPlatformClient
from .models import MeetingData, RecordingType, VideoPlatformConfig
class DailyClient(VideoPlatformClient):
PLATFORM_NAME: Platform = "daily"
TIMEOUT = 10
BASE_URL = "https://api.daily.co/v1"
TIMESTAMP_FORMAT = "%Y%m%d%H%M%S"
RECORDING_NONE: RecordingType = "none"
RECORDING_CLOUD: RecordingType = "cloud"
def __init__(self, config: VideoPlatformConfig):
super().__init__(config)
self.headers = {
"Authorization": f"Bearer {config.api_key}",
"Content-Type": "application/json",
}
async def create_meeting(
self, room_name_prefix: NonEmptyString, end_date: datetime, room: Room
) -> MeetingData:
"""
Daily.co rooms vs meetings:
- We create a NEW Daily.co room for each Reflector meeting
- Daily.co meeting/session starts automatically when first participant joins
- Room auto-deletes after exp time
- Meeting.room_name stores the timestamped Daily.co room name
"""
timestamp = datetime.now().strftime(self.TIMESTAMP_FORMAT)
room_name = f"{room_name_prefix}{ROOM_PREFIX_SEPARATOR}{timestamp}"
data = {
"name": room_name,
"privacy": "private" if room.is_locked else "public",
"properties": {
"enable_recording": "raw-tracks"
if room.recording_type != self.RECORDING_NONE
else False,
"enable_chat": True,
"enable_screenshare": True,
"start_video_off": False,
"start_audio_off": False,
"exp": int(end_date.timestamp()),
},
}
# Get storage config for passing to Daily API
daily_storage = get_dailyco_storage()
assert daily_storage.bucket_name, "S3 bucket must be configured"
data["properties"]["recordings_bucket"] = {
"bucket_name": daily_storage.bucket_name,
"bucket_region": daily_storage.region,
"assume_role_arn": daily_storage.role_credential,
"allow_api_access": True,
}
async with httpx.AsyncClient() as client:
response = await client.post(
f"{self.BASE_URL}/rooms",
headers=self.headers,
json=data,
timeout=self.TIMEOUT,
)
if response.status_code >= 400:
logger.error(
"Daily.co API error",
status_code=response.status_code,
response_body=response.text,
request_data=data,
)
response.raise_for_status()
result = response.json()
room_url = result["url"]
return MeetingData(
meeting_id=result["id"],
room_name=result["name"],
room_url=room_url,
host_room_url=room_url,
platform=self.PLATFORM_NAME,
extra_data=result,
)
async def get_room_sessions(self, room_name: str) -> List[Any] | None:
# no such api
return None
async def get_room_presence(self, room_name: str) -> Dict[str, Any]:
async with httpx.AsyncClient() as client:
response = await client.get(
f"{self.BASE_URL}/rooms/{room_name}/presence",
headers=self.headers,
timeout=self.TIMEOUT,
)
response.raise_for_status()
return response.json()
async def get_meeting_participants(self, meeting_id: str) -> Dict[str, Any]:
async with httpx.AsyncClient() as client:
response = await client.get(
f"{self.BASE_URL}/meetings/{meeting_id}/participants",
headers=self.headers,
timeout=self.TIMEOUT,
)
response.raise_for_status()
return response.json()
async def get_recording(self, recording_id: str) -> Dict[str, Any]:
async with httpx.AsyncClient() as client:
response = await client.get(
f"{self.BASE_URL}/recordings/{recording_id}",
headers=self.headers,
timeout=self.TIMEOUT,
)
response.raise_for_status()
return response.json()
async def delete_room(self, room_name: str) -> bool:
async with httpx.AsyncClient() as client:
response = await client.delete(
f"{self.BASE_URL}/rooms/{room_name}",
headers=self.headers,
timeout=self.TIMEOUT,
)
return response.status_code in (HTTPStatus.OK, HTTPStatus.NOT_FOUND)
async def upload_logo(self, room_name: str, logo_path: str) -> bool:
return True
def verify_webhook_signature(
self, body: bytes, signature: str, timestamp: Optional[str] = None
) -> bool:
"""Verify Daily.co webhook signature.
Daily.co uses:
- X-Webhook-Signature header
- X-Webhook-Timestamp header
- Signature format: HMAC-SHA256(base64_decode(secret), timestamp + '.' + body)
- Result is base64 encoded
"""
if not signature or not timestamp:
return False
try:
secret_bytes = base64.b64decode(self.config.webhook_secret)
signed_content = timestamp.encode() + b"." + body
expected = hmac.new(secret_bytes, signed_content, sha256).digest()
expected_b64 = base64.b64encode(expected).decode()
return hmac.compare_digest(expected_b64, signature)
except Exception as e:
logger.error("Daily.co webhook signature verification failed", exc_info=e)
return False
async def create_meeting_token(
self,
room_name: DailyRoomName,
enable_recording: bool,
user_id: Optional[str] = None,
) -> str:
data = {"properties": {"room_name": room_name}}
if enable_recording:
data["properties"]["start_cloud_recording"] = True
data["properties"]["enable_recording_ui"] = False
if user_id:
data["properties"]["user_id"] = user_id
async with httpx.AsyncClient() as client:
response = await client.post(
f"{self.BASE_URL}/meeting-tokens",
headers=self.headers,
json=data,
timeout=self.TIMEOUT,
)
response.raise_for_status()
return response.json()["token"]

View File

@@ -0,0 +1,62 @@
from typing import Optional
from reflector.settings import settings
from reflector.storage import get_dailyco_storage, get_whereby_storage
from ..schemas.platform import WHEREBY_PLATFORM, Platform
from .base import VideoPlatformClient, VideoPlatformConfig
from .registry import get_platform_client
def get_platform_config(platform: Platform) -> VideoPlatformConfig:
if platform == WHEREBY_PLATFORM:
if not settings.WHEREBY_API_KEY:
raise ValueError(
"WHEREBY_API_KEY is required when platform='whereby'. "
"Set WHEREBY_API_KEY environment variable."
)
whereby_storage = get_whereby_storage()
key_id, secret = whereby_storage.key_credentials
return VideoPlatformConfig(
api_key=settings.WHEREBY_API_KEY,
webhook_secret=settings.WHEREBY_WEBHOOK_SECRET or "",
api_url=settings.WHEREBY_API_URL,
s3_bucket=whereby_storage.bucket_name,
s3_region=whereby_storage.region,
aws_access_key_id=key_id,
aws_access_key_secret=secret,
)
elif platform == "daily":
if not settings.DAILY_API_KEY:
raise ValueError(
"DAILY_API_KEY is required when platform='daily'. "
"Set DAILY_API_KEY environment variable."
)
if not settings.DAILY_SUBDOMAIN:
raise ValueError(
"DAILY_SUBDOMAIN is required when platform='daily'. "
"Set DAILY_SUBDOMAIN environment variable."
)
daily_storage = get_dailyco_storage()
return VideoPlatformConfig(
api_key=settings.DAILY_API_KEY,
webhook_secret=settings.DAILY_WEBHOOK_SECRET or "",
subdomain=settings.DAILY_SUBDOMAIN,
s3_bucket=daily_storage.bucket_name,
s3_region=daily_storage.region,
aws_role_arn=daily_storage.role_credential,
)
else:
raise ValueError(f"Unknown platform: {platform}")
def create_platform_client(platform: Platform) -> VideoPlatformClient:
config = get_platform_config(platform)
return get_platform_client(platform, config)
def get_platform(room_platform: Optional[Platform] = None) -> Platform:
if room_platform:
return room_platform
return settings.DEFAULT_VIDEO_PLATFORM

View File

@@ -0,0 +1,40 @@
from typing import Any, Dict, Literal, Optional
from pydantic import BaseModel, Field
from reflector.schemas.platform import WHEREBY_PLATFORM, Platform
RecordingType = Literal["none", "local", "cloud"]
class MeetingData(BaseModel):
platform: Platform
meeting_id: str = Field(description="Platform-specific meeting identifier")
room_url: str = Field(description="URL for participants to join")
host_room_url: str = Field(description="URL for hosts (may be same as room_url)")
room_name: str = Field(description="Human-readable room name")
extra_data: Dict[str, Any] = Field(default_factory=dict)
class Config:
json_schema_extra = {
"example": {
"platform": WHEREBY_PLATFORM,
"meeting_id": "12345678",
"room_url": "https://subdomain.whereby.com/room-20251008120000",
"host_room_url": "https://subdomain.whereby.com/room-20251008120000?roomKey=abc123",
"room_name": "room-20251008120000",
}
}
class VideoPlatformConfig(BaseModel):
api_key: str
webhook_secret: str
api_url: Optional[str] = None
subdomain: Optional[str] = None # Whereby/Daily subdomain
s3_bucket: Optional[str] = None
s3_region: Optional[str] = None
# Whereby uses access keys, Daily uses IAM role
aws_access_key_id: Optional[str] = None
aws_access_key_secret: Optional[str] = None
aws_role_arn: Optional[str] = None

View File

@@ -0,0 +1,35 @@
from typing import Dict, Type
from ..schemas.platform import DAILY_PLATFORM, WHEREBY_PLATFORM, Platform
from .base import VideoPlatformClient, VideoPlatformConfig
_PLATFORMS: Dict[Platform, Type[VideoPlatformClient]] = {}
def register_platform(name: Platform, client_class: Type[VideoPlatformClient]):
_PLATFORMS[name] = client_class
def get_platform_client(
platform: Platform, config: VideoPlatformConfig
) -> VideoPlatformClient:
if platform not in _PLATFORMS:
raise ValueError(f"Unknown video platform: {platform}")
client_class = _PLATFORMS[platform]
return client_class(config)
def get_available_platforms() -> list[Platform]:
return list(_PLATFORMS.keys())
def _register_builtin_platforms():
from .daily import DailyClient # noqa: PLC0415
from .whereby import WherebyClient # noqa: PLC0415
register_platform(WHEREBY_PLATFORM, WherebyClient)
register_platform(DAILY_PLATFORM, DailyClient)
_register_builtin_platforms()

View File

@@ -0,0 +1,141 @@
import hmac
import json
import re
import time
from datetime import datetime
from hashlib import sha256
from typing import Any, Dict, Optional
import httpx
from reflector.db.rooms import Room
from reflector.storage import get_whereby_storage
from ..schemas.platform import WHEREBY_PLATFORM, Platform
from ..utils.string import NonEmptyString
from .base import (
MeetingData,
VideoPlatformClient,
VideoPlatformConfig,
)
from .whereby_utils import whereby_room_name_prefix
class WherebyClient(VideoPlatformClient):
PLATFORM_NAME: Platform = WHEREBY_PLATFORM
TIMEOUT = 10 # seconds
MAX_ELAPSED_TIME = 60 * 1000 # 1 minute in milliseconds
def __init__(self, config: VideoPlatformConfig):
super().__init__(config)
self.headers = {
"Content-Type": "application/json; charset=utf-8",
"Authorization": f"Bearer {config.api_key}",
}
async def create_meeting(
self, room_name_prefix: NonEmptyString, end_date: datetime, room: Room
) -> MeetingData:
data = {
"isLocked": room.is_locked,
"roomNamePrefix": whereby_room_name_prefix(room_name_prefix),
"roomNamePattern": "uuid",
"roomMode": room.room_mode,
"endDate": end_date.isoformat(),
"fields": ["hostRoomUrl"],
}
if room.recording_type == "cloud":
# Get storage config for passing credentials to Whereby API
whereby_storage = get_whereby_storage()
key_id, secret = whereby_storage.key_credentials
data["recording"] = {
"type": room.recording_type,
"destination": {
"provider": "s3",
"bucket": whereby_storage.bucket_name,
"accessKeyId": key_id,
"accessKeySecret": secret,
"fileFormat": "mp4",
},
"startTrigger": room.recording_trigger,
}
async with httpx.AsyncClient() as client:
response = await client.post(
f"{self.config.api_url}/meetings",
headers=self.headers,
json=data,
timeout=self.TIMEOUT,
)
response.raise_for_status()
result = response.json()
return MeetingData(
meeting_id=result["meetingId"],
room_name=result["roomName"],
room_url=result["roomUrl"],
host_room_url=result["hostRoomUrl"],
platform=self.PLATFORM_NAME,
extra_data=result,
)
async def get_room_sessions(self, room_name: str) -> Dict[str, Any]:
async with httpx.AsyncClient() as client:
response = await client.get(
f"{self.config.api_url}/insights/room-sessions?roomName={room_name}",
headers=self.headers,
timeout=self.TIMEOUT,
)
response.raise_for_status()
return response.json().get("results", [])
async def delete_room(self, room_name: str) -> bool:
return True
async def upload_logo(self, room_name: str, logo_path: str) -> bool:
async with httpx.AsyncClient() as client:
with open(logo_path, "rb") as f:
response = await client.put(
f"{self.config.api_url}/rooms/{room_name}/theme/logo",
headers={
"Authorization": f"Bearer {self.config.api_key}",
},
timeout=self.TIMEOUT,
files={"image": f},
)
response.raise_for_status()
return True
def verify_webhook_signature(
self, body: bytes, signature: str, timestamp: Optional[str] = None
) -> bool:
if not signature:
return False
matches = re.match(r"t=(.*),v1=(.*)", signature)
if not matches:
return False
ts, sig = matches.groups()
current_time = int(time.time() * 1000)
diff_time = current_time - int(ts) * 1000
if diff_time >= self.MAX_ELAPSED_TIME:
return False
body_dict = json.loads(body)
signed_payload = f"{ts}.{json.dumps(body_dict, separators=(',', ':'))}"
hmac_obj = hmac.new(
self.config.webhook_secret.encode("utf-8"),
signed_payload.encode("utf-8"),
sha256,
)
expected_signature = hmac_obj.hexdigest()
try:
return hmac.compare_digest(
expected_signature.encode("utf-8"), sig.encode("utf-8")
)
except Exception:
return False

View File

@@ -0,0 +1,38 @@
import re
from datetime import datetime
from reflector.utils.datetime import parse_datetime_with_timezone
from reflector.utils.string import NonEmptyString, parse_non_empty_string
from reflector.video_platforms.base import ROOM_PREFIX_SEPARATOR
def parse_whereby_recording_filename(
object_key: NonEmptyString,
) -> (NonEmptyString, datetime):
filename = parse_non_empty_string(object_key.rsplit(".", 1)[0])
timestamp_pattern = r"(\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}Z)"
match = re.search(timestamp_pattern, filename)
if not match:
raise ValueError(f"No ISO timestamp found in filename: {object_key}")
timestamp_str = match.group(1)
timestamp_start = match.start(1)
room_name_part = filename[:timestamp_start]
if room_name_part.endswith(ROOM_PREFIX_SEPARATOR):
room_name_part = room_name_part[: -len(ROOM_PREFIX_SEPARATOR)]
else:
raise ValueError(
f"room name {room_name_part} doesnt have {ROOM_PREFIX_SEPARATOR} at the end of filename: {object_key}"
)
return parse_non_empty_string(room_name_part), parse_datetime_with_timezone(
timestamp_str
)
def whereby_room_name_prefix(room_name_prefix: NonEmptyString) -> NonEmptyString:
return room_name_prefix + ROOM_PREFIX_SEPARATOR
# room name comes with "/" from whereby api but lacks "/" e.g. in recording filenames
def room_name_to_whereby_api_room_name(room_name: NonEmptyString) -> NonEmptyString:
return f"/{room_name}"

View File

@@ -0,0 +1,233 @@
import json
from typing import Any, Dict, Literal
from fastapi import APIRouter, HTTPException, Request
from pydantic import BaseModel
from reflector.db.meetings import meetings_controller
from reflector.logger import logger as _logger
from reflector.settings import settings
from reflector.utils.daily import DailyRoomName
from reflector.video_platforms.factory import create_platform_client
from reflector.worker.process import process_multitrack_recording
router = APIRouter()
logger = _logger.bind(platform="daily")
class DailyTrack(BaseModel):
type: Literal["audio", "video"]
s3Key: str
size: int
class DailyWebhookEvent(BaseModel):
version: str
type: str
id: str
payload: Dict[str, Any]
event_ts: float
def _extract_room_name(event: DailyWebhookEvent) -> DailyRoomName | None:
"""Extract room name from Daily event payload.
Daily.co API inconsistency:
- participant.* events use "room" field
- recording.* events use "room_name" field
"""
return event.payload.get("room_name") or event.payload.get("room")
@router.post("/webhook")
async def webhook(request: Request):
"""Handle Daily webhook events.
Daily.co circuit-breaker: After 3+ failed responses (4xx/5xx), webhook
state→FAILED, stops sending events. Reset: scripts/recreate_daily_webhook.py
"""
body = await request.body()
signature = request.headers.get("X-Webhook-Signature", "")
timestamp = request.headers.get("X-Webhook-Timestamp", "")
client = create_platform_client("daily")
# TEMPORARY: Bypass signature check for testing
# TODO: Remove this after testing is complete
BYPASS_FOR_TESTING = True
if not BYPASS_FOR_TESTING:
if not client.verify_webhook_signature(body, signature, timestamp):
logger.warning(
"Invalid webhook signature",
signature=signature,
timestamp=timestamp,
has_body=bool(body),
)
raise HTTPException(status_code=401, detail="Invalid webhook signature")
try:
body_json = json.loads(body)
except json.JSONDecodeError:
raise HTTPException(status_code=422, detail="Invalid JSON")
if body_json.get("test") == "test":
logger.info("Received Daily webhook test event")
return {"status": "ok"}
# Parse as actual event
try:
event = DailyWebhookEvent(**body_json)
except Exception as e:
logger.error("Failed to parse webhook event", error=str(e), body=body.decode())
raise HTTPException(status_code=422, detail="Invalid event format")
# Handle participant events
if event.type == "participant.joined":
await _handle_participant_joined(event)
elif event.type == "participant.left":
await _handle_participant_left(event)
elif event.type == "recording.started":
await _handle_recording_started(event)
elif event.type == "recording.ready-to-download":
await _handle_recording_ready(event)
elif event.type == "recording.error":
await _handle_recording_error(event)
else:
logger.warning(
"Unhandled Daily webhook event type",
event_type=event.type,
payload=event.payload,
)
return {"status": "ok"}
async def _handle_participant_joined(event: DailyWebhookEvent):
daily_room_name = _extract_room_name(event)
if not daily_room_name:
logger.warning("participant.joined: no room in payload", payload=event.payload)
return
meeting = await meetings_controller.get_by_room_name(daily_room_name)
if meeting:
await meetings_controller.increment_num_clients(meeting.id)
logger.info(
"Participant joined",
meeting_id=meeting.id,
room_name=daily_room_name,
recording_type=meeting.recording_type,
recording_trigger=meeting.recording_trigger,
)
else:
logger.warning(
"participant.joined: meeting not found", room_name=daily_room_name
)
async def _handle_participant_left(event: DailyWebhookEvent):
room_name = _extract_room_name(event)
if not room_name:
return
meeting = await meetings_controller.get_by_room_name(room_name)
if meeting:
await meetings_controller.decrement_num_clients(meeting.id)
async def _handle_recording_started(event: DailyWebhookEvent):
room_name = _extract_room_name(event)
if not room_name:
logger.warning(
"recording.started: no room_name in payload", payload=event.payload
)
return
meeting = await meetings_controller.get_by_room_name(room_name)
if meeting:
logger.info(
"Recording started",
meeting_id=meeting.id,
room_name=room_name,
recording_id=event.payload.get("recording_id"),
platform="daily",
)
else:
logger.warning("recording.started: meeting not found", room_name=room_name)
async def _handle_recording_ready(event: DailyWebhookEvent):
"""Handle recording ready for download event.
Daily.co webhook payload for raw-tracks recordings:
{
"recording_id": "...",
"room_name": "test2-20251009192341",
"tracks": [
{"type": "audio", "s3Key": "monadical/test2-.../uuid-cam-audio-123.webm", "size": 400000},
{"type": "video", "s3Key": "monadical/test2-.../uuid-cam-video-456.webm", "size": 30000000}
]
}
"""
room_name = _extract_room_name(event)
recording_id = event.payload.get("recording_id")
tracks_raw = event.payload.get("tracks", [])
if not room_name or not tracks_raw:
logger.warning(
"recording.ready-to-download: missing room_name or tracks",
room_name=room_name,
has_tracks=bool(tracks_raw),
payload=event.payload,
)
return
try:
tracks = [DailyTrack(**t) for t in tracks_raw]
except Exception as e:
logger.error(
"recording.ready-to-download: invalid tracks structure",
error=str(e),
tracks=tracks_raw,
)
return
logger.info(
"Recording ready for download",
room_name=room_name,
recording_id=recording_id,
num_tracks=len(tracks),
platform="daily",
)
bucket_name = settings.DAILYCO_STORAGE_AWS_BUCKET_NAME
if not bucket_name:
logger.error(
"DAILYCO_STORAGE_AWS_BUCKET_NAME not configured; cannot process Daily recording"
)
return
track_keys = [t.s3Key for t in tracks if t.type == "audio"]
process_multitrack_recording.delay(
bucket_name=bucket_name,
daily_room_name=room_name,
recording_id=recording_id,
track_keys=track_keys,
)
async def _handle_recording_error(event: DailyWebhookEvent):
room_name = _extract_room_name(event)
error = event.payload.get("error", "Unknown error")
if room_name:
meeting = await meetings_controller.get_by_room_name(room_name)
if meeting:
logger.error(
"Recording error",
meeting_id=meeting.id,
room_name=room_name,
error=error,
platform="daily",
)

View File

@@ -15,9 +15,14 @@ from reflector.db.calendar_events import calendar_events_controller
from reflector.db.meetings import meetings_controller
from reflector.db.rooms import rooms_controller
from reflector.redis_cache import RedisAsyncLock
from reflector.schemas.platform import Platform
from reflector.services.ics_sync import ics_sync_service
from reflector.settings import settings
from reflector.whereby import create_meeting, upload_logo
from reflector.utils.url import add_query_param
from reflector.video_platforms.factory import (
create_platform_client,
get_platform,
)
from reflector.worker.webhook import test_webhook
logger = logging.getLogger(__name__)
@@ -41,6 +46,7 @@ class Room(BaseModel):
ics_enabled: bool = False
ics_last_sync: Optional[datetime] = None
ics_last_etag: Optional[str] = None
platform: Platform
class RoomDetails(Room):
@@ -68,6 +74,7 @@ class Meeting(BaseModel):
is_active: bool = True
calendar_event_id: str | None = None
calendar_metadata: dict[str, Any] | None = None
platform: Platform
class CreateRoom(BaseModel):
@@ -85,6 +92,7 @@ class CreateRoom(BaseModel):
ics_url: Optional[str] = None
ics_fetch_interval: int = 300
ics_enabled: bool = False
platform: Optional[Platform] = None
class UpdateRoom(BaseModel):
@@ -102,6 +110,7 @@ class UpdateRoom(BaseModel):
ics_url: Optional[str] = None
ics_fetch_interval: Optional[int] = None
ics_enabled: Optional[bool] = None
platform: Optional[Platform] = None
class CreateRoomMeeting(BaseModel):
@@ -165,14 +174,6 @@ class CalendarEventResponse(BaseModel):
router = APIRouter()
def parse_datetime_with_timezone(iso_string: str) -> datetime:
"""Parse ISO datetime string and ensure timezone awareness (defaults to UTC if naive)."""
dt = datetime.fromisoformat(iso_string)
if dt.tzinfo is None:
dt = dt.replace(tzinfo=timezone.utc)
return dt
@router.get("/rooms", response_model=Page[RoomDetails])
async def rooms_list(
user: Annotated[Optional[auth.UserInfo], Depends(auth.current_user_optional)],
@@ -182,13 +183,18 @@ async def rooms_list(
user_id = user["sub"] if user else None
return await apaginate(
paginated = await apaginate(
get_database(),
await rooms_controller.get_all(
user_id=user_id, order_by="-created_at", return_query=True
),
)
for room in paginated.items:
room.platform = get_platform(room.platform)
return paginated
@router.get("/rooms/{room_id}", response_model=RoomDetails)
async def rooms_get(
@@ -199,6 +205,9 @@ async def rooms_get(
room = await rooms_controller.get_by_id_for_http(room_id, user_id=user_id)
if not room:
raise HTTPException(status_code=404, detail="Room not found")
if not room.is_shared and (user_id is None or room.user_id != user_id):
raise HTTPException(status_code=403, detail="Room access denied")
room.platform = get_platform(room.platform)
return room
@@ -212,26 +221,25 @@ async def rooms_get_by_name(
if not room:
raise HTTPException(status_code=404, detail="Room not found")
# Convert to RoomDetails format (add webhook fields if user is owner)
room_dict = room.__dict__.copy()
if user_id == room.user_id:
# User is owner, include webhook details if available
room_dict["webhook_url"] = getattr(room, "webhook_url", None)
room_dict["webhook_secret"] = getattr(room, "webhook_secret", None)
else:
# Non-owner, hide webhook details
room_dict["webhook_url"] = None
room_dict["webhook_secret"] = None
room_dict["platform"] = get_platform(room.platform)
return RoomDetails(**room_dict)
@router.post("/rooms", response_model=Room)
async def rooms_create(
room: CreateRoom,
user: Annotated[Optional[auth.UserInfo], Depends(auth.current_user_optional)],
user: Annotated[auth.UserInfo, Depends(auth.current_user)],
):
user_id = user["sub"] if user else None
user_id = user["sub"]
return await rooms_controller.add(
name=room.name,
@@ -249,6 +257,7 @@ async def rooms_create(
ics_url=room.ics_url,
ics_fetch_interval=room.ics_fetch_interval,
ics_enabled=room.ics_enabled,
platform=room.platform,
)
@@ -256,26 +265,31 @@ async def rooms_create(
async def rooms_update(
room_id: str,
info: UpdateRoom,
user: Annotated[Optional[auth.UserInfo], Depends(auth.current_user_optional)],
user: Annotated[auth.UserInfo, Depends(auth.current_user)],
):
user_id = user["sub"] if user else None
user_id = user["sub"]
room = await rooms_controller.get_by_id_for_http(room_id, user_id=user_id)
if not room:
raise HTTPException(status_code=404, detail="Room not found")
if room.user_id != user_id:
raise HTTPException(status_code=403, detail="Not authorized")
values = info.dict(exclude_unset=True)
await rooms_controller.update(room, values)
room.platform = get_platform(room.platform)
return room
@router.delete("/rooms/{room_id}", response_model=DeletionStatus)
async def rooms_delete(
room_id: str,
user: Annotated[Optional[auth.UserInfo], Depends(auth.current_user_optional)],
user: Annotated[auth.UserInfo, Depends(auth.current_user)],
):
user_id = user["sub"] if user else None
room = await rooms_controller.get_by_id(room_id, user_id=user_id)
user_id = user["sub"]
room = await rooms_controller.get_by_id(room_id)
if not room:
raise HTTPException(status_code=404, detail="Room not found")
if room.user_id != user_id:
raise HTTPException(status_code=403, detail="Not authorized")
await rooms_controller.remove_by_id(room.id, user_id=user_id)
return DeletionStatus(status="ok")
@@ -309,19 +323,22 @@ async def rooms_create_meeting(
if meeting is None:
end_date = current_time + timedelta(hours=8)
whereby_meeting = await create_meeting("", end_date=end_date, room=room)
platform = get_platform(room.platform)
client = create_platform_client(platform)
await upload_logo(whereby_meeting["roomName"], "./images/logo.png")
meeting_data = await client.create_meeting(
room.name, end_date=end_date, room=room
)
await client.upload_logo(meeting_data.room_name, "./images/logo.png")
meeting = await meetings_controller.create(
id=whereby_meeting["meetingId"],
room_name=whereby_meeting["roomName"],
room_url=whereby_meeting["roomUrl"],
host_room_url=whereby_meeting["hostRoomUrl"],
start_date=parse_datetime_with_timezone(
whereby_meeting["startDate"]
),
end_date=parse_datetime_with_timezone(whereby_meeting["endDate"]),
id=meeting_data.meeting_id,
room_name=meeting_data.room_name,
room_url=meeting_data.room_url,
host_room_url=meeting_data.host_room_url,
start_date=current_time,
end_date=end_date,
room=room,
)
except LockError:
@@ -330,6 +347,18 @@ async def rooms_create_meeting(
status_code=503, detail="Meeting creation in progress, please try again"
)
if meeting.platform == "daily" and room.recording_trigger != "none":
client = create_platform_client(meeting.platform)
token = await client.create_meeting_token(
meeting.room_name,
enable_recording=True,
user_id=user_id,
)
meeting = meeting.model_copy()
meeting.room_url = add_query_param(meeting.room_url, "t", token)
if meeting.host_room_url:
meeting.host_room_url = add_query_param(meeting.host_room_url, "t", token)
if user_id != room.user_id:
meeting.host_room_url = ""
@@ -339,16 +368,16 @@ async def rooms_create_meeting(
@router.post("/rooms/{room_id}/webhook/test", response_model=WebhookTestResult)
async def rooms_test_webhook(
room_id: str,
user: Annotated[Optional[auth.UserInfo], Depends(auth.current_user_optional)],
user: Annotated[auth.UserInfo, Depends(auth.current_user)],
):
"""Test webhook configuration by sending a sample payload."""
user_id = user["sub"] if user else None
user_id = user["sub"]
room = await rooms_controller.get_by_id(room_id)
if not room:
raise HTTPException(status_code=404, detail="Room not found")
if user_id and room.user_id != user_id:
if room.user_id != user_id:
raise HTTPException(
status_code=403, detail="Not authorized to test this room's webhook"
)
@@ -484,7 +513,10 @@ async def rooms_list_active_meetings(
room=room, current_time=current_time
)
# Hide host URLs from non-owners
effective_platform = get_platform(room.platform)
for meeting in meetings:
meeting.platform = effective_platform
if user_id != room.user_id:
for meeting in meetings:
meeting.host_room_url = ""
@@ -505,15 +537,10 @@ async def rooms_get_meeting(
if not room:
raise HTTPException(status_code=404, detail="Room not found")
meeting = await meetings_controller.get_by_id(meeting_id)
meeting = await meetings_controller.get_by_id(meeting_id, room=room)
if not meeting:
raise HTTPException(status_code=404, detail="Meeting not found")
if meeting.room_id != room.id:
raise HTTPException(
status_code=403, detail="Meeting does not belong to this room"
)
if user_id != room.user_id and not room.is_shared:
meeting.host_room_url = ""
@@ -532,16 +559,11 @@ async def rooms_join_meeting(
if not room:
raise HTTPException(status_code=404, detail="Room not found")
meeting = await meetings_controller.get_by_id(meeting_id)
meeting = await meetings_controller.get_by_id(meeting_id, room=room)
if not meeting:
raise HTTPException(status_code=404, detail="Meeting not found")
if meeting.room_id != room.id:
raise HTTPException(
status_code=403, detail="Meeting does not belong to this room"
)
if not meeting.is_active:
raise HTTPException(status_code=400, detail="Meeting is not active")
@@ -549,7 +571,6 @@ async def rooms_join_meeting(
if meeting.end_date <= current_time:
raise HTTPException(status_code=400, detail="Meeting has ended")
# Hide host URL from non-owners
if user_id != room.user_id:
meeting.host_room_url = ""

View File

@@ -5,12 +5,10 @@ from fastapi import APIRouter, Depends, HTTPException, Query
from fastapi_pagination import Page
from fastapi_pagination.ext.databases import apaginate
from jose import jwt
from pydantic import BaseModel, Field, constr, field_serializer
from pydantic import AwareDatetime, BaseModel, Field, constr, field_serializer
import reflector.auth as auth
from reflector.db import get_database
from reflector.db.meetings import meetings_controller
from reflector.db.rooms import rooms_controller
from reflector.db.search import (
DEFAULT_SEARCH_LIMIT,
SearchLimit,
@@ -34,6 +32,7 @@ from reflector.db.transcripts import (
from reflector.processors.types import Transcript as ProcessorTranscript
from reflector.processors.types import Word
from reflector.settings import settings
from reflector.ws_manager import get_ws_manager
from reflector.zulip import (
InvalidMessageError,
get_zulip_message,
@@ -134,6 +133,21 @@ SearchOffsetParam = Annotated[
SearchOffsetBase, Query(description="Number of results to skip")
]
SearchFromDatetimeParam = Annotated[
AwareDatetime | None,
Query(
alias="from",
description="Filter transcripts created on or after this datetime (ISO 8601 with timezone)",
),
]
SearchToDatetimeParam = Annotated[
AwareDatetime | None,
Query(
alias="to",
description="Filter transcripts created on or before this datetime (ISO 8601 with timezone)",
),
]
class SearchResponse(BaseModel):
results: list[SearchResult]
@@ -175,18 +189,23 @@ async def transcripts_search(
offset: SearchOffsetParam = 0,
room_id: Optional[str] = None,
source_kind: Optional[SourceKind] = None,
from_datetime: SearchFromDatetimeParam = None,
to_datetime: SearchToDatetimeParam = None,
user: Annotated[
Optional[auth.UserInfo], Depends(auth.current_user_optional)
] = None,
):
"""
Full-text search across transcript titles and content.
"""
"""Full-text search across transcript titles and content."""
if not user and not settings.PUBLIC_MODE:
raise HTTPException(status_code=401, detail="Not authenticated")
user_id = user["sub"] if user else None
if from_datetime and to_datetime and from_datetime > to_datetime:
raise HTTPException(
status_code=400, detail="'from' must be less than or equal to 'to'"
)
search_params = SearchParameters(
query_text=parse_search_query_param(q),
limit=limit,
@@ -194,6 +213,8 @@ async def transcripts_search(
user_id=user_id,
room_id=room_id,
source_kind=source_kind,
from_datetime=from_datetime,
to_datetime=to_datetime,
)
results, total = await search_controller.search_transcripts(search_params)
@@ -213,7 +234,7 @@ async def transcripts_create(
user: Annotated[Optional[auth.UserInfo], Depends(auth.current_user_optional)],
):
user_id = user["sub"] if user else None
return await transcripts_controller.add(
transcript = await transcripts_controller.add(
info.name,
source_kind=info.source_kind or SourceKind.LIVE,
source_language=info.source_language,
@@ -221,6 +242,14 @@ async def transcripts_create(
user_id=user_id,
)
if user_id:
await get_ws_manager().send_json(
room_id=f"user:{user_id}",
message={"event": "TRANSCRIPT_CREATED", "data": {"id": transcript.id}},
)
return transcript
# ==============================================================
# Single transcript
@@ -344,12 +373,14 @@ async def transcript_get(
async def transcript_update(
transcript_id: str,
info: UpdateTranscript,
user: Annotated[Optional[auth.UserInfo], Depends(auth.current_user_optional)],
user: Annotated[auth.UserInfo, Depends(auth.current_user)],
):
user_id = user["sub"] if user else None
user_id = user["sub"]
transcript = await transcripts_controller.get_by_id_for_http(
transcript_id, user_id=user_id
)
if not transcripts_controller.user_can_mutate(transcript, user_id):
raise HTTPException(status_code=403, detail="Not authorized")
values = info.dict(exclude_unset=True)
updated_transcript = await transcripts_controller.update(transcript, values)
return updated_transcript
@@ -358,20 +389,20 @@ async def transcript_update(
@router.delete("/transcripts/{transcript_id}", response_model=DeletionStatus)
async def transcript_delete(
transcript_id: str,
user: Annotated[Optional[auth.UserInfo], Depends(auth.current_user_optional)],
user: Annotated[auth.UserInfo, Depends(auth.current_user)],
):
user_id = user["sub"] if user else None
user_id = user["sub"]
transcript = await transcripts_controller.get_by_id(transcript_id)
if not transcript:
raise HTTPException(status_code=404, detail="Transcript not found")
if transcript.meeting_id:
meeting = await meetings_controller.get_by_id(transcript.meeting_id)
room = await rooms_controller.get_by_id(meeting.room_id)
if room.is_shared:
user_id = None
if not transcripts_controller.user_can_mutate(transcript, user_id):
raise HTTPException(status_code=403, detail="Not authorized")
await transcripts_controller.remove_by_id(transcript.id, user_id=user_id)
await get_ws_manager().send_json(
room_id=f"user:{user_id}",
message={"event": "TRANSCRIPT_DELETED", "data": {"id": transcript.id}},
)
return DeletionStatus(status="ok")
@@ -443,15 +474,16 @@ async def transcript_post_to_zulip(
stream: str,
topic: str,
include_topics: bool,
user: Annotated[Optional[auth.UserInfo], Depends(auth.current_user_optional)],
user: Annotated[auth.UserInfo, Depends(auth.current_user)],
):
user_id = user["sub"] if user else None
user_id = user["sub"]
transcript = await transcripts_controller.get_by_id_for_http(
transcript_id, user_id=user_id
)
if not transcript:
raise HTTPException(status_code=404, detail="Transcript not found")
if not transcripts_controller.user_can_mutate(transcript, user_id):
raise HTTPException(status_code=403, detail="Not authorized")
content = get_zulip_message(transcript, include_topics)
message_updated = False

View File

@@ -56,12 +56,14 @@ async def transcript_get_participants(
async def transcript_add_participant(
transcript_id: str,
participant: CreateParticipant,
user: Annotated[Optional[auth.UserInfo], Depends(auth.current_user_optional)],
user: Annotated[auth.UserInfo, Depends(auth.current_user)],
) -> Participant:
user_id = user["sub"] if user else None
user_id = user["sub"]
transcript = await transcripts_controller.get_by_id_for_http(
transcript_id, user_id=user_id
)
if transcript.user_id is not None and transcript.user_id != user_id:
raise HTTPException(status_code=403, detail="Not authorized")
# ensure the speaker is unique
if participant.speaker is not None and transcript.participants is not None:
@@ -101,12 +103,14 @@ async def transcript_update_participant(
transcript_id: str,
participant_id: str,
participant: UpdateParticipant,
user: Annotated[Optional[auth.UserInfo], Depends(auth.current_user_optional)],
user: Annotated[auth.UserInfo, Depends(auth.current_user)],
) -> Participant:
user_id = user["sub"] if user else None
user_id = user["sub"]
transcript = await transcripts_controller.get_by_id_for_http(
transcript_id, user_id=user_id
)
if transcript.user_id is not None and transcript.user_id != user_id:
raise HTTPException(status_code=403, detail="Not authorized")
# ensure the speaker is unique
for p in transcript.participants:
@@ -138,11 +142,13 @@ async def transcript_update_participant(
async def transcript_delete_participant(
transcript_id: str,
participant_id: str,
user: Annotated[Optional[auth.UserInfo], Depends(auth.current_user_optional)],
user: Annotated[auth.UserInfo, Depends(auth.current_user)],
) -> DeletionStatus:
user_id = user["sub"] if user else None
user_id = user["sub"]
transcript = await transcripts_controller.get_by_id_for_http(
transcript_id, user_id=user_id
)
if transcript.user_id is not None and transcript.user_id != user_id:
raise HTTPException(status_code=403, detail="Not authorized")
await transcripts_controller.delete_participant(transcript, participant_id)
return DeletionStatus(status="ok")

View File

@@ -5,8 +5,12 @@ from fastapi import APIRouter, Depends, HTTPException
from pydantic import BaseModel
import reflector.auth as auth
from reflector.db.recordings import recordings_controller
from reflector.db.transcripts import transcripts_controller
from reflector.pipelines.main_file_pipeline import task_pipeline_file_process
from reflector.pipelines.main_multitrack_pipeline import (
task_pipeline_multitrack_process,
)
router = APIRouter()
@@ -33,14 +37,35 @@ async def transcript_process(
status_code=400, detail="Recording is not ready for processing"
)
# avoid duplicate scheduling for either pipeline
if task_is_scheduled_or_active(
"reflector.pipelines.main_file_pipeline.task_pipeline_file_process",
transcript_id=transcript_id,
) or task_is_scheduled_or_active(
"reflector.pipelines.main_multitrack_pipeline.task_pipeline_multitrack_process",
transcript_id=transcript_id,
):
return ProcessStatus(status="already running")
# schedule a background task process the file
task_pipeline_file_process.delay(transcript_id=transcript_id)
# Determine processing mode strictly from DB to avoid S3 scans
bucket_name = None
track_keys: list[str] = []
if transcript.recording_id:
recording = await recordings_controller.get_by_id(transcript.recording_id)
if recording:
bucket_name = recording.bucket_name
track_keys = list(getattr(recording, "track_keys", []) or [])
if bucket_name:
task_pipeline_multitrack_process.delay(
transcript_id=transcript_id,
bucket_name=bucket_name,
track_keys=track_keys,
)
else:
# Default single-file pipeline
task_pipeline_file_process.delay(transcript_id=transcript_id)
return ProcessStatus(status="ok")

View File

@@ -35,12 +35,14 @@ class SpeakerMerge(BaseModel):
async def transcript_assign_speaker(
transcript_id: str,
assignment: SpeakerAssignment,
user: Annotated[Optional[auth.UserInfo], Depends(auth.current_user_optional)],
user: Annotated[auth.UserInfo, Depends(auth.current_user)],
) -> SpeakerAssignmentStatus:
user_id = user["sub"] if user else None
user_id = user["sub"]
transcript = await transcripts_controller.get_by_id_for_http(
transcript_id, user_id=user_id
)
if transcript.user_id is not None and transcript.user_id != user_id:
raise HTTPException(status_code=403, detail="Not authorized")
if not transcript:
raise HTTPException(status_code=404, detail="Transcript not found")
@@ -113,12 +115,14 @@ async def transcript_assign_speaker(
async def transcript_merge_speaker(
transcript_id: str,
merge: SpeakerMerge,
user: Annotated[Optional[auth.UserInfo], Depends(auth.current_user_optional)],
user: Annotated[auth.UserInfo, Depends(auth.current_user)],
) -> SpeakerAssignmentStatus:
user_id = user["sub"] if user else None
user_id = user["sub"]
transcript = await transcripts_controller.get_by_id_for_http(
transcript_id, user_id=user_id
)
if transcript.user_id is not None and transcript.user_id != user_id:
raise HTTPException(status_code=403, detail="Not authorized")
if not transcript:
raise HTTPException(status_code=404, detail="Transcript not found")

View File

@@ -4,8 +4,11 @@ Transcripts websocket API
"""
from fastapi import APIRouter, HTTPException, WebSocket, WebSocketDisconnect
from typing import Optional
from fastapi import APIRouter, Depends, HTTPException, WebSocket, WebSocketDisconnect
import reflector.auth as auth
from reflector.db.transcripts import transcripts_controller
from reflector.ws_manager import get_ws_manager
@@ -21,10 +24,12 @@ async def transcript_get_websocket_events(transcript_id: str):
async def transcript_events_websocket(
transcript_id: str,
websocket: WebSocket,
# user: Annotated[Optional[auth.UserInfo], Depends(auth.current_user_optional)],
user: Optional[auth.UserInfo] = Depends(auth.current_user_optional),
):
# user_id = user["sub"] if user else None
transcript = await transcripts_controller.get_by_id(transcript_id)
user_id = user["sub"] if user else None
transcript = await transcripts_controller.get_by_id_for_http(
transcript_id, user_id=user_id
)
if not transcript:
raise HTTPException(status_code=404, detail="Transcript not found")

View File

@@ -11,7 +11,6 @@ router = APIRouter()
class UserInfo(BaseModel):
sub: str
email: Optional[str]
email_verified: Optional[bool]
@router.get("/me")

View File

@@ -0,0 +1,62 @@
from datetime import datetime
from typing import Annotated
import structlog
from fastapi import APIRouter, Depends, HTTPException
from pydantic import BaseModel
import reflector.auth as auth
from reflector.db.user_api_keys import user_api_keys_controller
from reflector.utils.string import NonEmptyString
router = APIRouter()
logger = structlog.get_logger(__name__)
class CreateApiKeyRequest(BaseModel):
name: NonEmptyString | None = None
class ApiKeyResponse(BaseModel):
id: NonEmptyString
user_id: NonEmptyString
name: NonEmptyString | None
created_at: datetime
class CreateApiKeyResponse(ApiKeyResponse):
key: NonEmptyString
@router.post("/user/api-keys", response_model=CreateApiKeyResponse)
async def create_api_key(
req: CreateApiKeyRequest,
user: Annotated[auth.UserInfo, Depends(auth.current_user)],
):
api_key_model, plaintext = await user_api_keys_controller.create_key(
user_id=user["sub"],
name=req.name,
)
return CreateApiKeyResponse(
**api_key_model.model_dump(),
key=plaintext,
)
@router.get("/user/api-keys", response_model=list[ApiKeyResponse])
async def list_api_keys(
user: Annotated[auth.UserInfo, Depends(auth.current_user)],
):
api_keys = await user_api_keys_controller.list_by_user_id(user["sub"])
return [ApiKeyResponse(**k.model_dump()) for k in api_keys]
@router.delete("/user/api-keys/{key_id}")
async def delete_api_key(
key_id: NonEmptyString,
user: Annotated[auth.UserInfo, Depends(auth.current_user)],
):
deleted = await user_api_keys_controller.delete_key(key_id, user["sub"])
if not deleted:
raise HTTPException(status_code=404)
return {"status": "ok"}

View File

@@ -0,0 +1,53 @@
from typing import Optional
from fastapi import APIRouter, WebSocket
from reflector.auth.auth_jwt import JWTAuth # type: ignore
from reflector.ws_manager import get_ws_manager
router = APIRouter()
# Close code for unauthorized WebSocket connections
UNAUTHORISED = 4401
@router.websocket("/events")
async def user_events_websocket(websocket: WebSocket):
# Browser can't send Authorization header for WS; use subprotocol: ["bearer", token]
raw_subprotocol = websocket.headers.get("sec-websocket-protocol") or ""
parts = [p.strip() for p in raw_subprotocol.split(",") if p.strip()]
token: Optional[str] = None
negotiated_subprotocol: Optional[str] = None
if len(parts) >= 2 and parts[0].lower() == "bearer":
negotiated_subprotocol = "bearer"
token = parts[1]
user_id: Optional[str] = None
if not token:
await websocket.close(code=UNAUTHORISED)
return
try:
payload = JWTAuth().verify_token(token)
user_id = payload.get("sub")
except Exception:
await websocket.close(code=UNAUTHORISED)
return
if not user_id:
await websocket.close(code=UNAUTHORISED)
return
room_id = f"user:{user_id}"
ws_manager = get_ws_manager()
await ws_manager.add_user_to_room(
room_id, websocket, subprotocol=negotiated_subprotocol
)
try:
while True:
await websocket.receive()
finally:
if room_id:
await ws_manager.remove_user_from_room(room_id, websocket)

View File

@@ -1,114 +0,0 @@
import logging
from datetime import datetime
import httpx
from reflector.db.rooms import Room
from reflector.settings import settings
from reflector.utils.string import parse_non_empty_string
logger = logging.getLogger(__name__)
def _get_headers():
api_key = parse_non_empty_string(
settings.WHEREBY_API_KEY, "WHEREBY_API_KEY value is required."
)
return {
"Content-Type": "application/json; charset=utf-8",
"Authorization": f"Bearer {api_key}",
}
TIMEOUT = 10 # seconds
def _get_whereby_s3_auth():
errors = []
try:
bucket_name = parse_non_empty_string(
settings.RECORDING_STORAGE_AWS_BUCKET_NAME,
"RECORDING_STORAGE_AWS_BUCKET_NAME value is required.",
)
except Exception as e:
errors.append(e)
try:
key_id = parse_non_empty_string(
settings.AWS_WHEREBY_ACCESS_KEY_ID,
"AWS_WHEREBY_ACCESS_KEY_ID value is required.",
)
except Exception as e:
errors.append(e)
try:
key_secret = parse_non_empty_string(
settings.AWS_WHEREBY_ACCESS_KEY_SECRET,
"AWS_WHEREBY_ACCESS_KEY_SECRET value is required.",
)
except Exception as e:
errors.append(e)
if len(errors) > 0:
raise Exception(
f"Failed to get Whereby auth settings: {', '.join(str(e) for e in errors)}"
)
return bucket_name, key_id, key_secret
async def create_meeting(room_name_prefix: str, end_date: datetime, room: Room):
s3_bucket_name, s3_key_id, s3_key_secret = _get_whereby_s3_auth()
data = {
"isLocked": room.is_locked,
"roomNamePrefix": room_name_prefix,
"roomNamePattern": "uuid",
"roomMode": room.room_mode,
"endDate": end_date.isoformat(),
"recording": {
"type": room.recording_type,
"destination": {
"provider": "s3",
"bucket": s3_bucket_name,
"accessKeyId": s3_key_id,
"accessKeySecret": s3_key_secret,
"fileFormat": "mp4",
},
"startTrigger": room.recording_trigger,
},
"fields": ["hostRoomUrl"],
}
async with httpx.AsyncClient() as client:
response = await client.post(
f"{settings.WHEREBY_API_URL}/meetings",
headers=_get_headers(),
json=data,
timeout=TIMEOUT,
)
if response.status_code == 403:
logger.warning(
f"Failed to create meeting: access denied on Whereby: {response.text}"
)
response.raise_for_status()
return response.json()
async def get_room_sessions(room_name: str):
async with httpx.AsyncClient() as client:
response = await client.get(
f"{settings.WHEREBY_API_URL}/insights/room-sessions?roomName={room_name}",
headers=_get_headers(),
timeout=TIMEOUT,
)
response.raise_for_status()
return response.json()
async def upload_logo(room_name: str, logo_path: str):
async with httpx.AsyncClient() as client:
with open(logo_path, "rb") as f:
response = await client.put(
f"{settings.WHEREBY_API_URL}/rooms{room_name}/theme/logo",
headers={
"Authorization": f"Bearer {settings.WHEREBY_API_KEY}",
},
timeout=TIMEOUT,
files={"image": f},
)
response.raise_for_status()

View File

@@ -19,7 +19,7 @@ from reflector.db.meetings import meetings
from reflector.db.recordings import recordings
from reflector.db.transcripts import transcripts, transcripts_controller
from reflector.settings import settings
from reflector.storage import get_recordings_storage
from reflector.storage import get_transcripts_storage
logger = structlog.get_logger(__name__)
@@ -53,8 +53,8 @@ async def delete_single_transcript(
)
if recording:
try:
await get_recordings_storage().delete_file(
recording["object_key"]
await get_transcripts_storage().delete_file(
recording["object_key"], bucket=recording["bucket_name"]
)
except Exception as storage_error:
logger.warning(

View File

@@ -7,10 +7,10 @@ from celery.utils.log import get_task_logger
from reflector.asynctask import asynctask
from reflector.db.calendar_events import calendar_events_controller
from reflector.db.meetings import meetings_controller
from reflector.db.rooms import rooms_controller
from reflector.db.rooms import Room, rooms_controller
from reflector.redis_cache import RedisAsyncLock
from reflector.services.ics_sync import SyncStatus, ics_sync_service
from reflector.whereby import create_meeting, upload_logo
from reflector.video_platforms.factory import create_platform_client, get_platform
logger = structlog.wrap_logger(get_task_logger(__name__))
@@ -86,17 +86,17 @@ def _should_sync(room) -> bool:
MEETING_DEFAULT_DURATION = timedelta(hours=1)
async def create_upcoming_meetings_for_event(event, create_window, room_id, room):
async def create_upcoming_meetings_for_event(event, create_window, room: Room):
if event.start_time <= create_window:
return
existing_meeting = await meetings_controller.get_by_calendar_event(event.id)
existing_meeting = await meetings_controller.get_by_calendar_event(event.id, room)
if existing_meeting:
return
logger.info(
"Pre-creating meeting for calendar event",
room_id=room_id,
room_id=room.id,
event_id=event.id,
event_title=event.title,
)
@@ -104,20 +104,22 @@ async def create_upcoming_meetings_for_event(event, create_window, room_id, room
try:
end_date = event.end_time or (event.start_time + MEETING_DEFAULT_DURATION)
whereby_meeting = await create_meeting(
client = create_platform_client(get_platform(room.platform))
meeting_data = await client.create_meeting(
"",
end_date=end_date,
room=room,
)
await upload_logo(whereby_meeting["roomName"], "./images/logo.png")
await client.upload_logo(meeting_data.room_name, "./images/logo.png")
meeting = await meetings_controller.create(
id=whereby_meeting["meetingId"],
room_name=whereby_meeting["roomName"],
room_url=whereby_meeting["roomUrl"],
host_room_url=whereby_meeting["hostRoomUrl"],
start_date=datetime.fromisoformat(whereby_meeting["startDate"]),
end_date=datetime.fromisoformat(whereby_meeting["endDate"]),
id=meeting_data.meeting_id,
room_name=meeting_data.room_name,
room_url=meeting_data.room_url,
host_room_url=meeting_data.host_room_url,
start_date=event.start_time,
end_date=end_date,
room=room,
calendar_event_id=event.id,
calendar_metadata={
@@ -136,7 +138,7 @@ async def create_upcoming_meetings_for_event(event, create_window, room_id, room
except Exception as e:
logger.error(
"Failed to pre-create meeting",
room_id=room_id,
room_id=room.id,
event_id=event.id,
error=str(e),
)
@@ -166,9 +168,7 @@ async def create_upcoming_meetings():
)
for event in events:
await create_upcoming_meetings_for_event(
event, create_window, room.id, room
)
await create_upcoming_meetings_for_event(event, create_window, room)
logger.info("Completed pre-creation check for upcoming meetings")
except Exception as e:

View File

@@ -1,5 +1,6 @@
import json
import os
import re
from datetime import datetime, timezone
from urllib.parse import unquote
@@ -14,24 +15,32 @@ from redis.exceptions import LockError
from reflector.db.meetings import meetings_controller
from reflector.db.recordings import Recording, recordings_controller
from reflector.db.rooms import rooms_controller
from reflector.db.transcripts import SourceKind, transcripts_controller
from reflector.db.transcripts import (
SourceKind,
TranscriptParticipant,
transcripts_controller,
)
from reflector.pipelines.main_file_pipeline import task_pipeline_file_process
from reflector.pipelines.main_live_pipeline import asynctask
from reflector.pipelines.main_multitrack_pipeline import (
task_pipeline_multitrack_process,
)
from reflector.pipelines.topic_processing import EmptyPipeline
from reflector.processors import AudioFileWriterProcessor
from reflector.processors.audio_waveform_processor import AudioWaveformProcessor
from reflector.redis_cache import get_redis_client
from reflector.settings import settings
from reflector.whereby import get_room_sessions
from reflector.storage import get_transcripts_storage
from reflector.utils.daily import DailyRoomName, extract_base_room_name
from reflector.video_platforms.factory import create_platform_client
from reflector.video_platforms.whereby_utils import (
parse_whereby_recording_filename,
room_name_to_whereby_api_room_name,
)
logger = structlog.wrap_logger(get_task_logger(__name__))
def parse_datetime_with_timezone(iso_string: str) -> datetime:
"""Parse ISO datetime string and ensure timezone awareness (defaults to UTC if naive)."""
dt = datetime.fromisoformat(iso_string)
if dt.tzinfo is None:
dt = dt.replace(tzinfo=timezone.utc)
return dt
@shared_task
def process_messages():
queue_url = settings.AWS_PROCESS_RECORDING_QUEUE_URL
@@ -73,14 +82,16 @@ def process_messages():
logger.error("process_messages", error=str(e))
# only whereby supported.
@shared_task
@asynctask
async def process_recording(bucket_name: str, object_key: str):
logger.info("Processing recording: %s/%s", bucket_name, object_key)
# extract a guid and a datetime from the object key
room_name = f"/{object_key[:36]}"
recorded_at = parse_datetime_with_timezone(object_key[37:57])
room_name_part, recorded_at = parse_whereby_recording_filename(object_key)
# we store whereby api room names, NOT whereby room names
room_name = room_name_to_whereby_api_room_name(room_name_part)
meeting = await meetings_controller.get_by_room_name(room_name)
room = await rooms_controller.get_by_id(meeting.room_id)
@@ -102,6 +113,7 @@ async def process_recording(bucket_name: str, object_key: str):
transcript,
{
"topics": [],
"participants": [],
},
)
else:
@@ -121,15 +133,15 @@ async def process_recording(bucket_name: str, object_key: str):
upload_filename = transcript.data_path / f"upload{extension}"
upload_filename.parent.mkdir(parents=True, exist_ok=True)
s3 = boto3.client(
"s3",
region_name=settings.TRANSCRIPT_STORAGE_AWS_REGION,
aws_access_key_id=settings.TRANSCRIPT_STORAGE_AWS_ACCESS_KEY_ID,
aws_secret_access_key=settings.TRANSCRIPT_STORAGE_AWS_SECRET_ACCESS_KEY,
)
storage = get_transcripts_storage()
with open(upload_filename, "wb") as f:
s3.download_fileobj(bucket_name, object_key, f)
try:
with open(upload_filename, "wb") as f:
await storage.stream_to_fileobj(object_key, f, bucket=bucket_name)
except Exception:
# Clean up partial file on stream failure
upload_filename.unlink(missing_ok=True)
raise
container = av.open(upload_filename.as_posix())
try:
@@ -146,6 +158,165 @@ async def process_recording(bucket_name: str, object_key: str):
task_pipeline_file_process.delay(transcript_id=transcript.id)
@shared_task
@asynctask
async def process_multitrack_recording(
bucket_name: str,
daily_room_name: DailyRoomName,
recording_id: str,
track_keys: list[str],
):
logger.info(
"Processing multitrack recording",
bucket=bucket_name,
room_name=daily_room_name,
recording_id=recording_id,
provided_keys=len(track_keys),
)
if not track_keys:
logger.warning("No audio track keys provided")
return
tz = timezone.utc
recorded_at = datetime.now(tz)
try:
if track_keys:
folder = os.path.basename(os.path.dirname(track_keys[0]))
ts_match = re.search(r"(\d{14})$", folder)
if ts_match:
ts = ts_match.group(1)
recorded_at = datetime.strptime(ts, "%Y%m%d%H%M%S").replace(tzinfo=tz)
except Exception as e:
logger.warning(
f"Could not parse recorded_at from keys, using now() {recorded_at}",
e,
exc_info=True,
)
meeting = await meetings_controller.get_by_room_name(daily_room_name)
room_name_base = extract_base_room_name(daily_room_name)
room = await rooms_controller.get_by_name(room_name_base)
if not room:
raise Exception(f"Room not found: {room_name_base}")
if not meeting:
raise Exception(f"Meeting not found: {room_name_base}")
logger.info(
"Found existing Meeting for recording",
meeting_id=meeting.id,
room_name=daily_room_name,
recording_id=recording_id,
)
recording = await recordings_controller.get_by_id(recording_id)
if not recording:
object_key_dir = os.path.dirname(track_keys[0]) if track_keys else ""
recording = await recordings_controller.create(
Recording(
id=recording_id,
bucket_name=bucket_name,
object_key=object_key_dir,
recorded_at=recorded_at,
meeting_id=meeting.id,
track_keys=track_keys,
)
)
else:
# Recording already exists; assume metadata was set at creation time
pass
transcript = await transcripts_controller.get_by_recording_id(recording.id)
if transcript:
await transcripts_controller.update(
transcript,
{
"topics": [],
"participants": [],
},
)
else:
transcript = await transcripts_controller.add(
"",
source_kind=SourceKind.ROOM,
source_language="en",
target_language="en",
user_id=room.user_id,
recording_id=recording.id,
share_mode="public",
meeting_id=meeting.id,
room_id=room.id,
)
try:
daily_client = create_platform_client("daily")
id_to_name = {}
id_to_user_id = {}
mtg_session_id = None
try:
rec_details = await daily_client.get_recording(recording_id)
mtg_session_id = rec_details.get("mtgSessionId")
except Exception as e:
logger.warning(
"Failed to fetch Daily recording details",
error=str(e),
recording_id=recording_id,
exc_info=True,
)
if mtg_session_id:
try:
payload = await daily_client.get_meeting_participants(mtg_session_id)
for p in payload.get("data", []):
pid = p.get("participant_id")
name = p.get("user_name")
user_id = p.get("user_id")
if pid and name:
id_to_name[pid] = name
if pid and user_id:
id_to_user_id[pid] = user_id
except Exception as e:
logger.warning(
"Failed to fetch Daily meeting participants",
error=str(e),
mtg_session_id=mtg_session_id,
exc_info=True,
)
else:
logger.warning(
"No mtgSessionId found for recording; participant names may be generic",
recording_id=recording_id,
)
for idx, key in enumerate(track_keys):
base = os.path.basename(key)
m = re.search(r"\d{13,}-([0-9a-fA-F-]{36})-cam-audio-", base)
participant_id = m.group(1) if m else None
default_name = f"Speaker {idx}"
name = id_to_name.get(participant_id, default_name)
user_id = id_to_user_id.get(participant_id)
participant = TranscriptParticipant(
id=participant_id, speaker=idx, name=name, user_id=user_id
)
await transcripts_controller.upsert_participant(transcript, participant)
except Exception as e:
logger.warning("Failed to map participant names", error=str(e), exc_info=True)
task_pipeline_multitrack_process.delay(
transcript_id=transcript.id,
bucket_name=bucket_name,
track_keys=track_keys,
)
@shared_task
@asynctask
async def process_meetings():
@@ -164,7 +335,7 @@ async def process_meetings():
Uses distributed locking to prevent race conditions when multiple workers
process the same meeting simultaneously.
"""
logger.info("Processing meetings")
logger.debug("Processing meetings")
meetings = await meetings_controller.get_all_active()
current_time = datetime.now(timezone.utc)
redis_client = get_redis_client()
@@ -189,7 +360,8 @@ async def process_meetings():
end_date = end_date.replace(tzinfo=timezone.utc)
# This API call could be slow, extend lock if needed
response = await get_room_sessions(meeting.room_name)
client = create_platform_client(meeting.platform)
room_sessions = await client.get_room_sessions(meeting.room_name)
try:
# Extend lock after slow operation to ensure we still hold it
@@ -198,7 +370,6 @@ async def process_meetings():
logger_.warning("Lost lock for meeting, skipping")
continue
room_sessions = response.get("results", [])
has_active_sessions = room_sessions and any(
rs["endedAt"] is None for rs in room_sessions
)
@@ -231,69 +402,120 @@ async def process_meetings():
except LockError:
pass # Lock already released or expired
logger.info(
logger.debug(
"Processed meetings finished",
processed_count=processed_count,
skipped_count=skipped_count,
)
async def convert_audio_and_waveform(transcript) -> None:
"""Convert WebM to MP3 and generate waveform for Daily.co recordings.
This bypasses the full file pipeline which would overwrite stub data.
"""
try:
logger.info(
"Converting audio to MP3 and generating waveform",
transcript_id=transcript.id,
)
upload_path = transcript.data_path / "upload.webm"
mp3_path = transcript.audio_mp3_filename
# Convert WebM to MP3
mp3_writer = AudioFileWriterProcessor(path=mp3_path)
container = av.open(str(upload_path))
for frame in container.decode(audio=0):
await mp3_writer.push(frame)
await mp3_writer.flush()
container.close()
logger.info(
"Converted WebM to MP3",
transcript_id=transcript.id,
mp3_size=mp3_path.stat().st_size,
)
waveform_processor = AudioWaveformProcessor(
audio_path=mp3_path,
waveform_path=transcript.audio_waveform_filename,
)
waveform_processor.set_pipeline(EmptyPipeline(logger))
await waveform_processor.flush()
logger.info(
"Generated waveform",
transcript_id=transcript.id,
waveform_path=transcript.audio_waveform_filename,
)
# Update transcript status to ended (successful)
await transcripts_controller.update(transcript, {"status": "ended"})
except Exception as e:
logger.error(
"Failed to convert audio or generate waveform",
transcript_id=transcript.id,
error=str(e),
)
# Keep status as uploaded even if conversion fails
pass
@shared_task
@asynctask
async def reprocess_failed_recordings():
"""
Find recordings in the S3 bucket and check if they have proper transcriptions.
Find recordings in Whereby S3 bucket and check if they have proper transcriptions.
If not, requeue them for processing.
"""
logger.info("Checking for recordings that need processing or reprocessing")
s3 = boto3.client(
"s3",
region_name=settings.TRANSCRIPT_STORAGE_AWS_REGION,
aws_access_key_id=settings.TRANSCRIPT_STORAGE_AWS_ACCESS_KEY_ID,
aws_secret_access_key=settings.TRANSCRIPT_STORAGE_AWS_SECRET_ACCESS_KEY,
)
Note: Daily.co recordings are processed via webhooks, not this cron job.
"""
logger.info("Checking Whereby recordings that need processing or reprocessing")
if not settings.WHEREBY_STORAGE_AWS_BUCKET_NAME:
raise ValueError(
"WHEREBY_STORAGE_AWS_BUCKET_NAME required for Whereby recording reprocessing. "
"Set WHEREBY_STORAGE_AWS_BUCKET_NAME environment variable."
)
storage = get_transcripts_storage()
bucket_name = settings.WHEREBY_STORAGE_AWS_BUCKET_NAME
reprocessed_count = 0
try:
paginator = s3.get_paginator("list_objects_v2")
bucket_name = settings.RECORDING_STORAGE_AWS_BUCKET_NAME
pages = paginator.paginate(Bucket=bucket_name)
object_keys = await storage.list_objects(prefix="", bucket=bucket_name)
for page in pages:
if "Contents" not in page:
for object_key in object_keys:
if not object_key.endswith(".mp4"):
continue
for obj in page["Contents"]:
object_key = obj["Key"]
recording = await recordings_controller.get_by_object_key(
bucket_name, object_key
)
if not recording:
logger.info(f"Queueing recording for processing: {object_key}")
process_recording.delay(bucket_name, object_key)
reprocessed_count += 1
continue
if not (object_key.endswith(".mp4")):
continue
recording = await recordings_controller.get_by_object_key(
bucket_name, object_key
transcript = None
try:
transcript = await transcripts_controller.get_by_recording_id(
recording.id
)
except ValidationError:
await transcripts_controller.remove_by_recording_id(recording.id)
logger.warning(
f"Removed invalid transcript for recording: {recording.id}"
)
if not recording:
logger.info(f"Queueing recording for processing: {object_key}")
process_recording.delay(bucket_name, object_key)
reprocessed_count += 1
continue
transcript = None
try:
transcript = await transcripts_controller.get_by_recording_id(
recording.id
)
except ValidationError:
await transcripts_controller.remove_by_recording_id(recording.id)
logger.warning(
f"Removed invalid transcript for recording: {recording.id}"
)
if transcript is None or transcript.status == "error":
logger.info(f"Queueing recording for processing: {object_key}")
process_recording.delay(bucket_name, object_key)
reprocessed_count += 1
if transcript is None or transcript.status == "error":
logger.info(f"Queueing recording for processing: {object_key}")
process_recording.delay(bucket_name, object_key)
reprocessed_count += 1
except Exception as e:
logger.error(f"Error checking S3 bucket: {str(e)}")

View File

@@ -11,6 +11,8 @@ import structlog
from celery import shared_task
from celery.utils.log import get_task_logger
from reflector.db.calendar_events import calendar_events_controller
from reflector.db.meetings import meetings_controller
from reflector.db.rooms import rooms_controller
from reflector.db.transcripts import transcripts_controller
from reflector.pipelines.main_live_pipeline import asynctask
@@ -84,6 +86,18 @@ async def send_transcript_webhook(
}
)
# Fetch meeting and calendar event if they exist
calendar_event = None
try:
if transcript.meeting_id:
meeting = await meetings_controller.get_by_id(transcript.meeting_id)
if meeting and meeting.calendar_event_id:
calendar_event = await calendar_events_controller.get_by_id(
meeting.calendar_event_id
)
except Exception as e:
logger.error("Error fetching meeting or calendar event", error=str(e))
# Build webhook payload
frontend_url = f"{settings.UI_BASE_URL}/transcripts/{transcript.id}"
participants = [
@@ -116,6 +130,33 @@ async def send_transcript_webhook(
},
}
# Always include calendar_event field, even if no event is present
payload_data["calendar_event"] = {}
# Add calendar event data if present
if calendar_event:
calendar_data = {
"id": calendar_event.id,
"ics_uid": calendar_event.ics_uid,
"title": calendar_event.title,
"start_time": calendar_event.start_time.isoformat()
if calendar_event.start_time
else None,
"end_time": calendar_event.end_time.isoformat()
if calendar_event.end_time
else None,
}
# Add optional fields only if they exist
if calendar_event.description:
calendar_data["description"] = calendar_event.description
if calendar_event.location:
calendar_data["location"] = calendar_event.location
if calendar_event.attendees:
calendar_data["attendees"] = calendar_event.attendees
payload_data["calendar_event"] = calendar_data
# Convert to JSON
payload_json = json.dumps(payload_data, separators=(",", ":"))
payload_bytes = payload_json.encode("utf-8")

View File

@@ -65,8 +65,13 @@ class WebsocketManager:
self.tasks: dict = {}
self.pubsub_client = pubsub_client
async def add_user_to_room(self, room_id: str, websocket: WebSocket) -> None:
await websocket.accept()
async def add_user_to_room(
self, room_id: str, websocket: WebSocket, subprotocol: str | None = None
) -> None:
if subprotocol:
await websocket.accept(subprotocol=subprotocol)
else:
await websocket.accept()
if room_id in self.rooms:
self.rooms[room_id].append(websocket)

View File

@@ -0,0 +1,123 @@
#!/usr/bin/env python3
import asyncio
import sys
from pathlib import Path
sys.path.insert(0, str(Path(__file__).parent.parent))
import httpx
from reflector.settings import settings
async def setup_webhook(webhook_url: str):
"""
Create or update Daily.co webhook for this environment.
Uses DAILY_WEBHOOK_UUID to identify existing webhook.
"""
if not settings.DAILY_API_KEY:
print("Error: DAILY_API_KEY not set")
return 1
headers = {
"Authorization": f"Bearer {settings.DAILY_API_KEY}",
"Content-Type": "application/json",
}
webhook_data = {
"url": webhook_url,
"eventTypes": [
"participant.joined",
"participant.left",
"recording.started",
"recording.ready-to-download",
"recording.error",
],
"hmac": settings.DAILY_WEBHOOK_SECRET,
}
async with httpx.AsyncClient() as client:
webhook_uuid = settings.DAILY_WEBHOOK_UUID
if webhook_uuid:
# Update existing webhook
print(f"Updating existing webhook {webhook_uuid}...")
try:
resp = await client.patch(
f"https://api.daily.co/v1/webhooks/{webhook_uuid}",
headers=headers,
json=webhook_data,
)
resp.raise_for_status()
result = resp.json()
print(f"✓ Updated webhook {result['uuid']} (state: {result['state']})")
print(f" URL: {result['url']}")
return 0
except httpx.HTTPStatusError as e:
if e.response.status_code == 404:
print(f"Webhook {webhook_uuid} not found, creating new one...")
webhook_uuid = None # Fall through to creation
else:
print(f"Error updating webhook: {e}")
return 1
if not webhook_uuid:
# Create new webhook
print("Creating new webhook...")
resp = await client.post(
"https://api.daily.co/v1/webhooks", headers=headers, json=webhook_data
)
resp.raise_for_status()
result = resp.json()
webhook_uuid = result["uuid"]
print(f"✓ Created webhook {webhook_uuid} (state: {result['state']})")
print(f" URL: {result['url']}")
print()
print("=" * 60)
print("IMPORTANT: Add this to your environment variables:")
print("=" * 60)
print(f"DAILY_WEBHOOK_UUID: {webhook_uuid}")
print("=" * 60)
print()
# Try to write UUID to .env file
env_file = Path(__file__).parent.parent / ".env"
if env_file.exists():
lines = env_file.read_text().splitlines()
updated = False
# Update existing DAILY_WEBHOOK_UUID line or add it
for i, line in enumerate(lines):
if line.startswith("DAILY_WEBHOOK_UUID="):
lines[i] = f"DAILY_WEBHOOK_UUID={webhook_uuid}"
updated = True
break
if not updated:
lines.append(f"DAILY_WEBHOOK_UUID={webhook_uuid}")
env_file.write_text("\n".join(lines) + "\n")
print(f"✓ Also saved to local .env file")
else:
print(f"⚠ Local .env file not found - please add manually")
return 0
if __name__ == "__main__":
if len(sys.argv) != 2:
print("Usage: python recreate_daily_webhook.py <webhook_url>")
print(
"Example: python recreate_daily_webhook.py https://example.com/v1/daily/webhook"
)
print()
print("Behavior:")
print(" - If DAILY_WEBHOOK_UUID set: Updates existing webhook")
print(
" - If DAILY_WEBHOOK_UUID empty: Creates new webhook, saves UUID to .env"
)
sys.exit(1)
sys.exit(asyncio.run(setup_webhook(sys.argv[1])))

View File

@@ -1,9 +1,22 @@
import os
from contextlib import asynccontextmanager
from tempfile import NamedTemporaryFile
from unittest.mock import patch
import pytest
from reflector.schemas.platform import WHEREBY_PLATFORM
@pytest.fixture(scope="session", autouse=True)
def register_mock_platform():
from mocks.mock_platform import MockPlatformClient
from reflector.video_platforms.registry import register_platform
register_platform(WHEREBY_PLATFORM, MockPlatformClient)
yield
@pytest.fixture(scope="session", autouse=True)
def settings_configuration():
@@ -337,6 +350,166 @@ async def client():
yield ac
@pytest.fixture(autouse=True)
async def ws_manager_in_memory(monkeypatch):
"""Replace Redis-based WS manager with an in-memory implementation for tests."""
import asyncio
import json
from reflector.ws_manager import WebsocketManager
class _InMemorySubscriber:
def __init__(self, queue: asyncio.Queue):
self.queue = queue
async def get_message(self, ignore_subscribe_messages: bool = True):
try:
return await asyncio.wait_for(self.queue.get(), timeout=0.05)
except Exception:
return None
class InMemoryPubSubManager:
def __init__(self):
self.queues: dict[str, asyncio.Queue] = {}
self.connected = False
async def connect(self) -> None:
self.connected = True
async def disconnect(self) -> None:
self.connected = False
async def send_json(self, room_id: str, message: dict) -> None:
if room_id not in self.queues:
self.queues[room_id] = asyncio.Queue()
payload = json.dumps(message).encode("utf-8")
await self.queues[room_id].put(
{"channel": room_id.encode("utf-8"), "data": payload}
)
async def subscribe(self, room_id: str):
if room_id not in self.queues:
self.queues[room_id] = asyncio.Queue()
return _InMemorySubscriber(self.queues[room_id])
async def unsubscribe(self, room_id: str) -> None:
# keep queue for potential later resubscribe within same test
pass
pubsub = InMemoryPubSubManager()
ws_manager = WebsocketManager(pubsub_client=pubsub)
def _get_ws_manager():
return ws_manager
# Patch all places that imported get_ws_manager at import time
monkeypatch.setattr("reflector.ws_manager.get_ws_manager", _get_ws_manager)
monkeypatch.setattr(
"reflector.pipelines.main_live_pipeline.get_ws_manager", _get_ws_manager
)
monkeypatch.setattr(
"reflector.views.transcripts_websocket.get_ws_manager", _get_ws_manager
)
monkeypatch.setattr(
"reflector.views.user_websocket.get_ws_manager", _get_ws_manager
)
monkeypatch.setattr("reflector.views.transcripts.get_ws_manager", _get_ws_manager)
# Websocket auth: avoid OAuth2 on websocket dependencies; allow anonymous
import reflector.auth as auth
# Ensure FastAPI uses our override for routes that captured the original callable
from reflector.app import app as fastapi_app
try:
fastapi_app.dependency_overrides[auth.current_user_optional] = lambda: None
except Exception:
pass
# Stub Redis cache used by profanity filter to avoid external Redis
from reflector import redis_cache as rc
class _FakeRedis:
def __init__(self):
self._data = {}
def get(self, key):
value = self._data.get(key)
if value is None:
return None
if isinstance(value, bytes):
return value
return str(value).encode("utf-8")
def setex(self, key, duration, value):
# ignore duration for tests
if isinstance(value, bytes):
self._data[key] = value
else:
self._data[key] = str(value).encode("utf-8")
fake_redises: dict[int, _FakeRedis] = {}
def _get_redis_client(db=0):
if db not in fake_redises:
fake_redises[db] = _FakeRedis()
return fake_redises[db]
monkeypatch.setattr(rc, "get_redis_client", _get_redis_client)
yield
@pytest.fixture
@pytest.mark.asyncio
async def authenticated_client():
async with authenticated_client_ctx():
yield
@pytest.fixture
@pytest.mark.asyncio
async def authenticated_client2():
async with authenticated_client2_ctx():
yield
@asynccontextmanager
async def authenticated_client_ctx():
from reflector.app import app
from reflector.auth import current_user, current_user_optional
app.dependency_overrides[current_user] = lambda: {
"sub": "randomuserid",
"email": "test@mail.com",
}
app.dependency_overrides[current_user_optional] = lambda: {
"sub": "randomuserid",
"email": "test@mail.com",
}
yield
del app.dependency_overrides[current_user]
del app.dependency_overrides[current_user_optional]
@asynccontextmanager
async def authenticated_client2_ctx():
from reflector.app import app
from reflector.auth import current_user, current_user_optional
app.dependency_overrides[current_user] = lambda: {
"sub": "randomuserid2",
"email": "test@mail.com",
}
app.dependency_overrides[current_user_optional] = lambda: {
"sub": "randomuserid2",
"email": "test@mail.com",
}
yield
del app.dependency_overrides[current_user]
del app.dependency_overrides[current_user_optional]
@pytest.fixture(scope="session")
def fake_mp3_upload():
with patch(

View File

View File

@@ -0,0 +1,112 @@
import uuid
from datetime import datetime
from typing import Any, Dict, Literal, Optional
from reflector.db.rooms import Room
from reflector.video_platforms.base import (
ROOM_PREFIX_SEPARATOR,
MeetingData,
VideoPlatformClient,
VideoPlatformConfig,
)
MockPlatform = Literal["mock"]
class MockPlatformClient(VideoPlatformClient):
PLATFORM_NAME: MockPlatform = "mock"
def __init__(self, config: VideoPlatformConfig):
super().__init__(config)
self._rooms: Dict[str, Dict[str, Any]] = {}
self._webhook_calls: list[Dict[str, Any]] = []
async def create_meeting(
self, room_name_prefix: str, end_date: datetime, room: Room
) -> MeetingData:
meeting_id = str(uuid.uuid4())
room_name = f"{room_name_prefix}{ROOM_PREFIX_SEPARATOR}{meeting_id[:8]}"
room_url = f"https://mock.video/{room_name}"
host_room_url = f"{room_url}?host=true"
self._rooms[room_name] = {
"id": meeting_id,
"name": room_name,
"url": room_url,
"host_url": host_room_url,
"end_date": end_date,
"room": room,
"participants": [],
"is_active": True,
}
return MeetingData.model_construct(
meeting_id=meeting_id,
room_name=room_name,
room_url=room_url,
host_room_url=host_room_url,
platform="whereby",
extra_data={"mock": True},
)
async def get_room_sessions(self, room_name: str) -> Dict[str, Any]:
if room_name not in self._rooms:
return {"error": "Room not found"}
room_data = self._rooms[room_name]
return {
"roomName": room_name,
"sessions": [
{
"sessionId": room_data["id"],
"startTime": datetime.utcnow().isoformat(),
"participants": room_data["participants"],
"isActive": room_data["is_active"],
}
],
}
async def delete_room(self, room_name: str) -> bool:
if room_name in self._rooms:
self._rooms[room_name]["is_active"] = False
return True
return False
async def upload_logo(self, room_name: str, logo_path: str) -> bool:
if room_name in self._rooms:
self._rooms[room_name]["logo_path"] = logo_path
return True
return False
def verify_webhook_signature(
self, body: bytes, signature: str, timestamp: Optional[str] = None
) -> bool:
return signature == "valid"
def add_participant(
self, room_name: str, participant_id: str, participant_name: str
):
if room_name in self._rooms:
self._rooms[room_name]["participants"].append(
{
"id": participant_id,
"name": participant_name,
"joined_at": datetime.utcnow().isoformat(),
}
)
def trigger_webhook(self, event_type: str, data: Dict[str, Any]):
self._webhook_calls.append(
{
"type": event_type,
"data": data,
"timestamp": datetime.utcnow().isoformat(),
}
)
def get_webhook_calls(self) -> list[Dict[str, Any]]:
return self._webhook_calls.copy()
def clear_data(self):
self._rooms.clear()
self._webhook_calls.clear()

View File

@@ -139,14 +139,10 @@ async def test_cleanup_deletes_associated_meeting_and_recording():
mock_settings.PUBLIC_DATA_RETENTION_DAYS = 7
# Mock storage deletion
with patch("reflector.db.transcripts.get_transcripts_storage") as mock_storage:
with patch("reflector.worker.cleanup.get_transcripts_storage") as mock_storage:
mock_storage.return_value.delete_file = AsyncMock()
with patch(
"reflector.worker.cleanup.get_recordings_storage"
) as mock_rec_storage:
mock_rec_storage.return_value.delete_file = AsyncMock()
result = await cleanup_old_public_data()
result = await cleanup_old_public_data()
# Check results
assert result["transcripts_deleted"] == 1

View File

@@ -0,0 +1,330 @@
from datetime import datetime, timezone
from unittest.mock import AsyncMock, MagicMock, patch
import pytest
from reflector.db.meetings import (
MeetingConsent,
meeting_consent_controller,
meetings_controller,
)
from reflector.db.recordings import Recording, recordings_controller
from reflector.db.rooms import rooms_controller
from reflector.db.transcripts import SourceKind, transcripts_controller
from reflector.pipelines.main_live_pipeline import cleanup_consent
@pytest.mark.asyncio
async def test_consent_cleanup_deletes_multitrack_files():
room = await rooms_controller.add(
name="Test Room",
user_id="test-user",
zulip_auto_post=False,
zulip_stream="",
zulip_topic="",
is_locked=False,
room_mode="normal",
recording_type="cloud",
recording_trigger="automatic",
is_shared=False,
platform="daily",
)
# Create meeting
meeting = await meetings_controller.create(
id="test-multitrack-meeting",
room_name="test-room-20250101120000",
room_url="https://test.daily.co/test-room",
host_room_url="https://test.daily.co/test-room",
start_date=datetime.now(timezone.utc),
end_date=datetime.now(timezone.utc),
room=room,
)
track_keys = [
"recordings/test-room-20250101120000/track-0.webm",
"recordings/test-room-20250101120000/track-1.webm",
"recordings/test-room-20250101120000/track-2.webm",
]
recording = await recordings_controller.create(
Recording(
bucket_name="test-bucket",
object_key="recordings/test-room-20250101120000", # Folder path
recorded_at=datetime.now(timezone.utc),
meeting_id=meeting.id,
track_keys=track_keys,
)
)
# Create transcript
transcript = await transcripts_controller.add(
name="Test Multitrack Transcript",
source_kind=SourceKind.ROOM,
recording_id=recording.id,
meeting_id=meeting.id,
)
# Add consent denial
await meeting_consent_controller.upsert(
MeetingConsent(
meeting_id=meeting.id,
user_id="test-user",
consent_given=False,
consent_timestamp=datetime.now(timezone.utc),
)
)
# Mock get_transcripts_storage (master credentials with bucket override)
with patch(
"reflector.pipelines.main_live_pipeline.get_transcripts_storage"
) as mock_get_transcripts_storage:
mock_master_storage = MagicMock()
mock_master_storage.delete_file = AsyncMock()
mock_get_transcripts_storage.return_value = mock_master_storage
await cleanup_consent(transcript_id=transcript.id)
# Verify master storage was used with bucket override for all track keys
assert mock_master_storage.delete_file.call_count == 3
deleted_keys = []
for call_args in mock_master_storage.delete_file.call_args_list:
key = call_args[0][0]
bucket_kwarg = call_args[1].get("bucket")
deleted_keys.append(key)
assert bucket_kwarg == "test-bucket" # Verify bucket override!
assert set(deleted_keys) == set(track_keys)
updated_transcript = await transcripts_controller.get_by_id(transcript.id)
assert updated_transcript.audio_deleted is True
@pytest.mark.asyncio
async def test_consent_cleanup_handles_missing_track_keys():
room = await rooms_controller.add(
name="Test Room 2",
user_id="test-user",
zulip_auto_post=False,
zulip_stream="",
zulip_topic="",
is_locked=False,
room_mode="normal",
recording_type="cloud",
recording_trigger="automatic",
is_shared=False,
platform="daily",
)
# Create meeting
meeting = await meetings_controller.create(
id="test-multitrack-meeting-2",
room_name="test-room-20250101120001",
room_url="https://test.daily.co/test-room-2",
host_room_url="https://test.daily.co/test-room-2",
start_date=datetime.now(timezone.utc),
end_date=datetime.now(timezone.utc),
room=room,
)
recording = await recordings_controller.create(
Recording(
bucket_name="test-bucket",
object_key="recordings/old-style-recording.mp4",
recorded_at=datetime.now(timezone.utc),
meeting_id=meeting.id,
track_keys=None,
)
)
transcript = await transcripts_controller.add(
name="Test Old-Style Transcript",
source_kind=SourceKind.ROOM,
recording_id=recording.id,
meeting_id=meeting.id,
)
# Add consent denial
await meeting_consent_controller.upsert(
MeetingConsent(
meeting_id=meeting.id,
user_id="test-user-2",
consent_given=False,
consent_timestamp=datetime.now(timezone.utc),
)
)
# Mock get_transcripts_storage (master credentials with bucket override)
with patch(
"reflector.pipelines.main_live_pipeline.get_transcripts_storage"
) as mock_get_transcripts_storage:
mock_master_storage = MagicMock()
mock_master_storage.delete_file = AsyncMock()
mock_get_transcripts_storage.return_value = mock_master_storage
await cleanup_consent(transcript_id=transcript.id)
# Verify master storage was used with bucket override
assert mock_master_storage.delete_file.call_count == 1
call_args = mock_master_storage.delete_file.call_args
assert call_args[0][0] == recording.object_key
assert call_args[1].get("bucket") == "test-bucket" # Verify bucket override!
@pytest.mark.asyncio
async def test_consent_cleanup_empty_track_keys_falls_back():
room = await rooms_controller.add(
name="Test Room 3",
user_id="test-user",
zulip_auto_post=False,
zulip_stream="",
zulip_topic="",
is_locked=False,
room_mode="normal",
recording_type="cloud",
recording_trigger="automatic",
is_shared=False,
platform="daily",
)
# Create meeting
meeting = await meetings_controller.create(
id="test-multitrack-meeting-3",
room_name="test-room-20250101120002",
room_url="https://test.daily.co/test-room-3",
host_room_url="https://test.daily.co/test-room-3",
start_date=datetime.now(timezone.utc),
end_date=datetime.now(timezone.utc),
room=room,
)
recording = await recordings_controller.create(
Recording(
bucket_name="test-bucket",
object_key="recordings/fallback-recording.mp4",
recorded_at=datetime.now(timezone.utc),
meeting_id=meeting.id,
track_keys=[],
)
)
transcript = await transcripts_controller.add(
name="Test Empty Track Keys Transcript",
source_kind=SourceKind.ROOM,
recording_id=recording.id,
meeting_id=meeting.id,
)
# Add consent denial
await meeting_consent_controller.upsert(
MeetingConsent(
meeting_id=meeting.id,
user_id="test-user-3",
consent_given=False,
consent_timestamp=datetime.now(timezone.utc),
)
)
# Mock get_transcripts_storage (master credentials with bucket override)
with patch(
"reflector.pipelines.main_live_pipeline.get_transcripts_storage"
) as mock_get_transcripts_storage:
mock_master_storage = MagicMock()
mock_master_storage.delete_file = AsyncMock()
mock_get_transcripts_storage.return_value = mock_master_storage
# Run cleanup
await cleanup_consent(transcript_id=transcript.id)
# Verify master storage was used with bucket override
assert mock_master_storage.delete_file.call_count == 1
call_args = mock_master_storage.delete_file.call_args
assert call_args[0][0] == recording.object_key
assert call_args[1].get("bucket") == "test-bucket" # Verify bucket override!
@pytest.mark.asyncio
async def test_consent_cleanup_partial_failure_doesnt_mark_deleted():
room = await rooms_controller.add(
name="Test Room 4",
user_id="test-user",
zulip_auto_post=False,
zulip_stream="",
zulip_topic="",
is_locked=False,
room_mode="normal",
recording_type="cloud",
recording_trigger="automatic",
is_shared=False,
platform="daily",
)
# Create meeting
meeting = await meetings_controller.create(
id="test-multitrack-meeting-4",
room_name="test-room-20250101120003",
room_url="https://test.daily.co/test-room-4",
host_room_url="https://test.daily.co/test-room-4",
start_date=datetime.now(timezone.utc),
end_date=datetime.now(timezone.utc),
room=room,
)
track_keys = [
"recordings/test-room-20250101120003/track-0.webm",
"recordings/test-room-20250101120003/track-1.webm",
"recordings/test-room-20250101120003/track-2.webm",
]
recording = await recordings_controller.create(
Recording(
bucket_name="test-bucket",
object_key="recordings/test-room-20250101120003",
recorded_at=datetime.now(timezone.utc),
meeting_id=meeting.id,
track_keys=track_keys,
)
)
# Create transcript
transcript = await transcripts_controller.add(
name="Test Partial Failure Transcript",
source_kind=SourceKind.ROOM,
recording_id=recording.id,
meeting_id=meeting.id,
)
# Add consent denial
await meeting_consent_controller.upsert(
MeetingConsent(
meeting_id=meeting.id,
user_id="test-user-4",
consent_given=False,
consent_timestamp=datetime.now(timezone.utc),
)
)
# Mock get_transcripts_storage (master credentials with bucket override) with partial failure
with patch(
"reflector.pipelines.main_live_pipeline.get_transcripts_storage"
) as mock_get_transcripts_storage:
mock_master_storage = MagicMock()
call_count = 0
async def delete_side_effect(key, bucket=None):
nonlocal call_count
call_count += 1
if call_count == 2:
raise Exception("S3 deletion failed")
mock_master_storage.delete_file = AsyncMock(side_effect=delete_side_effect)
mock_get_transcripts_storage.return_value = mock_master_storage
await cleanup_consent(transcript_id=transcript.id)
# Verify master storage was called with bucket override
assert mock_master_storage.delete_file.call_count == 3
updated_transcript = await transcripts_controller.get_by_id(transcript.id)
assert (
updated_transcript.audio_deleted is None
or updated_transcript.audio_deleted is False
)

View File

@@ -127,18 +127,27 @@ async def mock_storage():
from reflector.storage.base import Storage
class TestStorage(Storage):
async def _put_file(self, path, data):
async def _put_file(self, path, data, bucket=None):
return None
async def _get_file_url(self, path):
async def _get_file_url(
self,
path,
operation: str = "get_object",
expires_in: int = 3600,
bucket=None,
):
return f"http://test-storage/{path}"
async def _get_file(self, path):
async def _get_file(self, path, bucket=None):
return b"test_audio_data"
async def _delete_file(self, path):
async def _delete_file(self, path, bucket=None):
return None
async def _stream_to_fileobj(self, path, fileobj, bucket=None):
fileobj.write(b"test_audio_data")
storage = TestStorage()
# Add mock tracking for verification
storage._put_file = AsyncMock(side_effect=storage._put_file)
@@ -181,7 +190,7 @@ async def mock_waveform_processor():
async def mock_topic_detector():
"""Mock TranscriptTopicDetectorProcessor"""
with patch(
"reflector.pipelines.main_file_pipeline.TranscriptTopicDetectorProcessor"
"reflector.pipelines.topic_processing.TranscriptTopicDetectorProcessor"
) as mock_topic_class:
mock_topic = AsyncMock()
mock_topic.set_pipeline = MagicMock()
@@ -218,7 +227,7 @@ async def mock_topic_detector():
async def mock_title_processor():
"""Mock TranscriptFinalTitleProcessor"""
with patch(
"reflector.pipelines.main_file_pipeline.TranscriptFinalTitleProcessor"
"reflector.pipelines.topic_processing.TranscriptFinalTitleProcessor"
) as mock_title_class:
mock_title = AsyncMock()
mock_title.set_pipeline = MagicMock()
@@ -247,7 +256,7 @@ async def mock_title_processor():
async def mock_summary_processor():
"""Mock TranscriptFinalSummaryProcessor"""
with patch(
"reflector.pipelines.main_file_pipeline.TranscriptFinalSummaryProcessor"
"reflector.pipelines.topic_processing.TranscriptFinalSummaryProcessor"
) as mock_summary_class:
mock_summary = AsyncMock()
mock_summary.set_pipeline = MagicMock()

View File

@@ -11,14 +11,21 @@ from reflector.db.rooms import rooms_controller
@pytest.fixture
async def authenticated_client(client):
from reflector.app import app
from reflector.auth import current_user_optional
from reflector.auth import current_user, current_user_optional
app.dependency_overrides[current_user] = lambda: {
"sub": "test-user",
"email": "test@example.com",
}
app.dependency_overrides[current_user_optional] = lambda: {
"sub": "test-user",
"email": "test@example.com",
}
yield client
del app.dependency_overrides[current_user_optional]
try:
yield client
finally:
del app.dependency_overrides[current_user]
del app.dependency_overrides[current_user_optional]
@pytest.mark.asyncio
@@ -41,6 +48,7 @@ async def test_create_room_with_ics_fields(authenticated_client):
"ics_url": "https://calendar.example.com/test.ics",
"ics_fetch_interval": 600,
"ics_enabled": True,
"platform": "daily",
},
)
assert response.status_code == 200
@@ -68,6 +76,7 @@ async def test_update_room_ics_configuration(authenticated_client):
"is_shared": False,
"webhook_url": "",
"webhook_secret": "",
"platform": "daily",
},
)
assert response.status_code == 200
@@ -104,6 +113,7 @@ async def test_trigger_ics_sync(authenticated_client):
is_shared=False,
ics_url="https://calendar.example.com/api.ics",
ics_enabled=True,
platform="daily",
)
cal = Calendar()
@@ -147,6 +157,7 @@ async def test_trigger_ics_sync_unauthorized(client):
is_shared=False,
ics_url="https://calendar.example.com/api.ics",
ics_enabled=True,
platform="daily",
)
response = await client.post(f"/rooms/{room.name}/ics/sync")
@@ -169,6 +180,7 @@ async def test_trigger_ics_sync_not_configured(authenticated_client):
recording_trigger="automatic-2nd-participant",
is_shared=False,
ics_enabled=False,
platform="daily",
)
response = await client.post(f"/rooms/{room.name}/ics/sync")
@@ -193,6 +205,7 @@ async def test_get_ics_status(authenticated_client):
ics_url="https://calendar.example.com/status.ics",
ics_enabled=True,
ics_fetch_interval=300,
platform="daily",
)
now = datetime.now(timezone.utc)
@@ -224,6 +237,7 @@ async def test_get_ics_status_unauthorized(client):
is_shared=False,
ics_url="https://calendar.example.com/status.ics",
ics_enabled=True,
platform="daily",
)
response = await client.get(f"/rooms/{room.name}/ics/status")
@@ -245,6 +259,7 @@ async def test_list_room_meetings(authenticated_client):
recording_type="cloud",
recording_trigger="automatic-2nd-participant",
is_shared=False,
platform="daily",
)
now = datetime.now(timezone.utc)
@@ -291,6 +306,7 @@ async def test_list_room_meetings_non_owner(client):
recording_type="cloud",
recording_trigger="automatic-2nd-participant",
is_shared=False,
platform="daily",
)
event = CalendarEvent(
@@ -327,6 +343,7 @@ async def test_list_upcoming_meetings(authenticated_client):
recording_type="cloud",
recording_trigger="automatic-2nd-participant",
is_shared=False,
platform="daily",
)
now = datetime.now(timezone.utc)

View File

@@ -0,0 +1,256 @@
from datetime import datetime, timedelta, timezone
import pytest
from reflector.db import get_database
from reflector.db.search import SearchParameters, search_controller
from reflector.db.transcripts import SourceKind, transcripts
@pytest.mark.asyncio
class TestDateRangeIntegration:
async def setup_test_transcripts(self):
# Use a test user_id that will match in our search parameters
test_user_id = "test-user-123"
test_data = [
{
"id": "test-before-range",
"created_at": datetime(2024, 1, 15, tzinfo=timezone.utc),
"title": "Before Range Transcript",
"user_id": test_user_id,
},
{
"id": "test-start-boundary",
"created_at": datetime(2024, 6, 1, tzinfo=timezone.utc),
"title": "Start Boundary Transcript",
"user_id": test_user_id,
},
{
"id": "test-middle-range",
"created_at": datetime(2024, 6, 15, tzinfo=timezone.utc),
"title": "Middle Range Transcript",
"user_id": test_user_id,
},
{
"id": "test-end-boundary",
"created_at": datetime(2024, 6, 30, 23, 59, 59, tzinfo=timezone.utc),
"title": "End Boundary Transcript",
"user_id": test_user_id,
},
{
"id": "test-after-range",
"created_at": datetime(2024, 12, 31, tzinfo=timezone.utc),
"title": "After Range Transcript",
"user_id": test_user_id,
},
]
for data in test_data:
full_data = {
"id": data["id"],
"name": data["id"],
"status": "ended",
"locked": False,
"duration": 60.0,
"created_at": data["created_at"],
"title": data["title"],
"short_summary": "Test summary",
"long_summary": "Test long summary",
"share_mode": "public",
"source_kind": SourceKind.FILE,
"audio_deleted": False,
"reviewed": False,
"user_id": data["user_id"],
}
await get_database().execute(transcripts.insert().values(**full_data))
return test_data
async def cleanup_test_transcripts(self, test_data):
"""Clean up test transcripts."""
for data in test_data:
await get_database().execute(
transcripts.delete().where(transcripts.c.id == data["id"])
)
@pytest.mark.asyncio
async def test_filter_with_from_datetime_only(self):
"""Test filtering with only from_datetime parameter."""
test_data = await self.setup_test_transcripts()
test_user_id = "test-user-123"
try:
params = SearchParameters(
query_text=None,
from_datetime=datetime(2024, 6, 1, tzinfo=timezone.utc),
to_datetime=None,
user_id=test_user_id,
)
results, total = await search_controller.search_transcripts(params)
# Should include: start_boundary, middle, end_boundary, after
result_ids = [r.id for r in results]
assert "test-before-range" not in result_ids
assert "test-start-boundary" in result_ids
assert "test-middle-range" in result_ids
assert "test-end-boundary" in result_ids
assert "test-after-range" in result_ids
finally:
await self.cleanup_test_transcripts(test_data)
@pytest.mark.asyncio
async def test_filter_with_to_datetime_only(self):
"""Test filtering with only to_datetime parameter."""
test_data = await self.setup_test_transcripts()
test_user_id = "test-user-123"
try:
params = SearchParameters(
query_text=None,
from_datetime=None,
to_datetime=datetime(2024, 6, 30, tzinfo=timezone.utc),
user_id=test_user_id,
)
results, total = await search_controller.search_transcripts(params)
result_ids = [r.id for r in results]
assert "test-before-range" in result_ids
assert "test-start-boundary" in result_ids
assert "test-middle-range" in result_ids
assert "test-end-boundary" not in result_ids
assert "test-after-range" not in result_ids
finally:
await self.cleanup_test_transcripts(test_data)
@pytest.mark.asyncio
async def test_filter_with_both_datetimes(self):
test_data = await self.setup_test_transcripts()
test_user_id = "test-user-123"
try:
params = SearchParameters(
query_text=None,
from_datetime=datetime(2024, 6, 1, tzinfo=timezone.utc),
to_datetime=datetime(
2024, 7, 1, tzinfo=timezone.utc
), # Inclusive of 6/30
user_id=test_user_id,
)
results, total = await search_controller.search_transcripts(params)
result_ids = [r.id for r in results]
assert "test-before-range" not in result_ids
assert "test-start-boundary" in result_ids
assert "test-middle-range" in result_ids
assert "test-end-boundary" in result_ids
assert "test-after-range" not in result_ids
finally:
await self.cleanup_test_transcripts(test_data)
@pytest.mark.asyncio
async def test_date_filter_with_room_and_source_kind(self):
test_data = await self.setup_test_transcripts()
test_user_id = "test-user-123"
try:
params = SearchParameters(
query_text=None,
from_datetime=datetime(2024, 6, 1, tzinfo=timezone.utc),
to_datetime=datetime(2024, 7, 1, tzinfo=timezone.utc),
source_kind=SourceKind.FILE,
room_id=None,
user_id=test_user_id,
)
results, total = await search_controller.search_transcripts(params)
for result in results:
assert result.source_kind == SourceKind.FILE
assert result.created_at >= datetime(2024, 6, 1, tzinfo=timezone.utc)
assert result.created_at <= datetime(2024, 7, 1, tzinfo=timezone.utc)
finally:
await self.cleanup_test_transcripts(test_data)
@pytest.mark.asyncio
async def test_empty_results_for_future_dates(self):
test_data = await self.setup_test_transcripts()
test_user_id = "test-user-123"
try:
params = SearchParameters(
query_text=None,
from_datetime=datetime(2099, 1, 1, tzinfo=timezone.utc),
to_datetime=datetime(2099, 12, 31, tzinfo=timezone.utc),
user_id=test_user_id,
)
results, total = await search_controller.search_transcripts(params)
assert results == []
assert total == 0
finally:
await self.cleanup_test_transcripts(test_data)
@pytest.mark.asyncio
async def test_date_only_input_handling(self):
test_data = await self.setup_test_transcripts()
test_user_id = "test-user-123"
try:
# Pydantic will parse date-only strings to datetime at midnight
from_dt = datetime(2024, 6, 15, 0, 0, 0, tzinfo=timezone.utc)
to_dt = datetime(2024, 6, 16, 0, 0, 0, tzinfo=timezone.utc)
params = SearchParameters(
query_text=None,
from_datetime=from_dt,
to_datetime=to_dt,
user_id=test_user_id,
)
results, total = await search_controller.search_transcripts(params)
result_ids = [r.id for r in results]
assert "test-middle-range" in result_ids
assert "test-before-range" not in result_ids
assert "test-after-range" not in result_ids
finally:
await self.cleanup_test_transcripts(test_data)
class TestDateValidationEdgeCases:
"""Edge case tests for datetime validation."""
def test_timezone_aware_comparison(self):
"""Test that timezone-aware comparisons work correctly."""
# PST time (UTC-8)
pst = timezone(timedelta(hours=-8))
pst_dt = datetime(2024, 6, 15, 8, 0, 0, tzinfo=pst)
# UTC time equivalent (8AM PST = 4PM UTC)
utc_dt = datetime(2024, 6, 15, 16, 0, 0, tzinfo=timezone.utc)
assert pst_dt == utc_dt
def test_mixed_timezone_input(self):
"""Test handling mixed timezone inputs."""
pst = timezone(timedelta(hours=-8))
ist = timezone(timedelta(hours=5, minutes=30))
from_date = datetime(2024, 6, 15, 0, 0, 0, tzinfo=pst) # PST midnight
to_date = datetime(2024, 6, 15, 23, 59, 59, tzinfo=ist) # IST end of day
assert from_date.tzinfo is not None
assert to_date.tzinfo is not None
assert from_date < to_date

View File

@@ -0,0 +1,384 @@
import asyncio
import shutil
import threading
import time
from pathlib import Path
import pytest
from httpx_ws import aconnect_ws
from uvicorn import Config, Server
from reflector import zulip as zulip_module
from reflector.app import app
from reflector.db import get_database
from reflector.db.meetings import meetings_controller
from reflector.db.rooms import Room, rooms_controller
from reflector.db.transcripts import (
SourceKind,
TranscriptTopic,
transcripts_controller,
)
from reflector.processors.types import Word
from reflector.settings import settings
from reflector.views.transcripts import create_access_token
@pytest.mark.asyncio
async def test_anonymous_cannot_delete_transcript_in_shared_room(client):
# Create a shared room with a fake owner id so meeting has a room_id
room = await rooms_controller.add(
name="shared-room-test",
user_id="owner-1",
zulip_auto_post=False,
zulip_stream="",
zulip_topic="",
is_locked=False,
room_mode="normal",
recording_type="cloud",
recording_trigger="automatic-2nd-participant",
is_shared=True,
webhook_url="",
webhook_secret="",
)
# Create a meeting for that room (so transcript.meeting_id links to the shared room)
meeting = await meetings_controller.create(
id="meeting-sec-test",
room_name="room-sec-test",
room_url="room-url",
host_room_url="host-url",
start_date=Room.model_fields["created_at"].default_factory(),
end_date=Room.model_fields["created_at"].default_factory(),
room=room,
)
# Create a transcript owned by someone else and link it to meeting
t = await transcripts_controller.add(
name="to-delete",
source_kind=SourceKind.LIVE,
user_id="owner-2",
meeting_id=meeting.id,
room_id=room.id,
share_mode="private",
)
# Anonymous DELETE should be rejected
del_resp = await client.delete(f"/transcripts/{t.id}")
assert del_resp.status_code == 401, del_resp.text
@pytest.mark.asyncio
async def test_anonymous_cannot_mutate_participants_on_public_transcript(client):
# Create a public transcript with no owner
t = await transcripts_controller.add(
name="public-transcript",
source_kind=SourceKind.LIVE,
user_id=None,
share_mode="public",
)
# Anonymous POST participant must be rejected
resp = await client.post(
f"/transcripts/{t.id}/participants",
json={"name": "AnonUser", "speaker": 0},
)
assert resp.status_code == 401, resp.text
@pytest.mark.asyncio
async def test_anonymous_cannot_update_and_delete_room(client):
# Create room as owner id "owner-3" via controller
room = await rooms_controller.add(
name="room-anon-update-delete",
user_id="owner-3",
zulip_auto_post=False,
zulip_stream="",
zulip_topic="",
is_locked=False,
room_mode="normal",
recording_type="cloud",
recording_trigger="automatic-2nd-participant",
is_shared=False,
webhook_url="",
webhook_secret="",
)
# Anonymous PATCH via API (no auth)
resp = await client.patch(
f"/rooms/{room.id}",
json={
"name": "room-anon-updated",
"zulip_auto_post": False,
"zulip_stream": "",
"zulip_topic": "",
"is_locked": False,
"room_mode": "normal",
"recording_type": "cloud",
"recording_trigger": "automatic-2nd-participant",
"is_shared": False,
"webhook_url": "",
"webhook_secret": "",
},
)
# Expect authentication required
assert resp.status_code == 401, resp.text
# Anonymous DELETE via API
del_resp = await client.delete(f"/rooms/{room.id}")
assert del_resp.status_code == 401, del_resp.text
@pytest.mark.asyncio
async def test_anonymous_cannot_post_transcript_to_zulip(client, monkeypatch):
# Create a public transcript with some content
t = await transcripts_controller.add(
name="zulip-public",
source_kind=SourceKind.LIVE,
user_id=None,
share_mode="public",
)
# Mock send/update calls
def _fake_send_message_to_zulip(stream, topic, content):
return {"id": 12345}
async def _fake_update_message(message_id, stream, topic, content):
return {"result": "success"}
monkeypatch.setattr(
zulip_module, "send_message_to_zulip", _fake_send_message_to_zulip
)
monkeypatch.setattr(zulip_module, "update_zulip_message", _fake_update_message)
# Anonymous POST to Zulip endpoint
resp = await client.post(
f"/transcripts/{t.id}/zulip",
params={"stream": "general", "topic": "Updates", "include_topics": False},
)
assert resp.status_code == 401, resp.text
@pytest.mark.asyncio
async def test_anonymous_cannot_assign_speaker_on_public_transcript(client):
# Create public transcript
t = await transcripts_controller.add(
name="public-assign",
source_kind=SourceKind.LIVE,
user_id=None,
share_mode="public",
)
# Add a topic with words to be reassigned
topic = TranscriptTopic(
title="T1",
summary="S1",
timestamp=0.0,
transcript="Hello",
words=[Word(start=0.0, end=1.0, text="Hello", speaker=0)],
)
transcript = await transcripts_controller.get_by_id(t.id)
await transcripts_controller.upsert_topic(transcript, topic)
# Anonymous assign speaker over time range covering the word
resp = await client.patch(
f"/transcripts/{t.id}/speaker/assign",
json={
"speaker": 1,
"timestamp_from": 0.0,
"timestamp_to": 1.0,
},
)
assert resp.status_code == 401, resp.text
# Minimal server fixture for websocket tests
@pytest.fixture
def appserver_ws_simple(setup_database):
host = "127.0.0.1"
port = 1256
server_started = threading.Event()
server_exception = None
server_instance = None
def run_server():
nonlocal server_exception, server_instance
try:
loop = asyncio.new_event_loop()
asyncio.set_event_loop(loop)
config = Config(app=app, host=host, port=port, loop=loop)
server_instance = Server(config)
async def start_server():
database = get_database()
await database.connect()
try:
await server_instance.serve()
finally:
await database.disconnect()
server_started.set()
loop.run_until_complete(start_server())
except Exception as e:
server_exception = e
server_started.set()
finally:
loop.close()
server_thread = threading.Thread(target=run_server, daemon=True)
server_thread.start()
server_started.wait(timeout=30)
if server_exception:
raise server_exception
time.sleep(0.5)
yield host, port
if server_instance:
server_instance.should_exit = True
server_thread.join(timeout=30)
@pytest.mark.asyncio
async def test_websocket_denies_anonymous_on_private_transcript(appserver_ws_simple):
host, port = appserver_ws_simple
# Create a private transcript owned by someone
t = await transcripts_controller.add(
name="private-ws",
source_kind=SourceKind.LIVE,
user_id="owner-x",
share_mode="private",
)
base_url = f"http://{host}:{port}/v1"
# Anonymous connect should be denied
with pytest.raises(Exception):
async with aconnect_ws(f"{base_url}/transcripts/{t.id}/events") as ws:
await ws.close()
@pytest.mark.asyncio
async def test_anonymous_cannot_update_public_transcript(client):
t = await transcripts_controller.add(
name="update-me",
source_kind=SourceKind.LIVE,
user_id=None,
share_mode="public",
)
resp = await client.patch(
f"/transcripts/{t.id}",
json={"title": "New Title From Anonymous"},
)
assert resp.status_code == 401, resp.text
@pytest.mark.asyncio
async def test_anonymous_cannot_get_nonshared_room_by_id(client):
room = await rooms_controller.add(
name="private-room-exposed",
user_id="owner-z",
zulip_auto_post=False,
zulip_stream="",
zulip_topic="",
is_locked=False,
room_mode="normal",
recording_type="cloud",
recording_trigger="automatic-2nd-participant",
is_shared=False,
webhook_url="",
webhook_secret="",
)
resp = await client.get(f"/rooms/{room.id}")
assert resp.status_code == 403, resp.text
@pytest.mark.asyncio
async def test_anonymous_cannot_call_rooms_webhook_test(client):
room = await rooms_controller.add(
name="room-webhook-test",
user_id="owner-y",
zulip_auto_post=False,
zulip_stream="",
zulip_topic="",
is_locked=False,
room_mode="normal",
recording_type="cloud",
recording_trigger="automatic-2nd-participant",
is_shared=False,
webhook_url="http://localhost.invalid/webhook",
webhook_secret="secret",
)
# Anonymous caller
resp = await client.post(f"/rooms/{room.id}/webhook/test")
assert resp.status_code == 401, resp.text
@pytest.mark.asyncio
async def test_anonymous_cannot_create_room(client):
payload = {
"name": "room-create-auth-required",
"zulip_auto_post": False,
"zulip_stream": "",
"zulip_topic": "",
"is_locked": False,
"room_mode": "normal",
"recording_type": "cloud",
"recording_trigger": "automatic-2nd-participant",
"is_shared": False,
"webhook_url": "",
"webhook_secret": "",
}
resp = await client.post("/rooms", json=payload)
assert resp.status_code == 401, resp.text
@pytest.mark.asyncio
async def test_list_search_401_when_public_mode_false(client, monkeypatch):
monkeypatch.setattr(settings, "PUBLIC_MODE", False)
resp = await client.get("/transcripts")
assert resp.status_code == 401
resp = await client.get("/transcripts/search", params={"q": "hello"})
assert resp.status_code == 401
@pytest.mark.asyncio
async def test_audio_mp3_requires_token_for_owned_transcript(
client, tmpdir, monkeypatch
):
# Use temp data dir
monkeypatch.setattr(settings, "DATA_DIR", Path(tmpdir).as_posix())
# Create owner transcript and attach a local mp3
t = await transcripts_controller.add(
name="owned-audio",
source_kind=SourceKind.LIVE,
user_id="owner-a",
share_mode="private",
)
tr = await transcripts_controller.get_by_id(t.id)
await transcripts_controller.update(tr, {"status": "ended"})
# copy fixture audio to transcript path
audio_path = Path(__file__).parent / "records" / "test_mathieu_hello.mp3"
tr.audio_mp3_filename.parent.mkdir(parents=True, exist_ok=True)
shutil.copy(audio_path, tr.audio_mp3_filename)
# Anonymous GET without token should be 403 or 404 depending on access; we call mp3
resp = await client.get(f"/transcripts/{t.id}/audio/mp3")
assert resp.status_code == 403
# With token should succeed
token = create_access_token(
{"sub": tr.user_id}, expires_delta=__import__("datetime").timedelta(minutes=15)
)
resp2 = await client.get(f"/transcripts/{t.id}/audio/mp3", params={"token": token})
assert resp2.status_code == 200

View File

@@ -0,0 +1,321 @@
"""Tests for storage abstraction layer."""
import io
from unittest.mock import AsyncMock, MagicMock, patch
import pytest
from botocore.exceptions import ClientError
from reflector.storage.base import StoragePermissionError
from reflector.storage.storage_aws import AwsStorage
@pytest.mark.asyncio
async def test_aws_storage_stream_to_fileobj():
"""Test that AWS storage can stream directly to a file object without loading into memory."""
# Setup
storage = AwsStorage(
aws_bucket_name="test-bucket",
aws_region="us-east-1",
aws_access_key_id="test-key",
aws_secret_access_key="test-secret",
)
# Mock download_fileobj to write data
async def mock_download(Bucket, Key, Fileobj, **kwargs):
Fileobj.write(b"chunk1chunk2")
mock_client = AsyncMock()
mock_client.download_fileobj = AsyncMock(side_effect=mock_download)
mock_client.__aenter__ = AsyncMock(return_value=mock_client)
mock_client.__aexit__ = AsyncMock(return_value=None)
# Patch the session client
with patch.object(storage.session, "client", return_value=mock_client):
# Create a file-like object to stream to
output = io.BytesIO()
# Act - stream to file object
await storage.stream_to_fileobj("test-file.mp4", output, bucket="test-bucket")
# Assert
mock_client.download_fileobj.assert_called_once_with(
Bucket="test-bucket", Key="test-file.mp4", Fileobj=output
)
# Check that data was written to output
output.seek(0)
assert output.read() == b"chunk1chunk2"
@pytest.mark.asyncio
async def test_aws_storage_stream_to_fileobj_with_folder():
"""Test streaming with folder prefix in bucket name."""
storage = AwsStorage(
aws_bucket_name="test-bucket/recordings",
aws_region="us-east-1",
aws_access_key_id="test-key",
aws_secret_access_key="test-secret",
)
async def mock_download(Bucket, Key, Fileobj, **kwargs):
Fileobj.write(b"data")
mock_client = AsyncMock()
mock_client.download_fileobj = AsyncMock(side_effect=mock_download)
mock_client.__aenter__ = AsyncMock(return_value=mock_client)
mock_client.__aexit__ = AsyncMock(return_value=None)
with patch.object(storage.session, "client", return_value=mock_client):
output = io.BytesIO()
await storage.stream_to_fileobj("file.mp4", output, bucket="other-bucket")
# Should use folder prefix from instance config
mock_client.download_fileobj.assert_called_once_with(
Bucket="other-bucket", Key="recordings/file.mp4", Fileobj=output
)
@pytest.mark.asyncio
async def test_storage_base_class_stream_to_fileobj():
"""Test that base Storage class has stream_to_fileobj method."""
from reflector.storage.base import Storage
# Verify method exists in base class
assert hasattr(Storage, "stream_to_fileobj")
# Create a mock storage instance
storage = MagicMock(spec=Storage)
storage.stream_to_fileobj = AsyncMock()
# Should be callable
await storage.stream_to_fileobj("file.mp4", io.BytesIO())
storage.stream_to_fileobj.assert_called_once()
@pytest.mark.asyncio
async def test_aws_storage_stream_uses_download_fileobj():
"""Test that download_fileobj is called correctly."""
storage = AwsStorage(
aws_bucket_name="test-bucket",
aws_region="us-east-1",
aws_access_key_id="test-key",
aws_secret_access_key="test-secret",
)
async def mock_download(Bucket, Key, Fileobj, **kwargs):
Fileobj.write(b"data")
mock_client = AsyncMock()
mock_client.download_fileobj = AsyncMock(side_effect=mock_download)
mock_client.__aenter__ = AsyncMock(return_value=mock_client)
mock_client.__aexit__ = AsyncMock(return_value=None)
with patch.object(storage.session, "client", return_value=mock_client):
output = io.BytesIO()
await storage.stream_to_fileobj("test.mp4", output)
# Verify download_fileobj was called with correct parameters
mock_client.download_fileobj.assert_called_once_with(
Bucket="test-bucket", Key="test.mp4", Fileobj=output
)
@pytest.mark.asyncio
async def test_aws_storage_handles_access_denied_error():
"""Test that AccessDenied errors are caught and wrapped in StoragePermissionError."""
storage = AwsStorage(
aws_bucket_name="test-bucket",
aws_region="us-east-1",
aws_access_key_id="test-key",
aws_secret_access_key="test-secret",
)
# Mock ClientError with AccessDenied
error_response = {"Error": {"Code": "AccessDenied", "Message": "Access Denied"}}
mock_client = AsyncMock()
mock_client.put_object = AsyncMock(
side_effect=ClientError(error_response, "PutObject")
)
mock_client.__aenter__ = AsyncMock(return_value=mock_client)
mock_client.__aexit__ = AsyncMock(return_value=None)
with patch.object(storage.session, "client", return_value=mock_client):
with pytest.raises(StoragePermissionError) as exc_info:
await storage.put_file("test.txt", b"data")
# Verify error message contains expected information
error_msg = str(exc_info.value)
assert "AccessDenied" in error_msg
assert "default bucket 'test-bucket'" in error_msg
assert "S3 upload failed" in error_msg
@pytest.mark.asyncio
async def test_aws_storage_handles_no_such_bucket_error():
"""Test that NoSuchBucket errors are caught and wrapped in StoragePermissionError."""
storage = AwsStorage(
aws_bucket_name="test-bucket",
aws_region="us-east-1",
aws_access_key_id="test-key",
aws_secret_access_key="test-secret",
)
# Mock ClientError with NoSuchBucket
error_response = {
"Error": {
"Code": "NoSuchBucket",
"Message": "The specified bucket does not exist",
}
}
mock_client = AsyncMock()
mock_client.delete_object = AsyncMock(
side_effect=ClientError(error_response, "DeleteObject")
)
mock_client.__aenter__ = AsyncMock(return_value=mock_client)
mock_client.__aexit__ = AsyncMock(return_value=None)
with patch.object(storage.session, "client", return_value=mock_client):
with pytest.raises(StoragePermissionError) as exc_info:
await storage.delete_file("test.txt")
# Verify error message contains expected information
error_msg = str(exc_info.value)
assert "NoSuchBucket" in error_msg
assert "default bucket 'test-bucket'" in error_msg
assert "S3 delete failed" in error_msg
@pytest.mark.asyncio
async def test_aws_storage_error_message_with_bucket_override():
"""Test that error messages correctly show overridden bucket."""
storage = AwsStorage(
aws_bucket_name="default-bucket",
aws_region="us-east-1",
aws_access_key_id="test-key",
aws_secret_access_key="test-secret",
)
# Mock ClientError with AccessDenied
error_response = {"Error": {"Code": "AccessDenied", "Message": "Access Denied"}}
mock_client = AsyncMock()
mock_client.get_object = AsyncMock(
side_effect=ClientError(error_response, "GetObject")
)
mock_client.__aenter__ = AsyncMock(return_value=mock_client)
mock_client.__aexit__ = AsyncMock(return_value=None)
with patch.object(storage.session, "client", return_value=mock_client):
with pytest.raises(StoragePermissionError) as exc_info:
await storage.get_file("test.txt", bucket="override-bucket")
# Verify error message shows overridden bucket, not default
error_msg = str(exc_info.value)
assert "overridden bucket 'override-bucket'" in error_msg
assert "default-bucket" not in error_msg
assert "S3 download failed" in error_msg
@pytest.mark.asyncio
async def test_aws_storage_reraises_non_handled_errors():
"""Test that non-AccessDenied/NoSuchBucket errors are re-raised as-is."""
storage = AwsStorage(
aws_bucket_name="test-bucket",
aws_region="us-east-1",
aws_access_key_id="test-key",
aws_secret_access_key="test-secret",
)
# Mock ClientError with different error code
error_response = {
"Error": {"Code": "InternalError", "Message": "Internal Server Error"}
}
mock_client = AsyncMock()
mock_client.put_object = AsyncMock(
side_effect=ClientError(error_response, "PutObject")
)
mock_client.__aenter__ = AsyncMock(return_value=mock_client)
mock_client.__aexit__ = AsyncMock(return_value=None)
with patch.object(storage.session, "client", return_value=mock_client):
# Should raise ClientError, not StoragePermissionError
with pytest.raises(ClientError) as exc_info:
await storage.put_file("test.txt", b"data")
# Verify it's the original ClientError
assert exc_info.value.response["Error"]["Code"] == "InternalError"
@pytest.mark.asyncio
async def test_aws_storage_presign_url_handles_errors():
"""Test that presigned URL generation handles permission errors."""
storage = AwsStorage(
aws_bucket_name="test-bucket",
aws_region="us-east-1",
aws_access_key_id="test-key",
aws_secret_access_key="test-secret",
)
# Mock ClientError with AccessDenied during presign operation
error_response = {"Error": {"Code": "AccessDenied", "Message": "Access Denied"}}
mock_client = AsyncMock()
mock_client.generate_presigned_url = AsyncMock(
side_effect=ClientError(error_response, "GeneratePresignedUrl")
)
mock_client.__aenter__ = AsyncMock(return_value=mock_client)
mock_client.__aexit__ = AsyncMock(return_value=None)
with patch.object(storage.session, "client", return_value=mock_client):
with pytest.raises(StoragePermissionError) as exc_info:
await storage.get_file_url("test.txt")
# Verify error message
error_msg = str(exc_info.value)
assert "S3 presign failed" in error_msg
assert "AccessDenied" in error_msg
@pytest.mark.asyncio
async def test_aws_storage_list_objects_handles_errors():
"""Test that list_objects handles permission errors."""
storage = AwsStorage(
aws_bucket_name="test-bucket",
aws_region="us-east-1",
aws_access_key_id="test-key",
aws_secret_access_key="test-secret",
)
# Mock ClientError during list operation
error_response = {"Error": {"Code": "AccessDenied", "Message": "Access Denied"}}
mock_paginator = MagicMock()
async def mock_paginate(*args, **kwargs):
raise ClientError(error_response, "ListObjectsV2")
yield # Make it an async generator
mock_paginator.paginate = mock_paginate
mock_client = AsyncMock()
mock_client.get_paginator = MagicMock(return_value=mock_paginator)
mock_client.__aenter__ = AsyncMock(return_value=mock_client)
mock_client.__aexit__ = AsyncMock(return_value=None)
with patch.object(storage.session, "client", return_value=mock_client):
with pytest.raises(StoragePermissionError) as exc_info:
await storage.list_objects(prefix="test/")
error_msg = str(exc_info.value)
assert "S3 list_objects failed" in error_msg
assert "AccessDenied" in error_msg
def test_aws_storage_constructor_rejects_mixed_auth():
"""Test that constructor rejects both role_arn and access keys."""
with pytest.raises(ValueError, match="cannot use both.*role_arn.*access keys"):
AwsStorage(
aws_bucket_name="test-bucket",
aws_region="us-east-1",
aws_access_key_id="test-key",
aws_secret_access_key="test-secret",
aws_role_arn="arn:aws:iam::123456789012:role/test-role",
)

View File

@@ -1,5 +1,3 @@
from contextlib import asynccontextmanager
import pytest
@@ -19,7 +17,7 @@ async def test_transcript_create(client):
@pytest.mark.asyncio
async def test_transcript_get_update_name(client):
async def test_transcript_get_update_name(authenticated_client, client):
response = await client.post("/transcripts", json={"name": "test"})
assert response.status_code == 200
assert response.json()["name"] == "test"
@@ -40,7 +38,7 @@ async def test_transcript_get_update_name(client):
@pytest.mark.asyncio
async def test_transcript_get_update_locked(client):
async def test_transcript_get_update_locked(authenticated_client, client):
response = await client.post("/transcripts", json={"name": "test"})
assert response.status_code == 200
assert response.json()["locked"] is False
@@ -61,7 +59,7 @@ async def test_transcript_get_update_locked(client):
@pytest.mark.asyncio
async def test_transcript_get_update_summary(client):
async def test_transcript_get_update_summary(authenticated_client, client):
response = await client.post("/transcripts", json={"name": "test"})
assert response.status_code == 200
assert response.json()["long_summary"] is None
@@ -89,7 +87,7 @@ async def test_transcript_get_update_summary(client):
@pytest.mark.asyncio
async def test_transcript_get_update_title(client):
async def test_transcript_get_update_title(authenticated_client, client):
response = await client.post("/transcripts", json={"name": "test"})
assert response.status_code == 200
assert response.json()["title"] is None
@@ -127,56 +125,6 @@ async def test_transcripts_list_anonymous(client):
settings.PUBLIC_MODE = False
@asynccontextmanager
async def authenticated_client_ctx():
from reflector.app import app
from reflector.auth import current_user, current_user_optional
app.dependency_overrides[current_user] = lambda: {
"sub": "randomuserid",
"email": "test@mail.com",
}
app.dependency_overrides[current_user_optional] = lambda: {
"sub": "randomuserid",
"email": "test@mail.com",
}
yield
del app.dependency_overrides[current_user]
del app.dependency_overrides[current_user_optional]
@asynccontextmanager
async def authenticated_client2_ctx():
from reflector.app import app
from reflector.auth import current_user, current_user_optional
app.dependency_overrides[current_user] = lambda: {
"sub": "randomuserid2",
"email": "test@mail.com",
}
app.dependency_overrides[current_user_optional] = lambda: {
"sub": "randomuserid2",
"email": "test@mail.com",
}
yield
del app.dependency_overrides[current_user]
del app.dependency_overrides[current_user_optional]
@pytest.fixture
@pytest.mark.asyncio
async def authenticated_client():
async with authenticated_client_ctx():
yield
@pytest.fixture
@pytest.mark.asyncio
async def authenticated_client2():
async with authenticated_client2_ctx():
yield
@pytest.mark.asyncio
async def test_transcripts_list_authenticated(authenticated_client, client):
# XXX this test is a bit fragile, as it depends on the storage which
@@ -199,7 +147,7 @@ async def test_transcripts_list_authenticated(authenticated_client, client):
@pytest.mark.asyncio
async def test_transcript_delete(client):
async def test_transcript_delete(authenticated_client, client):
response = await client.post("/transcripts", json={"name": "testdel1"})
assert response.status_code == 200
assert response.json()["name"] == "testdel1"
@@ -214,7 +162,7 @@ async def test_transcript_delete(client):
@pytest.mark.asyncio
async def test_transcript_mark_reviewed(client):
async def test_transcript_mark_reviewed(authenticated_client, client):
response = await client.post("/transcripts", json={"name": "test"})
assert response.status_code == 200
assert response.json()["name"] == "test"

View File

@@ -111,7 +111,9 @@ async def test_transcript_audio_download_range_with_seek(
@pytest.mark.asyncio
async def test_transcript_delete_with_audio(fake_transcript, client):
async def test_transcript_delete_with_audio(
authenticated_client, fake_transcript, client
):
response = await client.delete(f"/transcripts/{fake_transcript.id}")
assert response.status_code == 200
assert response.json()["status"] == "ok"

View File

@@ -2,7 +2,7 @@ import pytest
@pytest.mark.asyncio
async def test_transcript_participants(client):
async def test_transcript_participants(authenticated_client, client):
response = await client.post("/transcripts", json={"name": "test"})
assert response.status_code == 200
assert response.json()["participants"] == []
@@ -39,7 +39,7 @@ async def test_transcript_participants(client):
@pytest.mark.asyncio
async def test_transcript_participants_same_speaker(client):
async def test_transcript_participants_same_speaker(authenticated_client, client):
response = await client.post("/transcripts", json={"name": "test"})
assert response.status_code == 200
assert response.json()["participants"] == []
@@ -62,7 +62,7 @@ async def test_transcript_participants_same_speaker(client):
@pytest.mark.asyncio
async def test_transcript_participants_update_name(client):
async def test_transcript_participants_update_name(authenticated_client, client):
response = await client.post("/transcripts", json={"name": "test"})
assert response.status_code == 200
assert response.json()["participants"] == []
@@ -100,7 +100,7 @@ async def test_transcript_participants_update_name(client):
@pytest.mark.asyncio
async def test_transcript_participants_update_speaker(client):
async def test_transcript_participants_update_speaker(authenticated_client, client):
response = await client.post("/transcripts", json={"name": "test"})
assert response.status_code == 200
assert response.json()["participants"] == []

View File

@@ -22,13 +22,16 @@ async def test_recording_deleted_with_transcript():
recording_id=recording.id,
)
with patch("reflector.db.transcripts.get_recordings_storage") as mock_get_storage:
with patch("reflector.db.transcripts.get_transcripts_storage") as mock_get_storage:
storage_instance = mock_get_storage.return_value
storage_instance.delete_file = AsyncMock()
await transcripts_controller.remove_by_id(transcript.id)
storage_instance.delete_file.assert_awaited_once_with(recording.object_key)
# Should be called with bucket override
storage_instance.delete_file.assert_awaited_once_with(
recording.object_key, bucket=recording.bucket_name
)
assert await recordings_controller.get_by_id(recording.id) is None
assert await transcripts_controller.get_by_id(transcript.id) is None

View File

@@ -2,7 +2,9 @@ import pytest
@pytest.mark.asyncio
async def test_transcript_reassign_speaker(fake_transcript_with_topics, client):
async def test_transcript_reassign_speaker(
authenticated_client, fake_transcript_with_topics, client
):
transcript_id = fake_transcript_with_topics.id
# check the transcript exists
@@ -114,7 +116,9 @@ async def test_transcript_reassign_speaker(fake_transcript_with_topics, client):
@pytest.mark.asyncio
async def test_transcript_merge_speaker(fake_transcript_with_topics, client):
async def test_transcript_merge_speaker(
authenticated_client, fake_transcript_with_topics, client
):
transcript_id = fake_transcript_with_topics.id
# check the transcript exists
@@ -181,7 +185,7 @@ async def test_transcript_merge_speaker(fake_transcript_with_topics, client):
@pytest.mark.asyncio
async def test_transcript_reassign_with_participant(
fake_transcript_with_topics, client
authenticated_client, fake_transcript_with_topics, client
):
transcript_id = fake_transcript_with_topics.id
@@ -347,7 +351,9 @@ async def test_transcript_reassign_with_participant(
@pytest.mark.asyncio
async def test_transcript_reassign_edge_cases(fake_transcript_with_topics, client):
async def test_transcript_reassign_edge_cases(
authenticated_client, fake_transcript_with_topics, client
):
transcript_id = fake_transcript_with_topics.id
# check the transcript exists

View File

@@ -0,0 +1,70 @@
import pytest
from reflector.db.user_api_keys import user_api_keys_controller
@pytest.mark.asyncio
async def test_api_key_creation_and_verification():
api_key_model, plaintext = await user_api_keys_controller.create_key(
user_id="test_user",
name="Test API Key",
)
verified = await user_api_keys_controller.verify_key(plaintext)
assert verified is not None
assert verified.user_id == "test_user"
assert verified.name == "Test API Key"
invalid = await user_api_keys_controller.verify_key("fake_key")
assert invalid is None
@pytest.mark.asyncio
async def test_api_key_hashing():
_, plaintext = await user_api_keys_controller.create_key(
user_id="test_user_2",
)
api_keys = await user_api_keys_controller.list_by_user_id("test_user_2")
assert len(api_keys) == 1
assert api_keys[0].key_hash != plaintext
@pytest.mark.asyncio
async def test_generate_api_key_uniqueness():
key1 = user_api_keys_controller.generate_key()
key2 = user_api_keys_controller.generate_key()
assert key1 != key2
@pytest.mark.asyncio
async def test_hash_api_key_deterministic():
key = "test_key_123"
hash1 = user_api_keys_controller.hash_key(key)
hash2 = user_api_keys_controller.hash_key(key)
assert hash1 == hash2
@pytest.mark.asyncio
async def test_get_by_user_id_empty():
api_keys = await user_api_keys_controller.list_by_user_id("nonexistent_user")
assert api_keys == []
@pytest.mark.asyncio
async def test_get_by_user_id_multiple():
user_id = "multi_key_user"
_, plaintext1 = await user_api_keys_controller.create_key(
user_id=user_id,
name="API Key 1",
)
_, plaintext2 = await user_api_keys_controller.create_key(
user_id=user_id,
name="API Key 2",
)
api_keys = await user_api_keys_controller.list_by_user_id(user_id)
assert len(api_keys) == 2
names = {k.name for k in api_keys}
assert names == {"API Key 1", "API Key 2"}

View File

@@ -0,0 +1,156 @@
import asyncio
import threading
import time
import pytest
from httpx_ws import aconnect_ws
from uvicorn import Config, Server
@pytest.fixture
def appserver_ws_user(setup_database):
from reflector.app import app
from reflector.db import get_database
host = "127.0.0.1"
port = 1257
server_started = threading.Event()
server_exception = None
server_instance = None
def run_server():
nonlocal server_exception, server_instance
try:
loop = asyncio.new_event_loop()
asyncio.set_event_loop(loop)
config = Config(app=app, host=host, port=port, loop=loop)
server_instance = Server(config)
async def start_server():
database = get_database()
await database.connect()
try:
await server_instance.serve()
finally:
await database.disconnect()
server_started.set()
loop.run_until_complete(start_server())
except Exception as e:
server_exception = e
server_started.set()
finally:
loop.close()
server_thread = threading.Thread(target=run_server, daemon=True)
server_thread.start()
server_started.wait(timeout=30)
if server_exception:
raise server_exception
time.sleep(0.5)
yield host, port
if server_instance:
server_instance.should_exit = True
server_thread.join(timeout=30)
@pytest.fixture(autouse=True)
def patch_jwt_verification(monkeypatch):
"""Patch JWT verification to accept HS256 tokens signed with SECRET_KEY for tests."""
from jose import jwt
from reflector.settings import settings
def _verify_token(self, token: str):
# Do not validate audience in tests
return jwt.decode(token, settings.SECRET_KEY, algorithms=["HS256"]) # type: ignore[arg-type]
monkeypatch.setattr(
"reflector.auth.auth_jwt.JWTAuth.verify_token", _verify_token, raising=True
)
def _make_dummy_jwt(sub: str = "user123") -> str:
# Create a short HS256 JWT using the app secret to pass verification in tests
from datetime import datetime, timedelta, timezone
from jose import jwt
from reflector.settings import settings
payload = {
"sub": sub,
"email": f"{sub}@example.com",
"exp": datetime.now(timezone.utc) + timedelta(minutes=5),
}
# Note: production uses RS256 public key verification; tests can sign with SECRET_KEY
return jwt.encode(payload, settings.SECRET_KEY, algorithm="HS256")
@pytest.mark.asyncio
async def test_user_ws_rejects_missing_subprotocol(appserver_ws_user):
host, port = appserver_ws_user
base_ws = f"http://{host}:{port}/v1/events"
# No subprotocol/header with token
with pytest.raises(Exception):
async with aconnect_ws(base_ws) as ws: # type: ignore
# Should close during handshake; if not, close explicitly
await ws.close()
@pytest.mark.asyncio
async def test_user_ws_rejects_invalid_token(appserver_ws_user):
host, port = appserver_ws_user
base_ws = f"http://{host}:{port}/v1/events"
# Send wrong token via WebSocket subprotocols
protocols = ["bearer", "totally-invalid-token"]
with pytest.raises(Exception):
async with aconnect_ws(base_ws, subprotocols=protocols) as ws: # type: ignore
await ws.close()
@pytest.mark.asyncio
async def test_user_ws_accepts_valid_token_and_receives_events(appserver_ws_user):
host, port = appserver_ws_user
base_ws = f"http://{host}:{port}/v1/events"
token = _make_dummy_jwt("user-abc")
subprotocols = ["bearer", token]
# Connect and then trigger an event via HTTP create
async with aconnect_ws(base_ws, subprotocols=subprotocols) as ws:
# Emit an event to the user's room via a standard HTTP action
from httpx import AsyncClient
from reflector.app import app
from reflector.auth import current_user, current_user_optional
# Override auth dependencies so HTTP request is performed as the same user
app.dependency_overrides[current_user] = lambda: {
"sub": "user-abc",
"email": "user-abc@example.com",
}
app.dependency_overrides[current_user_optional] = lambda: {
"sub": "user-abc",
"email": "user-abc@example.com",
}
async with AsyncClient(app=app, base_url=f"http://{host}:{port}/v1") as ac:
# Create a transcript as this user so that the server publishes TRANSCRIPT_CREATED to user room
resp = await ac.post("/transcripts", json={"name": "WS Test"})
assert resp.status_code == 200
# Receive the published event
msg = await ws.receive_json()
assert msg["event"] == "TRANSCRIPT_CREATED"
assert "id" in msg["data"]
# Clean overrides
del app.dependency_overrides[current_user]
del app.dependency_overrides[current_user_optional]

View File

@@ -0,0 +1,17 @@
import pytest
from reflector.utils.daily import extract_base_room_name
@pytest.mark.parametrize(
"daily_room_name,expected",
[
("daily-20251020193458", "daily"),
("daily-2-20251020193458", "daily-2"),
("my-room-name-20251020193458", "my-room-name"),
("room-with-numbers-123-20251020193458", "room-with-numbers-123"),
("x-20251020193458", "x"),
],
)
def test_extract_base_room_name(daily_room_name, expected):
assert extract_base_room_name(daily_room_name) == expected

View File

@@ -0,0 +1,63 @@
"""Tests for URL utility functions."""
from reflector.utils.url import add_query_param
class TestAddQueryParam:
"""Test the add_query_param function."""
def test_add_param_to_url_without_query(self):
"""Should add query param with ? to URL without existing params."""
url = "https://example.com/room"
result = add_query_param(url, "t", "token123")
assert result == "https://example.com/room?t=token123"
def test_add_param_to_url_with_existing_query(self):
"""Should add query param with & to URL with existing params."""
url = "https://example.com/room?existing=param"
result = add_query_param(url, "t", "token123")
assert result == "https://example.com/room?existing=param&t=token123"
def test_add_param_to_url_with_multiple_existing_params(self):
"""Should add query param to URL with multiple existing params."""
url = "https://example.com/room?param1=value1&param2=value2"
result = add_query_param(url, "t", "token123")
assert (
result == "https://example.com/room?param1=value1&param2=value2&t=token123"
)
def test_add_param_with_special_characters(self):
"""Should properly encode special characters in param value."""
url = "https://example.com/room"
result = add_query_param(url, "name", "hello world")
assert result == "https://example.com/room?name=hello+world"
def test_add_param_to_url_with_fragment(self):
"""Should preserve URL fragment when adding query param."""
url = "https://example.com/room#section"
result = add_query_param(url, "t", "token123")
assert result == "https://example.com/room?t=token123#section"
def test_add_param_to_url_with_query_and_fragment(self):
"""Should preserve fragment when adding param to URL with existing query."""
url = "https://example.com/room?existing=param#section"
result = add_query_param(url, "t", "token123")
assert result == "https://example.com/room?existing=param&t=token123#section"
def test_add_param_overwrites_existing_param(self):
"""Should overwrite existing param with same name."""
url = "https://example.com/room?t=oldtoken"
result = add_query_param(url, "t", "newtoken")
assert result == "https://example.com/room?t=newtoken"
def test_url_without_scheme(self):
"""Should handle URLs without scheme (relative URLs)."""
url = "/room/path"
result = add_query_param(url, "t", "token123")
assert result == "/room/path?t=token123"
def test_empty_url(self):
"""Should handle empty URL."""
url = ""
result = add_query_param(url, "t", "token123")
assert result == "?t=token123"

View File

@@ -0,0 +1,58 @@
"""Tests for video_platforms.factory module."""
from unittest.mock import patch
from reflector.video_platforms.factory import get_platform
class TestGetPlatformF:
"""Test suite for get_platform function."""
@patch("reflector.video_platforms.factory.settings")
def test_with_room_platform(self, mock_settings):
"""When room_platform provided, should return room_platform."""
mock_settings.DEFAULT_VIDEO_PLATFORM = "whereby"
# Should return the room's platform when provided
assert get_platform(room_platform="daily") == "daily"
assert get_platform(room_platform="whereby") == "whereby"
@patch("reflector.video_platforms.factory.settings")
def test_without_room_platform_uses_default(self, mock_settings):
"""When no room_platform, should return DEFAULT_VIDEO_PLATFORM."""
mock_settings.DEFAULT_VIDEO_PLATFORM = "whereby"
# Should return default when room_platform is None
assert get_platform(room_platform=None) == "whereby"
@patch("reflector.video_platforms.factory.settings")
def test_with_daily_default(self, mock_settings):
"""When DEFAULT_VIDEO_PLATFORM is 'daily', should return 'daily' when no room_platform."""
mock_settings.DEFAULT_VIDEO_PLATFORM = "daily"
# Should return default 'daily' when room_platform is None
assert get_platform(room_platform=None) == "daily"
@patch("reflector.video_platforms.factory.settings")
def test_no_room_id_provided(self, mock_settings):
"""Should work correctly even when room_id is not provided."""
mock_settings.DEFAULT_VIDEO_PLATFORM = "whereby"
# Should use room_platform when provided
assert get_platform(room_platform="daily") == "daily"
# Should use default when room_platform not provided
assert get_platform(room_platform=None) == "whereby"
@patch("reflector.video_platforms.factory.settings")
def test_room_platform_always_takes_precedence(self, mock_settings):
"""room_platform should always be used when provided."""
mock_settings.DEFAULT_VIDEO_PLATFORM = "whereby"
# room_platform should take precedence over default
assert get_platform(room_platform="daily") == "daily"
assert get_platform(room_platform="whereby") == "whereby"
# Different default shouldn't matter when room_platform provided
mock_settings.DEFAULT_VIDEO_PLATFORM = "daily"
assert get_platform(room_platform="whereby") == "whereby"

14
www/.dockerignore Normal file
View File

@@ -0,0 +1,14 @@
.env
.env.*
.env.local
.env.development
.env.production
node_modules
.next
.git
.gitignore
*.md
.DS_Store
coverage
.pnpm-store
*.log

View File

@@ -1,9 +1,5 @@
# Environment
ENVIRONMENT=development
NEXT_PUBLIC_ENV=development
# Site Configuration
NEXT_PUBLIC_SITE_URL=http://localhost:3000
SITE_URL=http://localhost:3000
# Nextauth envs
# not used in app code but in lib code
@@ -18,16 +14,16 @@ AUTHENTIK_CLIENT_ID=your-client-id-here
AUTHENTIK_CLIENT_SECRET=your-client-secret-here
# Feature Flags
# NEXT_PUBLIC_FEATURE_REQUIRE_LOGIN=true
# NEXT_PUBLIC_FEATURE_PRIVACY=false
# NEXT_PUBLIC_FEATURE_BROWSE=true
# NEXT_PUBLIC_FEATURE_SEND_TO_ZULIP=true
# NEXT_PUBLIC_FEATURE_ROOMS=true
# FEATURE_REQUIRE_LOGIN=true
# FEATURE_PRIVACY=false
# FEATURE_BROWSE=true
# FEATURE_SEND_TO_ZULIP=true
# FEATURE_ROOMS=true
# API URLs
NEXT_PUBLIC_API_URL=http://127.0.0.1:1250
NEXT_PUBLIC_WEBSOCKET_URL=ws://127.0.0.1:1250
NEXT_PUBLIC_AUTH_CALLBACK_URL=http://localhost:3000/auth-callback
API_URL=http://127.0.0.1:1250
WEBSOCKET_URL=ws://127.0.0.1:1250
AUTH_CALLBACK_URL=http://localhost:3000/auth-callback
# Sentry
# SENTRY_DSN=https://your-dsn@sentry.io/project-id

81
www/DOCKER_README.md Normal file
View File

@@ -0,0 +1,81 @@
# Docker Production Build Guide
## Overview
The Docker image builds without any environment variables and requires all configuration to be provided at runtime.
## Environment Variables (ALL Runtime)
### Required Runtime Variables
```bash
API_URL # Backend API URL (e.g., https://api.example.com)
WEBSOCKET_URL # WebSocket URL (e.g., wss://api.example.com)
NEXTAUTH_URL # NextAuth base URL (e.g., https://app.example.com)
NEXTAUTH_SECRET # Random secret for NextAuth (generate with: openssl rand -base64 32)
KV_URL # Redis URL (e.g., redis://redis:6379)
```
### Optional Runtime Variables
```bash
SITE_URL # Frontend URL (defaults to NEXTAUTH_URL)
AUTHENTIK_ISSUER # OAuth issuer URL
AUTHENTIK_CLIENT_ID # OAuth client ID
AUTHENTIK_CLIENT_SECRET # OAuth client secret
AUTHENTIK_REFRESH_TOKEN_URL # OAuth token refresh URL
FEATURE_REQUIRE_LOGIN=false # Require authentication
FEATURE_PRIVACY=true # Enable privacy features
FEATURE_BROWSE=true # Enable browsing features
FEATURE_SEND_TO_ZULIP=false # Enable Zulip integration
FEATURE_ROOMS=true # Enable rooms feature
SENTRY_DSN # Sentry error tracking
AUTH_CALLBACK_URL # OAuth callback URL
```
## Building the Image
### Option 1: Using Docker Compose
1. Build the image (no environment variables needed):
```bash
docker compose -f docker-compose.prod.yml build
```
2. Create a `.env` file with runtime variables
3. Run with environment variables:
```bash
docker compose -f docker-compose.prod.yml --env-file .env up -d
```
### Option 2: Using Docker CLI
1. Build the image (no build args):
```bash
docker build -t reflector-frontend:latest ./www
```
2. Run with environment variables:
```bash
docker run -d \
-p 3000:3000 \
-e API_URL=https://api.example.com \
-e WEBSOCKET_URL=wss://api.example.com \
-e NEXTAUTH_URL=https://app.example.com \
-e NEXTAUTH_SECRET=your-secret \
-e KV_URL=redis://redis:6379 \
-e AUTHENTIK_ISSUER=https://auth.example.com/application/o/reflector \
-e AUTHENTIK_CLIENT_ID=your-client-id \
-e AUTHENTIK_CLIENT_SECRET=your-client-secret \
-e AUTHENTIK_REFRESH_TOKEN_URL=https://auth.example.com/application/o/token/ \
-e FEATURE_REQUIRE_LOGIN=true \
reflector-frontend:latest
```

View File

@@ -24,7 +24,8 @@ COPY --link . .
ENV NEXT_TELEMETRY_DISABLED 1
# If using npm comment out above and use below instead
RUN pnpm build
# next.js has the feature of excluding build step planned https://github.com/vercel/next.js/discussions/46544
RUN pnpm build-production
# RUN npm run build
# Production image, copy all the files and run next
@@ -51,6 +52,10 @@ USER nextjs
EXPOSE 3000
ENV PORT 3000
ENV HOSTNAME localhost
ENV HOSTNAME 0.0.0.0
HEALTHCHECK --interval=30s --timeout=3s --start-period=40s --retries=3 \
CMD wget --no-verbose --tries=1 --spider http://127.0.0.1:3000/api/health \
|| exit 1
CMD ["node", "server.js"]

View File

@@ -1,5 +1,5 @@
"use client";
import React, { useState, useEffect } from "react";
import React, { useState, useEffect, useMemo } from "react";
import {
Flex,
Spinner,
@@ -235,15 +235,26 @@ export default function TranscriptBrowser() {
const pageSize = 20;
// must be json-able
const searchFilters = useMemo(
() => ({
q: urlSearchQuery,
extras: {
room_id: urlRoomId || undefined,
source_kind: urlSourceKind || undefined,
},
}),
[urlSearchQuery, urlRoomId, urlSourceKind],
);
const {
data: searchData,
isLoading: searchLoading,
refetch: reloadSearch,
} = useTranscriptsSearch(urlSearchQuery, {
} = useTranscriptsSearch(searchFilters.q, {
limit: pageSize,
offset: paginationPageTo0Based(page) * pageSize,
room_id: urlRoomId || undefined,
source_kind: urlSourceKind || undefined,
...searchFilters.extras,
});
const results = searchData?.results || [];
@@ -255,6 +266,12 @@ export default function TranscriptBrowser() {
const totalPages = getTotalPages(totalResults, pageSize);
// reset pagination when search results change (detected by total change; good enough approximation)
useEffect(() => {
// operation is idempotent
setPage(FIRST_PAGE).then(() => {});
}, [JSON.stringify(searchFilters)]);
const userName = useUserName();
const [deletionLoading, setDeletionLoading] = useState(false);
const cancelRef = React.useRef(null);

View File

@@ -78,6 +78,14 @@ export default async function AppLayout({
)}
{featureEnabled("requireLogin") ? (
<>
&nbsp;·&nbsp;
<Link
href="/settings/api-keys"
as={NextLink}
className="font-light px-2"
>
Settings
</Link>
&nbsp;·&nbsp;
<UserInfo />
</>

View File

@@ -200,7 +200,13 @@ export default function ICSSettings({
<HStack gap={0} position="relative" width="100%">
<Input
ref={roomUrlInputRef}
value={roomAbsoluteUrl(parseNonEmptyString(roomName))}
value={roomAbsoluteUrl(
parseNonEmptyString(
roomName,
true,
"panic! roomName is required",
),
)}
readOnly
onClick={handleRoomUrlClick}
cursor="pointer"

View File

@@ -274,15 +274,31 @@ export function RoomTable({
<IconButton
aria-label="Force sync calendar"
onClick={() =>
handleForceSync(parseNonEmptyString(room.name))
handleForceSync(
parseNonEmptyString(
room.name,
true,
"panic! room.name is required",
),
)
}
size="sm"
variant="ghost"
disabled={syncingRooms.has(
parseNonEmptyString(room.name),
parseNonEmptyString(
room.name,
true,
"panic! room.name is required",
),
)}
>
{syncingRooms.has(parseNonEmptyString(room.name)) ? (
{syncingRooms.has(
parseNonEmptyString(
room.name,
true,
"panic! room.name is required",
),
) ? (
<Spinner size="sm" />
) : (
<CalendarSyncIcon />
@@ -297,7 +313,13 @@ export function RoomTable({
<IconButton
aria-label="Copy URL"
onClick={() =>
onCopyUrl(parseNonEmptyString(room.name))
onCopyUrl(
parseNonEmptyString(
room.name,
true,
"panic! room.name is required",
),
)
}
size="sm"
variant="ghost"

View File

@@ -833,7 +833,13 @@ export default function RoomsList() {
<Field.Root>
<ICSSettings
roomName={
room.name ? parseNonEmptyString(room.name) : null
room.name
? parseNonEmptyString(
room.name,
true,
"panic! room.name required",
)
: null
}
icsUrl={room.icsUrl}
icsEnabled={room.icsEnabled}

View File

@@ -0,0 +1,341 @@
"use client";
import React, { useState, useRef } from "react";
import {
Box,
Button,
Heading,
Stack,
Text,
Input,
Table,
Flex,
IconButton,
Code,
Dialog,
} from "@chakra-ui/react";
import { LuTrash2, LuCopy, LuPlus } from "react-icons/lu";
import { useQueryClient } from "@tanstack/react-query";
import { $api } from "../../../lib/apiClient";
import { toaster } from "../../../components/ui/toaster";
interface CreateApiKeyResponse {
id: string;
user_id: string;
name: string | null;
created_at: string;
key: string;
}
export default function ApiKeysPage() {
const [newKeyName, setNewKeyName] = useState("");
const [isCreating, setIsCreating] = useState(false);
const [createdKey, setCreatedKey] = useState<CreateApiKeyResponse | null>(
null,
);
const [keyToDelete, setKeyToDelete] = useState<string | null>(null);
const queryClient = useQueryClient();
const cancelRef = useRef<HTMLButtonElement>(null);
const { data: apiKeys, isLoading } = $api.useQuery(
"get",
"/v1/user/api-keys",
);
const createKeyMutation = $api.useMutation("post", "/v1/user/api-keys", {
onSuccess: (data) => {
setCreatedKey(data);
setNewKeyName("");
setIsCreating(false);
queryClient.invalidateQueries({ queryKey: ["get", "/v1/user/api-keys"] });
toaster.create({
duration: 5000,
render: () => (
<Box bg="green.500" color="white" px={4} py={3} borderRadius="md">
<Text fontWeight="bold">API key created</Text>
<Text fontSize="sm">
Make sure to copy it now - you won't see it again!
</Text>
</Box>
),
});
setTimeout(() => {
const keyElement = document.querySelector(".api-key-code");
if (keyElement) {
const range = document.createRange();
range.selectNodeContents(keyElement);
const selection = window.getSelection();
selection?.removeAllRanges();
selection?.addRange(range);
}
}, 100);
},
onError: () => {
toaster.create({
duration: 3000,
render: () => (
<Box bg="red.500" color="white" px={4} py={3} borderRadius="md">
<Text fontWeight="bold">Error</Text>
<Text fontSize="sm">Failed to create API key</Text>
</Box>
),
});
},
});
const deleteKeyMutation = $api.useMutation(
"delete",
"/v1/user/api-keys/{key_id}",
{
onSuccess: () => {
queryClient.invalidateQueries({
queryKey: ["get", "/v1/user/api-keys"],
});
toaster.create({
duration: 3000,
render: () => (
<Box bg="green.500" color="white" px={4} py={3} borderRadius="md">
<Text fontWeight="bold">API key deleted</Text>
</Box>
),
});
},
onError: () => {
toaster.create({
duration: 3000,
render: () => (
<Box bg="red.500" color="white" px={4} py={3} borderRadius="md">
<Text fontWeight="bold">Error</Text>
<Text fontSize="sm">Failed to delete API key</Text>
</Box>
),
});
},
},
);
const handleCreateKey = () => {
createKeyMutation.mutate({
body: { name: newKeyName || null },
});
};
const handleCopyKey = (key: string) => {
navigator.clipboard.writeText(key);
toaster.create({
duration: 2000,
render: () => (
<Box bg="green.500" color="white" px={4} py={3} borderRadius="md">
<Text fontWeight="bold">Copied to clipboard</Text>
</Box>
),
});
};
const handleDeleteRequest = (keyId: string) => {
setKeyToDelete(keyId);
};
const confirmDelete = () => {
if (keyToDelete) {
deleteKeyMutation.mutate({
params: { path: { key_id: keyToDelete } },
});
setKeyToDelete(null);
}
};
const formatDate = (dateString: string) => {
return new Date(dateString).toLocaleDateString("en-US", {
year: "numeric",
month: "short",
day: "numeric",
hour: "2-digit",
minute: "2-digit",
});
};
return (
<Box maxW="800px" w="100%" mx="auto" p={8}>
<Heading mb={2}>API Keys</Heading>
<Text color="gray.600" mb={6}>
Manage your API keys for programmatic access to Reflector
</Text>
{/* Show newly created key */}
{createdKey && (
<Box
mb={6}
p={4}
bg="green.50"
borderWidth={1}
borderColor="green.200"
borderRadius="md"
>
<Flex justify="space-between" align="start" mb={2}>
<Heading size="sm" color="green.800">
API Key Created
</Heading>
<Button
size="sm"
variant="ghost"
onClick={() => setCreatedKey(null)}
>
×
</Button>
</Flex>
<Text mb={2} fontSize="sm" color="green.700">
Make sure to copy your API key now. You won't be able to see it
again!
</Text>
<Flex gap={2} align="center">
<Code
p={2}
flex={1}
fontSize="sm"
bg="white"
className="api-key-code"
>
{createdKey.key}
</Code>
<IconButton
aria-label="Copy API key"
size="sm"
onClick={() => handleCopyKey(createdKey.key)}
>
<LuCopy />
</IconButton>
</Flex>
</Box>
)}
{/* Create new key */}
<Box mb={8} p={6} borderWidth={1} borderRadius="md">
<Heading size="md" mb={4}>
Create New API Key
</Heading>
{!isCreating ? (
<Button onClick={() => setIsCreating(true)} colorPalette="blue">
<LuPlus /> Create API Key
</Button>
) : (
<Stack gap={4}>
<Box>
<Text mb={2}>Name (optional)</Text>
<Input
placeholder="e.g., Production API Key"
value={newKeyName}
onChange={(e) => setNewKeyName(e.target.value)}
/>
</Box>
<Flex gap={2}>
<Button
onClick={handleCreateKey}
colorPalette="blue"
loading={createKeyMutation.isPending}
>
Create
</Button>
<Button
onClick={() => {
setIsCreating(false);
setNewKeyName("");
}}
variant="outline"
>
Cancel
</Button>
</Flex>
</Stack>
)}
</Box>
{/* List of API keys */}
<Box>
<Heading size="md" mb={4}>
Your API Keys
</Heading>
{isLoading ? (
<Text>Loading...</Text>
) : !apiKeys || apiKeys.length === 0 ? (
<Text color="gray.600">
No API keys yet. Create one to get started.
</Text>
) : (
<Table.Root>
<Table.Header>
<Table.Row>
<Table.ColumnHeader>Name</Table.ColumnHeader>
<Table.ColumnHeader>Created</Table.ColumnHeader>
<Table.ColumnHeader>Actions</Table.ColumnHeader>
</Table.Row>
</Table.Header>
<Table.Body>
{apiKeys.map((key) => (
<Table.Row key={key.id}>
<Table.Cell>
{key.name || <Text color="gray.500">Unnamed</Text>}
</Table.Cell>
<Table.Cell>{formatDate(key.created_at)}</Table.Cell>
<Table.Cell>
<IconButton
aria-label="Delete API key"
size="sm"
colorPalette="red"
variant="ghost"
onClick={() => handleDeleteRequest(key.id)}
loading={
deleteKeyMutation.isPending &&
deleteKeyMutation.variables?.params?.path?.key_id ===
key.id
}
>
<LuTrash2 />
</IconButton>
</Table.Cell>
</Table.Row>
))}
</Table.Body>
</Table.Root>
)}
</Box>
{/* Delete confirmation dialog */}
<Dialog.Root
open={!!keyToDelete}
onOpenChange={(e) => {
if (!e.open) setKeyToDelete(null);
}}
initialFocusEl={() => cancelRef.current}
>
<Dialog.Backdrop />
<Dialog.Positioner>
<Dialog.Content>
<Dialog.Header fontSize="lg" fontWeight="bold">
Delete API Key
</Dialog.Header>
<Dialog.Body>
<Text>
Are you sure you want to delete this API key? This action cannot
be undone.
</Text>
</Dialog.Body>
<Dialog.Footer>
<Button
ref={cancelRef}
onClick={() => setKeyToDelete(null)}
variant="outline"
colorPalette="gray"
>
Cancel
</Button>
<Button colorPalette="red" onClick={confirmDelete} ml={3}>
Delete
</Button>
</Dialog.Footer>
</Dialog.Content>
</Dialog.Positioner>
</Dialog.Root>
</Box>
);
}

Some files were not shown because too many files have changed in this diff Show More