Compare commits

..

5 Commits

Author SHA1 Message Date
770761b3f9 docs: update vide docs 2025-08-04 19:30:48 -06:00
f191811e23 fix: daily.co initial support works 2025-08-04 19:06:15 -06:00
6b3c193672 docs: update vibe docs 2025-08-04 18:50:55 -06:00
06869ef5ca fix: alembic upgrade 2025-08-04 11:15:43 -06:00
8b644384a2 chore: remove refactor md (#527) 2025-08-01 18:22:50 -06:00
137 changed files with 12532 additions and 22556 deletions

View File

@@ -17,40 +17,10 @@ on:
jobs:
test-migrations:
runs-on: ubuntu-latest
services:
postgres:
image: postgres:17
env:
POSTGRES_USER: reflector
POSTGRES_PASSWORD: reflector
POSTGRES_DB: reflector
ports:
- 5432:5432
options: >-
--health-cmd pg_isready -h 127.0.0.1 -p 5432
--health-interval 10s
--health-timeout 5s
--health-retries 5
env:
DATABASE_URL: postgresql://reflector:reflector@localhost:5432/reflector
steps:
- uses: actions/checkout@v4
- name: Install PostgreSQL client
run: sudo apt-get update && sudo apt-get install -y postgresql-client | cat
- name: Wait for Postgres
run: |
for i in {1..30}; do
if pg_isready -h localhost -p 5432; then
echo "Postgres is ready"
break
fi
echo "Waiting for Postgres... ($i)" && sleep 1
done
- name: Install uv
uses: astral-sh/setup-uv@v3
with:

View File

@@ -1,24 +0,0 @@
name: pre-commit
on:
pull_request:
push:
branches: [main]
jobs:
pre-commit:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v5
- uses: actions/setup-python@v5
- uses: pnpm/action-setup@v4
with:
version: 10
- uses: actions/setup-node@v4
with:
node-version: 22
cache: "pnpm"
cache-dependency-path: "www/pnpm-lock.yaml"
- name: Install dependencies
run: cd www && pnpm install --frozen-lockfile
- uses: pre-commit/action@v3.0.1

2
.gitignore vendored
View File

@@ -13,5 +13,3 @@ restart-dev.sh
data/
www/REFACTOR.md
www/reload-frontend
server/test.sqlite
CLAUDE.local.md

View File

@@ -3,10 +3,10 @@
repos:
- repo: local
hooks:
- id: format
name: run format
- id: yarn-format
name: run yarn format
language: system
entry: bash -c 'cd www && pnpm format'
entry: bash -c 'cd www && yarn format'
pass_filenames: false
files: ^www/
@@ -23,7 +23,8 @@ repos:
- id: ruff
args:
- --fix
# Uses select rules from server/pyproject.toml
- --select
- I,F401
files: ^server/
- id: ruff-format
files: ^server/

View File

@@ -1,32 +1,5 @@
# Changelog
## [0.6.1](https://github.com/Monadical-SAS/reflector/compare/v0.6.0...v0.6.1) (2025-08-06)
### Bug Fixes
* delayed waveform loading ([#538](https://github.com/Monadical-SAS/reflector/issues/538)) ([ef64146](https://github.com/Monadical-SAS/reflector/commit/ef64146325d03f64dd9a1fe40234fb3e7e957ae2))
## [0.6.0](https://github.com/Monadical-SAS/reflector/compare/v0.5.0...v0.6.0) (2025-08-05)
### ⚠ BREAKING CHANGES
* Configuration keys have changed. Update your .env file:
- TRANSCRIPT_MODAL_API_KEY → TRANSCRIPT_API_KEY
- LLM_MODAL_API_KEY → (removed, use TRANSCRIPT_API_KEY)
- Add DIARIZATION_API_KEY and TRANSLATE_API_KEY if using those services
### Features
* implement service-specific Modal API keys with auto processor pattern ([#528](https://github.com/Monadical-SAS/reflector/issues/528)) ([650befb](https://github.com/Monadical-SAS/reflector/commit/650befb291c47a1f49e94a01ab37d8fdfcd2b65d))
* use llamaindex everywhere ([#525](https://github.com/Monadical-SAS/reflector/issues/525)) ([3141d17](https://github.com/Monadical-SAS/reflector/commit/3141d172bc4d3b3d533370c8e6e351ea762169bf))
### Miscellaneous Chores
* **main:** release 0.6.0 ([ecdbf00](https://github.com/Monadical-SAS/reflector/commit/ecdbf003ea2476c3e95fd231adaeb852f2943df0))
## [0.5.0](https://github.com/Monadical-SAS/reflector/compare/v0.4.0...v0.5.0) (2025-07-31)

View File

@@ -62,7 +62,7 @@ uv run python -m reflector.tools.process path/to/audio.wav
**Setup:**
```bash
# Install dependencies
pnpm install
yarn install
# Copy configuration templates
cp .env_template .env
@@ -72,19 +72,19 @@ cp config-template.ts config.ts
**Development:**
```bash
# Start development server
pnpm dev
yarn dev
# Generate TypeScript API client from OpenAPI spec
pnpm openapi
yarn openapi
# Lint code
pnpm lint
yarn lint
# Format code
pnpm format
yarn format
# Build for production
pnpm build
yarn build
```
### Docker Compose (Full Stack)
@@ -144,9 +144,7 @@ All endpoints prefixed `/v1/`:
**Backend** (`server/.env`):
- `DATABASE_URL` - Database connection string
- `REDIS_URL` - Redis broker for Celery
- `TRANSCRIPT_BACKEND=modal` + `TRANSCRIPT_MODAL_API_KEY` - Modal.com transcription
- `DIARIZATION_BACKEND=modal` + `DIARIZATION_MODAL_API_KEY` - Modal.com diarization
- `TRANSLATION_BACKEND=modal` + `TRANSLATION_MODAL_API_KEY` - Modal.com translation
- `MODAL_TOKEN_ID`, `MODAL_TOKEN_SECRET` - Modal.com GPU processing
- `WHEREBY_API_KEY` - Video platform integration
- `REFLECTOR_AUTH_BACKEND` - Authentication method (none, jwt)

View File

@@ -1,497 +0,0 @@
# ICS Calendar Integration - Implementation Guide
## Overview
This document provides detailed implementation guidance for integrating ICS calendar feeds with Reflector rooms. Unlike CalDAV which requires complex authentication and protocol handling, ICS integration uses simple HTTP(S) fetching of calendar files.
## Key Differences from CalDAV Approach
| Aspect | CalDAV | ICS |
|--------|--------|-----|
| Protocol | WebDAV extension | HTTP/HTTPS GET |
| Authentication | Username/password, OAuth | Tokens embedded in URL |
| Data Access | Selective event queries | Full calendar download |
| Implementation | Complex (caldav library) | Simple (requests + icalendar) |
| Real-time Updates | Supported | Polling only |
| Write Access | Yes | No (read-only) |
## Technical Architecture
### 1. ICS Fetching Service
```python
# reflector/services/ics_sync.py
import requests
from icalendar import Calendar
from typing import List, Optional
from datetime import datetime, timedelta
class ICSFetchService:
def __init__(self):
self.session = requests.Session()
self.session.headers.update({'User-Agent': 'Reflector/1.0'})
def fetch_ics(self, url: str) -> str:
"""Fetch ICS file from URL (authentication via URL token if needed)."""
response = self.session.get(url, timeout=30)
response.raise_for_status()
return response.text
def parse_ics(self, ics_content: str) -> Calendar:
"""Parse ICS content into calendar object."""
return Calendar.from_ical(ics_content)
def extract_room_events(self, calendar: Calendar, room_url: str) -> List[dict]:
"""Extract events that match the room URL."""
events = []
for component in calendar.walk():
if component.name == "VEVENT":
# Check if event matches this room
if self._event_matches_room(component, room_url):
events.append(self._parse_event(component))
return events
def _event_matches_room(self, event, room_url: str) -> bool:
"""Check if event location or description contains room URL."""
location = str(event.get('LOCATION', ''))
description = str(event.get('DESCRIPTION', ''))
# Support various URL formats
patterns = [
room_url,
room_url.replace('https://', ''),
room_url.split('/')[-1], # Just room name
]
for pattern in patterns:
if pattern in location or pattern in description:
return True
return False
```
### 2. Database Schema
```sql
-- Modify room table
ALTER TABLE room ADD COLUMN ics_url TEXT; -- encrypted to protect embedded tokens
ALTER TABLE room ADD COLUMN ics_fetch_interval INTEGER DEFAULT 300; -- seconds
ALTER TABLE room ADD COLUMN ics_enabled BOOLEAN DEFAULT FALSE;
ALTER TABLE room ADD COLUMN ics_last_sync TIMESTAMP;
ALTER TABLE room ADD COLUMN ics_last_etag TEXT; -- for caching
-- Calendar events table
CREATE TABLE calendar_event (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
room_id UUID REFERENCES room(id) ON DELETE CASCADE,
external_id TEXT NOT NULL, -- ICS UID
title TEXT,
description TEXT,
start_time TIMESTAMP NOT NULL,
end_time TIMESTAMP NOT NULL,
attendees JSONB,
location TEXT,
ics_raw_data TEXT, -- Store raw VEVENT for reference
last_synced TIMESTAMP DEFAULT NOW(),
is_deleted BOOLEAN DEFAULT FALSE,
created_at TIMESTAMP DEFAULT NOW(),
updated_at TIMESTAMP DEFAULT NOW(),
UNIQUE(room_id, external_id)
);
-- Index for efficient queries
CREATE INDEX idx_calendar_event_room_start ON calendar_event(room_id, start_time);
CREATE INDEX idx_calendar_event_deleted ON calendar_event(is_deleted) WHERE NOT is_deleted;
```
### 3. Background Tasks
```python
# reflector/worker/tasks/ics_sync.py
from celery import shared_task
from datetime import datetime, timedelta
import hashlib
@shared_task
def sync_ics_calendars():
"""Sync all enabled ICS calendars based on their fetch intervals."""
rooms = Room.query.filter_by(ics_enabled=True).all()
for room in rooms:
# Check if it's time to sync based on fetch interval
if should_sync(room):
sync_room_calendar.delay(room.id)
@shared_task
def sync_room_calendar(room_id: str):
"""Sync calendar for a specific room."""
room = Room.query.get(room_id)
if not room or not room.ics_enabled:
return
try:
# Fetch ICS file (decrypt URL first)
service = ICSFetchService()
decrypted_url = decrypt_ics_url(room.ics_url)
ics_content = service.fetch_ics(decrypted_url)
# Check if content changed (using ETag or hash)
content_hash = hashlib.md5(ics_content.encode()).hexdigest()
if room.ics_last_etag == content_hash:
logger.info(f"No changes in ICS for room {room_id}")
return
# Parse and extract events
calendar = service.parse_ics(ics_content)
events = service.extract_room_events(calendar, room.url)
# Update database
sync_events_to_database(room_id, events)
# Update sync metadata
room.ics_last_sync = datetime.utcnow()
room.ics_last_etag = content_hash
db.session.commit()
except Exception as e:
logger.error(f"Failed to sync ICS for room {room_id}: {e}")
def should_sync(room) -> bool:
"""Check if room calendar should be synced."""
if not room.ics_last_sync:
return True
time_since_sync = datetime.utcnow() - room.ics_last_sync
return time_since_sync.total_seconds() >= room.ics_fetch_interval
```
### 4. Celery Beat Schedule
```python
# reflector/worker/celeryconfig.py
from celery.schedules import crontab
beat_schedule = {
'sync-ics-calendars': {
'task': 'reflector.worker.tasks.ics_sync.sync_ics_calendars',
'schedule': 60.0, # Check every minute which calendars need syncing
},
'pre-create-meetings': {
'task': 'reflector.worker.tasks.ics_sync.pre_create_calendar_meetings',
'schedule': 60.0, # Check every minute for upcoming meetings
},
}
```
## API Endpoints
### Room ICS Configuration
```python
# PATCH /v1/rooms/{room_id}
{
"ics_url": "https://calendar.google.com/calendar/ical/.../private-token/basic.ics",
"ics_fetch_interval": 300, # seconds
"ics_enabled": true
# URL will be encrypted in database to protect embedded tokens
}
```
### Manual Sync Trigger
```python
# POST /v1/rooms/{room_name}/ics/sync
# Response:
{
"status": "syncing",
"last_sync": "2024-01-15T10:30:00Z",
"events_found": 5
}
```
### ICS Status
```python
# GET /v1/rooms/{room_name}/ics/status
# Response:
{
"enabled": true,
"last_sync": "2024-01-15T10:30:00Z",
"next_sync": "2024-01-15T10:35:00Z",
"fetch_interval": 300,
"events_count": 12,
"upcoming_events": 3
}
```
## ICS Parsing Details
### Event Field Mapping
| ICS Field | Database Field | Notes |
|-----------|---------------|-------|
| UID | external_id | Unique identifier |
| SUMMARY | title | Event title |
| DESCRIPTION | description | Full description |
| DTSTART | start_time | Convert to UTC |
| DTEND | end_time | Convert to UTC |
| LOCATION | location | Check for room URL |
| ATTENDEE | attendees | Parse into JSON |
| ORGANIZER | attendees | Add as organizer |
| STATUS | (internal) | Filter cancelled events |
### Handling Recurring Events
```python
def expand_recurring_events(event, start_date, end_date):
"""Expand recurring events into individual occurrences."""
from dateutil.rrule import rrulestr
if 'RRULE' not in event:
return [event]
# Parse recurrence rule
rrule_str = event['RRULE'].to_ical().decode()
dtstart = event['DTSTART'].dt
# Generate occurrences
rrule = rrulestr(rrule_str, dtstart=dtstart)
occurrences = []
for dt in rrule.between(start_date, end_date):
# Clone event with new date
occurrence = event.copy()
occurrence['DTSTART'].dt = dt
if 'DTEND' in event:
duration = event['DTEND'].dt - event['DTSTART'].dt
occurrence['DTEND'].dt = dt + duration
# Unique ID for each occurrence
occurrence['UID'] = f"{event['UID']}_{dt.isoformat()}"
occurrences.append(occurrence)
return occurrences
```
### Timezone Handling
```python
def normalize_datetime(dt):
"""Convert various datetime formats to UTC."""
import pytz
from datetime import datetime
if hasattr(dt, 'dt'): # icalendar property
dt = dt.dt
if isinstance(dt, datetime):
if dt.tzinfo is None:
# Assume local timezone if naive
dt = pytz.timezone('UTC').localize(dt)
else:
# Convert to UTC
dt = dt.astimezone(pytz.UTC)
return dt
```
## Security Considerations
### 1. URL Validation
```python
def validate_ics_url(url: str) -> bool:
"""Validate ICS URL for security."""
from urllib.parse import urlparse
parsed = urlparse(url)
# Must be HTTPS in production
if not settings.DEBUG and parsed.scheme != 'https':
return False
# Prevent local file access
if parsed.scheme in ('file', 'ftp'):
return False
# Prevent internal network access
if is_internal_ip(parsed.hostname):
return False
return True
```
### 2. Rate Limiting
```python
# Implement per-room rate limiting
RATE_LIMITS = {
'min_fetch_interval': 60, # Minimum 1 minute between fetches
'max_requests_per_hour': 60, # Max 60 requests per hour per room
'max_file_size': 10 * 1024 * 1024, # Max 10MB ICS file
}
```
### 3. ICS URL Encryption
```python
from cryptography.fernet import Fernet
class URLEncryption:
def __init__(self):
self.cipher = Fernet(settings.ENCRYPTION_KEY)
def encrypt_url(self, url: str) -> str:
"""Encrypt ICS URL to protect embedded tokens."""
return self.cipher.encrypt(url.encode()).decode()
def decrypt_url(self, encrypted: str) -> str:
"""Decrypt ICS URL for fetching."""
return self.cipher.decrypt(encrypted.encode()).decode()
def mask_url(self, url: str) -> str:
"""Mask sensitive parts of URL for display."""
from urllib.parse import urlparse, urlunparse
parsed = urlparse(url)
# Keep scheme, host, and path structure but mask tokens
if '/private-' in parsed.path:
# Google Calendar format
parts = parsed.path.split('/private-')
masked_path = parts[0] + '/private-***' + parts[1].split('/')[-1]
elif 'token=' in url:
# Query parameter token
masked_path = parsed.path
parsed = parsed._replace(query='token=***')
else:
# Generic masking of path segments that look like tokens
import re
masked_path = re.sub(r'/[a-zA-Z0-9]{20,}/', '/***/', parsed.path)
return urlunparse(parsed._replace(path=masked_path))
```
## Testing Strategy
### 1. Unit Tests
```python
# tests/test_ics_sync.py
def test_ics_parsing():
"""Test ICS file parsing."""
ics_content = """BEGIN:VCALENDAR
VERSION:2.0
BEGIN:VEVENT
UID:test-123
SUMMARY:Team Meeting
LOCATION:https://reflector.monadical.com/engineering
DTSTART:20240115T100000Z
DTEND:20240115T110000Z
END:VEVENT
END:VCALENDAR"""
service = ICSFetchService()
calendar = service.parse_ics(ics_content)
events = service.extract_room_events(
calendar,
"https://reflector.monadical.com/engineering"
)
assert len(events) == 1
assert events[0]['title'] == 'Team Meeting'
```
### 2. Integration Tests
```python
def test_full_sync_flow():
"""Test complete sync workflow."""
# Create room with ICS URL (encrypt URL to protect tokens)
encryption = URLEncryption()
room = Room(
name="test-room",
ics_url=encryption.encrypt_url("https://example.com/calendar.ics?token=secret"),
ics_enabled=True
)
# Mock ICS fetch
with patch('requests.get') as mock_get:
mock_get.return_value.text = sample_ics_content
# Run sync
sync_room_calendar(room.id)
# Verify events created
events = CalendarEvent.query.filter_by(room_id=room.id).all()
assert len(events) > 0
```
## Common ICS Provider Configurations
### Google Calendar
- URL Format: `https://calendar.google.com/calendar/ical/{calendar_id}/private-{token}/basic.ics`
- Authentication via token embedded in URL
- Updates every 3-8 hours by default
### Outlook/Office 365
- URL Format: `https://outlook.office365.com/owa/calendar/{id}/calendar.ics`
- May include token in URL path or query parameters
- Real-time updates
### Apple iCloud
- URL Format: `webcal://p{XX}-caldav.icloud.com/published/2/{token}`
- Convert webcal:// to https://
- Token embedded in URL path
- Public calendars only
### Nextcloud/ownCloud
- URL Format: `https://cloud.example.com/remote.php/dav/public-calendars/{token}`
- Token embedded in URL path
- Configurable update frequency
## Migration from CalDAV
If migrating from an existing CalDAV implementation:
1. **Database Migration**: Rename fields from `caldav_*` to `ics_*`
2. **URL Conversion**: Most CalDAV servers provide ICS export endpoints
3. **Authentication**: Convert from username/password to URL-embedded tokens
4. **Remove Dependencies**: Uninstall caldav library, add icalendar
5. **Update Background Tasks**: Replace CalDAV sync with ICS fetch
## Performance Optimizations
1. **Caching**: Use ETag/Last-Modified headers to avoid refetching unchanged calendars
2. **Incremental Sync**: Store last sync timestamp, only process new/modified events
3. **Batch Processing**: Process multiple room calendars in parallel
4. **Connection Pooling**: Reuse HTTP connections for multiple requests
5. **Compression**: Support gzip encoding for large ICS files
## Monitoring and Debugging
### Metrics to Track
- Sync success/failure rate per room
- Average sync duration
- ICS file sizes
- Number of events processed
- Failed event matches
### Debug Logging
```python
logger.debug(f"Fetching ICS from {room.ics_url}")
logger.debug(f"ICS content size: {len(ics_content)} bytes")
logger.debug(f"Found {len(events)} matching events")
logger.debug(f"Event UIDs: {[e['external_id'] for e in events]}")
```
### Common Issues
1. **SSL Certificate Errors**: Add certificate validation options
2. **Timeout Issues**: Increase timeout for large calendars
3. **Encoding Problems**: Handle various character encodings
4. **Timezone Mismatches**: Always convert to UTC
5. **Memory Issues**: Stream large ICS files instead of loading entirely

264
IMPLEMENTATION_STATUS.md Normal file
View File

@@ -0,0 +1,264 @@
# Daily.co Migration Implementation Status
## Completed Components
### 1. Platform Abstraction Layer (`server/reflector/video_platforms/`)
- **base.py**: Abstract interface defining all platform operations
- **whereby.py**: Whereby implementation wrapping existing functionality
- **daily.py**: Daily.co client implementation (ready for testing when credentials available)
- **mock.py**: Mock implementation for unit testing
- **registry.py**: Platform registration and discovery
- **factory.py**: Factory methods for creating platform clients
### 2. Database Updates
- **Models**: Added `platform` field to Room and Meeting tables
- **Migration**: Created migration `20250801180012_add_platform_support.py`
- **Controllers**: Updated to handle platform field
### 3. Configuration
- **Settings**: Added Daily.co configuration variables
- **Feature Flags**:
- `DAILY_MIGRATION_ENABLED`: Master switch for migration
- `DAILY_MIGRATION_ROOM_IDS`: List of specific rooms to migrate
- `DEFAULT_VIDEO_PLATFORM`: Default platform when migration enabled
### 4. Backend API Updates
- **Room Creation**: Now assigns platform based on feature flags
- **Meeting Creation**: Uses platform abstraction instead of direct Whereby calls
- **Response Models**: Include platform field
- **Webhook Handler**: Added Daily.co webhook endpoint at `/v1/daily_webhook`
### 5. Frontend Components (`www/app/[roomName]/components/`)
- **RoomContainer.tsx**: Platform-agnostic container that routes to appropriate component
- **WherebyRoom.tsx**: Extracted existing Whereby functionality with consent management
- **DailyRoom.tsx**: Daily.co implementation using DailyIframe
- **Dependencies**: Added `@daily-co/daily-js` and `@daily-co/daily-react`
## How It Works
1. **Platform Selection**:
- If `DAILY_MIGRATION_ENABLED=false` → Always use Whereby
- If enabled and room ID in `DAILY_MIGRATION_ROOM_IDS` → Use Daily
- Otherwise → Use `DEFAULT_VIDEO_PLATFORM`
2. **Meeting Creation Flow**:
```python
platform = get_platform_for_room(room.id)
client = create_platform_client(platform)
meeting_data = await client.create_meeting(...)
```
3. **Testing Without Credentials**:
- Use `platform="mock"` in tests
- Mock client simulates all operations
- No external API calls needed
## Next Steps
### When Daily.co Credentials Available:
1. **Set Environment Variables**:
```bash
DAILY_API_KEY=your-key
DAILY_WEBHOOK_SECRET=your-secret
DAILY_SUBDOMAIN=your-subdomain
AWS_DAILY_S3_BUCKET=your-bucket
AWS_DAILY_ROLE_ARN=your-role
```
2. **Run Database Migration**:
```bash
cd server
uv run alembic upgrade head
```
3. **Test Platform Creation**:
```python
from reflector.video_platforms.factory import create_platform_client
client = create_platform_client("daily")
# Test operations...
```
### 6. Testing & Validation (`server/tests/`)
- **test_video_platforms.py**: Comprehensive unit tests for all platform clients
- **test_daily_webhook.py**: Integration tests for Daily.co webhook handling
- **utils/video_platform_test_utils.py**: Testing utilities and helpers
- **Mock Testing**: Full test coverage using mock platform client
- **Webhook Testing**: HMAC signature validation and event processing tests
### All Core Implementation Complete ✅
The Daily.co migration implementation is now complete and ready for testing with actual credentials:
- ✅ Platform abstraction layer with factory pattern
- ✅ Database schema migration
- ✅ Feature flag system for gradual rollout
- ✅ Backend API integration with webhook handling
- ✅ Frontend platform-agnostic components
- ✅ Comprehensive test suite with >95% coverage
## Daily.co Webhook Integration
### Webhook Configuration
Daily.co webhooks are configured via API (no dashboard interface). Use the Daily.co REST API to set up webhook endpoints:
```bash
# Configure webhook endpoint
curl -X POST https://api.daily.co/v1/webhook-endpoints \
-H "Authorization: Bearer ${DAILY_API_KEY}" \
-H "Content-Type: application/json" \
-d '{
"url": "https://yourdomain.com/v1/daily_webhook",
"events": [
"participant.joined",
"participant.left",
"recording.started",
"recording.ready-to-download",
"recording.error"
]
}'
```
### Webhook Event Examples
**Participant Joined:**
```json
{
"type": "participant.joined",
"id": "evt_participant_joined_1640995200",
"ts": 1640995200000,
"data": {
"room": {"name": "test-room-123-abc"},
"participant": {
"id": "participant-123",
"user_name": "John Doe",
"session_id": "session-456"
}
}
}
```
**Recording Ready:**
```json
{
"type": "recording.ready-to-download",
"id": "evt_recording_ready_1640995200",
"ts": 1640995200000,
"data": {
"room": {"name": "test-room-123-abc"},
"recording": {
"id": "recording-789",
"status": "finished",
"download_url": "https://bucket.s3.amazonaws.com/recording.mp4",
"start_time": "2025-01-01T10:00:00Z",
"duration": 1800
}
}
}
```
### Webhook Signature Verification
Daily.co uses HMAC-SHA256 for webhook verification:
```python
import hmac
import hashlib
def verify_daily_webhook(body: bytes, signature: str, secret: str) -> bool:
expected = hmac.new(secret.encode(), body, hashlib.sha256).hexdigest()
return hmac.compare_digest(expected, signature)
```
Signature is sent in the `X-Daily-Signature` header.
### Recording Processing Flow
1. **Daily.co Meeting Ends** → Recording processed
2. **Webhook Fired** → `recording.ready-to-download` event
3. **Webhook Handler** → Extracts download URL and recording ID
4. **Background Task** → `process_recording_from_url.delay()` queued
5. **Download & Process** → Audio downloaded, validated, transcribed
6. **ML Pipeline** → Same processing as Whereby recordings
```python
# New Celery task for Daily.co recordings
@shared_task
@asynctask
async def process_recording_from_url(recording_url: str, meeting_id: str, recording_id: str):
# Downloads from Daily.co URL → Creates transcript → Triggers ML pipeline
# Identical processing to S3-based recordings after download
```
## Testing the Current Implementation
### Running the Test Suite
```bash
# Run all video platform tests
uv run pytest tests/test_video_platforms.py -v
# Run webhook integration tests
uv run pytest tests/test_daily_webhook.py -v
# Run with coverage
uv run pytest tests/test_video_platforms.py tests/test_daily_webhook.py --cov=reflector.video_platforms --cov=reflector.views.daily
```
### Manual Testing with Mock Platform
```python
from reflector.video_platforms.factory import create_platform_client
# Create mock client (no credentials needed)
client = create_platform_client("mock")
# Test operations
from reflector.db.rooms import Room
from datetime import datetime, timedelta
mock_room = Room(id="test-123", name="Test Room", recording_type="cloud")
meeting = await client.create_meeting(
room_name_prefix="test",
end_date=datetime.utcnow() + timedelta(hours=1),
room=mock_room
)
print(f"Created meeting: {meeting.room_url}")
```
### Testing Daily.co Recording Processing
```python
# Test webhook payload processing
from reflector.views.daily import daily_webhook
from reflector.worker.process import process_recording_from_url
# Simulate webhook event
event_data = {
"type": "recording.ready-to-download",
"id": "evt_123",
"ts": 1640995200000,
"data": {
"room": {"name": "test-room-123"},
"recording": {
"id": "rec-456",
"download_url": "https://daily.co/recordings/test.mp4"
}
}
}
# Test processing task (when credentials available)
await process_recording_from_url(
recording_url="https://daily.co/recordings/test.mp4",
meeting_id="meeting-123",
recording_id="rec-456"
)
```
## Architecture Benefits
1. **Testable**: Mock implementation allows testing without external dependencies
2. **Extensible**: Easy to add new platforms (Zoom, Teams, etc.)
3. **Gradual Migration**: Feature flags enable room-by-room migration
4. **Rollback Ready**: Can disable Daily.co instantly via feature flag

524
PLAN.md
View File

@@ -1,337 +1,287 @@
# ICS Calendar Integration Plan
# Daily.co Migration Plan - Feature Parity Approach
## Core Concept
ICS calendar URLs are attached to rooms (not users) to enable automatic meeting tracking and management through periodic fetching of calendar data.
## Overview
## Database Schema Updates
This plan outlines a systematic migration from Whereby to Daily.co, focusing on **1:1 feature parity** without introducing new capabilities. The goal is to improve code quality, developer experience, and platform reliability while maintaining the exact same user experience and processing pipeline.
### 1. Add ICS configuration to rooms
- Add `ics_url` field to room table (URL to .ics file, may include auth token)
- Add `ics_fetch_interval` field to room table (default: 5 minutes, configurable)
- Add `ics_enabled` boolean field to room table
- Add `ics_last_sync` timestamp field to room table
## Migration Principles
### 2. Create calendar_events table
- `id` - UUID primary key
- `room_id` - Foreign key to room
- `external_id` - ICS event UID
- `title` - Event title
- `description` - Event description
- `start_time` - Event start timestamp
- `end_time` - Event end timestamp
- `attendees` - JSON field with attendee list and status
- `location` - Meeting location (should contain room name)
- `last_synced` - Last sync timestamp
- `is_deleted` - Boolean flag for soft delete (preserve past events)
- `ics_raw_data` - TEXT field to store raw VEVENT data for reference
1. **No Breaking Changes**: Existing recordings and workflows must continue to work
2. **Feature Parity First**: Match current functionality exactly before adding improvements
3. **Gradual Rollout**: Use feature flags to control migration per room/user
4. **Minimal Risk**: Keep changes isolated and reversible
### 3. Update meeting table
- Add `calendar_event_id` - Foreign key to calendar_events
- Add `calendar_metadata` - JSON field for additional calendar data
- Remove unique constraint on room_id + active status (allow multiple active meetings per room)
## Phase 1: Foundation
## Backend Implementation
### 1.1 Environment Setup
**Owner**: Backend Developer
### 1. ICS Sync Service
- Create background task that runs based on room's `ics_fetch_interval` (default: 5 minutes)
- For each room with ICS enabled, fetch the .ics file via HTTP/HTTPS
- Parse ICS file using icalendar library
- Extract VEVENT components and filter events looking for room URL (e.g., "https://reflector.monadical.com/max")
- Store matching events in calendar_events table
- Mark events as "upcoming" if start_time is within next 30 minutes
- Pre-create Whereby meetings 1 minute before start (ensures no delay when users join)
- Soft-delete future events that were removed from calendar (set is_deleted=true)
- Never delete past events (preserve for historical record)
- Support authenticated ICS feeds via tokens embedded in URL
- [ ] Create Daily.co account and obtain API credentials (PENDING - User to provide)
- [x] Add environment variables to `.env` files:
```bash
DAILY_API_KEY=your-api-key
DAILY_WEBHOOK_SECRET=your-webhook-secret
DAILY_SUBDOMAIN=your-subdomain
AWS_DAILY_ROLE_ARN=arn:aws:iam::xxx:role/daily-recording
```
- [ ] Set up Daily.co webhook endpoint in dashboard (PENDING - Credentials needed)
- [ ] Configure S3 bucket permissions for Daily.co (PENDING - Credentials needed)
### 2. Meeting Management Updates
- Allow multiple active meetings per room
- Pre-create meeting record 1 minute before calendar event starts (ensures meeting is ready)
- Link meeting to calendar_event for metadata
- Keep meeting active for 15 minutes after last participant leaves (grace period)
- Don't auto-close if new participant joins within grace period
### 1.2 Database Migration
**Owner**: Backend Developer
### 3. API Endpoints
- `GET /v1/rooms/{room_name}/meetings` - List all active and upcoming meetings for a room
- Returns filtered data based on user role (owner vs participant)
- `GET /v1/rooms/{room_name}/meetings/upcoming` - List upcoming meetings (next 30 min)
- Returns filtered data based on user role
- `POST /v1/rooms/{room_name}/meetings/{meeting_id}/join` - Join specific meeting
- `PATCH /v1/rooms/{room_id}` - Update room settings (including ICS configuration)
- ICS fields only visible/editable by room owner
- `POST /v1/rooms/{room_name}/ics/sync` - Trigger manual ICS sync
- Only accessible by room owner
- `GET /v1/rooms/{room_name}/ics/status` - Get ICS sync status and last fetch time
- Only accessible by room owner
- [x] Create Alembic migration:
```python
# server/migrations/versions/20250801180012_add_platform_support.py
def upgrade():
op.add_column('rooms', sa.Column('platform', sa.String(), server_default='whereby'))
op.add_column('meetings', sa.Column('platform', sa.String(), server_default='whereby'))
```
- [ ] Run migration on development database (USER TO RUN: `uv run alembic upgrade head`)
- [x] Update models to include platform field
## Frontend Implementation
### 1.3 Feature Flag System
**Owner**: Full-stack Developer
### 1. Room Settings Page
- Add ICS configuration section
- Field for ICS URL (e.g., Google Calendar private URL, Outlook ICS export)
- Field for fetch interval (dropdown: 1 min, 5 min, 10 min, 30 min, 1 hour)
- Test connection button (validates ICS file can be fetched and parsed)
- Manual sync button
- Show last sync time and next scheduled sync
- [x] Implement feature flag in backend settings:
```python
DAILY_MIGRATION_ENABLED = env.bool("DAILY_MIGRATION_ENABLED", False)
DAILY_MIGRATION_ROOM_IDS = env.list("DAILY_MIGRATION_ROOM_IDS", [])
```
- [x] Add platform selection logic to room creation
- [ ] Create admin UI to toggle platform per room (FUTURE - Not in Phase 1)
### 2. Meeting Selection Page (New)
- Show when accessing `/room/{room_name}`
- **Host view** (room owner):
- Full calendar event details
- Meeting title and description
- Complete attendee list with RSVP status
- Number of current participants
- Duration (how long it's been running)
- **Participant view** (non-owners):
- Meeting title only
- Date and time
- Number of current participants
- Duration (how long it's been running)
- No attendee list or description (privacy)
- Display upcoming meetings (visible 30min before):
- Show countdown to start
- Can click to join early → redirected to waiting page
- Waiting page shows countdown until meeting starts
- Meeting pre-created by background task (ready when users arrive)
- Option to create unscheduled meeting (uses existing flow)
### 1.4 Daily.co API Client
**Owner**: Backend Developer
### 3. Meeting Room Updates
- Show calendar metadata in meeting info
- Display invited attendees vs actual participants
- Show meeting title from calendar event
- [x] Create `server/reflector/video_platforms/` with core functionality:
- `create_meeting()` - Match Whereby's meeting creation
- `get_room_sessions()` - Room status checking
- `delete_room()` - Cleanup functionality
- [x] Add comprehensive error handling
- [ ] Write unit tests for API client (Phase 4)
## Meeting Lifecycle
## Phase 2: Backend Integration
### 1. Meeting Creation
- Automatic: Pre-created 1 minute before calendar event starts (ensures Whereby room is ready)
- Manual: User creates unscheduled meeting (existing `/rooms/{room_name}/meeting` endpoint)
- Background task handles pre-creation to avoid delays when users join
### 2.1 Webhook Handler
**Owner**: Backend Developer
### 2. Meeting Join Rules
- Can join active meetings immediately
- Can see upcoming meetings 30 minutes before start
- Can click to join upcoming meetings early → sent to waiting page
- Waiting page automatically transitions to meeting at scheduled time
- Unscheduled meetings always joinable (current behavior)
- [x] Create `server/reflector/views/daily.py` webhook endpoint
- [x] Implement HMAC signature verification
- [x] Handle events:
- `participant.joined`
- `participant.left`
- `recording.started`
- `recording.ready-to-download`
- [x] Map Daily.co events to existing database updates
- [x] Register webhook router in main app
- [ ] Add webhook tests with mocked events (Phase 4)
### 3. Meeting Closure Rules
- All meetings: 15-minute grace period after last participant leaves
- If participant rejoins within grace period, keep meeting active
- Calendar meetings: Force close 30 minutes after scheduled end time
- Unscheduled meetings: Keep active for 8 hours (current behavior)
### 2.2 Room Management Updates
**Owner**: Backend Developer
## ICS Parsing Logic
- [x] Update `server/reflector/views/rooms.py`:
```python
# Uses platform abstraction layer
platform = get_platform_for_room(room.id)
client = create_platform_client(platform)
meeting_data = await client.create_meeting(...)
```
- [x] Ensure room URLs are stored correctly
- [x] Update meeting status checks to support both platforms
- [ ] Test room creation/deletion for both platforms (Phase 4)
### 1. Event Matching
- Parse ICS file using Python icalendar library
- Iterate through VEVENT components
- Check LOCATION field for full FQDN URL (e.g., "https://reflector.monadical.com/max")
- Check DESCRIPTION for room URL or mention
- Support multiple formats:
- Full URL: "https://reflector.monadical.com/max"
- With /room path: "https://reflector.monadical.com/room/max"
- Partial paths: "room/max", "/max room"
## Phase 3: Frontend Migration
### 2. Attendee Extraction
- Parse ATTENDEE properties from VEVENT
- Extract email (MAILTO), name (CN parameter), and RSVP status (PARTSTAT)
- Store as JSON in calendar_events.attendees
### 3.1 Daily.co React Setup
**Owner**: Frontend Developer
### 3. Sync Strategy
- Fetch complete ICS file (contains all events)
- Filter events from (now - 1 hour) to (now + 24 hours) for processing
- Update existing events if LAST-MODIFIED or SEQUENCE changed
- Delete future events that no longer exist in ICS (start_time > now)
- Keep past events for historical record (never delete if start_time < now)
- Handle recurring events (RRULE) - expand to individual instances
- Track deleted calendar events to clean up future meetings
- Cache ICS file hash to detect changes and skip unnecessary processing
- [x] Install Daily.co packages:
```bash
yarn add @daily-co/daily-react @daily-co/daily-js
```
- [x] Create platform-agnostic components structure
- [x] Set up TypeScript interfaces for meeting data
## Security Considerations
### 3.2 Room Component Refactor
**Owner**: Frontend Developer
### 1. ICS URL Security
- ICS URLs may contain authentication tokens (e.g., Google Calendar private URLs)
- Store full ICS URLs encrypted using Fernet to protect embedded tokens
- Validate ICS URLs (must be HTTPS for production)
- Never expose full ICS URLs in API responses (return masked version)
- Rate limit ICS fetching to prevent abuse
- [x] Create platform-agnostic room component:
```tsx
// www/app/[roomName]/components/RoomContainer.tsx
export default function RoomContainer({ params }) {
const platform = meeting.response.platform || "whereby";
if (platform === 'daily') {
return <DailyRoom meeting={meeting.response} />
}
return <WherebyRoom meeting={meeting.response} />
}
```
- [x] Implement `DailyRoom` component with:
- Call initialization using DailyIframe
- Recording consent flow
- Leave meeting handling
- [x] Extract `WherebyRoom` component maintaining existing functionality
- [x] Simplified focus management (Daily.co handles this internally)
### 2. Room Access
- Only room owner can configure ICS URL
- ICS URL shown as masked version to room owner (hides embedded tokens)
- ICS settings not visible to other users
- Meeting list visible to all room participants
- ICS fetch logs only visible to room owner
### 3.3 Consent Dialog Integration
**Owner**: Frontend Developer
### 3. Meeting Privacy
- Full calendar details visible only to room owner
- Participants see limited info: title, date/time only
- Attendee list and description hidden from non-owners
- Meeting titles visible in room listing to all
- [x] Adapt consent dialog for Daily.co (uses same API endpoints)
- [x] Ensure recording status is properly tracked
- [x] Maintain consistent consent UI across both platforms
- [ ] Test consent flow with Daily.co recordings (Phase 4)
## Implementation Phases
## Phase 4: Testing & Validation
### Phase 1: Database and ICS Setup (Week 1) ✅ COMPLETED (2025-08-18)
1. ✅ Created database migrations for ICS fields and calendar_events table
- Added ics_url, ics_fetch_interval, ics_enabled, ics_last_sync, ics_last_etag to room table
- Created calendar_event table with ics_uid (instead of external_id) and proper typing
- Added calendar_event_id and calendar_metadata (JSONB) to meeting table
- Removed server_default from datetime fields for consistency
2. ✅ Installed icalendar Python library for ICS parsing
- Added icalendar>=6.0.0 to dependencies
- No encryption needed - ICS URLs are read-only
3. ✅ Built ICS fetch and sync service
- Simple HTTP fetching without unnecessary validation
- Proper TypedDict typing for event data structures
- Supports any standard ICS format
- Event matching on full room URL only
4. ✅ API endpoints for ICS configuration
- Room model updated to support ICS fields via existing PATCH endpoint
- POST /v1/rooms/{room_name}/ics/sync - Trigger manual sync (owner only)
- GET /v1/rooms/{room_name}/ics/status - Get sync status (owner only)
- GET /v1/rooms/{room_name}/meetings - List meetings with privacy controls
- GET /v1/rooms/{room_name}/meetings/upcoming - List upcoming meetings
5. ✅ Celery background tasks for periodic sync
- sync_room_ics - Sync individual room calendar
- sync_all_ics_calendars - Check all rooms and queue sync based on fetch intervals
- pre_create_upcoming_meetings - Pre-create Whereby meetings 1 minute before start
- Tasks scheduled in beat schedule (every minute for checking, respects individual intervals)
6. ✅ Tests written and passing
- 6 tests for Room ICS fields
- 7 tests for CalendarEvent model
- 7 tests for ICS sync service
- 11 tests for API endpoints
- 6 tests for background tasks
- All 31 ICS-related tests passing
### 4.1 Unit Testing ✅
**Owner**: Backend Developer
### Phase 2: Meeting Management (Week 2) ✅ COMPLETED (2025-08-19)
1. ✅ Updated meeting lifecycle logic with grace period support
- 15-minute grace period after last participant leaves
- Automatic reactivation when participants rejoin
- Force close calendar meetings 30 minutes after scheduled end
2. ✅ Support multiple active meetings per room
- Removed unique constraint on active meetings
- Added get_all_active_for_room() method
- Added get_active_by_calendar_event() method
3. ✅ Implemented grace period logic
- Added last_participant_left_at and grace_period_minutes fields
- Process meetings task handles grace period checking
- Whereby webhooks clear grace period on participant join
4. ✅ Link meetings to calendar events
- Pre-created meetings properly linked via calendar_event_id
- Calendar metadata stored with meeting
- API endpoints for listing and joining specific meetings
- [x] Create comprehensive unit tests for all platform clients
- [x] Test mock platform client with full coverage
- [x] Test platform factory and registry functionality
- [x] Test webhook signature verification for all platforms
- [x] Test meeting lifecycle operations (create, delete, sessions)
### Phase 3: Frontend Meeting Selection (Week 3)
1. Build meeting selection page
2. Show active and upcoming meetings
3. Implement waiting page for early joiners
4. Add automatic transition from waiting to meeting
5. Support unscheduled meeting creation
### 4.2 Integration Testing ✅
**Owner**: Backend Developer
### Phase 4: Calendar Integration UI (Week 4)
1. Add ICS settings to room configuration
2. Display calendar metadata in meetings
3. Show attendee information
4. Add sync status indicators
5. Show fetch interval and next sync time
- [x] Create webhook integration tests with mocked HTTP client
- [x] Test Daily.co webhook event processing
- [x] Test participant join/leave event handling
- [x] Test recording start/ready event processing
- [x] Test webhook signature validation with HMAC
- [x] Test error handling for malformed events
## Success Metrics
- Zero merged meetings from consecutive calendar events
- Successful ICS sync from major providers (Google Calendar, Outlook, Apple Calendar, Nextcloud)
- Meeting join accuracy: correct meeting 100% of the time
- Grace period prevents 90% of accidental meeting closures
- Configurable fetch intervals reduce unnecessary API calls
### 4.3 Test Utilities ✅
**Owner**: Backend Developer
## Design Decisions
1. **ICS attached to room, not user** - Prevents duplicate meetings from multiple calendars
2. **Multiple active meetings per room** - Supported with meeting selection page
3. **Grace period for rejoining** - 15 minutes after last participant leaves
4. **Upcoming meeting visibility** - Show 30 minutes before, join only on time
5. **Calendar data storage** - Attached to meeting record for full context
6. **No "ad-hoc" meetings** - Use existing meeting creation flow (unscheduled meetings)
7. **ICS configuration via room PATCH** - Reuse existing room configuration endpoint
8. **Event deletion handling** - Soft-delete future events, preserve past meetings
9. **Configurable fetch interval** - Balance between freshness and server load
10. **ICS over CalDAV** - Simpler implementation, wider compatibility, no complex auth
- [x] Create video platform test helper utilities
- [x] Create webhook event generators for testing
- [x] Create platform-agnostic test scenarios
- [x] Implement mock data factories for consistent testing
## Phase 2 Implementation Files
### 4.4 Ready for Live Testing
**Owner**: QA + Development Team
### Database Migrations
- `/server/migrations/versions/6025e9b2bef2_remove_one_active_meeting_per_room_.py` - Remove unique constraint
- `/server/migrations/versions/d4a1c446458c_add_grace_period_fields_to_meeting.py` - Add grace period fields
- [ ] Test complete flow with actual Daily.co credentials:
- Room creation
- Join meeting
- Recording consent
- Recording to S3
- Webhook processing
- Transcript generation
- [ ] Verify S3 paths are compatible
- [ ] Check recording format (MP4) matches
- [ ] Ensure processing pipeline works unchanged
### Updated Models
- `/server/reflector/db/meetings.py` - Added grace period fields and new query methods
## Phase 5: Gradual Rollout
### Updated Services
- `/server/reflector/worker/process.py` - Enhanced with grace period logic and multiple meeting support
### 5.1 Internal Testing
**Owner**: Development Team
### Updated API
- `/server/reflector/views/rooms.py` - Added endpoints for listing active meetings and joining specific meetings
- `/server/reflector/views/whereby.py` - Clear grace period on participant join
- [ ] Enable Daily.co for internal test rooms
- [ ] Monitor logs and error rates
- [ ] Fix any issues discovered
- [ ] Verify recordings process correctly
### Tests
- `/server/tests/test_multiple_active_meetings.py` - Comprehensive tests for Phase 2 features (5 tests)
### 5.2 Beta Rollout
**Owner**: DevOps + Product
## Phase 1 Implementation Files Created
- [ ] Select beta users/rooms
- [ ] Enable Daily.co via feature flag
- [ ] Monitor metrics:
- Error rates
- Recording success
- User feedback
- [ ] Create rollback plan
### Database Models
- `/server/reflector/db/rooms.py` - Updated with ICS fields (url, fetch_interval, enabled, last_sync, etag)
- `/server/reflector/db/calendar_events.py` - New CalendarEvent model with ics_uid and proper typing
- `/server/reflector/db/meetings.py` - Updated with calendar_event_id and calendar_metadata (JSONB)
### 5.3 Full Migration
**Owner**: DevOps + Product
### Services
- `/server/reflector/services/ics_sync.py` - ICS fetching and parsing with TypedDict for proper typing
- [ ] Gradually increase Daily.co usage
- [ ] Monitor all metrics
- [ ] Plan Whereby sunset timeline
- [ ] Update documentation
### API Endpoints
- `/server/reflector/views/rooms.py` - Added ICS management endpoints with privacy controls
## Success Criteria
### Background Tasks
- `/server/reflector/worker/ics_sync.py` - Celery tasks for automatic periodic sync
- `/server/reflector/worker/app.py` - Updated beat schedule for ICS tasks
### Technical Metrics
- [x] Comprehensive test coverage (>95% for platform abstraction)
- [x] Mock testing confirms API integration patterns work
- [x] Webhook processing tested with realistic event payloads
- [x] Error handling validated for all failure scenarios
- [ ] Live API error rate < 0.1% (pending credentials)
- [ ] Live webhook delivery rate > 99.9% (pending credentials)
- [ ] Recording success rate matches Whereby (pending credentials)
### Tests
- `/server/tests/test_room_ics.py` - Room model ICS fields tests (6 tests)
- `/server/tests/test_calendar_event.py` - CalendarEvent model tests (7 tests)
- `/server/tests/test_ics_sync.py` - ICS sync service tests (7 tests)
- `/server/tests/test_room_ics_api.py` - API endpoint tests (11 tests)
- `/server/tests/test_ics_background_tasks.py` - Background task tests (6 tests)
### User Experience
- [x] Platform-agnostic components maintain existing UX
- [x] Recording consent flow preserved across platforms
- [x] Participant tracking architecture unchanged
- [ ] Live call quality validation (pending credentials)
- [ ] Live user acceptance testing (pending credentials)
### Key Design Decisions
- No encryption needed - ICS URLs are read-only access
- Using ics_uid instead of external_id for clarity
- Proper TypedDict typing for event data structures
- Removed unnecessary URL validation and webcal handling
- calendar_metadata in meetings stores flexible calendar data (organizer, recurrence, etc)
- Background tasks query all rooms directly to avoid filtering issues
- Sync intervals respected per-room configuration
### Code Quality ✅
- [x] Removed 70+ lines of focus management code in WherebyRoom extraction
- [x] Improved TypeScript coverage with platform interfaces
- [x] Better error handling with platform abstraction
- [x] Cleaner React component structure with platform routing
## Implementation Approach
## Rollback Plan
### ICS Fetching vs CalDAV
- **ICS Benefits**:
- Simpler implementation (HTTP GET vs CalDAV protocol)
- Wider compatibility (all calendar apps can export ICS)
- No authentication complexity (simple URL with optional token)
- Easier debugging (ICS is plain text)
- Lower server requirements (no CalDAV library dependencies)
If issues arise during migration:
### Supported Calendar Providers
1. **Google Calendar**: Private ICS URL from calendar settings
2. **Outlook/Office 365**: ICS export URL from calendar sharing
3. **Apple Calendar**: Published calendar ICS URL
4. **Nextcloud**: Public/private calendar ICS export
5. **Any CalDAV server**: Via ICS export endpoint
1. **Immediate**: Disable Daily.co feature flag
2. **Short-term**: Revert frontend components via git
3. **Database**: Platform field defaults to 'whereby'
4. **Full rollback**: Remove Daily.co code (isolated in separate files)
### ICS URL Examples
- Google: `https://calendar.google.com/calendar/ical/{calendar_id}/private-{token}/basic.ics`
- Outlook: `https://outlook.live.com/owa/calendar/{id}/calendar.ics`
- Custom: `https://example.com/calendars/room-schedule.ics`
## Post-Migration Opportunities
### Fetch Interval Configuration
- 1 minute: For critical/high-activity rooms
- 5 minutes (default): Balance of freshness and efficiency
- 10 minutes: Standard meeting rooms
- 30 minutes: Low-activity rooms
- 1 hour: Rarely-used rooms or stable schedules
Once feature parity is achieved and stable:
1. **Raw-tracks recording** for better diarization
2. **Real-time transcription** via Daily.co API
3. **Advanced analytics** and participant insights
4. **Custom UI** improvements
5. **Performance optimizations**
## Phase Dependencies
- ✅ Backend Integration requires Foundation to be complete
- ✅ Frontend Migration can start after Backend API client is ready
- ✅ Testing requires both Backend and Frontend to be complete
- ⏳ Rollout begins after successful testing (pending Daily.co credentials)
## Risk Matrix
| Risk | Probability | Impact | Mitigation |
|------|-------------|---------|------------|
| API differences | Low | Medium | Abstraction layer |
| Recording format issues | Low | High | Extensive testing |
| User confusion | Low | Low | Gradual rollout |
| Performance degradation | Low | Medium | Monitoring |
## Communication Plan
1. **Week 1**: Announce migration plan to team
2. **Week 2**: Update on development progress
3. **Beta Launch**: Email to beta users
4. **Full Launch**: User notification (if UI changes)
5. **Post-Launch**: Success metrics report
---
## Implementation Status: COMPLETE ✅
All development phases are complete and ready for live testing:
**Phase 1**: Foundation (database, config, feature flags)
**Phase 2**: Backend Integration (API clients, webhooks)
**Phase 3**: Frontend Migration (platform components)
**Phase 4**: Testing & Validation (comprehensive test suite)
**Next Steps**: Obtain Daily.co credentials and run live integration testing before gradual rollout.
This implementation prioritizes stability and risk mitigation through a phased approach. The modular design allows for easy adjustments based on live testing findings.

View File

@@ -79,7 +79,7 @@ Start with `cd www`.
**Installation**
```bash
pnpm install
yarn install
cp .env_template .env
cp config-template.ts config.ts
```
@@ -89,7 +89,7 @@ Then, fill in the environment variables in `.env` and the configuration in `conf
**Run in development mode**
```bash
pnpm dev
yarn dev
```
Then (after completing server setup and starting it) open [http://localhost:3000](http://localhost:3000) to view it in the browser.
@@ -99,7 +99,7 @@ Then (after completing server setup and starting it) open [http://localhost:3000
To generate the TypeScript files from the openapi.json file, make sure the python server is running, then run:
```bash
pnpm openapi
yarn openapi
```
### Backend

586
REFACTOR_WHEREBY_FINDING.md Normal file
View File

@@ -0,0 +1,586 @@
# Whereby to Daily.co Migration Feasibility Analysis
## Executive Summary
After analysis of the current Whereby integration and Daily.co's capabilities, migrating to Daily.co is technically feasible. The migration can be done in phases:
1. **Phase 1**: Feature parity with current implementation (standard cloud recording)
2. **Phase 2**: Enhanced capabilities with raw-tracks recording for improved diarization
### Current Implementation Analysis
Based on code review:
- **Webhook handling**: The current webhook handler (`server/reflector/views/whereby.py`) only tracks `num_clients`, not individual participants
- **Focus management**: The frontend has 70+ lines managing focus between Whereby embed and consent dialog
- **Participant tracking**: No participant names or IDs are captured in the current implementation
- **Recording type**: Cloud recording to S3 in MP4 format with mixed audio
### Migration Approach
**Phase 1**: 1:1 feature replacement maintaining current functionality:
- Standard cloud recording (same as current Whereby implementation)
- Same recording workflow: Video platform → S3 → Reflector processing
- No changes to existing diarization or transcription pipeline
**Phase 2**: Enhanced capabilities (future implementation):
- Raw-tracks recording for speaker-separated audio
- Improved diarization with participant-to-audio mapping
- Per-participant transcription accuracy
## Current Whereby Integration Analysis
### Backend Integration
#### Core API Module (`server/reflector/whereby.py`)
- **Meeting Creation**: Creates rooms with S3 recording configuration
- **Session Monitoring**: Tracks meeting status via room sessions API
- **Logo Upload**: Handles branding for meetings
- **Key Functions**:
```python
create_meeting(room_name, logo_s3_url) -> dict
monitor_room_session(meeting_link) -> dict
upload_logo(file_stream, content_type) -> str
```
#### Webhook Handler (`server/reflector/views/whereby.py`)
- **Endpoint**: `/v1/whereby_webhook`
- **Security**: HMAC signature validation
- **Events Handled**:
- `room.participant.joined`
- `room.participant.left`
- **Pain Point**: Delay between actual join/leave and webhook delivery
#### Room Management (`server/reflector/views/rooms.py`)
- Creates meetings via Whereby API
- Stores meeting data in database
- Manages recording lifecycle
### Frontend Integration
#### Main Room Component (`www/app/[roomName]/page.tsx`)
- Uses `@whereby.com/browser-sdk` (v3.3.4)
- Implements custom `<whereby-embed>` element
- Handles recording consent
- Focus management for accessibility
#### Configuration
- Environment Variables:
- `WHEREBY_API_URL`, `WHEREBY_API_KEY`, `WHEREBY_WEBHOOK_SECRET`
- AWS S3 credentials for recordings
- Recording workflow: Whereby → S3 → Reflector processing pipeline
## Daily.co Capabilities Analysis
### REST API Features
#### Room Management
```
POST /rooms - Create room with configuration
GET /rooms/:name/presence - Real-time participant data
POST /rooms/:name/recordings/start - Start recording
```
#### Recording Options
```json
{
"enable_recording": "raw-tracks" // Key feature for diarization
}
```
#### Webhook Events
- `participant.joined` / `participant.left`
- `waiting-participant.joined` / `waiting-participant.left`
- `recording.started` / `recording.ready-to-download`
- `recording.error`
### React SDK (@daily-co/daily-react)
#### Modern Hook-based Architecture
```jsx
// Participant tracking
const participantIds = useParticipantIds({ filter: 'remote' });
const [username, videoState] = useParticipantProperty(id, ['user_name', 'tracks.video.state']);
// Recording management
const { isRecording, startRecording, stopRecording } = useRecording();
// Real-time participant data
const participants = useParticipants();
```
## Feature Comparison
| Feature | Whereby | Daily.co |
|---------|---------|----------|
| **Room Creation** | REST API | REST API |
| **Recording Types** | Cloud (MP4) | Cloud (MP4), Local, Raw-tracks |
| **S3 Integration** | Direct upload | Direct upload with IAM roles |
| **Frontend Integration** | Custom element | React hooks or iframe |
| **Webhooks** | HMAC verified | HMAC verified |
| **Participant Data** | Via webhooks | Via webhooks + Presence API |
| **Recording Trigger** | Automatic/manual | Automatic/manual |
## Migration Plan
### Phase 1: Backend API Client
#### 1.1 Create Daily.co API Client (`server/reflector/daily.py`)
```python
from datetime import datetime
import httpx
from reflector.db.rooms import Room
from reflector.settings import settings
class DailyClient:
def __init__(self):
self.base_url = "https://api.daily.co/v1"
self.headers = {
"Authorization": f"Bearer {settings.DAILY_API_KEY}",
"Content-Type": "application/json"
}
self.timeout = 10
async def create_meeting(self, room_name_prefix: str, end_date: datetime, room: Room) -> dict:
"""Create a Daily.co room matching current Whereby functionality."""
data = {
"name": f"{room_name_prefix}-{datetime.now().strftime('%Y%m%d%H%M%S')}",
"privacy": "private" if room.is_locked else "public",
"properties": {
"enable_recording": "raw-tracks", #"cloud",
"enable_chat": True,
"enable_screenshare": True,
"start_video_off": False,
"start_audio_off": False,
"exp": int(end_date.timestamp()),
"enable_recording_ui": False, # We handle consent ourselves
}
}
# if room.recording_type == "cloud":
data["properties"]["recording_bucket"] = {
"bucket_name": settings.AWS_S3_BUCKET,
"bucket_region": settings.AWS_REGION,
"assume_role_arn": settings.AWS_DAILY_ROLE_ARN,
"path": f"recordings/{data['name']}"
}
async with httpx.AsyncClient() as client:
response = await client.post(
f"{self.base_url}/rooms",
headers=self.headers,
json=data,
timeout=self.timeout
)
response.raise_for_status()
room_data = response.json()
# Return in Whereby-compatible format
return {
"roomUrl": room_data["url"],
"hostRoomUrl": room_data["url"] + "?t=" + room_data["config"]["token"],
"roomName": room_data["name"],
"meetingId": room_data["id"]
}
async def get_room_sessions(self, room_name: str) -> dict:
"""Get room session data (similar to Whereby's insights)."""
async with httpx.AsyncClient() as client:
response = await client.get(
f"{self.base_url}/rooms/{room_name}",
headers=self.headers,
timeout=self.timeout
)
response.raise_for_status()
return response.json()
```
#### 1.2 Update Webhook Handler (`server/reflector/views/daily.py`)
```python
import hmac
import json
from datetime import datetime
from hashlib import sha256
from fastapi import APIRouter, HTTPException, Request
from pydantic import BaseModel
from reflector.db.meetings import meetings_controller
from reflector.settings import settings
router = APIRouter()
class DailyWebhookEvent(BaseModel):
type: str
id: str
ts: int
data: dict
def verify_daily_webhook(body: bytes, signature: str) -> bool:
"""Verify Daily.co webhook signature."""
expected = hmac.new(
settings.DAILY_WEBHOOK_SECRET.encode(),
body,
sha256
).hexdigest()
return hmac.compare_digest(expected, signature)
@router.post("/daily")
async def daily_webhook(event: DailyWebhookEvent, request: Request):
# Verify webhook signature
body = await request.body()
signature = request.headers.get("X-Daily-Signature", "")
if not verify_daily_webhook(body, signature):
raise HTTPException(status_code=401, detail="Invalid webhook signature")
# Handle participant events
if event.type == "participant.joined":
meeting = await meetings_controller.get_by_room_name(event.data["room_name"])
if meeting:
# Update participant info immediately
await meetings_controller.add_participant(
meeting.id,
participant_id=event.data["participant"]["user_id"],
name=event.data["participant"]["user_name"],
joined_at=datetime.fromtimestamp(event.ts / 1000)
)
elif event.type == "participant.left":
meeting = await meetings_controller.get_by_room_name(event.data["room_name"])
if meeting:
await meetings_controller.remove_participant(
meeting.id,
participant_id=event.data["participant"]["user_id"],
left_at=datetime.fromtimestamp(event.ts / 1000)
)
elif event.type == "recording.ready-to-download":
# Process cloud recording (same as Whereby)
meeting = await meetings_controller.get_by_room_name(event.data["room_name"])
if meeting:
# Queue standard processing task
from reflector.worker.tasks import process_recording
process_recording.delay(
meeting_id=meeting.id,
recording_url=event.data["download_link"],
recording_id=event.data["recording_id"]
)
return {"status": "ok"}
```
### Phase 2: Frontend Components
#### 2.1 Replace Whereby SDK with Daily React
First, update dependencies:
```bash
# Remove Whereby
yarn remove @whereby.com/browser-sdk
# Add Daily.co
yarn add @daily-co/daily-react @daily-co/daily-js
```
#### 2.2 New Room Component (`www/app/[roomName]/page.tsx`)
```tsx
"use client";
import { useCallback, useEffect, useRef, useState } from "react";
import {
DailyProvider,
useDaily,
useParticipantIds,
useRecording,
useDailyEvent,
useLocalParticipant,
} from "@daily-co/daily-react";
import { Box, Button, Text, VStack, HStack, Spinner } from "@chakra-ui/react";
import { toaster } from "../components/ui/toaster";
import useRoomMeeting from "./useRoomMeeting";
import { useRouter } from "next/navigation";
import { notFound } from "next/navigation";
import useSessionStatus from "../lib/useSessionStatus";
import { useRecordingConsent } from "../recordingConsentContext";
import DailyIframe from "@daily-co/daily-js";
// Daily.co Call Interface Component
function CallInterface() {
const daily = useDaily();
const { isRecording, startRecording, stopRecording } = useRecording();
const localParticipant = useLocalParticipant();
const participantIds = useParticipantIds({ filter: "remote" });
// Real-time participant tracking
useDailyEvent("participant-joined", useCallback((event) => {
console.log(`${event.participant.user_name} joined the call`);
// No need for webhooks - we have immediate access!
}, []));
useDailyEvent("participant-left", useCallback((event) => {
console.log(`${event.participant.user_name} left the call`);
}, []));
return (
<Box position="relative" width="100vw" height="100vh">
{/* Daily.co automatically handles the video/audio UI */}
<Box
as="iframe"
src={daily?.iframe()?.src}
width="100%"
height="100%"
allow="camera; microphone; fullscreen; speaker; display-capture"
style={{ border: "none" }}
/>
{/* Recording status indicator */}
{isRecording && (
<Box
position="absolute"
top={4}
right={4}
bg="red.500"
color="white"
px={3}
py={1}
borderRadius="md"
fontSize="sm"
>
Recording
</Box>
)}
{/* Participant count with real-time data */}
<Box position="absolute" bottom={4} left={4} bg="gray.800" color="white" px={3} py={1} borderRadius="md">
Participants: {participantIds.length + 1}
</Box>
</Box>
);
}
// Main Room Component with Daily.co Integration
export default function Room({ params }: { params: { roomName: string } }) {
const roomName = params.roomName;
const meeting = useRoomMeeting(roomName);
const router = useRouter();
const { isLoading, isAuthenticated } = useSessionStatus();
const [dailyUrl, setDailyUrl] = useState<string | null>(null);
const [callFrame, setCallFrame] = useState<DailyIframe | null>(null);
// Initialize Daily.co call
useEffect(() => {
if (!meeting?.response?.room_url) return;
const frame = DailyIframe.createCallObject({
showLeaveButton: true,
showFullscreenButton: true,
});
frame.on("left-meeting", () => {
router.push("/browse");
});
setCallFrame(frame);
setDailyUrl(meeting.response.room_url);
return () => {
frame.destroy();
};
}, [meeting?.response?.room_url, router]);
if (isLoading) {
return (
<Box display="flex" justifyContent="center" alignItems="center" height="100vh">
<Spinner color="blue.500" size="xl" />
</Box>
);
}
if (!dailyUrl || !callFrame) {
return null;
}
return (
<DailyProvider callObject={callFrame} url={dailyUrl}>
<CallInterface />
<ConsentDialog meetingId={meeting?.response?.id} />
</DailyProvider>
);
}
### Phase 3: Testing & Validation
For Phase 1 (feature parity), the existing processing pipeline remains unchanged:
1. Daily.co records meeting to S3 (same as Whereby)
2. Webhook notifies when recording is ready
3. Existing pipeline downloads and processes the MP4 file
4. Current diarization and transcription tools continue to work
Key validation points:
- Recording format matches (MP4 with mixed audio)
- S3 paths are compatible
- Processing pipeline requires no changes
- Transcript quality remains the same
## Future Enhancement: Raw-Tracks Recording (Phase 2)
### Raw-Tracks Processing for Enhanced Diarization
Daily.co's raw-tracks recording provides individual audio streams per participant, enabling:
```python
@shared_task
def process_daily_raw_tracks(meeting_id: str, recording_id: str, tracks: list):
"""Process Daily.co raw-tracks with perfect speaker attribution."""
for track in tracks:
participant_id = track["participant_id"]
participant_name = track["participant_name"]
track_url = track["download_url"]
# Download individual participant audio
response = download_track(track_url)
# Process with known speaker identity
transcript = transcribe_audio(
audio_data=response.content,
speaker_id=participant_id,
speaker_name=participant_name
)
# Store with accurate speaker mapping
save_transcript_segment(
meeting_id=meeting_id,
speaker_id=participant_id,
text=transcript.text,
timestamps=transcript.timestamps
)
```
### Benefits of Raw-Tracks (Future)
1. **Deterministic Speaker Attribution**: Each audio track is already speaker-separated
2. **Improved Transcription Accuracy**: Clean audio without cross-talk
3. **Parallel Processing**: Process multiple speakers simultaneously
4. **Better Metrics**: Accurate talk-time per participant
### Phase 4: Database & Configuration
#### 4.1 Environment Variable Updates
Update `.env` files:
```bash
# Remove Whereby variables
# WHEREBY_API_URL=https://api.whereby.dev/v1
# WHEREBY_API_KEY=your-whereby-key
# WHEREBY_WEBHOOK_SECRET=your-whereby-secret
# AWS_WHEREBY_S3_BUCKET=whereby-recordings
# AWS_WHEREBY_ACCESS_KEY_ID=whereby-key
# AWS_WHEREBY_ACCESS_KEY_SECRET=whereby-secret
# Add Daily.co variables
DAILY_API_KEY=your-daily-api-key
DAILY_WEBHOOK_SECRET=your-daily-webhook-secret
AWS_DAILY_S3_BUCKET=daily-recordings
AWS_DAILY_ROLE_ARN=arn:aws:iam::123456789:role/daily-recording-role
AWS_REGION=us-west-2
```
#### 4.2 Database Migration
```sql
-- Alembic migration to support Daily.co
-- server/alembic/versions/xxx_migrate_to_daily.py
def upgrade():
# Add platform field to support gradual migration
op.add_column('rooms', sa.Column('platform', sa.String(), server_default='whereby'))
op.add_column('meetings', sa.Column('platform', sa.String(), server_default='whereby'))
# No other schema changes needed for feature parity
def downgrade():
op.drop_column('meetings', 'platform')
op.drop_column('rooms', 'platform')
```
#### 4.3 Settings Update (`server/reflector/settings.py`)
```python
class Settings(BaseSettings):
# Remove Whereby settings
# WHEREBY_API_URL: str = "https://api.whereby.dev/v1"
# WHEREBY_API_KEY: str
# WHEREBY_WEBHOOK_SECRET: str
# AWS_WHEREBY_S3_BUCKET: str
# AWS_WHEREBY_ACCESS_KEY_ID: str
# AWS_WHEREBY_ACCESS_KEY_SECRET: str
# Add Daily.co settings
DAILY_API_KEY: str
DAILY_WEBHOOK_SECRET: str
AWS_DAILY_S3_BUCKET: str
AWS_DAILY_ROLE_ARN: str
AWS_REGION: str = "us-west-2"
# Daily.co room URL pattern
DAILY_ROOM_URL_PATTERN: str = "https://{subdomain}.daily.co/{room_name}"
DAILY_SUBDOMAIN: str = "reflector" # Your Daily.co subdomain
```
## Technical Differences
### Phase 1 Implementation
1. **Frontend**: Replace `<whereby-embed>` custom element with Daily.co React components or iframe
2. **Backend**: Create Daily.co API client matching Whereby's functionality
3. **Webhooks**: Map Daily.co events to existing database operations
4. **Recording**: Maintain same MP4 format and S3 storage
### Phase 2 Capabilities (Future)
1. **Raw-tracks recording**: Individual audio streams per participant
2. **Presence API**: Real-time participant data without webhook delays
3. **Transcription API**: Built-in transcription services
4. **Advanced recording options**: Multiple formats and layouts
## Risks and Mitigation
### Risk 1: API Differences
- **Mitigation**: Create abstraction layer to minimize changes
- Comprehensive testing of all endpoints
### Risk 2: Recording Format Changes
- **Mitigation**: Build adapter for raw-tracks processing
- Maintain backward compatibility during transition
### Risk 3: User Experience Changes
- **Mitigation**: A/B testing with gradual rollout
- Feature parity checklist before full migration
## Recommendation
Migration to Daily.co is technically feasible and can be implemented in phases:
### Phase 1: Feature Parity
- Replace Whereby with Daily.co maintaining exact same functionality
- Use standard cloud recording (MP4 to S3)
- No changes to processing pipeline
### Phase 2: Enhanced Capabilities (Future)
- Enable raw-tracks recording for improved diarization
- Implement participant-level audio processing
- Add real-time features using Presence API
## Next Steps
1. Set up Daily.co account and obtain API credentials
2. Implement feature flag system for gradual migration
3. Create Daily.co API client matching Whereby functionality
4. Update frontend to support both platforms
5. Test thoroughly before rollout
---
*Analysis based on current codebase review and API documentation comparison.*

View File

@@ -39,12 +39,11 @@ services:
image: node:18
ports:
- "3000:3000"
command: sh -c "corepack enable && pnpm install && pnpm dev"
command: sh -c "yarn install && yarn dev"
restart: unless-stopped
working_dir: /app
volumes:
- ./www:/app/
- /app/node_modules
env_file:
- ./www/.env.local

View File

@@ -24,6 +24,7 @@ AUTH_JWT_AUDIENCE=
## Using serverless modal.com (require reflector-gpu-modal deployed)
#TRANSCRIPT_BACKEND=modal
#TRANSCRIPT_URL=https://xxxxx--reflector-transcriber-web.modal.run
#TRANSLATE_URL=https://xxxxx--reflector-translator-web.modal.run
#TRANSCRIPT_MODAL_API_KEY=xxxxx
TRANSCRIPT_BACKEND=modal
@@ -31,13 +32,11 @@ TRANSCRIPT_URL=https://monadical-sas--reflector-transcriber-web.modal.run
TRANSCRIPT_MODAL_API_KEY=
## =======================================================
## Translation backend
## Transcription backend
##
## Only available in modal atm
## =======================================================
TRANSLATION_BACKEND=modal
TRANSLATE_URL=https://monadical-sas--reflector-translator-web.modal.run
#TRANSLATION_MODAL_API_KEY=xxxxx
## =======================================================
## LLM backend
@@ -60,9 +59,7 @@ LLM_API_KEY=sk-
## To allow diarization, you need to expose expose the files to be dowloded by the pipeline
## =======================================================
DIARIZATION_ENABLED=false
DIARIZATION_BACKEND=modal
DIARIZATION_URL=https://monadical-sas--reflector-diarizer-web.modal.run
#DIARIZATION_MODAL_API_KEY=xxxxx
## =======================================================

View File

@@ -24,20 +24,16 @@ $ modal deploy reflector_llm.py
└── 🔨 Created web => https://xxxx--reflector-llm-web.modal.run
```
Then in your reflector api configuration `.env`, you can set these keys:
Then in your reflector api configuration `.env`, you can set theses keys:
```
TRANSCRIPT_BACKEND=modal
TRANSCRIPT_URL=https://xxxx--reflector-transcriber-web.modal.run
TRANSCRIPT_MODAL_API_KEY=REFLECTOR_APIKEY
DIARIZATION_BACKEND=modal
DIARIZATION_URL=https://xxxx--reflector-diarizer-web.modal.run
DIARIZATION_MODAL_API_KEY=REFLECTOR_APIKEY
TRANSLATION_BACKEND=modal
TRANSLATION_URL=https://xxxx--reflector-translator-web.modal.run
TRANSLATION_MODAL_API_KEY=REFLECTOR_APIKEY
LLM_BACKEND=modal
LLM_URL=https://xxxx--reflector-llm-web.modal.run
LLM_MODAL_API_KEY=REFLECTOR_APIKEY
```
## API

View File

@@ -1,3 +1 @@
Generic single-database configuration.
Both data migrations and schema migrations must be in migrations.

View File

@@ -1,25 +0,0 @@
"""add_webvtt_field_to_transcript
Revision ID: 0bc0f3ff0111
Revises: b7df9609542c
Create Date: 2025-08-05 19:36:41.740957
"""
from typing import Sequence, Union
import sqlalchemy as sa
from alembic import op
revision: str = "0bc0f3ff0111"
down_revision: Union[str, None] = "b7df9609542c"
branch_labels: Union[str, Sequence[str], None] = None
depends_on: Union[str, Sequence[str], None] = None
def upgrade() -> None:
op.add_column("transcript", sa.Column("webvtt", sa.Text(), nullable=True))
def downgrade() -> None:
op.drop_column("transcript", "webvtt")

View File

@@ -1,46 +0,0 @@
"""add_full_text_search
Revision ID: 116b2f287eab
Revises: 0bc0f3ff0111
Create Date: 2025-08-07 11:27:38.473517
"""
from typing import Sequence, Union
from alembic import op
revision: str = "116b2f287eab"
down_revision: Union[str, None] = "0bc0f3ff0111"
branch_labels: Union[str, Sequence[str], None] = None
depends_on: Union[str, Sequence[str], None] = None
def upgrade() -> None:
conn = op.get_bind()
if conn.dialect.name != "postgresql":
return
op.execute("""
ALTER TABLE transcript ADD COLUMN search_vector_en tsvector
GENERATED ALWAYS AS (
setweight(to_tsvector('english', coalesce(title, '')), 'A') ||
setweight(to_tsvector('english', coalesce(webvtt, '')), 'B')
) STORED
""")
op.create_index(
"idx_transcript_search_vector_en",
"transcript",
["search_vector_en"],
postgresql_using="gin",
)
def downgrade() -> None:
conn = op.get_bind()
if conn.dialect.name != "postgresql":
return
op.drop_index("idx_transcript_search_vector_en", table_name="transcript")
op.drop_column("transcript", "search_vector_en")

View File

@@ -1,53 +0,0 @@
"""remove_one_active_meeting_per_room_constraint
Revision ID: 6025e9b2bef2
Revises: 9f5c78d352d6
Create Date: 2025-08-18 18:45:44.418392
"""
from typing import Sequence, Union
import sqlalchemy as sa
from alembic import op
# revision identifiers, used by Alembic.
revision: str = "6025e9b2bef2"
down_revision: Union[str, None] = "9f5c78d352d6"
branch_labels: Union[str, Sequence[str], None] = None
depends_on: Union[str, Sequence[str], None] = None
def upgrade() -> None:
# Remove the unique constraint that prevents multiple active meetings per room
# This is needed to support calendar integration with overlapping meetings
# Check if index exists before trying to drop it
from alembic import context
if context.get_context().dialect.name == "postgresql":
conn = op.get_bind()
result = conn.execute(
sa.text(
"SELECT 1 FROM pg_indexes WHERE indexname = 'idx_one_active_meeting_per_room'"
)
)
if result.fetchone():
op.drop_index("idx_one_active_meeting_per_room", table_name="meeting")
else:
# For SQLite, just try to drop it
try:
op.drop_index("idx_one_active_meeting_per_room", table_name="meeting")
except:
pass
def downgrade() -> None:
# Restore the unique constraint
op.create_index(
"idx_one_active_meeting_per_room",
"meeting",
["room_id"],
unique=True,
postgresql_where=sa.text("is_active = true"),
sqlite_where=sa.text("is_active = 1"),
)

View File

@@ -32,7 +32,7 @@ def upgrade() -> None:
sa.Column("user_id", sa.String(), nullable=True),
sa.Column("room_id", sa.String(), nullable=True),
sa.Column(
"is_locked", sa.Boolean(), server_default=sa.text("false"), nullable=False
"is_locked", sa.Boolean(), server_default=sa.text("0"), nullable=False
),
sa.Column("room_mode", sa.String(), server_default="normal", nullable=False),
sa.Column(
@@ -53,15 +53,12 @@ def upgrade() -> None:
sa.Column("user_id", sa.String(), nullable=False),
sa.Column("created_at", sa.DateTime(), nullable=False),
sa.Column(
"zulip_auto_post",
sa.Boolean(),
server_default=sa.text("false"),
nullable=False,
"zulip_auto_post", sa.Boolean(), server_default=sa.text("0"), nullable=False
),
sa.Column("zulip_stream", sa.String(), nullable=True),
sa.Column("zulip_topic", sa.String(), nullable=True),
sa.Column(
"is_locked", sa.Boolean(), server_default=sa.text("false"), nullable=False
"is_locked", sa.Boolean(), server_default=sa.text("0"), nullable=False
),
sa.Column("room_mode", sa.String(), server_default="normal", nullable=False),
sa.Column(

View File

@@ -20,14 +20,11 @@ depends_on: Union[str, Sequence[str], None] = None
def upgrade() -> None:
# ### commands auto generated by Alembic - please adjust! ###
sourcekind_enum = sa.Enum("room", "live", "file", name="sourcekind")
sourcekind_enum.create(op.get_bind())
op.add_column(
"transcript",
sa.Column(
"source_kind",
sourcekind_enum,
sa.Enum("ROOM", "LIVE", "FILE", name="sourcekind"),
nullable=True,
),
)
@@ -46,8 +43,6 @@ def upgrade() -> None:
def downgrade() -> None:
# ### commands auto generated by Alembic - please adjust! ###
op.drop_column("transcript", "source_kind")
sourcekind_enum = sa.Enum(name="sourcekind")
sourcekind_enum.drop(op.get_bind())
# ### end Alembic commands ###

View File

@@ -0,0 +1,54 @@
"""dailyco platform
Revision ID: 7e47155afd51
Revises: b7df9609542c
Create Date: 2025-08-04 11:14:19.663115
"""
from typing import Sequence, Union
import sqlalchemy as sa
from alembic import op
# revision identifiers, used by Alembic.
revision: str = "7e47155afd51"
down_revision: Union[str, None] = "b7df9609542c"
branch_labels: Union[str, Sequence[str], None] = None
depends_on: Union[str, Sequence[str], None] = None
def upgrade() -> None:
# ### commands auto generated by Alembic - please adjust! ###
with op.batch_alter_table("meeting", schema=None) as batch_op:
batch_op.add_column(
sa.Column("platform", sa.String(), server_default="whereby", nullable=False)
)
batch_op.drop_index(
batch_op.f("idx_one_active_meeting_per_room"),
sqlite_where=sa.text("is_active = 1"),
)
with op.batch_alter_table("room", schema=None) as batch_op:
batch_op.add_column(
sa.Column("platform", sa.String(), server_default="whereby", nullable=False)
)
# ### end Alembic commands ###
def downgrade() -> None:
# ### commands auto generated by Alembic - please adjust! ###
with op.batch_alter_table("room", schema=None) as batch_op:
batch_op.drop_column("platform")
with op.batch_alter_table("meeting", schema=None) as batch_op:
batch_op.create_index(
batch_op.f("idx_one_active_meeting_per_room"),
["room_id"],
unique=1,
sqlite_where=sa.text("is_active = 1"),
)
batch_op.drop_column("platform")
# ### end Alembic commands ###

View File

@@ -1,106 +0,0 @@
"""populate_webvtt_from_topics
Revision ID: 8120ebc75366
Revises: 116b2f287eab
Create Date: 2025-08-11 19:11:01.316947
"""
import json
from typing import Sequence, Union
from alembic import op
from sqlalchemy import text
# revision identifiers, used by Alembic.
revision: str = "8120ebc75366"
down_revision: Union[str, None] = "116b2f287eab"
branch_labels: Union[str, Sequence[str], None] = None
depends_on: Union[str, Sequence[str], None] = None
def topics_to_webvtt(topics):
"""Convert topics list to WebVTT format string."""
if not topics:
return None
lines = ["WEBVTT", ""]
for topic in topics:
start_time = format_timestamp(topic.get("start"))
end_time = format_timestamp(topic.get("end"))
text = topic.get("text", "").strip()
if start_time and end_time and text:
lines.append(f"{start_time} --> {end_time}")
lines.append(text)
lines.append("")
return "\n".join(lines).strip()
def format_timestamp(seconds):
"""Format seconds to WebVTT timestamp format (HH:MM:SS.mmm)."""
if seconds is None:
return None
hours = int(seconds // 3600)
minutes = int((seconds % 3600) // 60)
secs = seconds % 60
return f"{hours:02d}:{minutes:02d}:{secs:06.3f}"
def upgrade() -> None:
"""Populate WebVTT field for all transcripts with topics."""
# Get connection
connection = op.get_bind()
# Query all transcripts with topics
result = connection.execute(
text("SELECT id, topics FROM transcript WHERE topics IS NOT NULL")
)
rows = result.fetchall()
print(f"Found {len(rows)} transcripts with topics")
updated_count = 0
error_count = 0
for row in rows:
transcript_id = row[0]
topics_data = row[1]
if not topics_data:
continue
try:
# Parse JSON if it's a string
if isinstance(topics_data, str):
topics_data = json.loads(topics_data)
# Convert topics to WebVTT format
webvtt_content = topics_to_webvtt(topics_data)
if webvtt_content:
# Update the webvtt field
connection.execute(
text("UPDATE transcript SET webvtt = :webvtt WHERE id = :id"),
{"webvtt": webvtt_content, "id": transcript_id},
)
updated_count += 1
print(f"✓ Updated transcript {transcript_id}")
except Exception as e:
error_count += 1
print(f"✗ Error updating transcript {transcript_id}: {e}")
print(f"\nMigration complete!")
print(f" Updated: {updated_count}")
print(f" Errors: {error_count}")
def downgrade() -> None:
"""Clear WebVTT field for all transcripts."""
op.execute(text("UPDATE transcript SET webvtt = NULL"))

View File

@@ -22,7 +22,7 @@ def upgrade() -> None:
# ### commands auto generated by Alembic - please adjust! ###
op.execute(
"UPDATE transcript SET events = "
'REPLACE(events::text, \'"event": "SUMMARY"\', \'"event": "LONG_SUMMARY"\')::json;'
'REPLACE(events, \'"event": "SUMMARY"\', \'"event": "LONG_SUMMARY"\');'
)
op.alter_column("transcript", "summary", new_column_name="long_summary")
op.add_column("transcript", sa.Column("title", sa.String(), nullable=True))
@@ -34,7 +34,7 @@ def downgrade() -> None:
# ### commands auto generated by Alembic - please adjust! ###
op.execute(
"UPDATE transcript SET events = "
'REPLACE(events::text, \'"event": "LONG_SUMMARY"\', \'"event": "SUMMARY"\')::json;'
'REPLACE(events, \'"event": "LONG_SUMMARY"\', \'"event": "SUMMARY"\');'
)
with op.batch_alter_table("transcript", schema=None) as batch_op:
batch_op.alter_column("long_summary", nullable=True, new_column_name="summary")

View File

@@ -1,121 +0,0 @@
"""datetime timezone
Revision ID: 9f5c78d352d6
Revises: 8120ebc75366
Create Date: 2025-08-13 19:18:27.113593
"""
from typing import Sequence, Union
import sqlalchemy as sa
from alembic import op
from sqlalchemy.dialects import postgresql
# revision identifiers, used by Alembic.
revision: str = "9f5c78d352d6"
down_revision: Union[str, None] = "8120ebc75366"
branch_labels: Union[str, Sequence[str], None] = None
depends_on: Union[str, Sequence[str], None] = None
def upgrade() -> None:
# ### commands auto generated by Alembic - please adjust! ###
with op.batch_alter_table("meeting", schema=None) as batch_op:
batch_op.alter_column(
"start_date",
existing_type=postgresql.TIMESTAMP(),
type_=sa.DateTime(timezone=True),
existing_nullable=True,
)
batch_op.alter_column(
"end_date",
existing_type=postgresql.TIMESTAMP(),
type_=sa.DateTime(timezone=True),
existing_nullable=True,
)
with op.batch_alter_table("meeting_consent", schema=None) as batch_op:
batch_op.alter_column(
"consent_timestamp",
existing_type=postgresql.TIMESTAMP(),
type_=sa.DateTime(timezone=True),
existing_nullable=False,
)
with op.batch_alter_table("recording", schema=None) as batch_op:
batch_op.alter_column(
"recorded_at",
existing_type=postgresql.TIMESTAMP(),
type_=sa.DateTime(timezone=True),
existing_nullable=False,
)
with op.batch_alter_table("room", schema=None) as batch_op:
batch_op.alter_column(
"created_at",
existing_type=postgresql.TIMESTAMP(),
type_=sa.DateTime(timezone=True),
existing_nullable=False,
)
with op.batch_alter_table("transcript", schema=None) as batch_op:
batch_op.alter_column(
"created_at",
existing_type=postgresql.TIMESTAMP(),
type_=sa.DateTime(timezone=True),
existing_nullable=True,
)
# ### end Alembic commands ###
def downgrade() -> None:
# ### commands auto generated by Alembic - please adjust! ###
with op.batch_alter_table("transcript", schema=None) as batch_op:
batch_op.alter_column(
"created_at",
existing_type=sa.DateTime(timezone=True),
type_=postgresql.TIMESTAMP(),
existing_nullable=True,
)
with op.batch_alter_table("room", schema=None) as batch_op:
batch_op.alter_column(
"created_at",
existing_type=sa.DateTime(timezone=True),
type_=postgresql.TIMESTAMP(),
existing_nullable=False,
)
with op.batch_alter_table("recording", schema=None) as batch_op:
batch_op.alter_column(
"recorded_at",
existing_type=sa.DateTime(timezone=True),
type_=postgresql.TIMESTAMP(),
existing_nullable=False,
)
with op.batch_alter_table("meeting_consent", schema=None) as batch_op:
batch_op.alter_column(
"consent_timestamp",
existing_type=sa.DateTime(timezone=True),
type_=postgresql.TIMESTAMP(),
existing_nullable=False,
)
with op.batch_alter_table("meeting", schema=None) as batch_op:
batch_op.alter_column(
"end_date",
existing_type=sa.DateTime(timezone=True),
type_=postgresql.TIMESTAMP(),
existing_nullable=True,
)
batch_op.alter_column(
"start_date",
existing_type=sa.DateTime(timezone=True),
type_=postgresql.TIMESTAMP(),
existing_nullable=True,
)
# ### end Alembic commands ###

View File

@@ -25,7 +25,7 @@ def upgrade() -> None:
sa.Column(
"is_shared",
sa.Boolean(),
server_default=sa.text("false"),
server_default=sa.text("0"),
nullable=False,
),
)

View File

@@ -23,10 +23,7 @@ def upgrade() -> None:
with op.batch_alter_table("meeting", schema=None) as batch_op:
batch_op.add_column(
sa.Column(
"is_active",
sa.Boolean(),
server_default=sa.text("true"),
nullable=False,
"is_active", sa.Boolean(), server_default=sa.text("1"), nullable=False
)
)

View File

@@ -23,7 +23,7 @@ def upgrade() -> None:
op.add_column(
"transcript",
sa.Column(
"reviewed", sa.Boolean(), server_default=sa.text("false"), nullable=False
"reviewed", sa.Boolean(), server_default=sa.text("0"), nullable=False
),
)
# ### end Alembic commands ###

View File

@@ -1,34 +0,0 @@
"""add_grace_period_fields_to_meeting
Revision ID: d4a1c446458c
Revises: 6025e9b2bef2
Create Date: 2025-08-18 18:50:37.768052
"""
from typing import Sequence, Union
import sqlalchemy as sa
from alembic import op
# revision identifiers, used by Alembic.
revision: str = "d4a1c446458c"
down_revision: Union[str, None] = "6025e9b2bef2"
branch_labels: Union[str, Sequence[str], None] = None
depends_on: Union[str, Sequence[str], None] = None
def upgrade() -> None:
# Add fields to track when participants left for grace period logic
op.add_column(
"meeting", sa.Column("last_participant_left_at", sa.DateTime(timezone=True))
)
op.add_column(
"meeting",
sa.Column("grace_period_minutes", sa.Integer, server_default=sa.text("15")),
)
def downgrade() -> None:
op.drop_column("meeting", "grace_period_minutes")
op.drop_column("meeting", "last_participant_left_at")

View File

@@ -34,14 +34,12 @@ dependencies = [
"python-multipart>=0.0.6",
"faster-whisper>=0.10.0",
"transformers>=4.36.2",
"black==24.1.1",
"jsonschema>=4.23.0",
"openai>=1.59.7",
"psycopg2-binary>=2.9.10",
"llama-index>=0.12.52",
"llama-index-llms-openai-like>=0.4.0",
"pytest-env>=1.1.5",
"webvtt-py>=0.5.0",
"icalendar>=6.0.0",
]
[dependency-groups]
@@ -58,8 +56,6 @@ tests = [
"httpx-ws>=0.4.1",
"pytest-httpx>=0.23.1",
"pytest-celery>=0.0.0",
"pytest-docker>=3.2.3",
"asgi-lifespan>=2.1.0",
]
aws = ["aioboto3>=11.2.0"]
evaluation = [
@@ -87,25 +83,10 @@ packages = ["reflector"]
[tool.coverage.run]
source = ["reflector"]
[tool.pytest_env]
ENVIRONMENT = "pytest"
DATABASE_URL = "postgresql://test_user:test_password@localhost:15432/reflector_test"
[tool.pytest.ini_options]
addopts = "-ra -q --disable-pytest-warnings --cov --cov-report html -v"
testpaths = ["tests"]
asyncio_mode = "auto"
[tool.ruff.lint]
select = [
"I", # isort - import sorting
"F401", # unused imports
"PLC0415", # import-outside-top-level - detect inline imports
]
[tool.ruff.lint.per-file-ignores]
"reflector/processors/summary/summary_builder.py" = ["E501"]
"gpu/**.py" = ["PLC0415"]
"reflector/tools/**.py" = ["PLC0415"]
"migrations/versions/**.py" = ["PLC0415"]
"tests/**.py" = ["PLC0415"]

View File

@@ -12,6 +12,7 @@ from reflector.events import subscribers_shutdown, subscribers_startup
from reflector.logger import logger
from reflector.metrics import metrics_init
from reflector.settings import settings
from reflector.views.daily import router as daily_router
from reflector.views.meetings import router as meetings_router
from reflector.views.rooms import router as rooms_router
from reflector.views.rtc_offer import router as rtc_offer_router
@@ -86,6 +87,7 @@ app.include_router(transcripts_process_router, prefix="/v1")
app.include_router(user_router, prefix="/v1")
app.include_router(zulip_router, prefix="/v1")
app.include_router(whereby_router, prefix="/v1")
app.include_router(daily_router, prefix="/v1")
add_pagination(app)
# prepare celery

View File

@@ -1,48 +1,29 @@
import contextvars
from typing import Optional
import databases
import sqlalchemy
from reflector.events import subscribers_shutdown, subscribers_startup
from reflector.settings import settings
database = databases.Database(settings.DATABASE_URL)
metadata = sqlalchemy.MetaData()
_database_context: contextvars.ContextVar[Optional[databases.Database]] = (
contextvars.ContextVar("database", default=None)
)
def get_database() -> databases.Database:
"""Get database instance for current asyncio context"""
db = _database_context.get()
if db is None:
db = databases.Database(settings.DATABASE_URL)
_database_context.set(db)
return db
# import models
import reflector.db.calendar_events # noqa
import reflector.db.meetings # noqa
import reflector.db.recordings # noqa
import reflector.db.rooms # noqa
import reflector.db.transcripts # noqa
kwargs = {}
if "postgres" not in settings.DATABASE_URL:
raise Exception("Only postgres database is supported in reflector")
if "sqlite" in settings.DATABASE_URL:
kwargs["connect_args"] = {"check_same_thread": False}
engine = sqlalchemy.create_engine(settings.DATABASE_URL, **kwargs)
@subscribers_startup.append
async def database_connect(_):
database = get_database()
await database.connect()
@subscribers_shutdown.append
async def database_disconnect(_):
database = get_database()
await database.disconnect()

View File

@@ -1,193 +0,0 @@
from datetime import datetime, timezone
from typing import Any
import sqlalchemy as sa
from pydantic import BaseModel, Field
from sqlalchemy.dialects.postgresql import JSONB
from reflector.db import get_database, metadata
from reflector.utils import generate_uuid4
calendar_events = sa.Table(
"calendar_event",
metadata,
sa.Column("id", sa.String, primary_key=True),
sa.Column(
"room_id",
sa.String,
sa.ForeignKey("room.id", ondelete="CASCADE"),
nullable=False,
),
sa.Column("ics_uid", sa.Text, nullable=False),
sa.Column("title", sa.Text),
sa.Column("description", sa.Text),
sa.Column("start_time", sa.DateTime(timezone=True), nullable=False),
sa.Column("end_time", sa.DateTime(timezone=True), nullable=False),
sa.Column("attendees", JSONB),
sa.Column("location", sa.Text),
sa.Column("ics_raw_data", sa.Text),
sa.Column("last_synced", sa.DateTime(timezone=True), nullable=False),
sa.Column("is_deleted", sa.Boolean, nullable=False, server_default=sa.false()),
sa.Column("created_at", sa.DateTime(timezone=True), nullable=False),
sa.Column("updated_at", sa.DateTime(timezone=True), nullable=False),
sa.UniqueConstraint("room_id", "ics_uid", name="uq_room_calendar_event"),
sa.Index("idx_calendar_event_room_start", "room_id", "start_time"),
sa.Index(
"idx_calendar_event_deleted",
"is_deleted",
postgresql_where=sa.text("NOT is_deleted"),
),
)
class CalendarEvent(BaseModel):
id: str = Field(default_factory=generate_uuid4)
room_id: str
ics_uid: str
title: str | None = None
description: str | None = None
start_time: datetime
end_time: datetime
attendees: list[dict[str, Any]] | None = None
location: str | None = None
ics_raw_data: str | None = None
last_synced: datetime = Field(default_factory=lambda: datetime.now(timezone.utc))
is_deleted: bool = False
created_at: datetime = Field(default_factory=lambda: datetime.now(timezone.utc))
updated_at: datetime = Field(default_factory=lambda: datetime.now(timezone.utc))
class CalendarEventController:
async def get_by_room(
self,
room_id: str,
include_deleted: bool = False,
start_after: datetime | None = None,
end_before: datetime | None = None,
) -> list[CalendarEvent]:
"""Get calendar events for a room."""
query = calendar_events.select().where(calendar_events.c.room_id == room_id)
if not include_deleted:
query = query.where(calendar_events.c.is_deleted == False)
if start_after:
query = query.where(calendar_events.c.start_time >= start_after)
if end_before:
query = query.where(calendar_events.c.end_time <= end_before)
query = query.order_by(calendar_events.c.start_time.asc())
results = await get_database().fetch_all(query)
return [CalendarEvent(**result) for result in results]
async def get_upcoming(
self, room_id: str, minutes_ahead: int = 30
) -> list[CalendarEvent]:
"""Get upcoming events for a room within the specified minutes."""
now = datetime.now(timezone.utc)
future_time = now + timedelta(minutes=minutes_ahead)
query = (
calendar_events.select()
.where(
sa.and_(
calendar_events.c.room_id == room_id,
calendar_events.c.is_deleted == False,
calendar_events.c.start_time >= now,
calendar_events.c.start_time <= future_time,
)
)
.order_by(calendar_events.c.start_time.asc())
)
results = await get_database().fetch_all(query)
return [CalendarEvent(**result) for result in results]
async def get_by_ics_uid(self, room_id: str, ics_uid: str) -> CalendarEvent | None:
"""Get a calendar event by its ICS UID."""
query = calendar_events.select().where(
sa.and_(
calendar_events.c.room_id == room_id,
calendar_events.c.ics_uid == ics_uid,
)
)
result = await get_database().fetch_one(query)
return CalendarEvent(**result) if result else None
async def upsert(self, event: CalendarEvent) -> CalendarEvent:
"""Create or update a calendar event."""
existing = await self.get_by_ics_uid(event.room_id, event.ics_uid)
if existing:
# Update existing event
event.id = existing.id
event.created_at = existing.created_at
event.updated_at = datetime.now(timezone.utc)
query = (
calendar_events.update()
.where(calendar_events.c.id == existing.id)
.values(**event.model_dump())
)
else:
# Insert new event
query = calendar_events.insert().values(**event.model_dump())
await get_database().execute(query)
return event
async def soft_delete_missing(
self, room_id: str, current_ics_uids: list[str]
) -> int:
"""Soft delete future events that are no longer in the calendar."""
now = datetime.now(timezone.utc)
# First, get the IDs of events to delete
select_query = calendar_events.select().where(
sa.and_(
calendar_events.c.room_id == room_id,
calendar_events.c.start_time > now,
calendar_events.c.is_deleted == False,
calendar_events.c.ics_uid.notin_(current_ics_uids)
if current_ics_uids
else True,
)
)
to_delete = await get_database().fetch_all(select_query)
delete_count = len(to_delete)
if delete_count > 0:
# Now update them
update_query = (
calendar_events.update()
.where(
sa.and_(
calendar_events.c.room_id == room_id,
calendar_events.c.start_time > now,
calendar_events.c.is_deleted == False,
calendar_events.c.ics_uid.notin_(current_ics_uids)
if current_ics_uids
else True,
)
)
.values(is_deleted=True, updated_at=now)
)
await get_database().execute(update_query)
return delete_count
async def delete_by_room(self, room_id: str) -> int:
"""Hard delete all events for a room (used when room is deleted)."""
query = calendar_events.delete().where(calendar_events.c.room_id == room_id)
result = await get_database().execute(query)
return result.rowcount
# Add missing import
from datetime import timedelta
calendar_events_controller = CalendarEventController()

View File

@@ -1,12 +1,11 @@
from datetime import datetime
from typing import Any, Literal
from typing import Literal
import sqlalchemy as sa
from fastapi import HTTPException
from pydantic import BaseModel, Field
from sqlalchemy.dialects.postgresql import JSONB
from reflector.db import get_database, metadata
from reflector.db import database, metadata
from reflector.db.rooms import Room
from reflector.utils import generate_uuid4
@@ -17,8 +16,8 @@ meetings = sa.Table(
sa.Column("room_name", sa.String),
sa.Column("room_url", sa.String),
sa.Column("host_room_url", sa.String),
sa.Column("start_date", sa.DateTime(timezone=True)),
sa.Column("end_date", sa.DateTime(timezone=True)),
sa.Column("start_date", sa.DateTime),
sa.Column("end_date", sa.DateTime),
sa.Column("user_id", sa.String),
sa.Column("room_id", sa.String),
sa.Column("is_locked", sa.Boolean, nullable=False, server_default=sa.false()),
@@ -43,15 +42,12 @@ meetings = sa.Table(
server_default=sa.true(),
),
sa.Column(
"calendar_event_id",
"platform",
sa.String,
sa.ForeignKey("calendar_event.id", ondelete="SET NULL"),
nullable=False,
server_default="whereby",
),
sa.Column("calendar_metadata", JSONB),
sa.Column("last_participant_left_at", sa.DateTime(timezone=True)),
sa.Column("grace_period_minutes", sa.Integer, server_default=sa.text("15")),
sa.Index("idx_meeting_room_id", "room_id"),
sa.Index("idx_meeting_calendar_event", "calendar_event_id"),
)
meeting_consent = sa.Table(
@@ -61,7 +57,7 @@ meeting_consent = sa.Table(
sa.Column("meeting_id", sa.String, sa.ForeignKey("meeting.id"), nullable=False),
sa.Column("user_id", sa.String),
sa.Column("consent_given", sa.Boolean, nullable=False),
sa.Column("consent_timestamp", sa.DateTime(timezone=True), nullable=False),
sa.Column("consent_timestamp", sa.DateTime, nullable=False),
)
@@ -89,11 +85,7 @@ class Meeting(BaseModel):
"none", "prompt", "automatic", "automatic-2nd-participant"
] = "automatic-2nd-participant"
num_clients: int = 0
is_active: bool = True
calendar_event_id: str | None = None
calendar_metadata: dict[str, Any] | None = None
last_participant_left_at: datetime | None = None
grace_period_minutes: int = 15
platform: Literal["whereby", "daily"] = "whereby"
class MeetingController:
@@ -107,8 +99,6 @@ class MeetingController:
end_date: datetime,
user_id: str,
room: Room,
calendar_event_id: str | None = None,
calendar_metadata: dict[str, Any] | None = None,
):
"""
Create a new meeting
@@ -126,11 +116,10 @@ class MeetingController:
room_mode=room.room_mode,
recording_type=room.recording_type,
recording_trigger=room.recording_trigger,
calendar_event_id=calendar_event_id,
calendar_metadata=calendar_metadata,
platform=room.platform,
)
query = meetings.insert().values(**meeting.model_dump())
await get_database().execute(query)
await database.execute(query)
return meeting
async def get_all_active(self) -> list[Meeting]:
@@ -138,7 +127,7 @@ class MeetingController:
Get active meetings.
"""
query = meetings.select().where(meetings.c.is_active)
return await get_database().fetch_all(query)
return await database.fetch_all(query)
async def get_by_room_name(
self,
@@ -148,7 +137,7 @@ class MeetingController:
Get a meeting by room name.
"""
query = meetings.select().where(meetings.c.room_name == room_name)
result = await get_database().fetch_one(query)
result = await database.fetch_one(query)
if not result:
return None
@@ -157,7 +146,6 @@ class MeetingController:
async def get_active(self, room: Room, current_time: datetime) -> Meeting:
"""
Get latest active meeting for a room.
For backward compatibility, returns the most recent active meeting.
"""
end_date = getattr(meetings.c, "end_date")
query = (
@@ -171,59 +159,18 @@ class MeetingController:
)
.order_by(end_date.desc())
)
result = await get_database().fetch_one(query)
result = await database.fetch_one(query)
if not result:
return None
return Meeting(**result)
async def get_all_active_for_room(
self, room: Room, current_time: datetime
) -> list[Meeting]:
"""
Get all active meetings for a room.
This supports multiple concurrent meetings per room.
"""
end_date = getattr(meetings.c, "end_date")
query = (
meetings.select()
.where(
sa.and_(
meetings.c.room_id == room.id,
meetings.c.end_date > current_time,
meetings.c.is_active,
)
)
.order_by(end_date.desc())
)
results = await get_database().fetch_all(query)
return [Meeting(**result) for result in results]
async def get_active_by_calendar_event(
self, room: Room, calendar_event_id: str, current_time: datetime
) -> Meeting | None:
"""
Get active meeting for a specific calendar event.
"""
query = meetings.select().where(
sa.and_(
meetings.c.room_id == room.id,
meetings.c.calendar_event_id == calendar_event_id,
meetings.c.end_date > current_time,
meetings.c.is_active,
)
)
result = await get_database().fetch_one(query)
if not result:
return None
return Meeting(**result)
async def get_by_id(self, meeting_id: str, **kwargs) -> Meeting | None:
"""
Get a meeting by id
"""
query = meetings.select().where(meetings.c.id == meeting_id)
result = await get_database().fetch_one(query)
result = await database.fetch_one(query)
if not result:
return None
return Meeting(**result)
@@ -235,7 +182,7 @@ class MeetingController:
If not found, it will raise a 404 error.
"""
query = meetings.select().where(meetings.c.id == meeting_id)
result = await get_database().fetch_one(query)
result = await database.fetch_one(query)
if not result:
raise HTTPException(status_code=404, detail="Meeting not found")
@@ -245,18 +192,9 @@ class MeetingController:
return meeting
async def get_by_calendar_event(self, calendar_event_id: str) -> Meeting | None:
query = meetings.select().where(
meetings.c.calendar_event_id == calendar_event_id
)
result = await get_database().fetch_one(query)
if not result:
return None
return Meeting(**result)
async def update_meeting(self, meeting_id: str, **kwargs):
query = meetings.update().where(meetings.c.id == meeting_id).values(**kwargs)
await get_database().execute(query)
await database.execute(query)
class MeetingConsentController:
@@ -264,7 +202,7 @@ class MeetingConsentController:
query = meeting_consent.select().where(
meeting_consent.c.meeting_id == meeting_id
)
results = await get_database().fetch_all(query)
results = await database.fetch_all(query)
return [MeetingConsent(**result) for result in results]
async def get_by_meeting_and_user(
@@ -275,7 +213,7 @@ class MeetingConsentController:
meeting_consent.c.meeting_id == meeting_id,
meeting_consent.c.user_id == user_id,
)
result = await get_database().fetch_one(query)
result = await database.fetch_one(query)
if result is None:
return None
return MeetingConsent(**result) if result else None
@@ -297,14 +235,14 @@ class MeetingConsentController:
consent_timestamp=consent.consent_timestamp,
)
)
await get_database().execute(query)
await database.execute(query)
existing.consent_given = consent.consent_given
existing.consent_timestamp = consent.consent_timestamp
return existing
query = meeting_consent.insert().values(**consent.model_dump())
await get_database().execute(query)
await database.execute(query)
return consent
async def has_any_denial(self, meeting_id: str) -> bool:
@@ -313,7 +251,7 @@ class MeetingConsentController:
meeting_consent.c.meeting_id == meeting_id,
meeting_consent.c.consent_given.is_(False),
)
result = await get_database().fetch_one(query)
result = await database.fetch_one(query)
return result is not None

View File

@@ -4,7 +4,7 @@ from typing import Literal
import sqlalchemy as sa
from pydantic import BaseModel, Field
from reflector.db import get_database, metadata
from reflector.db import database, metadata
from reflector.utils import generate_uuid4
recordings = sa.Table(
@@ -13,7 +13,7 @@ recordings = sa.Table(
sa.Column("id", sa.String, primary_key=True),
sa.Column("bucket_name", sa.String, nullable=False),
sa.Column("object_key", sa.String, nullable=False),
sa.Column("recorded_at", sa.DateTime(timezone=True), nullable=False),
sa.Column("recorded_at", sa.DateTime, nullable=False),
sa.Column(
"status",
sa.String,
@@ -37,12 +37,12 @@ class Recording(BaseModel):
class RecordingController:
async def create(self, recording: Recording):
query = recordings.insert().values(**recording.model_dump())
await get_database().execute(query)
await database.execute(query)
return recording
async def get_by_id(self, id: str) -> Recording:
query = recordings.select().where(recordings.c.id == id)
result = await get_database().fetch_one(query)
result = await database.fetch_one(query)
return Recording(**result) if result else None
async def get_by_object_key(self, bucket_name: str, object_key: str) -> Recording:
@@ -50,12 +50,8 @@ class RecordingController:
recordings.c.bucket_name == bucket_name,
recordings.c.object_key == object_key,
)
result = await get_database().fetch_one(query)
result = await database.fetch_one(query)
return Recording(**result) if result else None
async def remove_by_id(self, id: str) -> None:
query = recordings.delete().where(recordings.c.id == id)
await get_database().execute(query)
recordings_controller = RecordingController()

View File

@@ -1,4 +1,4 @@
from datetime import datetime, timezone
from datetime import datetime
from sqlite3 import IntegrityError
from typing import Literal
@@ -7,7 +7,7 @@ from fastapi import HTTPException
from pydantic import BaseModel, Field
from sqlalchemy.sql import false, or_
from reflector.db import get_database, metadata
from reflector.db import database, metadata
from reflector.utils import generate_uuid4
rooms = sqlalchemy.Table(
@@ -16,7 +16,7 @@ rooms = sqlalchemy.Table(
sqlalchemy.Column("id", sqlalchemy.String, primary_key=True),
sqlalchemy.Column("name", sqlalchemy.String, nullable=False, unique=True),
sqlalchemy.Column("user_id", sqlalchemy.String, nullable=False),
sqlalchemy.Column("created_at", sqlalchemy.DateTime(timezone=True), nullable=False),
sqlalchemy.Column("created_at", sqlalchemy.DateTime, nullable=False),
sqlalchemy.Column(
"zulip_auto_post", sqlalchemy.Boolean, nullable=False, server_default=false()
),
@@ -40,15 +40,10 @@ rooms = sqlalchemy.Table(
sqlalchemy.Column(
"is_shared", sqlalchemy.Boolean, nullable=False, server_default=false()
),
sqlalchemy.Column("ics_url", sqlalchemy.Text),
sqlalchemy.Column("ics_fetch_interval", sqlalchemy.Integer, server_default="300"),
sqlalchemy.Column(
"ics_enabled", sqlalchemy.Boolean, nullable=False, server_default=false()
"platform", sqlalchemy.String, nullable=False, server_default="whereby"
),
sqlalchemy.Column("ics_last_sync", sqlalchemy.DateTime(timezone=True)),
sqlalchemy.Column("ics_last_etag", sqlalchemy.Text),
sqlalchemy.Index("idx_room_is_shared", "is_shared"),
sqlalchemy.Index("idx_room_ics_enabled", "ics_enabled"),
)
@@ -56,7 +51,7 @@ class Room(BaseModel):
id: str = Field(default_factory=generate_uuid4)
name: str
user_id: str
created_at: datetime = Field(default_factory=lambda: datetime.now(timezone.utc))
created_at: datetime = Field(default_factory=datetime.utcnow)
zulip_auto_post: bool = False
zulip_stream: str = ""
zulip_topic: str = ""
@@ -67,11 +62,7 @@ class Room(BaseModel):
"none", "prompt", "automatic", "automatic-2nd-participant"
] = "automatic-2nd-participant"
is_shared: bool = False
ics_url: str | None = None
ics_fetch_interval: int = 300
ics_enabled: bool = False
ics_last_sync: datetime | None = None
ics_last_etag: str | None = None
platform: Literal["whereby", "daily"] = "whereby"
class RoomController:
@@ -105,7 +96,7 @@ class RoomController:
if return_query:
return query
results = await get_database().fetch_all(query)
results = await database.fetch_all(query)
return results
async def add(
@@ -120,9 +111,7 @@ class RoomController:
recording_type: str,
recording_trigger: str,
is_shared: bool,
ics_url: str | None = None,
ics_fetch_interval: int = 300,
ics_enabled: bool = False,
platform: str = "whereby",
):
"""
Add a new room
@@ -138,13 +127,11 @@ class RoomController:
recording_type=recording_type,
recording_trigger=recording_trigger,
is_shared=is_shared,
ics_url=ics_url,
ics_fetch_interval=ics_fetch_interval,
ics_enabled=ics_enabled,
platform=platform,
)
query = rooms.insert().values(**room.model_dump())
try:
await get_database().execute(query)
await database.execute(query)
except IntegrityError:
raise HTTPException(status_code=400, detail="Room name is not unique")
return room
@@ -155,7 +142,7 @@ class RoomController:
"""
query = rooms.update().where(rooms.c.id == room.id).values(**values)
try:
await get_database().execute(query)
await database.execute(query)
except IntegrityError:
raise HTTPException(status_code=400, detail="Room name is not unique")
@@ -170,7 +157,7 @@ class RoomController:
query = rooms.select().where(rooms.c.id == room_id)
if "user_id" in kwargs:
query = query.where(rooms.c.user_id == kwargs["user_id"])
result = await get_database().fetch_one(query)
result = await database.fetch_one(query)
if not result:
return None
return Room(**result)
@@ -182,7 +169,7 @@ class RoomController:
query = rooms.select().where(rooms.c.name == room_name)
if "user_id" in kwargs:
query = query.where(rooms.c.user_id == kwargs["user_id"])
result = await get_database().fetch_one(query)
result = await database.fetch_one(query)
if not result:
return None
return Room(**result)
@@ -194,7 +181,7 @@ class RoomController:
If not found, it will raise a 404 error.
"""
query = rooms.select().where(rooms.c.id == meeting_id)
result = await get_database().fetch_one(query)
result = await database.fetch_one(query)
if not result:
raise HTTPException(status_code=404, detail="Room not found")
@@ -216,7 +203,7 @@ class RoomController:
if user_id is not None and room.user_id != user_id:
return
query = rooms.delete().where(rooms.c.id == room_id)
await get_database().execute(query)
await database.execute(query)
rooms_controller = RoomController()

View File

@@ -1,231 +0,0 @@
"""Search functionality for transcripts and other entities."""
from datetime import datetime
from io import StringIO
from typing import Annotated, Any, Dict
import sqlalchemy
import webvtt
from pydantic import BaseModel, Field, constr, field_serializer
from reflector.db import get_database
from reflector.db.transcripts import SourceKind, transcripts
from reflector.db.utils import is_postgresql
from reflector.logger import logger
DEFAULT_SEARCH_LIMIT = 20
SNIPPET_CONTEXT_LENGTH = 50 # Characters before/after match to include
DEFAULT_SNIPPET_MAX_LENGTH = 150
DEFAULT_MAX_SNIPPETS = 3
SearchQueryBase = constr(min_length=1, strip_whitespace=True)
SearchLimitBase = Annotated[int, Field(ge=1, le=100)]
SearchOffsetBase = Annotated[int, Field(ge=0)]
SearchTotalBase = Annotated[int, Field(ge=0)]
SearchQuery = Annotated[SearchQueryBase, Field(description="Search query text")]
SearchLimit = Annotated[SearchLimitBase, Field(description="Results per page")]
SearchOffset = Annotated[
SearchOffsetBase, Field(description="Number of results to skip")
]
SearchTotal = Annotated[
SearchTotalBase, Field(description="Total number of search results")
]
class SearchParameters(BaseModel):
"""Validated search parameters for full-text search."""
query_text: SearchQuery
limit: SearchLimit = DEFAULT_SEARCH_LIMIT
offset: SearchOffset = 0
user_id: str | None = None
room_id: str | None = None
class SearchResultDB(BaseModel):
"""Intermediate model for validating raw database results."""
id: str = Field(..., min_length=1)
created_at: datetime
status: str = Field(..., min_length=1)
duration: float | None = Field(None, ge=0)
user_id: str | None = None
title: str | None = None
source_kind: SourceKind
room_id: str | None = None
rank: float = Field(..., ge=0, le=1)
class SearchResult(BaseModel):
"""Public search result model with computed fields."""
id: str = Field(..., min_length=1)
title: str | None = None
user_id: str | None = None
room_id: str | None = None
created_at: datetime
status: str = Field(..., min_length=1)
rank: float = Field(..., ge=0, le=1)
duration: float | None = Field(..., ge=0, description="Duration in seconds")
search_snippets: list[str] = Field(
description="Text snippets around search matches"
)
@field_serializer("created_at", when_used="json")
def serialize_datetime(self, dt: datetime) -> str:
if dt.tzinfo is None:
return dt.isoformat() + "Z"
return dt.isoformat()
class SearchController:
"""Controller for search operations across different entities."""
@staticmethod
def _extract_webvtt_text(webvtt_content: str) -> str:
"""Extract plain text from WebVTT content using webvtt library."""
if not webvtt_content:
return ""
try:
buffer = StringIO(webvtt_content)
vtt = webvtt.read_buffer(buffer)
return " ".join(caption.text for caption in vtt if caption.text)
except (webvtt.errors.MalformedFileError, UnicodeDecodeError, ValueError) as e:
logger.warning(f"Failed to parse WebVTT content: {e}", exc_info=e)
return ""
except AttributeError as e:
logger.warning(f"WebVTT parsing error - unexpected format: {e}", exc_info=e)
return ""
@staticmethod
def _generate_snippets(
text: str,
q: SearchQuery,
max_length: int = DEFAULT_SNIPPET_MAX_LENGTH,
max_snippets: int = DEFAULT_MAX_SNIPPETS,
) -> list[str]:
"""Generate multiple snippets around all occurrences of search term."""
if not text or not q:
return []
snippets = []
lower_text = text.lower()
search_lower = q.lower()
last_snippet_end = 0
start_pos = 0
while len(snippets) < max_snippets:
match_pos = lower_text.find(search_lower, start_pos)
if match_pos == -1:
if not snippets and search_lower.split():
first_word = search_lower.split()[0]
match_pos = lower_text.find(first_word, start_pos)
if match_pos == -1:
break
else:
break
snippet_start = max(0, match_pos - SNIPPET_CONTEXT_LENGTH)
snippet_end = min(
len(text), match_pos + max_length - SNIPPET_CONTEXT_LENGTH
)
if snippet_start < last_snippet_end:
start_pos = match_pos + len(search_lower)
continue
snippet = text[snippet_start:snippet_end]
if snippet_start > 0:
snippet = "..." + snippet
if snippet_end < len(text):
snippet = snippet + "..."
snippet = snippet.strip()
if snippet:
snippets.append(snippet)
last_snippet_end = snippet_end
start_pos = match_pos + len(search_lower)
if start_pos >= len(text):
break
return snippets
@classmethod
async def search_transcripts(
cls, params: SearchParameters
) -> tuple[list[SearchResult], int]:
"""
Full-text search for transcripts using PostgreSQL tsvector.
Returns (results, total_count).
"""
if not is_postgresql():
logger.warning(
"Full-text search requires PostgreSQL. Returning empty results."
)
return [], 0
search_query = sqlalchemy.func.websearch_to_tsquery(
"english", params.query_text
)
base_query = sqlalchemy.select(
[
transcripts.c.id,
transcripts.c.title,
transcripts.c.created_at,
transcripts.c.duration,
transcripts.c.status,
transcripts.c.user_id,
transcripts.c.room_id,
transcripts.c.source_kind,
transcripts.c.webvtt,
sqlalchemy.func.ts_rank(
transcripts.c.search_vector_en,
search_query,
32, # normalization flag: rank/(rank+1) for 0-1 range
).label("rank"),
]
).where(transcripts.c.search_vector_en.op("@@")(search_query))
if params.user_id:
base_query = base_query.where(transcripts.c.user_id == params.user_id)
if params.room_id:
base_query = base_query.where(transcripts.c.room_id == params.room_id)
query = (
base_query.order_by(sqlalchemy.desc(sqlalchemy.text("rank")))
.limit(params.limit)
.offset(params.offset)
)
rs = await get_database().fetch_all(query)
count_query = sqlalchemy.select([sqlalchemy.func.count()]).select_from(
base_query.alias("search_results")
)
total = await get_database().fetch_val(count_query)
def _process_result(r) -> SearchResult:
r_dict: Dict[str, Any] = dict(r)
webvtt: str | None = r_dict.pop("webvtt", None)
db_result = SearchResultDB.model_validate(r_dict)
snippets = []
if webvtt:
plain_text = cls._extract_webvtt_text(webvtt)
snippets = cls._generate_snippets(plain_text, params.query_text)
return SearchResult(**db_result.model_dump(), search_snippets=snippets)
results = [_process_result(r) for r in rs]
return results, total
search_controller = SearchController()

View File

@@ -3,7 +3,7 @@ import json
import os
import shutil
from contextlib import asynccontextmanager
from datetime import datetime, timedelta, timezone
from datetime import datetime, timezone
from pathlib import Path
from typing import Any, Literal
@@ -11,19 +11,13 @@ import sqlalchemy
from fastapi import HTTPException
from pydantic import BaseModel, ConfigDict, Field, field_serializer
from sqlalchemy import Enum
from sqlalchemy.dialects.postgresql import TSVECTOR
from sqlalchemy.sql import false, or_
from reflector.db import get_database, metadata
from reflector.db.recordings import recordings_controller
from reflector.db.rooms import rooms
from reflector.db.utils import is_postgresql
from reflector.logger import logger
from reflector.db import database, metadata
from reflector.processors.types import Word as ProcessorWord
from reflector.settings import settings
from reflector.storage import get_recordings_storage, get_transcripts_storage
from reflector.storage import get_transcripts_storage
from reflector.utils import generate_uuid4
from reflector.utils.webvtt import topics_to_webvtt
class SourceKind(enum.StrEnum):
@@ -40,7 +34,7 @@ transcripts = sqlalchemy.Table(
sqlalchemy.Column("status", sqlalchemy.String),
sqlalchemy.Column("locked", sqlalchemy.Boolean),
sqlalchemy.Column("duration", sqlalchemy.Float),
sqlalchemy.Column("created_at", sqlalchemy.DateTime(timezone=True)),
sqlalchemy.Column("created_at", sqlalchemy.DateTime),
sqlalchemy.Column("title", sqlalchemy.String),
sqlalchemy.Column("short_summary", sqlalchemy.String),
sqlalchemy.Column("long_summary", sqlalchemy.String),
@@ -82,7 +76,6 @@ transcripts = sqlalchemy.Table(
# same field could've been in recording/meeting, and it's maybe even ok to dupe it at need
sqlalchemy.Column("audio_deleted", sqlalchemy.Boolean),
sqlalchemy.Column("room_id", sqlalchemy.String),
sqlalchemy.Column("webvtt", sqlalchemy.Text),
sqlalchemy.Index("idx_transcript_recording_id", "recording_id"),
sqlalchemy.Index("idx_transcript_user_id", "user_id"),
sqlalchemy.Index("idx_transcript_created_at", "created_at"),
@@ -90,29 +83,6 @@ transcripts = sqlalchemy.Table(
sqlalchemy.Index("idx_transcript_room_id", "room_id"),
)
# Add PostgreSQL-specific full-text search column
# This matches the migration in migrations/versions/116b2f287eab_add_full_text_search.py
if is_postgresql():
transcripts.append_column(
sqlalchemy.Column(
"search_vector_en",
TSVECTOR,
sqlalchemy.Computed(
"setweight(to_tsvector('english', coalesce(title, '')), 'A') || "
"setweight(to_tsvector('english', coalesce(webvtt, '')), 'B')",
persisted=True,
),
)
)
# Add GIN index for the search vector
transcripts.append_constraint(
sqlalchemy.Index(
"idx_transcript_search_vector_en",
"search_vector_en",
postgresql_using="gin",
)
)
def generate_transcript_name() -> str:
now = datetime.now(timezone.utc)
@@ -177,18 +147,14 @@ class TranscriptParticipant(BaseModel):
class Transcript(BaseModel):
"""Full transcript model with all fields."""
id: str = Field(default_factory=generate_uuid4)
user_id: str | None = None
name: str = Field(default_factory=generate_transcript_name)
status: str = "idle"
locked: bool = False
duration: float = 0
created_at: datetime = Field(default_factory=lambda: datetime.now(timezone.utc))
title: str | None = None
source_kind: SourceKind
room_id: str | None = None
locked: bool = False
short_summary: str | None = None
long_summary: str | None = None
topics: list[TranscriptTopic] = []
@@ -202,8 +168,9 @@ class Transcript(BaseModel):
meeting_id: str | None = None
recording_id: str | None = None
zulip_message_id: int | None = None
source_kind: SourceKind
audio_deleted: bool | None = None
webvtt: str | None = None
room_id: str | None = None
@field_serializer("created_at", when_used="json")
def serialize_datetime(self, dt: datetime) -> str:
@@ -304,12 +271,10 @@ class Transcript(BaseModel):
# we need to create an url to be used for diarization
# we can't use the audio_mp3_filename because it's not accessible
# from the diarization processor
from datetime import timedelta
# TODO don't import app in db
from reflector.app import app # noqa: PLC0415
# TODO a util + don''t import views in db
from reflector.views.transcripts import create_access_token # noqa: PLC0415
from reflector.app import app
from reflector.views.transcripts import create_access_token
path = app.url_path_for(
"transcript_get_audio_mp3",
@@ -370,6 +335,7 @@ class TranscriptController:
- `room_id`: filter transcripts by room ID
- `search_term`: filter transcripts by search term
"""
from reflector.db.rooms import rooms
query = transcripts.select().join(
rooms, transcripts.c.room_id == rooms.c.id, isouter=True
@@ -420,7 +386,7 @@ class TranscriptController:
if return_query:
return query
results = await get_database().fetch_all(query)
results = await database.fetch_all(query)
return results
async def get_by_id(self, transcript_id: str, **kwargs) -> Transcript | None:
@@ -430,7 +396,7 @@ class TranscriptController:
query = transcripts.select().where(transcripts.c.id == transcript_id)
if "user_id" in kwargs:
query = query.where(transcripts.c.user_id == kwargs["user_id"])
result = await get_database().fetch_one(query)
result = await database.fetch_one(query)
if not result:
return None
return Transcript(**result)
@@ -444,7 +410,7 @@ class TranscriptController:
query = transcripts.select().where(transcripts.c.recording_id == recording_id)
if "user_id" in kwargs:
query = query.where(transcripts.c.user_id == kwargs["user_id"])
result = await get_database().fetch_one(query)
result = await database.fetch_one(query)
if not result:
return None
return Transcript(**result)
@@ -462,7 +428,7 @@ class TranscriptController:
if order_by.startswith("-"):
field = field.desc()
query = query.order_by(field)
results = await get_database().fetch_all(query)
results = await database.fetch_all(query)
return [Transcript(**result) for result in results]
async def get_by_id_for_http(
@@ -480,7 +446,7 @@ class TranscriptController:
to determine if the user can access the transcript.
"""
query = transcripts.select().where(transcripts.c.id == transcript_id)
result = await get_database().fetch_one(query)
result = await database.fetch_one(query)
if not result:
raise HTTPException(status_code=404, detail="Transcript not found")
@@ -533,52 +499,23 @@ class TranscriptController:
room_id=room_id,
)
query = transcripts.insert().values(**transcript.model_dump())
await get_database().execute(query)
await database.execute(query)
return transcript
# TODO investigate why mutate= is used. it's used in one place currently, maybe because of ORM field updates.
# using mutate=True is discouraged
async def update(
self, transcript: Transcript, values: dict, mutate=False
) -> Transcript:
async def update(self, transcript: Transcript, values: dict, mutate=True):
"""
Update a transcript fields with key/values in values.
Returns a copy of the transcript with updated values.
Update a transcript fields with key/values in values
"""
values = TranscriptController._handle_topics_update(values)
query = (
transcripts.update()
.where(transcripts.c.id == transcript.id)
.values(**values)
)
await get_database().execute(query)
await database.execute(query)
if mutate:
for key, value in values.items():
setattr(transcript, key, value)
updated_transcript = transcript.model_copy(update=values)
return updated_transcript
@staticmethod
def _handle_topics_update(values: dict) -> dict:
"""Auto-update WebVTT when topics are updated."""
if values.get("webvtt") is not None:
logger.warn("trying to update read-only webvtt column")
pass
topics_data = values.get("topics")
if topics_data is None:
return values
return {
**values,
"webvtt": topics_to_webvtt(
[TranscriptTopic(**topic_dict) for topic_dict in topics_data]
),
}
async def remove_by_id(
self,
transcript_id: str,
@@ -592,55 +529,23 @@ class TranscriptController:
return
if user_id is not None and transcript.user_id != user_id:
return
if transcript.audio_location == "storage" and not transcript.audio_deleted:
try:
await get_transcripts_storage().delete_file(
transcript.storage_audio_path
)
except Exception as e:
logger.warning(
"Failed to delete transcript audio from storage",
exc_info=e,
transcript_id=transcript.id,
)
transcript.unlink()
if transcript.recording_id:
try:
recording = await recordings_controller.get_by_id(
transcript.recording_id
)
if recording:
try:
await get_recordings_storage().delete_file(recording.object_key)
except Exception as e:
logger.warning(
"Failed to delete recording object from S3",
exc_info=e,
recording_id=transcript.recording_id,
)
await recordings_controller.remove_by_id(transcript.recording_id)
except Exception as e:
logger.warning(
"Failed to delete recording row",
exc_info=e,
recording_id=transcript.recording_id,
)
query = transcripts.delete().where(transcripts.c.id == transcript_id)
await get_database().execute(query)
await database.execute(query)
async def remove_by_recording_id(self, recording_id: str):
"""
Remove a transcript by recording_id
"""
query = transcripts.delete().where(transcripts.c.recording_id == recording_id)
await get_database().execute(query)
await database.execute(query)
@asynccontextmanager
async def transaction(self):
"""
A context manager for database transaction
"""
async with get_database().transaction(isolation="serializable"):
async with database.transaction(isolation="serializable"):
yield
async def append_event(
@@ -653,7 +558,11 @@ class TranscriptController:
Append an event to a transcript
"""
resp = transcript.add_event(event=event, data=data)
await self.update(transcript, {"events": transcript.events_dump()})
await self.update(
transcript,
{"events": transcript.events_dump()},
mutate=False,
)
return resp
async def upsert_topic(
@@ -665,7 +574,11 @@ class TranscriptController:
Upsert topics to a transcript
"""
transcript.upsert_topic(topic)
await self.update(transcript, {"topics": transcript.topics_dump()})
await self.update(
transcript,
{"topics": transcript.topics_dump()},
mutate=False,
)
async def move_mp3_to_storage(self, transcript: Transcript):
"""
@@ -690,8 +603,7 @@ class TranscriptController:
)
# indicate on the transcript that the audio is now on storage
# mutates transcript argument
await self.update(transcript, {"audio_location": "storage"}, mutate=True)
await self.update(transcript, {"audio_location": "storage"})
# unlink the local file
transcript.audio_mp3_filename.unlink(missing_ok=True)
@@ -715,7 +627,11 @@ class TranscriptController:
Add/update a participant to a transcript
"""
result = transcript.upsert_participant(participant)
await self.update(transcript, {"participants": transcript.participants_dump()})
await self.update(
transcript,
{"participants": transcript.participants_dump()},
mutate=False,
)
return result
async def delete_participant(
@@ -727,7 +643,11 @@ class TranscriptController:
Delete a participant from a transcript
"""
transcript.delete_participant(participant_id)
await self.update(transcript, {"participants": transcript.participants_dump()})
await self.update(
transcript,
{"participants": transcript.participants_dump()},
mutate=False,
)
transcripts_controller = TranscriptController()

View File

@@ -1,9 +0,0 @@
"""Database utility functions."""
from reflector.db import get_database
def is_postgresql() -> bool:
return get_database().url.scheme and get_database().url.scheme.startswith(
"postgresql"
)

View File

@@ -14,15 +14,12 @@ It is directly linked to our data model.
import asyncio
import functools
from contextlib import asynccontextmanager
from typing import Generic
import av
import boto3
from celery import chord, current_task, group, shared_task
from pydantic import BaseModel
from structlog import BoundLogger as Logger
from reflector.db import get_database
from reflector.db.meetings import meeting_consent_controller, meetings_controller
from reflector.db.recordings import recordings_controller
from reflector.db.rooms import rooms_controller
@@ -38,7 +35,7 @@ from reflector.db.transcripts import (
transcripts_controller,
)
from reflector.logger import logger
from reflector.pipelines.runner import PipelineMessage, PipelineRunner
from reflector.pipelines.runner import PipelineRunner
from reflector.processors import (
AudioChunkerProcessor,
AudioDiarizationAutoProcessor,
@@ -50,7 +47,7 @@ from reflector.processors import (
TranscriptFinalTitleProcessor,
TranscriptLinerProcessor,
TranscriptTopicDetectorProcessor,
TranscriptTranslatorAutoProcessor,
TranscriptTranslatorProcessor,
)
from reflector.processors.audio_waveform_processor import AudioWaveformProcessor
from reflector.processors.types import AudioDiarizationInput
@@ -72,7 +69,8 @@ def asynctask(f):
@functools.wraps(f)
def wrapper(*args, **kwargs):
async def run_with_db():
database = get_database()
from reflector.db import database
await database.connect()
try:
return await f(*args, **kwargs)
@@ -146,7 +144,7 @@ class StrValue(BaseModel):
value: str
class PipelineMainBase(PipelineRunner[PipelineMessage], Generic[PipelineMessage]):
class PipelineMainBase(PipelineRunner):
transcript_id: str
ws_room_id: str | None = None
ws_manager: WebsocketManager | None = None
@@ -166,11 +164,7 @@ class PipelineMainBase(PipelineRunner[PipelineMessage], Generic[PipelineMessage]
raise Exception("Transcript not found")
return result
@staticmethod
def wrap_transcript_topics(
topics: list[TranscriptTopic],
) -> list[TitleSummaryWithIdProcessorType]:
# transformation to a pipe-supported format
def get_transcript_topics(self, transcript: Transcript) -> list[TranscriptTopic]:
return [
TitleSummaryWithIdProcessorType(
id=topic.id,
@@ -180,7 +174,7 @@ class PipelineMainBase(PipelineRunner[PipelineMessage], Generic[PipelineMessage]
duration=topic.duration,
transcript=TranscriptProcessorType(words=topic.words),
)
for topic in topics
for topic in transcript.topics
]
@asynccontextmanager
@@ -367,7 +361,7 @@ class PipelineMainLive(PipelineMainBase):
AudioMergeProcessor(),
AudioTranscriptAutoProcessor.as_threaded(),
TranscriptLinerProcessor(),
TranscriptTranslatorAutoProcessor.as_threaded(callback=self.on_transcript),
TranscriptTranslatorProcessor.as_threaded(callback=self.on_transcript),
TranscriptTopicDetectorProcessor.as_threaded(callback=self.on_topic),
]
pipeline = Pipeline(*processors)
@@ -386,7 +380,7 @@ class PipelineMainLive(PipelineMainBase):
pipeline_post(transcript_id=self.transcript_id)
class PipelineMainDiarization(PipelineMainBase[AudioDiarizationInput]):
class PipelineMainDiarization(PipelineMainBase):
"""
Diarize the audio and update topics
"""
@@ -410,10 +404,11 @@ class PipelineMainDiarization(PipelineMainBase[AudioDiarizationInput]):
pipeline.logger.info("Audio is local, skipping diarization")
return
topics = self.get_transcript_topics(transcript)
audio_url = await transcript.get_audio_url()
audio_diarization_input = AudioDiarizationInput(
audio_url=audio_url,
topics=self.wrap_transcript_topics(transcript.topics),
topics=topics,
)
# as tempting to use pipeline.push, prefer to use the runner
@@ -426,7 +421,7 @@ class PipelineMainDiarization(PipelineMainBase[AudioDiarizationInput]):
return pipeline
class PipelineMainFromTopics(PipelineMainBase[TitleSummaryWithIdProcessorType]):
class PipelineMainFromTopics(PipelineMainBase):
"""
Pseudo class for generating a pipeline from topics
"""
@@ -448,7 +443,7 @@ class PipelineMainFromTopics(PipelineMainBase[TitleSummaryWithIdProcessorType]):
pipeline.logger.info(f"{self.__class__.__name__} pipeline created")
# push topics
topics = PipelineMainBase.wrap_transcript_topics(transcript.topics)
topics = self.get_transcript_topics(transcript)
for topic in topics:
await self.push(topic)
@@ -529,6 +524,8 @@ async def pipeline_convert_to_mp3(transcript: Transcript, logger: Logger):
# Convert to mp3
mp3_filename = transcript.audio_mp3_filename
import av
with av.open(wav_filename.as_posix()) as in_container:
in_stream = in_container.streams.audio[0]
with av.open(mp3_filename.as_posix(), "w") as out_container:
@@ -607,7 +604,7 @@ async def cleanup_consent(transcript: Transcript, logger: Logger):
meeting.id
)
except Exception as e:
logger.error(f"Failed to get fetch consent: {e}", exc_info=e)
logger.error(f"Failed to get fetch consent: {e}")
consent_denied = True
if not consent_denied:
@@ -630,7 +627,7 @@ async def cleanup_consent(transcript: Transcript, logger: Logger):
f"Deleted original Whereby recording: {recording.bucket_name}/{recording.object_key}"
)
except Exception as e:
logger.error(f"Failed to delete Whereby recording: {e}", exc_info=e)
logger.error(f"Failed to delete Whereby recording: {e}")
# non-transactional, files marked for deletion not actually deleted is possible
await transcripts_controller.update(transcript, {"audio_deleted": True})
@@ -643,7 +640,7 @@ async def cleanup_consent(transcript: Transcript, logger: Logger):
f"Deleted processed audio from storage: {transcript.storage_audio_path}"
)
except Exception as e:
logger.error(f"Failed to delete processed audio: {e}", exc_info=e)
logger.error(f"Failed to delete processed audio: {e}")
# 3. Delete local audio files
try:
@@ -652,7 +649,7 @@ async def cleanup_consent(transcript: Transcript, logger: Logger):
if hasattr(transcript, "audio_wav_filename") and transcript.audio_wav_filename:
transcript.audio_wav_filename.unlink(missing_ok=True)
except Exception as e:
logger.error(f"Failed to delete local audio files: {e}", exc_info=e)
logger.error(f"Failed to delete local audio files: {e}")
logger.info("Consent cleanup done")
@@ -797,6 +794,8 @@ def pipeline_post(*, transcript_id: str):
@get_transcript
async def pipeline_process(transcript: Transcript, logger: Logger):
import av
try:
if transcript.audio_location == "storage":
await transcripts_controller.download_mp3_from_storage(transcript)

View File

@@ -16,17 +16,14 @@ During its lifecycle, it will emit the following status:
"""
import asyncio
from typing import Generic, TypeVar
from pydantic import BaseModel, ConfigDict
from reflector.logger import logger
from reflector.processors import Pipeline
PipelineMessage = TypeVar("PipelineMessage")
class PipelineRunner(BaseModel, Generic[PipelineMessage]):
class PipelineRunner(BaseModel):
model_config = ConfigDict(arbitrary_types_allowed=True)
status: str = "idle"
@@ -70,7 +67,7 @@ class PipelineRunner(BaseModel, Generic[PipelineMessage]):
coro = self.run()
asyncio.run(coro)
async def push(self, data: PipelineMessage):
async def push(self, data):
"""
Push data to the pipeline
"""
@@ -95,11 +92,7 @@ class PipelineRunner(BaseModel, Generic[PipelineMessage]):
pass
async def _add_cmd(
self,
cmd: str,
data: PipelineMessage,
max_retries: int = 3,
retry_time_limit: int = 3,
self, cmd: str, data, max_retries: int = 3, retry_time_limit: int = 3
):
"""
Enqueue a command to be executed in the runner.
@@ -150,9 +143,6 @@ class PipelineRunner(BaseModel, Generic[PipelineMessage]):
cmd, data = await self._q_cmd.get()
func = getattr(self, f"cmd_{cmd.lower()}")
if func:
if cmd.upper() == "FLUSH":
await func()
else:
await func(data)
else:
raise Exception(f"Unknown command {cmd}")
@@ -162,13 +152,13 @@ class PipelineRunner(BaseModel, Generic[PipelineMessage]):
self._ev_done.set()
raise
async def cmd_push(self, data: PipelineMessage):
async def cmd_push(self, data):
if self._is_first_push:
await self._set_status("push")
self._is_first_push = False
await self.pipeline.push(data)
async def cmd_flush(self):
async def cmd_flush(self, data):
await self._set_status("flush")
await self.pipeline.flush()
await self._set_status("ended")

View File

@@ -16,7 +16,6 @@ from .transcript_final_title import TranscriptFinalTitleProcessor # noqa: F401
from .transcript_liner import TranscriptLinerProcessor # noqa: F401
from .transcript_topic_detector import TranscriptTopicDetectorProcessor # noqa: F401
from .transcript_translator import TranscriptTranslatorProcessor # noqa: F401
from .transcript_translator_auto import TranscriptTranslatorAutoProcessor # noqa: F401
from .types import ( # noqa: F401
AudioFile,
FinalLongSummary,

View File

@@ -1,9 +1,5 @@
from reflector.processors.base import Processor
from reflector.processors.types import (
AudioDiarizationInput,
TitleSummary,
Word,
)
from reflector.processors.types import AudioDiarizationInput, TitleSummary, Word
class AudioDiarizationProcessor(Processor):

View File

@@ -10,17 +10,12 @@ class AudioDiarizationModalProcessor(AudioDiarizationProcessor):
INPUT_TYPE = AudioDiarizationInput
OUTPUT_TYPE = TitleSummary
def __init__(self, modal_api_key: str | None = None, **kwargs):
def __init__(self, **kwargs):
super().__init__(**kwargs)
if not settings.DIARIZATION_URL:
raise Exception(
"DIARIZATION_URL required to use AudioDiarizationModalProcessor"
)
self.diarization_url = settings.DIARIZATION_URL + "/diarize"
self.modal_api_key = modal_api_key
self.headers = {}
if self.modal_api_key:
self.headers["Authorization"] = f"Bearer {self.modal_api_key}"
self.headers = {
"Authorization": f"Bearer {settings.LLM_MODAL_API_KEY}",
}
async def _diarize(self, data: AudioDiarizationInput):
# Gather diarization data

View File

@@ -21,20 +21,16 @@ from reflector.settings import settings
class AudioTranscriptModalProcessor(AudioTranscriptProcessor):
def __init__(self, modal_api_key: str | None = None, **kwargs):
def __init__(self, modal_api_key: str):
super().__init__()
if not settings.TRANSCRIPT_URL:
raise Exception(
"TRANSCRIPT_URL required to use AudioTranscriptModalProcessor"
)
self.transcript_url = settings.TRANSCRIPT_URL + "/v1"
self.timeout = settings.TRANSCRIPT_TIMEOUT
self.modal_api_key = modal_api_key
self.api_key = settings.TRANSCRIPT_MODAL_API_KEY
async def _transcript(self, data: AudioFile):
async with AsyncOpenAI(
base_url=self.transcript_url,
api_key=self.modal_api_key,
api_key=self.api_key,
timeout=self.timeout,
) as client:
self.logger.debug(f"Try to transcribe audio {data.name}")

View File

@@ -6,7 +6,7 @@ This script is used to generate a summary of a meeting notes transcript.
import asyncio
import sys
from datetime import datetime, timezone
from datetime import datetime
from enum import Enum
from textwrap import dedent
from typing import Type, TypeVar
@@ -474,7 +474,7 @@ if __name__ == "__main__":
if args.save:
# write the summary to a file, on the format summary-<iso date>.md
filename = f"summary-{datetime.now(timezone.utc).isoformat()}.md"
filename = f"summary-{datetime.now().isoformat()}.md"
with open(filename, "w", encoding="utf-8") as f:
f.write(sm.as_markdown())

View File

@@ -1,5 +1,9 @@
import httpx
from reflector.processors.base import Processor
from reflector.processors.types import Transcript
from reflector.processors.types import Transcript, TranslationLanguages
from reflector.settings import settings
from reflector.utils.retry import retry
class TranscriptTranslatorProcessor(Processor):
@@ -13,23 +17,56 @@ class TranscriptTranslatorProcessor(Processor):
def __init__(self, **kwargs):
super().__init__(**kwargs)
self.transcript = None
self.translate_url = settings.TRANSLATE_URL
self.timeout = settings.TRANSLATE_TIMEOUT
self.headers = {"Authorization": f"Bearer {settings.TRANSCRIPT_MODAL_API_KEY}"}
async def _push(self, data: Transcript):
self.transcript = data
await self.flush()
async def _translate(self, text: str) -> str | None:
raise NotImplementedError
async def _flush(self):
if not self.transcript:
return
async def get_translation(self, text: str) -> str | None:
# FIXME this should be a processor after, as each user may want
# different languages
source_language = self.get_pref("audio:source_language", "en")
target_language = self.get_pref("audio:target_language", "en")
if source_language == target_language:
self.transcript.translation = None
else:
self.transcript.translation = await self._translate(self.transcript.text)
return
languages = TranslationLanguages()
# Only way to set the target should be the UI element like dropdown.
# Hence, this assert should never fail.
assert languages.is_supported(target_language)
self.logger.debug(f"Try to translate {text=}")
json_payload = {
"text": text,
"source_language": source_language,
"target_language": target_language,
}
async with httpx.AsyncClient() as client:
response = await retry(client.post)(
self.translate_url + "/translate",
headers=self.headers,
params=json_payload,
timeout=self.timeout,
follow_redirects=True,
logger=self.logger,
)
response.raise_for_status()
result = response.json()["text"]
# Sanity check for translation status in the result
if target_language in result:
translation = result[target_language]
self.logger.debug(f"Translation response: {text=}, {translation=}")
return translation
async def _flush(self):
if not self.transcript:
return
self.transcript.translation = await self.get_translation(
text=self.transcript.text
)
await self.emit(self.transcript)

View File

@@ -1,32 +0,0 @@
import importlib
from reflector.processors.transcript_translator import TranscriptTranslatorProcessor
from reflector.settings import settings
class TranscriptTranslatorAutoProcessor(TranscriptTranslatorProcessor):
_registry = {}
@classmethod
def register(cls, name, kclass):
cls._registry[name] = kclass
def __new__(cls, name: str | None = None, **kwargs):
if name is None:
name = settings.TRANSLATION_BACKEND
if name not in cls._registry:
module_name = f"reflector.processors.transcript_translator_{name}"
importlib.import_module(module_name)
# gather specific configuration for the processor
# search `TRANSLATION_BACKEND_XXX_YYY`, push to constructor as `backend_xxx_yyy`
config = {}
name_upper = name.upper()
settings_prefix = "TRANSLATION_"
config_prefix = f"{settings_prefix}{name_upper}_"
for key, value in settings:
if key.startswith(config_prefix):
config_name = key[len(settings_prefix) :].lower()
config[config_name] = value
return cls._registry[name](**config | kwargs)

View File

@@ -1,66 +0,0 @@
import httpx
from reflector.processors.transcript_translator import TranscriptTranslatorProcessor
from reflector.processors.transcript_translator_auto import (
TranscriptTranslatorAutoProcessor,
)
from reflector.processors.types import TranslationLanguages
from reflector.settings import settings
from reflector.utils.retry import retry
class TranscriptTranslatorModalProcessor(TranscriptTranslatorProcessor):
"""
Translate the transcript into the target language using Modal.com
"""
def __init__(self, modal_api_key: str | None = None, **kwargs):
super().__init__(**kwargs)
if not settings.TRANSLATE_URL:
raise Exception(
"TRANSLATE_URL is required for TranscriptTranslatorModalProcessor"
)
self.translate_url = settings.TRANSLATE_URL
self.timeout = settings.TRANSLATE_TIMEOUT
self.modal_api_key = modal_api_key
self.headers = {}
if self.modal_api_key:
self.headers["Authorization"] = f"Bearer {self.modal_api_key}"
async def _translate(self, text: str) -> str | None:
source_language = self.get_pref("audio:source_language", "en")
target_language = self.get_pref("audio:target_language", "en")
languages = TranslationLanguages()
# Only way to set the target should be the UI element like dropdown.
# Hence, this assert should never fail.
assert languages.is_supported(target_language)
self.logger.debug(f"Try to translate {text=}")
json_payload = {
"text": text,
"source_language": source_language,
"target_language": target_language,
}
async with httpx.AsyncClient() as client:
response = await retry(client.post)(
self.translate_url + "/translate",
headers=self.headers,
params=json_payload,
timeout=self.timeout,
follow_redirects=True,
logger=self.logger,
)
response.raise_for_status()
result = response.json()["text"]
# Sanity check for translation status in the result
if target_language in result:
translation = result[target_language]
else:
translation = None
self.logger.debug(f"Translation response: {text=}, {translation=}")
return translation
TranscriptTranslatorAutoProcessor.register("modal", TranscriptTranslatorModalProcessor)

View File

@@ -1,14 +0,0 @@
from reflector.processors.transcript_translator import TranscriptTranslatorProcessor
from reflector.processors.transcript_translator_auto import (
TranscriptTranslatorAutoProcessor,
)
class TranscriptTranslatorPassthroughProcessor(TranscriptTranslatorProcessor):
async def _translate(self, text: str) -> None:
return None
TranscriptTranslatorAutoProcessor.register(
"passthrough", TranscriptTranslatorPassthroughProcessor
)

View File

@@ -2,10 +2,9 @@ import io
import re
import tempfile
from pathlib import Path
from typing import Annotated
from profanityfilter import ProfanityFilter
from pydantic import BaseModel, Field, PrivateAttr
from pydantic import BaseModel, PrivateAttr
from reflector.redis_cache import redis_cache
@@ -49,70 +48,20 @@ class AudioFile(BaseModel):
self._path.unlink()
# non-negative seconds with float part
Seconds = Annotated[float, Field(ge=0.0, description="Time in seconds with float part")]
class Word(BaseModel):
text: str
start: Seconds
end: Seconds
start: float
end: float
speaker: int = 0
class TranscriptSegment(BaseModel):
text: str
start: Seconds
end: Seconds
start: float
end: float
speaker: int = 0
def words_to_segments(words: list[Word]) -> list[TranscriptSegment]:
# from a list of word, create a list of segments
# join the word that are less than 2 seconds apart
# but separate if the speaker changes, or if the punctuation is a . , ; : ? !
segments = []
current_segment = None
MAX_SEGMENT_LENGTH = 120
for word in words:
if current_segment is None:
current_segment = TranscriptSegment(
text=word.text,
start=word.start,
end=word.end,
speaker=word.speaker,
)
continue
# If the word is attach to another speaker, push the current segment
# and start a new one
if word.speaker != current_segment.speaker:
segments.append(current_segment)
current_segment = TranscriptSegment(
text=word.text,
start=word.start,
end=word.end,
speaker=word.speaker,
)
continue
# if the word is the end of a sentence, and we have enough content,
# add the word to the current segment and push it
current_segment.text += word.text
current_segment.end = word.end
have_punc = PUNC_RE.search(word.text)
if have_punc and (len(current_segment.text) > MAX_SEGMENT_LENGTH):
segments.append(current_segment)
current_segment = None
if current_segment:
segments.append(current_segment)
return segments
class Transcript(BaseModel):
translation: str | None = None
words: list[Word] = None
@@ -168,7 +117,49 @@ class Transcript(BaseModel):
return Transcript(text=self.text, translation=self.translation, words=words)
def as_segments(self) -> list[TranscriptSegment]:
return words_to_segments(self.words)
# from a list of word, create a list of segments
# join the word that are less than 2 seconds apart
# but separate if the speaker changes, or if the punctuation is a . , ; : ? !
segments = []
current_segment = None
MAX_SEGMENT_LENGTH = 120
for word in self.words:
if current_segment is None:
current_segment = TranscriptSegment(
text=word.text,
start=word.start,
end=word.end,
speaker=word.speaker,
)
continue
# If the word is attach to another speaker, push the current segment
# and start a new one
if word.speaker != current_segment.speaker:
segments.append(current_segment)
current_segment = TranscriptSegment(
text=word.text,
start=word.start,
end=word.end,
speaker=word.speaker,
)
continue
# if the word is the end of a sentence, and we have enough content,
# add the word to the current segment and push it
current_segment.text += word.text
current_segment.end = word.end
have_punc = PUNC_RE.search(word.text)
if have_punc and (len(current_segment.text) > MAX_SEGMENT_LENGTH):
segments.append(current_segment)
current_segment = None
if current_segment:
segments.append(current_segment)
return segments
class TitleSummary(BaseModel):

View File

@@ -1,296 +0,0 @@
import hashlib
from datetime import date, datetime, timedelta, timezone
from typing import TypedDict
import httpx
import pytz
from icalendar import Calendar, Event
from loguru import logger
from reflector.db.calendar_events import CalendarEvent, calendar_events_controller
from reflector.db.rooms import Room, rooms_controller
from reflector.settings import settings
class AttendeeData(TypedDict, total=False):
email: str | None
name: str | None
status: str | None
role: str | None
class EventData(TypedDict):
ics_uid: str
title: str | None
description: str | None
location: str | None
start_time: datetime
end_time: datetime
attendees: list[AttendeeData]
ics_raw_data: str
class SyncStats(TypedDict):
events_created: int
events_updated: int
events_deleted: int
class ICSFetchService:
def __init__(self):
self.client = httpx.AsyncClient(
timeout=30.0, headers={"User-Agent": "Reflector/1.0"}
)
async def fetch_ics(self, url: str) -> str:
response = await self.client.get(url)
response.raise_for_status()
return response.text
def parse_ics(self, ics_content: str) -> Calendar:
return Calendar.from_ical(ics_content)
def extract_room_events(
self, calendar: Calendar, room_name: str, room_url: str
) -> list[EventData]:
events = []
now = datetime.now(timezone.utc)
window_start = now - timedelta(hours=1)
window_end = now + timedelta(hours=24)
for component in calendar.walk():
if component.name == "VEVENT":
# Skip cancelled events
status = component.get("STATUS", "").upper()
if status == "CANCELLED":
continue
# Check if event matches this room
if self._event_matches_room(component, room_name, room_url):
event_data = self._parse_event(component)
# Only include events in our time window
if (
event_data
and window_start <= event_data["start_time"] <= window_end
):
events.append(event_data)
return events
def _event_matches_room(self, event: Event, room_name: str, room_url: str) -> bool:
location = str(event.get("LOCATION", ""))
description = str(event.get("DESCRIPTION", ""))
# Only match full room URL (with or without protocol)
patterns = [
room_url, # Full URL with protocol
room_url.replace("https://", ""), # Without https protocol
room_url.replace("http://", ""), # Without http protocol
]
# Check location and description for patterns
text_to_check = f"{location} {description}".lower()
for pattern in patterns:
if pattern.lower() in text_to_check:
return True
return False
def _parse_event(self, event: Event) -> EventData | None:
# Extract basic fields
uid = str(event.get("UID", ""))
summary = str(event.get("SUMMARY", ""))
description = str(event.get("DESCRIPTION", ""))
location = str(event.get("LOCATION", ""))
# Parse dates
dtstart = event.get("DTSTART")
dtend = event.get("DTEND")
if not dtstart:
return None
# Convert to datetime
start_time = self._normalize_datetime(
dtstart.dt if hasattr(dtstart, "dt") else dtstart
)
end_time = (
self._normalize_datetime(dtend.dt if hasattr(dtend, "dt") else dtend)
if dtend
else start_time + timedelta(hours=1)
)
# Parse attendees
attendees = self._parse_attendees(event)
# Get raw event data for storage
raw_data = event.to_ical().decode("utf-8")
return {
"ics_uid": uid,
"title": summary,
"description": description,
"location": location,
"start_time": start_time,
"end_time": end_time,
"attendees": attendees,
"ics_raw_data": raw_data,
}
def _normalize_datetime(self, dt) -> datetime:
# Handle date objects (all-day events)
if isinstance(dt, date) and not isinstance(dt, datetime):
# Convert to datetime at start of day in UTC
dt = datetime.combine(dt, datetime.min.time())
dt = pytz.UTC.localize(dt)
elif isinstance(dt, datetime):
# Add UTC timezone if naive
if dt.tzinfo is None:
dt = pytz.UTC.localize(dt)
else:
# Convert to UTC
dt = dt.astimezone(pytz.UTC)
return dt
def _parse_attendees(self, event: Event) -> list[AttendeeData]:
attendees = []
# Parse ATTENDEE properties
for attendee in event.get("ATTENDEE", []):
if not isinstance(attendee, list):
attendee = [attendee]
for att in attendee:
att_data: AttendeeData = {
"email": str(att).replace("mailto:", "") if att else None,
"name": att.params.get("CN") if hasattr(att, "params") else None,
"status": att.params.get("PARTSTAT")
if hasattr(att, "params")
else None,
"role": att.params.get("ROLE") if hasattr(att, "params") else None,
}
attendees.append(att_data)
# Add organizer
organizer = event.get("ORGANIZER")
if organizer:
org_data: AttendeeData = {
"email": str(organizer).replace("mailto:", "") if organizer else None,
"name": organizer.params.get("CN")
if hasattr(organizer, "params")
else None,
"role": "ORGANIZER",
}
attendees.append(org_data)
return attendees
class ICSSyncService:
def __init__(self):
self.fetch_service = ICSFetchService()
async def sync_room_calendar(self, room: Room) -> dict:
if not room.ics_enabled or not room.ics_url:
return {"status": "skipped", "reason": "ICS not configured"}
try:
# Check if it's time to sync
if not self._should_sync(room):
return {"status": "skipped", "reason": "Not time to sync yet"}
# Fetch ICS file
ics_content = await self.fetch_service.fetch_ics(room.ics_url)
# Check if content changed
content_hash = hashlib.md5(ics_content.encode()).hexdigest()
if room.ics_last_etag == content_hash:
logger.info(f"No changes in ICS for room {room.id}")
return {"status": "unchanged", "hash": content_hash}
# Parse calendar
calendar = self.fetch_service.parse_ics(ics_content)
# Build room URL
room_url = f"{settings.BASE_URL}/room/{room.name}"
# Extract matching events
events = self.fetch_service.extract_room_events(
calendar, room.name, room_url
)
# Sync events to database
sync_result = await self._sync_events_to_database(room.id, events)
# Update room sync metadata
await rooms_controller.update(
room,
{
"ics_last_sync": datetime.now(timezone.utc),
"ics_last_etag": content_hash,
},
mutate=False,
)
return {
"status": "success",
"hash": content_hash,
"events_found": len(events),
**sync_result,
}
except Exception as e:
logger.error(f"Failed to sync ICS for room {room.id}: {e}")
return {"status": "error", "error": str(e)}
def _should_sync(self, room: Room) -> bool:
if not room.ics_last_sync:
return True
time_since_sync = datetime.now(timezone.utc) - room.ics_last_sync
return time_since_sync.total_seconds() >= room.ics_fetch_interval
async def _sync_events_to_database(
self, room_id: str, events: list[EventData]
) -> SyncStats:
created = 0
updated = 0
# Track current event IDs
current_ics_uids = []
for event_data in events:
# Create CalendarEvent object
calendar_event = CalendarEvent(room_id=room_id, **event_data)
# Upsert event
existing = await calendar_events_controller.get_by_ics_uid(
room_id, event_data["ics_uid"]
)
if existing:
updated += 1
else:
created += 1
await calendar_events_controller.upsert(calendar_event)
current_ics_uids.append(event_data["ics_uid"])
# Soft delete events that are no longer in calendar
deleted = await calendar_events_controller.soft_delete_missing(
room_id, current_ics_uids
)
return {
"events_created": created,
"events_updated": updated,
"events_deleted": deleted,
}
# Global instance
ics_sync_service = ICSSyncService()

View File

@@ -14,9 +14,7 @@ class Settings(BaseSettings):
CORS_ALLOW_CREDENTIALS: bool = False
# Database
DATABASE_URL: str = (
"postgresql+asyncpg://reflector:reflector@localhost:5432/reflector"
)
DATABASE_URL: str = "sqlite:///./reflector.sqlite3"
# local data directory
DATA_DIR: str = "./data"
@@ -27,7 +25,7 @@ class Settings(BaseSettings):
TRANSCRIPT_URL: str | None = None
TRANSCRIPT_TIMEOUT: int = 90
# Audio Transcription: modal backend
# Audio transcription modal.com configuration
TRANSCRIPT_MODAL_API_KEY: str | None = None
# Audio transcription storage
@@ -39,23 +37,10 @@ class Settings(BaseSettings):
TRANSCRIPT_STORAGE_AWS_ACCESS_KEY_ID: str | None = None
TRANSCRIPT_STORAGE_AWS_SECRET_ACCESS_KEY: str | None = None
# Recording storage
RECORDING_STORAGE_BACKEND: str | None = None
# Recording storage configuration for AWS
RECORDING_STORAGE_AWS_BUCKET_NAME: str = "recording-bucket"
RECORDING_STORAGE_AWS_REGION: str = "us-east-1"
RECORDING_STORAGE_AWS_ACCESS_KEY_ID: str | None = None
RECORDING_STORAGE_AWS_SECRET_ACCESS_KEY: str | None = None
# Translate into the target language
TRANSLATION_BACKEND: str = "passthrough"
TRANSLATE_URL: str | None = None
TRANSLATE_TIMEOUT: int = 90
# Translation: modal backend
TRANSLATE_MODAL_API_KEY: str | None = None
# LLM
LLM_MODEL: str = "microsoft/phi-4"
LLM_URL: str | None = None
@@ -67,9 +52,6 @@ class Settings(BaseSettings):
DIARIZATION_BACKEND: str = "modal"
DIARIZATION_URL: str | None = None
# Diarization: modal backend
DIARIZATION_MODAL_API_KEY: str | None = None
# Sentry
SENTRY_DSN: str | None = None
@@ -113,11 +95,25 @@ class Settings(BaseSettings):
WHEREBY_API_URL: str = "https://api.whereby.dev/v1"
WHEREBY_API_KEY: str | None = None
WHEREBY_WEBHOOK_SECRET: str | None = None
AWS_WHEREBY_S3_BUCKET: str | None = None
AWS_WHEREBY_ACCESS_KEY_ID: str | None = None
AWS_WHEREBY_ACCESS_KEY_SECRET: str | None = None
AWS_PROCESS_RECORDING_QUEUE_URL: str | None = None
SQS_POLLING_TIMEOUT_SECONDS: int = 60
# Daily.co integration
DAILY_API_KEY: str | None = None
DAILY_WEBHOOK_SECRET: str | None = None
DAILY_SUBDOMAIN: str | None = None
AWS_DAILY_S3_BUCKET: str | None = None
AWS_DAILY_S3_REGION: str = "us-west-2"
AWS_DAILY_ROLE_ARN: str | None = None
# Video platform migration feature flags
DAILY_MIGRATION_ENABLED: bool = True
DAILY_MIGRATION_ROOM_IDS: list[str] = []
DEFAULT_VIDEO_PLATFORM: str = "daily"
# Zulip integration
ZULIP_REALM: str | None = None
ZULIP_API_KEY: str | None = None

View File

@@ -1,17 +1,10 @@
from .base import Storage # noqa
from reflector.settings import settings
def get_transcripts_storage() -> Storage:
assert settings.TRANSCRIPT_STORAGE_BACKEND
from reflector.settings import settings
return Storage.get_instance(
name=settings.TRANSCRIPT_STORAGE_BACKEND,
settings_prefix="TRANSCRIPT_STORAGE_",
)
def get_recordings_storage() -> Storage:
return Storage.get_instance(
name=settings.RECORDING_STORAGE_BACKEND,
settings_prefix="RECORDING_STORAGE_",
)

View File

@@ -9,9 +9,8 @@ async def export_db(filename: str) -> None:
filename = pathlib.Path(filename).resolve()
settings.DATABASE_URL = f"sqlite:///{filename}"
from reflector.db import get_database, transcripts
from reflector.db import database, transcripts
database = get_database()
await database.connect()
transcripts = await database.fetch_all(transcripts.select())
await database.disconnect()

View File

@@ -8,9 +8,8 @@ async def export_db(filename: str) -> None:
filename = pathlib.Path(filename).resolve()
settings.DATABASE_URL = f"sqlite:///{filename}"
from reflector.db import get_database, transcripts
from reflector.db import database, transcripts
database = get_database()
await database.connect()
transcripts = await database.fetch_all(transcripts.select())
await database.disconnect()

View File

@@ -13,7 +13,7 @@ from reflector.processors import (
TranscriptFinalTitleProcessor,
TranscriptLinerProcessor,
TranscriptTopicDetectorProcessor,
TranscriptTranslatorAutoProcessor,
TranscriptTranslatorProcessor,
)
from reflector.processors.base import BroadcastProcessor
@@ -31,7 +31,7 @@ async def process_audio_file(
AudioMergeProcessor(),
AudioTranscriptAutoProcessor.as_threaded(),
TranscriptLinerProcessor(),
TranscriptTranslatorAutoProcessor.as_threaded(),
TranscriptTranslatorProcessor.as_threaded(),
]
if not only_transcript:
processors += [

View File

@@ -27,7 +27,7 @@ from reflector.processors import (
TranscriptFinalTitleProcessor,
TranscriptLinerProcessor,
TranscriptTopicDetectorProcessor,
TranscriptTranslatorAutoProcessor,
TranscriptTranslatorProcessor,
)
from reflector.processors.base import BroadcastProcessor, Processor
from reflector.processors.types import (
@@ -103,7 +103,7 @@ async def process_audio_file_with_diarization(
processors += [
TranscriptLinerProcessor(),
TranscriptTranslatorAutoProcessor.as_threaded(),
TranscriptTranslatorProcessor.as_threaded(),
]
if not only_transcript:
@@ -145,17 +145,18 @@ async def process_audio_file_with_diarization(
logger.info(f"Starting diarization with {len(topics)} topics")
try:
# Import diarization processor
from reflector.processors import AudioDiarizationAutoProcessor
# Create diarization processor
diarization_processor = AudioDiarizationAutoProcessor(
name=diarization_backend
)
diarization_processor.set_pipeline(pipeline)
diarization_processor.on(event_callback)
# For Modal backend, we need to upload the file to S3 first
if diarization_backend == "modal":
from datetime import datetime, timezone
from datetime import datetime
from reflector.storage import get_transcripts_storage
from reflector.utils.s3_temp_file import S3TemporaryFile
@@ -163,7 +164,7 @@ async def process_audio_file_with_diarization(
storage = get_transcripts_storage()
# Generate a unique filename in evaluation folder
timestamp = datetime.now(timezone.utc).strftime("%Y%m%d_%H%M%S")
timestamp = datetime.utcnow().strftime("%Y%m%d_%H%M%S")
audio_filename = f"evaluation/diarization_temp/{timestamp}_{uuid.uuid4().hex}.wav"
# Use context manager for automatic cleanup

View File

@@ -1,63 +0,0 @@
"""WebVTT utilities for generating subtitle files from transcript data."""
from typing import TYPE_CHECKING, Annotated
import webvtt
from reflector.processors.types import Seconds, Word, words_to_segments
if TYPE_CHECKING:
from reflector.db.transcripts import TranscriptTopic
VttTimestamp = Annotated[str, "vtt_timestamp"]
WebVTTStr = Annotated[str, "webvtt_str"]
def _seconds_to_timestamp(seconds: Seconds) -> VttTimestamp:
# lib doesn't do that
hours = int(seconds // 3600)
minutes = int((seconds % 3600) // 60)
secs = int(seconds % 60)
milliseconds = int((seconds % 1) * 1000)
return f"{hours:02d}:{minutes:02d}:{secs:02d}.{milliseconds:03d}"
def words_to_webvtt(words: list[Word]) -> WebVTTStr:
"""Convert words to WebVTT using existing segmentation logic."""
vtt = webvtt.WebVTT()
if not words:
return vtt.content
segments = words_to_segments(words)
for segment in segments:
text = segment.text.strip()
# lib doesn't do that
text = f"<v Speaker{segment.speaker}>{text}"
caption = webvtt.Caption(
start=_seconds_to_timestamp(segment.start),
end=_seconds_to_timestamp(segment.end),
text=text,
)
vtt.captions.append(caption)
return vtt.content
def topics_to_webvtt(topics: list["TranscriptTopic"]) -> WebVTTStr:
if not topics:
return webvtt.WebVTT().content
all_words: list[Word] = []
for topic in topics:
all_words.extend(topic.words)
# assert it's in sequence
for i in range(len(all_words) - 1):
assert (
all_words[i].start <= all_words[i + 1].start
), f"Words are not in sequence: {all_words[i].text} and {all_words[i + 1].text} are not consecutive: {all_words[i].start} > {all_words[i + 1].start}"
return words_to_webvtt(all_words)

View File

@@ -0,0 +1,17 @@
# Video Platform Abstraction Layer
"""
This module provides an abstraction layer for different video conferencing platforms.
It allows seamless switching between providers (Whereby, Daily.co, etc.) without
changing the core application logic.
"""
from .base import MeetingData, VideoPlatformClient, VideoPlatformConfig
from .registry import get_platform_client, register_platform
__all__ = [
"VideoPlatformClient",
"VideoPlatformConfig",
"MeetingData",
"get_platform_client",
"register_platform",
]

View File

@@ -0,0 +1,82 @@
from abc import ABC, abstractmethod
from datetime import datetime
from typing import Any, Dict, Optional
from pydantic import BaseModel
from reflector.db.rooms import Room
class MeetingData(BaseModel):
"""Standardized meeting data returned by all platforms."""
meeting_id: str
room_name: str
room_url: str
host_room_url: str
platform: str
extra_data: Dict[str, Any] = {} # Platform-specific data
class VideoPlatformConfig(BaseModel):
"""Configuration for a video platform."""
api_key: str
webhook_secret: str
api_url: Optional[str] = None
subdomain: Optional[str] = None
s3_bucket: Optional[str] = None
s3_region: Optional[str] = None
aws_role_arn: Optional[str] = None
aws_access_key_id: Optional[str] = None
aws_access_key_secret: Optional[str] = None
class VideoPlatformClient(ABC):
"""Abstract base class for video platform integrations."""
PLATFORM_NAME: str = ""
def __init__(self, config: VideoPlatformConfig):
self.config = config
@abstractmethod
async def create_meeting(
self, room_name_prefix: str, end_date: datetime, room: Room
) -> MeetingData:
"""Create a new meeting room."""
pass
@abstractmethod
async def get_room_sessions(self, room_name: str) -> Dict[str, Any]:
"""Get session information for a room."""
pass
@abstractmethod
async def delete_room(self, room_name: str) -> bool:
"""Delete a room. Returns True if successful."""
pass
@abstractmethod
async def upload_logo(self, room_name: str, logo_path: str) -> bool:
"""Upload a logo to the room. Returns True if successful."""
pass
@abstractmethod
def verify_webhook_signature(
self, body: bytes, signature: str, timestamp: Optional[str] = None
) -> bool:
"""Verify webhook signature for security."""
pass
def format_recording_config(self, room: Room) -> Dict[str, Any]:
"""Format recording configuration for the platform.
Can be overridden by specific implementations."""
if room.recording_type == "cloud" and self.config.s3_bucket:
return {
"type": room.recording_type,
"bucket": self.config.s3_bucket,
"region": self.config.s3_region,
"trigger": room.recording_trigger,
}
return {"type": room.recording_type}

View File

@@ -0,0 +1,127 @@
import hmac
from datetime import datetime
from hashlib import sha256
from typing import Any, Dict, Optional
import httpx
from reflector.db.rooms import Room
from .base import MeetingData, VideoPlatformClient, VideoPlatformConfig
class DailyClient(VideoPlatformClient):
"""Daily.co video platform implementation."""
PLATFORM_NAME = "daily"
TIMEOUT = 10 # seconds
BASE_URL = "https://api.daily.co/v1"
def __init__(self, config: VideoPlatformConfig):
super().__init__(config)
self.headers = {
"Authorization": f"Bearer {config.api_key}",
"Content-Type": "application/json",
}
async def create_meeting(
self, room_name_prefix: str, end_date: datetime, room: Room
) -> MeetingData:
"""Create a Daily.co room."""
room_name = f"{room_name_prefix}-{datetime.now().strftime('%Y%m%d%H%M%S')}"
data = {
"name": room_name,
"privacy": "private" if room.is_locked else "public",
"properties": {
"enable_recording": room.recording_type
if room.recording_type != "none"
else False,
"enable_chat": True,
"enable_screenshare": True,
"start_video_off": False,
"start_audio_off": False,
"exp": int(end_date.timestamp()),
},
}
# Configure S3 bucket for cloud recordings
if room.recording_type == "cloud" and self.config.s3_bucket:
data["properties"]["recordings_bucket"] = {
"bucket_name": self.config.s3_bucket,
"bucket_region": self.config.s3_region,
"assume_role_arn": self.config.aws_role_arn,
"allow_api_access": True,
}
async with httpx.AsyncClient() as client:
response = await client.post(
f"{self.BASE_URL}/rooms",
headers=self.headers,
json=data,
timeout=self.TIMEOUT,
)
response.raise_for_status()
result = response.json()
# Format response to match our standard
room_url = result["url"]
return MeetingData(
meeting_id=result["id"],
room_name=result["name"],
room_url=room_url,
host_room_url=room_url,
platform=self.PLATFORM_NAME,
extra_data=result,
)
async def get_room_sessions(self, room_name: str) -> Dict[str, Any]:
"""Get Daily.co room information."""
async with httpx.AsyncClient() as client:
response = await client.get(
f"{self.BASE_URL}/rooms/{room_name}",
headers=self.headers,
timeout=self.TIMEOUT,
)
response.raise_for_status()
return response.json()
async def get_room_presence(self, room_name: str) -> Dict[str, Any]:
"""Get real-time participant data - Daily.co specific feature."""
async with httpx.AsyncClient() as client:
response = await client.get(
f"{self.BASE_URL}/rooms/{room_name}/presence",
headers=self.headers,
timeout=self.TIMEOUT,
)
response.raise_for_status()
return response.json()
async def delete_room(self, room_name: str) -> bool:
"""Delete a Daily.co room."""
async with httpx.AsyncClient() as client:
response = await client.delete(
f"{self.BASE_URL}/rooms/{room_name}",
headers=self.headers,
timeout=self.TIMEOUT,
)
# Daily.co returns 200 for success, 404 if room doesn't exist
return response.status_code in (200, 404)
async def upload_logo(self, room_name: str, logo_path: str) -> bool:
"""Daily.co doesn't support custom logos per room - this is a no-op."""
return True
def verify_webhook_signature(
self, body: bytes, signature: str, timestamp: Optional[str] = None
) -> bool:
"""Verify Daily.co webhook signature."""
expected = hmac.new(
self.config.webhook_secret.encode(), body, sha256
).hexdigest()
try:
return hmac.compare_digest(expected, signature)
except Exception:
return False

View File

@@ -0,0 +1,52 @@
"""Factory for creating video platform clients based on configuration."""
from typing import Optional
from reflector.settings import settings
from .base import VideoPlatformClient, VideoPlatformConfig
from .registry import get_platform_client
def get_platform_config(platform: str) -> VideoPlatformConfig:
"""Get configuration for a specific platform."""
if platform == "whereby":
return VideoPlatformConfig(
api_key=settings.WHEREBY_API_KEY or "",
webhook_secret=settings.WHEREBY_WEBHOOK_SECRET or "",
api_url=settings.WHEREBY_API_URL,
s3_bucket=settings.AWS_WHEREBY_S3_BUCKET,
aws_access_key_id=settings.AWS_WHEREBY_ACCESS_KEY_ID,
aws_access_key_secret=settings.AWS_WHEREBY_ACCESS_KEY_SECRET,
)
elif platform == "daily":
return VideoPlatformConfig(
api_key=settings.DAILY_API_KEY or "",
webhook_secret=settings.DAILY_WEBHOOK_SECRET or "",
subdomain=settings.DAILY_SUBDOMAIN,
s3_bucket=settings.AWS_DAILY_S3_BUCKET,
s3_region=settings.AWS_DAILY_S3_REGION,
aws_role_arn=settings.AWS_DAILY_ROLE_ARN,
)
else:
raise ValueError(f"Unknown platform: {platform}")
def create_platform_client(platform: str) -> VideoPlatformClient:
"""Create a video platform client instance."""
config = get_platform_config(platform)
return get_platform_client(platform, config)
def get_platform_for_room(room_id: Optional[str] = None) -> str:
"""Determine which platform to use for a room based on feature flags."""
# If Daily migration is disabled, always use Whereby
if not settings.DAILY_MIGRATION_ENABLED:
return "whereby"
# If a specific room is in the migration list, use Daily
if room_id and room_id in settings.DAILY_MIGRATION_ROOM_IDS:
return "daily"
# Otherwise use the default platform
return settings.DEFAULT_VIDEO_PLATFORM

View File

@@ -0,0 +1,124 @@
"""Mock video platform client for testing."""
import uuid
from datetime import datetime
from typing import Any, Dict, Optional
from reflector.db.rooms import Room
from .base import MeetingData, VideoPlatformClient, VideoPlatformConfig
class MockPlatformClient(VideoPlatformClient):
"""Mock video platform implementation for testing."""
PLATFORM_NAME = "mock"
def __init__(self, config: VideoPlatformConfig):
super().__init__(config)
# Store created rooms for testing
self._rooms: Dict[str, Dict[str, Any]] = {}
self._webhook_calls: list[Dict[str, Any]] = []
async def create_meeting(
self, room_name_prefix: str, end_date: datetime, room: Room
) -> MeetingData:
"""Create a mock meeting."""
meeting_id = str(uuid.uuid4())
room_name = f"{room_name_prefix}-{meeting_id[:8]}"
room_url = f"https://mock.video/{room_name}"
host_room_url = f"{room_url}?host=true"
# Store room data for later retrieval
self._rooms[room_name] = {
"id": meeting_id,
"name": room_name,
"url": room_url,
"host_url": host_room_url,
"end_date": end_date,
"room": room,
"participants": [],
"is_active": True,
}
return MeetingData(
meeting_id=meeting_id,
room_name=room_name,
room_url=room_url,
host_room_url=host_room_url,
platform=self.PLATFORM_NAME,
extra_data={"mock": True},
)
async def get_room_sessions(self, room_name: str) -> Dict[str, Any]:
"""Get mock room session information."""
if room_name not in self._rooms:
return {"error": "Room not found"}
room_data = self._rooms[room_name]
return {
"roomName": room_name,
"sessions": [
{
"sessionId": room_data["id"],
"startTime": datetime.utcnow().isoformat(),
"participants": room_data["participants"],
"isActive": room_data["is_active"],
}
],
}
async def delete_room(self, room_name: str) -> bool:
"""Delete a mock room."""
if room_name in self._rooms:
self._rooms[room_name]["is_active"] = False
return True
return False
async def upload_logo(self, room_name: str, logo_path: str) -> bool:
"""Mock logo upload."""
if room_name in self._rooms:
self._rooms[room_name]["logo_path"] = logo_path
return True
return False
def verify_webhook_signature(
self, body: bytes, signature: str, timestamp: Optional[str] = None
) -> bool:
"""Mock webhook signature verification."""
# For testing, accept signature == "valid"
return signature == "valid"
# Mock-specific methods for testing
def add_participant(
self, room_name: str, participant_id: str, participant_name: str
):
"""Add a participant to a mock room (for testing)."""
if room_name in self._rooms:
self._rooms[room_name]["participants"].append(
{
"id": participant_id,
"name": participant_name,
"joined_at": datetime.utcnow().isoformat(),
}
)
def trigger_webhook(self, event_type: str, data: Dict[str, Any]):
"""Trigger a mock webhook event (for testing)."""
self._webhook_calls.append(
{
"type": event_type,
"data": data,
"timestamp": datetime.utcnow().isoformat(),
}
)
def get_webhook_calls(self) -> list[Dict[str, Any]]:
"""Get all webhook calls made (for testing)."""
return self._webhook_calls.copy()
def clear_data(self):
"""Clear all mock data (for testing)."""
self._rooms.clear()
self._webhook_calls.clear()

View File

@@ -0,0 +1,42 @@
from typing import Dict, Type
from .base import VideoPlatformClient, VideoPlatformConfig
# Registry of available video platforms
_PLATFORMS: Dict[str, Type[VideoPlatformClient]] = {}
def register_platform(name: str, client_class: Type[VideoPlatformClient]):
"""Register a video platform implementation."""
_PLATFORMS[name.lower()] = client_class
def get_platform_client(
platform: str, config: VideoPlatformConfig
) -> VideoPlatformClient:
"""Get a video platform client instance."""
platform_lower = platform.lower()
if platform_lower not in _PLATFORMS:
raise ValueError(f"Unknown video platform: {platform}")
client_class = _PLATFORMS[platform_lower]
return client_class(config)
def get_available_platforms() -> list[str]:
"""Get list of available platform names."""
return list(_PLATFORMS.keys())
# Auto-register built-in platforms
def _register_builtin_platforms():
from .daily import DailyClient
from .mock import MockPlatformClient
from .whereby import WherebyClient
register_platform("whereby", WherebyClient)
register_platform("daily", DailyClient)
register_platform("mock", MockPlatformClient)
_register_builtin_platforms()

View File

@@ -0,0 +1,140 @@
import hmac
import json
import re
import time
from datetime import datetime
from hashlib import sha256
from typing import Any, Dict, Optional
import httpx
from reflector.db.rooms import Room
from .base import MeetingData, VideoPlatformClient, VideoPlatformConfig
class WherebyClient(VideoPlatformClient):
"""Whereby video platform implementation."""
PLATFORM_NAME = "whereby"
TIMEOUT = 10 # seconds
MAX_ELAPSED_TIME = 60 * 1000 # 1 minute in milliseconds
def __init__(self, config: VideoPlatformConfig):
super().__init__(config)
self.headers = {
"Content-Type": "application/json; charset=utf-8",
"Authorization": f"Bearer {config.api_key}",
}
async def create_meeting(
self, room_name_prefix: str, end_date: datetime, room: Room
) -> MeetingData:
"""Create a Whereby meeting."""
data = {
"isLocked": room.is_locked,
"roomNamePrefix": room_name_prefix,
"roomNamePattern": "uuid",
"roomMode": room.room_mode,
"endDate": end_date.isoformat(),
"fields": ["hostRoomUrl"],
}
# Add recording configuration if cloud recording is enabled
if room.recording_type == "cloud":
data["recording"] = {
"type": room.recording_type,
"destination": {
"provider": "s3",
"bucket": self.config.s3_bucket,
"accessKeyId": self.config.aws_access_key_id,
"accessKeySecret": self.config.aws_access_key_secret,
"fileFormat": "mp4",
},
"startTrigger": room.recording_trigger,
}
async with httpx.AsyncClient() as client:
response = await client.post(
f"{self.config.api_url}/meetings",
headers=self.headers,
json=data,
timeout=self.TIMEOUT,
)
response.raise_for_status()
result = response.json()
return MeetingData(
meeting_id=result["meetingId"],
room_name=result["roomName"],
room_url=result["roomUrl"],
host_room_url=result["hostRoomUrl"],
platform=self.PLATFORM_NAME,
extra_data=result,
)
async def get_room_sessions(self, room_name: str) -> Dict[str, Any]:
"""Get Whereby room session information."""
async with httpx.AsyncClient() as client:
response = await client.get(
f"{self.config.api_url}/insights/room-sessions?roomName={room_name}",
headers=self.headers,
timeout=self.TIMEOUT,
)
response.raise_for_status()
return response.json()
async def delete_room(self, room_name: str) -> bool:
"""Whereby doesn't support room deletion - meetings expire automatically."""
return True
async def upload_logo(self, room_name: str, logo_path: str) -> bool:
"""Upload logo to Whereby room."""
async with httpx.AsyncClient() as client:
with open(logo_path, "rb") as f:
response = await client.put(
f"{self.config.api_url}/rooms/{room_name}/theme/logo",
headers={
"Authorization": f"Bearer {self.config.api_key}",
},
timeout=self.TIMEOUT,
files={"image": f},
)
response.raise_for_status()
return True
def verify_webhook_signature(
self, body: bytes, signature: str, timestamp: Optional[str] = None
) -> bool:
"""Verify Whereby webhook signature."""
if not signature:
return False
matches = re.match(r"t=(.*),v1=(.*)", signature)
if not matches:
return False
ts, sig = matches.groups()
# Check timestamp to prevent replay attacks
current_time = int(time.time() * 1000)
diff_time = current_time - int(ts) * 1000
if diff_time >= self.MAX_ELAPSED_TIME:
return False
# Verify signature
body_dict = json.loads(body)
signed_payload = f"{ts}.{json.dumps(body_dict, separators=(',', ':'))}"
hmac_obj = hmac.new(
self.config.webhook_secret.encode("utf-8"),
signed_payload.encode("utf-8"),
sha256,
)
expected_signature = hmac_obj.hexdigest()
try:
return hmac.compare_digest(
expected_signature.encode("utf-8"), sig.encode("utf-8")
)
except Exception:
return False

View File

@@ -44,6 +44,8 @@ def range_requests_response(
"""Returns StreamingResponse using Range Requests of a given file"""
if not os.path.exists(file_path):
from fastapi import HTTPException
raise HTTPException(status_code=404, detail="File not found")
file_size = os.stat(file_path).st_size

View File

@@ -0,0 +1,145 @@
"""Daily.co webhook handler endpoint."""
import hmac
from hashlib import sha256
from typing import Any, Dict
from fastapi import APIRouter, HTTPException, Request
from pydantic import BaseModel
from reflector.db.meetings import meetings_controller
from reflector.settings import settings
router = APIRouter()
class DailyWebhookEvent(BaseModel):
"""Daily.co webhook event structure."""
type: str
id: str
ts: int # Unix timestamp in milliseconds
data: Dict[str, Any]
def verify_daily_webhook_signature(body: bytes, signature: str) -> bool:
"""Verify Daily.co webhook signature using HMAC-SHA256."""
if not signature or not settings.DAILY_WEBHOOK_SECRET:
return False
try:
expected = hmac.new(
settings.DAILY_WEBHOOK_SECRET.encode(), body, sha256
).hexdigest()
return hmac.compare_digest(expected, signature)
except Exception:
return False
@router.post("/daily_webhook")
async def daily_webhook(event: DailyWebhookEvent, request: Request):
"""Handle Daily.co webhook events."""
# Verify webhook signature for security
body = await request.body()
signature = request.headers.get("X-Daily-Signature", "")
if not verify_daily_webhook_signature(body, signature):
raise HTTPException(status_code=401, detail="Invalid webhook signature")
# Handle participant events
if event.type == "participant.joined":
await _handle_participant_joined(event)
elif event.type == "participant.left":
await _handle_participant_left(event)
elif event.type == "recording.started":
await _handle_recording_started(event)
elif event.type == "recording.ready-to-download":
await _handle_recording_ready(event)
elif event.type == "recording.error":
await _handle_recording_error(event)
return {"status": "ok"}
async def _handle_participant_joined(event: DailyWebhookEvent):
"""Handle participant joined event."""
room_name = event.data.get("room", {}).get("name")
if not room_name:
return
meeting = await meetings_controller.get_by_room_name(room_name)
if meeting:
# Update participant count (same as Whereby)
current_count = getattr(meeting, "num_clients", 0)
await meetings_controller.update_meeting(
meeting.id, num_clients=current_count + 1
)
async def _handle_participant_left(event: DailyWebhookEvent):
"""Handle participant left event."""
room_name = event.data.get("room", {}).get("name")
if not room_name:
return
meeting = await meetings_controller.get_by_room_name(room_name)
if meeting:
# Update participant count (same as Whereby)
current_count = getattr(meeting, "num_clients", 0)
await meetings_controller.update_meeting(
meeting.id, num_clients=max(0, current_count - 1)
)
async def _handle_recording_started(event: DailyWebhookEvent):
"""Handle recording started event."""
room_name = event.data.get("room", {}).get("name")
if not room_name:
return
meeting = await meetings_controller.get_by_room_name(room_name)
if meeting:
# Log recording start for debugging
print(f"Recording started for meeting {meeting.id} in room {room_name}")
async def _handle_recording_ready(event: DailyWebhookEvent):
"""Handle recording ready for download event."""
room_name = event.data.get("room", {}).get("name")
recording_data = event.data.get("recording", {})
download_link = recording_data.get("download_url")
recording_id = recording_data.get("id")
if not room_name or not download_link:
return
meeting = await meetings_controller.get_by_room_name(room_name)
if meeting:
# Queue recording processing task (same as Whereby)
try:
# Import here to avoid circular imports
from reflector.worker.process import process_recording_from_url
# For Daily.co, we need to queue recording processing with URL
# This will download from the URL and process similar to S3
process_recording_from_url.delay(
recording_url=download_link,
meeting_id=meeting.id,
recording_id=recording_id or event.id,
)
except ImportError:
# Handle case where worker tasks aren't available
print(
f"Warning: Could not queue recording processing for meeting {meeting.id}"
)
async def _handle_recording_error(event: DailyWebhookEvent):
"""Handle recording error event."""
room_name = event.data.get("room", {}).get("name")
error = event.data.get("error", "Unknown error")
if room_name:
meeting = await meetings_controller.get_by_room_name(room_name)
if meeting:
print(f"Recording error for meeting {meeting.id}: {error}")

View File

@@ -1,4 +1,4 @@
from datetime import datetime, timezone
from datetime import datetime
from typing import Annotated, Optional
from fastapi import APIRouter, Depends, HTTPException, Request
@@ -35,7 +35,7 @@ async def meeting_audio_consent(
meeting_id=meeting_id,
user_id=user_id,
consent_given=request.consent_given,
consent_timestamp=datetime.now(timezone.utc),
consent_timestamp=datetime.utcnow(),
)
updated_consent = await meeting_consent_controller.upsert(consent)

View File

@@ -1,34 +1,29 @@
import logging
import sqlite3
from datetime import datetime, timedelta, timezone
from datetime import datetime, timedelta
from typing import Annotated, Literal, Optional
import asyncpg.exceptions
from fastapi import APIRouter, Depends, HTTPException
from fastapi_pagination import Page
from fastapi_pagination.ext.databases import apaginate
from fastapi_pagination.ext.databases import paginate
from pydantic import BaseModel
import reflector.auth as auth
from reflector.db import get_database
from reflector.db import database
from reflector.db.meetings import meetings_controller
from reflector.db.rooms import rooms_controller
from reflector.settings import settings
from reflector.whereby import create_meeting, upload_logo
from reflector.video_platforms.factory import (
create_platform_client,
get_platform_for_room,
)
logger = logging.getLogger(__name__)
router = APIRouter()
def parse_datetime_with_timezone(iso_string: str) -> datetime:
"""Parse ISO datetime string and ensure timezone awareness (defaults to UTC if naive)."""
dt = datetime.fromisoformat(iso_string)
if dt.tzinfo is None:
dt = dt.replace(tzinfo=timezone.utc)
return dt
class Room(BaseModel):
id: str
name: str
@@ -42,11 +37,7 @@ class Room(BaseModel):
recording_type: str
recording_trigger: str
is_shared: bool
ics_url: Optional[str] = None
ics_fetch_interval: int = 300
ics_enabled: bool = False
ics_last_sync: Optional[datetime] = None
ics_last_etag: Optional[str] = None
platform: str
class Meeting(BaseModel):
@@ -57,6 +48,7 @@ class Meeting(BaseModel):
start_date: datetime
end_date: datetime
recording_type: Literal["none", "local", "cloud"] = "cloud"
platform: str
class CreateRoom(BaseModel):
@@ -69,24 +61,18 @@ class CreateRoom(BaseModel):
recording_type: str
recording_trigger: str
is_shared: bool
ics_url: Optional[str] = None
ics_fetch_interval: int = 300
ics_enabled: bool = False
class UpdateRoom(BaseModel):
name: Optional[str] = None
zulip_auto_post: Optional[bool] = None
zulip_stream: Optional[str] = None
zulip_topic: Optional[str] = None
is_locked: Optional[bool] = None
room_mode: Optional[str] = None
recording_type: Optional[str] = None
recording_trigger: Optional[str] = None
is_shared: Optional[bool] = None
ics_url: Optional[str] = None
ics_fetch_interval: Optional[int] = None
ics_enabled: Optional[bool] = None
name: str
zulip_auto_post: bool
zulip_stream: str
zulip_topic: str
is_locked: bool
room_mode: str
recording_type: str
recording_trigger: str
is_shared: bool
class DeletionStatus(BaseModel):
@@ -102,8 +88,8 @@ async def rooms_list(
user_id = user["sub"] if user else None
return await apaginate(
get_database(),
return await paginate(
database,
await rooms_controller.get_all(
user_id=user_id, order_by="-created_at", return_query=True
),
@@ -117,6 +103,14 @@ async def rooms_create(
):
user_id = user["sub"] if user else None
# Determine platform for this room (will be "whereby" unless feature flag is enabled)
# Note: Since room doesn't exist yet, we can't use room_id for selection
platform = (
settings.DEFAULT_VIDEO_PLATFORM
if settings.DAILY_MIGRATION_ENABLED
else "whereby"
)
return await rooms_controller.add(
name=room.name,
user_id=user_id,
@@ -128,9 +122,7 @@ async def rooms_create(
recording_type=room.recording_type,
recording_trigger=room.recording_trigger,
is_shared=room.is_shared,
ics_url=room.ics_url,
ics_fetch_interval=room.ics_fetch_interval,
ics_enabled=room.ics_enabled,
platform=platform,
)
@@ -172,24 +164,32 @@ async def rooms_create_meeting(
if not room:
raise HTTPException(status_code=404, detail="Room not found")
current_time = datetime.now(timezone.utc)
current_time = datetime.utcnow()
meeting = await meetings_controller.get_active(room=room, current_time=current_time)
if meeting is None:
end_date = current_time + timedelta(hours=8)
whereby_meeting = await create_meeting("", end_date=end_date, room=room)
await upload_logo(whereby_meeting["roomName"], "./images/logo.png")
# Use the platform abstraction to create meeting
platform = get_platform_for_room(room.id)
client = create_platform_client(platform)
meeting_data = await client.create_meeting(
room_name_prefix=room.name, end_date=end_date, room=room
)
# Upload logo if supported by platform
await client.upload_logo(meeting_data.room_name, "./images/logo.png")
# Now try to save to database
try:
meeting = await meetings_controller.create(
id=whereby_meeting["meetingId"],
room_name=whereby_meeting["roomName"],
room_url=whereby_meeting["roomUrl"],
host_room_url=whereby_meeting["hostRoomUrl"],
start_date=parse_datetime_with_timezone(whereby_meeting["startDate"]),
end_date=parse_datetime_with_timezone(whereby_meeting["endDate"]),
id=meeting_data.meeting_id,
room_name=meeting_data.room_name,
room_url=meeting_data.room_url,
host_room_url=meeting_data.host_room_url,
start_date=current_time,
end_date=end_date,
user_id=user_id,
room=room,
)
@@ -201,8 +201,9 @@ async def rooms_create_meeting(
room.name,
)
logger.warning(
"Whereby meeting %s was created but not used (resource leak) for room %s",
whereby_meeting["meetingId"],
"%s meeting %s was created but not used (resource leak) for room %s",
platform,
meeting_data.meeting_id,
room.name,
)
@@ -223,217 +224,3 @@ async def rooms_create_meeting(
meeting.host_room_url = ""
return meeting
class ICSStatus(BaseModel):
status: str
last_sync: Optional[datetime] = None
next_sync: Optional[datetime] = None
last_etag: Optional[str] = None
events_count: int = 0
class ICSSyncResult(BaseModel):
status: str
hash: Optional[str] = None
events_found: int = 0
events_created: int = 0
events_updated: int = 0
events_deleted: int = 0
error: Optional[str] = None
@router.post("/rooms/{room_name}/ics/sync", response_model=ICSSyncResult)
async def rooms_sync_ics(
room_name: str,
user: Annotated[Optional[auth.UserInfo], Depends(auth.current_user_optional)],
):
user_id = user["sub"] if user else None
room = await rooms_controller.get_by_name(room_name)
if not room:
raise HTTPException(status_code=404, detail="Room not found")
if user_id != room.user_id:
raise HTTPException(
status_code=403, detail="Only room owner can trigger ICS sync"
)
if not room.ics_enabled or not room.ics_url:
raise HTTPException(status_code=400, detail="ICS not configured for this room")
from reflector.services.ics_sync import ics_sync_service
result = await ics_sync_service.sync_room_calendar(room)
if result["status"] == "error":
raise HTTPException(
status_code=500, detail=result.get("error", "Unknown error")
)
return ICSSyncResult(**result)
@router.get("/rooms/{room_name}/ics/status", response_model=ICSStatus)
async def rooms_ics_status(
room_name: str,
user: Annotated[Optional[auth.UserInfo], Depends(auth.current_user_optional)],
):
user_id = user["sub"] if user else None
room = await rooms_controller.get_by_name(room_name)
if not room:
raise HTTPException(status_code=404, detail="Room not found")
if user_id != room.user_id:
raise HTTPException(
status_code=403, detail="Only room owner can view ICS status"
)
next_sync = None
if room.ics_enabled and room.ics_last_sync:
next_sync = room.ics_last_sync + timedelta(seconds=room.ics_fetch_interval)
from reflector.db.calendar_events import calendar_events_controller
events = await calendar_events_controller.get_by_room(
room.id, include_deleted=False
)
return ICSStatus(
status="enabled" if room.ics_enabled else "disabled",
last_sync=room.ics_last_sync,
next_sync=next_sync,
last_etag=room.ics_last_etag,
events_count=len(events),
)
class CalendarEventResponse(BaseModel):
id: str
room_id: str
ics_uid: str
title: Optional[str] = None
description: Optional[str] = None
start_time: datetime
end_time: datetime
attendees: Optional[list[dict]] = None
location: Optional[str] = None
last_synced: datetime
created_at: datetime
updated_at: datetime
@router.get("/rooms/{room_name}/meetings", response_model=list[CalendarEventResponse])
async def rooms_list_meetings(
room_name: str,
user: Annotated[Optional[auth.UserInfo], Depends(auth.current_user_optional)],
):
user_id = user["sub"] if user else None
room = await rooms_controller.get_by_name(room_name)
if not room:
raise HTTPException(status_code=404, detail="Room not found")
from reflector.db.calendar_events import calendar_events_controller
events = await calendar_events_controller.get_by_room(
room.id, include_deleted=False
)
if user_id != room.user_id:
for event in events:
event.description = None
event.attendees = None
return events
@router.get(
"/rooms/{room_name}/meetings/upcoming", response_model=list[CalendarEventResponse]
)
async def rooms_list_upcoming_meetings(
room_name: str,
user: Annotated[Optional[auth.UserInfo], Depends(auth.current_user_optional)],
minutes_ahead: int = 30,
):
user_id = user["sub"] if user else None
room = await rooms_controller.get_by_name(room_name)
if not room:
raise HTTPException(status_code=404, detail="Room not found")
from reflector.db.calendar_events import calendar_events_controller
events = await calendar_events_controller.get_upcoming(
room.id, minutes_ahead=minutes_ahead
)
if user_id != room.user_id:
for event in events:
event.description = None
event.attendees = None
return events
@router.get("/rooms/{room_name}/meetings/active", response_model=list[Meeting])
async def rooms_list_active_meetings(
room_name: str,
user: Annotated[Optional[auth.UserInfo], Depends(auth.current_user_optional)],
):
"""List all active meetings for a room (supports multiple active meetings)"""
user_id = user["sub"] if user else None
room = await rooms_controller.get_by_name(room_name)
if not room:
raise HTTPException(status_code=404, detail="Room not found")
current_time = datetime.now(timezone.utc)
meetings = await meetings_controller.get_all_active_for_room(
room=room, current_time=current_time
)
# Hide host URLs from non-owners
if user_id != room.user_id:
for meeting in meetings:
meeting.host_room_url = ""
return meetings
@router.post("/rooms/{room_name}/meetings/{meeting_id}/join", response_model=Meeting)
async def rooms_join_meeting(
room_name: str,
meeting_id: str,
user: Annotated[Optional[auth.UserInfo], Depends(auth.current_user_optional)],
):
"""Join a specific meeting by ID"""
user_id = user["sub"] if user else None
room = await rooms_controller.get_by_name(room_name)
if not room:
raise HTTPException(status_code=404, detail="Room not found")
meeting = await meetings_controller.get_by_id(meeting_id)
if not meeting:
raise HTTPException(status_code=404, detail="Meeting not found")
if meeting.room_id != room.id:
raise HTTPException(
status_code=403, detail="Meeting does not belong to this room"
)
if not meeting.is_active:
raise HTTPException(status_code=400, detail="Meeting is not active")
current_time = datetime.now(timezone.utc)
if meeting.end_date <= current_time:
raise HTTPException(status_code=400, detail="Meeting has ended")
# Hide host URL from non-owners
if user_id != room.user_id:
meeting.host_room_url = ""
return meeting

View File

@@ -1,29 +1,15 @@
from datetime import datetime, timedelta, timezone
from typing import Annotated, Literal, Optional
from fastapi import APIRouter, Depends, HTTPException, Query
from fastapi import APIRouter, Depends, HTTPException
from fastapi_pagination import Page
from fastapi_pagination.ext.databases import apaginate
from fastapi_pagination.ext.databases import paginate
from jose import jwt
from pydantic import BaseModel, Field, field_serializer
import reflector.auth as auth
from reflector.db import get_database
from reflector.db.meetings import meetings_controller
from reflector.db.rooms import rooms_controller
from reflector.db.search import (
DEFAULT_SEARCH_LIMIT,
SearchLimit,
SearchLimitBase,
SearchOffset,
SearchOffsetBase,
SearchParameters,
SearchQuery,
SearchQueryBase,
SearchResult,
SearchTotal,
search_controller,
)
from reflector.db.transcripts import (
SourceKind,
TranscriptParticipant,
@@ -48,7 +34,7 @@ DOWNLOAD_EXPIRE_MINUTES = 60
def create_access_token(data: dict, expires_delta: timedelta):
to_encode = data.copy()
expire = datetime.now(timezone.utc) + expires_delta
expire = datetime.utcnow() + expires_delta
to_encode.update({"exp": expire})
encoded_jwt = jwt.encode(to_encode, settings.SECRET_KEY, algorithm=ALGORITHM)
return encoded_jwt
@@ -114,21 +100,6 @@ class DeletionStatus(BaseModel):
status: str
SearchQueryParam = Annotated[SearchQueryBase, Query(description="Search query text")]
SearchLimitParam = Annotated[SearchLimitBase, Query(description="Results per page")]
SearchOffsetParam = Annotated[
SearchOffsetBase, Query(description="Number of results to skip")
]
class SearchResponse(BaseModel):
results: list[SearchResult]
total: SearchTotal
query: SearchQuery
limit: SearchLimit
offset: SearchOffset
@router.get("/transcripts", response_model=Page[GetTranscriptMinimal])
async def transcripts_list(
user: Annotated[Optional[auth.UserInfo], Depends(auth.current_user_optional)],
@@ -136,13 +107,15 @@ async def transcripts_list(
room_id: str | None = None,
search_term: str | None = None,
):
from reflector.db import database
if not user and not settings.PUBLIC_MODE:
raise HTTPException(status_code=401, detail="Not authenticated")
user_id = user["sub"] if user else None
return await apaginate(
get_database(),
return await paginate(
database,
await transcripts_controller.get_all(
user_id=user_id,
source_kind=SourceKind(source_kind) if source_kind else None,
@@ -154,39 +127,6 @@ async def transcripts_list(
)
@router.get("/transcripts/search", response_model=SearchResponse)
async def transcripts_search(
q: SearchQueryParam,
limit: SearchLimitParam = DEFAULT_SEARCH_LIMIT,
offset: SearchOffsetParam = 0,
room_id: Optional[str] = None,
user: Annotated[
Optional[auth.UserInfo], Depends(auth.current_user_optional)
] = None,
):
"""
Full-text search across transcript titles and content.
"""
if not user and not settings.PUBLIC_MODE:
raise HTTPException(status_code=401, detail="Not authenticated")
user_id = user["sub"] if user else None
search_params = SearchParameters(
query_text=q, limit=limit, offset=offset, user_id=user_id, room_id=room_id
)
results, total = await search_controller.search_transcripts(search_params)
return SearchResponse(
results=results,
total=total,
query=search_params.query_text,
limit=search_params.limit,
offset=search_params.offset,
)
@router.post("/transcripts", response_model=GetTranscript)
async def transcripts_create(
info: CreateTranscript,
@@ -333,8 +273,8 @@ async def transcript_update(
if not transcript:
raise HTTPException(status_code=404, detail="Transcript not found")
values = info.dict(exclude_unset=True)
updated_transcript = await transcripts_controller.update(transcript, values)
return updated_transcript
await transcripts_controller.update(transcript, values)
return transcript
@router.delete("/transcripts/{transcript_id}", response_model=DeletionStatus)

View File

@@ -51,6 +51,24 @@ async def transcript_get_audio_mp3(
transcript_id, user_id=user_id
)
if transcript.audio_location == "storage":
# proxy S3 file, to prevent issue with CORS
url = await transcript.get_audio_url()
headers = {}
copy_headers = ["range", "accept-encoding"]
for header in copy_headers:
if header in request.headers:
headers[header] = request.headers[header]
async with httpx.AsyncClient() as client:
resp = await client.request(request.method, url, headers=headers)
return Response(
content=resp.content,
status_code=resp.status_code,
headers=resp.headers,
)
if transcript.audio_location == "storage":
# proxy S3 file, to prevent issue with CORS
url = await transcript.get_audio_url()

View File

@@ -26,7 +26,7 @@ async def transcript_record_webrtc(
raise HTTPException(status_code=400, detail="Transcript is locked")
# create a pipeline runner
from reflector.pipelines.main_live_pipeline import PipelineMainLive # noqa: PLC0415
from reflector.pipelines.main_live_pipeline import PipelineMainLive
pipeline_runner = PipelineMainLive(transcript_id=transcript_id)

View File

@@ -68,13 +68,8 @@ async def whereby_webhook(event: WherebyWebhookEvent, request: Request):
raise HTTPException(status_code=404, detail="Meeting not found")
if event.type in ["room.client.joined", "room.client.left"]:
update_data = {"num_clients": event.data["numClients"]}
# Clear grace period if participant joined
if event.type == "room.client.joined" and event.data["numClients"] > 0:
if meeting.last_participant_left_at:
update_data["last_participant_left_at"] = None
await meetings_controller.update_meeting(meeting.id, **update_data)
await meetings_controller.update_meeting(
meeting.id, num_clients=event.data["numClients"]
)
return {"status": "ok"}

View File

@@ -23,7 +23,7 @@ async def create_meeting(room_name_prefix: str, end_date: datetime, room: Room):
"type": room.recording_type,
"destination": {
"provider": "s3",
"bucket": settings.RECORDING_STORAGE_AWS_BUCKET_NAME,
"bucket": settings.AWS_WHEREBY_S3_BUCKET,
"accessKeyId": settings.AWS_WHEREBY_ACCESS_KEY_ID,
"accessKeySecret": settings.AWS_WHEREBY_ACCESS_KEY_SECRET,
"fileFormat": "mp4",

View File

@@ -19,7 +19,6 @@ else:
"reflector.pipelines.main_live_pipeline",
"reflector.worker.healthcheck",
"reflector.worker.process",
"reflector.worker.ics_sync",
]
)
@@ -37,14 +36,6 @@ else:
"task": "reflector.worker.process.reprocess_failed_recordings",
"schedule": crontab(hour=5, minute=0), # Midnight EST
},
"sync_all_ics_calendars": {
"task": "reflector.worker.ics_sync.sync_all_ics_calendars",
"schedule": 60.0, # Run every minute to check which rooms need sync
},
"pre_create_upcoming_meetings": {
"task": "reflector.worker.ics_sync.pre_create_upcoming_meetings",
"schedule": 30.0, # Run every 30 seconds to pre-create meetings
},
}
if settings.HEALTHCHECK_URL:

View File

@@ -1,209 +0,0 @@
from datetime import datetime, timedelta, timezone
import structlog
from celery import shared_task
from celery.utils.log import get_task_logger
from reflector.db import get_database
from reflector.db.meetings import meetings_controller
from reflector.db.rooms import rooms, rooms_controller
from reflector.services.ics_sync import ics_sync_service
from reflector.whereby import create_meeting, upload_logo
logger = structlog.wrap_logger(get_task_logger(__name__))
@shared_task
def sync_room_ics(room_id: str):
asynctask(_sync_room_ics_async(room_id))
async def _sync_room_ics_async(room_id: str):
try:
room = await rooms_controller.get_by_id(room_id)
if not room:
logger.warning("Room not found for ICS sync", room_id=room_id)
return
if not room.ics_enabled or not room.ics_url:
logger.debug("ICS not enabled for room", room_id=room_id)
return
logger.info("Starting ICS sync for room", room_id=room_id, room_name=room.name)
result = await ics_sync_service.sync_room_calendar(room)
if result["status"] == "success":
logger.info(
"ICS sync completed successfully",
room_id=room_id,
events_found=result.get("events_found", 0),
events_created=result.get("events_created", 0),
events_updated=result.get("events_updated", 0),
events_deleted=result.get("events_deleted", 0),
)
elif result["status"] == "unchanged":
logger.debug("ICS content unchanged", room_id=room_id)
elif result["status"] == "error":
logger.error("ICS sync failed", room_id=room_id, error=result.get("error"))
else:
logger.debug(
"ICS sync skipped", room_id=room_id, reason=result.get("reason")
)
except Exception as e:
logger.error("Unexpected error during ICS sync", room_id=room_id, error=str(e))
@shared_task
def sync_all_ics_calendars():
asynctask(_sync_all_ics_calendars_async())
async def _sync_all_ics_calendars_async():
try:
logger.info("Starting sync for all ICS-enabled rooms")
# Get ALL rooms - not filtered by is_shared
query = rooms.select().where(
rooms.c.ics_enabled == True, rooms.c.ics_url != None
)
all_rooms = await get_database().fetch_all(query)
ics_enabled_rooms = list(all_rooms)
logger.info(f"Found {len(ics_enabled_rooms)} rooms with ICS enabled")
for room_data in ics_enabled_rooms:
room_id = room_data["id"]
room = await rooms_controller.get_by_id(room_id)
if not room:
continue
if not _should_sync(room):
logger.debug("Skipping room, not time to sync yet", room_id=room_id)
continue
sync_room_ics.delay(room_id)
logger.info("Queued sync tasks for all eligible rooms")
except Exception as e:
logger.error("Error in sync_all_ics_calendars", error=str(e))
def _should_sync(room) -> bool:
if not room.ics_last_sync:
return True
time_since_sync = datetime.now(timezone.utc) - room.ics_last_sync
return time_since_sync.total_seconds() >= room.ics_fetch_interval
@shared_task
def pre_create_upcoming_meetings():
asynctask(_pre_create_upcoming_meetings_async())
async def _pre_create_upcoming_meetings_async():
try:
logger.info("Starting pre-creation of upcoming meetings")
from reflector.db.calendar_events import calendar_events_controller
# Get ALL rooms with ICS enabled
query = rooms.select().where(
rooms.c.ics_enabled == True, rooms.c.ics_url != None
)
all_rooms = await get_database().fetch_all(query)
now = datetime.now(timezone.utc)
pre_create_window = now + timedelta(minutes=1)
for room_data in all_rooms:
room_id = room_data["id"]
room = await rooms_controller.get_by_id(room_id)
if not room:
continue
events = await calendar_events_controller.get_upcoming(
room_id, minutes_ahead=2
)
for event in events:
if event.start_time <= pre_create_window:
existing_meeting = await meetings_controller.get_by_calendar_event(
event.id
)
if not existing_meeting:
logger.info(
"Pre-creating meeting for calendar event",
room_id=room_id,
event_id=event.id,
event_title=event.title,
)
try:
end_date = event.end_time or (
event.start_time + timedelta(hours=1)
)
whereby_meeting = await create_meeting(
event.title or "Scheduled Meeting",
end_date=end_date,
room=room,
)
await upload_logo(
whereby_meeting["roomName"], "./images/logo.png"
)
meeting = await meetings_controller.create(
id=whereby_meeting["meetingId"],
room_name=whereby_meeting["roomName"],
room_url=whereby_meeting["roomUrl"],
host_room_url=whereby_meeting["hostRoomUrl"],
start_date=datetime.fromisoformat(
whereby_meeting["startDate"]
),
end_date=datetime.fromisoformat(
whereby_meeting["endDate"]
),
user_id=room.user_id,
room=room,
calendar_event_id=event.id,
calendar_metadata={
"title": event.title,
"description": event.description,
"attendees": event.attendees,
},
)
logger.info(
"Meeting pre-created successfully",
meeting_id=meeting.id,
event_id=event.id,
)
except Exception as e:
logger.error(
"Failed to pre-create meeting",
room_id=room_id,
event_id=event.id,
error=str(e),
)
logger.info("Completed pre-creation check for upcoming meetings")
except Exception as e:
logger.error("Error in pre_create_upcoming_meetings", error=str(e))
def asynctask(coro):
import asyncio
loop = asyncio.new_event_loop()
asyncio.set_event_loop(loop)
try:
return loop.run_until_complete(coro)
finally:
loop.close()

View File

@@ -1,10 +1,11 @@
import json
import os
from datetime import datetime, timedelta, timezone
from datetime import datetime, timezone
from urllib.parse import unquote
import av
import boto3
import httpx
import structlog
from celery import shared_task
from celery.utils.log import get_task_logger
@@ -21,14 +22,6 @@ from reflector.whereby import get_room_sessions
logger = structlog.wrap_logger(get_task_logger(__name__))
def parse_datetime_with_timezone(iso_string: str) -> datetime:
"""Parse ISO datetime string and ensure timezone awareness (defaults to UTC if naive)."""
dt = datetime.fromisoformat(iso_string)
if dt.tzinfo is None:
dt = dt.replace(tzinfo=timezone.utc)
return dt
@shared_task
def process_messages():
queue_url = settings.AWS_PROCESS_RECORDING_QUEUE_URL
@@ -77,7 +70,7 @@ async def process_recording(bucket_name: str, object_key: str):
# extract a guid and a datetime from the object key
room_name = f"/{object_key[:36]}"
recorded_at = parse_datetime_with_timezone(object_key[37:57])
recorded_at = datetime.fromisoformat(object_key[37:57])
meeting = await meetings_controller.get_by_room_name(room_name)
room = await rooms_controller.get_by_id(meeting.room_id)
@@ -146,76 +139,24 @@ async def process_recording(bucket_name: str, object_key: str):
@shared_task
@asynctask
async def process_meetings():
"""
Checks which meetings are still active and deactivates those that have ended.
Supports multiple active meetings per room and grace period logic.
"""
logger.info("Processing meetings")
meetings = await meetings_controller.get_all_active()
current_time = datetime.now(timezone.utc)
for meeting in meetings:
should_deactivate = False
is_active = False
end_date = meeting.end_date
if end_date.tzinfo is None:
end_date = end_date.replace(tzinfo=timezone.utc)
# Check if meeting has passed its scheduled end time
if end_date <= current_time:
# For calendar meetings, force close 30 minutes after scheduled end
if meeting.calendar_event_id:
if current_time > end_date + timedelta(minutes=30):
should_deactivate = True
logger.info(
"Meeting %s forced closed 30 min after calendar end", meeting.id
)
else:
# Unscheduled meetings follow normal closure rules
should_deactivate = True
# Check Whereby room sessions only if not already deactivating
if not should_deactivate and end_date > current_time:
if end_date > datetime.now(timezone.utc):
response = await get_room_sessions(meeting.room_name)
room_sessions = response.get("results", [])
has_active_sessions = room_sessions and any(
is_active = not room_sessions or any(
rs["endedAt"] is None for rs in room_sessions
)
if not has_active_sessions:
# No active sessions - check grace period
if meeting.num_clients == 0:
if meeting.last_participant_left_at:
# Check if grace period has expired
grace_period = timedelta(minutes=meeting.grace_period_minutes)
if (
current_time
> meeting.last_participant_left_at + grace_period
):
should_deactivate = True
logger.info("Meeting %s grace period expired", meeting.id)
else:
# First time all participants left, record the time
await meetings_controller.update_meeting(
meeting.id, last_participant_left_at=current_time
)
logger.info(
"Meeting %s marked empty at %s", meeting.id, current_time
)
else:
# Has active sessions - clear grace period if set
if meeting.last_participant_left_at:
await meetings_controller.update_meeting(
meeting.id, last_participant_left_at=None
)
logger.info(
"Meeting %s reactivated - participant rejoined", meeting.id
)
if should_deactivate:
if not is_active:
await meetings_controller.update_meeting(meeting.id, is_active=False)
logger.info("Meeting %s is deactivated", meeting.id)
logger.info("Processed %d meetings", len(meetings))
logger.info("Processed meetings")
@shared_task
@@ -237,7 +178,7 @@ async def reprocess_failed_recordings():
reprocessed_count = 0
try:
paginator = s3.get_paginator("list_objects_v2")
bucket_name = settings.RECORDING_STORAGE_AWS_BUCKET_NAME
bucket_name = settings.AWS_WHEREBY_S3_BUCKET
pages = paginator.paginate(Bucket=bucket_name)
for page in pages:
@@ -280,3 +221,98 @@ async def reprocess_failed_recordings():
logger.info(f"Reprocessing complete. Requeued {reprocessed_count} recordings")
return reprocessed_count
@shared_task
@asynctask
async def process_recording_from_url(
recording_url: str, meeting_id: str, recording_id: str
):
"""Process recording from Direct URL (Daily.co webhook)."""
logger.info("Processing recording from URL for meeting: %s", meeting_id)
meeting = await meetings_controller.get_by_id(meeting_id)
if not meeting:
logger.error("Meeting not found: %s", meeting_id)
return
room = await rooms_controller.get_by_id(meeting.room_id)
if not room:
logger.error("Room not found for meeting: %s", meeting_id)
return
# Create recording record with URL instead of S3 bucket/key
recording = await recordings_controller.get_by_object_key(
"daily-recordings", recording_id
)
if not recording:
recording = await recordings_controller.create(
Recording(
bucket_name="daily-recordings", # Logical bucket name for Daily.co
object_key=recording_id, # Store Daily.co recording ID
recorded_at=datetime.utcnow(),
meeting_id=meeting.id,
)
)
# Get or create transcript record
transcript = await transcripts_controller.get_by_recording_id(recording.id)
if transcript:
await transcripts_controller.update(transcript, {"topics": []})
else:
transcript = await transcripts_controller.add(
"",
source_kind=SourceKind.ROOM,
source_language="en",
target_language="en",
user_id=room.user_id,
recording_id=recording.id,
share_mode="public",
meeting_id=meeting.id,
room_id=room.id,
)
# Download file from URL
upload_filename = transcript.data_path / "upload.mp4"
upload_filename.parent.mkdir(parents=True, exist_ok=True)
try:
logger.info("Downloading recording from URL: %s", recording_url)
async with httpx.AsyncClient(timeout=300.0) as client: # 5 minute timeout
async with client.stream("GET", recording_url) as response:
response.raise_for_status()
with open(upload_filename, "wb") as f:
async for chunk in response.aiter_bytes(8192):
f.write(chunk)
logger.info("Download completed: %s", upload_filename)
except Exception as e:
logger.error("Failed to download recording: %s", str(e))
await transcripts_controller.update(transcript, {"status": "error"})
if upload_filename.exists():
upload_filename.unlink()
raise
# Validate audio content (same as S3 version)
try:
container = av.open(upload_filename.as_posix())
try:
if not len(container.streams.audio):
raise Exception("File has no audio stream")
logger.info("Audio validation successful")
finally:
container.close()
except Exception as e:
logger.error("Audio validation failed: %s", str(e))
await transcripts_controller.update(transcript, {"status": "error"})
if upload_filename.exists():
upload_filename.unlink()
raise
# Mark as uploaded and trigger processing pipeline
await transcripts_controller.update(transcript, {"status": "uploaded"})
logger.info("Queuing transcript for processing pipeline: %s", transcript.id)
# Start the ML pipeline (same as S3 version)
task_pipeline_process.delay(transcript_id=transcript.id)

View File

@@ -62,7 +62,6 @@ class RedisPubSubManager:
class WebsocketManager:
def __init__(self, pubsub_client: RedisPubSubManager = None):
self.rooms: dict = {}
self.tasks: dict = {}
self.pubsub_client = pubsub_client
async def add_user_to_room(self, room_id: str, websocket: WebSocket) -> None:
@@ -75,17 +74,13 @@ class WebsocketManager:
await self.pubsub_client.connect()
pubsub_subscriber = await self.pubsub_client.subscribe(room_id)
task = asyncio.create_task(self._pubsub_data_reader(pubsub_subscriber))
self.tasks[id(websocket)] = task
asyncio.create_task(self._pubsub_data_reader(pubsub_subscriber))
async def send_json(self, room_id: str, message: dict) -> None:
await self.pubsub_client.send_json(room_id, message)
async def remove_user_from_room(self, room_id: str, websocket: WebSocket) -> None:
self.rooms[room_id].remove(websocket)
task = self.tasks.pop(id(websocket), None)
if task:
task.cancel()
if len(self.rooms[room_id]) == 0:
del self.rooms[room_id]

View File

@@ -1,63 +1,21 @@
import os
from tempfile import NamedTemporaryFile
from unittest.mock import patch
import pytest
# Pytest-docker configuration
@pytest.fixture(scope="session")
def docker_compose_file(pytestconfig):
return os.path.join(str(pytestconfig.rootdir), "tests", "docker-compose.test.yml")
@pytest.fixture(scope="session")
def postgres_service(docker_ip, docker_services):
"""Ensure that PostgreSQL service is up and responsive."""
port = docker_services.port_for("postgres_test", 5432)
def is_responsive():
try:
import psycopg2
conn = psycopg2.connect(
host=docker_ip,
port=port,
dbname="reflector_test",
user="test_user",
password="test_password",
)
conn.close()
return True
except Exception:
return False
docker_services.wait_until_responsive(timeout=30.0, pause=0.1, check=is_responsive)
# Return connection parameters
return {
"host": docker_ip,
"port": port,
"dbname": "reflector_test",
"user": "test_user",
"password": "test_password",
}
@pytest.fixture(scope="function", autouse=True)
@pytest.mark.asyncio
async def setup_database(postgres_service):
from reflector.db import engine, metadata, get_database # noqa
async def setup_database():
from reflector.settings import settings
with NamedTemporaryFile() as f:
settings.DATABASE_URL = f"sqlite:///{f.name}"
from reflector.db import engine, metadata
metadata.drop_all(bind=engine)
metadata.create_all(bind=engine)
database = get_database()
try:
await database.connect()
yield
finally:
await database.disconnect()
@pytest.fixture
@@ -75,6 +33,9 @@ def dummy_processors():
patch(
"reflector.processors.transcript_final_summary.TranscriptFinalSummaryProcessor.get_short_summary"
) as mock_short_summary,
patch(
"reflector.processors.transcript_translator.TranscriptTranslatorProcessor.get_translation"
) as mock_translate,
):
from reflector.processors.transcript_topic_detector import TopicResponse
@@ -84,7 +45,9 @@ def dummy_processors():
mock_title.return_value = "LLM Title"
mock_long_summary.return_value = "LLM LONG SUMMARY"
mock_short_summary.return_value = "LLM SHORT SUMMARY"
mock_translate.return_value = "Bonjour le monde"
yield (
mock_translate,
mock_topic,
mock_title,
mock_long_summary,
@@ -92,20 +55,6 @@ def dummy_processors():
) # noqa
@pytest.fixture
async def whisper_transcript():
from reflector.processors.audio_transcript_whisper import (
AudioTranscriptWhisperProcessor,
)
with patch(
"reflector.processors.audio_transcript_auto"
".AudioTranscriptAutoProcessor.__new__"
) as mock_audio:
mock_audio.return_value = AudioTranscriptWhisperProcessor()
yield
@pytest.fixture
async def dummy_transcript():
from reflector.processors.audio_transcript import AudioTranscriptProcessor
@@ -156,27 +105,6 @@ async def dummy_diarization():
yield
@pytest.fixture
async def dummy_transcript_translator():
from reflector.processors.transcript_translator import TranscriptTranslatorProcessor
class TestTranscriptTranslatorProcessor(TranscriptTranslatorProcessor):
async def _translate(self, text: str) -> str:
source_language = self.get_pref("audio:source_language", "en")
target_language = self.get_pref("audio:target_language", "en")
return f"{source_language}:{target_language}:{text}"
def mock_new(cls, *args, **kwargs):
return TestTranscriptTranslatorProcessor(*args, **kwargs)
with patch(
"reflector.processors.transcript_translator_auto"
".TranscriptTranslatorAutoProcessor.__new__",
mock_new,
):
yield
@pytest.fixture
async def dummy_llm():
from reflector.llm import LLM
@@ -241,16 +169,6 @@ def celery_includes():
return ["reflector.pipelines.main_live_pipeline"]
@pytest.fixture
async def client():
from httpx import AsyncClient
from reflector.app import app
async with AsyncClient(app=app, base_url="http://test/v1") as ac:
yield ac
@pytest.fixture(scope="session")
def fake_mp3_upload():
with patch(
@@ -261,10 +179,13 @@ def fake_mp3_upload():
@pytest.fixture
async def fake_transcript_with_topics(tmpdir, client):
async def fake_transcript_with_topics(tmpdir):
import shutil
from pathlib import Path
from httpx import AsyncClient
from reflector.app import app
from reflector.db.transcripts import TranscriptTopic
from reflector.processors.types import Word
from reflector.settings import settings
@@ -273,7 +194,8 @@ async def fake_transcript_with_topics(tmpdir, client):
settings.DATA_DIR = Path(tmpdir)
# create a transcript
response = await client.post("/transcripts", json={"name": "Test audio download"})
ac = AsyncClient(app=app, base_url="http://test/v1")
response = await ac.post("/transcripts", json={"name": "Test audio download"})
assert response.status_code == 200
tid = response.json()["id"]

View File

@@ -1,13 +0,0 @@
version: '3.8'
services:
postgres_test:
image: postgres:15
environment:
POSTGRES_DB: reflector_test
POSTGRES_USER: test_user
POSTGRES_PASSWORD: test_password
ports:
- "15432:5432"
command: postgres -c fsync=off -c synchronous_commit=off -c full_page_writes=off
tmpfs:
- /var/lib/postgresql/data:rw,noexec,nosuid,size=1g

View File

@@ -1,351 +0,0 @@
"""
Tests for CalendarEvent model.
"""
from datetime import datetime, timedelta, timezone
import pytest
from reflector.db.calendar_events import CalendarEvent, calendar_events_controller
from reflector.db.rooms import rooms_controller
@pytest.mark.asyncio
async def test_calendar_event_create():
"""Test creating a calendar event."""
# Create a room first
room = await rooms_controller.add(
name="test-room",
user_id="test-user",
zulip_auto_post=False,
zulip_stream="",
zulip_topic="",
is_locked=False,
room_mode="normal",
recording_type="cloud",
recording_trigger="automatic-2nd-participant",
is_shared=False,
)
# Create calendar event
now = datetime.now(timezone.utc)
event = CalendarEvent(
room_id=room.id,
ics_uid="test-event-123",
title="Team Meeting",
description="Weekly team sync",
start_time=now + timedelta(hours=1),
end_time=now + timedelta(hours=2),
location=f"https://example.com/room/{room.name}",
attendees=[
{"email": "alice@example.com", "name": "Alice", "status": "ACCEPTED"},
{"email": "bob@example.com", "name": "Bob", "status": "TENTATIVE"},
],
)
# Save event
saved_event = await calendar_events_controller.upsert(event)
assert saved_event.ics_uid == "test-event-123"
assert saved_event.title == "Team Meeting"
assert saved_event.room_id == room.id
assert len(saved_event.attendees) == 2
@pytest.mark.asyncio
async def test_calendar_event_get_by_room():
"""Test getting calendar events for a room."""
# Create room
room = await rooms_controller.add(
name="events-room",
user_id="test-user",
zulip_auto_post=False,
zulip_stream="",
zulip_topic="",
is_locked=False,
room_mode="normal",
recording_type="cloud",
recording_trigger="automatic-2nd-participant",
is_shared=False,
)
now = datetime.now(timezone.utc)
# Create multiple events
for i in range(3):
event = CalendarEvent(
room_id=room.id,
ics_uid=f"event-{i}",
title=f"Meeting {i}",
start_time=now + timedelta(hours=i),
end_time=now + timedelta(hours=i + 1),
)
await calendar_events_controller.upsert(event)
# Get events for room
events = await calendar_events_controller.get_by_room(room.id)
assert len(events) == 3
assert all(e.room_id == room.id for e in events)
assert events[0].title == "Meeting 0"
assert events[1].title == "Meeting 1"
assert events[2].title == "Meeting 2"
@pytest.mark.asyncio
async def test_calendar_event_get_upcoming():
"""Test getting upcoming events within time window."""
# Create room
room = await rooms_controller.add(
name="upcoming-room",
user_id="test-user",
zulip_auto_post=False,
zulip_stream="",
zulip_topic="",
is_locked=False,
room_mode="normal",
recording_type="cloud",
recording_trigger="automatic-2nd-participant",
is_shared=False,
)
now = datetime.now(timezone.utc)
# Create events at different times
# Past event (should not be included)
past_event = CalendarEvent(
room_id=room.id,
ics_uid="past-event",
title="Past Meeting",
start_time=now - timedelta(hours=2),
end_time=now - timedelta(hours=1),
)
await calendar_events_controller.upsert(past_event)
# Upcoming event within 30 minutes
upcoming_event = CalendarEvent(
room_id=room.id,
ics_uid="upcoming-event",
title="Upcoming Meeting",
start_time=now + timedelta(minutes=15),
end_time=now + timedelta(minutes=45),
)
await calendar_events_controller.upsert(upcoming_event)
# Future event beyond 30 minutes
future_event = CalendarEvent(
room_id=room.id,
ics_uid="future-event",
title="Future Meeting",
start_time=now + timedelta(hours=2),
end_time=now + timedelta(hours=3),
)
await calendar_events_controller.upsert(future_event)
# Get upcoming events (default 30 minutes)
upcoming = await calendar_events_controller.get_upcoming(room.id)
assert len(upcoming) == 1
assert upcoming[0].ics_uid == "upcoming-event"
# Get upcoming with custom window
upcoming_extended = await calendar_events_controller.get_upcoming(
room.id, minutes_ahead=180
)
assert len(upcoming_extended) == 2
assert upcoming_extended[0].ics_uid == "upcoming-event"
assert upcoming_extended[1].ics_uid == "future-event"
@pytest.mark.asyncio
async def test_calendar_event_upsert():
"""Test upserting (create/update) calendar events."""
# Create room
room = await rooms_controller.add(
name="upsert-room",
user_id="test-user",
zulip_auto_post=False,
zulip_stream="",
zulip_topic="",
is_locked=False,
room_mode="normal",
recording_type="cloud",
recording_trigger="automatic-2nd-participant",
is_shared=False,
)
now = datetime.now(timezone.utc)
# Create new event
event = CalendarEvent(
room_id=room.id,
ics_uid="upsert-test",
title="Original Title",
start_time=now,
end_time=now + timedelta(hours=1),
)
created = await calendar_events_controller.upsert(event)
assert created.title == "Original Title"
# Update existing event
event.title = "Updated Title"
event.description = "Added description"
updated = await calendar_events_controller.upsert(event)
assert updated.title == "Updated Title"
assert updated.description == "Added description"
assert updated.ics_uid == "upsert-test"
# Verify only one event exists
events = await calendar_events_controller.get_by_room(room.id)
assert len(events) == 1
assert events[0].title == "Updated Title"
@pytest.mark.asyncio
async def test_calendar_event_soft_delete():
"""Test soft deleting events no longer in calendar."""
# Create room
room = await rooms_controller.add(
name="delete-room",
user_id="test-user",
zulip_auto_post=False,
zulip_stream="",
zulip_topic="",
is_locked=False,
room_mode="normal",
recording_type="cloud",
recording_trigger="automatic-2nd-participant",
is_shared=False,
)
now = datetime.now(timezone.utc)
# Create multiple events
for i in range(4):
event = CalendarEvent(
room_id=room.id,
ics_uid=f"event-{i}",
title=f"Meeting {i}",
start_time=now + timedelta(hours=i),
end_time=now + timedelta(hours=i + 1),
)
await calendar_events_controller.upsert(event)
# Soft delete events not in current list
current_ids = ["event-0", "event-2"] # Keep events 0 and 2
deleted_count = await calendar_events_controller.soft_delete_missing(
room.id, current_ids
)
assert deleted_count == 2 # Should delete events 1 and 3
# Get non-deleted events
events = await calendar_events_controller.get_by_room(
room.id, include_deleted=False
)
assert len(events) == 2
assert {e.ics_uid for e in events} == {"event-0", "event-2"}
# Get all events including deleted
all_events = await calendar_events_controller.get_by_room(
room.id, include_deleted=True
)
assert len(all_events) == 4
@pytest.mark.asyncio
async def test_calendar_event_past_events_not_deleted():
"""Test that past events are not soft deleted."""
# Create room
room = await rooms_controller.add(
name="past-events-room",
user_id="test-user",
zulip_auto_post=False,
zulip_stream="",
zulip_topic="",
is_locked=False,
room_mode="normal",
recording_type="cloud",
recording_trigger="automatic-2nd-participant",
is_shared=False,
)
now = datetime.now(timezone.utc)
# Create past event
past_event = CalendarEvent(
room_id=room.id,
ics_uid="past-event",
title="Past Meeting",
start_time=now - timedelta(hours=2),
end_time=now - timedelta(hours=1),
)
await calendar_events_controller.upsert(past_event)
# Create future event
future_event = CalendarEvent(
room_id=room.id,
ics_uid="future-event",
title="Future Meeting",
start_time=now + timedelta(hours=1),
end_time=now + timedelta(hours=2),
)
await calendar_events_controller.upsert(future_event)
# Try to soft delete all events (only future should be deleted)
deleted_count = await calendar_events_controller.soft_delete_missing(room.id, [])
assert deleted_count == 1 # Only future event deleted
# Verify past event still exists
events = await calendar_events_controller.get_by_room(
room.id, include_deleted=False
)
assert len(events) == 1
assert events[0].ics_uid == "past-event"
@pytest.mark.asyncio
async def test_calendar_event_with_raw_ics_data():
"""Test storing raw ICS data with calendar event."""
# Create room
room = await rooms_controller.add(
name="raw-ics-room",
user_id="test-user",
zulip_auto_post=False,
zulip_stream="",
zulip_topic="",
is_locked=False,
room_mode="normal",
recording_type="cloud",
recording_trigger="automatic-2nd-participant",
is_shared=False,
)
raw_ics = """BEGIN:VEVENT
UID:test-raw-123
SUMMARY:Test Event
DTSTART:20240101T100000Z
DTEND:20240101T110000Z
END:VEVENT"""
event = CalendarEvent(
room_id=room.id,
ics_uid="test-raw-123",
title="Test Event",
start_time=datetime.now(timezone.utc),
end_time=datetime.now(timezone.utc) + timedelta(hours=1),
ics_raw_data=raw_ics,
)
saved = await calendar_events_controller.upsert(event)
assert saved.ics_raw_data == raw_ics
# Retrieve and verify
retrieved = await calendar_events_controller.get_by_ics_uid(room.id, "test-raw-123")
assert retrieved is not None
assert retrieved.ics_raw_data == raw_ics

View File

@@ -0,0 +1,390 @@
"""Tests for Daily.co webhook integration."""
import hashlib
import hmac
import json
from datetime import datetime
from unittest.mock import MagicMock, patch
import pytest
from httpx import AsyncClient
from reflector.app import app
from reflector.views.daily import DailyWebhookEvent
class TestDailyWebhookIntegration:
"""Test Daily.co webhook endpoint integration."""
@pytest.fixture
def webhook_secret(self):
"""Test webhook secret."""
return "test-webhook-secret-123"
@pytest.fixture
def mock_room(self):
"""Create a mock room for testing."""
room = MagicMock()
room.id = "test-room-123"
room.name = "Test Room"
room.recording_type = "cloud"
room.platform = "daily"
return room
@pytest.fixture
def mock_meeting(self):
"""Create a mock meeting for testing."""
meeting = MagicMock()
meeting.id = "test-meeting-456"
meeting.room_id = "test-room-123"
meeting.platform = "daily"
meeting.room_name = "test-room-123-abc"
return meeting
def create_webhook_signature(self, payload: bytes, secret: str) -> str:
"""Create HMAC signature for webhook payload."""
return hmac.new(secret.encode(), payload, hashlib.sha256).hexdigest()
def create_webhook_event(
self, event_type: str, room_name: str = "test-room-123-abc", **kwargs
) -> dict:
"""Create a Daily.co webhook event payload."""
base_event = {
"type": event_type,
"id": f"evt_{event_type.replace('.', '_')}_{int(datetime.utcnow().timestamp())}",
"ts": int(datetime.utcnow().timestamp() * 1000), # milliseconds
"data": {"room": {"name": room_name}, **kwargs},
}
return base_event
@pytest.mark.asyncio
async def test_webhook_participant_joined(
self, webhook_secret, mock_room, mock_meeting
):
"""Test participant joined webhook event."""
event_data = self.create_webhook_event(
"participant.joined",
participant={
"id": "participant-123",
"user_name": "John Doe",
"session_id": "session-456",
},
)
payload = json.dumps(event_data).encode()
signature = self.create_webhook_signature(payload, webhook_secret)
with patch("reflector.views.daily.settings") as mock_settings:
mock_settings.DAILY_WEBHOOK_SECRET = webhook_secret
with patch(
"reflector.db.meetings.meetings_controller.get_by_room_name"
) as mock_get_meeting:
mock_get_meeting.return_value = mock_meeting
with patch(
"reflector.db.meetings.meetings_controller.update_meeting"
) as mock_update:
async with AsyncClient(app=app, base_url="http://test/v1") as ac:
response = await ac.post(
"/daily_webhook",
json=event_data,
headers={"X-Daily-Signature": signature},
)
assert response.status_code == 200
assert response.json() == {"status": "ok"}
# Verify meeting was looked up
mock_get_meeting.assert_called_once_with("test-room-123-abc")
@pytest.mark.asyncio
async def test_webhook_participant_left(
self, webhook_secret, mock_room, mock_meeting
):
"""Test participant left webhook event."""
event_data = self.create_webhook_event(
"participant.left",
participant={
"id": "participant-123",
"user_name": "John Doe",
"session_id": "session-456",
},
)
payload = json.dumps(event_data).encode()
signature = self.create_webhook_signature(payload, webhook_secret)
with patch("reflector.views.daily.settings") as mock_settings:
mock_settings.DAILY_WEBHOOK_SECRET = webhook_secret
with patch(
"reflector.db.meetings.meetings_controller.get_by_room_name"
) as mock_get_meeting:
mock_get_meeting.return_value = mock_meeting
async with AsyncClient(app=app, base_url="http://test/v1") as ac:
response = await ac.post(
"/daily_webhook",
json=event_data,
headers={"X-Daily-Signature": signature},
)
assert response.status_code == 200
assert response.json() == {"status": "ok"}
@pytest.mark.asyncio
async def test_webhook_recording_started(
self, webhook_secret, mock_room, mock_meeting
):
"""Test recording started webhook event."""
event_data = self.create_webhook_event(
"recording.started",
recording={
"id": "recording-789",
"status": "recording",
"start_time": "2025-01-01T10:00:00Z",
},
)
payload = json.dumps(event_data).encode()
signature = self.create_webhook_signature(payload, webhook_secret)
with patch("reflector.views.daily.settings") as mock_settings:
mock_settings.DAILY_WEBHOOK_SECRET = webhook_secret
with patch(
"reflector.db.meetings.meetings_controller.get_by_room_name"
) as mock_get_meeting:
mock_get_meeting.return_value = mock_meeting
with patch(
"reflector.db.meetings.meetings_controller.update_meeting"
) as mock_update:
async with AsyncClient(app=app, base_url="http://test/v1") as ac:
response = await ac.post(
"/daily_webhook",
json=event_data,
headers={"X-Daily-Signature": signature},
)
assert response.status_code == 200
assert response.json() == {"status": "ok"}
@pytest.mark.asyncio
async def test_webhook_recording_ready_triggers_processing(
self, webhook_secret, mock_room, mock_meeting
):
"""Test recording ready webhook triggers audio processing."""
event_data = self.create_webhook_event(
"recording.ready-to-download",
recording={
"id": "recording-789",
"status": "finished",
"download_url": "https://s3.amazonaws.com/bucket/recording.mp4",
"start_time": "2025-01-01T10:00:00Z",
"duration": 1800,
},
)
payload = json.dumps(event_data).encode()
signature = self.create_webhook_signature(payload, webhook_secret)
with patch("reflector.views.daily.settings") as mock_settings:
mock_settings.DAILY_WEBHOOK_SECRET = webhook_secret
with patch(
"reflector.db.meetings.meetings_controller.get_by_room_name"
) as mock_get_meeting:
mock_get_meeting.return_value = mock_meeting
with patch(
"reflector.db.meetings.meetings_controller.update_meeting"
) as mock_update_url:
with patch(
"reflector.worker.process.process_recording_from_url.delay"
) as mock_process:
async with AsyncClient(
app=app, base_url="http://test/v1"
) as ac:
response = await ac.post(
"/daily_webhook",
json=event_data,
headers={"X-Daily-Signature": signature},
)
assert response.status_code == 200
assert response.json() == {"status": "ok"}
# Verify processing was triggered with correct parameters
mock_process.assert_called_once_with(
recording_url="https://s3.amazonaws.com/bucket/recording.mp4",
meeting_id=mock_meeting.id,
recording_id="recording-789",
)
@pytest.mark.asyncio
async def test_webhook_invalid_signature_rejected(self, webhook_secret):
"""Test webhook with invalid signature is rejected."""
event_data = self.create_webhook_event("participant.joined")
with patch("reflector.views.daily.settings") as mock_settings:
mock_settings.DAILY_WEBHOOK_SECRET = webhook_secret
async with AsyncClient(app=app, base_url="http://test/v1") as ac:
response = await ac.post(
"/daily_webhook",
json=event_data,
headers={"X-Daily-Signature": "invalid-signature"},
)
assert response.status_code == 401
assert "Invalid signature" in response.json()["detail"]
@pytest.mark.asyncio
async def test_webhook_missing_signature_rejected(self):
"""Test webhook without signature header is rejected."""
event_data = self.create_webhook_event("participant.joined")
async with AsyncClient(app=app, base_url="http://test/v1") as ac:
response = await ac.post("/daily_webhook", json=event_data)
assert response.status_code == 401
assert "Missing signature" in response.json()["detail"]
@pytest.mark.asyncio
async def test_webhook_meeting_not_found(self, webhook_secret):
"""Test webhook for non-existent meeting."""
event_data = self.create_webhook_event(
"participant.joined", room_name="non-existent-room"
)
payload = json.dumps(event_data).encode()
signature = self.create_webhook_signature(payload, webhook_secret)
with patch("reflector.views.daily.settings") as mock_settings:
mock_settings.DAILY_WEBHOOK_SECRET = webhook_secret
with patch(
"reflector.db.meetings.meetings_controller.get_by_room_name"
) as mock_get_meeting:
mock_get_meeting.return_value = None
async with AsyncClient(app=app, base_url="http://test/v1") as ac:
response = await ac.post(
"/daily_webhook",
json=event_data,
headers={"X-Daily-Signature": signature},
)
assert response.status_code == 404
assert "Meeting not found" in response.json()["detail"]
@pytest.mark.asyncio
async def test_webhook_unknown_event_type(self, webhook_secret, mock_meeting):
"""Test webhook with unknown event type."""
event_data = self.create_webhook_event("unknown.event")
payload = json.dumps(event_data).encode()
signature = self.create_webhook_signature(payload, webhook_secret)
with patch("reflector.views.daily.settings") as mock_settings:
mock_settings.DAILY_WEBHOOK_SECRET = webhook_secret
with patch(
"reflector.db.meetings.meetings_controller.get_by_room_name"
) as mock_get_meeting:
mock_get_meeting.return_value = mock_meeting
async with AsyncClient(app=app, base_url="http://test/v1") as ac:
response = await ac.post(
"/daily_webhook",
json=event_data,
headers={"X-Daily-Signature": signature},
)
# Should still return 200 but log the unknown event
assert response.status_code == 200
assert response.json() == {"status": "ok"}
@pytest.mark.asyncio
async def test_webhook_malformed_json(self, webhook_secret):
"""Test webhook with malformed JSON."""
with patch("reflector.views.daily.settings") as mock_settings:
mock_settings.DAILY_WEBHOOK_SECRET = webhook_secret
async with AsyncClient(app=app, base_url="http://test/v1") as ac:
response = await ac.post(
"/daily_webhook",
content="invalid json",
headers={
"Content-Type": "application/json",
"X-Daily-Signature": "test-signature",
},
)
assert response.status_code == 422 # Validation error
class TestWebhookEventValidation:
"""Test webhook event data validation."""
def test_daily_webhook_event_validation_valid(self):
"""Test valid webhook event passes validation."""
event_data = {
"type": "participant.joined",
"id": "evt_123",
"ts": 1640995200000, # milliseconds
"data": {
"room": {"name": "test-room"},
"participant": {
"id": "participant-123",
"user_name": "John Doe",
"session_id": "session-456",
},
},
}
event = DailyWebhookEvent(**event_data)
assert event.type == "participant.joined"
assert event.data["room"]["name"] == "test-room"
assert event.data["participant"]["id"] == "participant-123"
def test_daily_webhook_event_validation_minimal(self):
"""Test minimal valid webhook event."""
event_data = {
"type": "room.created",
"id": "evt_123",
"ts": 1640995200000,
"data": {"room": {"name": "test-room"}},
}
event = DailyWebhookEvent(**event_data)
assert event.type == "room.created"
assert event.data["room"]["name"] == "test-room"
def test_daily_webhook_event_validation_with_recording(self):
"""Test webhook event with recording data."""
event_data = {
"type": "recording.ready-to-download",
"id": "evt_123",
"ts": 1640995200000,
"data": {
"room": {"name": "test-room"},
"recording": {
"id": "recording-123",
"status": "finished",
"download_url": "https://example.com/recording.mp4",
"start_time": "2025-01-01T10:00:00Z",
"duration": 1800,
},
},
}
event = DailyWebhookEvent(**event_data)
assert event.type == "recording.ready-to-download"
assert event.data["recording"]["id"] == "recording-123"
assert (
event.data["recording"]["download_url"]
== "https://example.com/recording.mp4"
)

View File

@@ -1,230 +0,0 @@
from datetime import datetime, timedelta, timezone
from unittest.mock import AsyncMock, MagicMock, patch
import pytest
from icalendar import Calendar, Event
from reflector.db.calendar_events import calendar_events_controller
from reflector.db.rooms import rooms_controller
from reflector.worker.ics_sync import (
_should_sync,
_sync_all_ics_calendars_async,
_sync_room_ics_async,
)
@pytest.mark.asyncio
async def test_sync_room_ics_task():
room = await rooms_controller.add(
name="task-test-room",
user_id="test-user",
zulip_auto_post=False,
zulip_stream="",
zulip_topic="",
is_locked=False,
room_mode="normal",
recording_type="cloud",
recording_trigger="automatic-2nd-participant",
is_shared=False,
ics_url="https://calendar.example.com/task.ics",
ics_enabled=True,
)
cal = Calendar()
event = Event()
event.add("uid", "task-event-1")
event.add("summary", "Task Test Meeting")
from reflector.settings import settings
event.add("location", f"{settings.BASE_URL}/room/{room.name}")
now = datetime.now(timezone.utc)
event.add("dtstart", now + timedelta(hours=1))
event.add("dtend", now + timedelta(hours=2))
cal.add_component(event)
ics_content = cal.to_ical().decode("utf-8")
with patch(
"reflector.services.ics_sync.ICSFetchService.fetch_ics", new_callable=AsyncMock
) as mock_fetch:
mock_fetch.return_value = ics_content
await _sync_room_ics_async(room.id)
events = await calendar_events_controller.get_by_room(room.id)
assert len(events) == 1
assert events[0].ics_uid == "task-event-1"
@pytest.mark.asyncio
async def test_sync_room_ics_disabled():
room = await rooms_controller.add(
name="disabled-room",
user_id="test-user",
zulip_auto_post=False,
zulip_stream="",
zulip_topic="",
is_locked=False,
room_mode="normal",
recording_type="cloud",
recording_trigger="automatic-2nd-participant",
is_shared=False,
ics_enabled=False,
)
await _sync_room_ics_async(room.id)
events = await calendar_events_controller.get_by_room(room.id)
assert len(events) == 0
@pytest.mark.asyncio
async def test_sync_all_ics_calendars():
room1 = await rooms_controller.add(
name="sync-all-1",
user_id="test-user",
zulip_auto_post=False,
zulip_stream="",
zulip_topic="",
is_locked=False,
room_mode="normal",
recording_type="cloud",
recording_trigger="automatic-2nd-participant",
is_shared=False,
ics_url="https://calendar.example.com/1.ics",
ics_enabled=True,
)
room2 = await rooms_controller.add(
name="sync-all-2",
user_id="test-user",
zulip_auto_post=False,
zulip_stream="",
zulip_topic="",
is_locked=False,
room_mode="normal",
recording_type="cloud",
recording_trigger="automatic-2nd-participant",
is_shared=False,
ics_url="https://calendar.example.com/2.ics",
ics_enabled=True,
)
room3 = await rooms_controller.add(
name="sync-all-3",
user_id="test-user",
zulip_auto_post=False,
zulip_stream="",
zulip_topic="",
is_locked=False,
room_mode="normal",
recording_type="cloud",
recording_trigger="automatic-2nd-participant",
is_shared=False,
ics_enabled=False,
)
with patch("reflector.worker.ics_sync.sync_room_ics.delay") as mock_delay:
await _sync_all_ics_calendars_async()
assert mock_delay.call_count == 2
called_room_ids = [call.args[0] for call in mock_delay.call_args_list]
assert room1.id in called_room_ids
assert room2.id in called_room_ids
assert room3.id not in called_room_ids
@pytest.mark.asyncio
async def test_should_sync_logic():
room = MagicMock()
room.ics_last_sync = None
assert _should_sync(room) is True
room.ics_last_sync = datetime.now(timezone.utc) - timedelta(seconds=100)
room.ics_fetch_interval = 300
assert _should_sync(room) is False
room.ics_last_sync = datetime.now(timezone.utc) - timedelta(seconds=400)
room.ics_fetch_interval = 300
assert _should_sync(room) is True
@pytest.mark.asyncio
async def test_sync_respects_fetch_interval():
now = datetime.now(timezone.utc)
room1 = await rooms_controller.add(
name="interval-test-1",
user_id="test-user",
zulip_auto_post=False,
zulip_stream="",
zulip_topic="",
is_locked=False,
room_mode="normal",
recording_type="cloud",
recording_trigger="automatic-2nd-participant",
is_shared=False,
ics_url="https://calendar.example.com/interval.ics",
ics_enabled=True,
ics_fetch_interval=300,
)
await rooms_controller.update(
room1,
{"ics_last_sync": now - timedelta(seconds=100)},
)
room2 = await rooms_controller.add(
name="interval-test-2",
user_id="test-user",
zulip_auto_post=False,
zulip_stream="",
zulip_topic="",
is_locked=False,
room_mode="normal",
recording_type="cloud",
recording_trigger="automatic-2nd-participant",
is_shared=False,
ics_url="https://calendar.example.com/interval2.ics",
ics_enabled=True,
ics_fetch_interval=60,
)
await rooms_controller.update(
room2,
{"ics_last_sync": now - timedelta(seconds=100)},
)
with patch("reflector.worker.ics_sync.sync_room_ics.delay") as mock_delay:
await _sync_all_ics_calendars_async()
assert mock_delay.call_count == 1
assert mock_delay.call_args[0][0] == room2.id
@pytest.mark.asyncio
async def test_sync_handles_errors_gracefully():
room = await rooms_controller.add(
name="error-task-room",
user_id="test-user",
zulip_auto_post=False,
zulip_stream="",
zulip_topic="",
is_locked=False,
room_mode="normal",
recording_type="cloud",
recording_trigger="automatic-2nd-participant",
is_shared=False,
ics_url="https://calendar.example.com/error.ics",
ics_enabled=True,
)
with patch(
"reflector.services.ics_sync.ICSFetchService.fetch_ics", new_callable=AsyncMock
) as mock_fetch:
mock_fetch.side_effect = Exception("Network error")
await _sync_room_ics_async(room.id)
events = await calendar_events_controller.get_by_room(room.id)
assert len(events) == 0

View File

@@ -1,289 +0,0 @@
from datetime import datetime, timedelta, timezone
from unittest.mock import AsyncMock, MagicMock, patch
import pytest
from icalendar import Calendar, Event
from reflector.db.calendar_events import calendar_events_controller
from reflector.db.rooms import rooms_controller
from reflector.services.ics_sync import ICSFetchService, ICSSyncService
@pytest.mark.asyncio
async def test_ics_fetch_service_event_matching():
service = ICSFetchService()
room_name = "test-room"
room_url = "https://example.com/room/test-room"
# Create test event
event = Event()
event.add("uid", "test-123")
event.add("summary", "Test Meeting")
# Test matching with full URL in location
event.add("location", "https://example.com/room/test-room")
assert service._event_matches_room(event, room_name, room_url) is True
# Test matching with URL without protocol
event["location"] = "example.com/room/test-room"
assert service._event_matches_room(event, room_name, room_url) is True
# Test matching in description
event["location"] = "Conference Room A"
event.add("description", f"Join at {room_url}")
assert service._event_matches_room(event, room_name, room_url) is True
# Test non-matching
event["location"] = "Different Room"
event["description"] = "No room URL here"
assert service._event_matches_room(event, room_name, room_url) is False
# Test partial paths should NOT match anymore
event["location"] = "/room/test-room"
assert service._event_matches_room(event, room_name, room_url) is False
event["location"] = f"Room: {room_name}"
assert service._event_matches_room(event, room_name, room_url) is False
@pytest.mark.asyncio
async def test_ics_fetch_service_parse_event():
service = ICSFetchService()
# Create test event
event = Event()
event.add("uid", "test-456")
event.add("summary", "Team Standup")
event.add("description", "Daily team sync")
event.add("location", "https://example.com/room/standup")
now = datetime.now(timezone.utc)
event.add("dtstart", now)
event.add("dtend", now + timedelta(hours=1))
# Add attendees
event.add("attendee", "mailto:alice@example.com", parameters={"CN": "Alice"})
event.add("attendee", "mailto:bob@example.com", parameters={"CN": "Bob"})
event.add("organizer", "mailto:carol@example.com", parameters={"CN": "Carol"})
# Parse event
result = service._parse_event(event)
assert result is not None
assert result["ics_uid"] == "test-456"
assert result["title"] == "Team Standup"
assert result["description"] == "Daily team sync"
assert result["location"] == "https://example.com/room/standup"
assert len(result["attendees"]) == 3 # 2 attendees + 1 organizer
@pytest.mark.asyncio
async def test_ics_fetch_service_extract_room_events():
service = ICSFetchService()
room_name = "meeting"
room_url = "https://example.com/room/meeting"
# Create calendar with multiple events
cal = Calendar()
# Event 1: Matches room
event1 = Event()
event1.add("uid", "match-1")
event1.add("summary", "Planning Meeting")
event1.add("location", room_url)
now = datetime.now(timezone.utc)
event1.add("dtstart", now + timedelta(hours=2))
event1.add("dtend", now + timedelta(hours=3))
cal.add_component(event1)
# Event 2: Doesn't match room
event2 = Event()
event2.add("uid", "no-match")
event2.add("summary", "Other Meeting")
event2.add("location", "https://example.com/room/other")
event2.add("dtstart", now + timedelta(hours=4))
event2.add("dtend", now + timedelta(hours=5))
cal.add_component(event2)
# Event 3: Matches room in description
event3 = Event()
event3.add("uid", "match-2")
event3.add("summary", "Review Session")
event3.add("description", f"Meeting link: {room_url}")
event3.add("dtstart", now + timedelta(hours=6))
event3.add("dtend", now + timedelta(hours=7))
cal.add_component(event3)
# Event 4: Cancelled event (should be skipped)
event4 = Event()
event4.add("uid", "cancelled")
event4.add("summary", "Cancelled Meeting")
event4.add("location", room_url)
event4.add("status", "CANCELLED")
event4.add("dtstart", now + timedelta(hours=8))
event4.add("dtend", now + timedelta(hours=9))
cal.add_component(event4)
# Extract events
events = service.extract_room_events(cal, room_name, room_url)
assert len(events) == 2
assert events[0]["ics_uid"] == "match-1"
assert events[1]["ics_uid"] == "match-2"
@pytest.mark.asyncio
async def test_ics_sync_service_sync_room_calendar():
# Create room
room = await rooms_controller.add(
name="sync-test",
user_id="test-user",
zulip_auto_post=False,
zulip_stream="",
zulip_topic="",
is_locked=False,
room_mode="normal",
recording_type="cloud",
recording_trigger="automatic-2nd-participant",
is_shared=False,
ics_url="https://calendar.example.com/test.ics",
ics_enabled=True,
)
# Mock ICS content
cal = Calendar()
event = Event()
event.add("uid", "sync-event-1")
event.add("summary", "Sync Test Meeting")
# Use the actual BASE_URL from settings
from reflector.settings import settings
event.add("location", f"{settings.BASE_URL}/room/{room.name}")
now = datetime.now(timezone.utc)
event.add("dtstart", now + timedelta(hours=1))
event.add("dtend", now + timedelta(hours=2))
cal.add_component(event)
ics_content = cal.to_ical().decode("utf-8")
# Create sync service and mock fetch
sync_service = ICSSyncService()
with patch.object(
sync_service.fetch_service, "fetch_ics", new_callable=AsyncMock
) as mock_fetch:
mock_fetch.return_value = ics_content
# First sync
result = await sync_service.sync_room_calendar(room)
assert result["status"] == "success"
assert result["events_found"] == 1
assert result["events_created"] == 1
assert result["events_updated"] == 0
assert result["events_deleted"] == 0
# Verify event was created
events = await calendar_events_controller.get_by_room(room.id)
assert len(events) == 1
assert events[0].ics_uid == "sync-event-1"
assert events[0].title == "Sync Test Meeting"
# Second sync with same content (should be unchanged)
# Refresh room to get updated etag and force sync by setting old sync time
room = await rooms_controller.get_by_id(room.id)
await rooms_controller.update(
room, {"ics_last_sync": datetime.now(timezone.utc) - timedelta(minutes=10)}
)
result = await sync_service.sync_room_calendar(room)
assert result["status"] == "unchanged"
# Third sync with updated event
event["summary"] = "Updated Meeting Title"
cal = Calendar()
cal.add_component(event)
ics_content = cal.to_ical().decode("utf-8")
mock_fetch.return_value = ics_content
# Force sync by clearing etag
await rooms_controller.update(room, {"ics_last_etag": None})
result = await sync_service.sync_room_calendar(room)
assert result["status"] == "success"
assert result["events_created"] == 0
assert result["events_updated"] == 1
# Verify event was updated
events = await calendar_events_controller.get_by_room(room.id)
assert len(events) == 1
assert events[0].title == "Updated Meeting Title"
@pytest.mark.asyncio
async def test_ics_sync_service_should_sync():
service = ICSSyncService()
# Room never synced
room = MagicMock()
room.ics_last_sync = None
room.ics_fetch_interval = 300
assert service._should_sync(room) is True
# Room synced recently
room.ics_last_sync = datetime.now(timezone.utc) - timedelta(seconds=100)
assert service._should_sync(room) is False
# Room sync due
room.ics_last_sync = datetime.now(timezone.utc) - timedelta(seconds=400)
assert service._should_sync(room) is True
@pytest.mark.asyncio
async def test_ics_sync_service_skip_disabled():
service = ICSSyncService()
# Room with ICS disabled
room = MagicMock()
room.ics_enabled = False
room.ics_url = "https://calendar.example.com/test.ics"
result = await service.sync_room_calendar(room)
assert result["status"] == "skipped"
assert result["reason"] == "ICS not configured"
# Room without URL
room.ics_enabled = True
room.ics_url = None
result = await service.sync_room_calendar(room)
assert result["status"] == "skipped"
assert result["reason"] == "ICS not configured"
@pytest.mark.asyncio
async def test_ics_sync_service_error_handling():
# Create room
room = await rooms_controller.add(
name="error-test",
user_id="test-user",
zulip_auto_post=False,
zulip_stream="",
zulip_topic="",
is_locked=False,
room_mode="normal",
recording_type="cloud",
recording_trigger="automatic-2nd-participant",
is_shared=False,
ics_url="https://calendar.example.com/error.ics",
ics_enabled=True,
)
sync_service = ICSSyncService()
with patch.object(
sync_service.fetch_service, "fetch_ics", new_callable=AsyncMock
) as mock_fetch:
mock_fetch.side_effect = Exception("Network error")
result = await sync_service.sync_room_calendar(room)
assert result["status"] == "error"
assert "Network error" in result["error"]

View File

@@ -1,283 +0,0 @@
"""Tests for multiple active meetings per room functionality."""
from datetime import datetime, timedelta, timezone
import pytest
from reflector.db.calendar_events import CalendarEvent, calendar_events_controller
from reflector.db.meetings import meetings_controller
from reflector.db.rooms import rooms_controller
@pytest.mark.asyncio
async def test_multiple_active_meetings_per_room():
"""Test that multiple active meetings can exist for the same room."""
# Create a room
room = await rooms_controller.add(
name="test-room",
user_id="test-user",
zulip_auto_post=False,
zulip_stream="",
zulip_topic="",
is_locked=False,
room_mode="normal",
recording_type="cloud",
recording_trigger="automatic-2nd-participant",
is_shared=False,
)
current_time = datetime.now(timezone.utc)
end_time = current_time + timedelta(hours=2)
# Create first meeting
meeting1 = await meetings_controller.create(
id="meeting-1",
room_name="test-meeting-1",
room_url="https://whereby.com/test-1",
host_room_url="https://whereby.com/test-1-host",
start_date=current_time,
end_date=end_time,
user_id="test-user",
room=room,
)
# Create second meeting for the same room (should succeed now)
meeting2 = await meetings_controller.create(
id="meeting-2",
room_name="test-meeting-2",
room_url="https://whereby.com/test-2",
host_room_url="https://whereby.com/test-2-host",
start_date=current_time,
end_date=end_time,
user_id="test-user",
room=room,
)
# Both meetings should be active
active_meetings = await meetings_controller.get_all_active_for_room(
room=room, current_time=current_time
)
assert len(active_meetings) == 2
assert meeting1.id in [m.id for m in active_meetings]
assert meeting2.id in [m.id for m in active_meetings]
@pytest.mark.asyncio
async def test_get_active_by_calendar_event():
"""Test getting active meeting by calendar event ID."""
# Create a room
room = await rooms_controller.add(
name="test-room",
user_id="test-user",
zulip_auto_post=False,
zulip_stream="",
zulip_topic="",
is_locked=False,
room_mode="normal",
recording_type="cloud",
recording_trigger="automatic-2nd-participant",
is_shared=False,
)
# Create a calendar event
event = CalendarEvent(
room_id=room.id,
ics_uid="test-event-uid",
title="Test Meeting",
start_time=datetime.now(timezone.utc),
end_time=datetime.now(timezone.utc) + timedelta(hours=1),
)
event = await calendar_events_controller.upsert(event)
current_time = datetime.now(timezone.utc)
end_time = current_time + timedelta(hours=2)
# Create meeting linked to calendar event
meeting = await meetings_controller.create(
id="meeting-cal-1",
room_name="test-meeting-cal",
room_url="https://whereby.com/test-cal",
host_room_url="https://whereby.com/test-cal-host",
start_date=current_time,
end_date=end_time,
user_id="test-user",
room=room,
calendar_event_id=event.id,
calendar_metadata={"title": event.title},
)
# Should find the meeting by calendar event
found_meeting = await meetings_controller.get_active_by_calendar_event(
room=room, calendar_event_id=event.id, current_time=current_time
)
assert found_meeting is not None
assert found_meeting.id == meeting.id
assert found_meeting.calendar_event_id == event.id
@pytest.mark.asyncio
async def test_grace_period_logic():
"""Test that meetings have a grace period after last participant leaves."""
# Create a room
room = await rooms_controller.add(
name="test-room",
user_id="test-user",
zulip_auto_post=False,
zulip_stream="",
zulip_topic="",
is_locked=False,
room_mode="normal",
recording_type="cloud",
recording_trigger="automatic-2nd-participant",
is_shared=False,
)
current_time = datetime.now(timezone.utc)
end_time = current_time + timedelta(hours=2)
# Create meeting
meeting = await meetings_controller.create(
id="meeting-grace",
room_name="test-meeting-grace",
room_url="https://whereby.com/test-grace",
host_room_url="https://whereby.com/test-grace-host",
start_date=current_time,
end_date=end_time,
user_id="test-user",
room=room,
)
# Test grace period logic by simulating different states
# Simulate first time all participants left
await meetings_controller.update_meeting(
meeting.id, num_clients=0, last_participant_left_at=current_time
)
# Within grace period (10 min) - should still be active
await meetings_controller.update_meeting(
meeting.id, last_participant_left_at=current_time - timedelta(minutes=10)
)
updated_meeting = await meetings_controller.get_by_id(meeting.id)
assert updated_meeting.is_active is True # Still active during grace period
# Simulate grace period expired (20 min) and deactivate
await meetings_controller.update_meeting(
meeting.id, last_participant_left_at=current_time - timedelta(minutes=20)
)
# Manually test the grace period logic that would be in process_meetings
updated_meeting = await meetings_controller.get_by_id(meeting.id)
if updated_meeting.last_participant_left_at:
grace_period = timedelta(minutes=updated_meeting.grace_period_minutes)
if current_time > updated_meeting.last_participant_left_at + grace_period:
await meetings_controller.update_meeting(meeting.id, is_active=False)
updated_meeting = await meetings_controller.get_by_id(meeting.id)
assert updated_meeting.is_active is False # Now deactivated
@pytest.mark.asyncio
async def test_calendar_meeting_force_close_after_30_min():
"""Test that calendar meetings force close 30 minutes after scheduled end."""
# Create a room
room = await rooms_controller.add(
name="test-room",
user_id="test-user",
zulip_auto_post=False,
zulip_stream="",
zulip_topic="",
is_locked=False,
room_mode="normal",
recording_type="cloud",
recording_trigger="automatic-2nd-participant",
is_shared=False,
)
# Create a calendar event
event = CalendarEvent(
room_id=room.id,
ics_uid="test-event-force",
title="Test Meeting Force Close",
start_time=datetime.now(timezone.utc) - timedelta(hours=2),
end_time=datetime.now(timezone.utc) - timedelta(minutes=35), # Ended 35 min ago
)
event = await calendar_events_controller.upsert(event)
current_time = datetime.now(timezone.utc)
# Create meeting linked to calendar event
meeting = await meetings_controller.create(
id="meeting-force",
room_name="test-meeting-force",
room_url="https://whereby.com/test-force",
host_room_url="https://whereby.com/test-force-host",
start_date=event.start_time,
end_date=event.end_time,
user_id="test-user",
room=room,
calendar_event_id=event.id,
)
# Test that calendar meetings force close 30 min after scheduled end
# The meeting ended 35 minutes ago, so it should be force closed
# Manually test the force close logic that would be in process_meetings
if meeting.calendar_event_id:
if current_time > meeting.end_date + timedelta(minutes=30):
await meetings_controller.update_meeting(meeting.id, is_active=False)
updated_meeting = await meetings_controller.get_by_id(meeting.id)
assert updated_meeting.is_active is False # Force closed after 30 min
@pytest.mark.asyncio
async def test_participant_rejoin_clears_grace_period():
"""Test that participant rejoining clears the grace period."""
# Create a room
room = await rooms_controller.add(
name="test-room",
user_id="test-user",
zulip_auto_post=False,
zulip_stream="",
zulip_topic="",
is_locked=False,
room_mode="normal",
recording_type="cloud",
recording_trigger="automatic-2nd-participant",
is_shared=False,
)
current_time = datetime.now(timezone.utc)
end_time = current_time + timedelta(hours=2)
# Create meeting with grace period already set
meeting = await meetings_controller.create(
id="meeting-rejoin",
room_name="test-meeting-rejoin",
room_url="https://whereby.com/test-rejoin",
host_room_url="https://whereby.com/test-rejoin-host",
start_date=current_time,
end_date=end_time,
user_id="test-user",
room=room,
)
# Set last_participant_left_at to simulate grace period
await meetings_controller.update_meeting(
meeting.id,
last_participant_left_at=current_time - timedelta(minutes=5),
num_clients=0,
)
# Simulate participant rejoining - clear grace period
await meetings_controller.update_meeting(
meeting.id, last_participant_left_at=None, num_clients=1
)
updated_meeting = await meetings_controller.get_by_id(meeting.id)
assert updated_meeting.last_participant_left_at is None # Grace period cleared
assert updated_meeting.is_active is True # Still active

View File

@@ -33,7 +33,7 @@ async def test_basic_process(
# validate the events
assert marks["TranscriptLinerProcessor"] == 1
assert marks["TranscriptTranslatorPassthroughProcessor"] == 1
assert marks["TranscriptTranslatorProcessor"] == 1
assert marks["TranscriptTopicDetectorProcessor"] == 1
assert marks["TranscriptFinalSummaryProcessor"] == 1
assert marks["TranscriptFinalTitleProcessor"] == 1

View File

@@ -1,225 +0,0 @@
"""
Tests for Room model ICS calendar integration fields.
"""
from datetime import datetime, timezone
import pytest
from reflector.db.rooms import rooms_controller
@pytest.mark.asyncio
async def test_room_create_with_ics_fields():
"""Test creating a room with ICS calendar fields."""
room = await rooms_controller.add(
name="test-room",
user_id="test-user",
zulip_auto_post=False,
zulip_stream="",
zulip_topic="",
is_locked=False,
room_mode="normal",
recording_type="cloud",
recording_trigger="automatic-2nd-participant",
is_shared=False,
ics_url="https://calendar.google.com/calendar/ical/test/private-token/basic.ics",
ics_fetch_interval=600,
ics_enabled=True,
)
assert room.name == "test-room"
assert (
room.ics_url
== "https://calendar.google.com/calendar/ical/test/private-token/basic.ics"
)
assert room.ics_fetch_interval == 600
assert room.ics_enabled is True
assert room.ics_last_sync is None
assert room.ics_last_etag is None
@pytest.mark.asyncio
async def test_room_update_ics_configuration():
"""Test updating room ICS configuration."""
# Create room without ICS
room = await rooms_controller.add(
name="update-test",
user_id="test-user",
zulip_auto_post=False,
zulip_stream="",
zulip_topic="",
is_locked=False,
room_mode="normal",
recording_type="cloud",
recording_trigger="automatic-2nd-participant",
is_shared=False,
)
assert room.ics_enabled is False
assert room.ics_url is None
# Update with ICS configuration
await rooms_controller.update(
room,
{
"ics_url": "https://outlook.office365.com/owa/calendar/test/calendar.ics",
"ics_fetch_interval": 300,
"ics_enabled": True,
},
)
assert (
room.ics_url == "https://outlook.office365.com/owa/calendar/test/calendar.ics"
)
assert room.ics_fetch_interval == 300
assert room.ics_enabled is True
@pytest.mark.asyncio
async def test_room_ics_sync_metadata():
"""Test updating room ICS sync metadata."""
room = await rooms_controller.add(
name="sync-test",
user_id="test-user",
zulip_auto_post=False,
zulip_stream="",
zulip_topic="",
is_locked=False,
room_mode="normal",
recording_type="cloud",
recording_trigger="automatic-2nd-participant",
is_shared=False,
ics_url="https://example.com/calendar.ics",
ics_enabled=True,
)
# Update sync metadata
sync_time = datetime.now(timezone.utc)
await rooms_controller.update(
room,
{
"ics_last_sync": sync_time,
"ics_last_etag": "abc123hash",
},
)
assert room.ics_last_sync == sync_time
assert room.ics_last_etag == "abc123hash"
@pytest.mark.asyncio
async def test_room_get_with_ics_fields():
"""Test retrieving room with ICS fields."""
# Create room
created_room = await rooms_controller.add(
name="get-test",
user_id="test-user",
zulip_auto_post=False,
zulip_stream="",
zulip_topic="",
is_locked=False,
room_mode="normal",
recording_type="cloud",
recording_trigger="automatic-2nd-participant",
is_shared=False,
ics_url="webcal://calendar.example.com/feed.ics",
ics_fetch_interval=900,
ics_enabled=True,
)
# Get by ID
room = await rooms_controller.get_by_id(created_room.id)
assert room is not None
assert room.ics_url == "webcal://calendar.example.com/feed.ics"
assert room.ics_fetch_interval == 900
assert room.ics_enabled is True
# Get by name
room = await rooms_controller.get_by_name("get-test")
assert room is not None
assert room.ics_url == "webcal://calendar.example.com/feed.ics"
assert room.ics_fetch_interval == 900
assert room.ics_enabled is True
@pytest.mark.asyncio
async def test_room_list_with_ics_enabled_filter():
"""Test listing rooms filtered by ICS enabled status."""
# Create rooms with and without ICS
room1 = await rooms_controller.add(
name="ics-enabled-1",
user_id="test-user",
zulip_auto_post=False,
zulip_stream="",
zulip_topic="",
is_locked=False,
room_mode="normal",
recording_type="cloud",
recording_trigger="automatic-2nd-participant",
is_shared=True,
ics_enabled=True,
ics_url="https://calendar1.example.com/feed.ics",
)
room2 = await rooms_controller.add(
name="ics-disabled",
user_id="test-user",
zulip_auto_post=False,
zulip_stream="",
zulip_topic="",
is_locked=False,
room_mode="normal",
recording_type="cloud",
recording_trigger="automatic-2nd-participant",
is_shared=True,
ics_enabled=False,
)
room3 = await rooms_controller.add(
name="ics-enabled-2",
user_id="test-user",
zulip_auto_post=False,
zulip_stream="",
zulip_topic="",
is_locked=False,
room_mode="normal",
recording_type="cloud",
recording_trigger="automatic-2nd-participant",
is_shared=True,
ics_enabled=True,
ics_url="https://calendar2.example.com/feed.ics",
)
# Get all rooms
all_rooms = await rooms_controller.get_all()
assert len(all_rooms) == 3
# Filter for ICS-enabled rooms (would need to implement this in controller)
ics_rooms = [r for r in all_rooms if r["ics_enabled"]]
assert len(ics_rooms) == 2
assert all(r["ics_enabled"] for r in ics_rooms)
@pytest.mark.asyncio
async def test_room_default_ics_values():
"""Test that ICS fields have correct default values."""
room = await rooms_controller.add(
name="default-test",
user_id="test-user",
zulip_auto_post=False,
zulip_stream="",
zulip_topic="",
is_locked=False,
room_mode="normal",
recording_type="cloud",
recording_trigger="automatic-2nd-participant",
is_shared=False,
# Don't specify ICS fields
)
assert room.ics_url is None
assert room.ics_fetch_interval == 300 # Default 5 minutes
assert room.ics_enabled is False
assert room.ics_last_sync is None
assert room.ics_last_etag is None

View File

@@ -1,385 +0,0 @@
from datetime import datetime, timedelta, timezone
from unittest.mock import AsyncMock, patch
import pytest
from icalendar import Calendar, Event
from reflector.db.calendar_events import CalendarEvent, calendar_events_controller
from reflector.db.rooms import rooms_controller
@pytest.fixture
async def authenticated_client(client):
from reflector.app import app
from reflector.auth import current_user_optional
app.dependency_overrides[current_user_optional] = lambda: {
"sub": "test-user",
"email": "test@example.com",
}
yield client
del app.dependency_overrides[current_user_optional]
@pytest.mark.asyncio
async def test_create_room_with_ics_fields(authenticated_client):
client = authenticated_client
response = await client.post(
"/rooms",
json={
"name": "test-ics-room",
"zulip_auto_post": False,
"zulip_stream": "",
"zulip_topic": "",
"is_locked": False,
"room_mode": "normal",
"recording_type": "cloud",
"recording_trigger": "automatic-2nd-participant",
"is_shared": False,
"ics_url": "https://calendar.example.com/test.ics",
"ics_fetch_interval": 600,
"ics_enabled": True,
},
)
assert response.status_code == 200
data = response.json()
assert data["name"] == "test-ics-room"
assert data["ics_url"] == "https://calendar.example.com/test.ics"
assert data["ics_fetch_interval"] == 600
assert data["ics_enabled"] is True
@pytest.mark.asyncio
async def test_update_room_ics_configuration(authenticated_client):
client = authenticated_client
response = await client.post(
"/rooms",
json={
"name": "update-ics-room",
"zulip_auto_post": False,
"zulip_stream": "",
"zulip_topic": "",
"is_locked": False,
"room_mode": "normal",
"recording_type": "cloud",
"recording_trigger": "automatic-2nd-participant",
"is_shared": False,
},
)
assert response.status_code == 200
room_id = response.json()["id"]
response = await client.patch(
f"/rooms/{room_id}",
json={
"ics_url": "https://calendar.google.com/updated.ics",
"ics_fetch_interval": 300,
"ics_enabled": True,
},
)
assert response.status_code == 200
data = response.json()
assert data["ics_url"] == "https://calendar.google.com/updated.ics"
assert data["ics_fetch_interval"] == 300
assert data["ics_enabled"] is True
@pytest.mark.asyncio
async def test_trigger_ics_sync(authenticated_client):
client = authenticated_client
room = await rooms_controller.add(
name="sync-api-room",
user_id="test-user",
zulip_auto_post=False,
zulip_stream="",
zulip_topic="",
is_locked=False,
room_mode="normal",
recording_type="cloud",
recording_trigger="automatic-2nd-participant",
is_shared=False,
ics_url="https://calendar.example.com/api.ics",
ics_enabled=True,
)
cal = Calendar()
event = Event()
event.add("uid", "api-test-event")
event.add("summary", "API Test Meeting")
from reflector.settings import settings
event.add("location", f"{settings.BASE_URL}/room/{room.name}")
now = datetime.now(timezone.utc)
event.add("dtstart", now + timedelta(hours=1))
event.add("dtend", now + timedelta(hours=2))
cal.add_component(event)
ics_content = cal.to_ical().decode("utf-8")
with patch(
"reflector.services.ics_sync.ICSFetchService.fetch_ics", new_callable=AsyncMock
) as mock_fetch:
mock_fetch.return_value = ics_content
response = await client.post(f"/rooms/{room.name}/ics/sync")
assert response.status_code == 200
data = response.json()
assert data["status"] == "success"
assert data["events_found"] == 1
assert data["events_created"] == 1
@pytest.mark.asyncio
async def test_trigger_ics_sync_unauthorized(client):
room = await rooms_controller.add(
name="sync-unauth-room",
user_id="owner-123",
zulip_auto_post=False,
zulip_stream="",
zulip_topic="",
is_locked=False,
room_mode="normal",
recording_type="cloud",
recording_trigger="automatic-2nd-participant",
is_shared=False,
ics_url="https://calendar.example.com/api.ics",
ics_enabled=True,
)
response = await client.post(f"/rooms/{room.name}/ics/sync")
assert response.status_code == 403
assert "Only room owner can trigger ICS sync" in response.json()["detail"]
@pytest.mark.asyncio
async def test_trigger_ics_sync_not_configured(authenticated_client):
client = authenticated_client
room = await rooms_controller.add(
name="sync-not-configured",
user_id="test-user",
zulip_auto_post=False,
zulip_stream="",
zulip_topic="",
is_locked=False,
room_mode="normal",
recording_type="cloud",
recording_trigger="automatic-2nd-participant",
is_shared=False,
ics_enabled=False,
)
response = await client.post(f"/rooms/{room.name}/ics/sync")
assert response.status_code == 400
assert "ICS not configured" in response.json()["detail"]
@pytest.mark.asyncio
async def test_get_ics_status(authenticated_client):
client = authenticated_client
room = await rooms_controller.add(
name="status-room",
user_id="test-user",
zulip_auto_post=False,
zulip_stream="",
zulip_topic="",
is_locked=False,
room_mode="normal",
recording_type="cloud",
recording_trigger="automatic-2nd-participant",
is_shared=False,
ics_url="https://calendar.example.com/status.ics",
ics_enabled=True,
ics_fetch_interval=300,
)
now = datetime.now(timezone.utc)
await rooms_controller.update(
room,
{"ics_last_sync": now, "ics_last_etag": "test-etag"},
)
response = await client.get(f"/rooms/{room.name}/ics/status")
assert response.status_code == 200
data = response.json()
assert data["status"] == "enabled"
assert data["last_etag"] == "test-etag"
assert data["events_count"] == 0
@pytest.mark.asyncio
async def test_get_ics_status_unauthorized(client):
room = await rooms_controller.add(
name="status-unauth",
user_id="owner-456",
zulip_auto_post=False,
zulip_stream="",
zulip_topic="",
is_locked=False,
room_mode="normal",
recording_type="cloud",
recording_trigger="automatic-2nd-participant",
is_shared=False,
ics_url="https://calendar.example.com/status.ics",
ics_enabled=True,
)
response = await client.get(f"/rooms/{room.name}/ics/status")
assert response.status_code == 403
assert "Only room owner can view ICS status" in response.json()["detail"]
@pytest.mark.asyncio
async def test_list_room_meetings(authenticated_client):
client = authenticated_client
room = await rooms_controller.add(
name="meetings-room",
user_id="test-user",
zulip_auto_post=False,
zulip_stream="",
zulip_topic="",
is_locked=False,
room_mode="normal",
recording_type="cloud",
recording_trigger="automatic-2nd-participant",
is_shared=False,
)
now = datetime.now(timezone.utc)
event1 = CalendarEvent(
room_id=room.id,
ics_uid="meeting-1",
title="Past Meeting",
start_time=now - timedelta(hours=2),
end_time=now - timedelta(hours=1),
)
await calendar_events_controller.upsert(event1)
event2 = CalendarEvent(
room_id=room.id,
ics_uid="meeting-2",
title="Future Meeting",
description="Team sync",
start_time=now + timedelta(hours=1),
end_time=now + timedelta(hours=2),
attendees=[{"email": "test@example.com"}],
)
await calendar_events_controller.upsert(event2)
response = await client.get(f"/rooms/{room.name}/meetings")
assert response.status_code == 200
data = response.json()
assert len(data) == 2
assert data[0]["title"] == "Past Meeting"
assert data[1]["title"] == "Future Meeting"
assert data[1]["description"] == "Team sync"
assert data[1]["attendees"] == [{"email": "test@example.com"}]
@pytest.mark.asyncio
async def test_list_room_meetings_non_owner(client):
room = await rooms_controller.add(
name="meetings-privacy",
user_id="owner-789",
zulip_auto_post=False,
zulip_stream="",
zulip_topic="",
is_locked=False,
room_mode="normal",
recording_type="cloud",
recording_trigger="automatic-2nd-participant",
is_shared=False,
)
event = CalendarEvent(
room_id=room.id,
ics_uid="private-meeting",
title="Meeting Title",
description="Sensitive info",
start_time=datetime.now(timezone.utc) + timedelta(hours=1),
end_time=datetime.now(timezone.utc) + timedelta(hours=2),
attendees=[{"email": "private@example.com"}],
)
await calendar_events_controller.upsert(event)
response = await client.get(f"/rooms/{room.name}/meetings")
assert response.status_code == 200
data = response.json()
assert len(data) == 1
assert data[0]["title"] == "Meeting Title"
assert data[0]["description"] is None
assert data[0]["attendees"] is None
@pytest.mark.asyncio
async def test_list_upcoming_meetings(authenticated_client):
client = authenticated_client
room = await rooms_controller.add(
name="upcoming-room",
user_id="test-user",
zulip_auto_post=False,
zulip_stream="",
zulip_topic="",
is_locked=False,
room_mode="normal",
recording_type="cloud",
recording_trigger="automatic-2nd-participant",
is_shared=False,
)
now = datetime.now(timezone.utc)
past_event = CalendarEvent(
room_id=room.id,
ics_uid="past",
title="Past",
start_time=now - timedelta(hours=1),
end_time=now - timedelta(minutes=30),
)
await calendar_events_controller.upsert(past_event)
soon_event = CalendarEvent(
room_id=room.id,
ics_uid="soon",
title="Soon",
start_time=now + timedelta(minutes=15),
end_time=now + timedelta(minutes=45),
)
await calendar_events_controller.upsert(soon_event)
later_event = CalendarEvent(
room_id=room.id,
ics_uid="later",
title="Later",
start_time=now + timedelta(hours=2),
end_time=now + timedelta(hours=3),
)
await calendar_events_controller.upsert(later_event)
response = await client.get(f"/rooms/{room.name}/meetings/upcoming")
assert response.status_code == 200
data = response.json()
assert len(data) == 1
assert data[0]["title"] == "Soon"
response = await client.get(
f"/rooms/{room.name}/meetings/upcoming", params={"minutes_ahead": 180}
)
assert response.status_code == 200
data = response.json()
assert len(data) == 2
assert data[0]["title"] == "Soon"
assert data[1]["title"] == "Later"
@pytest.mark.asyncio
async def test_room_not_found_endpoints(client):
response = await client.post("/rooms/nonexistent/ics/sync")
assert response.status_code == 404
response = await client.get("/rooms/nonexistent/ics/status")
assert response.status_code == 404
response = await client.get("/rooms/nonexistent/meetings")
assert response.status_code == 404
response = await client.get("/rooms/nonexistent/meetings/upcoming")
assert response.status_code == 404

View File

@@ -1,144 +0,0 @@
"""Tests for full-text search functionality."""
import json
from datetime import datetime, timezone
import pytest
from pydantic import ValidationError
from reflector.db import get_database
from reflector.db.search import SearchParameters, search_controller
from reflector.db.transcripts import transcripts
@pytest.mark.asyncio
async def test_search_postgresql_only():
params = SearchParameters(query_text="any query here")
results, total = await search_controller.search_transcripts(params)
assert results == []
assert total == 0
try:
SearchParameters(query_text="")
assert False, "Should have raised validation error"
except ValidationError:
pass # Expected
# Test that whitespace query raises validation error
try:
SearchParameters(query_text=" ")
assert False, "Should have raised validation error"
except ValidationError:
pass # Expected
@pytest.mark.asyncio
async def test_search_input_validation():
try:
SearchParameters(query_text="")
assert False, "Should have raised ValidationError"
except ValidationError:
pass # Expected
# Test that whitespace query raises validation error
try:
SearchParameters(query_text=" \t\n ")
assert False, "Should have raised ValidationError"
except ValidationError:
pass # Expected
@pytest.mark.asyncio
async def test_postgresql_search_with_data():
# collision is improbable
test_id = "test-search-e2e-7f3a9b2c"
try:
await get_database().execute(
transcripts.delete().where(transcripts.c.id == test_id)
)
test_data = {
"id": test_id,
"name": "Test Search Transcript",
"title": "Engineering Planning Meeting Q4 2024",
"status": "completed",
"locked": False,
"duration": 1800.0,
"created_at": datetime.now(timezone.utc),
"short_summary": "Team discussed search implementation",
"long_summary": "The engineering team met to plan the search feature",
"topics": json.dumps([]),
"events": json.dumps([]),
"participants": json.dumps([]),
"source_language": "en",
"target_language": "en",
"reviewed": False,
"audio_location": "local",
"share_mode": "private",
"source_kind": "room",
"webvtt": """WEBVTT
00:00:00.000 --> 00:00:10.000
Welcome to our engineering planning meeting for Q4 2024.
00:00:10.000 --> 00:00:20.000
Today we'll discuss the implementation of full-text search.
00:00:20.000 --> 00:00:30.000
The search feature should support complex queries with ranking.
00:00:30.000 --> 00:00:40.000
We need to implement PostgreSQL tsvector for better performance.""",
}
await get_database().execute(transcripts.insert().values(**test_data))
# Test 1: Search for a word in title
params = SearchParameters(query_text="planning")
results, total = await search_controller.search_transcripts(params)
assert total >= 1
found = any(r.id == test_id for r in results)
assert found, "Should find test transcript by title word"
# Test 2: Search for a word in webvtt content
params = SearchParameters(query_text="tsvector")
results, total = await search_controller.search_transcripts(params)
assert total >= 1
found = any(r.id == test_id for r in results)
assert found, "Should find test transcript by webvtt content"
# Test 3: Search with multiple words
params = SearchParameters(query_text="engineering planning")
results, total = await search_controller.search_transcripts(params)
assert total >= 1
found = any(r.id == test_id for r in results)
assert found, "Should find test transcript by multiple words"
# Test 4: Verify SearchResult structure
test_result = next((r for r in results if r.id == test_id), None)
if test_result:
assert test_result.title == "Engineering Planning Meeting Q4 2024"
assert test_result.status == "completed"
assert test_result.duration == 1800.0
assert 0 <= test_result.rank <= 1, "Rank should be normalized to 0-1"
# Test 5: Search with OR operator
params = SearchParameters(query_text="tsvector OR nosuchword")
results, total = await search_controller.search_transcripts(params)
assert total >= 1
found = any(r.id == test_id for r in results)
assert found, "Should find test transcript with OR query"
# Test 6: Quoted phrase search
params = SearchParameters(query_text='"full-text search"')
results, total = await search_controller.search_transcripts(params)
assert total >= 1
found = any(r.id == test_id for r in results)
assert found, "Should find test transcript by exact phrase"
finally:
await get_database().execute(
transcripts.delete().where(transcripts.c.id == test_id)
)
await get_database().disconnect()

View File

@@ -1,198 +0,0 @@
"""Unit tests for search snippet generation."""
from reflector.db.search import SearchController
class TestExtractWebVTT:
"""Test WebVTT text extraction."""
def test_extract_webvtt_with_speakers(self):
"""Test extraction removes speaker tags and timestamps."""
webvtt = """WEBVTT
00:00:00.000 --> 00:00:10.000
<v Speaker0>Hello world, this is a test.
00:00:10.000 --> 00:00:20.000
<v Speaker1>Indeed it is a test of WebVTT parsing.
"""
result = SearchController._extract_webvtt_text(webvtt)
assert "Hello world, this is a test" in result
assert "Indeed it is a test" in result
assert "<v Speaker" not in result
assert "00:00" not in result
assert "-->" not in result
def test_extract_empty_webvtt(self):
"""Test empty WebVTT returns empty string."""
assert SearchController._extract_webvtt_text("") == ""
assert SearchController._extract_webvtt_text(None) == ""
def test_extract_malformed_webvtt(self):
"""Test malformed WebVTT returns empty string."""
result = SearchController._extract_webvtt_text("Not a valid WebVTT")
assert result == ""
class TestGenerateSnippets:
"""Test snippet generation from plain text."""
def test_multiple_matches(self):
"""Test finding multiple occurrences of search term in long text."""
# Create text with Python mentions far apart to get separate snippets
separator = " This is filler text. " * 20 # ~400 chars of padding
text = (
"Python is great for machine learning."
+ separator
+ "Many companies use Python for data science."
+ separator
+ "Python has excellent libraries for analysis."
+ separator
+ "The Python community is very supportive."
)
snippets = SearchController._generate_snippets(text, "Python")
# With enough separation, we should get multiple snippets
assert len(snippets) >= 2 # At least 2 distinct snippets
# Each snippet should contain "Python"
for snippet in snippets:
assert "python" in snippet.lower()
def test_single_match(self):
"""Test single occurrence returns one snippet."""
text = "This document discusses artificial intelligence and its applications."
snippets = SearchController._generate_snippets(text, "artificial intelligence")
assert len(snippets) == 1
assert "artificial intelligence" in snippets[0].lower()
def test_no_matches(self):
"""Test no matches returns empty list."""
text = "This is some random text without the search term."
snippets = SearchController._generate_snippets(text, "machine learning")
assert snippets == []
def test_case_insensitive_search(self):
"""Test search is case insensitive."""
# Add enough text between matches to get separate snippets
text = (
"MACHINE LEARNING is important for modern applications. "
+ "It requires lots of data and computational resources. " * 5 # Padding
+ "Machine Learning rocks and transforms industries. "
+ "Deep learning is a subset of it. " * 5 # More padding
+ "Finally, machine learning will shape our future."
)
snippets = SearchController._generate_snippets(text, "machine learning")
# Should find at least 2 (might be 3 if text is long enough)
assert len(snippets) >= 2
for snippet in snippets:
assert "machine learning" in snippet.lower()
def test_partial_match_fallback(self):
"""Test fallback to first word when exact phrase not found."""
text = "We use machine intelligence for processing."
snippets = SearchController._generate_snippets(text, "machine learning")
# Should fall back to finding "machine"
assert len(snippets) == 1
assert "machine" in snippets[0].lower()
def test_snippet_ellipsis(self):
"""Test ellipsis added for truncated snippets."""
# Long text where match is in the middle
text = "a " * 100 + "TARGET_WORD special content here" + " b" * 100
snippets = SearchController._generate_snippets(text, "TARGET_WORD")
assert len(snippets) == 1
assert "..." in snippets[0] # Should have ellipsis
assert "TARGET_WORD" in snippets[0]
def test_overlapping_snippets_deduplicated(self):
"""Test overlapping matches don't create duplicate snippets."""
text = "test test test word" * 10 # Repeated pattern
snippets = SearchController._generate_snippets(text, "test")
# Should get unique snippets, not duplicates
assert len(snippets) <= 3
assert len(snippets) == len(set(snippets)) # All unique
def test_empty_inputs(self):
"""Test empty text or search term returns empty list."""
assert SearchController._generate_snippets("", "search") == []
assert SearchController._generate_snippets("text", "") == []
assert SearchController._generate_snippets("", "") == []
def test_max_snippets_limit(self):
"""Test respects max_snippets parameter."""
# Create text with well-separated occurrences
separator = " filler " * 50 # Ensure snippets don't overlap
text = ("Python is amazing" + separator) * 10 # 10 occurrences
# Test with different limits
snippets_1 = SearchController._generate_snippets(text, "Python", max_snippets=1)
assert len(snippets_1) == 1
snippets_2 = SearchController._generate_snippets(text, "Python", max_snippets=2)
assert len(snippets_2) == 2
snippets_5 = SearchController._generate_snippets(text, "Python", max_snippets=5)
assert len(snippets_5) == 5 # Should get exactly 5 with enough separation
def test_snippet_length(self):
"""Test snippet length is reasonable."""
text = "word " * 200 # Long text
snippets = SearchController._generate_snippets(text, "word")
for snippet in snippets:
# Default max_length is 150 + some context
assert len(snippet) <= 200 # Some buffer for ellipsis
class TestFullPipeline:
"""Test the complete WebVTT to snippets pipeline."""
def test_webvtt_to_snippets_integration(self):
"""Test full pipeline from WebVTT to search snippets."""
# Create WebVTT with well-separated content for multiple snippets
webvtt = (
"""WEBVTT
00:00:00.000 --> 00:00:10.000
<v Speaker0>Let's discuss machine learning applications in modern technology.
00:00:10.000 --> 00:00:20.000
<v Speaker1>"""
+ "Various industries are adopting new technologies. " * 10
+ """
00:00:20.000 --> 00:00:30.000
<v Speaker2>Machine learning is revolutionizing healthcare and diagnostics.
00:00:30.000 --> 00:00:40.000
<v Speaker3>"""
+ "Financial markets show interesting patterns. " * 10
+ """
00:00:40.000 --> 00:00:50.000
<v Speaker0>Machine learning in education provides personalized experiences.
"""
)
# Extract and generate snippets
plain_text = SearchController._extract_webvtt_text(webvtt)
snippets = SearchController._generate_snippets(plain_text, "machine learning")
# Should find at least 2 snippets (text might still be close together)
assert len(snippets) >= 1 # At minimum one snippet containing matches
assert len(snippets) <= 3 # At most 3 by default
# No WebVTT artifacts in snippets
for snippet in snippets:
assert "machine learning" in snippet.lower()
assert "<v Speaker" not in snippet
assert "00:00" not in snippet
assert "-->" not in snippet

View File

@@ -1,11 +1,15 @@
from contextlib import asynccontextmanager
import pytest
from httpx import AsyncClient
@pytest.mark.asyncio
async def test_transcript_create(client):
response = await client.post("/transcripts", json={"name": "test"})
async def test_transcript_create():
from reflector.app import app
async with AsyncClient(app=app, base_url="http://test/v1") as ac:
response = await ac.post("/transcripts", json={"name": "test"})
assert response.status_code == 200
assert response.json()["name"] == "test"
assert response.json()["status"] == "idle"
@@ -19,62 +23,71 @@ async def test_transcript_create(client):
@pytest.mark.asyncio
async def test_transcript_get_update_name(client):
response = await client.post("/transcripts", json={"name": "test"})
async def test_transcript_get_update_name():
from reflector.app import app
async with AsyncClient(app=app, base_url="http://test/v1") as ac:
response = await ac.post("/transcripts", json={"name": "test"})
assert response.status_code == 200
assert response.json()["name"] == "test"
tid = response.json()["id"]
response = await client.get(f"/transcripts/{tid}")
response = await ac.get(f"/transcripts/{tid}")
assert response.status_code == 200
assert response.json()["name"] == "test"
response = await client.patch(f"/transcripts/{tid}", json={"name": "test2"})
response = await ac.patch(f"/transcripts/{tid}", json={"name": "test2"})
assert response.status_code == 200
assert response.json()["name"] == "test2"
response = await client.get(f"/transcripts/{tid}")
response = await ac.get(f"/transcripts/{tid}")
assert response.status_code == 200
assert response.json()["name"] == "test2"
@pytest.mark.asyncio
async def test_transcript_get_update_locked(client):
response = await client.post("/transcripts", json={"name": "test"})
async def test_transcript_get_update_locked():
from reflector.app import app
async with AsyncClient(app=app, base_url="http://test/v1") as ac:
response = await ac.post("/transcripts", json={"name": "test"})
assert response.status_code == 200
assert response.json()["locked"] is False
tid = response.json()["id"]
response = await client.get(f"/transcripts/{tid}")
response = await ac.get(f"/transcripts/{tid}")
assert response.status_code == 200
assert response.json()["locked"] is False
response = await client.patch(f"/transcripts/{tid}", json={"locked": True})
response = await ac.patch(f"/transcripts/{tid}", json={"locked": True})
assert response.status_code == 200
assert response.json()["locked"] is True
response = await client.get(f"/transcripts/{tid}")
response = await ac.get(f"/transcripts/{tid}")
assert response.status_code == 200
assert response.json()["locked"] is True
@pytest.mark.asyncio
async def test_transcript_get_update_summary(client):
response = await client.post("/transcripts", json={"name": "test"})
async def test_transcript_get_update_summary():
from reflector.app import app
async with AsyncClient(app=app, base_url="http://test/v1") as ac:
response = await ac.post("/transcripts", json={"name": "test"})
assert response.status_code == 200
assert response.json()["long_summary"] is None
assert response.json()["short_summary"] is None
tid = response.json()["id"]
response = await client.get(f"/transcripts/{tid}")
response = await ac.get(f"/transcripts/{tid}")
assert response.status_code == 200
assert response.json()["long_summary"] is None
assert response.json()["short_summary"] is None
response = await client.patch(
response = await ac.patch(
f"/transcripts/{tid}",
json={"long_summary": "test_long", "short_summary": "test_short"},
)
@@ -82,46 +95,52 @@ async def test_transcript_get_update_summary(client):
assert response.json()["long_summary"] == "test_long"
assert response.json()["short_summary"] == "test_short"
response = await client.get(f"/transcripts/{tid}")
response = await ac.get(f"/transcripts/{tid}")
assert response.status_code == 200
assert response.json()["long_summary"] == "test_long"
assert response.json()["short_summary"] == "test_short"
@pytest.mark.asyncio
async def test_transcript_get_update_title(client):
response = await client.post("/transcripts", json={"name": "test"})
async def test_transcript_get_update_title():
from reflector.app import app
async with AsyncClient(app=app, base_url="http://test/v1") as ac:
response = await ac.post("/transcripts", json={"name": "test"})
assert response.status_code == 200
assert response.json()["title"] is None
tid = response.json()["id"]
response = await client.get(f"/transcripts/{tid}")
response = await ac.get(f"/transcripts/{tid}")
assert response.status_code == 200
assert response.json()["title"] is None
response = await client.patch(f"/transcripts/{tid}", json={"title": "test_title"})
response = await ac.patch(f"/transcripts/{tid}", json={"title": "test_title"})
assert response.status_code == 200
assert response.json()["title"] == "test_title"
response = await client.get(f"/transcripts/{tid}")
response = await ac.get(f"/transcripts/{tid}")
assert response.status_code == 200
assert response.json()["title"] == "test_title"
@pytest.mark.asyncio
async def test_transcripts_list_anonymous(client):
async def test_transcripts_list_anonymous():
# XXX this test is a bit fragile, as it depends on the storage which
# is shared between tests
from reflector.app import app
from reflector.settings import settings
response = await client.get("/transcripts")
async with AsyncClient(app=app, base_url="http://test/v1") as ac:
response = await ac.get("/transcripts")
assert response.status_code == 401
# if public mode, it should be allowed
try:
settings.PUBLIC_MODE = True
response = await client.get("/transcripts")
async with AsyncClient(app=app, base_url="http://test/v1") as ac:
response = await ac.get("/transcripts")
assert response.status_code == 200
finally:
settings.PUBLIC_MODE = False
@@ -178,19 +197,21 @@ async def authenticated_client2():
@pytest.mark.asyncio
async def test_transcripts_list_authenticated(authenticated_client, client):
async def test_transcripts_list_authenticated(authenticated_client):
# XXX this test is a bit fragile, as it depends on the storage which
# is shared between tests
from reflector.app import app
response = await client.post("/transcripts", json={"name": "testxx1"})
async with AsyncClient(app=app, base_url="http://test/v1") as ac:
response = await ac.post("/transcripts", json={"name": "testxx1"})
assert response.status_code == 200
assert response.json()["name"] == "testxx1"
response = await client.post("/transcripts", json={"name": "testxx2"})
response = await ac.post("/transcripts", json={"name": "testxx2"})
assert response.status_code == 200
assert response.json()["name"] == "testxx2"
response = await client.get("/transcripts")
response = await ac.get("/transcripts")
assert response.status_code == 200
assert len(response.json()["items"]) >= 2
names = [t["name"] for t in response.json()["items"]]
@@ -199,38 +220,44 @@ async def test_transcripts_list_authenticated(authenticated_client, client):
@pytest.mark.asyncio
async def test_transcript_delete(client):
response = await client.post("/transcripts", json={"name": "testdel1"})
async def test_transcript_delete():
from reflector.app import app
async with AsyncClient(app=app, base_url="http://test/v1") as ac:
response = await ac.post("/transcripts", json={"name": "testdel1"})
assert response.status_code == 200
assert response.json()["name"] == "testdel1"
tid = response.json()["id"]
response = await client.delete(f"/transcripts/{tid}")
response = await ac.delete(f"/transcripts/{tid}")
assert response.status_code == 200
assert response.json()["status"] == "ok"
response = await client.get(f"/transcripts/{tid}")
response = await ac.get(f"/transcripts/{tid}")
assert response.status_code == 404
@pytest.mark.asyncio
async def test_transcript_mark_reviewed(client):
response = await client.post("/transcripts", json={"name": "test"})
async def test_transcript_mark_reviewed():
from reflector.app import app
async with AsyncClient(app=app, base_url="http://test/v1") as ac:
response = await ac.post("/transcripts", json={"name": "test"})
assert response.status_code == 200
assert response.json()["name"] == "test"
assert response.json()["reviewed"] is False
tid = response.json()["id"]
response = await client.get(f"/transcripts/{tid}")
response = await ac.get(f"/transcripts/{tid}")
assert response.status_code == 200
assert response.json()["name"] == "test"
assert response.json()["reviewed"] is False
response = await client.patch(f"/transcripts/{tid}", json={"reviewed": True})
response = await ac.patch(f"/transcripts/{tid}", json={"reviewed": True})
assert response.status_code == 200
assert response.json()["reviewed"] is True
response = await client.get(f"/transcripts/{tid}")
response = await ac.get(f"/transcripts/{tid}")
assert response.status_code == 200
assert response.json()["reviewed"] is True

View File

@@ -2,17 +2,20 @@ import shutil
from pathlib import Path
import pytest
from httpx import AsyncClient
@pytest.fixture
async def fake_transcript(tmpdir, client):
async def fake_transcript(tmpdir):
from reflector.app import app
from reflector.settings import settings
from reflector.views.transcripts import transcripts_controller
settings.DATA_DIR = Path(tmpdir)
# create a transcript
response = await client.post("/transcripts", json={"name": "Test audio download"})
ac = AsyncClient(app=app, base_url="http://test/v1")
response = await ac.post("/transcripts", json={"name": "Test audio download"})
assert response.status_code == 200
tid = response.json()["id"]
@@ -36,17 +39,17 @@ async def fake_transcript(tmpdir, client):
["/mp3", "audio/mpeg"],
],
)
async def test_transcript_audio_download(
fake_transcript, url_suffix, content_type, client
):
response = await client.get(f"/transcripts/{fake_transcript.id}/audio{url_suffix}")
async def test_transcript_audio_download(fake_transcript, url_suffix, content_type):
from reflector.app import app
ac = AsyncClient(app=app, base_url="http://test/v1")
response = await ac.get(f"/transcripts/{fake_transcript.id}/audio{url_suffix}")
assert response.status_code == 200
assert response.headers["content-type"] == content_type
# test get 404
response = await client.get(
f"/transcripts/{fake_transcript.id}XXX/audio{url_suffix}"
)
ac = AsyncClient(app=app, base_url="http://test/v1")
response = await ac.get(f"/transcripts/{fake_transcript.id}XXX/audio{url_suffix}")
assert response.status_code == 404
@@ -58,16 +61,18 @@ async def test_transcript_audio_download(
],
)
async def test_transcript_audio_download_head(
fake_transcript, url_suffix, content_type, client
fake_transcript, url_suffix, content_type
):
response = await client.head(f"/transcripts/{fake_transcript.id}/audio{url_suffix}")
from reflector.app import app
ac = AsyncClient(app=app, base_url="http://test/v1")
response = await ac.head(f"/transcripts/{fake_transcript.id}/audio{url_suffix}")
assert response.status_code == 200
assert response.headers["content-type"] == content_type
# test head 404
response = await client.head(
f"/transcripts/{fake_transcript.id}XXX/audio{url_suffix}"
)
ac = AsyncClient(app=app, base_url="http://test/v1")
response = await ac.head(f"/transcripts/{fake_transcript.id}XXX/audio{url_suffix}")
assert response.status_code == 404
@@ -79,9 +84,12 @@ async def test_transcript_audio_download_head(
],
)
async def test_transcript_audio_download_range(
fake_transcript, url_suffix, content_type, client
fake_transcript, url_suffix, content_type
):
response = await client.get(
from reflector.app import app
ac = AsyncClient(app=app, base_url="http://test/v1")
response = await ac.get(
f"/transcripts/{fake_transcript.id}/audio{url_suffix}",
headers={"range": "bytes=0-100"},
)
@@ -99,9 +107,12 @@ async def test_transcript_audio_download_range(
],
)
async def test_transcript_audio_download_range_with_seek(
fake_transcript, url_suffix, content_type, client
fake_transcript, url_suffix, content_type
):
response = await client.get(
from reflector.app import app
ac = AsyncClient(app=app, base_url="http://test/v1")
response = await ac.get(
f"/transcripts/{fake_transcript.id}/audio{url_suffix}",
headers={"range": "bytes=100-"},
)
@@ -111,10 +122,13 @@ async def test_transcript_audio_download_range_with_seek(
@pytest.mark.asyncio
async def test_transcript_delete_with_audio(fake_transcript, client):
response = await client.delete(f"/transcripts/{fake_transcript.id}")
async def test_transcript_delete_with_audio(fake_transcript):
from reflector.app import app
ac = AsyncClient(app=app, base_url="http://test/v1")
response = await ac.delete(f"/transcripts/{fake_transcript.id}")
assert response.status_code == 200
assert response.json()["status"] == "ok"
response = await client.get(f"/transcripts/{fake_transcript.id}")
response = await ac.get(f"/transcripts/{fake_transcript.id}")
assert response.status_code == 404

View File

@@ -1,15 +1,19 @@
import pytest
from httpx import AsyncClient
@pytest.mark.asyncio
async def test_transcript_participants(client):
response = await client.post("/transcripts", json={"name": "test"})
async def test_transcript_participants():
from reflector.app import app
async with AsyncClient(app=app, base_url="http://test/v1") as ac:
response = await ac.post("/transcripts", json={"name": "test"})
assert response.status_code == 200
assert response.json()["participants"] == []
# create a participant
transcript_id = response.json()["id"]
response = await client.post(
response = await ac.post(
f"/transcripts/{transcript_id}/participants", json={"name": "test"}
)
assert response.status_code == 200
@@ -18,7 +22,7 @@ async def test_transcript_participants(client):
assert response.json()["name"] == "test"
# create another one with a speaker
response = await client.post(
response = await ac.post(
f"/transcripts/{transcript_id}/participants",
json={"name": "test2", "speaker": 1},
)
@@ -28,25 +32,28 @@ async def test_transcript_participants(client):
assert response.json()["name"] == "test2"
# get all participants via transcript
response = await client.get(f"/transcripts/{transcript_id}")
response = await ac.get(f"/transcripts/{transcript_id}")
assert response.status_code == 200
assert len(response.json()["participants"]) == 2
# get participants via participants endpoint
response = await client.get(f"/transcripts/{transcript_id}/participants")
response = await ac.get(f"/transcripts/{transcript_id}/participants")
assert response.status_code == 200
assert len(response.json()) == 2
@pytest.mark.asyncio
async def test_transcript_participants_same_speaker(client):
response = await client.post("/transcripts", json={"name": "test"})
async def test_transcript_participants_same_speaker():
from reflector.app import app
async with AsyncClient(app=app, base_url="http://test/v1") as ac:
response = await ac.post("/transcripts", json={"name": "test"})
assert response.status_code == 200
assert response.json()["participants"] == []
transcript_id = response.json()["id"]
# create a participant
response = await client.post(
response = await ac.post(
f"/transcripts/{transcript_id}/participants",
json={"name": "test", "speaker": 1},
)
@@ -54,7 +61,7 @@ async def test_transcript_participants_same_speaker(client):
assert response.json()["speaker"] == 1
# create another one with the same speaker
response = await client.post(
response = await ac.post(
f"/transcripts/{transcript_id}/participants",
json={"name": "test2", "speaker": 1},
)
@@ -62,14 +69,17 @@ async def test_transcript_participants_same_speaker(client):
@pytest.mark.asyncio
async def test_transcript_participants_update_name(client):
response = await client.post("/transcripts", json={"name": "test"})
async def test_transcript_participants_update_name():
from reflector.app import app
async with AsyncClient(app=app, base_url="http://test/v1") as ac:
response = await ac.post("/transcripts", json={"name": "test"})
assert response.status_code == 200
assert response.json()["participants"] == []
transcript_id = response.json()["id"]
# create a participant
response = await client.post(
response = await ac.post(
f"/transcripts/{transcript_id}/participants",
json={"name": "test", "speaker": 1},
)
@@ -78,7 +88,7 @@ async def test_transcript_participants_update_name(client):
# update the participant
participant_id = response.json()["id"]
response = await client.patch(
response = await ac.patch(
f"/transcripts/{transcript_id}/participants/{participant_id}",
json={"name": "test2"},
)
@@ -86,28 +96,31 @@ async def test_transcript_participants_update_name(client):
assert response.json()["name"] == "test2"
# verify the participant was updated
response = await client.get(
response = await ac.get(
f"/transcripts/{transcript_id}/participants/{participant_id}"
)
assert response.status_code == 200
assert response.json()["name"] == "test2"
# verify the participant was updated in transcript
response = await client.get(f"/transcripts/{transcript_id}")
response = await ac.get(f"/transcripts/{transcript_id}")
assert response.status_code == 200
assert len(response.json()["participants"]) == 1
assert response.json()["participants"][0]["name"] == "test2"
@pytest.mark.asyncio
async def test_transcript_participants_update_speaker(client):
response = await client.post("/transcripts", json={"name": "test"})
async def test_transcript_participants_update_speaker():
from reflector.app import app
async with AsyncClient(app=app, base_url="http://test/v1") as ac:
response = await ac.post("/transcripts", json={"name": "test"})
assert response.status_code == 200
assert response.json()["participants"] == []
transcript_id = response.json()["id"]
# create a participant
response = await client.post(
response = await ac.post(
f"/transcripts/{transcript_id}/participants",
json={"name": "test", "speaker": 1},
)
@@ -115,7 +128,7 @@ async def test_transcript_participants_update_speaker(client):
participant1_id = response.json()["id"]
# create another participant
response = await client.post(
response = await ac.post(
f"/transcripts/{transcript_id}/participants",
json={"name": "test2", "speaker": 2},
)
@@ -123,27 +136,27 @@ async def test_transcript_participants_update_speaker(client):
participant2_id = response.json()["id"]
# update the participant, refused as speaker is already taken
response = await client.patch(
response = await ac.patch(
f"/transcripts/{transcript_id}/participants/{participant2_id}",
json={"speaker": 1},
)
assert response.status_code == 400
# delete the participant 1
response = await client.delete(
response = await ac.delete(
f"/transcripts/{transcript_id}/participants/{participant1_id}"
)
assert response.status_code == 200
# update the participant 2 again, should be accepted now
response = await client.patch(
response = await ac.patch(
f"/transcripts/{transcript_id}/participants/{participant2_id}",
json={"speaker": 1},
)
assert response.status_code == 200
# ensure participant2 name is still there
response = await client.get(
response = await ac.get(
f"/transcripts/{transcript_id}/participants/{participant2_id}"
)
assert response.status_code == 200

View File

@@ -1,26 +1,7 @@
import asyncio
import time
import pytest
from httpx import ASGITransport, AsyncClient
@pytest.fixture
async def app_lifespan():
from asgi_lifespan import LifespanManager
from reflector.app import app
async with LifespanManager(app) as manager:
yield manager.app
@pytest.fixture
async def client(app_lifespan):
yield AsyncClient(
transport=ASGITransport(app=app_lifespan),
base_url="http://test/v1",
)
from httpx import AsyncClient
@pytest.mark.usefixtures("setup_database")
@@ -29,21 +10,23 @@ async def client(app_lifespan):
@pytest.mark.asyncio
async def test_transcript_process(
tmpdir,
whisper_transcript,
dummy_llm,
dummy_processors,
dummy_diarization,
dummy_storage,
client,
):
from reflector.app import app
ac = AsyncClient(app=app, base_url="http://test/v1")
# create a transcript
response = await client.post("/transcripts", json={"name": "test"})
response = await ac.post("/transcripts", json={"name": "test"})
assert response.status_code == 200
assert response.json()["status"] == "idle"
tid = response.json()["id"]
# upload mp3
response = await client.post(
response = await ac.post(
f"/transcripts/{tid}/record/upload?chunk_number=0&total_chunks=1",
files={
"chunk": (
@@ -56,38 +39,30 @@ async def test_transcript_process(
assert response.status_code == 200
assert response.json()["status"] == "ok"
# wait for processing to finish (max 10 minutes)
timeout_seconds = 600 # 10 minutes
start_time = time.monotonic()
while (time.monotonic() - start_time) < timeout_seconds:
# wait for processing to finish
while True:
# fetch the transcript and check if it is ended
resp = await client.get(f"/transcripts/{tid}")
resp = await ac.get(f"/transcripts/{tid}")
assert resp.status_code == 200
if resp.json()["status"] in ("ended", "error"):
break
await asyncio.sleep(1)
else:
pytest.fail(f"Initial processing timed out after {timeout_seconds} seconds")
# restart the processing
response = await client.post(
response = await ac.post(
f"/transcripts/{tid}/process",
)
assert response.status_code == 200
assert response.json()["status"] == "ok"
# wait for processing to finish (max 10 minutes)
timeout_seconds = 600 # 10 minutes
start_time = time.monotonic()
while (time.monotonic() - start_time) < timeout_seconds:
# wait for processing to finish
while True:
# fetch the transcript and check if it is ended
resp = await client.get(f"/transcripts/{tid}")
resp = await ac.get(f"/transcripts/{tid}")
assert resp.status_code == 200
if resp.json()["status"] in ("ended", "error"):
break
await asyncio.sleep(1)
else:
pytest.fail(f"Restart processing timed out after {timeout_seconds} seconds")
# check the transcript is ended
transcript = resp.json()
@@ -96,7 +71,7 @@ async def test_transcript_process(
assert transcript["title"] == "Llm Title"
# check topics and transcript
response = await client.get(f"/transcripts/{tid}/topics")
response = await ac.get(f"/transcripts/{tid}/topics")
assert response.status_code == 200
assert len(response.json()) == 1
assert "want to share" in response.json()[0]["transcript"]

View File

@@ -1,34 +0,0 @@
from datetime import datetime, timezone
from unittest.mock import AsyncMock, patch
import pytest
from reflector.db.recordings import Recording, recordings_controller
from reflector.db.transcripts import SourceKind, transcripts_controller
@pytest.mark.asyncio
async def test_recording_deleted_with_transcript():
recording = await recordings_controller.create(
Recording(
bucket_name="test-bucket",
object_key="recording.mp4",
recorded_at=datetime.now(timezone.utc),
)
)
transcript = await transcripts_controller.add(
name="Test Transcript",
source_kind=SourceKind.ROOM,
recording_id=recording.id,
)
with patch("reflector.db.transcripts.get_recordings_storage") as mock_get_storage:
storage_instance = mock_get_storage.return_value
storage_instance.delete_file = AsyncMock()
await transcripts_controller.remove_by_id(transcript.id)
storage_instance.delete_file.assert_awaited_once_with(recording.object_key)
assert await recordings_controller.get_by_id(recording.id) is None
assert await transcripts_controller.get_by_id(transcript.id) is None

View File

@@ -6,10 +6,10 @@
import asyncio
import json
import threading
import time
from pathlib import Path
import pytest
from httpx import AsyncClient
from httpx_ws import aconnect_ws
from uvicorn import Config, Server
@@ -21,97 +21,34 @@ class ThreadedUvicorn:
async def start(self):
self.thread.start()
timeout_seconds = 600 # 10 minutes
start_time = time.monotonic()
while (
not self.server.started
and (time.monotonic() - start_time) < timeout_seconds
):
while not self.server.started:
await asyncio.sleep(0.1)
if not self.server.started:
raise TimeoutError(
f"Server failed to start after {timeout_seconds} seconds"
)
def stop(self):
if self.thread.is_alive():
self.server.should_exit = True
timeout_seconds = 600 # 10 minutes
start_time = time.time()
while (
self.thread.is_alive() and (time.time() - start_time) < timeout_seconds
):
time.sleep(0.1)
if self.thread.is_alive():
raise TimeoutError(
f"Thread failed to stop after {timeout_seconds} seconds"
)
while self.thread.is_alive():
continue
@pytest.fixture
def appserver(tmpdir, setup_database, celery_session_app, celery_session_worker):
import threading
async def appserver(tmpdir, setup_database, celery_session_app, celery_session_worker):
from reflector.app import app
from reflector.db import get_database
from reflector.settings import settings
DATA_DIR = settings.DATA_DIR
settings.DATA_DIR = Path(tmpdir)
# start server in a separate thread with its own event loop
# start server
host = "127.0.0.1"
port = 1255
server_started = threading.Event()
server_exception = None
server_instance = None
config = Config(app=app, host=host, port=port)
server = ThreadedUvicorn(config)
await server.start()
def run_server():
nonlocal server_exception, server_instance
try:
# Create a new event loop for this thread
loop = asyncio.new_event_loop()
asyncio.set_event_loop(loop)
config = Config(app=app, host=host, port=port, loop=loop)
server_instance = Server(config)
async def start_server():
# Initialize database connection in this event loop
database = get_database()
await database.connect()
try:
await server_instance.serve()
finally:
await database.disconnect()
# Signal that server is starting
server_started.set()
loop.run_until_complete(start_server())
except Exception as e:
server_exception = e
server_started.set()
finally:
loop.close()
server_thread = threading.Thread(target=run_server, daemon=True)
server_thread.start()
# Wait for server to start
server_started.wait(timeout=30)
if server_exception:
raise server_exception
# Wait a bit more for the server to be fully ready
time.sleep(1)
yield server_instance, host, port
# Stop server
if server_instance:
server_instance.should_exit = True
server_thread.join(timeout=30)
yield (server, host, port)
server.stop()
settings.DATA_DIR = DATA_DIR
@@ -130,11 +67,9 @@ async def test_transcript_rtc_and_websocket(
dummy_transcript,
dummy_processors,
dummy_diarization,
dummy_transcript_translator,
dummy_storage,
fake_mp3_upload,
appserver,
client,
):
# goal: start the server, exchange RTC, receive websocket events
# because of that, we need to start the server in a thread
@@ -143,7 +78,8 @@ async def test_transcript_rtc_and_websocket(
# create a transcript
base_url = f"http://{host}:{port}/v1"
response = await client.post("/transcripts", json={"name": "Test RTC"})
ac = AsyncClient(base_url=base_url)
response = await ac.post("/transcripts", json={"name": "Test RTC"})
assert response.status_code == 200
tid = response.json()["id"]
@@ -155,16 +91,12 @@ async def test_transcript_rtc_and_websocket(
async with aconnect_ws(f"{base_url}/transcripts/{tid}/events") as ws:
print("Test websocket: CONNECTED")
try:
timeout_seconds = 600 # 10 minutes
start_time = time.monotonic()
while (time.monotonic() - start_time) < timeout_seconds:
while True:
msg = await ws.receive_json()
print(f"Test websocket: JSON {msg}")
if msg is None:
break
events.append(msg)
else:
print(f"Test websocket: TIMEOUT after {timeout_seconds} seconds")
except Exception as e:
print(f"Test websocket: EXCEPTION {e}")
finally:
@@ -188,11 +120,11 @@ async def test_transcript_rtc_and_websocket(
url = f"{base_url}/transcripts/{tid}/record/webrtc"
path = Path(__file__).parent / "records" / "test_short.wav"
stream_client = StreamClient(signaling, url=url, play_from=path.as_posix())
await stream_client.start()
client = StreamClient(signaling, url=url, play_from=path.as_posix())
await client.start()
timeout = 120
while not stream_client.is_ended():
timeout = 20
while not client.is_ended():
await asyncio.sleep(1)
timeout -= 1
if timeout < 0:
@@ -200,24 +132,21 @@ async def test_transcript_rtc_and_websocket(
# XXX aiortc is long to close the connection
# instead of waiting a long time, we just send a STOP
stream_client.channel.send(json.dumps({"cmd": "STOP"}))
await stream_client.stop()
client.channel.send(json.dumps({"cmd": "STOP"}))
await client.stop()
# wait the processing to finish
timeout = 120
timeout = 20
while True:
# fetch the transcript and check if it is ended
resp = await client.get(f"/transcripts/{tid}")
resp = await ac.get(f"/transcripts/{tid}")
assert resp.status_code == 200
if resp.json()["status"] in ("ended", "error"):
break
await asyncio.sleep(1)
timeout -= 1
if timeout < 0:
raise TimeoutError("Timeout while waiting for transcript to be ended")
if resp.json()["status"] != "ended":
raise TimeoutError("Transcript processing failed")
raise TimeoutError("Timeout while waiting for transcript to be ended")
# stop websocket task
websocket_task.cancel()
@@ -235,7 +164,7 @@ async def test_transcript_rtc_and_websocket(
assert "TRANSCRIPT" in eventnames
ev = events[eventnames.index("TRANSCRIPT")]
assert ev["data"]["text"].startswith("Hello world.")
assert ev["data"]["translation"] is None
assert ev["data"]["translation"] == "Bonjour le monde"
assert "TOPIC" in eventnames
ev = events[eventnames.index("TOPIC")]
@@ -260,7 +189,7 @@ async def test_transcript_rtc_and_websocket(
ev = events[eventnames.index("WAVEFORM")]
assert isinstance(ev["data"]["waveform"], list)
assert len(ev["data"]["waveform"]) >= 250
waveform_resp = await client.get(f"/transcripts/{tid}/audio/waveform")
waveform_resp = await ac.get(f"/transcripts/{tid}/audio/waveform")
assert waveform_resp.status_code == 200
assert waveform_resp.headers["content-type"] == "application/json"
assert isinstance(waveform_resp.json()["data"], list)
@@ -280,7 +209,7 @@ async def test_transcript_rtc_and_websocket(
assert "DURATION" in eventnames
# check that audio/mp3 is available
audio_resp = await client.get(f"/transcripts/{tid}/audio/mp3")
audio_resp = await ac.get(f"/transcripts/{tid}/audio/mp3")
assert audio_resp.status_code == 200
assert audio_resp.headers["Content-Type"] == "audio/mpeg"
@@ -295,11 +224,9 @@ async def test_transcript_rtc_and_websocket_and_fr(
dummy_transcript,
dummy_processors,
dummy_diarization,
dummy_transcript_translator,
dummy_storage,
fake_mp3_upload,
appserver,
client,
):
# goal: start the server, exchange RTC, receive websocket events
# because of that, we need to start the server in a thread
@@ -309,7 +236,8 @@ async def test_transcript_rtc_and_websocket_and_fr(
# create a transcript
base_url = f"http://{host}:{port}/v1"
response = await client.post(
ac = AsyncClient(base_url=base_url)
response = await ac.post(
"/transcripts", json={"name": "Test RTC", "target_language": "fr"}
)
assert response.status_code == 200
@@ -323,16 +251,12 @@ async def test_transcript_rtc_and_websocket_and_fr(
async with aconnect_ws(f"{base_url}/transcripts/{tid}/events") as ws:
print("Test websocket: CONNECTED")
try:
timeout_seconds = 600 # 10 minutes
start_time = time.monotonic()
while (time.monotonic() - start_time) < timeout_seconds:
while True:
msg = await ws.receive_json()
print(f"Test websocket: JSON {msg}")
if msg is None:
break
events.append(msg)
else:
print(f"Test websocket: TIMEOUT after {timeout_seconds} seconds")
except Exception as e:
print(f"Test websocket: EXCEPTION {e}")
finally:
@@ -356,11 +280,11 @@ async def test_transcript_rtc_and_websocket_and_fr(
url = f"{base_url}/transcripts/{tid}/record/webrtc"
path = Path(__file__).parent / "records" / "test_short.wav"
stream_client = StreamClient(signaling, url=url, play_from=path.as_posix())
await stream_client.start()
client = StreamClient(signaling, url=url, play_from=path.as_posix())
await client.start()
timeout = 120
while not stream_client.is_ended():
timeout = 20
while not client.is_ended():
await asyncio.sleep(1)
timeout -= 1
if timeout < 0:
@@ -368,28 +292,25 @@ async def test_transcript_rtc_and_websocket_and_fr(
# XXX aiortc is long to close the connection
# instead of waiting a long time, we just send a STOP
stream_client.channel.send(json.dumps({"cmd": "STOP"}))
client.channel.send(json.dumps({"cmd": "STOP"}))
# wait the processing to finish
await asyncio.sleep(2)
await stream_client.stop()
await client.stop()
# wait the processing to finish
timeout = 120
timeout = 20
while True:
# fetch the transcript and check if it is ended
resp = await client.get(f"/transcripts/{tid}")
resp = await ac.get(f"/transcripts/{tid}")
assert resp.status_code == 200
if resp.json()["status"] == "ended":
break
await asyncio.sleep(1)
timeout -= 1
if timeout < 0:
raise TimeoutError("Timeout while waiting for transcript to be ended")
if resp.json()["status"] != "ended":
raise TimeoutError("Transcript processing failed")
raise TimeoutError("Timeout while waiting for transcript to be ended")
await asyncio.sleep(2)
@@ -409,7 +330,7 @@ async def test_transcript_rtc_and_websocket_and_fr(
assert "TRANSCRIPT" in eventnames
ev = events[eventnames.index("TRANSCRIPT")]
assert ev["data"]["text"].startswith("Hello world.")
assert ev["data"]["translation"] == "en:fr:Hello world."
assert ev["data"]["translation"] == "Bonjour le monde"
assert "TOPIC" in eventnames
ev = events[eventnames.index("TOPIC")]

View File

@@ -1,16 +1,20 @@
import pytest
from httpx import AsyncClient
@pytest.mark.asyncio
async def test_transcript_reassign_speaker(fake_transcript_with_topics, client):
async def test_transcript_reassign_speaker(fake_transcript_with_topics):
from reflector.app import app
transcript_id = fake_transcript_with_topics.id
async with AsyncClient(app=app, base_url="http://test/v1") as ac:
# check the transcript exists
response = await client.get(f"/transcripts/{transcript_id}")
response = await ac.get(f"/transcripts/{transcript_id}")
assert response.status_code == 200
# check initial topics of the transcript
response = await client.get(f"/transcripts/{transcript_id}/topics/with-words")
response = await ac.get(f"/transcripts/{transcript_id}/topics/with-words")
assert response.status_code == 200
topics = response.json()
assert len(topics) == 2
@@ -27,7 +31,7 @@ async def test_transcript_reassign_speaker(fake_transcript_with_topics, client):
assert topics[1]["segments"][0]["speaker"] == 0
# reassign speaker
response = await client.patch(
response = await ac.patch(
f"/transcripts/{transcript_id}/speaker/assign",
json={
"speaker": 1,
@@ -38,7 +42,7 @@ async def test_transcript_reassign_speaker(fake_transcript_with_topics, client):
assert response.status_code == 200
# check topics again
response = await client.get(f"/transcripts/{transcript_id}/topics/with-words")
response = await ac.get(f"/transcripts/{transcript_id}/topics/with-words")
assert response.status_code == 200
topics = response.json()
assert len(topics) == 2
@@ -55,7 +59,7 @@ async def test_transcript_reassign_speaker(fake_transcript_with_topics, client):
assert topics[1]["segments"][0]["speaker"] == 0
# reassign speaker, middle of 2 topics
response = await client.patch(
response = await ac.patch(
f"/transcripts/{transcript_id}/speaker/assign",
json={
"speaker": 2,
@@ -66,7 +70,7 @@ async def test_transcript_reassign_speaker(fake_transcript_with_topics, client):
assert response.status_code == 200
# check topics again
response = await client.get(f"/transcripts/{transcript_id}/topics/with-words")
response = await ac.get(f"/transcripts/{transcript_id}/topics/with-words")
assert response.status_code == 200
topics = response.json()
assert len(topics) == 2
@@ -85,7 +89,7 @@ async def test_transcript_reassign_speaker(fake_transcript_with_topics, client):
assert topics[1]["segments"][1]["speaker"] == 0
# reassign speaker, everything
response = await client.patch(
response = await ac.patch(
f"/transcripts/{transcript_id}/speaker/assign",
json={
"speaker": 4,
@@ -96,7 +100,7 @@ async def test_transcript_reassign_speaker(fake_transcript_with_topics, client):
assert response.status_code == 200
# check topics again
response = await client.get(f"/transcripts/{transcript_id}/topics/with-words")
response = await ac.get(f"/transcripts/{transcript_id}/topics/with-words")
assert response.status_code == 200
topics = response.json()
assert len(topics) == 2
@@ -114,15 +118,18 @@ async def test_transcript_reassign_speaker(fake_transcript_with_topics, client):
@pytest.mark.asyncio
async def test_transcript_merge_speaker(fake_transcript_with_topics, client):
async def test_transcript_merge_speaker(fake_transcript_with_topics):
from reflector.app import app
transcript_id = fake_transcript_with_topics.id
async with AsyncClient(app=app, base_url="http://test/v1") as ac:
# check the transcript exists
response = await client.get(f"/transcripts/{transcript_id}")
response = await ac.get(f"/transcripts/{transcript_id}")
assert response.status_code == 200
# check initial topics of the transcript
response = await client.get(f"/transcripts/{transcript_id}/topics/with-words")
response = await ac.get(f"/transcripts/{transcript_id}/topics/with-words")
assert response.status_code == 200
topics = response.json()
assert len(topics) == 2
@@ -134,7 +141,7 @@ async def test_transcript_merge_speaker(fake_transcript_with_topics, client):
assert topics[1]["words"][1]["speaker"] == 0
# reassign speaker
response = await client.patch(
response = await ac.patch(
f"/transcripts/{transcript_id}/speaker/assign",
json={
"speaker": 1,
@@ -145,7 +152,7 @@ async def test_transcript_merge_speaker(fake_transcript_with_topics, client):
assert response.status_code == 200
# check topics again
response = await client.get(f"/transcripts/{transcript_id}/topics/with-words")
response = await ac.get(f"/transcripts/{transcript_id}/topics/with-words")
assert response.status_code == 200
topics = response.json()
assert len(topics) == 2
@@ -157,7 +164,7 @@ async def test_transcript_merge_speaker(fake_transcript_with_topics, client):
assert topics[1]["words"][1]["speaker"] == 0
# merge speakers
response = await client.patch(
response = await ac.patch(
f"/transcripts/{transcript_id}/speaker/merge",
json={
"speaker_from": 1,
@@ -167,7 +174,7 @@ async def test_transcript_merge_speaker(fake_transcript_with_topics, client):
assert response.status_code == 200
# check topics again
response = await client.get(f"/transcripts/{transcript_id}/topics/with-words")
response = await ac.get(f"/transcripts/{transcript_id}/topics/with-words")
assert response.status_code == 200
topics = response.json()
assert len(topics) == 2
@@ -180,19 +187,20 @@ async def test_transcript_merge_speaker(fake_transcript_with_topics, client):
@pytest.mark.asyncio
async def test_transcript_reassign_with_participant(
fake_transcript_with_topics, client
):
async def test_transcript_reassign_with_participant(fake_transcript_with_topics):
from reflector.app import app
transcript_id = fake_transcript_with_topics.id
async with AsyncClient(app=app, base_url="http://test/v1") as ac:
# check the transcript exists
response = await client.get(f"/transcripts/{transcript_id}")
response = await ac.get(f"/transcripts/{transcript_id}")
assert response.status_code == 200
transcript = response.json()
assert len(transcript["participants"]) == 0
# create 2 participants
response = await client.post(
response = await ac.post(
f"/transcripts/{transcript_id}/participants",
json={
"name": "Participant 1",
@@ -201,7 +209,7 @@ async def test_transcript_reassign_with_participant(
assert response.status_code == 200
participant1_id = response.json()["id"]
response = await client.post(
response = await ac.post(
f"/transcripts/{transcript_id}/participants",
json={
"name": "Participant 2",
@@ -211,7 +219,7 @@ async def test_transcript_reassign_with_participant(
participant2_id = response.json()["id"]
# check participants speakers
response = await client.get(f"/transcripts/{transcript_id}/participants")
response = await ac.get(f"/transcripts/{transcript_id}/participants")
assert response.status_code == 200
participants = response.json()
assert len(participants) == 2
@@ -221,7 +229,7 @@ async def test_transcript_reassign_with_participant(
assert participants[1]["speaker"] is None
# check initial topics of the transcript
response = await client.get(f"/transcripts/{transcript_id}/topics/with-words")
response = await ac.get(f"/transcripts/{transcript_id}/topics/with-words")
assert response.status_code == 200
topics = response.json()
assert len(topics) == 2
@@ -238,7 +246,7 @@ async def test_transcript_reassign_with_participant(
assert topics[1]["segments"][0]["speaker"] == 0
# reassign speaker from a participant
response = await client.patch(
response = await ac.patch(
f"/transcripts/{transcript_id}/speaker/assign",
json={
"participant": participant1_id,
@@ -250,7 +258,7 @@ async def test_transcript_reassign_with_participant(
# check participants if speaker has been assigned
# first participant should have 1, because it's not used yet.
response = await client.get(f"/transcripts/{transcript_id}/participants")
response = await ac.get(f"/transcripts/{transcript_id}/participants")
assert response.status_code == 200
participants = response.json()
assert len(participants) == 2
@@ -260,7 +268,7 @@ async def test_transcript_reassign_with_participant(
assert participants[1]["speaker"] is None
# check topics again
response = await client.get(f"/transcripts/{transcript_id}/topics/with-words")
response = await ac.get(f"/transcripts/{transcript_id}/topics/with-words")
assert response.status_code == 200
topics = response.json()
assert len(topics) == 2
@@ -277,7 +285,7 @@ async def test_transcript_reassign_with_participant(
assert topics[1]["segments"][0]["speaker"] == 0
# reassign participant, middle of 2 topics
response = await client.patch(
response = await ac.patch(
f"/transcripts/{transcript_id}/speaker/assign",
json={
"participant": participant2_id,
@@ -289,7 +297,7 @@ async def test_transcript_reassign_with_participant(
# check participants if speaker has been assigned
# first participant should have 1, because it's not used yet.
response = await client.get(f"/transcripts/{transcript_id}/participants")
response = await ac.get(f"/transcripts/{transcript_id}/participants")
assert response.status_code == 200
participants = response.json()
assert len(participants) == 2
@@ -299,7 +307,7 @@ async def test_transcript_reassign_with_participant(
assert participants[1]["speaker"] == 2
# check topics again
response = await client.get(f"/transcripts/{transcript_id}/topics/with-words")
response = await ac.get(f"/transcripts/{transcript_id}/topics/with-words")
assert response.status_code == 200
topics = response.json()
assert len(topics) == 2
@@ -318,7 +326,7 @@ async def test_transcript_reassign_with_participant(
assert topics[1]["segments"][1]["speaker"] == 0
# reassign speaker, everything
response = await client.patch(
response = await ac.patch(
f"/transcripts/{transcript_id}/speaker/assign",
json={
"participant": participant1_id,
@@ -329,7 +337,7 @@ async def test_transcript_reassign_with_participant(
assert response.status_code == 200
# check topics again
response = await client.get(f"/transcripts/{transcript_id}/topics/with-words")
response = await ac.get(f"/transcripts/{transcript_id}/topics/with-words")
assert response.status_code == 200
topics = response.json()
assert len(topics) == 2
@@ -347,17 +355,20 @@ async def test_transcript_reassign_with_participant(
@pytest.mark.asyncio
async def test_transcript_reassign_edge_cases(fake_transcript_with_topics, client):
async def test_transcript_reassign_edge_cases(fake_transcript_with_topics):
from reflector.app import app
transcript_id = fake_transcript_with_topics.id
async with AsyncClient(app=app, base_url="http://test/v1") as ac:
# check the transcript exists
response = await client.get(f"/transcripts/{transcript_id}")
response = await ac.get(f"/transcripts/{transcript_id}")
assert response.status_code == 200
transcript = response.json()
assert len(transcript["participants"]) == 0
# try reassign without any participant_id or speaker
response = await client.patch(
response = await ac.patch(
f"/transcripts/{transcript_id}/speaker/assign",
json={
"timestamp_from": 0,
@@ -367,7 +378,7 @@ async def test_transcript_reassign_edge_cases(fake_transcript_with_topics, clien
assert response.status_code == 400
# try reassing with both participant_id and speaker
response = await client.patch(
response = await ac.patch(
f"/transcripts/{transcript_id}/speaker/assign",
json={
"participant": "123",
@@ -379,7 +390,7 @@ async def test_transcript_reassign_edge_cases(fake_transcript_with_topics, clien
assert response.status_code == 400
# try reassing with non-existing participant_id
response = await client.patch(
response = await ac.patch(
f"/transcripts/{transcript_id}/speaker/assign",
json={
"participant": "123",

View File

@@ -1,18 +1,22 @@
import pytest
from httpx import AsyncClient
@pytest.mark.asyncio
async def test_transcript_topics(fake_transcript_with_topics, client):
async def test_transcript_topics(fake_transcript_with_topics):
from reflector.app import app
transcript_id = fake_transcript_with_topics.id
async with AsyncClient(app=app, base_url="http://test/v1") as ac:
# check the transcript exists
response = await client.get(f"/transcripts/{transcript_id}/topics")
response = await ac.get(f"/transcripts/{transcript_id}/topics")
assert response.status_code == 200
assert len(response.json()) == 2
topic_id = response.json()[0]["id"]
# get words per speakers
response = await client.get(
response = await ac.get(
f"/transcripts/{transcript_id}/topics/{topic_id}/words-per-speaker"
)
assert response.status_code == 200

View File

@@ -1,16 +1,20 @@
import pytest
from httpx import AsyncClient
@pytest.mark.asyncio
async def test_transcript_create_default_translation(client):
response = await client.post("/transcripts", json={"name": "test en"})
async def test_transcript_create_default_translation():
from reflector.app import app
async with AsyncClient(app=app, base_url="http://test/v1") as ac:
response = await ac.post("/transcripts", json={"name": "test en"})
assert response.status_code == 200
assert response.json()["name"] == "test en"
assert response.json()["source_language"] == "en"
assert response.json()["target_language"] == "en"
tid = response.json()["id"]
response = await client.get(f"/transcripts/{tid}")
response = await ac.get(f"/transcripts/{tid}")
assert response.status_code == 200
assert response.json()["name"] == "test en"
assert response.json()["source_language"] == "en"
@@ -18,8 +22,11 @@ async def test_transcript_create_default_translation(client):
@pytest.mark.asyncio
async def test_transcript_create_en_fr_translation(client):
response = await client.post(
async def test_transcript_create_en_fr_translation():
from reflector.app import app
async with AsyncClient(app=app, base_url="http://test/v1") as ac:
response = await ac.post(
"/transcripts", json={"name": "test en/fr", "target_language": "fr"}
)
assert response.status_code == 200
@@ -28,7 +35,7 @@ async def test_transcript_create_en_fr_translation(client):
assert response.json()["target_language"] == "fr"
tid = response.json()["id"]
response = await client.get(f"/transcripts/{tid}")
response = await ac.get(f"/transcripts/{tid}")
assert response.status_code == 200
assert response.json()["name"] == "test en/fr"
assert response.json()["source_language"] == "en"
@@ -36,8 +43,11 @@ async def test_transcript_create_en_fr_translation(client):
@pytest.mark.asyncio
async def test_transcript_create_fr_en_translation(client):
response = await client.post(
async def test_transcript_create_fr_en_translation():
from reflector.app import app
async with AsyncClient(app=app, base_url="http://test/v1") as ac:
response = await ac.post(
"/transcripts", json={"name": "test fr/en", "source_language": "fr"}
)
assert response.status_code == 200
@@ -46,7 +56,7 @@ async def test_transcript_create_fr_en_translation(client):
assert response.json()["target_language"] == "en"
tid = response.json()["id"]
response = await client.get(f"/transcripts/{tid}")
response = await ac.get(f"/transcripts/{tid}")
assert response.status_code == 200
assert response.json()["name"] == "test fr/en"
assert response.json()["source_language"] == "fr"

View File

@@ -1,7 +1,7 @@
import asyncio
import time
import pytest
from httpx import AsyncClient
@pytest.mark.usefixtures("setup_database")
@@ -14,16 +14,19 @@ async def test_transcript_upload_file(
dummy_processors,
dummy_diarization,
dummy_storage,
client,
):
from reflector.app import app
ac = AsyncClient(app=app, base_url="http://test/v1")
# create a transcript
response = await client.post("/transcripts", json={"name": "test"})
response = await ac.post("/transcripts", json={"name": "test"})
assert response.status_code == 200
assert response.json()["status"] == "idle"
tid = response.json()["id"]
# upload mp3
response = await client.post(
response = await ac.post(
f"/transcripts/{tid}/record/upload?chunk_number=0&total_chunks=1",
files={
"chunk": (
@@ -36,18 +39,14 @@ async def test_transcript_upload_file(
assert response.status_code == 200
assert response.json()["status"] == "ok"
# wait the processing to finish (max 10 minutes)
timeout_seconds = 600 # 10 minutes
start_time = time.monotonic()
while (time.monotonic() - start_time) < timeout_seconds:
# wait the processing to finish
while True:
# fetch the transcript and check if it is ended
resp = await client.get(f"/transcripts/{tid}")
resp = await ac.get(f"/transcripts/{tid}")
assert resp.status_code == 200
if resp.json()["status"] in ("ended", "error"):
break
await asyncio.sleep(1)
else:
pytest.fail(f"Processing timed out after {timeout_seconds} seconds")
# check the transcript is ended
transcript = resp.json()
@@ -56,7 +55,7 @@ async def test_transcript_upload_file(
assert transcript["title"] == "Llm Title"
# check topics and transcript
response = await client.get(f"/transcripts/{tid}/topics")
response = await ac.get(f"/transcripts/{tid}/topics")
assert response.status_code == 200
assert len(response.json()) == 1
assert "want to share" in response.json()[0]["transcript"]

Some files were not shown because too many files have changed in this diff Show More