Compare commits

...

132 Commits

Author SHA1 Message Date
e91979abbc feat: use jitsi file system 2025-09-17 15:16:03 -06:00
95e8011975 Merge main into jisti-integration branch
- Resolved conflicts in server/reflector/views/rooms.py to keep platform-agnostic approach
- Resolved conflicts in www/app/[roomName]/page.tsx to keep VideoPlatformEmbed approach
- Accepted main's version of generated API files (schemas.gen.ts, services.gen.ts, types.gen.ts)
- Removed config-template.ts as per main branch changes
2025-09-15 12:53:49 -06:00
c546e69739 fix: zulip stream and topic selection in share dialog (#644)
* fix: zulip stream and topic selection in share dialog

Replace useListCollection with createListCollection to match the working
room edit implementation. This ensures collections update when data loads,
fixing the issue where streams and topics wouldn't appear until navigation.

* fix: wrap createListCollection in useMemo to prevent recreation on every render

Both streamCollection and topicCollection are now memoized to improve performance
and prevent unnecessary re-renders of Combobox components
2025-09-15 12:34:51 -06:00
Igor Monadical
3f1fe8c9bf chore: remove timeout-based auth session logic (#649)
* remove timeout-based auth session logic

* remove timeout-based auth session logic

---------

Co-authored-by: Igor Loskutov <igor.loskutoff@gmail.com>
2025-09-15 14:19:10 -04:00
5f143fe364 fix: zulip and consent handler on the file pipeline (#645) 2025-09-15 10:49:20 -06:00
Igor Monadical
79f161436e chore: meeting user id removal and room id requirement (#635)
* chore: remove meeting user id and make meeting room id required

* meeting room_id optional

* orphaned meeting room ids DATA migration

* ci fix

* fix meeting_room_id_fkey downgrade

* fix migration rollback

* fix: put index back (meeting room id)

* fix: put index back (meeting room id)

* fix: put index back (meeting room id)

* remove noop migrations

---------

Co-authored-by: Igor Loskutov <igor.loskutoff@gmail.com>
2025-09-12 13:07:58 -04:00
Igor Monadical
5cba5d310d chore: sentry and nextjs major bumps (#633)
* chore: remove nextjs-config

* build fix

* sentry update

* nextjs update

* feature flags doc

* update readme

* explicit nextjs env vars + remove feature-unrelated things and obsolete vars from config

* full config removal

* remove force-dynamic from pages

* compile fix

* restore claude-deleted tests

* no sentry backward compat

* better .env.example

* AUTHENTIK_REFRESH_TOKEN_URL not so required

* accommodate auth system to requiredLogin feature

---------

Co-authored-by: Igor Loskutov <igor.loskutoff@gmail.com>
2025-09-12 12:41:44 -04:00
43ea9349f5 chore(main): release 0.10.0 (#616) 2025-09-11 20:57:19 -06:00
Igor Monadical
b3a8e9739d chore: whereby & s3 settings env error reporting (#637)
Co-authored-by: Igor Loskutov <igor.loskutoff@gmail.com>
2025-09-11 17:52:34 -04:00
Igor Monadical
369ecdff13 feat: replace nextjs-config with environment variables (#632)
* chore: remove nextjs-config

* build fix

* update readme

* explicit nextjs env vars + remove feature-unrelated things and obsolete vars from config

* full config removal

* remove force-dynamic from pages

* compile fix

* restore claude-deleted tests

* better .env.example

---------

Co-authored-by: Igor Loskutov <igor.loskutoff@gmail.com>
2025-09-11 11:20:41 -04:00
fc363bd49b fix: missing follow_redirects=True on modal endpoint (#630) 2025-09-10 08:15:47 -06:00
Igor Monadical
962038ee3f fix: auth post (#627)
Co-authored-by: Igor Loskutov <igor.loskutoff@gmail.com>
2025-09-09 16:46:57 -04:00
Igor Monadical
3b85ff3bdf fix: auth post (#626)
Co-authored-by: Igor Loskutov <igor.loskutoff@gmail.com>
2025-09-09 16:27:46 -04:00
Igor Monadical
cde99ca271 fix: auth post (#624)
Co-authored-by: Igor Loskutov <igor.loskutoff@gmail.com>
2025-09-09 15:48:07 -04:00
Igor Monadical
f81fe9948a fix: anonymous users transcript permissions (#621)
* fix: public transcript visibility

* fix: transcript permissions frontend

* dead code removal

* chore: remove unused code

* fix search tests

* fix search tests

---------

Co-authored-by: Igor Loskutov <igor.loskutoff@gmail.com>
2025-09-09 10:50:29 -04:00
Igor Monadical
5a5b323382 fix: sync backend and frontend token refresh logic (#614)
* sync backend and frontend token refresh logic

* return react strict mode

---------

Co-authored-by: Igor Loskutov <igor.loskutoff@gmail.com>
2025-09-08 10:40:18 -04:00
02a3938822 chore(main): release 0.9.0 (#603) 2025-09-05 22:50:10 -06:00
Igor Monadical
7f5a4c9ddc fix: token refresh locking (#613)
* fix: kv use tls explicit

* fix: token refresh locking

* remove logs

* compile fix

* compile fix

---------

Co-authored-by: Igor Loskutov <igor.loskutoff@gmail.com>
2025-09-05 23:03:24 -04:00
Igor Monadical
08d88ec349 fix: kv use tls explicit (#610)
Co-authored-by: Igor Loskutov <igor.loskutoff@gmail.com>
2025-09-05 18:39:32 -04:00
Igor Monadical
c4d2825c81 feat: frontend openapi react query (#606)
* refactor: migrate from @hey-api/openapi-ts to openapi-react-query

- Replace @hey-api/openapi-ts with openapi-typescript and openapi-react-query
- Generate TypeScript types from OpenAPI spec
- Set up React Query infrastructure with QueryClientProvider
- Migrate all API hooks to use React Query patterns
- Maintain backward compatibility for existing components
- Remove old API infrastructure and dependencies

* fix: resolve import errors and add missing api hooks

- Create constants.ts for RECORD_A_MEETING_URL
- Add api-types.ts for backward compatible type exports
- Update all imports from deleted api folder to new locations
- Add missing React Query hooks for rooms and zulip operations
- Create useApi compatibility layer for unmigrated components

* feat: migrate components to React Query hooks

- Add comprehensive API hooks for all operations
- Migrate rooms page to use React Query mutations
- Update transcript title component to use mutation hook
- Refactor share/privacy component with proper error handling
- Remove direct API client usage in favor of hooks

* feat: complete migration from @hey-api/openapi-ts to openapi-react-query

- Migrated all components from useApi compatibility layer to direct React Query hooks
- Added new hooks for participant operations, room meetings, and speaker operations
- Updated all imports from old api module to api-types
- Fixed TypeScript types and API endpoint signatures
- Removed deprecated useApi.ts compatibility layer
- Fixed SourceKind enum values to match OpenAPI spec
- Added @ts-ignore for Zulip endpoints not in OpenAPI spec yet
- Fixed all compilation errors and type issues

* fix: authentication flow with React Query migration

- Fix middleware management in apiClient to properly handle auth tokens
- Update ApiAuthProvider to correctly configure base URL and auth
- Add missing NextAuth API route handler at app/api/auth/[...nextauth]/route.ts
- Remove middleware ejection attempts (not supported by openapi-fetch)
- Use global variables to store current auth token and API URL
- Setup middleware once on initialization instead of repeatedly adding

This fixes the login/logout flow that was broken after migrating from
the useApi compatibility layer to native React Query hooks.

* fix: prevent unauthorized API calls before authentication

- Add global AuthGuard component to handle authentication at layout level
- Make all API query hooks conditional on authentication status
- Define public routes (like /transcripts/new) that don't require auth
- Fix login flow to use NextAuth signIn instead of non-existent /login route
- Prevent 401 errors by waiting for auth token before making API calls

Previously, all routes under (app) were publicly accessible with each page
handling auth individually. Now authentication is enforced globally while
still allowing specific routes to remain public.

* refactor: remove redundant client-side AuthGuard

The authentication is already properly handled by Next.js middleware
in middleware.ts with LOGIN_REQUIRED_PAGES. The middleware approach is
superior as it:
- Provides server-side protection before page loads
- Prevents flash of unauthorized content
- Centralizes auth logic in one place
- Better performance (no client-side JS needed)

Keep the API hooks conditional to prevent 401 errors before token is ready.

* fix: use direct status check for API query authentication

Changed all query hooks to use direct `status === "authenticated"` check
instead of derived `isAuthenticated && !isLoading` to avoid race conditions
where queries might fire before the authentication token is properly set.

This prevents the brief 401 errors that occur on page refresh when the
session is being restored.

* fix: correct content-type header for FormData uploads

Previously, the API client was setting a default Content-Type of application/json
for all requests, which broke file uploads that need multipart/form-data.

Now the client only sets application/json when the body is not FormData,
allowing FormData to automatically set the correct multipart boundary.

* fix: resolve authentication race condition with React Query

Previously, API calls were being made before the auth token was configured,
causing initial 401 errors that would retry with 200 after token setup.

Changes:
- Add global auth readiness tracking in apiClient
- Create useAuthReady hook that checks both session and token state
- Update all API hooks to use isAuthReady instead of just session status
- Add AuthWrapper component at layout level for consistent loading UX
- Show spinner while authentication initializes across all pages

This ensures API calls only fire after authentication is fully configured,
eliminating the 401/retry pattern and improving user experience.

* refactor: clean up api-hooks.ts comments and improve search invalidation

- Remove redundant function category comments (exports are self-explanatory)
- Remove obvious inline comments for query invalidation
- Fix search endpoint invalidation to clear all queries regardless of parameters

* refactor: remove api-types.ts compatibility layer

- Migrated all 29 files from api-types.ts to use reflector-api.d.ts directly
- Removed $SourceKind manual enum in favor of OpenAPI-generated types
- Fixed unrelated Spinner component TypeScript error in AuthWrapper.tsx
- All imports now use: import type { components } from "path/to/reflector-api"
- Deleted api-types.ts file completely

* refactor: rename api-hooks.ts to apiHooks.ts for consistency

- Renamed api-hooks.ts to apiHooks.ts to follow camelCase convention
- Updated all 21 import statements across the codebase
- Maintains consistency with other non-component files (apiClient.tsx, useAuthReady.ts, etc.)
- Follows established naming pattern: PascalCase for components, camelCase for utilities/hooks

* chore: add .playwright-mcp to .gitignore

* refactor: remove SK helper object and use inline type casting in FilterSidebar

Replace the SK (SourceKind) helper object with direct inline type casting
to simplify the code and reduce unnecessary abstraction.

* chore: clean up migration comments from React Query refactoring

- Remove temporary "// Use new React Query hooks" comments
- Remove "// React Query hooks" comments from browse and rooms pages
- Update package.json script name from codegen to openapi for consistency

* refactor: remove Redis dependencies from frontend authentication

- Replace Redis/Redlock with in-memory cache for token management
- Remove @vercel/kv, ioredis, and redlock dependencies from package.json
- Implement simple lock mechanism for concurrent token refresh prevention
- Use Map-based cache with TTL for token storage
- Maintain same authentication flow without external dependencies

This simplifies the infrastructure requirements and removes the need for
Redis while maintaining the same functionality through in-memory caching.

* fix: add staleTime to prevent cross-tab staled data

* fix: remove infinite re-render loop in useSessionAccessToken

The hook was maintaining redundant local state that caused re-renders
on every update, which triggered NextAuth to continuously refetch the
session, resulting in hundreds of POST requests to /api/auth/session.

Simplified the hook to directly return session values without
unnecessary state duplication.

* fix: handle undefined access tokens in auth.ts

Added fallback to empty string for potentially undefined access_token
and refresh_token from NextAuth account object to satisfy
JWTWithAccessToken type requirements.

* Igor/mathieu/frontend openapi react query (#597)

* small typing

* typing fixes

---------

Co-authored-by: Igor Loskutov <igor.loskutoff@gmail.com>

* self-review-fix

* authReady callback simplify

* fix auth

* fix compose

* room detail page fix

* compile fix

* room edit fix

* normalize auth provider

* room edition state granular management

* cover TODOs + cross-tab cache

* session auto refresh blink

* schema generator error type doc

* protect from zombie auth

* clarify access token refresh logic a bit

* remove react-query tab sharing cache

* remove react-query tab sharing cache

* websocket dupe react devmode protection

* invalidate room on room update

* redis cache

* test ts server

* ci randomness

* less edgy config (ci)

* less edgy config (ci)

* less edgy config (ci)

* ci randomness

* ci randomness

* ci randomness

* ci randomness

* less edgy config (ci)

* added vs edited room state cleanup

* file upload real-time state management fix

* prettier auth state ternary

* prettier auth state ternary

* proper api address from env

* INTERVAL_REFRESH_MS

* node version 20 for tests

* github debug

* github debug

* github debug

* github debug

* github debug

* github debug

* github debug

* github debug

* github debug

* github debug

* github debug

* CI debug

* CI debug

* nextjs magic

* nextjs magic

* doc

* client-side stale auth soft safety net

---------

Co-authored-by: Mathieu Virbel <mat@meltingrocks.com>
Co-authored-by: Igor Loskutov <igor.loskutoff@gmail.com>
2025-09-05 16:01:31 -06:00
0663700a61 fix: align whisper transcriber api with parakeet (#602)
* Documents transcriber api

* Update whisper transcriber api to match parakeet

* Update api transcription spec

* Return 400 for unsupported file type

* Add params to api spec

* Update whisper transcriber implementation to match parakeet
2025-09-05 10:52:14 +02:00
293f7d4f1f feat: implement frontend video platform configuration and abstraction
- Add NEXT_PUBLIC_VIDEO_PLATFORM environment variable support
- Create video platform abstraction layer with factory pattern
- Implement Whereby and Jitsi platform providers
- Update room meeting page to use platform-agnostic component
- Add platform display in room management (cards and table views)
- Support single platform per deployment configuration
- Maintain backward compatibility with existing Whereby integration

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-09-04 12:21:51 -06:00
dc82f8bb3b fix: source kind for file processing (#601) 2025-09-04 08:42:21 -06:00
41224a424c docs: move platform-jitsi.md to docs/ directory 2025-09-02 18:28:50 -06:00
dd0089906f fix: replace datetime.utcnow() with datetime.now(tz=timezone.utc) in Jitsi health check 2025-09-02 18:25:55 -06:00
fa559b1970 feat: update and expand video platform tests
- Update existing tests for StrEnum instead of string literals
- Add comprehensive WherebyClient tests with HTTP mocking
- Add webhook event storage tests for participant and recording events
- Add typing overload tests for create_platform_client factory
- Update webhook test paths to new video_platforms router locations
- Fix mock ordering and parameter issues in async tests
- Test all platform client functionality including signature verification
- Verify webhook event storage with proper timestamp handling

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-09-02 18:16:41 -06:00
c26ce65083 feat: update Jitsi documentation with webhook events storage system
- Add comprehensive webhook event storage documentation
- Document event structure and JSON storage in meetings table
- Add practical webhook testing examples with proper signature generation
- Include detailed troubleshooting for webhook signature verification issues
- Add webhook event payload examples for all supported event types
- Document event storage verification and database querying methods
- Enhance existing webhook configuration with real-world examples

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-09-02 18:09:44 -06:00
52eff2acc0 feat: clean up legacy code and remove excessive documentation
- Remove excessive inline comments from meeting creation flow
- Remove verbose docstrings from simple property methods and basic functions
- Clean up obvious comments like 'Generate JWT tokens', 'Build room URLs'
- Remove unnecessary explanatory comments in platform clients
- Keep only essential documentation for complex logic
- Simplify race condition handling comments
- Remove excessive method documentation for simple operations

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-09-02 18:08:12 -06:00
7875ec3432 feat: move platform routers to video_platforms folders
- Move Jitsi router from views/jitsi.py to video_platforms/jitsi/router.py
- Move Whereby router from views/whereby.py to video_platforms/whereby/router.py
- Update __init__.py files to export routers from platform packages
- Update app.py imports to use video_platforms instead of views
- Remove old view files after successful migration
- Maintain exact same API endpoint paths (/v1/jitsi, /v1/whereby)

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-09-02 18:05:04 -06:00
398be06fad feat: add typing overloads and clean up platform client factory
- Add typing overloads to get_platform_client for JitsiClient and WherebyClient return types
- Add overloads to create_platform_client in factory for better IDE support
- Remove PyJWT fallback imports from views/rooms.py
- Remove platform defaults from CreateRoom and UpdateRoom models
- Clean up legacy whereby fallback code in meeting creation
- Use direct platform client access instead of conditional fallbacks

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-09-02 18:02:43 -06:00
da700069d9 Add webhook events storage to meetings model
- Add events column as JSON type to meetings table with default empty array
- Add events: List[Dict[str, Any]] field to Meeting model
- Create migration 2890b5104577 for events column and apply successfully
- Add MeetingController helper methods for event storage:
  - add_event() for generic event storage with timestamps
  - participant_joined(), participant_left() for participant tracking
  - recording_started(), recording_stopped() for recording events
  - get_events() for event retrieval
- Update Jitsi webhook endpoints to store events:
  - Store participant join/leave events with data and timestamps
  - Store recording start/stop events from Prosody webhooks
  - Store recording completion events from Jibri finalize script
- Events stored with type, timestamp, and data for webhook history tracking
- Fix linting and formatting issues

Addresses PR feedback point 12: save webhook events in meetings events field
2025-09-02 17:53:35 -06:00
51229a1790 Fix Jitsi client issues and create typed meeting data
- Remove 'transcription': True from JWT features in _generate_jwt
- Replace int(time.time()) with generate_uuid4() for room naming to avoid conflicts
- Replace datetime.utcnow() with datetime.now(tz=timezone.utc) for proper timezone handling
- Create JitsiMeetingData(MeetingData) class with typed extra_data properties
- Update PLATFORM_NAME = VideoPlatform.JITSI to use enum
- Update create_meeting to return JitsiMeetingData instance with proper typing
- Fix get_room_sessions mock to use timezone-aware datetime
- Export JitsiMeetingData from jitsi module

Addresses PR feedback points 4, 5, 6, 10: remove transcription features, use UUID,
proper datetime handling, and typed meeting data
2025-09-02 17:44:04 -06:00
2d2c23f7cc Create video_platforms/whereby structure and WherebyClient
- Create video_platforms/whereby/ directory with __init__.py, client.py, tasks.py
- Implement WherebyClient inheriting from VideoPlatformClient interface
- Move all functions from whereby.py into WherebyClient methods
- Use VideoPlatform.WHEREBY enum for PLATFORM_NAME
- Register WherebyClient in platform registry
- Update factory.py to include S3 bucket config for whereby
- Update worker process to use platform abstraction for get_room_sessions
- Preserve exact API behavior for meeting activity detection
- Maintain AWS S3 configuration handling in WherebyClient
- Fix linting and formatting issues

Addresses PR feedback point 7: implement video_platforms/whereby structure
Note: whereby.py kept for legacy fallback until task 7 cleanup
2025-09-02 17:40:32 -06:00
0acb9cac79 Replace Literal with VideoPlatform StrEnum for platform field
- Create VideoPlatform StrEnum with WHEREBY and JITSI values
- Update rooms.py and meetings.py to use VideoPlatform enum
- Update views/rooms.py and video_platforms/factory.py to use enum values
- Generate new migration with proper server_default='whereby'
- Apply migration successfully with backward compatibility
- Fix linting and formatting issues

Addresses PR feedback point 1: use StrEnum instead of Literal[]
2025-09-02 17:36:14 -06:00
d861d92cc2 docs: add comprehensive Jitsi Meet integration user guide
- Complete end-user configuration guide for self-hosted Jitsi Meet
- Covers installation, JWT authentication, and Prosody configuration
- Webhook event handling with mod_event_sync setup
- Jibri recording service configuration and finalize script
- Room creation, JWT token management, and security best practices
- Comprehensive troubleshooting with debug commands and solutions
- Performance optimization and scaling considerations
- Migration guidance from Whereby platform

🤖 Generated with Claude Code
2025-09-02 17:07:09 -06:00
24ff83a2ec docs: add comprehensive Whereby integration user guide
- Complete end-user configuration guide for Whereby video platform
- Covers account setup, API key generation, and webhook configuration
- AWS S3 storage setup with IAM permissions and security best practices
- Room creation, recording options, and meeting feature configuration
- Troubleshooting guide with common issues and debug commands
- Security considerations and performance optimization tips
- Migration guidance from other platforms

🤖 Generated with Claude Code
2025-09-02 17:05:40 -06:00
249234238c feat: add comprehensive video platform test suite
- Created complete test coverage for video platform abstraction
- Tests for base classes, JitsiClient implementation, and platform registry
- JWT generation tests with proper mocking and error scenarios
- Webhook signature verification tests (valid/invalid/missing secret)
- Platform factory tests for Jitsi and Whereby configuration
- Registry tests for platform registration and client creation
- Webhook endpoint tests with signature verification and error cases
- Integration tests for rooms endpoint with platform abstraction
- 24 comprehensive test cases covering all video platform functionality
- All tests passing with proper mocking and isolation

🤖 Generated with Claude Code
2025-09-02 16:54:58 -06:00
42a603d5c3 feat: add PyJWT dependency and finalize Jitsi integration
- Added PyJWT>=2.8.0 to pyproject.toml dependencies
- Installed dependency via uv sync successfully
- Verified JWT generation functionality works correctly
- Confirmed platform factory creates JitsiClient instances
- Validated database migrations applied (platform fields available)
- Tested webhook endpoints are registered and functional
- Verified FastAPI app starts without errors with full integration
- All integration tests pass - Jitsi platform fully functional

🤖 Generated with Claude Code
2025-09-02 16:28:44 -06:00
6d2092f950 feat: create comprehensive Jitsi integration documentation
- Added complete end-user configuration guide at server/platform-jitsi.md
- Covers prerequisites, environment setup, and Jitsi Meet configuration
- Includes JWT authentication, Jibri recording, and Prosody event-sync setup
- Provides troubleshooting guide with common issues and solutions
- Documents security best practices and performance optimization
- Includes testing procedures and migration guidance from Whereby
- Ready for production deployment with step-by-step instructions
- Uses environment variable placeholders for security

🤖 Generated with Claude Code
2025-09-02 16:24:47 -06:00
f2bb6aaecb feat: update rooms.py to use video platform abstraction
- Added platform field to Room, CreateRoom, and UpdateRoom models
- Updated rooms_create function to pass platform parameter
- Rewrote rooms_create_meeting to use platform factory pattern
- Added graceful fallback to legacy whereby implementation
- Maintained API compatibility and error handling patterns
- Prepared for multi-platform support (Whereby/Jitsi)

🤖 Generated with Claude Code
2025-09-02 16:21:58 -06:00
2b136ac7b0 feat: create Jitsi webhook endpoints for event handling
- Added comprehensive Jitsi webhook endpoint in views/jitsi.py
- Handles Prosody event-sync events (muc-occupant-joined/left)
- Implements participant counting following whereby.py pattern
- Added Jibri recording completion webhook endpoint
- Includes signature verification with fallback when platform client unavailable
- Registered router in app.py for /v1/jitsi endpoints
- Added health check endpoint for webhook configuration

🤖 Generated with Claude Code
2025-09-02 16:19:54 -06:00
3f4fc26483 feat: register Jitsi platform in video platforms factory and registry
- Added JitsiClient registration to platform registry
- Enables dynamic platform selection through factory pattern
- Factory configuration already supports Jitsi settings
- Platform abstraction layer now supports beide Whereby and Jitsi

🤖 Generated with Claude Code
2025-09-02 16:17:32 -06:00
8e5ef5bca6 feat: implement JitsiClient with JWT authentication
Complete implementation of JitsiClient following VideoPlatformClient interface
with JWT-based room access control and webhook signature verification.

- Add JWT token generation with proper payload structure
- Implement unique room name generation with timestamp
- Create separate user/host JWT tokens with moderator permissions
- Build secure room URLs with embedded JWT parameters
- Add HMAC-SHA256 webhook signature verification for Prosody events
- Implement all abstract methods with Jitsi-specific behavior
- Include comprehensive typing and error handling

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-09-02 16:15:49 -06:00
d49fdcb38d feat: create video platforms architecture with Jitsi directory structure
Create complete video platforms abstraction layer following daily.co branch
pattern with Jitsi-specific directory structure.

- Add video_platforms base module with abstract classes
- Create VideoPlatformClient, MeetingData, VideoPlatformConfig interfaces
- Add platform registry system for client management
- Create factory pattern for platform client creation
- Add Jitsi directory structure with __init__.py, tasks.py, client.py
- Configure Jitsi platform in factory with JWT-based authentication

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-09-02 16:14:42 -06:00
d42380abf1 feat: add Jitsi configuration settings
Add comprehensive Jitsi Meet configuration settings to settings.py
following the same pattern as WHEREBY settings.

- Add JITSI_DOMAIN with meet.jit.si default
- Add JITSI_JWT_SECRET for JWT token signing
- Add JITSI_WEBHOOK_SECRET for webhook validation
- Add JITSI_APP_ID, JITSI_JWT_ISSUER, JITSI_JWT_AUDIENCE for JWT configuration
- Follow consistent naming and typing patterns

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-09-02 16:11:00 -06:00
cf64e1a3d9 feat: add database migration for platform field
Generate Alembic migration to add platform column to rooms and meetings
tables enabling multi-platform video conferencing support.

- Add platform column to meeting table with whereby default
- Add platform column to room table with whereby default
- Migration tested successfully with alembic upgrade head

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-09-02 16:10:11 -06:00
ea53ca7000 feat: add platform field to Room and Meeting models
Add platform column to rooms and meetings database tables with Literal typing
to support multiple video conferencing platforms (whereby, jitsi).

- Add platform column to rooms/meetings SQLAlchemy tables with whereby default
- Update Room/Meeting Pydantic models with platform field and Literal typing
- Modify RoomController.add() to accept platform parameter
- Update MeetingController.create() to copy platform from room

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-09-02 16:08:38 -06:00
457823e1c1 chore(main): release 0.8.2 (#595) 2025-09-01 19:09:09 -06:00
Igor Monadical
695d1a957d fix: search-logspam (#593)
* fix: search-logspam

* llm comment

* fix tests

---------

Co-authored-by: Igor Loskutov <igor.loskutoff@gmail.com>
2025-08-29 18:55:51 -04:00
ccffdba75b chore(main): release 0.8.1 (#591) 2025-08-29 11:56:11 -06:00
84a381220b fix: make webhook secret/url allowing null (#590) 2025-08-29 11:55:18 -06:00
5f2f0e9317 chore(main): release 0.8.0 (#579) 2025-08-29 11:34:24 -06:00
88ed7cfa78 feat(rooms): add webhook for transcript completion (#578)
* feat(rooms): add webhook notifications for transcript completion

- Add webhook_url and webhook_secret fields to rooms table
- Create Celery task with 24-hour retry window using exponential backoff
- Send transcript metadata, diarized text, topics, and summaries via webhook
- Add HMAC signature verification for webhook security
- Add test endpoint POST /v1/rooms/{room_id}/webhook/test
- Update frontend with webhook configuration UI and test button
- Auto-generate webhook secret if not provided
- Trigger webhook after successful file pipeline processing for room recordings

* style: linting

* fix: remove unwanted files

* fix: update openapi gen

* fix: self-review

* docs: add comprehensive webhook documentation

- Document webhook configuration, events, and payloads
- Include transcript.completed and test event examples
- Add security considerations and best practices
- Provide example webhook receiver implementation
- Document retry policy and signature verification

* fix: remove audio_mp3_url from webhook payload

- Remove audio download URL generation from webhook
- Update documentation to reflect the change
- Keep only frontend_url for accessing transcripts

* docs: remove unwanted section

* fix: correct API method name and type imports for rooms

- Fix v1RoomsRetrieve to v1RoomsGet
- Update Room type to RoomDetails throughout frontend
- Fix type imports in useRoomList, RoomList, RoomTable, and RoomCards

* feat: add show/hide toggle for webhook secret field

- Add eye icon button to reveal/hide webhook secret when editing
- Show password dots when webhook secret is hidden
- Reset visibility state when opening/closing dialog
- Only show toggle button when editing existing room with secret

* fix: resolve event loop conflict in webhook test endpoint

- Extract webhook test logic into shared async function
- Call async function directly from FastAPI endpoint
- Keep Celery task wrapper for background processing
- Fixes RuntimeError: event loop already running

* refactor: remove unnecessary Celery task for webhook testing

- Webhook testing is synchronous and provides immediate feedback
- No need for background processing via Celery
- Keep only the async function called directly from API endpoint

* feat: improve webhook test error messages and display

- Show HTTP status code in error messages
- Parse JSON error responses to extract meaningful messages
- Improved UI layout for webhook test results
- Added colored background for success/error states
- Better text wrapping for long error messages

* docs: adjust doc

* fix: review

* fix: update attempts to match close 24h

* fix: add event_id

* fix: changed to uuid, to have new event_id when reprocess.

* style: linting

* fix: alembic revision
2025-08-29 10:07:49 -06:00
6f0c7c1a5e feat(cleanup): add automatic data retention for public instances (#574)
* feat(cleanup): add automatic data retention for public instances

- Add Celery task to clean up anonymous data after configurable retention period
- Delete transcripts, meetings, and orphaned recordings older than retention days
- Only runs when PUBLIC_MODE is enabled to prevent accidental data loss
- Properly removes all associated files (local and S3 storage)
- Add manual cleanup tool for testing and intervention
- Configure retention via PUBLIC_DATA_RETENTION_DAYS setting (default: 7 days)

Fixes #571

* fix: apply pre-commit formatting fixes

* fix: properly delete recording files from storage during cleanup

- Add storage deletion for orphaned recordings in both cleanup task and manual tool
- Delete from storage before removing database records
- Log warnings if storage deletion fails but continue with database cleanup

* Apply suggestion from @pr-agent-monadical[bot]

Co-authored-by: pr-agent-monadical[bot] <198624643+pr-agent-monadical[bot]@users.noreply.github.com>

* Apply suggestion from @pr-agent-monadical[bot]

Co-authored-by: pr-agent-monadical[bot] <198624643+pr-agent-monadical[bot]@users.noreply.github.com>

* refactor: cleanup_old_data for better logging

* fix: linting

* test: fix meeting cleanup test to not require room controller

- Simplify test by directly inserting meetings into database
- Remove dependency on non-existent rooms_controller.create method
- Tests now pass successfully

* fix: linting

* refactor: simplify cleanup tool to use worker implementation

- Remove duplicate cleanup logic from manual tool
- Use the same _cleanup_old_public_data function from worker
- Remove dry-run feature as requested
- Prevent code duplication and ensure consistency
- Update documentation to reflect changes

* refactor: split cleanup worker into smaller functions

- Move all imports to the top of the file
- Extract cleanup logic into separate functions:
  - cleanup_old_transcripts()
  - cleanup_old_meetings()
  - cleanup_orphaned_recordings()
  - log_cleanup_results()
- Make code more maintainable and testable
- Add days parameter support to Celery task
- Update manual tool to work with refactored code

* feat: add TypedDict typing for cleanup stats

- Add CleanupStats TypedDict for better type safety
- Update all function signatures to use proper typing
- Add return type annotations to _cleanup_old_public_data
- Improves code maintainability and IDE support

* feat: add CASCADE DELETE to meeting_consent foreign key

- Add ondelete="CASCADE" to meeting_consent.meeting_id foreign key
- Generate and apply migration to update existing constraint
- Remove manual consent deletion from cleanup code
- Add unit test to verify CASCADE DELETE behavior

* style: linting

* fix: alembic migration branchpoint

* fix: correct downgrade constraint name in CASCADE DELETE migration

* fix: regenerate CASCADE DELETE migration with proper constraint names

- Delete problematic migration and regenerate with correct names
- Use explicit constraint name in both upgrade and downgrade
- Ensure migration works bidirectionally
- All tests passing including CASCADE DELETE test

* style: linting

* refactor: simplify cleanup to use transcripts as entry point

- Remove orphaned_recordings cleanup (not part of this PR scope)
- Remove separate old_meetings cleanup
- Transcripts are now the main entry point for cleanup
- Associated meetings and recordings are deleted with their transcript
- Use single database connection for all operations
- Update tests to reflect new approach

* refactor: cleanup and rename functions for clarity

- Rename _cleanup_old_public_data to cleanup_old_public_data (make public)
- Rename celery task to cleanup_old_public_data_task for clarity
- Update docstrings and improve code organization
- Remove unnecessary comments and simplify deletion logic
- Update tests to use new function names
- All tests passing

* style: linting\

* style: typing and review

* fix: add transaction on cleanup_single_transcript

* fix: naming

---------

Co-authored-by: pr-agent-monadical[bot] <198624643+pr-agent-monadical[bot]@users.noreply.github.com>
2025-08-29 08:47:14 -06:00
9dfd76996f fix: file pipeline status reporting and websocket updates (#589)
* feat: use file pipeline for upload and reprocess action

* fix: make file pipeline correctly report status events

* fix: duplication of transcripts_controller

* fix: tests

* test: fix file upload test

* test: fix reprocess

* fix: also patch from main_file_pipeline

(how patch is done is dependent of file import unfortunately)
2025-08-29 00:58:14 -06:00
55cc8637c6 ci: restrict workflow execution to main branch and add concurrency (#586)
* ci: try adding concurrency

* ci: restrict push on main branch

* ci: fix concurrency key

* ci: fix build concurrency

* refactor: apply suggestion from @pr-agent-monadical[bot]

Co-authored-by: pr-agent-monadical[bot] <198624643+pr-agent-monadical[bot]@users.noreply.github.com>

---------

Co-authored-by: pr-agent-monadical[bot] <198624643+pr-agent-monadical[bot]@users.noreply.github.com>
2025-08-28 16:43:17 -06:00
f5331a2107 style: more type annotations to parakeet transcriber (#581)
* feat: add comprehensive type annotations to Parakeet transcriber

- Add TypedDict for WordTiming with word, start, end fields
- Add NamedTuple for TimeSegment, AudioSegment, and TranscriptResult
- Add type hints to all generator functions (vad_segment_generator, batch_speech_segments, etc.)
- Add enforce_word_timing_constraints function to prevent word timing overlaps
- Refactor batch_segment_to_audio_segment to reuse pad_audio function

* doc: add note about space
2025-08-28 12:22:07 -06:00
Igor Loskutov
124ce03bf8 fix: Igor/evaluation (#575)
* fix: impossible import error (#563)

* evaluation cli - database events experiment

* hallucinations

* evaluation - unhallucinate

* evaluation - unhallucinate

* roll back reliability link

* self reviewio

* lint

* self review

* add file pipeline to cli

* add file pipeline to cli + sorting

* remove cli tests

* remove ai comments

* comments
2025-08-28 12:07:34 -04:00
7030e0f236 fix: optimize parakeet transcription batching algorithm (#577)
* refactor: optimize transcription batching to accumulate speech segments

- Changed VAD segment generator to return full audio array instead of segments
- Removed segment filtering step
- Modified batch_segments to accumulate maximum speech including silence
- Transcribe larger continuous chunks instead of individual speech segments

* fix: correct transcribe_batch call to use list and fix batch unpacking

* fix: simplify

* fix: remove unused variables

* fix: add typing
2025-08-27 10:32:04 -06:00
37f0110892 doc: update local model readme 2025-08-22 17:50:24 -06:00
cf2896a7f4 doc: update readme about installation instructions
Add a note about installation instructions being inaccurate.
2025-08-22 17:48:35 -06:00
aabf2c2572 chore(main): release 0.7.3 (#565) 2025-08-22 16:35:52 -06:00
6a7b08f016 doc: change readme intro 2025-08-22 16:26:25 -06:00
e2736563d9 doc: update readme with new images 2025-08-22 16:15:54 -06:00
0f54b7782d chore: ignore www/.env.[development,production] 2025-08-22 14:41:09 -06:00
359280dd34 fix: cleaned repo, and get git-leaks clean 2025-08-22 11:51:34 -06:00
9265d201b5 fix: restore previous behavior on live pipeline + audio downscaler (#561)
This commit restore the original behavior with frame cutting. While
silero is used on our gpu for files, look like it's not working great on
the live pipeline. To be investigated, but at the moment, what we keep
is:

- refactored to extract the downscale for further processing in the
pipeline
- remove any downscale implementation from audio_chunker and audio_merge
- removed batching from audio_merge too for now
2025-08-22 10:49:26 -06:00
52f9f533d7 chore(main): release 0.7.2 (#559) 2025-08-21 21:00:05 -06:00
0c3878ac3c fix: docker image not loading libgomp.so.1 for torch (#560)
On ARM64, the docker iamge crash because torch cannot load libgomp.so.1
-- Look like pytorch does not install the same packages depending the
platform.

AMD64:

/app/.venv/lib/python3.12/site-packages/torch/lib/libgomp.so.1
/app/.venv/lib/python3.12/site-packages/ctranslate2.libs/libgomp-a34b3233.so.1.0.0
/app/.venv/lib/python3.12/site-packages/scikit_learn.libs/libgomp-a34b3233.so.1.0.0

ARM64:

/app/.venv/lib/python3.12/site-packages/ctranslate2.libs/libgomp-d22c30c5.so.1.0.0
/app/.venv/lib/python3.12/site-packages/scikit_learn.libs/libgomp-947d5fa1.so.1.0.0
/app/.venv/lib/python3.12/site-packages/torch.libs/libgomp-947d5fa1.so.1.0.0
2025-08-21 16:41:35 -06:00
Igor Loskutov
d70beee51b fix: include shared rooms to search (#558)
* include shared rooms to search

* tests vibe

* tests vibe

* tests vibe

* tests vibe

* tests vibe

* tests vibe

* tests vibe

* remove tests, thats too much
2025-08-21 14:52:29 -04:00
bc5b351d2b chore(main): release 0.7.1 (#557) 2025-08-20 23:23:27 -06:00
Igor Loskutov
07981e8090 fix: webvtt db null expectation mismatch (#556) 2025-08-20 23:22:41 -06:00
7e366f6338 chore(main): release 0.7.0 (#541) 2025-08-20 22:24:36 -06:00
7592679a35 build: separate silero-vad and force torch to be resolved without nvidia (#555)
* build: separate silero-vad and force torch to be resolved without nvidia

* build: also add torchaudio as cpu version
2025-08-20 22:23:48 -06:00
af16178f86 ci: use github-token to get around potential api throttling + rework dockerfile (#554)
* ci: use github-token to get around potential api throttling

* build: put pyannote-audio separate to the project

* fix: now that we have a readme, use it

* build: add UV_NO_CACHE
2025-08-20 21:59:29 -06:00
3ea7f6b7b6 feat: pipeline improvement with file processing, parakeet, silero-vad (#540)
* feat: improve pipeline threading, and transcriber (parakeet and silero vad)

* refactor: remove whisperx, implement parakeet

* refactor: make audio_chunker more smart and wait for speech, instead of fixed frame

* refactor: make audio merge to always downscale the audio to 16k for transcription

* refactor: make the audio transcript modal accepting batches

* refactor: improve type safety and remove prometheus metrics

- Add DiarizationSegment TypedDict for proper diarization typing
- Replace List/Optional with modern Python list/| None syntax
- Remove all Prometheus metrics from TranscriptDiarizationAssemblerProcessor
- Add comprehensive file processing pipeline with parallel execution
- Update processor imports and type annotations throughout
- Implement optimized file pipeline as default in process.py tool

* refactor: convert FileDiarizationProcessor I/O types to BaseModel

Update FileDiarizationInput and FileDiarizationOutput to inherit from
BaseModel instead of plain classes, following the standard pattern
used by other processors in the codebase.

* test: add tests for file transcript and diarization with pytest-recording

* build: add pytest-recording

* feat: add local pyannote for testing

* fix: replace PyAV AudioResampler with torchaudio for reliable audio processing

- Replace problematic PyAV AudioResampler that was causing ValueError: [Errno 22] Invalid argument
- Use torchaudio.functional.resample for robust sample rate conversion
- Optimize processing: skip conversion for already 16kHz mono audio
- Add direct WAV writing with Python wave module for better performance
- Consolidate duplicate downsample checks for cleaner code
- Maintain list[av.AudioFrame] input interface
- Required for Silero VAD which needs 16kHz mono audio

* fix: replace PyAV AudioResampler with torchaudio solution

- Resolves ValueError: [Errno 22] Invalid argument in AudioMergeProcessor
- Replaces problematic PyAV AudioResampler with torchaudio.functional.resample
- Optimizes processing to skip unnecessary conversions when audio is already 16kHz mono
- Uses direct WAV writing with Python's wave module for better performance
- Fixes test_basic_process to disable diarization (pyannote dependency not installed)
- Updates test expectations to match actual processor behavior
- Removes unused pydub dependency from pyproject.toml
- Adds comprehensive TEST_ANALYSIS.md documenting test suite status

* feat: add parameterized test for both diarization modes

- Adds @pytest.mark.parametrize to test_basic_process with enable_diarization=[False, True]
- Test with diarization=False always passes (tests core AudioMergeProcessor functionality)
- Test with diarization=True gracefully skips when pyannote.audio is not installed
- Provides comprehensive test coverage for both pipeline configurations

* fix: resolve pipeline property naming conflict in AudioDiarizationPyannoteProcessor

- Renames 'pipeline' property to 'diarization_pipeline' to avoid conflict with base Processor.pipeline attribute
- Fixes AttributeError: 'property 'pipeline' object has no setter' when set_pipeline() is called
- Updates property usage in _diarize method to use new name
- Now correctly supports pipeline initialization for diarization processing

* fix: add local for pyannote

* test: add diarization test

* fix: resample on audio merge now working

* fix: correctly restore timestamp

* fix: display exception in a threaded processor if that happen

* Update pyproject.toml

* ci: remove option

* ci: update astral-sh/setup-uv

* test: add monadical url for pytest-recording

* refactor: remove previous version

* build: move faster whisper to local dep

* test: fix missing import

* refactor: improve main_file_pipeline organization and error handling

- Move all imports to the top of the file
- Create unified EmptyPipeline class to replace duplicate mock pipeline code
- Remove timeout and fallback logic - let processors handle their own retries
- Fix error handling to raise any exception from parallel tasks
- Add proper type hints and validation for captured results

* fix: wrong function

* fix: remove task_done

* feat: add configurable file processing timeouts for modal processors

- Add TRANSCRIPT_FILE_TIMEOUT setting (default: 600s) for file transcription
- Add DIARIZATION_FILE_TIMEOUT setting (default: 600s) for file diarization
- Replace hardcoded timeout=600 with configurable settings in modal processors
- Allows customization of timeout values via environment variables

* fix: use logger

* fix: worker process meetings now use file pipeline

* fix: topic not gathered

* refactor: remove prepare(), pipeline now work

* refactor: implement many review from Igor

* test: add test for test_pipeline_main_file

* refactor: remove doc

* doc: add doc

* ci: update build to use native arm64 builder

* fix: merge fixes

* refactor: changes from Igor review + add test (not by default) to test gpu modal part

* ci: update to our own runner linux-amd64

* ci: try using suggested mode=min

* fix: update diarizer for latest modal, and use volume

* fix: modal file extension detection

* fix: put the diarizer as A100
2025-08-20 20:07:19 -06:00
Igor Loskutov
009590c080 feat: search frontend (#551)
* feat: better highlight

* feat(search): add long_summary to search vector for improved search results

- Update search vector to include long_summary with weight B (between title A and webvtt C)
- Modify SearchController to fetch long_summary and prioritize its snippets
- Generate snippets from long_summary first (max 2), then from webvtt for remaining slots
- Add comprehensive tests for long_summary search functionality
- Create migration to update search_vector_en column in PostgreSQL

This improves search quality by including summarized content which often contains
key topics and themes that may not be explicitly mentioned in the transcript.

* fix: address code review feedback for search enhancements

- Fix test file inconsistencies by removing references to non-existent model fields
  - Comment out tests for unimplemented features (room_ids, status filters, date ranges)
  - Update tests to only use currently available fields (room_id singular, no room_name/processing_status)
  - Mark future functionality tests with @pytest.mark.skip

- Make snippet counts configurable
  - Add LONG_SUMMARY_MAX_SNIPPETS constant (default: 2)
  - Replace hardcoded value with configurable constant

- Improve error handling consistency in WebVTT parsing
  - Use different log levels for different error types (debug for malformed, warning for decode, error for unexpected)
  - Add catch-all exception handler for unexpected errors
  - Include stack trace for critical errors

All existing tests pass with these changes.

* fix: correct datetime test to include required duration field

* feat: better highlight

* feat: search room names

* feat: acknowledge deleted room

* feat: search filters fix and rank removal

* chore: minor refactoring

* feat: better matches frontend

* chore: self-review (vibe)

* chore: self-review WIP

* chore: self-review WIP

* chore: self-review WIP

* chore: self-review WIP

* chore: self-review WIP

* chore: self-review WIP

* chore: self-review WIP

* remove swc (vibe)

* search url query sync (vibe)

* search url query sync (vibe)

* better casts and cap while

* PR review + simplify frontend hook

* pr: remove search db timeouts

* cleanup tests

* tests cleanup

* frontend cleanup

* index declarations

* refactor frontend (self-review)

* fix search pagination

* clear "x" for search input

* pagination max pages fix

* chore: cleanup

* cleanup

* cleanup

* cleanup

* cleanup

* cleanup

* cleanup

* cleanup

* lockfile

* pr review
2025-08-20 20:56:45 -04:00
Igor Loskutov
fe5d344cff diarization cli: throw on modal errors (#553) 2025-08-20 10:21:52 -04:00
Igor Loskutov
86455ce573 chore: type fixes (#544)
* chore: type fixes

* chore: type fixes
2025-08-18 16:31:23 -04:00
2fccd81bcd fix: use structlog not logging (#550) 2025-08-15 15:41:23 -06:00
1311714451 ci: add pre-commit hook and fix linting issues (#545)
* style: deactivate PLC0415 only on part that it's ok

+ re-run pre-commit run --all

* ci: add pre-commit hook

* build: move from yarn to pnpm

* build: move from yarn to pnpm

* build: fix node-version

* ci: install pnpm prior node (?)

* build: update deps and pnpm trying to fix vercel build

* feat: docker www corepack

* style: pre-commit

---------

Co-authored-by: Igor Loskutov <igor.loskutoff@gmail.com>
2025-08-14 20:59:54 -06:00
b9d891d342 feat: delete recording with transcript (#547)
* Delete recording with transcript

* Delete confirmation dialog

* Use aws storage abstraction for recording deletion

* Test recording deleted with transcript

* Use get transcript storage

* Fix the test

* Add env vars for recording storage
2025-08-14 20:45:30 +02:00
9eab952c63 feat: postgresql migration and removal of sqlite in pytest (#546)
* feat: remove support of sqlite, 100% postgres

* fix: more migration and make datetime timezone aware in postgres

* fix: change how database is get, and use contextvar to have difference instance between different loops

* test: properly use client fixture that handle lifetime/database connection

* fix: add missing client fixture parameters to test functions

This commit fixes NameError issues where test functions were trying to use
the 'client' fixture but didn't have it as a parameter. The changes include:

1. Added 'client' parameter to test functions in:
   - test_transcripts_audio_download.py (6 functions including fixture)
   - test_transcripts_speaker.py (3 functions)
   - test_transcripts_upload.py (1 function)
   - test_transcripts_rtc_ws.py (2 functions + appserver fixture)

2. Resolved naming conflicts in test_transcripts_rtc_ws.py where both HTTP
   client and StreamClient were using variable name 'client'. StreamClient
   instances are now named 'stream_client' to avoid conflicts.

3. Added missing 'from reflector.app import app' import in rtc_ws tests.

Background: Previously implemented contextvars solution with get_database()
function resolves asyncio event loop conflicts in Celery tasks. The global
client fixture was also created to replace manual AsyncClient instances,
ensuring proper FastAPI application lifecycle management and database
connections during tests.

All tests now pass except for 2 pre-existing RTC WebSocket test failures
related to asyncpg connection issues unrelated to these fixes.

* fix: ensure task are correctly closed

* fix: make separate event loop for the live server

* fix: make default settings pointing at postgres

* build: remove pytest-docker deps out of dev, just tests group
2025-08-14 11:40:52 -06:00
Igor Loskutov
6fb5cb21c2 feat: search backend (#537)
* docs: transient docs

* chore: cleanup

* webvtt WIP

* webvtt field

* chore: webvtt tests comments

* chore: remove useless tests

* feat: search TASK.md

* feat: full text search by title/webvtt

* chore: search api task

* feat: search api

* feat: search API

* chore: rm task md

* chore: roll back unnecessary validators

* chore: pr review WIP

* chore: pr review WIP

* chore: pr review

* chore: top imports

* feat: better lint + ci

* feat: better lint + ci

* feat: better lint + ci

* feat: better lint + ci

* chore: lint

* chore: lint

* fix: db datetime definitions

* fix: flush() params

* fix: update transcript mutability expectation / test

* fix: update transcript mutability expectation / test

* chore: auto review

* chore: new controller extraction

* chore: new controller extraction

* chore: cleanup

* chore: review WIP

* chore: pr WIP

* chore: remove ci lint

* chore: openapi regeneration

* chore: openapi regeneration

* chore: postgres test doc

* fix: .dockerignore for arm binaries

* fix: .dockerignore for arm binaries

* fix: cap test loops

* fix: cap test loops

* fix: cap test loops

* fix: get_transcript_topics

* chore: remove flow.md docs and claude guidance

* chore: remove claude.md db doc

* chore: remove claude.md db doc

* chore: remove claude.md db doc

* chore: remove claude.md db doc
2025-08-13 10:03:38 -04:00
Igor Loskutov
a42ed12982 fix: evaluation cli event wrap (#536)
* fix: evaluation cli event wrap

* fix: evaluation cli event wrap

* chore: remove unrelated change

* chore: rollback claude.md changes
2025-08-11 19:28:52 -04:00
1aa52a99b6 chore(main): release 0.6.1 (#539) 2025-08-06 19:38:43 -06:00
dependabot[bot]
2a97290f2e build(deps): bump the npm_and_yarn group across 1 directory with 7 updates (#535)
Bumps the npm_and_yarn group with 6 updates in the /www directory:

| Package | From | To |
| --- | --- | --- |
| [axios](https://github.com/axios/axios) | `1.6.2` | `1.8.2` |
| [postcss](https://github.com/postcss/postcss) | `8.4.25` | `8.4.31` |
| [braces](https://github.com/micromatch/braces) | `3.0.2` | `3.0.3` |
| [cross-spawn](https://github.com/moxystudio/node-cross-spawn) | `7.0.3` | `7.0.6` |
| [micromatch](https://github.com/micromatch/micromatch) | `4.0.5` | `4.0.8` |
| [nanoid](https://github.com/ai/nanoid) | `3.3.6` | `3.3.11` |



Updates `axios` from 1.6.2 to 1.8.2
- [Release notes](https://github.com/axios/axios/releases)
- [Changelog](https://github.com/axios/axios/blob/v1.x/CHANGELOG.md)
- [Commits](https://github.com/axios/axios/compare/v1.6.2...v1.8.2)

Updates `postcss` from 8.4.25 to 8.4.31
- [Release notes](https://github.com/postcss/postcss/releases)
- [Changelog](https://github.com/postcss/postcss/blob/main/CHANGELOG.md)
- [Commits](https://github.com/postcss/postcss/compare/8.4.25...8.4.31)

Updates `braces` from 3.0.2 to 3.0.3
- [Changelog](https://github.com/micromatch/braces/blob/master/CHANGELOG.md)
- [Commits](https://github.com/micromatch/braces/compare/3.0.2...3.0.3)

Updates `cross-spawn` from 7.0.3 to 7.0.6
- [Changelog](https://github.com/moxystudio/node-cross-spawn/blob/master/CHANGELOG.md)
- [Commits](https://github.com/moxystudio/node-cross-spawn/compare/v7.0.3...v7.0.6)

Updates `follow-redirects` from 1.15.2 to 1.15.6
- [Release notes](https://github.com/follow-redirects/follow-redirects/releases)
- [Commits](https://github.com/follow-redirects/follow-redirects/compare/v1.15.2...v1.15.6)

Updates `micromatch` from 4.0.5 to 4.0.8
- [Release notes](https://github.com/micromatch/micromatch/releases)
- [Changelog](https://github.com/micromatch/micromatch/blob/master/CHANGELOG.md)
- [Commits](https://github.com/micromatch/micromatch/compare/4.0.5...4.0.8)

Updates `nanoid` from 3.3.6 to 3.3.11
- [Release notes](https://github.com/ai/nanoid/releases)
- [Changelog](https://github.com/ai/nanoid/blob/main/CHANGELOG.md)
- [Commits](https://github.com/ai/nanoid/compare/3.3.6...3.3.11)

---
updated-dependencies:
- dependency-name: axios
  dependency-version: 1.8.2
  dependency-type: direct:production
  dependency-group: npm_and_yarn
- dependency-name: postcss
  dependency-version: 8.4.31
  dependency-type: direct:production
  dependency-group: npm_and_yarn
- dependency-name: braces
  dependency-version: 3.0.3
  dependency-type: indirect
  dependency-group: npm_and_yarn
- dependency-name: cross-spawn
  dependency-version: 7.0.6
  dependency-type: indirect
  dependency-group: npm_and_yarn
- dependency-name: follow-redirects
  dependency-version: 1.15.6
  dependency-type: indirect
  dependency-group: npm_and_yarn
- dependency-name: micromatch
  dependency-version: 4.0.8
  dependency-type: indirect
  dependency-group: npm_and_yarn
- dependency-name: nanoid
  dependency-version: 3.3.11
  dependency-type: indirect
  dependency-group: npm_and_yarn
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-08-06 10:23:48 -06:00
7963cc8a52 fix: delayed waveform loading (#538) 2025-08-06 10:22:51 -06:00
d12424848d chore: remove black (#534) 2025-08-05 12:07:53 -06:00
dependabot[bot]
6e765875d5 build(deps): bump @babel/runtime (#530)
Bumps the npm_and_yarn group with 1 update in the /www directory: [@babel/runtime](https://github.com/babel/babel/tree/HEAD/packages/babel-runtime).


Updates `@babel/runtime` from 7.23.6 to 7.28.2
- [Release notes](https://github.com/babel/babel/releases)
- [Changelog](https://github.com/babel/babel/blob/main/CHANGELOG.md)
- [Commits](https://github.com/babel/babel/commits/v7.28.2/packages/babel-runtime)

---
updated-dependencies:
- dependency-name: "@babel/runtime"
  dependency-version: 7.28.2
  dependency-type: indirect
  dependency-group: npm_and_yarn
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-08-05 11:41:34 -06:00
dependabot[bot]
e0f4acf28b build(deps): bump form-data (#531)
Bumps the npm_and_yarn group with 1 update in the /www directory: [form-data](https://github.com/form-data/form-data).


Updates `form-data` from 4.0.0 to 4.0.4
- [Release notes](https://github.com/form-data/form-data/releases)
- [Changelog](https://github.com/form-data/form-data/blob/master/CHANGELOG.md)
- [Commits](https://github.com/form-data/form-data/compare/v4.0.0...v4.0.4)

---
updated-dependencies:
- dependency-name: form-data
  dependency-version: 4.0.4
  dependency-type: indirect
  dependency-group: npm_and_yarn
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-08-05 11:41:25 -06:00
dependabot[bot]
12359ea4eb build(deps): bump next (#533)
Bumps the npm_and_yarn group with 1 update in the /www directory: [next](https://github.com/vercel/next.js).


Updates `next` from 14.2.7 to 14.2.30
- [Release notes](https://github.com/vercel/next.js/releases)
- [Changelog](https://github.com/vercel/next.js/blob/canary/release.js)
- [Commits](https://github.com/vercel/next.js/compare/v14.2.7...v14.2.30)

---
updated-dependencies:
- dependency-name: next
  dependency-version: 14.2.30
  dependency-type: direct:production
  dependency-group: npm_and_yarn
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-08-05 11:41:10 -06:00
267b7401ea chore(main): release 0.6.0 (#526) 2025-08-04 18:04:10 -06:00
aea9de393c chore(main): release 0.6.0
Release-As: 0.6.0
2025-08-04 18:02:19 -06:00
dc177af3ff feat: implement service-specific Modal API keys with auto processor pattern (#528)
* fix: refactor modal API key configuration for better separation of concerns

- Split generic MODAL_API_KEY into service-specific keys:
  - TRANSCRIPT_API_KEY for transcription service
  - DIARIZATION_API_KEY for diarization service
  - TRANSLATE_API_KEY for translation service
- Remove deprecated *_MODAL_API_KEY settings
- Add proper validation to ensure URLs are set when using modal processors
- Update README with new configuration format

BREAKING CHANGE: Configuration keys have changed. Update your .env file:
- TRANSCRIPT_MODAL_API_KEY → TRANSCRIPT_API_KEY
- LLM_MODAL_API_KEY → (removed, use TRANSCRIPT_API_KEY)
- Add DIARIZATION_API_KEY and TRANSLATE_API_KEY if using those services

* fix: update Modal backend configuration to use service-specific API keys

- Changed from generic MODAL_API_KEY to service-specific keys:
  - TRANSCRIPT_MODAL_API_KEY for transcription
  - DIARIZATION_MODAL_API_KEY for diarization
  - TRANSLATION_MODAL_API_KEY for translation
- Updated audio_transcript_modal.py and audio_diarization_modal.py to use modal_api_key parameter
- Updated documentation in README.md, CLAUDE.md, and env.example

* feat: implement auto/modal pattern for translation processor

- Created TranscriptTranslatorAutoProcessor following the same pattern as transcript/diarization
- Created TranscriptTranslatorModalProcessor with TRANSLATION_MODAL_API_KEY support
- Added TRANSLATION_BACKEND setting (defaults to "modal")
- Updated all imports to use TranscriptTranslatorAutoProcessor instead of TranscriptTranslatorProcessor
- Updated env.example with TRANSLATION_BACKEND and TRANSLATION_MODAL_API_KEY
- Updated test to expect TranscriptTranslatorModalProcessor name
- All tests passing

* refactor: simplify transcript_translator base class to match other processors

- Moved all implementation from base class to modal processor
- Base class now only defines abstract _translate method
- Follows the same minimal pattern as audio_diarization and audio_transcript base classes
- Updated test mock to use _translate instead of get_translation
- All tests passing

* chore: clean up settings and improve type annotations

- Remove deprecated generic API key variables from settings
- Add comments to group Modal-specific settings
- Improve type annotations for modal_api_key parameters

* fix: typing

* fix: passing key to openai

* test: fix rtc test failing due to change on transcript

It also correctly setup database from sqlite, in case our configuration
is setup to postgres.

* ci: deactivate translation backend by default

* test: fix modal->mock

* refactor: implementing igor review, mock to passthrough
2025-08-04 12:07:30 -06:00
5bd8233657 chore: remove refactor md (#527) 2025-08-01 16:33:40 -06:00
28ac031ff6 feat: use llamaindex everywhere (#525)
* feat: use llamaindex for transcript final title too

* refactor: removed llm backend, replaced with one single class+llamaindex

* refactor: self-review

* fix: typing

* fix: tests

* refactor: extract clean_title and add tests

* test: fix

* test: remove ensure_casing/nltk

* fix: tiny mistake
2025-08-01 12:13:00 -06:00
1878834ce6 chore(main): release 0.5.0 (#521) 2025-07-31 20:11:41 -06:00
f5b82d44e3 style: use ruff for linting and formatting (#524) 2025-07-31 17:57:43 -06:00
ad56165b54 fix: remove unused settings and utils files (#522)
* fix: remove unused settings and utils files

* fix: remove migration done

* fix: remove outdated scripts

* fix: removing deployment of hermes, not used anymore

* fix: partially remove secret, still have to understand frontend.
2025-07-31 17:45:48 -06:00
4ee19ed015 ci: update pull request template (#523) 2025-07-31 17:45:19 -06:00
406164033d feat: new summary using phi-4 and llama-index (#519)
* feat: add litellm backend implementation

* refactor: improve generate/completion methods for base LLM

* refactor: remove tokenizer logic

* style: apply code formatting

* fix: remove hallucinations from LLM responses

* refactor: comprehensive LLM and summarization rework

* chore: remove debug code

* feat: add structured output support to LiteLLM

* refactor: apply self-review improvements

* docs: add model structured output comments

* docs: update model structured output comments

* style: apply linting and formatting fixes

* fix: resolve type logic bug

* refactor: apply PR review feedback

* refactor: apply additional PR review feedback

* refactor: apply final PR review feedback

* fix: improve schema passing for LLMs without structured output

* feat: add PR comments and logger improvements

* docs: update README and add HTTP logging

* feat: improve HTTP logging

* feat: add summary chunking functionality

* fix: resolve title generation runtime issues

* refactor: apply self-review improvements

* style: apply linting and formatting

* feat: implement LiteLLM class structure

* style: apply linting and formatting fixes

* docs: env template model name fix

* chore: remove older litellm class

* chore: format

* refactor: simplify OpenAILLM

* refactor: OpenAILLM tokenizer

* refactor: self-review

* refactor: self-review

* refactor: self-review

* chore: format

* chore: remove LLM_USE_STRUCTURED_OUTPUT from envs

* chore: roll back migration lint changes

* chore: roll back migration lint changes

* fix: make summary llm configuration optional for the tests

* fix: missing f-string

* fix: tweak the prompt for summary title

* feat: try llamaindex for summarization

* fix: complete refactor of summary builder using llamaindex and structured output when possible

* fix: separate prompt as constant

* fix: typings

* fix: enhance prompt to prevent mentioning others subject while summarize one

* fix: various changes after self-review

* fix: from igor review

---------

Co-authored-by: Igor Loskutov <igor.loskutoff@gmail.com>
2025-07-31 15:29:29 -06:00
81d316cb56 ci: remove conventional commit for ci (#520)
As we now squash merge, only the conventional commit is required for the
title of the PR
2025-07-31 15:19:16 -06:00
db3beae5cd chore(main): release 0.4.0 (#510) 2025-07-25 19:09:57 -06:00
Igor Loskutov
03b9a18c1b fix: remove faulty import Meeting (#512)
* fix: remove faulty import Meeting

* fix: remove faulty import Meeting
2025-07-25 17:48:10 -04:00
Igor Loskutov
7e3027adb6 fix: room concurrency (theoretically) (#511)
* fix: room concurrency (theoretically)

* cleanup

* cleanup
2025-07-25 17:37:51 -04:00
Igor Loskutov
27b43d85ab feat: Diarization cli (#509)
* diarisation cli

* feat: s3 upload for modal diarisation cli call

* chore: cleanup

* chore: s3 cleanup improvement

* chore: lint

* chore: cleanup

* chore: cleanup

* chore: cleanup

* chore: cleanup
2025-07-25 16:24:06 -04:00
2289a1a231 chore(main): release 0.3.2 (#506) 2025-07-22 19:15:47 -06:00
d0e130eb13 fix: match font size for the filter sidebar (#507) 2025-07-22 14:59:23 -06:00
24fabe3e86 fix: whereby consent not displaying (#505) 2025-07-22 12:20:26 -06:00
6fedbbe63f chore(main): release 0.3.1 (#503) 2025-07-21 22:52:21 -06:00
b39175cdc9 fix: remove primary color for room action menu (#504) 2025-07-21 22:45:26 -06:00
2a2af5fff2 fix: remove fief out of the source code (#502)
* fix: remove fief out of the source code

* fix: remove corresponding test about migration
2025-07-21 21:09:05 -06:00
ad44492cae chore(main): release 0.3.0 (#501) 2025-07-21 19:14:15 -06:00
901a239952 feat: migrate from chakra 2 to chakra 3 (#500)
* feat: separate page into different component, greatly improving the loading and reactivity

* fix: various fixes

* feat: migrate to Chakra UI v3 - update theme, fix deprecated props

- Add whiteAlpha color palette with semantic tokens
- Update button recipe with fontWeight 600 and hover states
- Move Poppins font from theme to HTML tag className
- Fix deprecated props: isDisabled→disabled, align→alignItems/textAlign
- Remove button.css as styles are now handled by Chakra v3

* fix: complete Chakra UI v3 deprecated prop migrations

- Replace all isDisabled with disabled
- Replace all isChecked with checked
- Replace all isLoading with loading
- Replace all isOpen with open
- Replace all noOfLines with lineClamp
- Replace all align with alignItems on Flex/Stack components
- Replace all justify with justifyContent on Flex/Stack components
- Update temporary Select components to use new prop names
- Update REFACTOR2.md with completion status

* fix: add value prop to Menu.Item for proper hover states in Chakra v3

* fix: update browse page components for Chakra UI v3 compatibility

- Fix FilterSidebar status filter styling and prop usage
- Update Pagination component to use new Chakra v3 props and structure
- Refactor TranscriptTable to use modern Chakra patterns
- Clean up browse page layout and props
- Remove unused import from transcripts API view
- Enhance theme with additional semantic color tokens

* fix: polish browse page UI for Chakra v3

- Add rounded corners to FilterSidebar
- Adjust responsive breakpoints from md to lg for table/card view
- Add consistent font weights to table headers
- Improve card view typography and spacing
- Fix padding and margins for better mobile experience
- Remove unused table recipe from theme

* fix: padding

* fix: rework transcript page

* fix: more tidy layout for topic

* fix: share and privacy using chakra3 select

* fix: fix share and privacy select, now working, with closing dialog

* fix: complete Chakra UI v3 migration for share components and fix all TypeScript errors

- Refactor shareZulip.tsx to integrate modal content directly
- Replace react-select-search with Chakra UI v3 Select components using collection pattern
- Convert all Checkbox components to use v3 composable structure (Checkbox.Root, etc.)
- Fix Card components to use Card.Root and Card.Body
- Replace deprecated textColor prop with color prop
- Update Menu components to use v3 namespace pattern (Menu.Root, Menu.Trigger, etc.)
- Remove unused AlertDialog imports
- Fix useDisclosure hook changes (isOpen -> open)
- Replace UnorderedList with List.Root and ListItem with List.Item
- Fix Skeleton components by removing isLoaded prop and using conditional rendering
- Update Button variants to valid v3 options
- Fix Spinner props (remove thickness, speed, emptyColor)
- Update toast API to use custom toaster component
- Fix Progress components and FormControl to Field.Root
- Update Alert to use compound component pattern
- Remove shareModal.tsx file after integration

* fix: bring back topic list

* fix: normalize menu item

* fix: migrate rooms page to Chakra UI v3 pattern

- Updated layout to match browse page with Flex container and proper spacing
- Migrated add/edit room modal from custom HTML to Chakra UI v3 Dialog component
- Replaced all Select components with Chakra UI v3 Select using createListCollection
- Replaced FormControl/FormLabel/FormHelperText with Field.Root/Field.Label/Field.HelperText
- Removed inline styles and used Chakra props (mr={2} instead of style={{ marginRight: "8px" }})
- Fixed TypeScript interfaces removing OptionBase extension
- Fixed theme.ts accordion anatomy import issue

* refactor: convert rooms list to table view with responsive design

- Create RoomTable component for desktop view showing room details in columns
- Create RoomCards component for mobile/tablet responsive view
- Refactor RoomList to use table/card components based on screen size
- Display Zulip configuration, room size, and recording settings in table
- Remove unused RoomItem component
- Import Room type from API for proper typing

* refactor: extract RoomActionsMenu component to eliminate duplication

- Create RoomActionsMenu component for consistent room action menus
- Update RoomCards and RoomTable to use the new shared component
- Remove duplicated menu code from both components

* feat: add icons to TranscriptActionsMenu for consistency

- Add FaTrash icon for Delete action with red color
- Add FaArrowsRotate icon for Reprocess action
- Matches the pattern established in RoomActionsMenu

* refactor: update icons from Font Awesome to Lucide React

- Replace FaEllipsisVertical with LuMenu in menu triggers
- Replace FaLink with LuLink for copy URL buttons
- Replace FaPencil with LuPen for edit actions
- Replace FaTrash with LuTrash for delete actions
- Replace FaArrowsRotate with LuRotateCw for reprocess action
- Consistent icon library usage across all components

* refactor: little pass on the icons

* fix: lu icon

* fix: primary for button

* fix: recording page with mic selection

* fix: also fix duration

* fix: use combobox for share zulip

* fix: use proper theming for button, variant was not recognized

* fix: room actions menu

* fix: remove other variant primary left.
2025-07-21 16:16:12 -06:00
d77b5611f8 chore(main): release 0.2.1 (#499) 2025-07-17 20:19:56 -06:00
fc38345d65 fix: separate browsing page into different components, limit to 10 by default (#498)
* feat: limit the amount of transcripts to 10 by default

* feat: separate page into different component, greatly improving the
loading and reactivity

* fix: current implementation immediately invokes the onDelete and
onReprocess

From pr-agent-monadical: Suggestion: The current implementation
immediately invokes the onDelete and onReprocess functions when the
component renders, rather than when the menu items are clicked. This can
cause unexpected behavior and potential memory leaks. Use callback
functions that only execute when the menu items are actually clicked.
[possible issue, importance: 9]
2025-07-17 20:18:00 -06:00
5a1d662dc4 chore(main): release 0.2.0 (#497) 2025-07-17 15:55:19 -06:00
033bd4bc48 feat: improve transcript listing with room_id (#496)
Added a new field in transcript for room_id, and set room_id/meeting_id
in a transcript now. Use this field to list the transcripts. URL is now
very fast.
2025-07-17 15:43:36 -06:00
0eb670ca19 fix: don't attempt to load waveform/mp3 if audio was deleted (#495) 2025-07-17 10:04:59 -06:00
4a340c797b chore(main): release 0.1.1 (#494) 2025-07-16 21:43:53 -06:00
c1e10f4dab fix: process meetings with utc (#493) 2025-07-16 21:39:16 -06:00
2516d4085f fix: postgres database not connecting in worker (#492)
stacks-reflector-worker-1  | [2025-07-17 02:18:21,234:
ERROR/ForkPoolWorker-2] Task
reflector.worker.process.process_meetings[8e763caf-be8a-4272-8793-7b918e4e3922]
raised unexpected: AssertionError('DatabaseBackend is not running')
stacks-reflector-worker-1  | Traceback (most recent call last):
stacks-reflector-worker-1  |   File
"/app/.venv/lib/python3.12/site-packages/celery/app/trace.py", line 453,
in trace_task
stacks-reflector-worker-1  |     R = retval = fun(*args, **kwargs)
stacks-reflector-worker-1  |                  ^^^^^^^^^^^^^^^^^^^^
stacks-reflector-worker-1  |   File
"/app/.venv/lib/python3.12/site-packages/celery/app/trace.py", line 736,
in __protected_call__
stacks-reflector-worker-1  |     return self.run(*args, **kwargs)
stacks-reflector-worker-1  |            ^^^^^^^^^^^^^^^^^^^^^^^^^
stacks-reflector-worker-1  |   File
"/app/reflector/pipelines/main_live_pipeline.py", line 81, in wrapper
stacks-reflector-worker-1  |     return asyncio.run(coro)
stacks-reflector-worker-1  |            ^^^^^^^^^^^^^^^^^
stacks-reflector-worker-1  |   File
"/usr/local/lib/python3.12/asyncio/runners.py", line 195, in run
stacks-reflector-worker-1  |     return runner.run(main)
stacks-reflector-worker-1  |            ^^^^^^^^^^^^^^^^
stacks-reflector-worker-1  |   File
"/usr/local/lib/python3.12/asyncio/runners.py", line 118, in run
stacks-reflector-worker-1  |     return
self._loop.run_until_complete(task)
stacks-reflector-worker-1  |
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
stacks-reflector-worker-1  |   File
"/usr/local/lib/python3.12/asyncio/base_events.py", line 691, in
run_until_complete
stacks-reflector-worker-1  |     return future.result()
stacks-reflector-worker-1  |            ^^^^^^^^^^^^^^^
stacks-reflector-worker-1  |   File "/app/reflector/worker/process.py",
line 139, in process_meetings
stacks-reflector-worker-1  |     meetings = await
meetings_controller.get_all_active()
stacks-reflector-worker-1  |
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
stacks-reflector-worker-1  |   File "/app/reflector/db/meetings.py",
line 121, in get_all_active
stacks-reflector-worker-1  |     return await database.fetch_all(query)
stacks-reflector-worker-1  |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
stacks-reflector-worker-1  |   File
"/app/.venv/lib/python3.12/site-packages/databases/core.py", line 173,
in fetch_all
stacks-reflector-worker-1  |     async with self.connection() as
connection:
stacks-reflector-worker-1  |                ^^^^^^^^^^^^^^^^^
stacks-reflector-worker-1  |   File
"/app/.venv/lib/python3.12/site-packages/databases/core.py", line 267,
in __aenter__
stacks-reflector-worker-1  |     raise e
stacks-reflector-worker-1  |   File
"/app/.venv/lib/python3.12/site-packages/databases/core.py", line 264,
in __aenter__
stacks-reflector-worker-1  |     await self._connection.acquire()
stacks-reflector-worker-1  |   File
"/app/.venv/lib/python3.12/site-packages/databases/backends/postgres.py",
line 169, in acquire
stacks-reflector-worker-1  |     assert self._database._pool is not
None, "DatabaseBackend is not running"
stacks-reflector-worker-1  |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
stacks-reflector-worker-1  | AssertionError: DatabaseBackend is not
running
2025-07-16 21:09:51 -06:00
4d21fd1754 refactor: migration from sqlite to postgres with migration script (#483) 2025-07-16 19:38:33 -06:00
b05fc9c36a fix: rename averaged_perceptron_tagger to averaged_perceptron_tagger_eng (#491) 2025-07-16 19:13:20 -06:00
0e2ae5fca8 fix: punkt -> punkt_tab + pre-download nltk packages to prevent runtime not working (#489) 2025-07-16 18:58:57 -06:00
86ce68651f build: move to uv (#488)
* build: move to uv

* build: add packages declaration

* build: move to python 3.12, as sentencespiece does not work on 3.13

* ci: remove pre-commit check, will be done in another branch.

* ci: fix name checkout

* ci: update lock and dockerfile

* test: remove event_loop, not needed in python 3.12

* test: updated test due to av returning AudioFrame with 4096 samples instead of 1024

* build: prevent using fastapi cli, because there is no way to set default port

I don't want to pass --port 1250 every time, so back on previous
approach. I deactivated auto-reload for production.

* ci: remove main.py

* test: fix quirck with httpx
2025-07-16 18:10:11 -06:00
4895160181 docs: update readme with screenshots 2025-07-16 08:44:30 -06:00
d3498ae669 docs: add AGPL-v3 license and update README (#487) 2025-07-16 08:31:55 -06:00
4764dfc219 ci: add conventional commits checks to the repo (#486) 2025-07-16 08:31:31 -06:00
9b67deb9fe ci: add release-please workflow (#485) 2025-07-16 08:09:57 -06:00
aea8773057 chore: remove old non-working code (#484) 2025-07-16 13:47:42 +00:00
438 changed files with 47607 additions and 33408 deletions

View File

@@ -1,19 +1,21 @@
## ⚠️ Insert the PR TITLE replacing this text ⚠️
<!--- Provide a general summary of your changes in the Title above -->
⚠️ Describe your PR replacing this text. Post screenshots or videos whenever possible. ⚠️
## Description
<!--- Describe your changes in detail -->
### Checklist
## Related Issue
<!--- This project only accepts pull requests related to open issues -->
<!--- If suggesting a new feature or change, please discuss it in an issue first -->
<!--- If fixing a bug, there should be an issue describing it with steps to reproduce -->
<!--- Please link to the issue here: -->
- [ ] My branch is updated with main (mandatory)
- [ ] I wrote unit tests for this (if applies)
- [ ] I have included migrations and tested them locally (if applies)
- [ ] I have manually tested this feature locally
## Motivation and Context
<!--- Why is this change required? What problem does it solve? -->
<!--- If it fixes an open issue, please link to the issue here. -->
> IMPORTANT: Remember that you are responsible for merging this PR after it's been reviewed, and once deployed
> you should perform manual testing to make sure everything went smoothly.
### Urgency
- [ ] Urgent (deploy ASAP)
- [ ] Non-urgent (deploying in next release is ok)
## How Has This Been Tested?
<!--- Please describe in detail how you tested your changes. -->
<!--- Include details of your testing environment, and the tests you ran to -->
<!--- see how your change affects other areas of the code, etc. -->
## Screenshots (if appropriate):

View File

@@ -0,0 +1,21 @@
name: "Lint PR"
on:
pull_request_target:
types:
- opened
- edited
- synchronize
- reopened
permissions:
pull-requests: read
jobs:
main:
name: Validate PR title
runs-on: ubuntu-latest
steps:
- uses: amannn/action-semantic-pull-request@v5
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}

View File

@@ -2,6 +2,8 @@ name: Test Database Migrations
on:
push:
branches:
- main
paths:
- "server/migrations/**"
- "server/reflector/db/**"
@@ -17,39 +19,63 @@ on:
jobs:
test-migrations:
runs-on: ubuntu-latest
concurrency:
group: db-ubuntu-latest-${{ github.ref }}
cancel-in-progress: true
services:
postgres:
image: postgres:17
env:
POSTGRES_USER: reflector
POSTGRES_PASSWORD: reflector
POSTGRES_DB: reflector
ports:
- 5432:5432
options: >-
--health-cmd pg_isready -h 127.0.0.1 -p 5432
--health-interval 10s
--health-timeout 5s
--health-retries 5
env:
DATABASE_URL: postgresql://reflector:reflector@localhost:5432/reflector
steps:
- uses: actions/checkout@v4
- name: Install poetry
run: pipx install poetry
- name: Install PostgreSQL client
run: sudo apt-get update && sudo apt-get install -y postgresql-client | cat
- name: Set up Python 3.x
uses: actions/setup-python@v4
with:
python-version: "3.11"
cache: "poetry"
cache-dependency-path: "server/poetry.lock"
- name: Install requirements
working-directory: ./server
- name: Wait for Postgres
run: |
poetry install --no-root
for i in {1..30}; do
if pg_isready -h localhost -p 5432; then
echo "Postgres is ready"
break
fi
echo "Waiting for Postgres... ($i)" && sleep 1
done
- name: Install uv
uses: astral-sh/setup-uv@v3
with:
enable-cache: true
working-directory: server
- name: Test migrations from scratch
working-directory: ./server
working-directory: server
run: |
echo "Testing migrations from clean database..."
poetry run alembic upgrade head
uv run alembic upgrade head
echo "✅ Fresh migration successful"
- name: Test migration rollback and re-apply
working-directory: ./server
working-directory: server
run: |
echo "Testing rollback to base..."
poetry run alembic downgrade base
uv run alembic downgrade base
echo "✅ Rollback successful"
echo "Testing re-apply of all migrations..."
poetry run alembic upgrade head
uv run alembic upgrade head
echo "✅ Re-apply successful"

View File

@@ -8,18 +8,30 @@ env:
ECR_REPOSITORY: reflector
jobs:
deploy:
runs-on: ubuntu-latest
build:
strategy:
matrix:
include:
- platform: linux/amd64
runner: linux-amd64
arch: amd64
- platform: linux/arm64
runner: linux-arm64
arch: arm64
runs-on: ${{ matrix.runner }}
permissions:
deployments: write
contents: read
outputs:
registry: ${{ steps.login-ecr.outputs.registry }}
steps:
- uses: actions/checkout@v3
- uses: actions/checkout@v4
- name: Configure AWS credentials
uses: aws-actions/configure-aws-credentials@0e613a0980cbf65ed5b322eb7a1e075d28913a83
uses: aws-actions/configure-aws-credentials@v4
with:
aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
@@ -27,21 +39,52 @@ jobs:
- name: Login to Amazon ECR
id: login-ecr
uses: aws-actions/amazon-ecr-login@62f4f872db3836360b72999f4b87f1ff13310f3a
- name: Set up QEMU
uses: docker/setup-qemu-action@v2
uses: aws-actions/amazon-ecr-login@v2
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v2
uses: docker/setup-buildx-action@v3
- name: Build and push
id: docker_build
uses: docker/build-push-action@v4
- name: Build and push ${{ matrix.arch }}
uses: docker/build-push-action@v5
with:
context: server
platforms: linux/amd64,linux/arm64
platforms: ${{ matrix.platform }}
push: true
tags: ${{ steps.login-ecr.outputs.registry }}/${{ env.ECR_REPOSITORY }}:latest
cache-from: type=gha
cache-to: type=gha,mode=max
tags: ${{ steps.login-ecr.outputs.registry }}/${{ env.ECR_REPOSITORY }}:latest-${{ matrix.arch }}
cache-from: type=gha,scope=${{ matrix.arch }}
cache-to: type=gha,mode=max,scope=${{ matrix.arch }}
github-token: ${{ secrets.GHA_CACHE_TOKEN }}
provenance: false
create-manifest:
runs-on: ubuntu-latest
needs: [build]
permissions:
deployments: write
contents: read
steps:
- name: Configure AWS credentials
uses: aws-actions/configure-aws-credentials@v4
with:
aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
aws-region: ${{ env.AWS_REGION }}
- name: Login to Amazon ECR
uses: aws-actions/amazon-ecr-login@v2
- name: Create and push multi-arch manifest
run: |
# Get the registry URL (since we can't easily access job outputs in matrix)
ECR_REGISTRY=$(aws ecr describe-registry --query 'registryId' --output text).dkr.ecr.${{ env.AWS_REGION }}.amazonaws.com
docker manifest create \
$ECR_REGISTRY/${{ env.ECR_REPOSITORY }}:latest \
$ECR_REGISTRY/${{ env.ECR_REPOSITORY }}:latest-amd64 \
$ECR_REGISTRY/${{ env.ECR_REPOSITORY }}:latest-arm64
docker manifest push $ECR_REGISTRY/${{ env.ECR_REPOSITORY }}:latest
echo "✅ Multi-arch manifest pushed: $ECR_REGISTRY/${{ env.ECR_REPOSITORY }}:latest"

24
.github/workflows/pre-commit.yml vendored Normal file
View File

@@ -0,0 +1,24 @@
name: pre-commit
on:
pull_request:
push:
branches: [main]
jobs:
pre-commit:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v5
- uses: actions/setup-python@v5
- uses: pnpm/action-setup@v4
with:
version: 10
- uses: actions/setup-node@v4
with:
node-version: 22
cache: "pnpm"
cache-dependency-path: "www/pnpm-lock.yaml"
- name: Install dependencies
run: cd www && pnpm install --frozen-lockfile
- uses: pre-commit/action@v3.0.1

19
.github/workflows/release-please.yml vendored Normal file
View File

@@ -0,0 +1,19 @@
on:
push:
branches:
- main
permissions:
contents: write
pull-requests: write
name: release-please
jobs:
release-please:
runs-on: ubuntu-latest
steps:
- uses: googleapis/release-please-action@v4
with:
token: ${{ secrets.MY_RELEASE_PLEASE_TOKEN }}
release-type: simple

45
.github/workflows/test_next_server.yml vendored Normal file
View File

@@ -0,0 +1,45 @@
name: Test Next Server
on:
pull_request:
paths:
- "www/**"
push:
branches:
- main
paths:
- "www/**"
jobs:
test-next-server:
runs-on: ubuntu-latest
defaults:
run:
working-directory: ./www
steps:
- uses: actions/checkout@v4
- name: Setup Node.js
uses: actions/setup-node@v4
with:
node-version: '20'
- name: Install pnpm
uses: pnpm/action-setup@v4
with:
version: 8
- name: Setup Node.js cache
uses: actions/setup-node@v4
with:
node-version: '20'
cache: 'pnpm'
cache-dependency-path: './www/pnpm-lock.yaml'
- name: Install dependencies
run: pnpm install
- name: Run tests
run: pnpm test

View File

@@ -5,77 +5,66 @@ on:
paths:
- "server/**"
push:
branches:
- main
paths:
- "server/**"
jobs:
pytest:
runs-on: ubuntu-latest
concurrency:
group: pytest-${{ github.ref }}
cancel-in-progress: true
services:
redis:
image: redis:6
ports:
- 6379:6379
steps:
- uses: actions/checkout@v3
- name: Install poetry
run: pipx install poetry
- name: Set up Python 3.x
uses: actions/setup-python@v4
- uses: actions/checkout@v4
- name: Install uv
uses: astral-sh/setup-uv@v6
with:
python-version: "3.11"
cache: "poetry"
cache-dependency-path: "server/poetry.lock"
- name: Install requirements
run: |
cd server
poetry install --no-root
enable-cache: true
working-directory: server
- name: Tests
run: |
cd server
poetry run python -m pytest -v tests
uv run -m pytest -v tests
formatting:
runs-on: ubuntu-latest
docker-amd64:
runs-on: linux-amd64
concurrency:
group: docker-amd64-${{ github.ref }}
cancel-in-progress: true
steps:
- uses: actions/checkout@v3
- name: Set up Python 3.x
uses: actions/setup-python@v4
with:
python-version: 3.11
- name: Validate formatting
run: |
pip install black
cd server
black --check reflector tests
linting:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Set up Python 3.x
uses: actions/setup-python@v4
with:
python-version: 3.11
- name: Validate formatting
run: |
pip install ruff
cd server
ruff check reflector tests
docker:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Set up QEMU
uses: docker/setup-qemu-action@v2
- uses: actions/checkout@v4
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v2
- name: Build and push
id: docker_build
uses: docker/build-push-action@v4
uses: docker/setup-buildx-action@v3
- name: Build AMD64
uses: docker/build-push-action@v6
with:
context: server
platforms: linux/amd64,linux/arm64
cache-from: type=gha
cache-to: type=gha,mode=max
platforms: linux/amd64
cache-from: type=gha,scope=amd64
cache-to: type=gha,mode=max,scope=amd64
github-token: ${{ secrets.GHA_CACHE_TOKEN }}
docker-arm64:
runs-on: linux-arm64
concurrency:
group: docker-arm64-${{ github.ref }}
cancel-in-progress: true
steps:
- uses: actions/checkout@v4
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
- name: Build ARM64
uses: docker/build-push-action@v6
with:
context: server
platforms: linux/arm64
cache-from: type=gha,scope=arm64
cache-to: type=gha,mode=max,scope=arm64
github-token: ${{ secrets.GHA_CACHE_TOKEN }}

10
.gitignore vendored
View File

@@ -9,4 +9,12 @@ dump.rdb
ngrok.log
.claude/settings.local.json
restart-dev.sh
*.log
*.log
data/
www/REFACTOR.md
www/reload-frontend
server/test.sqlite
CLAUDE.local.md
www/.env.development
www/.env.production
.playwright-mcp

1
.gitleaksignore Normal file
View File

@@ -0,0 +1 @@
b9d891d3424f371642cb032ecfd0e2564470a72c:server/tests/test_transcripts_recording_deletion.py:generic-api-key:15

View File

@@ -3,10 +3,10 @@
repos:
- repo: local
hooks:
- id: yarn-format
name: run yarn format
- id: format
name: run format
language: system
entry: bash -c 'cd www && yarn format'
entry: bash -c 'cd www && pnpm format'
pass_filenames: false
files: ^www/
@@ -15,25 +15,20 @@ repos:
hooks:
- id: debug-statements
- id: trailing-whitespace
exclude: ^server/trials
- id: detect-private-key
- repo: https://github.com/psf/black
rev: 24.1.1
hooks:
- id: black
files: ^server/(reflector|tests)/
- repo: https://github.com/pycqa/isort
rev: 5.12.0
hooks:
- id: isort
name: isort (python)
files: ^server/(gpu|evaluate|reflector)/
args: [ "--profile", "black", "--filter-files" ]
- repo: https://github.com/astral-sh/ruff-pre-commit
rev: v0.6.5
rev: v0.8.2
hooks:
- id: ruff
files: ^server/(reflector|tests)/
args:
- --fix
# Uses select rules from server/pyproject.toml
files: ^server/
- id: ruff-format
files: ^server/
- repo: https://github.com/gitleaks/gitleaks
rev: v8.28.0
hooks:
- id: gitleaks

View File

@@ -1 +0,0 @@
3.11.6

206
CHANGELOG.md Normal file
View File

@@ -0,0 +1,206 @@
# Changelog
## [0.10.0](https://github.com/Monadical-SAS/reflector/compare/v0.9.0...v0.10.0) (2025-09-11)
### Features
* replace nextjs-config with environment variables ([#632](https://github.com/Monadical-SAS/reflector/issues/632)) ([369ecdf](https://github.com/Monadical-SAS/reflector/commit/369ecdff13f3862d926a9c0b87df52c9d94c4dde))
### Bug Fixes
* anonymous users transcript permissions ([#621](https://github.com/Monadical-SAS/reflector/issues/621)) ([f81fe99](https://github.com/Monadical-SAS/reflector/commit/f81fe9948a9237b3e0001b2d8ca84f54d76878f9))
* auth post ([#624](https://github.com/Monadical-SAS/reflector/issues/624)) ([cde99ca](https://github.com/Monadical-SAS/reflector/commit/cde99ca2716f84ba26798f289047732f0448742e))
* auth post ([#626](https://github.com/Monadical-SAS/reflector/issues/626)) ([3b85ff3](https://github.com/Monadical-SAS/reflector/commit/3b85ff3bdf4fb053b103070646811bc990c0e70a))
* auth post ([#627](https://github.com/Monadical-SAS/reflector/issues/627)) ([962038e](https://github.com/Monadical-SAS/reflector/commit/962038ee3f2a555dc3c03856be0e4409456e0996))
* missing follow_redirects=True on modal endpoint ([#630](https://github.com/Monadical-SAS/reflector/issues/630)) ([fc363bd](https://github.com/Monadical-SAS/reflector/commit/fc363bd49b17b075e64f9186e5e0185abc325ea7))
* sync backend and frontend token refresh logic ([#614](https://github.com/Monadical-SAS/reflector/issues/614)) ([5a5b323](https://github.com/Monadical-SAS/reflector/commit/5a5b3233820df9536da75e87ce6184a983d4713a))
## [0.9.0](https://github.com/Monadical-SAS/reflector/compare/v0.8.2...v0.9.0) (2025-09-06)
### Features
* frontend openapi react query ([#606](https://github.com/Monadical-SAS/reflector/issues/606)) ([c4d2825](https://github.com/Monadical-SAS/reflector/commit/c4d2825c81f81ad8835629fbf6ea8c7383f8c31b))
### Bug Fixes
* align whisper transcriber api with parakeet ([#602](https://github.com/Monadical-SAS/reflector/issues/602)) ([0663700](https://github.com/Monadical-SAS/reflector/commit/0663700a615a4af69a03c96c410f049e23ec9443))
* kv use tls explicit ([#610](https://github.com/Monadical-SAS/reflector/issues/610)) ([08d88ec](https://github.com/Monadical-SAS/reflector/commit/08d88ec349f38b0d13e0fa4cb73486c8dfd31836))
* source kind for file processing ([#601](https://github.com/Monadical-SAS/reflector/issues/601)) ([dc82f8b](https://github.com/Monadical-SAS/reflector/commit/dc82f8bb3bdf3ab3d4088e592a30fd63907319e1))
* token refresh locking ([#613](https://github.com/Monadical-SAS/reflector/issues/613)) ([7f5a4c9](https://github.com/Monadical-SAS/reflector/commit/7f5a4c9ddc7fd098860c8bdda2ca3b57f63ded2f))
## [0.8.2](https://github.com/Monadical-SAS/reflector/compare/v0.8.1...v0.8.2) (2025-08-29)
### Bug Fixes
* search-logspam ([#593](https://github.com/Monadical-SAS/reflector/issues/593)) ([695d1a9](https://github.com/Monadical-SAS/reflector/commit/695d1a957d4cd862753049f9beed88836cabd5ab))
## [0.8.1](https://github.com/Monadical-SAS/reflector/compare/v0.8.0...v0.8.1) (2025-08-29)
### Bug Fixes
* make webhook secret/url allowing null ([#590](https://github.com/Monadical-SAS/reflector/issues/590)) ([84a3812](https://github.com/Monadical-SAS/reflector/commit/84a381220bc606231d08d6f71d4babc818fa3c75))
## [0.8.0](https://github.com/Monadical-SAS/reflector/compare/v0.7.3...v0.8.0) (2025-08-29)
### Features
* **cleanup:** add automatic data retention for public instances ([#574](https://github.com/Monadical-SAS/reflector/issues/574)) ([6f0c7c1](https://github.com/Monadical-SAS/reflector/commit/6f0c7c1a5e751713366886c8e764c2009e12ba72))
* **rooms:** add webhook for transcript completion ([#578](https://github.com/Monadical-SAS/reflector/issues/578)) ([88ed7cf](https://github.com/Monadical-SAS/reflector/commit/88ed7cfa7804794b9b54cad4c3facc8a98cf85fd))
### Bug Fixes
* file pipeline status reporting and websocket updates ([#589](https://github.com/Monadical-SAS/reflector/issues/589)) ([9dfd769](https://github.com/Monadical-SAS/reflector/commit/9dfd76996f851cc52be54feea078adbc0816dc57))
* Igor/evaluation ([#575](https://github.com/Monadical-SAS/reflector/issues/575)) ([124ce03](https://github.com/Monadical-SAS/reflector/commit/124ce03bf86044c18313d27228a25da4bc20c9c5))
* optimize parakeet transcription batching algorithm ([#577](https://github.com/Monadical-SAS/reflector/issues/577)) ([7030e0f](https://github.com/Monadical-SAS/reflector/commit/7030e0f23649a8cf6c1eb6d5889684a41ce849ec))
## [0.7.3](https://github.com/Monadical-SAS/reflector/compare/v0.7.2...v0.7.3) (2025-08-22)
### Bug Fixes
* cleaned repo, and get git-leaks clean ([359280d](https://github.com/Monadical-SAS/reflector/commit/359280dd340433ba4402ed69034094884c825e67))
* restore previous behavior on live pipeline + audio downscaler ([#561](https://github.com/Monadical-SAS/reflector/issues/561)) ([9265d20](https://github.com/Monadical-SAS/reflector/commit/9265d201b590d23c628c5f19251b70f473859043))
## [0.7.2](https://github.com/Monadical-SAS/reflector/compare/v0.7.1...v0.7.2) (2025-08-21)
### Bug Fixes
* docker image not loading libgomp.so.1 for torch ([#560](https://github.com/Monadical-SAS/reflector/issues/560)) ([773fccd](https://github.com/Monadical-SAS/reflector/commit/773fccd93e887c3493abc2e4a4864dddce610177))
* include shared rooms to search ([#558](https://github.com/Monadical-SAS/reflector/issues/558)) ([499eced](https://github.com/Monadical-SAS/reflector/commit/499eced3360b84fb3a90e1c8a3b554290d21adc2))
## [0.7.1](https://github.com/Monadical-SAS/reflector/compare/v0.7.0...v0.7.1) (2025-08-21)
### Bug Fixes
* webvtt db null expectation mismatch ([#556](https://github.com/Monadical-SAS/reflector/issues/556)) ([e67ad1a](https://github.com/Monadical-SAS/reflector/commit/e67ad1a4a2054467bfeb1e0258fbac5868aaaf21))
## [0.7.0](https://github.com/Monadical-SAS/reflector/compare/v0.6.1...v0.7.0) (2025-08-21)
### Features
* delete recording with transcript ([#547](https://github.com/Monadical-SAS/reflector/issues/547)) ([99cc984](https://github.com/Monadical-SAS/reflector/commit/99cc9840b3f5de01e0adfbfae93234042d706d13))
* pipeline improvement with file processing, parakeet, silero-vad ([#540](https://github.com/Monadical-SAS/reflector/issues/540)) ([bcc29c9](https://github.com/Monadical-SAS/reflector/commit/bcc29c9e0050ae215f89d460e9d645aaf6a5e486))
* postgresql migration and removal of sqlite in pytest ([#546](https://github.com/Monadical-SAS/reflector/issues/546)) ([cd1990f](https://github.com/Monadical-SAS/reflector/commit/cd1990f8f0fe1503ef5069512f33777a73a93d7f))
* search backend ([#537](https://github.com/Monadical-SAS/reflector/issues/537)) ([5f9b892](https://github.com/Monadical-SAS/reflector/commit/5f9b89260c9ef7f3c921319719467df22830453f))
* search frontend ([#551](https://github.com/Monadical-SAS/reflector/issues/551)) ([3657242](https://github.com/Monadical-SAS/reflector/commit/365724271ca6e615e3425125a69ae2b46ce39285))
### Bug Fixes
* evaluation cli event wrap ([#536](https://github.com/Monadical-SAS/reflector/issues/536)) ([941c3db](https://github.com/Monadical-SAS/reflector/commit/941c3db0bdacc7b61fea412f3746cc5a7cb67836))
* use structlog not logging ([#550](https://github.com/Monadical-SAS/reflector/issues/550)) ([27e2f81](https://github.com/Monadical-SAS/reflector/commit/27e2f81fda5232e53edc729d3e99c5ef03adbfe9))
## [0.6.1](https://github.com/Monadical-SAS/reflector/compare/v0.6.0...v0.6.1) (2025-08-06)
### Bug Fixes
* delayed waveform loading ([#538](https://github.com/Monadical-SAS/reflector/issues/538)) ([ef64146](https://github.com/Monadical-SAS/reflector/commit/ef64146325d03f64dd9a1fe40234fb3e7e957ae2))
## [0.6.0](https://github.com/Monadical-SAS/reflector/compare/v0.5.0...v0.6.0) (2025-08-05)
### ⚠ BREAKING CHANGES
* Configuration keys have changed. Update your .env file:
- TRANSCRIPT_MODAL_API_KEY → TRANSCRIPT_API_KEY
- LLM_MODAL_API_KEY → (removed, use TRANSCRIPT_API_KEY)
- Add DIARIZATION_API_KEY and TRANSLATE_API_KEY if using those services
### Features
* implement service-specific Modal API keys with auto processor pattern ([#528](https://github.com/Monadical-SAS/reflector/issues/528)) ([650befb](https://github.com/Monadical-SAS/reflector/commit/650befb291c47a1f49e94a01ab37d8fdfcd2b65d))
* use llamaindex everywhere ([#525](https://github.com/Monadical-SAS/reflector/issues/525)) ([3141d17](https://github.com/Monadical-SAS/reflector/commit/3141d172bc4d3b3d533370c8e6e351ea762169bf))
### Miscellaneous Chores
* **main:** release 0.6.0 ([ecdbf00](https://github.com/Monadical-SAS/reflector/commit/ecdbf003ea2476c3e95fd231adaeb852f2943df0))
## [0.5.0](https://github.com/Monadical-SAS/reflector/compare/v0.4.0...v0.5.0) (2025-07-31)
### Features
* new summary using phi-4 and llama-index ([#519](https://github.com/Monadical-SAS/reflector/issues/519)) ([1bf9ce0](https://github.com/Monadical-SAS/reflector/commit/1bf9ce07c12f87f89e68a1dbb3b2c96c5ee62466))
### Bug Fixes
* remove unused settings and utils files ([#522](https://github.com/Monadical-SAS/reflector/issues/522)) ([2af4790](https://github.com/Monadical-SAS/reflector/commit/2af4790e4be9e588f282fbc1bb171c88a03d6479))
## [0.4.0](https://github.com/Monadical-SAS/reflector/compare/v0.3.2...v0.4.0) (2025-07-25)
### Features
* Diarization cli ([#509](https://github.com/Monadical-SAS/reflector/issues/509)) ([ffc8003](https://github.com/Monadical-SAS/reflector/commit/ffc8003e6dad236930a27d0fe3e2f2adfb793890))
### Bug Fixes
* remove faulty import Meeting ([#512](https://github.com/Monadical-SAS/reflector/issues/512)) ([0e68c79](https://github.com/Monadical-SAS/reflector/commit/0e68c798434e1b481f9482cc3a4702ea00365df4))
* room concurrency (theoretically) ([#511](https://github.com/Monadical-SAS/reflector/issues/511)) ([7bb3676](https://github.com/Monadical-SAS/reflector/commit/7bb367653afeb2778cff697a0eb217abf0b81b84))
## [0.3.2](https://github.com/Monadical-SAS/reflector/compare/v0.3.1...v0.3.2) (2025-07-22)
### Bug Fixes
* match font size for the filter sidebar ([#507](https://github.com/Monadical-SAS/reflector/issues/507)) ([4b8ba5d](https://github.com/Monadical-SAS/reflector/commit/4b8ba5db1733557e27b098ad3d1cdecadf97ae52))
* whereby consent not displaying ([#505](https://github.com/Monadical-SAS/reflector/issues/505)) ([1120552](https://github.com/Monadical-SAS/reflector/commit/1120552c2c83d084d3a39272ad49b6aeda1af98f))
## [0.3.1](https://github.com/Monadical-SAS/reflector/compare/v0.3.0...v0.3.1) (2025-07-22)
### Bug Fixes
* remove fief out of the source code ([#502](https://github.com/Monadical-SAS/reflector/issues/502)) ([890dd15](https://github.com/Monadical-SAS/reflector/commit/890dd15ba5a2be10dbb841e9aeb75d377885f4af))
* remove primary color for room action menu ([#504](https://github.com/Monadical-SAS/reflector/issues/504)) ([2e33f89](https://github.com/Monadical-SAS/reflector/commit/2e33f89c0f9e5fbaafa80e8d2ae9788450ea2f31))
## [0.3.0](https://github.com/Monadical-SAS/reflector/compare/v0.2.1...v0.3.0) (2025-07-21)
### Features
* migrate from chakra 2 to chakra 3 ([#500](https://github.com/Monadical-SAS/reflector/issues/500)) ([a858464](https://github.com/Monadical-SAS/reflector/commit/a858464c7a80e5497acf801d933bf04092f8b526))
## [0.2.1](https://github.com/Monadical-SAS/reflector/compare/v0.2.0...v0.2.1) (2025-07-18)
### Bug Fixes
* separate browsing page into different components, limit to 10 by default ([#498](https://github.com/Monadical-SAS/reflector/issues/498)) ([c752da6](https://github.com/Monadical-SAS/reflector/commit/c752da6b97c96318aff079a5b2a6eceadfbfcad1))
## [0.2.0](https://github.com/Monadical-SAS/reflector/compare/0.1.1...v0.2.0) (2025-07-17)
### Features
* improve transcript listing with room_id ([#496](https://github.com/Monadical-SAS/reflector/issues/496)) ([d2b5de5](https://github.com/Monadical-SAS/reflector/commit/d2b5de543fc0617fc220caa6a8a290e4040cb10b))
### Bug Fixes
* don't attempt to load waveform/mp3 if audio was deleted ([#495](https://github.com/Monadical-SAS/reflector/issues/495)) ([f4578a7](https://github.com/Monadical-SAS/reflector/commit/f4578a743fd0f20312fbd242fa9cccdfaeb20a9e))
## [0.1.1](https://github.com/Monadical-SAS/reflector/compare/0.1.0...v0.1.1) (2025-07-17)
### Bug Fixes
* postgres database not connecting in worker ([#492](https://github.com/Monadical-SAS/reflector/issues/492)) ([123d09f](https://github.com/Monadical-SAS/reflector/commit/123d09fdacef7f5a84541cf01732d4f5b6b9d2d0))
* process meetings with utc ([#493](https://github.com/Monadical-SAS/reflector/issues/493)) ([f3c85e1](https://github.com/Monadical-SAS/reflector/commit/f3c85e1eb97cd893840125ed056dcb290fccb612))
* punkt -&gt; punkt_tab + pre-download nltk packages to prevent runtime not working ([#489](https://github.com/Monadical-SAS/reflector/issues/489)) ([c22487b](https://github.com/Monadical-SAS/reflector/commit/c22487b41f311a3fdba2eac04c7637bd396cccee))
* rename averaged_perceptron_tagger to averaged_perceptron_tagger_eng ([#491](https://github.com/Monadical-SAS/reflector/issues/491)) ([a7b7846](https://github.com/Monadical-SAS/reflector/commit/a7b78462419b3af81c6dbf1ddfccb3d532f660a3))

179
CLAUDE.md Normal file
View File

@@ -0,0 +1,179 @@
# CLAUDE.md
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
## Project Overview
Reflector is an AI-powered audio transcription and meeting analysis platform with real-time processing capabilities. The system consists of:
- **Frontend**: Next.js 14 React application (`www/`) with Chakra UI, real-time WebSocket integration
- **Backend**: Python FastAPI server (`server/`) with async database operations and background processing
- **Processing**: GPU-accelerated ML pipeline for transcription, diarization, summarization via Modal.com
- **Infrastructure**: Redis, PostgreSQL/SQLite, Celery workers, WebRTC streaming
## Development Commands
### Backend (Python) - `cd server/`
**Setup and Dependencies:**
```bash
# Install dependencies
uv sync
# Database migrations (first run or schema changes)
uv run alembic upgrade head
# Start services
docker compose up -d redis
```
**Development:**
```bash
# Start FastAPI server
uv run -m reflector.app --reload
# Start Celery worker for background tasks
uv run celery -A reflector.worker.app worker --loglevel=info
# Start Celery beat scheduler (optional, for cron jobs)
uv run celery -A reflector.worker.app beat
```
**Testing:**
```bash
# Run all tests with coverage
uv run pytest
# Run specific test file
uv run pytest tests/test_transcripts.py
# Run tests with verbose output
uv run pytest -v
```
**Process Audio Files:**
```bash
# Process local audio file manually
uv run python -m reflector.tools.process path/to/audio.wav
```
### Frontend (Next.js) - `cd www/`
**Setup:**
```bash
# Install dependencies
pnpm install
# Copy configuration templates
cp .env_template .env
```
**Development:**
```bash
# Start development server
pnpm dev
# Generate TypeScript API client from OpenAPI spec
pnpm openapi
# Lint code
pnpm lint
# Format code
pnpm format
# Build for production
pnpm build
```
### Docker Compose (Full Stack)
```bash
# Start all services
docker compose up -d
# Start specific services
docker compose up -d redis server worker
```
## Architecture Overview
### Backend Processing Pipeline
The audio processing follows a modular pipeline architecture:
1. **Audio Input**: WebRTC streaming, file upload, or cloud recording ingestion
2. **Chunking**: Audio split into processable segments (`AudioChunkerProcessor`)
3. **Transcription**: Whisper or Modal.com GPU processing (`AudioTranscriptAutoProcessor`)
4. **Diarization**: Speaker identification (`AudioDiarizationAutoProcessor`)
5. **Text Processing**: Formatting, translation, topic detection
6. **Summarization**: AI-powered summaries and title generation
7. **Storage**: Database persistence with optional S3 backend
### Database Models
Core entities:
- `transcript`: Main table with processing results, summaries, topics, participants
- `meeting`: Live meeting sessions with consent management
- `room`: Virtual meeting spaces with configuration
- `recording`: Audio/video file metadata and processing status
### API Structure
All endpoints prefixed `/v1/`:
- `transcripts/` - CRUD operations for transcripts
- `transcripts_audio/` - Audio streaming and download
- `transcripts_webrtc/` - Real-time WebRTC endpoints
- `transcripts_websocket/` - WebSocket for live updates
- `meetings/` - Meeting lifecycle management
- `rooms/` - Virtual room management
### Frontend Architecture
- **App Router**: Next.js 14 with route groups for organization
- **State**: React Context pattern, no Redux
- **Real-time**: WebSocket integration for live transcription updates
- **Auth**: NextAuth.js with Authentik OAuth/OIDC provider
- **UI**: Chakra UI components with Tailwind CSS utilities
## Key Configuration
### Environment Variables
**Backend** (`server/.env`):
- `DATABASE_URL` - Database connection string
- `REDIS_URL` - Redis broker for Celery
- `TRANSCRIPT_BACKEND=modal` + `TRANSCRIPT_MODAL_API_KEY` - Modal.com transcription
- `DIARIZATION_BACKEND=modal` + `DIARIZATION_MODAL_API_KEY` - Modal.com diarization
- `TRANSLATION_BACKEND=modal` + `TRANSLATION_MODAL_API_KEY` - Modal.com translation
- `WHEREBY_API_KEY` - Video platform integration
- `REFLECTOR_AUTH_BACKEND` - Authentication method (none, jwt)
**Frontend** (`www/.env`):
- `NEXTAUTH_URL`, `NEXTAUTH_SECRET` - Authentication configuration
- `NEXT_PUBLIC_REFLECTOR_API_URL` - Backend API endpoint
- `REFLECTOR_DOMAIN_CONFIG` - Feature flags and domain settings
## Testing Strategy
- **Backend**: pytest with async support, HTTP client mocking, audio processing tests
- **Frontend**: No current test suite - opportunities for Jest/React Testing Library
- **Coverage**: Backend maintains test coverage reports in `htmlcov/`
## GPU Processing
Modal.com integration for scalable ML processing:
- Deploy changes: `modal run server/gpu/path/to/model.py`
- Requires Modal account with `REFLECTOR_GPU_APIKEY` secret
- Fallback to local processing when Modal unavailable
## Common Issues
- **Permissions**: Browser microphone access required in System Preferences
- **Audio Routing**: Use BlackHole (Mac) for merging multiple audio sources
- **WebRTC**: Ensure proper CORS configuration for cross-origin streaming
- **Database**: Run `uv run alembic upgrade head` after pulling schema changes
## Pipeline/worker related info
If you need to do any worker/pipeline related work, search for "Pipeline" classes and their "create" or "build" methods to find the main processor sequence. Look for task orchestration patterns (like "chord", "group", or "chain") to identify the post-processing flow with parallel execution chains. This will give you abstract vision on how processing pipeling is organized.

9
LICENSE Normal file
View File

@@ -0,0 +1,9 @@
MIT License
Copyright (c) 2025 Monadical SAS
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

267
README.md
View File

@@ -1,48 +1,69 @@
<div align="center">
<img width="100" alt="image" src="https://github.com/user-attachments/assets/66fb367b-2c89-4516-9912-f47ac59c6a7f"/>
# Reflector
Reflector Audio Management and Analysis is a cutting-edge web application under development by Monadical. It utilizes AI to record meetings, providing a permanent record with transcripts, translations, and automated summaries.
Reflector is an AI-powered audio transcription and meeting analysis platform that provides real-time transcription, speaker diarization, translation and summarization for audio content and live meetings. It works 100% with local models (whisper/parakeet, pyannote, seamless-m4t, and your local llm like phi-4).
[![Tests](https://github.com/monadical-sas/reflector/actions/workflows/test_server.yml/badge.svg?branch=main&event=push)](https://github.com/monadical-sas/reflector/actions/workflows/test_server.yml)
[![License: MIT](https://img.shields.io/badge/license-MIT-green.svg)](https://opensource.org/licenses/MIT)
</div>
</div>
<table>
<tr>
<td>
<a href="https://github.com/user-attachments/assets/21f5597c-2930-4899-a154-f7bd61a59e97">
<img width="700" alt="image" src="https://github.com/user-attachments/assets/21f5597c-2930-4899-a154-f7bd61a59e97" />
</a>
</td>
<td>
<a href="https://github.com/user-attachments/assets/f6b9399a-5e51-4bae-b807-59128d0a940c">
<img width="700" alt="image" src="https://github.com/user-attachments/assets/f6b9399a-5e51-4bae-b807-59128d0a940c" />
</a>
</td>
<td>
<a href="https://github.com/user-attachments/assets/a42ce460-c1fd-4489-a995-270516193897">
<img width="700" alt="image" src="https://github.com/user-attachments/assets/a42ce460-c1fd-4489-a995-270516193897" />
</a>
</td>
<td>
<a href="https://github.com/user-attachments/assets/21929f6d-c309-42fe-9c11-f1299e50fbd4">
<img width="700" alt="image" src="https://github.com/user-attachments/assets/21929f6d-c309-42fe-9c11-f1299e50fbd4" />
</a>
</td>
</tr>
</table>
## What is Reflector?
Reflector is a web application that utilizes local models to process audio content, providing:
- **Real-time Transcription**: Convert speech to text using [Whisper](https://github.com/openai/whisper) (multi-language) or [Parakeet](https://huggingface.co/nvidia/parakeet-tdt-0.6b-v2) (English) models
- **Speaker Diarization**: Identify and label different speakers using [Pyannote](https://github.com/pyannote/pyannote-audio) 3.1
- **Live Translation**: Translate audio content in real-time to many languages with [Facebook Seamless-M4T](https://github.com/facebookresearch/seamless_communication)
- **Topic Detection & Summarization**: Extract key topics and generate concise summaries using LLMs
- **Meeting Recording**: Create permanent records of meetings with searchable transcripts
Currently we provide [modal.com](https://modal.com/) gpu template to deploy.
## Background
The project architecture consists of three primary components:
- **Front-End**: NextJS React project hosted on Vercel, located in `www/`.
- **Back-End**: Python server that offers an API and data persistence, found in `server/`.
- **GPU implementation**: Providing services such as speech-to-text transcription, topic generation, automated summaries, and translations. Most reliable option is Modal deployment
- **Front-End**: NextJS React project hosted on Vercel, located in `www/`.
- **GPU implementation**: Providing services such as speech-to-text transcription, topic generation, automated summaries, and translations.
It also uses https://github.com/fief-dev for authentication, and Vercel for deployment and configuration of the front-end.
It also uses authentik for authentication if activated.
## Table of Contents
## Contribution Guidelines
- [Reflector](#reflector)
- [Table of Contents](#table-of-contents)
- [Miscellaneous](#miscellaneous)
- [Contribution Guidelines](#contribution-guidelines)
- [How to Install Blackhole (Mac Only)](#how-to-install-blackhole-mac-only)
- [Front-End](#front-end)
- [Installation](#installation)
- [Run the Application](#run-the-application)
- [OpenAPI Code Generation](#openapi-code-generation)
- [Back-End](#back-end)
- [Installation](#installation-1)
- [Start the API/Backend](#start-the-apibackend)
- [Redis (Mac)](#redis-mac)
- [Redis (Windows)](#redis-windows)
- [Update the database schema (run on first install, and after each pull containing a migration)](#update-the-database-schema-run-on-first-install-and-after-each-pull-containing-a-migration)
- [Main Server](#main-server)
- [Crontab (optional)](#crontab-optional)
- [Using docker](#using-docker)
- [Using local GPT4All](#using-local-gpt4all)
- [Using local files](#using-local-files)
- [AI Models](#ai-models)
All new contributions should be made in a separate branch, and goes through a Pull Request.
[Conventional commits](https://www.conventionalcommits.org/en/v1.0.0/) must be used for the PR title and commits.
## Miscellaneous
## Usage
### Contribution Guidelines
All new contributions should be made in a separate branch. Before any code is merged into `main`, it requires a code review.
### Usage instructions
To record both your voice and the meeting you're taking part in, you need :
To record both your voice and the meeting you're taking part in, you need:
- For an in-person meeting, make sure your microphone is in range of all participants.
- If using several microphones, make sure to merge the audio feeds into one with an external tool.
@@ -66,156 +87,114 @@ Note: We currently do not have instructions for Windows users.
- Then goto `System Preferences -> Sound` and choose the devices created from the Output and Input tabs.
- The input from your local microphone, the browser run meeting should be aggregated into one virtual stream to listen to and the output should be fed back to your specified output devices if everything is configured properly.
## Front-End
## Installation
*Note: we're working toward better installation, theses instructions are not accurate for now*
### Frontend
Start with `cd www`.
### Installation
To install the application, run:
**Installation**
```bash
yarn install
cp .env_template .env
cp config-template.ts config.ts
pnpm install
cp .env.example .env
```
Then, fill in the environment variables in `.env` and the configuration in `config.ts` as needed. If you are unsure on how to proceed, ask in Zulip.
Then, fill in the environment variables in `.env` as needed. If you are unsure on how to proceed, ask in Zulip.
### Run the Application
To run the application in development mode, run:
**Run in development mode**
```bash
yarn dev
pnpm dev
```
Then (after completing server setup and starting it) open [http://localhost:3000](http://localhost:3000) to view it in the browser.
### OpenAPI Code Generation
**OpenAPI Code Generation**
To generate the TypeScript files from the openapi.json file, make sure the python server is running, then run:
```bash
yarn openapi
pnpm openapi
```
## Back-End
### Backend
Start with `cd server`.
### Quick-run instructions (only if you installed everything already)
```bash
redis-server # Mac
docker compose up -d redis # Windows
poetry run celery -A reflector.worker.app worker --loglevel=info
poetry run python -m reflector.app
```
### Installation
Download [Python 3.11 from the official website](https://www.python.org/downloads/) and ensure you have version 3.11 by running `python --version`.
Run:
```bash
python --version # It should say 3.11
pip install poetry
poetry install --no-root
cp .env_template .env
```
Then fill `.env` with the omitted values (ask in Zulip). At the moment of this writing, the only value omitted is `AUTH_FIEF_CLIENT_SECRET`.
### Start the API/Backend
Start the background worker:
```bash
poetry run celery -A reflector.worker.app worker --loglevel=info
```
### Redis (Mac)
```bash
yarn add redis
poetry run celery -A reflector.worker.app worker --loglevel=info
redis-server
```
### Redis (Windows)
**Option 1**
**Run in development mode**
```bash
docker compose up -d redis
# on the first run, or if the schemas changed
uv run alembic upgrade head
# start the worker
uv run celery -A reflector.worker.app worker --loglevel=info
# start the app
uv run -m reflector.app --reload
```
**Option 2**
Then fill `.env` with the omitted values (ask in Zulip).
Install:
- [Git for Windows](https://gitforwindows.org/)
- [Windows Subsystem for Linux (WSL)](https://docs.microsoft.com/en-us/windows/wsl/install)
- Install your preferred Linux distribution via the Microsoft Store (e.g., Ubuntu).
Open your Linux distribution and update the package list:
```bash
sudo apt update
sudo apt install redis-server
redis-server
```
## Update the database schema (run on first install, and after each pull containing a migration)
```bash
poetry run alembic heads
```
## Main Server
```bash
poetry run python -m reflector.app
```
### Crontab (optional)
**Crontab (optional)**
For crontab (only healthcheck for now), start the celery beat (you don't need it on your local dev environment):
```bash
poetry run celery -A reflector.worker.app beat
uv run celery -A reflector.worker.app beat
```
#### Using docker
### GPU models
Use:
Currently, reflector heavily use custom local models, deployed on modal. All the micro services are available in server/gpu/
```bash
docker-compose up server
```
### Using local GPT4All
- Start GPT4All with any model you want
- Ensure the API server is activated in GPT4all
- Run with: `LLM_BACKEND=openai LLM_URL=http://localhost:4891/v1/completions LLM_OPENAI_MODEL="GPT4All Falcon" python -m reflector.app`
### Using local files
```
poetry run python -m reflector.tools.process path/to/audio.wav
```
## AI Models
### Modal
To deploy llm changes to modal, you need.
To deploy llm changes to modal, you need:
- a modal account
- set up the required secret in your modal account (REFLECTOR_GPU_APIKEY)
- install the modal cli
- connect your modal cli to your account if not done previously
- `modal run path/to/required/llm`
_(Documentation for this section is pending.)_
## Using local files
You can manually process an audio file by calling the process tool:
```bash
uv run python -m reflector.tools.process path/to/audio.wav
```
## Feature Flags
Reflector uses environment variable-based feature flags to control application functionality. These flags allow you to enable or disable features without code changes.
### Available Feature Flags
| Feature Flag | Environment Variable |
|-------------|---------------------|
| `requireLogin` | `NEXT_PUBLIC_FEATURE_REQUIRE_LOGIN` |
| `privacy` | `NEXT_PUBLIC_FEATURE_PRIVACY` |
| `browse` | `NEXT_PUBLIC_FEATURE_BROWSE` |
| `sendToZulip` | `NEXT_PUBLIC_FEATURE_SEND_TO_ZULIP` |
| `rooms` | `NEXT_PUBLIC_FEATURE_ROOMS` |
### Setting Feature Flags
Feature flags are controlled via environment variables using the pattern `NEXT_PUBLIC_FEATURE_{FEATURE_NAME}` where `{FEATURE_NAME}` is the SCREAMING_SNAKE_CASE version of the feature name.
**Examples:**
```bash
# Enable user authentication requirement
NEXT_PUBLIC_FEATURE_REQUIRE_LOGIN=true
# Disable browse functionality
NEXT_PUBLIC_FEATURE_BROWSE=false
# Enable Zulip integration
NEXT_PUBLIC_FEATURE_SEND_TO_ZULIP=true
```

View File

@@ -6,6 +6,7 @@ services:
- 1250:1250
volumes:
- ./server/:/app/
- /app/.venv
env_file:
- ./server/.env
environment:
@@ -16,6 +17,7 @@ services:
context: server
volumes:
- ./server/:/app/
- /app/.venv
env_file:
- ./server/.env
environment:
@@ -26,6 +28,7 @@ services:
context: server
volumes:
- ./server/:/app/
- /app/.venv
env_file:
- ./server/.env
environment:
@@ -39,10 +42,26 @@ services:
image: node:18
ports:
- "3000:3000"
command: sh -c "yarn install && yarn dev"
command: sh -c "corepack enable && pnpm install && pnpm dev"
restart: unless-stopped
working_dir: /app
volumes:
- ./www:/app/
- /app/node_modules
env_file:
- ./www/.env.local
postgres:
image: postgres:17
ports:
- 5432:5432
environment:
POSTGRES_USER: reflector
POSTGRES_PASSWORD: reflector
POSTGRES_DB: reflector
volumes:
- ./data/postgres:/var/lib/postgresql/data
networks:
default:
attachable: true

369
docs/jitsi.md Normal file
View File

@@ -0,0 +1,369 @@
# Jitsi Integration for Reflector
This document contains research and planning notes for integrating Jitsi Meet as a replacement for Whereby in Reflector.
## Overview
Jitsi Meet is an open-source video conferencing solution that can replace Whereby in Reflector, providing:
- Cost reduction (no per-minute charges)
- Direct recording access via Jibri
- Real-time event webhooks
- Full customization and control
## Current Whereby Integration Analysis
### Architecture
1. **Room Creation**: User creates a "room" template in Reflector DB with settings
2. **Meeting Creation**: `/rooms/{room_name}/meeting` endpoint calls Whereby API to create meeting
3. **Recording**: Whereby handles recording automatically to S3 bucket
4. **Webhooks**: Whereby sends events for participant tracking
### Database Structure
```python
# Room = Template/Configuration
class Room:
id, name, user_id
recording_type, recording_trigger # cloud, automatic-2nd-participant
webhook_url, webhook_secret
# Meeting = Actual Whereby Meeting Instance
class Meeting:
id # Whereby meetingId
room_name # Generated by Whereby
room_url, host_room_url # Whereby URLs
num_clients # Updated via webhooks
```
## Jitsi Components
### Core Architecture
- **Jitsi Meet**: Web frontend (Next.js + React)
- **Prosody**: XMPP server for messaging/rooms
- **Jicofo**: Conference focus (orchestration)
- **JVB**: Videobridge (media routing)
- **Jibri**: Recording service
- **Jigasi**: SIP gateway (optional, for phone dial-in)
### Exposure Requirements
- **Web service**: 443/80 (frontend)
- **JVB**: 10000/UDP (media streams) - **MUST EXPOSE**
- **Prosody**: 5280 (BOSH/WebSocket) - can proxy via web
- **Jicofo, Jibri, Jigasi**: Internal only
## Recording with Jibri
### How Jibri Works
- Each Jibri instance handles **one recording at a time**
- Records mixed audio/video to MP4 format
- Uses Chrome headless + ffmpeg for capture
- Supports finalize scripts for post-processing
### Jibri Pool for Scaling
- Multiple Jibri instances join "jibribrewery" MUC
- Jicofo distributes recording requests to available instances
- Automatic load balancing and failover
```yaml
# Multiple Jibri instances
jibri1:
environment:
- JIBRI_INSTANCE_ID=jibri1
- JIBRI_BREWERY_MUC=jibribrewery
jibri2:
environment:
- JIBRI_INSTANCE_ID=jibri2
- JIBRI_BREWERY_MUC=jibribrewery
```
### Recording Automation Options
1. **Environment Variables**: `ENABLE_RECORDING=1`, `AUTO_RECORDING=1`
2. **URL Parameters**: `?config.autoRecord=true`
3. **JWT Token**: Include recording permissions in JWT
4. **API Control**: `api.executeCommand('startRecording')`
### Post-Processing Integration
```bash
#!/bin/bash
# finalize.sh - runs after recording completion
RECORDING_FILE=$1
MEETING_METADATA=$2
ROOM_NAME=$3
# Copy to Reflector-accessible location
cp "$RECORDING_FILE" /shared/reflector-uploads/
# Trigger Reflector processing
curl -X POST "http://reflector-api:8000/v1/transcripts/process" \
-H "Content-Type: application/json" \
-d "{
\"file_path\": \"/shared/reflector-uploads/$(basename $RECORDING_FILE)\",
\"room_name\": \"$ROOM_NAME\",
\"source\": \"jitsi\"
}"
```
## React Integration
### Official React SDK
```bash
npm i @jitsi/react-sdk
```
```jsx
import { JitsiMeeting } from '@jitsi/react-sdk'
<JitsiMeeting
room="meeting-room"
serverURL="https://your-jitsi.domain"
jwt="your-jwt-token"
config={{
startWithAudioMuted: true,
fileRecordingsEnabled: true,
autoRecord: true
}}
onParticipantJoined={(participant) => {
// Track participant events
}}
onRecordingStatusChanged={(status) => {
// Handle recording events
}}
/>
```
## Authentication & Room Control
### JWT-Based Access Control
```python
def generate_jitsi_jwt(payload):
return jwt.encode({
"aud": "jitsi",
"iss": "reflector",
"sub": "reflector-user",
"room": payload["room"],
"exp": int(payload["exp"].timestamp()),
"context": {
"user": {
"name": payload["user_name"],
"moderator": payload.get("moderator", False)
},
"features": {
"recording": payload.get("recording", True)
}
}
}, JITSI_JWT_SECRET)
```
### Prevent Anonymous Room Creation
```bash
# Environment configuration
ENABLE_AUTH=1
ENABLE_GUESTS=0
AUTH_TYPE=jwt
JWT_APP_ID=reflector
JWT_APP_SECRET=your-secret-key
```
## Webhook Integration
### Real-time Events via Prosody
Custom event-sync module can send webhooks for:
- Participant join/leave
- Recording start/stop
- Room creation/destruction
- Mute/unmute events
```lua
-- mod_event_sync.lua
module:hook("muc-occupant-joined", function(event)
send_event({
type = "participant_joined",
room = event.room.jid,
participant = {
nick = event.occupant.nick,
jid = event.occupant.jid,
},
timestamp = os.time(),
});
end);
```
### Jibri Recording Webhooks
```bash
# Environment variable
JIBRI_WEBHOOK_SUBSCRIBERS=https://your-reflector.com/webhooks/jibri
```
## Proposed Reflector Integration
### Modified Database Schema
```python
class Meeting(BaseModel):
id: str # Our generated meeting ID
room_name: str # Generated: reflector-{room.name}-{timestamp}
room_url: str # https://jitsi.domain/room_name?jwt=token
host_room_url: str # Same but with moderator JWT
# Add Jitsi-specific fields
jitsi_jwt: str # JWT token
jitsi_room_id: str # Internal room identifier
recording_status: str # pending, recording, completed
recording_file_path: Optional[str]
```
### API Replacement
```python
# Replace whereby.py with jitsi.py
async def create_meeting(room_name_prefix: str, end_date: datetime, room: Room):
# Generate unique room name
jitsi_room = f"reflector-{room.name}-{int(time.time())}"
# Generate JWT tokens
user_jwt = generate_jwt(room=jitsi_room, moderator=False, exp=end_date)
host_jwt = generate_jwt(room=jitsi_room, moderator=True, exp=end_date)
return {
"meetingId": generate_uuid4(), # Our ID
"roomName": jitsi_room,
"roomUrl": f"https://jitsi.domain/{jitsi_room}?jwt={user_jwt}",
"hostRoomUrl": f"https://jitsi.domain/{jitsi_room}?jwt={host_jwt}",
"startDate": datetime.now().isoformat(),
"endDate": end_date.isoformat(),
}
```
### Webhook Endpoints
```python
# Replace whereby webhook with jitsi webhooks
@router.post("/jitsi/events")
async def jitsi_events_webhook(event_data: dict):
event_type = event_data.get("event")
room_name = event_data.get("room", "").split("@")[0]
meeting = await Meeting.get_by_room(room_name)
if event_type == "muc-occupant-joined":
# Update participant count
meeting.num_clients += 1
elif event_type == "jibri-recording-on":
meeting.recording_status = "recording"
elif event_type == "jibri-recording-off":
meeting.recording_status = "processing"
await process_meeting_recording.delay(meeting.id)
@router.post("/jibri/recording-complete")
async def recording_complete(data: dict):
# Handle finalize script webhook
room_name = data.get("room_name")
file_path = data.get("file_path")
meeting = await Meeting.get_by_room(room_name)
meeting.recording_file_path = file_path
meeting.recording_status = "completed"
# Start Reflector processing
await process_recording_for_transcription(meeting.id, file_path)
```
## Deployment with Docker
### Official docker-jitsi-meet
```bash
# Download official release
wget $(wget -q -O - https://api.github.com/repos/jitsi/docker-jitsi-meet/releases/latest | grep zip | cut -d\" -f4)
# Setup
mkdir -p ~/.jitsi-meet-cfg/{web,transcripts,prosody/config,prosody/prosody-plugins-custom,jicofo,jvb,jigasi,jibri}
./gen-passwords.sh # Generate secure passwords
docker compose up -d
```
### Coolify Integration
```yaml
services:
web:
ports: ["80:80", "443:443"]
jvb:
ports: ["10000:10000/udp"] # Must expose for media
jibri1:
environment:
- JIBRI_INSTANCE_ID=jibri1
- JIBRI_FINALIZE_RECORDING_SCRIPT_PATH=/config/finalize.sh
jibri2:
environment:
- JIBRI_INSTANCE_ID=jibri2
```
## Benefits vs Whereby
### Cost & Control
**No per-minute charges** - significant cost savings
**Full recording control** - direct file access
**Custom branding** - complete UI control
**Self-hosted** - no vendor lock-in
### Technical Advantages
**Real-time events** - immediate webhook notifications
**Rich participant metadata** - detailed tracking
**JWT security** - token-based access with expiration
**Multiple recording formats** - audio-only options
**Scalable architecture** - horizontal Jibri scaling
### Integration Benefits
**Same API surface** - minimal changes to existing code
**React SDK** - better frontend integration
**Direct processing** - no S3 download delays
**Event-driven architecture** - better real-time capabilities
## Implementation Plan
1. **Deploy Jitsi Stack** - Set up docker-jitsi-meet with multiple Jibri instances
2. **Create jitsi.py** - Replace whereby.py with Jitsi API functions
3. **Update Database** - Add Jitsi-specific fields to Meeting model
4. **Webhook Integration** - Replace Whereby webhooks with Jitsi events
5. **Frontend Updates** - Replace Whereby embed with Jitsi React SDK
6. **Testing & Migration** - Gradual rollout with fallback to Whereby
## Recording Limitations & Considerations
### Current Limitations
- **Mixed audio only** - Jibri doesn't separate participant tracks natively
- **One recording per Jibri** - requires multiple instances for concurrent recordings
- **Chrome dependency** - Jibri uses headless Chrome for recording
### Metadata Capabilities
**Participant join/leave timestamps** - via webhooks
**Speaking time tracking** - via audio level events
**Meeting duration** - precise timing
**Room-specific data** - custom metadata in JWT
### Alternative Recording Methods
- **Local recording** - browser-based, per-participant
- **Custom recording** - lib-jitsi-meet for individual streams
- **Third-party solutions** - Recall.ai, Otter.ai integrations
## Security Considerations
### JWT Configuration
- **Room-specific tokens** - limit access to specific rooms
- **Time-based expiration** - automatic cleanup
- **Feature permissions** - control recording, moderation rights
- **User identification** - embed user metadata in tokens
### Access Control
- **No anonymous rooms** - all rooms require valid JWT
- **API-only creation** - prevent direct room access
- **Webhook verification** - HMAC signature validation
## Next Steps
1. **Deploy test Jitsi instance** - validate recording pipeline
2. **Prototype jitsi.py** - create equivalent API functions
3. **Test webhook integration** - ensure event delivery works
4. **Performance testing** - validate multiple concurrent recordings
5. **Migration strategy** - plan gradual transition from Whereby
---
*This document serves as the comprehensive planning and research notes for Jitsi integration in Reflector. It should be updated as implementation progresses and new insights are discovered.*

720
docs/video-jitsi.md Normal file
View File

@@ -0,0 +1,720 @@
# Jitsi Meet Integration Configuration Guide
This guide explains how to configure Reflector to use your self-hosted Jitsi Meet installation for video meetings, recording, and participant tracking.
## Overview
Jitsi Meet is an open-source video conferencing platform that can be self-hosted. Reflector integrates with Jitsi Meet to:
- Create secure meeting rooms with JWT authentication
- Track participant join/leave events via Prosody webhooks
- Record meetings using Jibri recording service
- Process recordings for transcription and analysis
## Requirements
### Self-Hosted Jitsi Meet
You need a complete Jitsi Meet installation including:
1. **Jitsi Meet Web Interface** - The main meeting interface
2. **Prosody XMPP Server** - Handles room management and authentication
3. **Jicofo (JItsi COnference FOcus)** - Manages media sessions
4. **Jitsi Videobridge (JVB)** - Handles WebRTC media routing
5. **Jibri Recording Service** - Records meetings (optional but recommended)
### System Requirements
- **Domain with SSL Certificate** - Required for WebRTC functionality
- **Prosody mod_event_sync** - For webhook event handling
- **JWT Authentication** - For secure room access control
- **Storage Solution** - For recording files (local or cloud)
## Configuration Variables
Add the following environment variables to your Reflector `.env` file:
### Required Variables
```bash
# Jitsi Meet Domain (without https://)
JITSI_DOMAIN=meet.example.com
# JWT Secret for room authentication (generate with: openssl rand -hex 32)
JITSI_JWT_SECRET=your-64-character-hex-secret-here
# Webhook secret for event handling (generate with: openssl rand -hex 16)
JITSI_WEBHOOK_SECRET=your-32-character-hex-secret-here
```
### Optional Variables
```bash
# Application identifier (should match Jitsi configuration)
JITSI_APP_ID=reflector
# JWT issuer and audience (should match Jitsi configuration)
JITSI_JWT_ISSUER=reflector
JITSI_JWT_AUDIENCE=jitsi
```
## Installation Steps
### 1. Jitsi Meet Server Installation
#### Quick Installation (Ubuntu/Debian)
```bash
# Add Jitsi repository
curl -fsSL https://download.jitsi.org/jitsi-key.gpg.key | sudo gpg --dearmor -o /usr/share/keyrings/jitsi-keyring.gpg
echo "deb [signed-by=/usr/share/keyrings/jitsi-keyring.gpg] https://download.jitsi.org stable/" | sudo tee /etc/apt/sources.list.d/jitsi-stable.list
# Install Jitsi Meet
sudo apt update
sudo apt install jitsi-meet
# Configure SSL certificate
sudo /usr/share/jitsi-meet/scripts/install-letsencrypt-cert.sh
```
#### Docker Installation
```bash
# Clone Jitsi Docker repository
git clone https://github.com/jitsi/docker-jitsi-meet
cd docker-jitsi-meet
# Copy environment template
cp env.example .env
# Edit configuration
nano .env
# Start services
docker-compose up -d
```
### 2. JWT Authentication Setup
#### Update Prosody Configuration
Edit `/etc/prosody/conf.d/your-domain.cfg.lua`:
```lua
VirtualHost "meet.example.com"
authentication = "token"
app_id = "reflector"
app_secret = "your-jwt-secret-here"
-- Allow anonymous access for guests
c2s_require_encryption = false
admins = { "focusUser@auth.meet.example.com" }
modules_enabled = {
"bosh";
"pubsub";
"ping";
"roster";
"saslauth";
"tls";
"dialback";
"disco";
"carbons";
"pep";
"private";
"blocklist";
"vcard";
"version";
"uptime";
"time";
"ping";
"register";
"admin_adhoc";
"token_verification";
"event_sync"; -- Required for webhooks
}
```
#### Configure Jitsi Meet Interface
Edit `/etc/jitsi/meet/your-domain-config.js`:
```javascript
var config = {
hosts: {
domain: 'meet.example.com',
muc: 'conference.meet.example.com'
},
// Enable JWT authentication
enableUserRolesBasedOnToken: true,
// Recording configuration
fileRecordingsEnabled: true,
liveStreamingEnabled: false,
// Reflector integration settings
prejoinPageEnabled: true,
requireDisplayName: true
};
```
### 3. Webhook Event Configuration
#### Install Event Sync Module
```bash
# Download the module
cd /usr/share/jitsi-meet/prosody-plugins/
wget https://raw.githubusercontent.com/jitsi-contrib/prosody-plugins/main/mod_event_sync.lua
```
#### Configure Event Sync
Add to your Prosody configuration:
```lua
Component "conference.meet.example.com" "muc"
storage = "memory"
modules_enabled = {
"muc_meeting_id";
"muc_domain_mapper";
"polls";
"event_sync"; -- Enable event sync
}
-- Event sync webhook configuration
event_sync_url = "https://your-reflector-domain.com/v1/jitsi/events"
event_sync_secret = "your-webhook-secret-here"
-- Events to track
event_sync_events = {
"muc-occupant-joined",
"muc-occupant-left",
"jibri-recording-on",
"jibri-recording-off"
}
#### Webhook Event Payload Examples
**Participant Joined Event:**
```json
{
"event": "muc-occupant-joined",
"room": "reflector-my-room-uuid123",
"timestamp": "2025-01-15T10:30:00.000Z",
"data": {
"occupant_id": "participant-456",
"nick": "John Doe",
"role": "participant",
"affiliation": "none"
}
}
```
**Recording Started Event:**
```json
{
"event": "jibri-recording-on",
"room": "reflector-my-room-uuid123",
"timestamp": "2025-01-15T10:32:00.000Z",
"data": {
"recording_id": "rec-789",
"initiator": "moderator-123"
}
}
```
**Recording Completed Event:**
```json
{
"room_name": "reflector-my-room-uuid123",
"recording_file": "/var/recordings/rec-789.mp4",
"recording_status": "completed",
"timestamp": "2025-01-15T11:15:00.000Z"
}
```
### 4. Jibri Recording Setup (Optional)
#### Install Jibri
```bash
# Install Jibri package
sudo apt install jibri
# Create recording directory
sudo mkdir -p /var/recordings
sudo chown jibri:jibri /var/recordings
```
#### Configure Jibri
Edit `/etc/jitsi/jibri/jibri.conf`:
```hocon
jibri {
recording {
recordings-directory = "/var/recordings"
finalize-script = "/opt/jitsi/jibri/finalize.sh"
}
api {
xmpp {
environments = [{
name = "prod environment"
xmpp-server-hosts = ["meet.example.com"]
xmpp-domain = "meet.example.com"
control-muc {
domain = "internal.auth.meet.example.com"
room-name = "JibriBrewery"
nickname = "jibri-nickname"
}
control-login {
domain = "auth.meet.example.com"
username = "jibri"
password = "jibri-password"
}
}]
}
}
}
```
#### Create Finalize Script
Create `/opt/jitsi/jibri/finalize.sh`:
```bash
#!/bin/bash
# Jibri finalize script for Reflector integration
RECORDING_FILE="$1"
ROOM_NAME="$2"
REFLECTOR_API_URL="${REFLECTOR_API_URL:-http://localhost:1250}"
# Prepare webhook payload
TIMESTAMP=$(date -u +%Y-%m-%dT%H:%M:%S.%3NZ)
PAYLOAD=$(cat <<EOF
{
"room_name": "$ROOM_NAME",
"recording_file": "$RECORDING_FILE",
"recording_status": "completed",
"timestamp": "$TIMESTAMP"
}
EOF
)
# Generate signature
SIGNATURE=$(echo -n "$PAYLOAD" | openssl dgst -sha256 -hmac "$JITSI_WEBHOOK_SECRET" | cut -d' ' -f2)
# Send webhook to Reflector
curl -X POST "$REFLECTOR_API_URL/v1/jibri/recording-complete" \
-H "Content-Type: application/json" \
-H "X-Jitsi-Signature: $SIGNATURE" \
-d "$PAYLOAD"
echo "Recording finalization webhook sent for room: $ROOM_NAME"
```
Make executable:
```bash
sudo chmod +x /opt/jitsi/jibri/finalize.sh
```
### 5. Restart Services
After configuration changes:
```bash
sudo systemctl restart prosody
sudo systemctl restart jicofo
sudo systemctl restart jitsi-videobridge2
sudo systemctl restart jibri
sudo systemctl restart nginx
```
## Room Configuration
### Creating Jitsi Rooms
Create rooms with Jitsi platform in Reflector:
```bash
curl -X POST "https://your-reflector-domain.com/v1/rooms" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $AUTH_TOKEN" \
-d '{
"name": "my-jitsi-room",
"platform": "jitsi",
"recording_type": "cloud",
"recording_trigger": "automatic-2nd-participant",
"is_locked": false,
"room_mode": "normal"
}'
```
### Meeting Creation
Meetings automatically use JWT authentication:
```bash
curl -X POST "https://your-reflector-domain.com/v1/rooms/my-jitsi-room/meeting" \
-H "Authorization: Bearer $AUTH_TOKEN"
```
Response includes JWT-authenticated URLs:
```json
{
"id": "meeting-uuid",
"room_name": "reflector-my-jitsi-room-123456",
"room_url": "https://meet.example.com/room?jwt=user-token",
"host_room_url": "https://meet.example.com/room?jwt=moderator-token"
}
```
## Features and Capabilities
### JWT Authentication
Reflector automatically generates JWT tokens with:
- **Room Access Control** - Secure room entry
- **User Roles** - Moderator vs participant permissions
- **Expiration** - Configurable token lifetime (default 8 hours)
- **Custom Claims** - Room-specific metadata
### Recording Options
**Recording Types:**
- `"none"` - No recording
- `"local"` - Local Jibri recording
- `"cloud"` - Cloud recording (requires external storage)
**Recording Triggers:**
- `"none"` - Manual recording only
- `"prompt"` - Prompt users to start
- `"automatic"` - Start immediately
- `"automatic-2nd-participant"` - Start when 2nd person joins
### Event Tracking and Storage
Reflector automatically stores all webhook events in the `meetings` table for comprehensive meeting analytics:
**Supported Event Types:**
- `muc-occupant-joined` - Participant joined the meeting
- `muc-occupant-left` - Participant left the meeting
- `jibri-recording-on` - Recording started
- `jibri-recording-off` - Recording stopped
- `recording_completed` - Recording file ready for processing
**Event Storage Structure:**
Each webhook event is stored as a JSON object in the `meetings.events` column:
```json
{
"type": "muc-occupant-joined",
"timestamp": "2025-01-15T10:30:00.123456Z",
"data": {
"timestamp": "2025-01-15T10:30:00Z",
"user_id": "participant-123",
"display_name": "John Doe"
}
}
```
**Querying Stored Events:**
```sql
-- Get all events for a meeting
SELECT events FROM meeting WHERE id = 'meeting-uuid';
-- Count participant joins
SELECT json_array_length(
json_extract(events, '$[*] ? (@.type == "muc-occupant-joined")')
) as total_joins FROM meeting WHERE id = 'meeting-uuid';
```
## Testing and Verification
### Health Check
Test Jitsi webhook integration:
```bash
curl "https://your-reflector-domain.com/v1/jitsi/health"
```
Expected response:
```json
{
"status": "ok",
"service": "jitsi-webhooks",
"timestamp": "2025-01-15T10:30:00.000Z",
"webhook_secret_configured": true
}
```
### JWT Token Testing
Verify JWT generation works:
```bash
# Create a test meeting
MEETING=$(curl -X POST "https://your-reflector-domain.com/v1/rooms/test-room/meeting" \
-H "Authorization: Bearer $AUTH_TOKEN" | jq -r '.room_url')
echo "Test meeting URL: $MEETING"
```
### Webhook Testing
#### Manual Webhook Event Testing
Test participant join event:
```bash
# Generate proper signature
PAYLOAD='{"event":"muc-occupant-joined","room":"reflector-test-room-uuid","timestamp":"2025-01-15T10:30:00.000Z","data":{"user_id":"test-user","display_name":"Test User"}}'
SIGNATURE=$(echo -n "$PAYLOAD" | openssl dgst -sha256 -hmac "$JITSI_WEBHOOK_SECRET" | cut -d' ' -f2)
curl -X POST "https://your-reflector-domain.com/v1/jitsi/events" \
-H "Content-Type: application/json" \
-H "X-Jitsi-Signature: $SIGNATURE" \
-d "$PAYLOAD"
```
Expected response:
```json
{
"status": "ok",
"event": "muc-occupant-joined",
"room": "reflector-test-room-uuid"
}
```
#### Recording Webhook Testing
Test recording completion event:
```bash
PAYLOAD='{"room_name":"reflector-test-room-uuid","recording_file":"/recordings/test.mp4","recording_status":"completed","timestamp":"2025-01-15T10:30:00.000Z"}'
SIGNATURE=$(echo -n "$PAYLOAD" | openssl dgst -sha256 -hmac "$JITSI_WEBHOOK_SECRET" | cut -d' ' -f2)
curl -X POST "https://your-reflector-domain.com/v1/jibri/recording-complete" \
-H "Content-Type: application/json" \
-H "X-Jitsi-Signature: $SIGNATURE" \
-d "$PAYLOAD"
```
#### Event Storage Verification
Verify events were stored:
```bash
# Check meeting events via API (requires authentication)
curl -H "Authorization: Bearer $AUTH_TOKEN" \
"https://your-reflector-domain.com/v1/meetings/{meeting-id}"
```
## Troubleshooting
### Common Issues
#### JWT Authentication Failures
**Symptoms**: Users cannot join rooms, "Authentication failed" errors
**Solutions**:
1. Verify `JITSI_JWT_SECRET` matches Prosody configuration
2. Check JWT token hasn't expired (default 8 hours)
3. Ensure system clocks are synchronized between servers
4. Validate JWT issuer/audience configuration matches
**Debug JWT tokens**:
```bash
# Decode JWT payload
echo "JWT_TOKEN_HERE" | cut -d'.' -f2 | base64 -d | jq
```
#### Webhook Events Not Received
**Symptoms**: Participant counts not updating, no recording events
**Solutions**:
1. Verify `mod_event_sync` is loaded in Prosody
2. Check webhook URL is accessible from Jitsi server
3. Validate webhook signature generation
4. Review Prosody and Reflector logs
**Debug webhook connectivity**:
```bash
# Test from Jitsi server
curl -v "https://your-reflector-domain.com/v1/jitsi/health"
# Check Prosody logs
sudo tail -f /var/log/prosody/prosody.log
```
#### Webhook Signature Verification Issues
**Symptoms**: HTTP 401 "Invalid webhook signature" errors
**Solutions**:
1. Verify webhook secret matches between Jitsi and Reflector
2. Check payload encoding (no extra whitespace)
3. Ensure proper HMAC-SHA256 signature generation
**Debug signature generation**:
```bash
# Test signature manually
PAYLOAD='{"event":"test","room":"test","timestamp":"2025-01-15T10:30:00.000Z","data":{}}'
SECRET="your-webhook-secret-here"
# Generate signature (should match X-Jitsi-Signature header)
echo -n "$PAYLOAD" | openssl dgst -sha256 -hmac "$SECRET" | cut -d' ' -f2
# Test with curl
curl -X POST "https://your-reflector-domain.com/v1/jitsi/events" \
-H "Content-Type: application/json" \
-H "X-Jitsi-Signature: $(echo -n "$PAYLOAD" | openssl dgst -sha256 -hmac "$SECRET" | cut -d' ' -f2)" \
-d "$PAYLOAD" -v
```
#### Event Storage Problems
**Symptoms**: Events received but not stored in database
**Solutions**:
1. Check database connectivity and permissions
2. Verify meeting exists before event processing
3. Review Reflector application logs
4. Ensure JSON column support in database
**Debug event storage**:
```bash
# Check meeting exists
curl -H "Authorization: Bearer $TOKEN" \
"https://your-reflector-domain.com/v1/meetings/{meeting-id}"
# Monitor database queries (if using PostgreSQL)
sudo -u postgres psql -c "SELECT * FROM pg_stat_activity WHERE query LIKE '%meeting%';"
# Check Reflector logs for event processing
sudo journalctl -u reflector -f | grep -E "(event|webhook|jitsi)"
```
#### Recording Issues
**Symptoms**: Recordings not starting, finalize script errors
**Solutions**:
1. Verify Jibri service status: `sudo systemctl status jibri`
2. Check recording directory permissions: `/var/recordings`
3. Validate finalize script execution permissions
4. Monitor Jibri logs: `sudo journalctl -u jibri -f`
**Test finalize script**:
```bash
sudo -u jibri /opt/jitsi/jibri/finalize.sh "/test/recording.mp4" "test-room"
```
#### Meeting Creation Failures
**Symptoms**: HTTP 500 errors when creating meetings
**Solutions**:
1. Check Reflector logs for JWT generation errors
2. Verify all required environment variables are set
3. Ensure Jitsi domain is accessible from Reflector
4. Test JWT secret configuration
### Debug Commands
```bash
# Verify Prosody configuration
sudo prosodyctl check config
# Check Jitsi services status
sudo systemctl status prosody jicofo jitsi-videobridge2
# Test JWT generation
curl -X POST "https://your-reflector-domain.com/v1/rooms/test/meeting" \
-H "Authorization: Bearer $TOKEN" -v
# Monitor webhook events
sudo tail -f /var/log/reflector/app.log | grep jitsi
# Check SSL certificates
sudo certbot certificates
```
### Performance Optimization
#### Scaling Considerations
**Single Server Limits:**
- ~50 concurrent participants per JVB instance
- ~10 concurrent Jibri recordings
- CPU and bandwidth become bottlenecks
**Multi-Server Setup:**
- Multiple JVB instances for scaling
- Dedicated Jibri recording servers
- Load balancing for high availability
#### Resource Monitoring
```bash
# Monitor JVB performance
sudo systemctl status jitsi-videobridge2
sudo journalctl -u jitsi-videobridge2 -f
# Check Prosody connections
sudo prosodyctl mod_admin_telnet
> c2s:show()
> muc:rooms()
```
## Security Best Practices
### JWT Security
- Use strong, unique secrets (32+ characters)
- Rotate JWT secrets regularly
- Implement proper token expiration
- Never log or expose JWT tokens
### Network Security
- Use HTTPS/WSS for all communications
- Implement proper firewall rules
- Consider VPN for server-to-server communication
- Monitor for unauthorized access attempts
### Recording Security
- Encrypt recordings at rest
- Implement access controls for recording files
- Regular security audits of file permissions
- Comply with data protection regulations
## Migration from Whereby
If migrating from Whereby to Jitsi:
1. **Parallel Setup** - Configure Jitsi alongside existing Whereby
2. **Room Migration** - Update room platform field to "jitsi"
3. **Test Integration** - Verify meeting creation and webhooks
4. **User Training** - Different UI and feature set
5. **Monitor Performance** - Watch for issues during transition
6. **Cleanup** - Remove Whereby configuration when stable
## Support and Resources
### Jitsi Community Resources
- **Documentation**: [jitsi.github.io/handbook](https://jitsi.github.io/handbook/)
- **Community Forum**: [community.jitsi.org](https://community.jitsi.org/)
- **GitHub Issues**: [github.com/jitsi/jitsi-meet](https://github.com/jitsi/jitsi-meet)
### Professional Support
- **8x8 Commercial Support** - Professional Jitsi hosting and support
- **Community Consulting** - Third-party Jitsi implementation services
### Monitoring and Maintenance
- Monitor system resources (CPU, memory, bandwidth)
- Regular security updates for all components
- Backup configuration files and certificates
- Test disaster recovery procedures

276
docs/video-whereby.md Normal file
View File

@@ -0,0 +1,276 @@
# Whereby Integration Configuration Guide
This guide explains how to configure Reflector to use Whereby as your video meeting platform for room creation, recording, and participant tracking.
## Overview
Whereby is a browser-based video meeting platform that provides hosted meeting rooms with recording capabilities. Reflector integrates with Whereby's API to:
- Create secure meeting rooms with custom branding
- Handle participant join/leave events via webhooks
- Automatically record meetings to AWS S3 storage
- Track meeting sessions and participant counts
## Requirements
### Whereby Account Setup
1. **Whereby Account**: Sign up for a Whereby business account at [whereby.com](https://whereby.com/business)
2. **API Access**: Request API access from Whereby support (required for programmatic room creation)
3. **Webhook Configuration**: Configure webhooks in your Whereby dashboard to point to your Reflector instance
### AWS S3 Storage
Whereby requires AWS S3 for recording storage. You need:
- AWS account with S3 access
- Dedicated S3 bucket for Whereby recordings
- AWS IAM credentials with S3 write permissions
## Configuration Variables
Add the following environment variables to your Reflector `.env` file:
### Required Variables
```bash
# Whereby API Configuration
WHEREBY_API_KEY=your-whereby-jwt-api-key
WHEREBY_WEBHOOK_SECRET=your-webhook-secret-from-whereby
# AWS S3 Storage for Recordings
AWS_WHEREBY_ACCESS_KEY_ID=your-aws-access-key
AWS_WHEREBY_ACCESS_KEY_SECRET=your-aws-secret-key
RECORDING_STORAGE_AWS_BUCKET_NAME=your-s3-bucket-name
```
### Optional Variables
```bash
# Whereby API URL (defaults to production)
WHEREBY_API_URL=https://api.whereby.dev/v1
# SQS Configuration (for recording processing)
AWS_PROCESS_RECORDING_QUEUE_URL=https://sqs.region.amazonaws.com/account/queue
SQS_POLLING_TIMEOUT_SECONDS=60
```
## Configuration Steps
### 1. Whereby API Key Setup
1. **Contact Whereby Support** to request API access for your account
2. **Generate JWT Token** in your Whereby dashboard under API settings
3. **Copy the JWT token** and set it as `WHEREBY_API_KEY` in your environment
The API key is a JWT token that looks like:
```
eyJ[...truncated JWT token...]
```
### 2. Webhook Configuration
1. **Access Whereby Dashboard** and navigate to webhook settings
2. **Set Webhook URL** to your Reflector instance:
```
https://your-reflector-domain.com/v1/whereby
```
3. **Configure Events** to send the following event types:
- `room.client.joined` - When participants join
- `room.client.left` - When participants leave
4. **Generate Webhook Secret** and set it as `WHEREBY_WEBHOOK_SECRET`
5. **Save Configuration** in your Whereby dashboard
### 3. AWS S3 Storage Setup
1. **Create S3 Bucket** dedicated for Whereby recordings
2. **Create IAM User** with programmatic access
3. **Attach S3 Policy** with the following permissions:
```json
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"s3:PutObject",
"s3:PutObjectAcl",
"s3:GetObject"
],
"Resource": "arn:aws:s3:::your-bucket-name/*"
}
]
}
```
4. **Configure Environment Variables** with the IAM credentials
### 4. Room Configuration
When creating rooms in Reflector, set the platform to use Whereby:
```bash
curl -X POST "https://your-reflector-domain.com/v1/rooms" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $AUTH_TOKEN" \
-d '{
"name": "my-whereby-room",
"platform": "whereby",
"recording_type": "cloud",
"recording_trigger": "automatic-2nd-participant",
"is_locked": false,
"room_mode": "normal"
}'
```
## Meeting Features
### Recording Options
Whereby supports three recording types:
- **`none`**: No recording
- **`local`**: Local recording (not recommended for production)
- **`cloud`**: Cloud recording to S3 (recommended)
### Recording Triggers
Control when recordings start:
- **`none`**: No automatic recording
- **`prompt`**: Prompt users to start recording
- **`automatic`**: Start immediately when meeting begins
- **`automatic-2nd-participant`**: Start when second participant joins
### Room Modes
- **`normal`**: Standard meeting room
- **`group`**: Group meeting with advanced features
## Webhook Event Handling
Reflector automatically handles these Whereby webhook events:
### Participant Tracking
```json
{
"type": "room.client.joined",
"data": {
"meetingId": "room-uuid",
"numClients": 2
}
}
```
### Recording Events
Whereby sends recording completion events that trigger Reflector's processing pipeline:
- Audio transcription
- Speaker diarization
- Summary generation
## Troubleshooting
### Common Issues
#### API Authentication Errors
**Symptoms**: 401 Unauthorized errors when creating meetings
**Solutions**:
1. Verify your `WHEREBY_API_KEY` is correct and not expired
2. Ensure you have API access enabled on your Whereby account
3. Contact Whereby support if API access is not available
#### Webhook Signature Validation Failed
**Symptoms**: Webhook events rejected with 401 errors
**Solutions**:
1. Verify `WHEREBY_WEBHOOK_SECRET` matches your Whereby dashboard configuration
2. Check webhook URL is correctly configured in Whereby dashboard
3. Ensure webhook endpoint is accessible from Whereby servers
#### Recording Upload Failures
**Symptoms**: Recordings not appearing in S3 bucket
**Solutions**:
1. Verify AWS credentials have S3 write permissions
2. Check S3 bucket name is correct and accessible
3. Ensure AWS region settings match your bucket location
4. Review AWS CloudTrail logs for permission issues
#### Participant Count Not Updating
**Symptoms**: Meeting participant counts remain at 0
**Solutions**:
1. Verify webhook events are being received at `/v1/whereby`
2. Check webhook signature validation is passing
3. Ensure meeting IDs match between Whereby and Reflector database
### Debug Commands
```bash
# Test Whereby API connectivity
curl -H "Authorization: Bearer $WHEREBY_API_KEY" \
https://api.whereby.dev/v1/meetings
# Check webhook endpoint health
curl https://your-reflector-domain.com/v1/whereby/health
# Verify S3 bucket access
aws s3 ls s3://your-bucket-name --profile whereby-user
```
## Security Considerations
### API Key Security
- Store API keys securely using environment variables
- Rotate API keys regularly
- Never commit API keys to version control
- Use separate keys for development and production
### Webhook Security
- Always validate webhook signatures using HMAC-SHA256
- Use HTTPS for all webhook endpoints
- Implement rate limiting on webhook endpoints
- Monitor webhook events for suspicious activity
### Recording Privacy
- Ensure S3 bucket access is restricted to authorized users
- Consider encryption at rest for sensitive recordings
- Implement retention policies for recorded content
- Comply with data protection regulations (GDPR, etc.)
## Performance Optimization
### Meeting Scaling
- Monitor concurrent meeting limits on your Whereby plan
- Implement meeting cleanup for expired sessions
- Use appropriate room modes for different use cases
### Recording Processing
- Configure SQS for asynchronous recording processing
- Monitor S3 storage usage and costs
- Implement automatic cleanup of processed recordings
### Webhook Reliability
- Implement webhook retry mechanisms
- Monitor webhook delivery success rates
- Log webhook events for debugging and auditing
## Migration from Other Platforms
If migrating from another video platform:
1. **Update Room Configuration**: Change existing rooms to use `"platform": "whereby"`
2. **Configure Webhooks**: Set up Whereby webhook endpoints
3. **Test Integration**: Verify meeting creation and event handling
4. **Monitor Performance**: Watch for any issues during transition
5. **Update Documentation**: Inform users of any workflow changes
## Support
For Whereby-specific issues:
- **Whereby Support**: [whereby.com/support](https://whereby.com/support)
- **API Documentation**: [whereby.dev](https://whereby.dev)
- **Status Page**: [status.whereby.com](https://status.whereby.com)
For Reflector integration issues:
- Check application logs for error details
- Verify environment variable configuration
- Test webhook connectivity and authentication
- Review AWS permissions and S3 access

474
docs/video_platforms.md Normal file
View File

@@ -0,0 +1,474 @@
# Video Platforms Architecture (PR #529 Analysis)
This document analyzes the video platforms refactoring implemented in PR #529 for daily.co integration, providing a blueprint for extending support to Jitsi and other video conferencing platforms.
## Overview
The video platforms refactoring introduces a clean abstraction layer that allows Reflector to support multiple video conferencing providers (Whereby, Daily.co, etc.) without changing core application logic. This architecture enables:
- Seamless switching between video platforms
- Platform-specific feature support
- Isolated platform code organization
- Consistent API surface across platforms
- Feature flags for gradual migration
## Architecture Components
### 1. **Directory Structure**
```
server/reflector/video_platforms/
├── __init__.py # Public API exports
├── base.py # Abstract base classes
├── factory.py # Platform client factory
├── registry.py # Platform registration system
├── whereby.py # Whereby implementation
├── daily.py # Daily.co implementation
└── mock.py # Testing implementation
```
### 2. **Core Abstract Classes**
#### `VideoPlatformClient` (base.py)
Abstract base class defining the interface all platforms must implement:
```python
class VideoPlatformClient(ABC):
PLATFORM_NAME: str = ""
@abstractmethod
async def create_meeting(self, room_name_prefix: str, end_date: datetime, room: Room) -> MeetingData
@abstractmethod
async def get_room_sessions(self, room_name: str) -> Dict[str, Any]
@abstractmethod
async def delete_room(self, room_name: str) -> bool
@abstractmethod
async def upload_logo(self, room_name: str, logo_path: str) -> bool
@abstractmethod
def verify_webhook_signature(self, body: bytes, signature: str, timestamp: Optional[str] = None) -> bool
```
#### `MeetingData` (base.py)
Standardized meeting data structure returned by all platforms:
```python
class MeetingData(BaseModel):
meeting_id: str
room_name: str
room_url: str
host_room_url: str
platform: str
extra_data: Dict[str, Any] = {} # Platform-specific data
```
#### `VideoPlatformConfig` (base.py)
Unified configuration structure for all platforms:
```python
class VideoPlatformConfig(BaseModel):
api_key: str
webhook_secret: str
api_url: Optional[str] = None
subdomain: Optional[str] = None
s3_bucket: Optional[str] = None
s3_region: Optional[str] = None
aws_role_arn: Optional[str] = None
aws_access_key_id: Optional[str] = None
aws_access_key_secret: Optional[str] = None
```
### 3. **Platform Registration System**
#### Registry Pattern (registry.py)
- Automatic registration of built-in platforms
- Runtime platform discovery
- Type-safe client instantiation
```python
# Auto-registration of platforms
_PLATFORMS: Dict[str, Type[VideoPlatformClient]] = {}
def register_platform(name: str, client_class: Type[VideoPlatformClient])
def get_platform_client(platform: str, config: VideoPlatformConfig) -> VideoPlatformClient
```
#### Factory System (factory.py)
- Configuration management per platform
- Platform selection logic
- Feature flag integration
```python
def get_platform_for_room(room_id: Optional[str] = None) -> str:
"""Determine which platform to use based on feature flags."""
if not settings.DAILY_MIGRATION_ENABLED:
return "whereby"
if room_id and room_id in settings.DAILY_MIGRATION_ROOM_IDS:
return "daily"
return settings.DEFAULT_VIDEO_PLATFORM
```
### 4. **Database Schema Changes**
#### Room Model Updates
Added `platform` field to track which video platform each room uses:
```python
# Database Schema
platform_column = sqlalchemy.Column(
"platform",
sqlalchemy.String,
nullable=False,
server_default="whereby"
)
# Pydantic Model
class Room(BaseModel):
platform: Literal["whereby", "daily"] = "whereby"
```
#### Meeting Model Updates
Added `platform` field to meetings for tracking and debugging:
```python
# Database Schema
platform_column = sqlalchemy.Column(
"platform",
sqlalchemy.String,
nullable=False,
server_default="whereby"
)
# Pydantic Model
class Meeting(BaseModel):
platform: Literal["whereby", "daily"] = "whereby"
```
**Key Decision**: No platform-specific fields were added to models. Instead, the `extra_data` field in `MeetingData` handles platform-specific information, following the user's rule of using generic `provider_data` as JSON if needed.
### 5. **Settings Configuration**
#### Feature Flags
```python
# Migration control
DAILY_MIGRATION_ENABLED: bool = True
DAILY_MIGRATION_ROOM_IDS: list[str] = []
DEFAULT_VIDEO_PLATFORM: str = "daily"
# Daily.co specific settings
DAILY_API_KEY: str | None = None
DAILY_WEBHOOK_SECRET: str | None = None
DAILY_SUBDOMAIN: str | None = None
AWS_DAILY_S3_BUCKET: str | None = None
AWS_DAILY_S3_REGION: str = "us-west-2"
AWS_DAILY_ROLE_ARN: str | None = None
```
#### Configuration Pattern
Each platform gets its own configuration namespace while sharing common patterns:
```python
def get_platform_config(platform: str) -> VideoPlatformConfig:
if platform == "whereby":
return VideoPlatformConfig(
api_key=settings.WHEREBY_API_KEY or "",
webhook_secret=settings.WHEREBY_WEBHOOK_SECRET or "",
# ... whereby-specific config
)
elif platform == "daily":
return VideoPlatformConfig(
api_key=settings.DAILY_API_KEY or "",
webhook_secret=settings.DAILY_WEBHOOK_SECRET or "",
# ... daily-specific config
)
```
### 6. **API Integration Updates**
#### Room Creation (views/rooms.py)
Updated to use platform factory instead of direct Whereby calls:
```python
@router.post("/rooms/{room_name}/meeting")
async def rooms_create_meeting(room_name: str, user: UserInfo):
# OLD: Direct Whereby integration
# whereby_meeting = await create_meeting("", end_date=end_date, room=room)
# NEW: Platform abstraction
platform = get_platform_for_room(room.id)
client = create_platform_client(platform)
meeting_data = await client.create_meeting(
room_name_prefix=room.name, end_date=end_date, room=room
)
await client.upload_logo(meeting_data.room_name, "./images/logo.png")
```
### 7. **Webhook Handling**
#### Separate Webhook Endpoints
Each platform gets its own webhook endpoint with platform-specific signature verification:
```python
# views/daily.py
@router.post("/daily_webhook")
async def daily_webhook(event: DailyWebhookEvent, request: Request):
# Verify Daily.co signature
body = await request.body()
signature = request.headers.get("X-Daily-Signature", "")
if not verify_daily_webhook_signature(body, signature):
raise HTTPException(status_code=401)
# Handle platform-specific events
if event.type == "participant.joined":
await _handle_participant_joined(event)
```
#### Consistent Event Handling
Despite different event formats, the core business logic remains the same:
```python
async def _handle_participant_joined(event):
room_name = event.data.get("room", {}).get("name") # Daily.co format
meeting = await meetings_controller.get_by_room_name(room_name)
if meeting:
current_count = getattr(meeting, "num_clients", 0)
await meetings_controller.update_meeting(
meeting.id, num_clients=current_count + 1
)
```
### 8. **Worker Task Integration**
#### New Task for Daily.co Recording Processing
Added platform-specific recording processing while maintaining the same pipeline:
```python
@shared_task
@asynctask
async def process_recording_from_url(recording_url: str, meeting_id: str, recording_id: str):
"""Process recording from Direct URL (Daily.co webhook)."""
logger.info("Processing recording from URL for meeting: %s", meeting_id)
# Uses same processing pipeline as Whereby S3 recordings
```
**Key Decision**: Worker tasks remain in main worker module but could be moved to platform-specific folders as suggested by the user.
### 9. **Testing Infrastructure**
#### Comprehensive Test Suite
- Unit tests for each platform client
- Integration tests for platform switching
- Mock platform for testing without external dependencies
- Webhook signature verification tests
```python
class TestPlatformIntegration:
"""Integration tests for platform switching."""
async def test_platform_switching_preserves_interface(self):
"""Test that different platforms provide consistent interface."""
# Test both Mock and Daily platforms return MeetingData objects
# with consistent fields
```
## Implementation Patterns for Jitsi Integration
Based on the daily.co implementation, here's how Jitsi should be integrated:
### 1. **Jitsi Client Implementation**
```python
# video_platforms/jitsi.py
class JitsiClient(VideoPlatformClient):
PLATFORM_NAME = "jitsi"
async def create_meeting(self, room_name_prefix: str, end_date: datetime, room: Room) -> MeetingData:
# Generate unique room name
jitsi_room = f"reflector-{room.name}-{int(time.time())}"
# Generate JWT tokens
user_jwt = self._generate_jwt(room=jitsi_room, moderator=False, exp=end_date)
host_jwt = self._generate_jwt(room=jitsi_room, moderator=True, exp=end_date)
return MeetingData(
meeting_id=generate_uuid4(),
room_name=jitsi_room,
room_url=f"https://jitsi.domain/{jitsi_room}?jwt={user_jwt}",
host_room_url=f"https://jitsi.domain/{jitsi_room}?jwt={host_jwt}",
platform=self.PLATFORM_NAME,
extra_data={"user_jwt": user_jwt, "host_jwt": host_jwt}
)
```
### 2. **Settings Integration**
```python
# settings.py
JITSI_DOMAIN: str = "meet.jit.si"
JITSI_JWT_SECRET: str | None = None
JITSI_WEBHOOK_SECRET: str | None = None
JITSI_API_URL: str | None = None # If using Jitsi API
```
### 3. **Factory Registration**
```python
# registry.py
def _register_builtin_platforms():
from .jitsi import JitsiClient
register_platform("jitsi", JitsiClient)
# factory.py
def get_platform_config(platform: str) -> VideoPlatformConfig:
elif platform == "jitsi":
return VideoPlatformConfig(
api_key="", # Jitsi may not need API key
webhook_secret=settings.JITSI_WEBHOOK_SECRET or "",
api_url=settings.JITSI_API_URL,
)
```
### 4. **Webhook Integration**
```python
# views/jitsi.py
@router.post("/jitsi/events")
async def jitsi_events_webhook(event_data: dict):
# Handle Prosody event-sync webhook format
event_type = event_data.get("event")
room_name = event_data.get("room", "").split("@")[0]
if event_type == "muc-occupant-joined":
# Same participant handling logic as other platforms
```
## Key Benefits of This Architecture
### 1. **Isolation and Organization**
- Platform-specific code contained in separate modules
- No platform logic leaking into core application
- Easy to add/remove platforms without affecting others
### 2. **Consistent Interface**
- All platforms implement the same abstract methods
- Standardized `MeetingData` structure
- Uniform error handling and logging
### 3. **Gradual Migration Support**
- Feature flags for controlled rollouts
- Room-specific platform selection
- Fallback mechanisms for platform failures
### 4. **Configuration Management**
- Centralized settings per platform
- Consistent naming patterns
- Environment-based configuration
### 5. **Testing and Quality**
- Mock platform for testing
- Comprehensive test coverage
- Platform-specific test utilities
## Migration Strategy Applied
The daily.co implementation demonstrates a careful migration approach:
### 1. **Backward Compatibility**
- Default platform remains "whereby"
- Existing rooms continue using Whereby unless explicitly migrated
- Same API endpoints and response formats
### 2. **Feature Flag Control**
```python
# Gradual rollout control
DAILY_MIGRATION_ENABLED: bool = True
DAILY_MIGRATION_ROOM_IDS: list[str] = [] # Specific rooms to migrate
DEFAULT_VIDEO_PLATFORM: str = "daily" # New rooms default
```
### 3. **Data Integrity**
- Platform field tracks which service each room/meeting uses
- No data loss during migration
- Platform-specific data preserved in `extra_data`
### 4. **Monitoring and Rollback**
- Comprehensive logging of platform selection
- Easy rollback by changing feature flags
- Platform-specific error tracking
## Recommendations for Jitsi Integration
Based on this analysis and the user's requirements:
### 1. **Follow the Pattern**
- Create `video_platforms/jitsi/` directory with:
- `client.py` - Main JitsiClient implementation
- `tasks.py` - Jitsi-specific worker tasks
- `__init__.py` - Module exports
### 2. **Settings Organization**
- Use `JITSI_*` prefix for all Jitsi settings
- Follow the same configuration pattern as Daily.co
- Support both environment variables and config files
### 3. **Generic Database Fields**
- Avoid platform-specific columns in database
- Use `provider_data` JSON field if platform-specific data needed
- Keep `platform` field as simple string identifier
### 4. **Worker Task Migration**
According to user requirements, migrate platform-specific tasks:
```
video_platforms/
├── whereby/
│ ├── client.py (moved from whereby.py)
│ └── tasks.py (moved from worker/whereby_tasks.py)
├── daily/
│ ├── client.py (moved from daily.py)
│ └── tasks.py (moved from worker/daily_tasks.py)
└── jitsi/
├── client.py (new JitsiClient)
└── tasks.py (new Jitsi recording tasks)
```
### 5. **Webhook Architecture**
- Create `views/jitsi.py` for Jitsi-specific webhooks
- Follow the same signature verification pattern
- Reuse existing participant tracking logic
## Implementation Checklist for Jitsi
- [ ] Create `video_platforms/jitsi/` directory structure
- [ ] Implement `JitsiClient` following the abstract interface
- [ ] Add Jitsi settings to configuration
- [ ] Register Jitsi platform in factory/registry
- [ ] Create Jitsi webhook endpoint
- [ ] Implement JWT token generation for room access
- [ ] Add Jitsi recording processing tasks
- [ ] Create comprehensive test suite
- [ ] Update database migrations for platform field
- [ ] Document Jitsi-specific configuration
## Conclusion
The video platforms refactoring in PR #529 provides an excellent foundation for adding Jitsi support. The architecture is well-designed with clear separation of concerns, consistent interfaces, and excellent extensibility. The daily.co implementation demonstrates how to add a new platform while maintaining backward compatibility and providing gradual migration capabilities.
The pattern should be directly applicable to Jitsi integration, with the main differences being:
- JWT-based authentication instead of API keys
- Different webhook event formats
- Jibri recording pipeline integration
- Self-hosted deployment considerations
This architecture successfully achieves the user's goals of:
1. Settings-based configuration
2. Generic database fields (no provider-specific columns)
3. Platform isolation in separate directories
4. Worker task organization within platform folders

View File

@@ -1,21 +0,0 @@
TRANSCRIPT_BACKEND=modal
TRANSCRIPT_URL=https://monadical-sas--reflector-transcriber-web.modal.run
TRANSCRIPT_MODAL_API_KEY=***REMOVED***
LLM_BACKEND=modal
LLM_URL=https://monadical-sas--reflector-llm-web.modal.run
LLM_MODAL_API_KEY=***REMOVED***
AUTH_BACKEND=fief
AUTH_FIEF_URL=https://auth.reflector.media/reflector-local
AUTH_FIEF_CLIENT_ID=***REMOVED***
AUTH_FIEF_CLIENT_SECRET=<ask in zulip> <-----------------------------------------------------------------------------------------
TRANSLATE_URL=https://monadical-sas--reflector-translator-web.modal.run
ZEPHYR_LLM_URL=https://monadical-sas--reflector-llm-zephyr-web.modal.run
DIARIZATION_URL=https://monadical-sas--reflector-diarizer-web.modal.run
BASE_URL=https://xxxxx.ngrok.app
DIARIZATION_ENABLED=false
SQS_POLLING_TIMEOUT_SECONDS=60

4
server/.gitignore vendored
View File

@@ -176,7 +176,9 @@ artefacts/
audio_*.wav
# ignore local database
reflector.sqlite3
*.sqlite3
*.db
data/
dump.rdb

View File

@@ -1 +1 @@
3.11.6
3.12

View File

@@ -1,30 +1,41 @@
FROM python:3.11-slim as base
FROM python:3.12-slim
ENV PIP_DEFAULT_TIMEOUT=100 \
PIP_DISABLE_PIP_VERSION_CHECK=1 \
PIP_NO_CACHE_DIR=1 \
PYTHONDONTWRITEBYTECODE=1 \
PYTHONUNBUFFERED=1 \
POETRY_VERSION=1.3.1
ENV PYTHONUNBUFFERED=1 \
UV_LINK_MODE=copy \
UV_NO_CACHE=1
# builder install base dependencies
FROM base AS builder
WORKDIR /tmp
RUN pip install "poetry==$POETRY_VERSION"
RUN python -m venv /venv
RUN apt-get update && apt-get install -y curl && apt-get clean
ADD https://astral.sh/uv/install.sh /uv-installer.sh
RUN sh /uv-installer.sh && rm /uv-installer.sh
ENV PATH="/root/.local/bin/:$PATH"
# install application dependencies
COPY pyproject.toml poetry.lock /tmp
RUN . /venv/bin/activate && poetry config virtualenvs.create false
RUN . /venv/bin/activate && poetry install --only main,aws --no-root --no-interaction --no-ansi
RUN mkdir -p /app
WORKDIR /app
COPY pyproject.toml uv.lock README.md /app/
RUN uv sync --compile-bytecode --locked
# pre-download nltk packages
RUN uv run python -c "import nltk; nltk.download('punkt_tab'); nltk.download('averaged_perceptron_tagger_eng')"
# bootstrap
FROM base AS final
COPY --from=builder /venv /venv
RUN mkdir -p /app
COPY reflector /app/reflector
COPY migrations /app/migrations
COPY images /app/images
COPY alembic.ini runserver.sh /app/
COPY images /app/images
COPY migrations /app/migrations
COPY reflector /app/reflector
WORKDIR /app
# Create symlink for libgomp if it doesn't exist (for ARM64 compatibility)
RUN if [ "$(uname -m)" = "aarch64" ] && [ ! -f /usr/lib/libgomp.so.1 ]; then \
LIBGOMP_PATH=$(find /app/.venv/lib -path "*/torch.libs/libgomp*.so.*" 2>/dev/null | head -n1); \
if [ -n "$LIBGOMP_PATH" ]; then \
ln -sf "$LIBGOMP_PATH" /usr/lib/libgomp.so.1; \
fi \
fi
# Pre-check just to make sure the image will not fail
RUN uv run python -c "import silero_vad.model"
CMD ["./runserver.sh"]

View File

@@ -20,3 +20,25 @@ Polls SQS every 60 seconds via /server/reflector/worker/process.py:24-62:
# Every 60 seconds, check for new recordings
sqs = boto3.client("sqs", ...)
response = sqs.receive_message(QueueUrl=queue_url, ...)
# Requeue
```bash
uv run /app/requeue_uploaded_file.py TRANSCRIPT_ID
```
## Pipeline Management
### Continue stuck pipeline from final summaries (identify_participants) step:
```bash
uv run python -c "from reflector.pipelines.main_live_pipeline import task_pipeline_final_summaries; result = task_pipeline_final_summaries.delay(transcript_id='TRANSCRIPT_ID'); print(f'Task queued: {result.id}')"
```
### Run full post-processing pipeline (continues to completion):
```bash
uv run python -c "from reflector.pipelines.main_live_pipeline import pipeline_post; pipeline_post(transcript_id='TRANSCRIPT_ID')"
```
.

View File

@@ -0,0 +1,212 @@
# Event Logger for Docker-Jitsi-Meet
A Prosody module that logs Jitsi meeting events to JSONL files alongside recordings, enabling complete participant tracking and speaker statistics.
## Prerequisites
- Running docker-jitsi-meet installation
- Jibri configured for recording
## Installation
### Step 1: Copy the Module
Copy the Prosody module to your custom plugins directory:
```bash
# Create the directory if it doesn't exist
mkdir -p ~/.jitsi-meet-cfg/prosody/prosody-plugins-custom
# Copy the module
cp mod_event_logger.lua ~/.jitsi-meet-cfg/prosody/prosody-plugins-custom/
```
### Step 2: Update Your .env File
Add or modify these variables in your `.env` file:
```bash
# If XMPP_MUC_MODULES already exists, append event_logger
# Example: XMPP_MUC_MODULES=existing_module,event_logger
XMPP_MUC_MODULES=event_logger
# Optional: Configure the module (these are defaults)
JIBRI_RECORDINGS_PATH=/config/recordings
JIBRI_LOG_SPEAKER_STATS=true
JIBRI_SPEAKER_STATS_INTERVAL=10
```
**Important**: If you already have `XMPP_MUC_MODULES` defined, add `event_logger` to the comma-separated list:
```bash
# Existing modules + our module
XMPP_MUC_MODULES=mod_info,mod_alert,event_logger
```
### Step 3: Modify docker-compose.yml
Add a shared recordings volume so Prosody can write events alongside Jibri recordings:
```yaml
services:
prosody:
# ... existing configuration ...
volumes:
- ${CONFIG}/prosody/config:/config:Z
- ${CONFIG}/prosody/prosody-plugins-custom:/prosody-plugins-custom:Z
- ${CONFIG}/recordings:/config/recordings:Z # Add this line
environment:
# Add if not using .env file
- XMPP_MUC_MODULES=${XMPP_MUC_MODULES:-event_logger}
- JIBRI_RECORDINGS_PATH=/config/recordings
jibri:
# ... existing configuration ...
volumes:
- ${CONFIG}/jibri:/config:Z
- ${CONFIG}/recordings:/config/recordings:Z # Add this line
environment:
# For Reflector webhook integration (optional)
- REFLECTOR_WEBHOOK_URL=${REFLECTOR_WEBHOOK_URL:-}
- JIBRI_FINALIZE_RECORDING_SCRIPT_PATH=/config/finalize.sh
```
### Step 4: Add Finalize Script (Optional - For Reflector Integration)
If you want to notify Reflector when recordings complete:
```bash
# Copy the finalize script
cp finalize.sh ~/.jitsi-meet-cfg/jibri/finalize.sh
chmod +x ~/.jitsi-meet-cfg/jibri/finalize.sh
# Add to .env
REFLECTOR_WEBHOOK_URL=http://your-reflector-api:8000
```
### Step 5: Restart Services
```bash
docker-compose down
docker-compose up -d
```
## What Gets Created
After a recording, you'll find in `~/.jitsi-meet-cfg/recordings/{session-id}/`:
- `recording.mp4` - The video recording (created by Jibri)
- `metadata.json` - Basic metadata (created by Jibri)
- `events.jsonl` - Complete participant timeline (created by this module)
## Event Format
Each line in `events.jsonl` is a JSON object:
```json
{"type":"room_created","timestamp":1234567890,"room_name":"TestRoom","room_jid":"testroom@conference.meet.jitsi","meeting_url":"https://meet.jitsi/TestRoom"}
{"type":"recording_started","timestamp":1234567891,"room_name":"TestRoom","session_id":"20240115120000_TestRoom","jibri_jid":"jibri@recorder.meet.jitsi"}
{"type":"participant_joined","timestamp":1234567892,"room_name":"TestRoom","participant":{"jid":"user1@meet.jitsi/web","nick":"John Doe","id":"user1@meet.jitsi","is_moderator":false}}
{"type":"speaker_active","timestamp":1234567895,"room_name":"TestRoom","speaker_jid":"user1@meet.jitsi","speaker_nick":"John Doe","duration":10}
{"type":"participant_left","timestamp":1234567920,"room_name":"TestRoom","participant":{"jid":"user1@meet.jitsi/web","nick":"John Doe","duration_seconds":28}}
{"type":"recording_stopped","timestamp":1234567950,"room_name":"TestRoom","session_id":"20240115120000_TestRoom","meeting_url":"https://meet.jitsi/TestRoom"}
```
## Configuration Options
All configuration can be done via environment variables:
| Environment Variable | Default | Description |
|---------------------|---------|-------------|
| `JIBRI_RECORDINGS_PATH` | `/config/recordings` | Path where recordings are stored |
| `JIBRI_LOG_SPEAKER_STATS` | `true` | Enable speaker statistics logging |
| `JIBRI_SPEAKER_STATS_INTERVAL` | `10` | Seconds between speaker stats updates |
## Verifying Installation
Check that the module is loaded:
```bash
docker-compose logs prosody | grep "Event Logger"
# Should see: "Event Logger loaded - writing to /config/recordings"
```
Check for events after a recording:
```bash
ls -la ~/.jitsi-meet-cfg/recordings/*/events.jsonl
cat ~/.jitsi-meet-cfg/recordings/*/events.jsonl | jq .
```
## Troubleshooting
### No events.jsonl file created
1. **Check module is enabled**:
```bash
docker-compose exec prosody grep -r "event_logger" /config
```
2. **Verify volume permissions**:
```bash
docker-compose exec prosody ls -la /config/recordings
```
3. **Check Prosody logs for errors**:
```bash
docker-compose logs prosody | grep -i error
```
### Module not loading
1. **Verify file exists in container**:
```bash
docker-compose exec prosody ls -la /prosody-plugins-custom/
```
2. **Check XMPP_MUC_MODULES format** (must be comma-separated, no spaces):
- ✅ Correct: `XMPP_MUC_MODULES=mod1,mod2,event_logger`
- ❌ Wrong: `XMPP_MUC_MODULES=mod1, mod2, event_logger`
## Common docker-compose.yml Patterns
### Minimal Addition (if you trust defaults)
```yaml
services:
prosody:
volumes:
- ${CONFIG}/recordings:/config/recordings:Z # Just add this
```
### Full Configuration
```yaml
services:
prosody:
volumes:
- ${CONFIG}/prosody/config:/config:Z
- ${CONFIG}/prosody/prosody-plugins-custom:/prosody-plugins-custom:Z
- ${CONFIG}/recordings:/config/recordings:Z
environment:
- XMPP_MUC_MODULES=event_logger
- JIBRI_RECORDINGS_PATH=/config/recordings
- JIBRI_LOG_SPEAKER_STATS=true
- JIBRI_SPEAKER_STATS_INTERVAL=10
jibri:
volumes:
- ${CONFIG}/jibri:/config:Z
- ${CONFIG}/recordings:/config/recordings:Z
environment:
- JIBRI_RECORDING_DIR=/config/recordings
- JIBRI_FINALIZE_RECORDING_SCRIPT_PATH=/config/finalize.sh
```
## Integration with Reflector
The finalize.sh script will automatically notify Reflector when a recording completes if `REFLECTOR_WEBHOOK_URL` is set. Reflector will receive:
```json
{
"session_id": "20240115120000_TestRoom",
"path": "20240115120000_TestRoom",
"meeting_url": "https://meet.jitsi/TestRoom"
}
```
Reflector then processes the recording along with the complete participant timeline from `events.jsonl`.

View File

@@ -0,0 +1,49 @@
#!/bin/bash
# Jibri finalize script to notify Reflector when recording is complete
# This script is called by Jibri with the recording directory as argument
RECORDING_PATH="$1"
SESSION_ID=$(basename "$RECORDING_PATH")
METADATA_FILE="$RECORDING_PATH/metadata.json"
# Extract meeting URL from Jibri's metadata
MEETING_URL=""
if [ -f "$METADATA_FILE" ]; then
MEETING_URL=$(jq -r '.meeting_url' "$METADATA_FILE" 2>/dev/null || echo "")
fi
echo "[$(date)] Recording finalized: $RECORDING_PATH"
echo "[$(date)] Session ID: $SESSION_ID"
echo "[$(date)] Meeting URL: $MEETING_URL"
# Check if events.jsonl was created by our Prosody module
if [ -f "$RECORDING_PATH/events.jsonl" ]; then
EVENT_COUNT=$(wc -l < "$RECORDING_PATH/events.jsonl")
echo "[$(date)] Found events.jsonl with $EVENT_COUNT events"
else
echo "[$(date)] Warning: No events.jsonl found"
fi
# Notify Reflector if webhook URL is configured
if [ -n "$REFLECTOR_WEBHOOK_URL" ]; then
echo "[$(date)] Notifying Reflector at: $REFLECTOR_WEBHOOK_URL"
RESPONSE=$(curl -s -w "\n%{http_code}" -X POST "$REFLECTOR_WEBHOOK_URL/api/v1/jibri/recording-ready" \
-H "Content-Type: application/json" \
-d "{\"session_id\":\"$SESSION_ID\",\"path\":\"$SESSION_ID\",\"meeting_url\":\"$MEETING_URL\"}")
HTTP_CODE=$(echo "$RESPONSE" | tail -n1)
BODY=$(echo "$RESPONSE" | sed '$d')
if [ "$HTTP_CODE" = "200" ]; then
echo "[$(date)] Reflector notified successfully"
echo "[$(date)] Response: $BODY"
else
echo "[$(date)] Failed to notify Reflector. HTTP code: $HTTP_CODE"
echo "[$(date)] Response: $BODY"
fi
else
echo "[$(date)] No REFLECTOR_WEBHOOK_URL configured, skipping notification"
fi
echo "[$(date)] Finalize script completed"

View File

@@ -0,0 +1,372 @@
local json = require "util.json"
local st = require "util.stanza"
local jid_bare = require "util.jid".bare
local recordings_path = os.getenv("JIBRI_RECORDINGS_PATH") or
module:get_option_string("jibri_recordings_path", "/recordings")
-- room_jid -> { session_id, participants = {jid -> info} }
local active_recordings = {}
-- room_jid -> { participants = {jid -> info}, created_at }
local room_states = {}
local function get_timestamp()
return os.time()
end
local function write_event(session_id, event)
if not session_id then
module:log("warn", "No session_id for event: %s", event.type)
return
end
local session_dir = string.format("%s/%s", recordings_path, session_id)
local event_file = string.format("%s/events.jsonl", session_dir)
module:log("info", "Writing event %s to %s", event.type, event_file)
-- Create directory
local mkdir_cmd = string.format("mkdir -p '%s' 2>&1", session_dir)
local mkdir_result = os.execute(mkdir_cmd)
module:log("debug", "mkdir result: %s", tostring(mkdir_result))
local file, err = io.open(event_file, "a")
if file then
local json_str = json.encode(event)
file:write(json_str .. "\n")
file:close()
module:log("info", "Successfully wrote event %s", event.type)
else
module:log("error", "Failed to write event to %s: %s", event_file, err)
end
end
local function extract_participant_info(occupant)
local info = {
jid = occupant.jid,
bare_jid = occupant.bare_jid,
nick = occupant.nick,
display_name = nil,
role = occupant.role
}
local presence = occupant:get_presence()
if presence then
local nick_element = presence:get_child("nick", "http://jabber.org/protocol/nick")
if nick_element then
info.display_name = nick_element:get_text()
end
local identity = presence:get_child("identity")
if identity then
local user = identity:get_child("user")
if user then
local name = user:get_child("name")
if name then
info.display_name = name:get_text()
end
local id_element = user:get_child("id")
if id_element then
info.id = id_element:get_text()
end
end
end
if not info.display_name and occupant.nick then
local _, _, resource = occupant.nick:match("([^@]+)@([^/]+)/(.+)")
if resource then
info.display_name = resource
end
end
end
return info
end
local function get_room_participant_count(room)
local count = 0
for _ in room:each_occupant() do
count = count + 1
end
return count
end
local function snapshot_room_participants(room)
local participants = {}
local total = 0
local skipped = 0
module:log("info", "Snapshotting room participants")
for _, occupant in room:each_occupant() do
total = total + 1
-- Skip recorders (Jibri)
if occupant.bare_jid and (occupant.bare_jid:match("^recorder@") or
occupant.bare_jid:match("^jibri@")) then
skipped = skipped + 1
else
local info = extract_participant_info(occupant)
participants[occupant.jid] = info
module:log("debug", "Added participant: %s", info.display_name or info.bare_jid)
end
end
module:log("info", "Snapshot: %d total, %d participants", total, total - skipped)
return participants
end
-- Import utility functions if available
local util = module:require "util";
local get_room_from_jid = util.get_room_from_jid;
local room_jid_match_rewrite = util.room_jid_match_rewrite;
-- Main IQ handler for Jibri stanzas
module:hook("pre-iq/full", function(event)
local stanza = event.stanza
if stanza.name ~= "iq" then
return
end
local jibri = stanza:get_child('jibri', 'http://jitsi.org/protocol/jibri')
if not jibri then
return
end
module:log("info", "=== Jibri IQ intercepted ===")
local action = jibri.attr.action
local session_id = jibri.attr.session_id
local room_jid = jibri.attr.room
local recording_mode = jibri.attr.recording_mode
local app_data = jibri.attr.app_data
module:log("info", "Jibri %s - session: %s, room: %s, mode: %s",
action or "?", session_id or "?", room_jid or "?", recording_mode or "?")
if not room_jid or not session_id then
module:log("warn", "Missing room_jid or session_id")
return
end
-- Get the room using util function
local room = get_room_from_jid(room_jid_match_rewrite(jid_bare(stanza.attr.to)))
if not room then
-- Try with the room_jid directly
room = get_room_from_jid(room_jid)
end
if not room then
module:log("error", "Room not found for jid: %s", room_jid)
return
end
module:log("info", "Room found: %s", room:get_name() or room_jid)
if action == "start" then
module:log("info", "Recording START for session %s", session_id)
-- Count and snapshot participants
local participant_count = 0
for _ in room:each_occupant() do
participant_count = participant_count + 1
end
local participants = snapshot_room_participants(room)
local participant_list = {}
for jid, info in pairs(participants) do
table.insert(participant_list, info)
end
active_recordings[room_jid] = {
session_id = session_id,
participants = participants,
started_at = get_timestamp()
}
write_event(session_id, {
type = "recording_started",
timestamp = get_timestamp(),
room_jid = room_jid,
room_name = room:get_name(),
session_id = session_id,
recording_mode = recording_mode,
app_data = app_data,
participant_count = participant_count,
participants_at_start = participant_list
})
elseif action == "stop" then
module:log("info", "Recording STOP for session %s", session_id)
local recording = active_recordings[room_jid]
if recording and recording.session_id == session_id then
write_event(session_id, {
type = "recording_stopped",
timestamp = get_timestamp(),
room_jid = room_jid,
room_name = room:get_name(),
session_id = session_id,
duration = get_timestamp() - recording.started_at,
participant_count = get_room_participant_count(room)
})
active_recordings[room_jid] = nil
else
module:log("warn", "No active recording found for room %s", room_jid)
end
end
end);
-- Room and participant event hooks
local function setup_room_hooks(host_module)
module:log("info", "Setting up room hooks on %s", host_module.host or "unknown")
-- Room created
host_module:hook("muc-room-created", function(event)
local room = event.room
local room_jid = room.jid
room_states[room_jid] = {
participants = {},
created_at = get_timestamp()
}
module:log("info", "Room created: %s", room_jid)
end)
-- Room destroyed
host_module:hook("muc-room-destroyed", function(event)
local room = event.room
local room_jid = room.jid
room_states[room_jid] = nil
active_recordings[room_jid] = nil
module:log("info", "Room destroyed: %s", room_jid)
end)
-- Occupant joined
host_module:hook("muc-occupant-joined", function(event)
local room = event.room
local occupant = event.occupant
local room_jid = room.jid
-- Skip recorders
if occupant.bare_jid and (occupant.bare_jid:match("^recorder@") or
occupant.bare_jid:match("^jibri@")) then
return
end
local participant_info = extract_participant_info(occupant)
-- Update room state
if room_states[room_jid] then
room_states[room_jid].participants[occupant.jid] = participant_info
end
-- Log to active recording if exists
local recording = active_recordings[room_jid]
if recording then
recording.participants[occupant.jid] = participant_info
write_event(recording.session_id, {
type = "participant_joined",
timestamp = get_timestamp(),
room_jid = room_jid,
room_name = room:get_name(),
participant = participant_info,
participant_count = get_room_participant_count(room)
})
end
module:log("info", "Participant joined %s: %s (%d total)",
room:get_name() or room_jid,
participant_info.display_name or participant_info.bare_jid,
get_room_participant_count(room))
end)
-- Occupant left
host_module:hook("muc-occupant-left", function(event)
local room = event.room
local occupant = event.occupant
local room_jid = room.jid
-- Skip recorders
if occupant.bare_jid and (occupant.bare_jid:match("^recorder@") or
occupant.bare_jid:match("^jibri@")) then
return
end
local participant_info = extract_participant_info(occupant)
-- Update room state
if room_states[room_jid] then
room_states[room_jid].participants[occupant.jid] = nil
end
-- Log to active recording if exists
local recording = active_recordings[room_jid]
if recording then
if recording.participants[occupant.jid] then
recording.participants[occupant.jid] = nil
end
write_event(recording.session_id, {
type = "participant_left",
timestamp = get_timestamp(),
room_jid = room_jid,
room_name = room:get_name(),
participant = participant_info,
participant_count = get_room_participant_count(room)
})
end
module:log("info", "Participant left %s: %s (%d remaining)",
room:get_name() or room_jid,
participant_info.display_name or participant_info.bare_jid,
get_room_participant_count(room))
end)
end
-- Module initialization
local current_host = module:get_host()
local host_type = module:get_host_type()
module:log("info", "Event Logger loading on %s (type: %s)", current_host, host_type or "unknown")
module:log("info", "Recording path: %s", recordings_path)
-- Setup room hooks based on host type
if host_type == "component" and current_host:match("^[^.]+%.") then
setup_room_hooks(module)
else
-- Try to find and hook to MUC component
local process_host_module = util.process_host_module
local muc_component_host = module:get_option_string("muc_component") or
module:get_option_string("main_muc")
if not muc_component_host then
local possible_hosts = {
"muc." .. current_host,
"conference." .. current_host,
"rooms." .. current_host
}
for _, host in ipairs(possible_hosts) do
if prosody.hosts[host] then
muc_component_host = host
module:log("info", "Auto-detected MUC component: %s", muc_component_host)
break
end
end
end
if muc_component_host then
process_host_module(muc_component_host, function(host_module, host)
module:log("info", "Hooking to MUC events on %s", host)
setup_room_hooks(host_module)
end)
else
module:log("error", "Could not find MUC component")
end
end

View File

@@ -0,0 +1,95 @@
# Data Retention and Cleanup
## Overview
For public instances of Reflector, a data retention policy is automatically enforced to delete anonymous user data after a configurable period (default: 7 days). This ensures compliance with privacy expectations and prevents unbounded storage growth.
## Configuration
### Environment Variables
- `PUBLIC_MODE` (bool): Must be set to `true` to enable automatic cleanup
- `PUBLIC_DATA_RETENTION_DAYS` (int): Number of days to retain anonymous data (default: 7)
### What Gets Deleted
When data reaches the retention period, the following items are automatically removed:
1. **Transcripts** from anonymous users (where `user_id` is NULL):
- Database records
- Local files (audio.wav, audio.mp3, audio.json waveform)
- Storage files (cloud storage if configured)
## Automatic Cleanup
### Celery Beat Schedule
When `PUBLIC_MODE=true`, a Celery beat task runs daily at 3 AM to clean up old data:
```python
# Automatically scheduled when PUBLIC_MODE=true
"cleanup_old_public_data": {
"task": "reflector.worker.cleanup.cleanup_old_public_data",
"schedule": crontab(hour=3, minute=0), # Daily at 3 AM
}
```
### Running the Worker
Ensure both Celery worker and beat scheduler are running:
```bash
# Start Celery worker
uv run celery -A reflector.worker.app worker --loglevel=info
# Start Celery beat scheduler (in another terminal)
uv run celery -A reflector.worker.app beat
```
## Manual Cleanup
For testing or manual intervention, use the cleanup tool:
```bash
# Delete data older than 7 days (default)
uv run python -m reflector.tools.cleanup_old_data
# Delete data older than 30 days
uv run python -m reflector.tools.cleanup_old_data --days 30
```
Note: The manual tool uses the same implementation as the Celery worker task to ensure consistency.
## Important Notes
1. **User Data Deletion**: Only anonymous data (where `user_id` is NULL) is deleted. Authenticated user data is preserved.
2. **Storage Cleanup**: The system properly cleans up both local files and cloud storage when configured.
3. **Error Handling**: If individual deletions fail, the cleanup continues and logs errors. Failed deletions are reported in the task output.
4. **Public Instance Only**: The automatic cleanup task only runs when `PUBLIC_MODE=true` to prevent accidental data loss in private deployments.
## Testing
Run the cleanup tests:
```bash
uv run pytest tests/test_cleanup.py -v
```
## Monitoring
Check Celery logs for cleanup task execution:
```bash
# Look for cleanup task logs
grep "cleanup_old_public_data" celery.log
grep "Starting cleanup of old public data" celery.log
```
Task statistics are logged after each run:
- Number of transcripts deleted
- Number of meetings deleted
- Number of orphaned recordings deleted
- Any errors encountered

View File

@@ -0,0 +1,194 @@
## Reflector GPU Transcription API (Specification)
This document defines the Reflector GPU transcription API that all implementations must adhere to. Current implementations include NVIDIA Parakeet (NeMo) and Whisper (faster-whisper), both deployed on Modal.com. The API surface and response shapes are OpenAI/Whisper-compatible, so clients can switch implementations by changing only the base URL.
### Base URL and Authentication
- Example base URLs (Modal web endpoints):
- Parakeet: `https://<account>--reflector-transcriber-parakeet-web.modal.run`
- Whisper: `https://<account>--reflector-transcriber-web.modal.run`
- All endpoints are served under `/v1` and require a Bearer token:
```
Authorization: Bearer <REFLECTOR_GPU_APIKEY>
```
Note: To switch implementations, deploy the desired variant and point `TRANSCRIPT_URL` to its base URL. The API is identical.
### Supported file types
`mp3, mp4, mpeg, mpga, m4a, wav, webm`
### Models and languages
- Parakeet (NVIDIA NeMo): default `nvidia/parakeet-tdt-0.6b-v2`
- Language support: only `en`. Other languages return HTTP 400.
- Whisper (faster-whisper): default `large-v2` (or deployment-specific)
- Language support: multilingual (per Whisper model capabilities).
Note: The `model` parameter is accepted by all implementations for interface parity. Some backends may treat it as informational.
### Endpoints
#### POST /v1/audio/transcriptions
Transcribe one or more uploaded audio files.
Request: multipart/form-data
- `file` (File) — optional. Single file to transcribe.
- `files` (File[]) — optional. One or more files to transcribe.
- `model` (string) — optional. Defaults to the implementation-specific model (see above).
- `language` (string) — optional, defaults to `en`.
- Parakeet: only `en` is accepted; other values return HTTP 400
- Whisper: model-dependent; typically multilingual
- `batch` (boolean) — optional, defaults to `false`.
Notes:
- Provide either `file` or `files`, not both. If neither is provided, HTTP 400.
- `batch` requires `files`; using `batch=true` without `files` returns HTTP 400.
- Response shape for multiple files is the same regardless of `batch`.
- Files sent to this endpoint are processed in a single pass (no VAD/chunking). This is intended for short clips (roughly ≤ 30s; depends on GPU memory/model). For longer audio, prefer `/v1/audio/transcriptions-from-url` which supports VAD-based chunking.
Responses
Single file response:
```json
{
"text": "transcribed text",
"words": [
{ "word": "hello", "start": 0.0, "end": 0.5 },
{ "word": "world", "start": 0.5, "end": 1.0 }
],
"filename": "audio.mp3"
}
```
Multiple files response:
```json
{
"results": [
{"filename": "a1.mp3", "text": "...", "words": [...]},
{"filename": "a2.mp3", "text": "...", "words": [...]}]
}
```
Notes:
- Word objects always include keys: `word`, `start`, `end`.
- Some implementations may include a trailing space in `word` to match Whisper tokenization behavior; clients should trim if needed.
Example curl (single file):
```bash
curl -X POST \
-H "Authorization: Bearer $REFLECTOR_GPU_APIKEY" \
-F "file=@/path/to/audio.mp3" \
-F "language=en" \
"$BASE_URL/v1/audio/transcriptions"
```
Example curl (multiple files, batch):
```bash
curl -X POST \
-H "Authorization: Bearer $REFLECTOR_GPU_APIKEY" \
-F "files=@/path/a1.mp3" -F "files=@/path/a2.mp3" \
-F "batch=true" -F "language=en" \
"$BASE_URL/v1/audio/transcriptions"
```
#### POST /v1/audio/transcriptions-from-url
Transcribe a single remote audio file by URL.
Request: application/json
Body parameters:
- `audio_file_url` (string) — required. URL of the audio file to transcribe.
- `model` (string) — optional. Defaults to the implementation-specific model (see above).
- `language` (string) — optional, defaults to `en`. Parakeet only accepts `en`.
- `timestamp_offset` (number) — optional, defaults to `0.0`. Added to each word's `start`/`end` in the response.
```json
{
"audio_file_url": "https://example.com/audio.mp3",
"model": "nvidia/parakeet-tdt-0.6b-v2",
"language": "en",
"timestamp_offset": 0.0
}
```
Response:
```json
{
"text": "transcribed text",
"words": [
{ "word": "hello", "start": 10.0, "end": 10.5 },
{ "word": "world", "start": 10.5, "end": 11.0 }
]
}
```
Notes:
- `timestamp_offset` is added to each words `start`/`end` in the response.
- Implementations may perform VAD-based chunking and batching for long-form audio; word timings are adjusted accordingly.
Example curl:
```bash
curl -X POST \
-H "Authorization: Bearer $REFLECTOR_GPU_APIKEY" \
-H "Content-Type: application/json" \
-d '{
"audio_file_url": "https://example.com/audio.mp3",
"language": "en",
"timestamp_offset": 0
}' \
"$BASE_URL/v1/audio/transcriptions-from-url"
```
### Error handling
- 400 Bad Request
- Parakeet: `language` other than `en`
- Missing required parameters (`file`/`files` for upload; `audio_file_url` for URL endpoint)
- Unsupported file extension
- 401 Unauthorized
- Missing or invalid Bearer token
- 404 Not Found
- `audio_file_url` does not exist
### Implementation details
- GPUs: A10G for small-file/live, L40S for large-file URL transcription (subject to deployment)
- VAD chunking and segment batching; word timings adjusted and overlapping ends constrained
- Pads very short segments (< 0.5s) to avoid model crashes on some backends
### Server configuration (Reflector API)
Set the Reflector server to use the Modal backend and point `TRANSCRIPT_URL` to your chosen deployment:
```
TRANSCRIPT_BACKEND=modal
TRANSCRIPT_URL=https://<account>--reflector-transcriber-parakeet-web.modal.run
TRANSCRIPT_MODAL_API_KEY=<REFLECTOR_GPU_APIKEY>
```
### Conformance tests
Use the pytest-based conformance tests to validate any new implementation (including self-hosted) against this spec:
```
TRANSCRIPT_URL=https://<your-deployment-base> \
TRANSCRIPT_MODAL_API_KEY=your-api-key \
uv run -m pytest -m gpu_modal --no-cov server/tests/test_gpu_modal_transcript.py
```

View File

@@ -0,0 +1,493 @@
# Jitsi Integration Configuration Guide
This guide provides step-by-step instructions for configuring Reflector to work with a self-hosted Jitsi Meet installation for video meetings and recording.
## Prerequisites
Before configuring Jitsi integration, ensure you have:
- **Self-hosted Jitsi Meet installation** (version 2.0.8922 or later recommended)
- **Jibri recording service** configured and running
- **Prosody XMPP server** with mod_event_sync module installed
- **Docker or system deployment** of Reflector with access to environment variables
- **SSL certificates** for secure communication between services
## Environment Configuration
Add the following environment variables to your Reflector deployment:
### Required Settings
```bash
# Jitsi Meet domain (without https://)
JITSI_DOMAIN=meet.example.com
# JWT secret for room authentication (generate with: openssl rand -hex 32)
JITSI_JWT_SECRET=your-64-character-hex-secret-here
# Webhook secret for secure event handling (generate with: openssl rand -hex 16)
JITSI_WEBHOOK_SECRET=your-32-character-hex-secret-here
# Application identifier (should match Jitsi configuration)
JITSI_APP_ID=reflector
# JWT issuer and audience (should match Jitsi configuration)
JITSI_JWT_ISSUER=reflector
JITSI_JWT_AUDIENCE=jitsi
```
### Example .env Configuration
```bash
# Add to your server/.env file
JITSI_DOMAIN=meet.mycompany.com
JITSI_JWT_SECRET=$(openssl rand -hex 32)
JITSI_WEBHOOK_SECRET=$(openssl rand -hex 16)
JITSI_APP_ID=reflector
JITSI_JWT_ISSUER=reflector
JITSI_JWT_AUDIENCE=jitsi
```
## Jitsi Meet Server Configuration
### 1. JWT Authentication Setup
Edit `/etc/prosody/conf.d/[YOUR_DOMAIN].cfg.lua`:
```lua
VirtualHost "meet.example.com"
authentication = "token"
app_id = "reflector"
app_secret = "your-jwt-secret-here"
-- Allow anonymous access for non-authenticated users
c2s_require_encryption = false
admins = { "focusUser@auth.meet.example.com" }
modules_enabled = {
"bosh";
"pubsub";
"ping";
"roster";
"saslauth";
"tls";
"dialback";
"disco";
"carbons";
"pep";
"private";
"blocklist";
"vcard";
"version";
"uptime";
"time";
"ping";
"register";
"admin_adhoc";
"token_verification";
"event_sync"; -- Required for webhook events
}
```
### 2. Room Access Control
Edit `/etc/jitsi/meet/meet.example.com-config.js`:
```javascript
var config = {
hosts: {
domain: 'meet.example.com',
muc: 'conference.meet.example.com'
},
// Enable JWT authentication
enableUserRolesBasedOnToken: true,
// Recording configuration
fileRecordingsEnabled: true,
liveStreamingEnabled: false,
// Reflector-specific settings
prejoinPageEnabled: true,
requireDisplayName: true,
};
```
### 3. Interface Configuration
Edit `/usr/share/jitsi-meet/interface_config.js`:
```javascript
var interfaceConfig = {
// Customize for Reflector branding
APP_NAME: 'Reflector Meeting',
DEFAULT_WELCOME_PAGE_LOGO_URL: 'https://your-domain.com/logo.png',
// Hide unnecessary buttons
TOOLBAR_BUTTONS: [
'microphone', 'camera', 'closedcaptions', 'desktop',
'fullscreen', 'fodeviceselection', 'hangup',
'chat', 'recording', 'livestreaming', 'etherpad',
'sharedvideo', 'settings', 'raisehand', 'videoquality',
'filmstrip', 'invite', 'feedback', 'stats', 'shortcuts',
'tileview', 'videobackgroundblur', 'download', 'help',
'mute-everyone'
]
};
```
## Jibri Configuration
### 1. Recording Service Setup
Edit `/etc/jitsi/jibri/jibri.conf`:
```hocon
jibri {
recording {
recordings-directory = "/var/recordings"
finalize-script = "/opt/jitsi/jibri/finalize.sh"
}
api {
xmpp {
environments = [{
name = "prod environment"
xmpp-server-hosts = ["meet.example.com"]
xmpp-domain = "meet.example.com"
control-muc {
domain = "internal.auth.meet.example.com"
room-name = "JibriBrewery"
nickname = "jibri-nickname"
}
control-login {
domain = "auth.meet.example.com"
username = "jibri"
password = "jibri-password"
}
}]
}
}
}
```
### 2. Finalize Script Setup
Create `/opt/jitsi/jibri/finalize.sh`:
```bash
#!/bin/bash
# Jibri finalize script for Reflector integration
RECORDING_FILE="$1"
ROOM_NAME="$2"
REFLECTOR_API_URL="${REFLECTOR_API_URL:-http://localhost:1250}"
WEBHOOK_SECRET="${JITSI_WEBHOOK_SECRET}"
# Generate webhook signature
generate_signature() {
local payload="$1"
echo -n "$payload" | openssl dgst -sha256 -hmac "$WEBHOOK_SECRET" | cut -d' ' -f2
}
# Prepare webhook payload
TIMESTAMP=$(date -u +%Y-%m-%dT%H:%M:%S.%3NZ)
PAYLOAD=$(cat <<EOF
{
"room_name": "$ROOM_NAME",
"recording_file": "$RECORDING_FILE",
"recording_status": "completed",
"timestamp": "$TIMESTAMP"
}
EOF
)
# Generate signature
SIGNATURE=$(generate_signature "$PAYLOAD")
# Send webhook to Reflector
curl -X POST "$REFLECTOR_API_URL/v1/jibri/recording-complete" \
-H "Content-Type: application/json" \
-H "X-Jitsi-Signature: $SIGNATURE" \
-d "$PAYLOAD" \
--max-time 30
echo "Recording finalization webhook sent for room: $ROOM_NAME"
```
Make the script executable:
```bash
chmod +x /opt/jitsi/jibri/finalize.sh
```
## Prosody Event Configuration
### 1. Event-Sync Module Installation
Install the mod_event_sync module:
```bash
# Download the module
cd /usr/share/jitsi-meet/prosody-plugins/
wget https://raw.githubusercontent.com/jitsi-contrib/prosody-plugins/main/mod_event_sync.lua
# Or if using git
git clone https://github.com/jitsi-contrib/prosody-plugins.git
cp prosody-plugins/mod_event_sync.lua /usr/share/jitsi-meet/prosody-plugins/
```
### 2. Webhook Configuration
Add to `/etc/prosody/conf.d/[YOUR_DOMAIN].cfg.lua`:
```lua
Component "conference.meet.example.com" "muc"
storage = "memory"
modules_enabled = {
"muc_meeting_id";
"muc_domain_mapper";
"polls";
"event_sync"; -- Enable event sync
}
-- Event sync webhook configuration
event_sync_url = "https://your-reflector-domain.com/v1/jitsi/events"
event_sync_secret = "your-webhook-secret-here"
-- Events to track
event_sync_events = {
"muc-occupant-joined",
"muc-occupant-left",
"jibri-recording-on",
"jibri-recording-off"
}
```
### 3. Restart Services
After configuration changes, restart all services:
```bash
systemctl restart prosody
systemctl restart jicofo
systemctl restart jitsi-videobridge2
systemctl restart jibri
systemctl restart nginx
```
## Reflector Room Configuration
### 1. Create Jitsi Room
When creating rooms in Reflector, set the platform field:
```bash
curl -X POST "https://your-reflector-domain.com/v1/rooms" \
-H "Authorization: Bearer $AUTH_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"name": "my-jitsi-room",
"platform": "jitsi",
"recording_type": "cloud",
"recording_trigger": "automatic-2nd-participant",
"is_locked": false,
"room_mode": "normal"
}'
```
### 2. Meeting Creation
Meetings will automatically use Jitsi when the room platform is set to "jitsi":
```bash
curl -X POST "https://your-reflector-domain.com/v1/rooms/my-jitsi-room/meeting" \
-H "Authorization: Bearer $AUTH_TOKEN"
```
## Testing the Integration
### 1. Health Check
Verify Jitsi webhook configuration:
```bash
curl "https://your-reflector-domain.com/v1/jitsi/health"
```
Expected response:
```json
{
"status": "ok",
"service": "jitsi-webhooks",
"timestamp": "2025-01-15T10:30:00.000Z",
"webhook_secret_configured": true
}
```
### 2. Room Creation Test
1. Create a Jitsi room via Reflector API
2. Start a meeting - should generate Jitsi Meet URL with JWT token
3. Join with multiple participants - should trigger participant events
4. Start recording - should trigger Jibri recording workflow
### 3. Webhook Event Test
Monitor Reflector logs for incoming webhook events:
```bash
# Check for participant events
curl -X POST "https://your-reflector-domain.com/v1/jitsi/events" \
-H "Content-Type: application/json" \
-H "X-Jitsi-Signature: test-signature" \
-d '{
"event": "muc-occupant-joined",
"room": "test-room-name",
"timestamp": "2025-01-15T10:30:00.000Z",
"data": {}
}'
```
## Troubleshooting
### Common Issues
#### JWT Authentication Failures
**Symptoms:** Users can't join rooms, "Authentication failed" errors
**Solutions:**
1. Verify JWT secret matches between Jitsi and Reflector
2. Check JWT token expiration (default 8 hours)
3. Ensure system clocks are synchronized
4. Validate JWT issuer/audience configuration
```bash
# Debug JWT tokens
echo "JWT_TOKEN_HERE" | cut -d'.' -f2 | base64 -d | jq
```
#### Webhook Events Not Received
**Symptoms:** Participant counts not updating, recording events missing
**Solutions:**
1. Verify event_sync module is loaded in Prosody
2. Check webhook URL accessibility from Jitsi server
3. Validate webhook signature generation
4. Review Prosody and Reflector logs
```bash
# Test webhook connectivity
curl -v "https://your-reflector-domain.com/v1/jitsi/health"
# Check Prosody logs
tail -f /var/log/prosody/prosody.log
# Check Reflector logs
docker logs your-reflector-container
```
#### Recording Issues
**Symptoms:** Recordings not starting, finalize script errors
**Solutions:**
1. Verify Jibri service status and configuration
2. Check recording directory permissions
3. Validate finalize script execution permissions
4. Monitor Jibri logs for errors
```bash
# Check Jibri status
systemctl status jibri
# Test finalize script
sudo -u jibri /opt/jitsi/jibri/finalize.sh "/test/recording.mp4" "test-room"
# Check Jibri logs
journalctl -u jibri -f
```
### Debug Commands
```bash
# Verify Jitsi configuration
prosodyctl check config
# Test JWT generation
curl -X POST "https://your-reflector-domain.com/v1/rooms/test/meeting" \
-H "Authorization: Bearer $TOKEN" -v
# Monitor webhook events
tail -f /var/log/reflector/app.log | grep jitsi
# Check room participant counts
curl "https://your-reflector-domain.com/v1/rooms" \
-H "Authorization: Bearer $TOKEN" | jq '.data[].num_clients'
```
### Performance Optimization
#### For High-Concurrent Usage
1. **Jitsi Videobridge Tuning:**
```bash
# /etc/jitsi/videobridge/sip-communicator.properties
org.jitsi.videobridge.STATISTICS_INTERVAL=5000
org.jitsi.videobridge.load.INITIAL_STREAM_LIMIT=50
```
2. **Database Connection Pooling:**
```python
# In your Reflector settings
DATABASE_POOL_SIZE=20
DATABASE_MAX_OVERFLOW=30
```
3. **Redis Configuration:**
```bash
# For webhook event caching
REDIS_URL=redis://localhost:6379/1
WEBHOOK_EVENT_TTL=3600
```
## Security Considerations
### Network Security
- Use HTTPS/WSS for all communications
- Implement proper firewall rules
- Consider VPN for server-to-server communication
### Authentication Security
- Rotate JWT secrets regularly
- Use strong webhook secrets (32+ characters)
- Implement rate limiting on webhook endpoints
### Recording Security
- Encrypt recordings at rest
- Implement access controls for recording files
- Regular security audits of file permissions
## Support
For additional support:
1. **Reflector Issues:** Check GitHub issues or create new ones
2. **Jitsi Community:** [Community Forum](https://community.jitsi.org/)
3. **Documentation:** [Jitsi Developer Guide](https://jitsi.github.io/handbook/)
## Migration from Whereby
If migrating from Whereby integration:
1. Update existing rooms to use "jitsi" platform
2. Verify webhook configurations are updated
3. Test recording workflows thoroughly
4. Monitor participant event accuracy
5. Update any custom integrations using meeting APIs
The platform abstraction layer ensures smooth migration with minimal API changes.

212
server/docs/webhook.md Normal file
View File

@@ -0,0 +1,212 @@
# Reflector Webhook Documentation
## Overview
Reflector supports webhook notifications to notify external systems when transcript processing is completed. Webhooks can be configured per room and are triggered automatically after a transcript is successfully processed.
## Configuration
Webhooks are configured at the room level with two fields:
- `webhook_url`: The HTTPS endpoint to receive webhook notifications
- `webhook_secret`: Optional secret key for HMAC signature verification (auto-generated if not provided)
## Events
### `transcript.completed`
Triggered when a transcript has been fully processed, including transcription, diarization, summarization, and topic detection.
### `test`
A test event that can be triggered manually to verify webhook configuration.
## Webhook Request Format
### Headers
All webhook requests include the following headers:
| Header | Description | Example |
|--------|-------------|---------|
| `Content-Type` | Always `application/json` | `application/json` |
| `User-Agent` | Identifies Reflector as the source | `Reflector-Webhook/1.0` |
| `X-Webhook-Event` | The event type | `transcript.completed` or `test` |
| `X-Webhook-Retry` | Current retry attempt number | `0`, `1`, `2`... |
| `X-Webhook-Signature` | HMAC signature (if secret configured) | `t=1735306800,v1=abc123...` |
### Signature Verification
If a webhook secret is configured, Reflector includes an HMAC-SHA256 signature in the `X-Webhook-Signature` header to verify the webhook authenticity.
The signature format is: `t={timestamp},v1={signature}`
To verify the signature:
1. Extract the timestamp and signature from the header
2. Create the signed payload: `{timestamp}.{request_body}`
3. Compute HMAC-SHA256 of the signed payload using your webhook secret
4. Compare the computed signature with the received signature
Example verification (Python):
```python
import hmac
import hashlib
def verify_webhook_signature(payload: bytes, signature_header: str, secret: str) -> bool:
# Parse header: "t=1735306800,v1=abc123..."
parts = dict(part.split("=") for part in signature_header.split(","))
timestamp = parts["t"]
received_signature = parts["v1"]
# Create signed payload
signed_payload = f"{timestamp}.{payload.decode('utf-8')}"
# Compute expected signature
expected_signature = hmac.new(
secret.encode("utf-8"),
signed_payload.encode("utf-8"),
hashlib.sha256
).hexdigest()
# Compare signatures
return hmac.compare_digest(expected_signature, received_signature)
```
## Event Payloads
### `transcript.completed` Event
This event includes a convenient URL for accessing the transcript:
- `frontend_url`: Direct link to view the transcript in the web interface
```json
{
"event": "transcript.completed",
"event_id": "transcript.completed-abc-123-def-456",
"timestamp": "2025-08-27T12:34:56.789012Z",
"transcript": {
"id": "abc-123-def-456",
"room_id": "room-789",
"created_at": "2025-08-27T12:00:00Z",
"duration": 1800.5,
"title": "Q3 Product Planning Meeting",
"short_summary": "Team discussed Q3 product roadmap, prioritizing mobile app features and API improvements.",
"long_summary": "The product team met to finalize the Q3 roadmap. Key decisions included...",
"webvtt": "WEBVTT\n\n00:00:00.000 --> 00:00:05.000\n<v Speaker 1>Welcome everyone to today's meeting...",
"topics": [
{
"title": "Introduction and Agenda",
"summary": "Meeting kickoff with agenda review",
"timestamp": 0.0,
"duration": 120.0,
"webvtt": "WEBVTT\n\n00:00:00.000 --> 00:00:05.000\n<v Speaker 1>Welcome everyone..."
},
{
"title": "Mobile App Features Discussion",
"summary": "Team reviewed proposed mobile app features for Q3",
"timestamp": 120.0,
"duration": 600.0,
"webvtt": "WEBVTT\n\n00:02:00.000 --> 00:02:10.000\n<v Speaker 2>Let's talk about the mobile app..."
}
],
"participants": [
{
"id": "participant-1",
"name": "John Doe",
"speaker": "Speaker 1"
},
{
"id": "participant-2",
"name": "Jane Smith",
"speaker": "Speaker 2"
}
],
"source_language": "en",
"target_language": "en",
"status": "completed",
"frontend_url": "https://app.reflector.com/transcripts/abc-123-def-456"
},
"room": {
"id": "room-789",
"name": "Product Team Room"
}
}
```
### `test` Event
```json
{
"event": "test",
"event_id": "test.2025-08-27T12:34:56.789012Z",
"timestamp": "2025-08-27T12:34:56.789012Z",
"message": "This is a test webhook from Reflector",
"room": {
"id": "room-789",
"name": "Product Team Room"
}
}
```
## Retry Policy
Webhooks are delivered with automatic retry logic to handle transient failures. When a webhook delivery fails due to server errors or network issues, Reflector will automatically retry the delivery multiple times over an extended period.
### Retry Mechanism
Reflector implements an exponential backoff strategy for webhook retries:
- **Initial retry delay**: 60 seconds after the first failure
- **Exponential backoff**: Each subsequent retry waits approximately twice as long as the previous one
- **Maximum retry interval**: 1 hour (backoff is capped at this duration)
- **Maximum retry attempts**: 30 attempts total
- **Total retry duration**: Retries continue for approximately 24 hours
### How Retries Work
When a webhook fails, Reflector will:
1. Wait 60 seconds, then retry (attempt #1)
2. If it fails again, wait ~2 minutes, then retry (attempt #2)
3. Continue doubling the wait time up to a maximum of 1 hour between attempts
4. Keep retrying at 1-hour intervals until successful or 30 attempts are exhausted
The `X-Webhook-Retry` header indicates the current retry attempt number (0 for the initial attempt, 1 for first retry, etc.), allowing your endpoint to track retry attempts.
### Retry Behavior by HTTP Status Code
| Status Code | Behavior |
|-------------|----------|
| 2xx (Success) | No retry, webhook marked as delivered |
| 4xx (Client Error) | No retry, request is considered permanently failed |
| 5xx (Server Error) | Automatic retry with exponential backoff |
| Network/Timeout Error | Automatic retry with exponential backoff |
**Important Notes:**
- Webhooks timeout after 30 seconds. If your endpoint takes longer to respond, it will be considered a timeout error and retried.
- During the retry period (~24 hours), you may receive the same webhook multiple times if your endpoint experiences intermittent failures.
- There is no mechanism to manually retry failed webhooks after the retry period expires.
## Testing Webhooks
You can test your webhook configuration before processing transcripts:
```http
POST /v1/rooms/{room_id}/webhook/test
```
Response:
```json
{
"success": true,
"status_code": 200,
"message": "Webhook test successful",
"response_preview": "OK"
}
```
Or in case of failure:
```json
{
"success": false,
"error": "Webhook request timed out (10 seconds)"
}
```

View File

@@ -7,11 +7,9 @@
## User authentication
## =======================================================
## Using fief (fief.dev)
AUTH_BACKEND=fief
AUTH_FIEF_URL=https://auth.reflector.media/reflector-local
AUTH_FIEF_CLIENT_ID=***REMOVED***
AUTH_FIEF_CLIENT_SECRET=<ask in zulip>
## Using jwt/authentik
AUTH_BACKEND=jwt
AUTH_JWT_AUDIENCE=
## =======================================================
## Transcription backend
@@ -22,24 +20,24 @@ AUTH_FIEF_CLIENT_SECRET=<ask in zulip>
## Using local whisper
#TRANSCRIPT_BACKEND=whisper
#WHISPER_MODEL_SIZE=tiny
## Using serverless modal.com (require reflector-gpu-modal deployed)
#TRANSCRIPT_BACKEND=modal
#TRANSCRIPT_URL=https://xxxxx--reflector-transcriber-web.modal.run
#TRANSLATE_URL=https://xxxxx--reflector-translator-web.modal.run
#TRANSCRIPT_MODAL_API_KEY=xxxxx
TRANSCRIPT_BACKEND=modal
TRANSCRIPT_URL=https://monadical-sas--reflector-transcriber-web.modal.run
TRANSCRIPT_MODAL_API_KEY=***REMOVED***
TRANSCRIPT_MODAL_API_KEY=
## =======================================================
## Transcription backend
## Translation backend
##
## Only available in modal atm
## =======================================================
TRANSLATION_BACKEND=modal
TRANSLATE_URL=https://monadical-sas--reflector-translator-web.modal.run
#TRANSLATION_MODAL_API_KEY=xxxxx
## =======================================================
## LLM backend
@@ -49,28 +47,11 @@ TRANSLATE_URL=https://monadical-sas--reflector-translator-web.modal.run
## llm backend implementation
## =======================================================
## Using serverless modal.com (require reflector-gpu-modal deployed)
LLM_BACKEND=modal
LLM_URL=https://monadical-sas--reflector-llm-web.modal.run
LLM_MODAL_API_KEY=***REMOVED***
ZEPHYR_LLM_URL=https://monadical-sas--reflector-llm-zephyr-web.modal.run
## Using OpenAI
#LLM_BACKEND=openai
#LLM_OPENAI_KEY=xxx
#LLM_OPENAI_MODEL=gpt-3.5-turbo
## Using GPT4ALL
#LLM_BACKEND=openai
#LLM_URL=http://localhost:4891/v1/completions
#LLM_OPENAI_MODEL="GPT4All Falcon"
## Default LLM MODEL NAME
#DEFAULT_LLM=lmsys/vicuna-13b-v1.5
## Cache directory to store models
CACHE_DIR=data
## Context size for summary generation (tokens)
# LLM_MODEL=microsoft/phi-4
LLM_CONTEXT_WINDOW=16000
LLM_URL=
LLM_API_KEY=sk-
## =======================================================
## Diarization
@@ -79,7 +60,9 @@ CACHE_DIR=data
## To allow diarization, you need to expose expose the files to be dowloded by the pipeline
## =======================================================
DIARIZATION_ENABLED=false
DIARIZATION_BACKEND=modal
DIARIZATION_URL=https://monadical-sas--reflector-diarizer-web.modal.run
#DIARIZATION_MODAL_API_KEY=xxxxx
## =======================================================
@@ -88,4 +71,3 @@ DIARIZATION_URL=https://monadical-sas--reflector-diarizer-web.modal.run
## Sentry DSN configuration
#SENTRY_DSN=

View File

@@ -1,204 +0,0 @@
import re
from pathlib import Path
from typing import Any, List
from jiwer import wer
from Levenshtein import distance
from pydantic import BaseModel, Field, field_validator
from tqdm.auto import tqdm
from whisper.normalizers import EnglishTextNormalizer
class EvaluationResult(BaseModel):
"""
Result object of the model evaluation
"""
accuracy: float = Field(default=0.0)
total_test_samples: int = Field(default=0)
class EvaluationTestSample(BaseModel):
"""
Represents one test sample
"""
reference_text: str
predicted_text: str
def update(self, reference_text:str, predicted_text:str) -> None:
self.reference_text = reference_text
self.predicted_text = predicted_text
class TestDatasetLoader(BaseModel):
"""
Test samples loader
"""
test_dir: Path = Field(default=Path(__file__).parent)
total_samples: int = Field(default=0)
@field_validator("test_dir")
def validate_file_path(cls, path):
"""
Check the file path
"""
if not path.exists():
raise ValueError("Path does not exist")
return path
def _load_test_data(self) -> tuple[Path, Path]:
"""
Loader function to validate input files and generate samples
"""
PREDICTED_TEST_SAMPLES_DIR = self.test_dir / "predicted_texts"
REFERENCE_TEST_SAMPLES_DIR = self.test_dir / "reference_texts"
for filename in PREDICTED_TEST_SAMPLES_DIR.iterdir():
match = re.search(r"(\d+)\.txt$", filename.as_posix())
if match:
sample_id = match.group(1)
pred_file_path = PREDICTED_TEST_SAMPLES_DIR / filename
ref_file_name = "ref_sample_" + str(sample_id) + ".txt"
ref_file_path = REFERENCE_TEST_SAMPLES_DIR / ref_file_name
if ref_file_path.exists():
self.total_samples += 1
yield ref_file_path, pred_file_path
def __iter__(self) -> EvaluationTestSample:
"""
Iter method for the test loader
"""
for pred_file_path, ref_file_path in self._load_test_data():
with open(pred_file_path, "r", encoding="utf-8") as file:
pred_text = file.read()
with open(ref_file_path, "r", encoding="utf-8") as file:
ref_text = file.read()
yield EvaluationTestSample(reference_text=ref_text, predicted_text=pred_text)
class EvaluationConfig(BaseModel):
"""
Model for evaluation parameters
"""
insertion_penalty: int = Field(default=1)
substitution_penalty: int = Field(default=1)
deletion_penalty: int = Field(default=1)
normalizer: Any = Field(default=EnglishTextNormalizer())
test_directory: str = Field(default=str(Path(__file__).parent))
class ModelEvaluator:
"""
Class that comprises all model evaluation related processes and methods
"""
# The 2 popular methods of WER differ slightly. More dimensions of accuracy
# will be added. For now, the average of these 2 will serve as the metric.
WEIGHTED_WER_LEVENSHTEIN = 0.0
WER_LEVENSHTEIN = []
WEIGHTED_WER_JIWER = 0.0
WER_JIWER = []
evaluation_result = EvaluationResult()
test_dataset_loader = None
evaluation_config = None
def __init__(self, **kwargs):
self.evaluation_config = EvaluationConfig(**kwargs)
self.test_dataset_loader = TestDatasetLoader(test_dir=self.evaluation_config.test_directory)
def __repr__(self):
return f"ModelEvaluator({self.evaluation_config})"
def describe(self) -> dict:
"""
Returns the parameters defining the evaluator
"""
return self.evaluation_config.model_dump()
def _normalize(self, sample: EvaluationTestSample) -> None:
"""
Normalize both reference and predicted text
"""
sample.update(
self.evaluation_config.normalizer(sample.reference_text),
self.evaluation_config.normalizer(sample.predicted_text),
)
def _calculate_wer(self, sample: EvaluationTestSample) -> float:
"""
Based on weights for (insert, delete, substitute), calculate
the Word Error Rate
"""
levenshtein_distance = distance(
s1=sample.reference_text,
s2=sample.predicted_text,
weights=(
self.evaluation_config.insertion_penalty,
self.evaluation_config.deletion_penalty,
self.evaluation_config.substitution_penalty,
),
)
wer = levenshtein_distance / len(sample.reference_text)
return wer
def _calculate_wers(self) -> None:
"""
Compute WER
"""
for sample in tqdm(self.test_dataset_loader, desc="Evaluating"):
self._normalize(sample)
wer_item_l = {
"wer": self._calculate_wer(sample),
"no_of_words": len(sample.reference_text),
}
wer_item_j = {
"wer": wer(sample.reference_text, sample.predicted_text),
"no_of_words": len(sample.reference_text),
}
self.WER_LEVENSHTEIN.append(wer_item_l)
self.WER_JIWER.append(wer_item_j)
def _calculate_weighted_wer(self, wers: List[float]) -> float:
"""
Calculate the weighted WER from WER
"""
total_wer = 0.0
total_words = 0.0
for item in wers:
total_wer += item["no_of_words"] * item["wer"]
total_words += item["no_of_words"]
return total_wer / total_words
def _calculate_model_accuracy(self) -> None:
"""
Compute model accuracy
"""
self._calculate_wers()
weighted_wer_levenshtein = self._calculate_weighted_wer(self.WER_LEVENSHTEIN)
weighted_wer_jiwer = self._calculate_weighted_wer(self.WER_JIWER)
final_weighted_wer = (weighted_wer_levenshtein + weighted_wer_jiwer) / 2
self.evaluation_result.accuracy = (1 - final_weighted_wer) * 100
def evaluate(self, recalculate: bool = False) -> EvaluationResult:
"""
Triggers the model evaluation
"""
if not self.evaluation_result.accuracy or recalculate:
self._calculate_model_accuracy()
return EvaluationResult(
accuracy=self.evaluation_result.accuracy,
total_test_samples=self.test_dataset_loader.total_samples
)
eval_config = {"insertion_penalty": 1, "deletion_penalty": 2, "substitution_penalty": 1}
evaluator = ModelEvaluator(**eval_config)
evaluation = evaluator.evaluate()
print(evaluator)
print(evaluation)
print("Model accuracy : {:.2f} %".format(evaluation.accuracy))

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because it is too large Load Diff

View File

@@ -1,620 +0,0 @@
Technologies ticker symbol w-e-l-l on
the TSX recently reported its 2023 q1
results beating the streets consensus
estimate for revenue and adjusted ebitda
and in a report issued this week Raymond
James analyst said quote we're impressed
by Wells capacity to drive powerful
growth across its diverse business units
in the absence of M A joining me today
is CEO Hamed chabazi to look at what's
next for well health good to see you sir
how are you great to see you Richard
thanks very much for having me great to
have you uh congratulations on your 17th
consecutive quarter of record Revenue
can you share some insights into what's
Driven these results historically and in
the past quarter as well
yeah thank you we we're very excited
about our uh q1 2023 results and as you
mentioned uh we've had a long you know
successful uh string of of uh you know
continued growth and record growth
um we also had accelerating organic
growth and I think um a big part of the
success of our franchise here is the
incredibly sticky and predictable
Revenue that we have you know well over
90 of our business is either highly
reoccurring as in uh the you know highly
predictable uh results of our two-sided
network of patients and providers or
truly recurring as in scheduled or
subscribed revenues and this allows us
to essentially make sure that that uh
you know we're on track it obviously you
know like any other business things
happen uh and sometimes it's hard to
meet those results but what's really
being unique about our platform is we do
have exposure to all kinds of different
aspects of healthcare you know we have
Prime primary care and Specialized Care
on both sides of the Border in the US
and Canada so we have exposure to
different types of business models we
have exposure to the U.S payer Network
which has higher per unit economics than
Canada and of course the stability and
uh and and sort of higher Fidelity uh
kind of Collections and revenue cycle
process that Canada has over the United
States where you don't have to kind of
deal with all of that uh at that payment
noise so just a lot of I think strength
built into the platform because of the
diversity of different Healthcare
businesses that we support
and uh where do you see Well's future
growth coming from which part of the
business uh excites you the most right
now yeah well look the centrifugal force
of well is the healthcare provider and
we exist to uh Tech enable and
ameliorate the business of that of that
Tech of that healthcare provider uh and
and and that's what we're laser focused
on and and what we're seeing is
providers not wanting to run businesses
anymore it's very simple and so we have
a digital platform and providers can
either acquire what they want and need
from our digital platform and implement
it themselves
or they can decide that they don't want
to run a business anymore they don't
want to configure and manage technology
which is becoming a bigger and bigger
part of their world every single day and
when we see what we've seen with that
Dynamic is that uh is that a lot of them
are now just wanting to work in a place
where where all the technology is
configured for them it's wrapped around
them and they have a competent operating
partner that is supporting the organ the
the practice uh and and taking care of
the front office in the back office so
that they can focus on providing care
this results in them seeing more
patients uh and and being happier
because you know they became doctors to
see patients not so they can manage uh
workers and and deal with HR issues and
deal with labs and all that kind of
stuff excellent and I know too that
Acquisitions have played a key role in
well can you share any insights into how
the Acquisitions fit into Wells growth
strategy
sure in in look in 2020 and 2021 we did
a lot of Acquisitions in 2022 we took a
bit of a breather and we've really
focused on integration and I think
that's one of the reasons why you saw
this accelerating organic growth we
really were able to demonstrate that we
could bring together the different
elements of our technology platform we
started to sell bundles we started to
really derive Synergy uh and activate uh
you know more sales as a result of
selling uh all the different products
and services with one voice with One
Vision uh so we made it easier for
providers to use their technology and I
think that was a big reason uh for our
growth now M A as you know where Capital
allocation company we're never far from
it and so we did continue to have you
know tuck-ins here and there and in fact
today uh we announced that we've
acquired uh the Alberta operations of uh
MCI one Health and other publicly traded
company uh who was looking to raise
funds to support their business we're
very pleased with with this acquisition
it just demonstrates our continued
discipline these are you know great
primary care clinics in in Canada right
in the greater Calgary area and uh uh
you know just allows us to grow our
footprint in Alberta which is an
important Province for us and it it's
it's if you look at the price if you
look at what we're getting uh you know
it's just demonstrative of our continued
uh discipline and just you know a few
days ago at our conference call I
mentioned uh that we had you know a
really strong lineup of Acquisitions uh
and you know they're starting to uh uh I
think uh come to fruition for us
a company on the grown-up question I you
recently announced a new AI investment
program last month what specific areas
of healthcare technology or AI are you
focusing on and what's the strategy when
it comes to AI
yes uh look AI as as I'm sure you're
aware is it's become you know really uh
an incredibly important topic in in all
aspects of of business and and you know
not just business socially as well
everyone's talking about uh this this
new breakthrough disruptive technology
the large language models and generative
AI
um I mean look AI uh has been about a 80
year old overnight success a lot of
people have been working on this for a
long time generative AI is just sort of
you know the culmination of a lot of
things coming together and working uh
but it is uncorked enormous uh
Innovation and and we think that um this
there's a very good news story about
this in healthcare particularly where we
were looking to look we were looking to
unlock uh the value of of the data that
that we all produce every single day
um as as humans and and so we've
established an AI investment program
because no one company can can tackle
all of these Innovations themselves and
what well has done too is it's taken a
very much an ecosystem approach by
establishing its apps.health Marketplace
and so we're very excited about not only
uh allocating Capital into promising
young AI companies that are focused on
digital health and solving Healthcare
problems but also giving them access to
um you know safely and securely to our
provider Network to our uh you know to
to our Outpatient Clinic Network which
is the largest owned and operated
Network in Canada by far uh so
um and and when these and it's it was
remarkable when we announced this
program we've had just in the in the
first uh week to 10 days we've had over
a hundred uh inbound prospects come in
uh that that wanted to you know
collaborate with us and again I don't
think that's necessarily for the money
you know we're saying we would invest a
minimum of a quarter of a million
dollars you know a lot of them will
likely be higher than a quarter of a
million dollars
so it's not life-changing money but but
our structural advantages and and and
the benefits that we have in the Well
Network those are extremely hard to come
by uh and I think and I think uh uh
you'll see us uh you know help some of
these companies uh succeed and they will
help us drive uh you know more
Innovation to that helps the provider
but speaking of this very interesting AI
I know your company just launched well
AI voice this is super interesting tell
me what it is and the impact it could
have on health care providers
yeah thanks for uh asking Richard our
providers uh are thrilled with this you
know we've we've had a number of of of
our own well providers testing this
technology and it it it really feels
like magic to them it's essentially an
ambient AI powered scribe so it's a it's
a service that with the consent of the
parties involved listens to the
conversation between a patient and
provider and then uh essentially
condenses that into a medically relevant
note for the chart files uh typically
that is a lengthy process a doctor has
to transcribe notes then review those
notes and make sure that uh a a a a
appropriate medically oriented and
structured node is is is uh prepared and
put into the chart and that could take
you know sometimes more than more time
than the actual consultation uh time and
so we believe that on average if it's
used regularly and consistently this can
give providers back at least a third of
their day
um and and it's it's just a game changer
uh and and uh we have now gone into
General release with this product it's
widely available in Canada uh it has
been integrated into our EMR which makes
it even more valuable tools like this
are going to start popping up but if
they're not integrated into your
practice management system then you have
to kind of have data in in more than one
place and and move that around a little
bit which which makes it a little bit
more difficult especially with HIPAA
requirements and and regulations so
again I think this is the first of many
types of different products and services
that allow doctors to place more
emphasis and focus on the patient
experience instead of having their head
in a laptop and looking at you once in a
while they'll be looking at you and
speaking to their practice management
system and I think this you know think
about it as Alexa for for our doctors uh
you know this this ability to speak uh
and and have you know uh you know Voice
driven AI assistant that does things
like this I think are going to be you
know incredibly helpful and valuable uh
for for healthcare providers
super fascinating I mean we're just
hearing you know more about AI maybe AI
for the first time but here you are with
a product already on the market in the
in the healthcare field that's going to
be pretty attractive to be out there uh
right ahead of many other people right
thank you Richard thanks for that
recognition that's been Our intention we
we want to demonstrate that we uh you
know that we're all in on ensuring that
technology that benefits providers uh is
is is accelerated and uh de-risked and
provided uh you know um in in a timely
way you know providers need this help we
we have a healthcare crisis in the
country that is generally characterized
as a as a lack of doctors and so imagine
if we can get our doctors to be 20 or 30
percent more productive through the use
of these types of tools well they're
going to just see more patience and and
that's going to help all of us and uh
and look if you step back Wells business
model is all about having exposure to
the success of doctors and doing our
best to help them be more successful
because we're in a revenue share
relationship with most of the doctors
that we work with and so this uh this is
good for the ecosystem it's great for
the provider and it's great for well as
well super fascinating I'm Ed shabazzi
CEO well Health Technologies ticker
w-e-l-l great to catch up again thank
you sir
thank you Richard appreciate you having
me
[Music]
thank you

View File

@@ -1,970 +0,0 @@
learning medicine is hard work osmosis
makes it easy it takes our lectures and
notes to create a personalized study
plan with exclusive videos practice
questions and flashcards and so much
more try it free today
in diabetes mellitus your body has
trouble moving glucose which is the type
of sugar from your blood into your cells
this leads to high levels of glucose in
your blood and not enough of it in your
cells and remember that your cells need
glucose as a source of energy so not
letting the glucose enter means that the
cells star for energy despite having
glucose right on their doorstep in
general the body controls how much
glucose is in the blood relative to how
much gets into the cells with two
hormones insulin and glucagon insulin is
used to reduce blood glucose levels and
glucagon is used to increase blood
glucose levels both of these hormones
are produced by clusters of cells in the
pancreas called islets of langerhans
insulin is secreted by beta cells in the
center of these islets and glucagon is
secreted by alpha cells in the periphery
of the islets insulin reduces the amount
of glucose in the blood by binding to
insulin receptors embedded in the cell
membrane of various insulin responsive
tissues like muscle cells in adipose
tissue when activated the insulin
receptors cause vesicles containing
glucose transporter that are inside the
cell to fuse with the cell membrane
allowing glucose to be transported into
the cell glucagon does exactly the
opposite it raises the blood glucose
levels by getting the liver to generate
new molecules of glucose from other
molecules and also break down glycogen
into glucose so that I can all get
dumped into the blood diabetes mellitus
is diagnosed when blood glucose levels
get too high and this is seen among 10
percent of the world population there
are two types of diabetes type 1 and
type 2 and the main difference between
them is the underlying mechanism that
causes the blood glucose levels to rise
about 10% of people with diabetes have
type 1 and the remaining 90% of people
with diabetes have type 2 let's start
with type 1 diabetes mellitus sometimes
just called type 1 diabetes in this
situation the body doesn't make enough
insulin the reason this happens is that
in type 1 diabetes there's a type 4
hypersensitivity response or a cell
mediated immune response where a
person's own T cells at
the pancreas as a quick review remember
that the immune system has T cells that
react to all sorts of antigens which are
usually small peptides polysaccharides
or lipids and that some of these
antigens are part of our own body cells
it doesn't make sense to allow T cells
that will attack our own cells to hang
around until there's this process to
eliminate them called self tolerance in
type 1 diabetes there's a genetic
abnormality that causes a loss of self
tolerance among T cells that
specifically target the beta cell
antigens losing self tolerance means
that these T cells are allowed to
recruit other immune cells and
coordinate an attack on these beta cells
losing beta cells means less insulin and
less insulin means that glucose piles up
in the blood because it can't enter the
body's cells one really important group
of genes involved in regulation of the
immune response is the human leukocyte
antigen system or HLA system even though
it's called a system it's basically this
group of genes on chromosome 6 that
encode the major histocompatibility
complex or MHC which is a protein that's
extremely important in helping the
immune system recognize foreign
molecules as well as maintaining self
tolerance MHC is like the serving
platter that antigens are presented to
the immune cells on interestingly people
with type 1 diabetes often have specific
HLA genes in common with each other one
called
HLA dr3 and another called HLA dr4 but
this is just a genetic clue right
because not everyone with HLA dr3 and
HLA dr4 develops diabetes in diabetes
mellitus type 1 destruction of beta
cells usually starts early in life but
sometimes up to 90% of the beta cells
are destroyed before symptoms crop up
for clinical symptoms of uncontrolled
diabetes that all sound similar our
polyphagia glycosuria polyuria and
polydipsia let's go through them one by
one even though there's a lot of glucose
in the blood it cannot get into the
cells which leaves cells starved for
energy so in response adipose tissue
starts breaking down fat called
lipolysis
and muscle tissue starts breaking down
proteins both of which results in weight
loss for someone with uncontrolled
diabetes this catabolic state leaves
people feeling hungry
also known as poly fascia Faiza means
eating and poly means a lot now with
high glucose levels that means that when
blood gets filtered through the kidneys
some of it starts to spill into the
urine called glycosuria glyco surfers to
glucose and urea the urine since glucose
is osmotically active water tends to
follow it resulting in an increase in
urination or polyuria poly again refers
to a lot and urea again refers to urine
finally because there's so much
urination people with uncontrolled
diabetes become dehydrated and thirsty
or polydipsia poly means a lot and dip
SIA means thirst even though people with
diabetes are not able to produce their
own insulin they can still respond to
insulin so treatment involves lifelong
insulin therapy to regulate their blood
glucose levels and basically enable
their cells to use glucose
one really serious complication with
type 1 diabetes is called diabetic
ketoacidosis or DKA to understand it
let's go back to the process of
lipolysis where fat is broken down into
free fatty acids after that happens the
liver turns the fatty acids into ketone
bodies like Osito acetic acid in beta
hydroxy butyrate acid a seed of acetic
acid is a keto acid because it has a
ketone group in a carboxylic acid group
beta hydroxy rhetoric acid on the other
hand even though it's still one of the
ketone bodies isn't technically a keto
acid since its ketone group has been
reduced to a hydroxyl group these ketone
bodies are important because they can be
used by cells for energy but they also
increase the acidity of the blood which
is why it's called ketoacidosis and the
blood becoming really acidic can have
major effects throughout the body
individuals can develop custom all
respiration which is a deep and labored
breathing as the body tries to move
carbon dioxide out of the blood in an
effort to reduce its acidity cells also
have a transporter that exchanges
hydrogen ions or protons for potassium
when the blood gets acidic it's by
definition loaded with protons that get
sent into cells while potassium gets
sent into the fluid outside cells
another thing to keep in mind is that in
addition to helping glucose enter cells
insulin stimulates the sodium potassium
ATPase --is which help potassium get
into the cells and so without insulin
more potassium stays in the fluid
outside cells both of these mechanisms
lead to increased potassium in the fluid
outside cells which quickly makes it
into the blood and causes hyperkalemia
the potassium is then excreted so over
time even though the blood potassium
levels remain high over all stores of
potassium in the body which include
potassium inside cells starts to run low
individuals will also have a high anion
gap which reflects a large difference in
the unmeasured negative and positive
ions in the serum largely due to the
build-up of ketoacids
diabetic ketoacidosis can happen even in
people who have already been diagnosed
with diabetes and currently have some
sort of insulin therapy
in states of stress like an infection
the body releases epinephrine which in
turn stimulates the release of glucagon
too much glucagon can tip the delicate
hormonal balance of glucagon and insulin
in favor of elevating blood sugars and
can lead to a cascade of events we just
described increased glucose in the blood
loss of glucose in the urine loss of
water dehydration and in parallel and
need for alternative energy generation
of ketone bodies and ketoacidosis
interestingly both ketone bodies break
down into acetone and escape as a gas by
getting breathed out the lungs which
gives us sweet fruity smell to a
person's breath in general though that's
the only sweet thing about this illness
which also causes nausea vomiting and if
severe mental status changes and acute
cerebral edema
treatment of a DKA episode involves
giving plenty of fluids which helps with
dehydration insulin which helps lower
blood glucose levels and replacement of
electrolytes like potassium all of which
help to reverse the acidosis now let's
switch gears and talk about type 2
diabetes which is where the body makes
insulin but the tissues don't respond as
well to it the exact reason why cells
don't respond isn't fully understood
essentially the body's providing the
normal amount of insulin but the cells
don't move their glucose transporters to
their membrane in response which
remember is needed for the glucose to
get into the cells these cells therefore
have insulin resistance some risk
factors for insulin resistance are
obesity lack of exercise and
hypertension the exact mechanisms are
still being explored for example in
excess of adipose tissue or fat is
thought to cause the release of free
fatty acids in so-called edible kinds
which are signaling molecules that can
cause inflammation which seems related
to insulin resistance
however many people that are obese are
not diabetic so genetic factors probably
play a major role as well we see this
when we look at twin studies as well
we're having a twin with type-2 diabetes
increases the risk of developing type 2
diabetes completely independently of
other environmental risk factors in type
2 diabetes since tissues don't respond
as well to normal levels of insulin the
body ends up producing more insulin in
order to get the same effect and move
glucose out of the blood
they do this through beta cell
hyperplasia an increased number of beta
cells and beta cell hypertrophy where
they actually grow in size all in this
attempt to pump out more insulin this
works for a while and by keeping insulin
levels higher than normal blood glucose
levels can be kept normal called normal
glycemia now along with insulin beta
cells also secrete islet amyloid
polypeptide or amylin so while beta
cells are cranking out insulin they also
secrete an increased amount of amylin
over time Emlyn builds up and aggregates
in the islets this beta cell
compensation though is not sustainable
and over time those maxed out beta cells
get exhausted and they become
dysfunctional and undergo hypo trophy
and get smaller as well as hypoplasia
and die off as beta cells are lost in
insulin levels decrease glucose levels
in the blood start to increase in
patients develop hyperglycemia which
leads to similar clinical signs that we
mentioned before like Paul aphasia
glycosuria polyuria polydipsia but
unlike type 1 diabetes there's generally
some circulating insulin in type 2
diabetes from the beta cells that are
trying to compensate for the insulin
resistance this means that the insulin
glucagon balances such that diabetic
ketoacidosis does not usually develop
having said that a complication called
hyperosmolar hyperglycemic state or HHS
is much more common in type 2 diabetes
than type 1 diabetes and it causes
increased plasma osmolarity due to
extreme dehydration and concentration of
the blood to help understand this
remember that glucose is a polar
molecule that cannot passively diffuse
across cell membranes which means that
it acts as a solute so when levels of
glucose are super high in the blood
meaning it's a hyperosmolar State water
starts to leave the body cells and enter
the blood vessels leaving the cells were
relatively dry in travailed rather than
plump and juicy blood vessels that are
full of water lead to increased
urination and total body dehydration and
this is a very serious situation because
the dehydration of the body's cells and
in particular the brain can cause a
number of symptoms including mental
status changes in HHS you can sometimes
see mild ketone emia and acidosis but
not to the extent that it's seen in DKA
and in DKA you can see some hyper
osmolarity so there's definitely overlap
between these two syndromes
besides type 1 and type 2 diabetes there
are also a couple other subtypes of
diabetes mellitus gestational diabetes
is when pregnant women have increased
blood glucose which is particularly
during the third trimester although
ultimately unknown the cause is thought
to be related to pregnancy hormones that
interfere with insulins action on
insulin receptors also sometimes people
can develop drug-induced diabetes which
is where medications have side effects
that tend to increase blood glucose
levels the mechanism for both of these
is thought to be related to insulin
resistance like type 2 diabetes rather
than an autoimmune destruction process
like in type 1 diabetes diagnosing type
1 or type 2 diabetes is done by getting
a sense for how much glucose is floating
around in the blood and has specific
standards that the World Health
Organization uses very commonly a
fasting glucose test is taken where the
person doesn't eat or drink except the
water that's okay for a total of eight
hours and then has their blood tested
for glucose levels levels of 100
milligrams per deciliter to 120
five milligrams per deciliter indicates
pre-diabetes and 126 milligrams per
deciliter or higher indicates diabetes a
non fasting a random glucose test can be
done at any time with 200 milligrams per
deciliter or higher being a red flag for
diabetes another test is called an oral
glucose tolerance test where person is
given glucose and then blood samples are
taken at time intervals to figure out
how well it's being cleared from the
blood the most important interval being
two hours later levels of 140 milligrams
per deciliter to 199 milligrams per
deciliter indicate pre-diabetes
and 200 or above indicates diabetes
another thing to know is that when blood
glucose levels get high the glucose can
also stick to proteins that are floating
around in the blood or in cells so that
brings us to another type of test that
can be done which is the hba1c test
which tests for the proportion of
hemoglobin in red blood cells that has
glucose stuck to it called glycated
hemoglobin hba1c levels of 5.7% 26.4%
indicate pre-diabetes
and 6.5 percent or higher indicates
diabetes this proportion of glycated
hemoglobin doesn't change day to day so
it gives a sense for whether the blood
glucose levels have been high over the
past two to three months finally we have
the c-peptide test which tests for
byproducts of insulin production if the
level of c-peptide is low or absent it
means the pancreas is no longer
producing enough insulin and the glucose
cannot enter the cells
for type one diabetes insulin is the
only treatment option for type 2
diabetes on the other hand lifestyle
changes like weight loss and exercise
along with a healthy diet and an oral
anti-diabetic medication like metformin
in several other classes can sometimes
be enough to reverse some of that
insulin resistance and keep blood sugar
levels in check however if oral
anti-diabetic medications fail type 2
diabetes can also be treated with
insulin something to bear in mind is
that insulin treatment comes with a risk
of hypoglycemia especially if insulin is
taken without a meal symptoms of
hypoglycemia can be mild like weakness
hunger and shaking but they can progress
to a loss of consciousness in seizures
in severe cases in mild cases drinking
juices or eating candy or sugar might be
enough to bring blood sugar up but in
severe cases intravenous glucose should
be given as soon as possible
the FDA has also recently approved
intranasal glucagon as a treatment for
severe hypoglycemia all right now over
time high glucose levels can cause
damage to tiny blood vessels while the
micro vasculature in arterioles a
process called hyaline
arteriolosclerosis is where the walls of
the arterioles develop hyaline deposits
which are deposits of proteins and these
make them hard and inflexible in
capillaries the basement membrane can
thicken and make it difficult for oxygen
to easily move from the capillary to the
tissues causing hypoxia
one of the most significant effects is
that diabetes increases the risk of
medium and large arterial wall damage
and subsequent atherosclerosis which can
lead to heart attacks and strokes which
are major causes of morbidity and
mortality for patients with diabetes in
the eyes diabetes can lead to
retinopathy and evidence of that can be
seen on a fundus copic exam that shows
cotton-wool spots or flare hemorrhages
and can eventually cause blindness in
the kidneys the a ferrant and efferent
arterioles as well as the glomerulus
itself can get damaged which can lead to
an F Radek syndrome that slowly
diminishes the kidneys ability to filter
blood over time and can ultimately lead
to dialysis diabetes can also affect the
function of nerves causing symptoms like
a decrease in sensation in the toes and
fingers sometimes called a stocking
glove distribution as well as causes the
autonomic nervous system to malfunction
and that system controls a number of
body functions
everything from sweating to passing gas
finally both the poor blood supply and
nerve damage can lead to ulcers
typically on the feet that don't heal
quickly and can get pretty severe and
need to be amputated these are some of
the complications of uncontrolled
diabetes which is why it's important to
diagnose and control diabetes through a
healthy lifestyle medications to reduce
insulin resistance and even insulin
therapy if beta cells have been
exhausted while type 1 diabetes cannot
be prevented type 2 diabetes can in fact
many people with diabetes can control
their blood sugar levels really
effectively and live a full and active
life without any of the complications
thanks for watching if you're interested
in a deeper dive on this topic take a
look at as Moses org where we have
flashcards questions and other awesome
tools to help you learn medicine
you

View File

@@ -3,8 +3,10 @@
This repository hold an API for the GPU implementation of the Reflector API service,
and use [Modal.com](https://modal.com)
- `reflector_llm.py` - LLM API
- `reflector_transcriber.py` - Transcription API
- `reflector_diarizer.py` - Diarization API
- `reflector_transcriber.py` - Transcription API (Whisper)
- `reflector_transcriber_parakeet.py` - Transcription API (NVIDIA Parakeet)
- `reflector_translator.py` - Translation API
## Modal.com deployment
@@ -18,21 +20,29 @@ $ modal deploy reflector_transcriber.py
...
└── 🔨 Created web => https://xxxx--reflector-transcriber-web.modal.run
$ modal deploy reflector_transcriber_parakeet.py
...
└── 🔨 Created web => https://xxxx--reflector-transcriber-parakeet-web.modal.run
$ modal deploy reflector_llm.py
...
└── 🔨 Created web => https://xxxx--reflector-llm-web.modal.run
```
Then in your reflector api configuration `.env`, you can set theses keys:
Then in your reflector api configuration `.env`, you can set these keys:
```
TRANSCRIPT_BACKEND=modal
TRANSCRIPT_URL=https://xxxx--reflector-transcriber-web.modal.run
TRANSCRIPT_MODAL_API_KEY=REFLECTOR_APIKEY
LLM_BACKEND=modal
LLM_URL=https://xxxx--reflector-llm-web.modal.run
LLM_MODAL_API_KEY=REFLECTOR_APIKEY
DIARIZATION_BACKEND=modal
DIARIZATION_URL=https://xxxx--reflector-diarizer-web.modal.run
DIARIZATION_MODAL_API_KEY=REFLECTOR_APIKEY
TRANSLATION_BACKEND=modal
TRANSLATION_URL=https://xxxx--reflector-translator-web.modal.run
TRANSLATION_MODAL_API_KEY=REFLECTOR_APIKEY
```
## API
@@ -63,6 +73,86 @@ Authorization: bearer <REFLECTOR_APIKEY>
### Transcription
#### Parakeet Transcriber (`reflector_transcriber_parakeet.py`)
NVIDIA Parakeet is a state-of-the-art ASR model optimized for real-time transcription with superior word-level timestamps.
**GPU Configuration:**
- **A10G GPU** - Used for `/v1/audio/transcriptions` endpoint (small files, live transcription)
- Higher concurrency (max_inputs=10)
- Optimized for multiple small audio files
- Supports batch processing for efficiency
- **L40S GPU** - Used for `/v1/audio/transcriptions-from-url` endpoint (large files)
- Lower concurrency but more powerful processing
- Optimized for single large audio files
- VAD-based chunking for long-form audio
##### `/v1/audio/transcriptions` - Small file transcription
**request** (multipart/form-data)
- `file` or `files[]` - audio file(s) to transcribe
- `model` - model name (default: `nvidia/parakeet-tdt-0.6b-v2`)
- `language` - language code (default: `en`)
- `batch` - whether to use batch processing for multiple files (default: `true`)
**response**
```json
{
"text": "transcribed text",
"words": [
{"word": "hello", "start": 0.0, "end": 0.5},
{"word": "world", "start": 0.5, "end": 1.0}
],
"filename": "audio.mp3"
}
```
For multiple files with batch=true:
```json
{
"results": [
{
"filename": "audio1.mp3",
"text": "transcribed text",
"words": [...]
},
{
"filename": "audio2.mp3",
"text": "transcribed text",
"words": [...]
}
]
}
```
##### `/v1/audio/transcriptions-from-url` - Large file transcription
**request** (application/json)
```json
{
"audio_file_url": "https://example.com/audio.mp3",
"model": "nvidia/parakeet-tdt-0.6b-v2",
"language": "en",
"timestamp_offset": 0.0
}
```
**response**
```json
{
"text": "transcribed text from large file",
"words": [
{"word": "hello", "start": 0.0, "end": 0.5},
{"word": "world", "start": 0.5, "end": 1.0}
]
}
```
**Supported file types:** mp3, mp4, mpeg, mpga, m4a, wav, webm
#### Whisper Transcriber (`reflector_transcriber.py`)
`POST /transcribe`
**request** (multipart/form-data)

View File

@@ -4,14 +4,80 @@ Reflector GPU backend - diarizer
"""
import os
import uuid
from typing import Mapping, NewType
from urllib.parse import urlparse
import modal.gpu
from modal import App, Image, Secret, asgi_app, enter, method
from pydantic import BaseModel
import modal
PYANNOTE_MODEL_NAME: str = "pyannote/speaker-diarization-3.1"
MODEL_DIR = "/root/diarization_models"
app = App(name="reflector-diarizer")
UPLOADS_PATH = "/uploads"
SUPPORTED_FILE_EXTENSIONS = ["mp3", "mp4", "mpeg", "mpga", "m4a", "wav", "webm"]
DiarizerUniqFilename = NewType("DiarizerUniqFilename", str)
AudioFileExtension = NewType("AudioFileExtension", str)
app = modal.App(name="reflector-diarizer")
# Volume for temporary file uploads
upload_volume = modal.Volume.from_name("diarizer-uploads", create_if_missing=True)
def detect_audio_format(url: str, headers: Mapping[str, str]) -> AudioFileExtension:
parsed_url = urlparse(url)
url_path = parsed_url.path
for ext in SUPPORTED_FILE_EXTENSIONS:
if url_path.lower().endswith(f".{ext}"):
return AudioFileExtension(ext)
content_type = headers.get("content-type", "").lower()
if "audio/mpeg" in content_type or "audio/mp3" in content_type:
return AudioFileExtension("mp3")
if "audio/wav" in content_type:
return AudioFileExtension("wav")
if "audio/mp4" in content_type:
return AudioFileExtension("mp4")
raise ValueError(
f"Unsupported audio format for URL: {url}. "
f"Supported extensions: {', '.join(SUPPORTED_FILE_EXTENSIONS)}"
)
def download_audio_to_volume(
audio_file_url: str,
) -> tuple[DiarizerUniqFilename, AudioFileExtension]:
import requests
from fastapi import HTTPException
print(f"Checking audio file at: {audio_file_url}")
response = requests.head(audio_file_url, allow_redirects=True)
if response.status_code == 404:
raise HTTPException(status_code=404, detail="Audio file not found")
print(f"Downloading audio file from: {audio_file_url}")
response = requests.get(audio_file_url, allow_redirects=True)
if response.status_code != 200:
print(f"Download failed with status {response.status_code}: {response.text}")
raise HTTPException(
status_code=response.status_code,
detail=f"Failed to download audio file: {response.status_code}",
)
audio_suffix = detect_audio_format(audio_file_url, response.headers)
unique_filename = DiarizerUniqFilename(f"{uuid.uuid4()}.{audio_suffix}")
file_path = f"{UPLOADS_PATH}/{unique_filename}"
print(f"Writing file to: {file_path} (size: {len(response.content)} bytes)")
with open(file_path, "wb") as f:
f.write(response.content)
upload_volume.commit()
print(f"File saved as: {unique_filename}")
return unique_filename, audio_suffix
def migrate_cache_llm():
@@ -39,7 +105,7 @@ def download_pyannote_audio():
diarizer_image = (
Image.debian_slim(python_version="3.10.8")
modal.Image.debian_slim(python_version="3.10.8")
.pip_install(
"pyannote.audio==3.1.0",
"requests",
@@ -55,7 +121,8 @@ diarizer_image = (
"hf-transfer",
)
.run_function(
download_pyannote_audio, secrets=[Secret.from_name("my-huggingface-secret")]
download_pyannote_audio,
secrets=[modal.Secret.from_name("hf_token")],
)
.run_function(migrate_cache_llm)
.env(
@@ -70,53 +137,60 @@ diarizer_image = (
@app.cls(
gpu=modal.gpu.A100(size="40GB"),
gpu="A100",
timeout=60 * 30,
scaledown_window=60,
allow_concurrent_inputs=1,
image=diarizer_image,
volumes={UPLOADS_PATH: upload_volume},
enable_memory_snapshot=True,
experimental_options={"enable_gpu_snapshot": True},
secrets=[
modal.Secret.from_name("hf_token"),
],
)
@modal.concurrent(max_inputs=1)
class Diarizer:
@enter()
@modal.enter(snap=True)
def enter(self):
import torch
from pyannote.audio import Pipeline
self.use_gpu = torch.cuda.is_available()
self.device = "cuda" if self.use_gpu else "cpu"
print(f"Using device: {self.device}")
self.diarization_pipeline = Pipeline.from_pretrained(
PYANNOTE_MODEL_NAME, cache_dir=MODEL_DIR
PYANNOTE_MODEL_NAME,
cache_dir=MODEL_DIR,
use_auth_token=os.environ["HF_TOKEN"],
)
self.diarization_pipeline.to(torch.device(self.device))
@method()
def diarize(self, audio_data: str, audio_suffix: str, timestamp: float):
import tempfile
@modal.method()
def diarize(self, filename: str, timestamp: float = 0.0):
import torchaudio
with tempfile.NamedTemporaryFile("wb+", suffix=f".{audio_suffix}") as fp:
fp.write(audio_data)
upload_volume.reload()
print("Diarizing audio")
waveform, sample_rate = torchaudio.load(fp.name)
diarization = self.diarization_pipeline(
{"waveform": waveform, "sample_rate": sample_rate}
file_path = f"{UPLOADS_PATH}/{filename}"
if not os.path.exists(file_path):
raise FileNotFoundError(f"File not found: {file_path}")
print(f"Diarizing audio from: {file_path}")
waveform, sample_rate = torchaudio.load(file_path)
diarization = self.diarization_pipeline(
{"waveform": waveform, "sample_rate": sample_rate}
)
words = []
for diarization_segment, _, speaker in diarization.itertracks(yield_label=True):
words.append(
{
"start": round(timestamp + diarization_segment.start, 3),
"end": round(timestamp + diarization_segment.end, 3),
"speaker": int(speaker[-2:]),
}
)
words = []
for diarization_segment, _, speaker in diarization.itertracks(
yield_label=True
):
words.append(
{
"start": round(timestamp + diarization_segment.start, 3),
"end": round(timestamp + diarization_segment.end, 3),
"speaker": int(speaker[-2:]),
}
)
print("Diarization complete")
return {"diarization": words}
print("Diarization complete")
return {"diarization": words}
# -------------------------------------------------------------------
@@ -127,17 +201,18 @@ class Diarizer:
@app.function(
timeout=60 * 10,
scaledown_window=60 * 3,
allow_concurrent_inputs=40,
secrets=[
Secret.from_name("reflector-gpu"),
modal.Secret.from_name("reflector-gpu"),
],
volumes={UPLOADS_PATH: upload_volume},
image=diarizer_image,
)
@asgi_app()
@modal.concurrent(max_inputs=40)
@modal.asgi_app()
def web():
import requests
from fastapi import Depends, FastAPI, HTTPException, status
from fastapi.security import OAuth2PasswordBearer
from pydantic import BaseModel
diarizerstub = Diarizer()
@@ -153,35 +228,26 @@ def web():
headers={"WWW-Authenticate": "Bearer"},
)
def validate_audio_file(audio_file_url: str):
# Check if the audio file exists
response = requests.head(audio_file_url, allow_redirects=True)
if response.status_code == 404:
raise HTTPException(
status_code=response.status_code,
detail="The audio file does not exist.",
)
class DiarizationResponse(BaseModel):
result: dict
@app.post(
"/diarize", dependencies=[Depends(apikey_auth), Depends(validate_audio_file)]
)
def diarize(
audio_file_url: str, timestamp: float = 0.0
) -> HTTPException | DiarizationResponse:
# Currently the uploaded files are in mp3 format
audio_suffix = "mp3"
@app.post("/diarize", dependencies=[Depends(apikey_auth)])
def diarize(audio_file_url: str, timestamp: float = 0.0) -> DiarizationResponse:
unique_filename, audio_suffix = download_audio_to_volume(audio_file_url)
print("Downloading audio file")
response = requests.get(audio_file_url, allow_redirects=True)
print("Audio file downloaded successfully")
func = diarizerstub.diarize.spawn(
audio_data=response.content, audio_suffix=audio_suffix, timestamp=timestamp
)
result = func.get()
return result
try:
func = diarizerstub.diarize.spawn(
filename=unique_filename, timestamp=timestamp
)
result = func.get()
return result
finally:
try:
file_path = f"{UPLOADS_PATH}/{unique_filename}"
print(f"Deleting file: {file_path}")
os.remove(file_path)
upload_volume.commit()
except Exception as e:
print(f"Error cleaning up {unique_filename}: {e}")
return app

View File

@@ -1,214 +0,0 @@
"""
Reflector GPU backend - LLM
===========================
"""
import json
import os
import threading
from typing import Optional
import modal
from modal import App, Image, Secret, asgi_app, enter, exit, method
# LLM
LLM_MODEL: str = "lmsys/vicuna-13b-v1.5"
LLM_LOW_CPU_MEM_USAGE: bool = True
LLM_TORCH_DTYPE: str = "bfloat16"
LLM_MAX_NEW_TOKENS: int = 300
IMAGE_MODEL_DIR = "/root/llm_models"
app = App(name="reflector-llm")
def download_llm():
from huggingface_hub import snapshot_download
print("Downloading LLM model")
snapshot_download(LLM_MODEL, cache_dir=IMAGE_MODEL_DIR)
print("LLM model downloaded")
def migrate_cache_llm():
"""
XXX The cache for model files in Transformers v4.22.0 has been updated.
Migrating your old cache. This is a one-time only operation. You can
interrupt this and resume the migration later on by calling
`transformers.utils.move_cache()`.
"""
from transformers.utils.hub import move_cache
print("Moving LLM cache")
move_cache(cache_dir=IMAGE_MODEL_DIR, new_cache_dir=IMAGE_MODEL_DIR)
print("LLM cache moved")
llm_image = (
Image.debian_slim(python_version="3.10.8")
.apt_install("git")
.pip_install(
"transformers",
"torch",
"sentencepiece",
"protobuf",
"jsonformer==0.12.0",
"accelerate==0.21.0",
"einops==0.6.1",
"hf-transfer~=0.1",
"huggingface_hub==0.16.4",
)
.env({"HF_HUB_ENABLE_HF_TRANSFER": "1"})
.run_function(download_llm)
.run_function(migrate_cache_llm)
)
@app.cls(
gpu="A100",
timeout=60 * 5,
scaledown_window=60 * 5,
allow_concurrent_inputs=15,
image=llm_image,
)
class LLM:
@enter()
def enter(self):
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, GenerationConfig
print("Instance llm model")
model = AutoModelForCausalLM.from_pretrained(
LLM_MODEL,
torch_dtype=getattr(torch, LLM_TORCH_DTYPE),
low_cpu_mem_usage=LLM_LOW_CPU_MEM_USAGE,
cache_dir=IMAGE_MODEL_DIR,
local_files_only=True,
)
# JSONFormer doesn't yet support generation configs
print("Instance llm generation config")
model.config.max_new_tokens = LLM_MAX_NEW_TOKENS
# generation configuration
gen_cfg = GenerationConfig.from_model_config(model.config)
gen_cfg.max_new_tokens = LLM_MAX_NEW_TOKENS
# load tokenizer
print("Instance llm tokenizer")
tokenizer = AutoTokenizer.from_pretrained(
LLM_MODEL, cache_dir=IMAGE_MODEL_DIR, local_files_only=True
)
# move model to gpu
print("Move llm model to GPU")
model = model.cuda()
print("Warmup llm done")
self.model = model
self.tokenizer = tokenizer
self.gen_cfg = gen_cfg
self.GenerationConfig = GenerationConfig
self.lock = threading.Lock()
@exit()
def exit():
print("Exit llm")
@method()
def generate(
self, prompt: str, gen_schema: str | None, gen_cfg: str | None
) -> dict:
"""
Perform a generation action using the LLM
"""
print(f"Generate {prompt=}")
if gen_cfg:
gen_cfg = self.GenerationConfig.from_dict(json.loads(gen_cfg))
else:
gen_cfg = self.gen_cfg
# If a gen_schema is given, conform to gen_schema
with self.lock:
if gen_schema:
import jsonformer
print(f"Schema {gen_schema=}")
jsonformer_llm = jsonformer.Jsonformer(
model=self.model,
tokenizer=self.tokenizer,
json_schema=json.loads(gen_schema),
prompt=prompt,
max_string_token_length=gen_cfg.max_new_tokens,
)
response = jsonformer_llm()
else:
# If no gen_schema, perform prompt only generation
# tokenize prompt
input_ids = self.tokenizer.encode(prompt, return_tensors="pt").to(
self.model.device
)
output = self.model.generate(input_ids, generation_config=gen_cfg)
# decode output
response = self.tokenizer.decode(
output[0].cpu(), skip_special_tokens=True
)
response = response[len(prompt) :]
print(f"Generated {response=}")
return {"text": response}
# -------------------------------------------------------------------
# Web API
# -------------------------------------------------------------------
@app.function(
scaledown_window=60 * 10,
timeout=60 * 5,
allow_concurrent_inputs=45,
secrets=[
Secret.from_name("reflector-gpu"),
],
)
@asgi_app()
def web():
from fastapi import Depends, FastAPI, HTTPException, status
from fastapi.security import OAuth2PasswordBearer
from pydantic import BaseModel
llmstub = LLM()
app = FastAPI()
oauth2_scheme = OAuth2PasswordBearer(tokenUrl="token")
def apikey_auth(apikey: str = Depends(oauth2_scheme)):
if apikey != os.environ["REFLECTOR_GPU_APIKEY"]:
raise HTTPException(
status_code=status.HTTP_401_UNAUTHORIZED,
detail="Invalid API key",
headers={"WWW-Authenticate": "Bearer"},
)
class LLMRequest(BaseModel):
prompt: str
gen_schema: Optional[dict] = None
gen_cfg: Optional[dict] = None
@app.post("/llm", dependencies=[Depends(apikey_auth)])
def llm(
req: LLMRequest,
):
gen_schema = json.dumps(req.gen_schema) if req.gen_schema else None
gen_cfg = json.dumps(req.gen_cfg) if req.gen_cfg else None
func = llmstub.generate.spawn(
prompt=req.prompt, gen_schema=gen_schema, gen_cfg=gen_cfg
)
result = func.get()
return result
return app

View File

@@ -1,220 +0,0 @@
"""
Reflector GPU backend - LLM
===========================
"""
import json
import os
import threading
from typing import Optional
import modal
from modal import App, Image, Secret, asgi_app, enter, exit, method
# LLM
LLM_MODEL: str = "HuggingFaceH4/zephyr-7b-alpha"
LLM_LOW_CPU_MEM_USAGE: bool = True
LLM_TORCH_DTYPE: str = "bfloat16"
LLM_MAX_NEW_TOKENS: int = 300
IMAGE_MODEL_DIR = "/root/llm_models/zephyr"
app = App(name="reflector-llm-zephyr")
def download_llm():
from huggingface_hub import snapshot_download
print("Downloading LLM model")
snapshot_download(LLM_MODEL, cache_dir=IMAGE_MODEL_DIR)
print("LLM model downloaded")
def migrate_cache_llm():
"""
XXX The cache for model files in Transformers v4.22.0 has been updated.
Migrating your old cache. This is a one-time only operation. You can
interrupt this and resume the migration later on by calling
`transformers.utils.move_cache()`.
"""
from transformers.utils.hub import move_cache
print("Moving LLM cache")
move_cache(cache_dir=IMAGE_MODEL_DIR, new_cache_dir=IMAGE_MODEL_DIR)
print("LLM cache moved")
llm_image = (
Image.debian_slim(python_version="3.10.8")
.apt_install("git")
.pip_install(
"transformers==4.34.0",
"torch",
"sentencepiece",
"protobuf",
"jsonformer==0.12.0",
"accelerate==0.21.0",
"einops==0.6.1",
"hf-transfer~=0.1",
"huggingface_hub==0.16.4",
)
.env({"HF_HUB_ENABLE_HF_TRANSFER": "1"})
.run_function(download_llm)
.run_function(migrate_cache_llm)
)
@app.cls(
gpu="A10G",
timeout=60 * 5,
scaledown_window=60 * 5,
allow_concurrent_inputs=10,
image=llm_image,
)
class LLM:
@enter()
def enter(self):
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, GenerationConfig
print("Instance llm model")
model = AutoModelForCausalLM.from_pretrained(
LLM_MODEL,
torch_dtype=getattr(torch, LLM_TORCH_DTYPE),
low_cpu_mem_usage=LLM_LOW_CPU_MEM_USAGE,
cache_dir=IMAGE_MODEL_DIR,
local_files_only=True,
)
# JSONFormer doesn't yet support generation configs
print("Instance llm generation config")
model.config.max_new_tokens = LLM_MAX_NEW_TOKENS
# generation configuration
gen_cfg = GenerationConfig.from_model_config(model.config)
gen_cfg.max_new_tokens = LLM_MAX_NEW_TOKENS
# load tokenizer
print("Instance llm tokenizer")
tokenizer = AutoTokenizer.from_pretrained(
LLM_MODEL, cache_dir=IMAGE_MODEL_DIR, local_files_only=True
)
gen_cfg.pad_token_id = tokenizer.eos_token_id
gen_cfg.eos_token_id = tokenizer.eos_token_id
tokenizer.pad_token = tokenizer.eos_token
model.config.pad_token_id = tokenizer.eos_token_id
# move model to gpu
print("Move llm model to GPU")
model = model.cuda()
print("Warmup llm done")
self.model = model
self.tokenizer = tokenizer
self.gen_cfg = gen_cfg
self.GenerationConfig = GenerationConfig
self.lock = threading.Lock()
@exit()
def exit():
print("Exit llm")
@method()
def generate(
self, prompt: str, gen_schema: str | None, gen_cfg: str | None
) -> dict:
"""
Perform a generation action using the LLM
"""
print(f"Generate {prompt=}")
if gen_cfg:
gen_cfg = self.GenerationConfig.from_dict(json.loads(gen_cfg))
gen_cfg.pad_token_id = self.tokenizer.eos_token_id
gen_cfg.eos_token_id = self.tokenizer.eos_token_id
else:
gen_cfg = self.gen_cfg
# If a gen_schema is given, conform to gen_schema
with self.lock:
if gen_schema:
import jsonformer
print(f"Schema {gen_schema=}")
jsonformer_llm = jsonformer.Jsonformer(
model=self.model,
tokenizer=self.tokenizer,
json_schema=json.loads(gen_schema),
prompt=prompt,
max_string_token_length=gen_cfg.max_new_tokens,
)
response = jsonformer_llm()
else:
# If no gen_schema, perform prompt only generation
# tokenize prompt
input_ids = self.tokenizer.encode(prompt, return_tensors="pt").to(
self.model.device
)
output = self.model.generate(input_ids, generation_config=gen_cfg)
# decode output
response = self.tokenizer.decode(
output[0].cpu(), skip_special_tokens=True
)
response = response[len(prompt) :]
response = {"long_summary": response}
print(f"Generated {response=}")
return {"text": response}
# -------------------------------------------------------------------
# Web API
# -------------------------------------------------------------------
@app.function(
scaledown_window=60 * 10,
timeout=60 * 5,
allow_concurrent_inputs=30,
secrets=[
Secret.from_name("reflector-gpu"),
],
)
@asgi_app()
def web():
from fastapi import Depends, FastAPI, HTTPException, status
from fastapi.security import OAuth2PasswordBearer
from pydantic import BaseModel
llmstub = LLM()
app = FastAPI()
oauth2_scheme = OAuth2PasswordBearer(tokenUrl="token")
def apikey_auth(apikey: str = Depends(oauth2_scheme)):
if apikey != os.environ["REFLECTOR_GPU_APIKEY"]:
raise HTTPException(
status_code=status.HTTP_401_UNAUTHORIZED,
detail="Invalid API key",
headers={"WWW-Authenticate": "Bearer"},
)
class LLMRequest(BaseModel):
prompt: str
gen_schema: Optional[dict] = None
gen_cfg: Optional[dict] = None
@app.post("/llm", dependencies=[Depends(apikey_auth)])
def llm(
req: LLMRequest,
):
gen_schema = json.dumps(req.gen_schema) if req.gen_schema else None
gen_cfg = json.dumps(req.gen_cfg) if req.gen_cfg else None
func = llmstub.generate.spawn(
prompt=req.prompt, gen_schema=gen_schema, gen_cfg=gen_cfg
)
result = func.get()
return result
return app

View File

@@ -1,41 +1,78 @@
import os
import tempfile
import sys
import threading
import uuid
from typing import Generator, Mapping, NamedTuple, NewType, TypedDict
from urllib.parse import urlparse
import modal
from pydantic import BaseModel
MODELS_DIR = "/models"
MODEL_NAME = "large-v2"
MODEL_COMPUTE_TYPE: str = "float16"
MODEL_NUM_WORKERS: int = 1
MINUTES = 60 # seconds
SAMPLERATE = 16000
UPLOADS_PATH = "/uploads"
CACHE_PATH = "/models"
SUPPORTED_FILE_EXTENSIONS = ["mp3", "mp4", "mpeg", "mpga", "m4a", "wav", "webm"]
VAD_CONFIG = {
"batch_max_duration": 30.0,
"silence_padding": 0.5,
"window_size": 512,
}
volume = modal.Volume.from_name("models", create_if_missing=True)
WhisperUniqFilename = NewType("WhisperUniqFilename", str)
AudioFileExtension = NewType("AudioFileExtension", str)
app = modal.App("reflector-transcriber")
model_cache = modal.Volume.from_name("models", create_if_missing=True)
upload_volume = modal.Volume.from_name("whisper-uploads", create_if_missing=True)
class TimeSegment(NamedTuple):
"""Represents a time segment with start and end times."""
start: float
end: float
class AudioSegment(NamedTuple):
"""Represents an audio segment with timing and audio data."""
start: float
end: float
audio: any
class TranscriptResult(NamedTuple):
"""Represents a transcription result with text and word timings."""
text: str
words: list["WordTiming"]
class WordTiming(TypedDict):
"""Represents a word with its timing information."""
word: str
start: float
end: float
def download_model():
from faster_whisper import download_model
volume.reload()
model_cache.reload()
download_model(MODEL_NAME, cache_dir=MODELS_DIR)
download_model(MODEL_NAME, cache_dir=CACHE_PATH)
volume.commit()
model_cache.commit()
image = (
modal.Image.debian_slim(python_version="3.12")
.pip_install(
"huggingface_hub==0.27.1",
"hf-transfer==0.1.9",
"torch==2.5.1",
"faster-whisper==1.1.1",
)
.env(
{
"HF_HUB_ENABLE_HF_TRANSFER": "1",
@@ -45,19 +82,98 @@ image = (
),
}
)
.run_function(download_model, volumes={MODELS_DIR: volume})
.apt_install("ffmpeg")
.pip_install(
"huggingface_hub==0.27.1",
"hf-transfer==0.1.9",
"torch==2.5.1",
"faster-whisper==1.1.1",
"fastapi==0.115.12",
"requests",
"librosa==0.10.1",
"numpy<2",
"silero-vad==5.1.0",
)
.run_function(download_model, volumes={CACHE_PATH: model_cache})
)
def detect_audio_format(url: str, headers: Mapping[str, str]) -> AudioFileExtension:
parsed_url = urlparse(url)
url_path = parsed_url.path
for ext in SUPPORTED_FILE_EXTENSIONS:
if url_path.lower().endswith(f".{ext}"):
return AudioFileExtension(ext)
content_type = headers.get("content-type", "").lower()
if "audio/mpeg" in content_type or "audio/mp3" in content_type:
return AudioFileExtension("mp3")
if "audio/wav" in content_type:
return AudioFileExtension("wav")
if "audio/mp4" in content_type:
return AudioFileExtension("mp4")
raise ValueError(
f"Unsupported audio format for URL: {url}. "
f"Supported extensions: {', '.join(SUPPORTED_FILE_EXTENSIONS)}"
)
def download_audio_to_volume(
audio_file_url: str,
) -> tuple[WhisperUniqFilename, AudioFileExtension]:
import requests
from fastapi import HTTPException
response = requests.head(audio_file_url, allow_redirects=True)
if response.status_code == 404:
raise HTTPException(status_code=404, detail="Audio file not found")
response = requests.get(audio_file_url, allow_redirects=True)
response.raise_for_status()
audio_suffix = detect_audio_format(audio_file_url, response.headers)
unique_filename = WhisperUniqFilename(f"{uuid.uuid4()}.{audio_suffix}")
file_path = f"{UPLOADS_PATH}/{unique_filename}"
with open(file_path, "wb") as f:
f.write(response.content)
upload_volume.commit()
return unique_filename, audio_suffix
def pad_audio(audio_array, sample_rate: int = SAMPLERATE):
"""Add 0.5s of silence if audio is shorter than the silence_padding window.
Whisper does not require this strictly, but aligning behavior with Parakeet
avoids edge-case crashes on extremely short inputs and makes comparisons easier.
"""
import numpy as np
audio_duration = len(audio_array) / sample_rate
if audio_duration < VAD_CONFIG["silence_padding"]:
silence_samples = int(sample_rate * VAD_CONFIG["silence_padding"])
silence = np.zeros(silence_samples, dtype=np.float32)
return np.concatenate([audio_array, silence])
return audio_array
@app.cls(
gpu="A10G",
timeout=5 * MINUTES,
scaledown_window=5 * MINUTES,
allow_concurrent_inputs=6,
image=image,
volumes={MODELS_DIR: volume},
volumes={CACHE_PATH: model_cache, UPLOADS_PATH: upload_volume},
)
class Transcriber:
@modal.concurrent(max_inputs=10)
class TranscriberWhisperLive:
"""Live transcriber class for small audio segments (A10G).
Mirrors the Parakeet live class API but uses Faster-Whisper under the hood.
"""
@modal.enter()
def enter(self):
import faster_whisper
@@ -71,23 +187,200 @@ class Transcriber:
device=self.device,
compute_type=MODEL_COMPUTE_TYPE,
num_workers=MODEL_NUM_WORKERS,
download_root=MODELS_DIR,
download_root=CACHE_PATH,
local_files_only=True,
)
print(f"Model is on device: {self.device}")
@modal.method()
def transcribe_segment(
self,
audio_data: str,
audio_suffix: str,
language: str,
filename: str,
language: str = "en",
):
with tempfile.NamedTemporaryFile("wb+", suffix=f".{audio_suffix}") as fp:
fp.write(audio_data)
"""Transcribe a single uploaded audio file by filename."""
upload_volume.reload()
file_path = f"{UPLOADS_PATH}/{filename}"
if not os.path.exists(file_path):
raise FileNotFoundError(f"File not found: {file_path}")
with self.lock:
with NoStdStreams():
segments, _ = self.model.transcribe(
file_path,
language=language,
beam_size=5,
word_timestamps=True,
vad_filter=True,
vad_parameters={"min_silence_duration_ms": 500},
)
segments = list(segments)
text = "".join(segment.text for segment in segments).strip()
words = [
{
"word": word.word,
"start": round(float(word.start), 2),
"end": round(float(word.end), 2),
}
for segment in segments
for word in segment.words
]
return {"text": text, "words": words}
@modal.method()
def transcribe_batch(
self,
filenames: list[str],
language: str = "en",
):
"""Transcribe multiple uploaded audio files and return per-file results."""
upload_volume.reload()
results = []
for filename in filenames:
file_path = f"{UPLOADS_PATH}/{filename}"
if not os.path.exists(file_path):
raise FileNotFoundError(f"Batch file not found: {file_path}")
with self.lock:
with NoStdStreams():
segments, _ = self.model.transcribe(
file_path,
language=language,
beam_size=5,
word_timestamps=True,
vad_filter=True,
vad_parameters={"min_silence_duration_ms": 500},
)
segments = list(segments)
text = "".join(seg.text for seg in segments).strip()
words = [
{
"word": w.word,
"start": round(float(w.start), 2),
"end": round(float(w.end), 2),
}
for seg in segments
for w in seg.words
]
results.append(
{
"filename": filename,
"text": text,
"words": words,
}
)
return results
@app.cls(
gpu="L40S",
timeout=15 * MINUTES,
image=image,
volumes={CACHE_PATH: model_cache, UPLOADS_PATH: upload_volume},
)
class TranscriberWhisperFile:
"""File transcriber for larger/longer audio, using VAD-driven batching (L40S)."""
@modal.enter()
def enter(self):
import faster_whisper
import torch
from silero_vad import load_silero_vad
self.lock = threading.Lock()
self.use_gpu = torch.cuda.is_available()
self.device = "cuda" if self.use_gpu else "cpu"
self.model = faster_whisper.WhisperModel(
MODEL_NAME,
device=self.device,
compute_type=MODEL_COMPUTE_TYPE,
num_workers=MODEL_NUM_WORKERS,
download_root=CACHE_PATH,
local_files_only=True,
)
self.vad_model = load_silero_vad(onnx=False)
@modal.method()
def transcribe_segment(
self, filename: str, timestamp_offset: float = 0.0, language: str = "en"
):
import librosa
import numpy as np
from silero_vad import VADIterator
def vad_segments(
audio_array,
sample_rate: int = SAMPLERATE,
window_size: int = VAD_CONFIG["window_size"],
) -> Generator[TimeSegment, None, None]:
"""Generate speech segments as TimeSegment using Silero VAD."""
iterator = VADIterator(self.vad_model, sampling_rate=sample_rate)
start = None
for i in range(0, len(audio_array), window_size):
chunk = audio_array[i : i + window_size]
if len(chunk) < window_size:
chunk = np.pad(
chunk, (0, window_size - len(chunk)), mode="constant"
)
speech = iterator(chunk)
if not speech:
continue
if "start" in speech:
start = speech["start"]
continue
if "end" in speech and start is not None:
end = speech["end"]
yield TimeSegment(
start / float(SAMPLERATE), end / float(SAMPLERATE)
)
start = None
iterator.reset_states()
upload_volume.reload()
file_path = f"{UPLOADS_PATH}/{filename}"
if not os.path.exists(file_path):
raise FileNotFoundError(f"File not found: {file_path}")
audio_array, _sr = librosa.load(file_path, sr=SAMPLERATE, mono=True)
# Batch segments up to ~30s windows by merging contiguous VAD segments
merged_batches: list[TimeSegment] = []
batch_start = None
batch_end = None
max_duration = VAD_CONFIG["batch_max_duration"]
for segment in vad_segments(audio_array):
seg_start, seg_end = segment.start, segment.end
if batch_start is None:
batch_start, batch_end = seg_start, seg_end
continue
if seg_end - batch_start <= max_duration:
batch_end = seg_end
else:
merged_batches.append(TimeSegment(batch_start, batch_end))
batch_start, batch_end = seg_start, seg_end
if batch_start is not None and batch_end is not None:
merged_batches.append(TimeSegment(batch_start, batch_end))
all_text = []
all_words = []
for segment in merged_batches:
start_time, end_time = segment.start, segment.end
s_idx = int(start_time * SAMPLERATE)
e_idx = int(end_time * SAMPLERATE)
segment = audio_array[s_idx:e_idx]
segment = pad_audio(segment, SAMPLERATE)
with self.lock:
segments, _ = self.model.transcribe(
fp.name,
segment,
language=language,
beam_size=5,
word_timestamps=True,
@@ -96,66 +389,220 @@ class Transcriber:
)
segments = list(segments)
text = "".join(segment.text for segment in segments)
text = "".join(seg.text for seg in segments).strip()
words = [
{"word": word.word, "start": word.start, "end": word.end}
for segment in segments
for word in segment.words
{
"word": w.word,
"start": round(float(w.start) + start_time + timestamp_offset, 2),
"end": round(float(w.end) + start_time + timestamp_offset, 2),
}
for seg in segments
for w in seg.words
]
if text:
all_text.append(text)
all_words.extend(words)
return {"text": text, "words": words}
return {"text": " ".join(all_text), "words": all_words}
def detect_audio_format(url: str, headers: dict) -> str:
from urllib.parse import urlparse
from fastapi import HTTPException
url_path = urlparse(url).path
for ext in SUPPORTED_FILE_EXTENSIONS:
if url_path.lower().endswith(f".{ext}"):
return ext
content_type = headers.get("content-type", "").lower()
if "audio/mpeg" in content_type or "audio/mp3" in content_type:
return "mp3"
if "audio/wav" in content_type:
return "wav"
if "audio/mp4" in content_type:
return "mp4"
raise HTTPException(
status_code=400,
detail=(
f"Unsupported audio format for URL. Supported extensions: {', '.join(SUPPORTED_FILE_EXTENSIONS)}"
),
)
def download_audio_to_volume(audio_file_url: str) -> tuple[str, str]:
import requests
from fastapi import HTTPException
response = requests.head(audio_file_url, allow_redirects=True)
if response.status_code == 404:
raise HTTPException(status_code=404, detail="Audio file not found")
response = requests.get(audio_file_url, allow_redirects=True)
response.raise_for_status()
audio_suffix = detect_audio_format(audio_file_url, response.headers)
unique_filename = f"{uuid.uuid4()}.{audio_suffix}"
file_path = f"{UPLOADS_PATH}/{unique_filename}"
with open(file_path, "wb") as f:
f.write(response.content)
upload_volume.commit()
return unique_filename, audio_suffix
@app.function(
scaledown_window=60,
timeout=60,
allow_concurrent_inputs=40,
timeout=600,
secrets=[
modal.Secret.from_name("reflector-gpu"),
],
volumes={MODELS_DIR: volume},
volumes={CACHE_PATH: model_cache, UPLOADS_PATH: upload_volume},
image=image,
)
@modal.concurrent(max_inputs=40)
@modal.asgi_app()
def web():
from fastapi import Body, Depends, FastAPI, HTTPException, UploadFile, status
from fastapi import (
Body,
Depends,
FastAPI,
Form,
HTTPException,
UploadFile,
status,
)
from fastapi.security import OAuth2PasswordBearer
from typing_extensions import Annotated
transcriber = Transcriber()
transcriber_live = TranscriberWhisperLive()
transcriber_file = TranscriberWhisperFile()
app = FastAPI()
oauth2_scheme = OAuth2PasswordBearer(tokenUrl="token")
supported_file_types = ["mp3", "mp4", "mpeg", "mpga", "m4a", "wav", "webm"]
def apikey_auth(apikey: str = Depends(oauth2_scheme)):
if apikey != os.environ["REFLECTOR_GPU_APIKEY"]:
raise HTTPException(
status_code=status.HTTP_401_UNAUTHORIZED,
detail="Invalid API key",
headers={"WWW-Authenticate": "Bearer"},
)
if apikey == os.environ["REFLECTOR_GPU_APIKEY"]:
return
raise HTTPException(
status_code=status.HTTP_401_UNAUTHORIZED,
detail="Invalid API key",
headers={"WWW-Authenticate": "Bearer"},
)
class TranscriptResponse(BaseModel):
result: dict
class TranscriptResponse(dict):
pass
@app.post("/v1/audio/transcriptions", dependencies=[Depends(apikey_auth)])
def transcribe(
file: UploadFile,
model: str = "whisper-1",
language: Annotated[str, Body(...)] = "en",
) -> TranscriptResponse:
audio_data = file.file.read()
audio_suffix = file.filename.split(".")[-1]
assert audio_suffix in supported_file_types
file: UploadFile = None,
files: list[UploadFile] | None = None,
model: str = Form(MODEL_NAME),
language: str = Form("en"),
batch: bool = Form(False),
):
if not file and not files:
raise HTTPException(
status_code=400, detail="Either 'file' or 'files' parameter is required"
)
if batch and not files:
raise HTTPException(
status_code=400, detail="Batch transcription requires 'files'"
)
func = transcriber.transcribe_segment.spawn(
audio_data=audio_data,
audio_suffix=audio_suffix,
language=language,
)
result = func.get()
return result
upload_files = [file] if file else files
uploaded_filenames: list[str] = []
for upload_file in upload_files:
audio_suffix = upload_file.filename.split(".")[-1]
if audio_suffix not in SUPPORTED_FILE_EXTENSIONS:
raise HTTPException(
status_code=400,
detail=(
f"Unsupported audio format. Supported extensions: {', '.join(SUPPORTED_FILE_EXTENSIONS)}"
),
)
unique_filename = f"{uuid.uuid4()}.{audio_suffix}"
file_path = f"{UPLOADS_PATH}/{unique_filename}"
with open(file_path, "wb") as f:
content = upload_file.file.read()
f.write(content)
uploaded_filenames.append(unique_filename)
upload_volume.commit()
try:
if batch and len(upload_files) > 1:
func = transcriber_live.transcribe_batch.spawn(
filenames=uploaded_filenames,
language=language,
)
results = func.get()
return {"results": results}
results = []
for filename in uploaded_filenames:
func = transcriber_live.transcribe_segment.spawn(
filename=filename,
language=language,
)
result = func.get()
result["filename"] = filename
results.append(result)
return {"results": results} if len(results) > 1 else results[0]
finally:
for filename in uploaded_filenames:
try:
file_path = f"{UPLOADS_PATH}/{filename}"
os.remove(file_path)
except Exception:
pass
upload_volume.commit()
@app.post("/v1/audio/transcriptions-from-url", dependencies=[Depends(apikey_auth)])
def transcribe_from_url(
audio_file_url: str = Body(
..., description="URL of the audio file to transcribe"
),
model: str = Body(MODEL_NAME),
language: str = Body("en"),
timestamp_offset: float = Body(0.0),
):
unique_filename, _audio_suffix = download_audio_to_volume(audio_file_url)
try:
func = transcriber_file.transcribe_segment.spawn(
filename=unique_filename,
timestamp_offset=timestamp_offset,
language=language,
)
result = func.get()
return result
finally:
try:
file_path = f"{UPLOADS_PATH}/{unique_filename}"
os.remove(file_path)
upload_volume.commit()
except Exception:
pass
return app
class NoStdStreams:
def __init__(self):
self.devnull = open(os.devnull, "w")
def __enter__(self):
self._stdout, self._stderr = sys.stdout, sys.stderr
self._stdout.flush()
self._stderr.flush()
sys.stdout, sys.stderr = self.devnull, self.devnull
def __exit__(self, exc_type, exc_value, traceback):
sys.stdout, sys.stderr = self._stdout, self._stderr
self.devnull.close()

View File

@@ -0,0 +1,658 @@
import logging
import os
import sys
import threading
import uuid
from typing import Generator, Mapping, NamedTuple, NewType, TypedDict
from urllib.parse import urlparse
import modal
MODEL_NAME = "nvidia/parakeet-tdt-0.6b-v2"
SUPPORTED_FILE_EXTENSIONS = ["mp3", "mp4", "mpeg", "mpga", "m4a", "wav", "webm"]
SAMPLERATE = 16000
UPLOADS_PATH = "/uploads"
CACHE_PATH = "/cache"
VAD_CONFIG = {
"batch_max_duration": 30.0,
"silence_padding": 0.5,
"window_size": 512,
}
ParakeetUniqFilename = NewType("ParakeetUniqFilename", str)
AudioFileExtension = NewType("AudioFileExtension", str)
class TimeSegment(NamedTuple):
"""Represents a time segment with start and end times."""
start: float
end: float
class AudioSegment(NamedTuple):
"""Represents an audio segment with timing and audio data."""
start: float
end: float
audio: any
class TranscriptResult(NamedTuple):
"""Represents a transcription result with text and word timings."""
text: str
words: list["WordTiming"]
class WordTiming(TypedDict):
"""Represents a word with its timing information."""
word: str
start: float
end: float
app = modal.App("reflector-transcriber-parakeet")
# Volume for caching model weights
model_cache = modal.Volume.from_name("parakeet-model-cache", create_if_missing=True)
# Volume for temporary file uploads
upload_volume = modal.Volume.from_name("parakeet-uploads", create_if_missing=True)
image = (
modal.Image.from_registry(
"nvidia/cuda:12.8.0-cudnn-devel-ubuntu22.04", add_python="3.12"
)
.env(
{
"HF_HUB_ENABLE_HF_TRANSFER": "1",
"HF_HOME": "/cache",
"DEBIAN_FRONTEND": "noninteractive",
"CXX": "g++",
"CC": "g++",
}
)
.apt_install("ffmpeg")
.pip_install(
"hf_transfer==0.1.9",
"huggingface_hub[hf-xet]==0.31.2",
"nemo_toolkit[asr]==2.3.0",
"cuda-python==12.8.0",
"fastapi==0.115.12",
"numpy<2",
"librosa==0.10.1",
"requests",
"silero-vad==5.1.0",
"torch",
)
.entrypoint([]) # silence chatty logs by container on start
)
def detect_audio_format(url: str, headers: Mapping[str, str]) -> AudioFileExtension:
parsed_url = urlparse(url)
url_path = parsed_url.path
for ext in SUPPORTED_FILE_EXTENSIONS:
if url_path.lower().endswith(f".{ext}"):
return AudioFileExtension(ext)
content_type = headers.get("content-type", "").lower()
if "audio/mpeg" in content_type or "audio/mp3" in content_type:
return AudioFileExtension("mp3")
if "audio/wav" in content_type:
return AudioFileExtension("wav")
if "audio/mp4" in content_type:
return AudioFileExtension("mp4")
raise ValueError(
f"Unsupported audio format for URL: {url}. "
f"Supported extensions: {', '.join(SUPPORTED_FILE_EXTENSIONS)}"
)
def download_audio_to_volume(
audio_file_url: str,
) -> tuple[ParakeetUniqFilename, AudioFileExtension]:
import requests
from fastapi import HTTPException
response = requests.head(audio_file_url, allow_redirects=True)
if response.status_code == 404:
raise HTTPException(status_code=404, detail="Audio file not found")
response = requests.get(audio_file_url, allow_redirects=True)
response.raise_for_status()
audio_suffix = detect_audio_format(audio_file_url, response.headers)
unique_filename = ParakeetUniqFilename(f"{uuid.uuid4()}.{audio_suffix}")
file_path = f"{UPLOADS_PATH}/{unique_filename}"
with open(file_path, "wb") as f:
f.write(response.content)
upload_volume.commit()
return unique_filename, audio_suffix
def pad_audio(audio_array, sample_rate: int = SAMPLERATE):
"""Add 0.5 seconds of silence if audio is less than 500ms.
This is a workaround for a Parakeet bug where very short audio (<500ms) causes:
ValueError: `char_offsets`: [] and `processed_tokens`: [157, 834, 834, 841]
have to be of the same length
See: https://github.com/NVIDIA/NeMo/issues/8451
"""
import numpy as np
audio_duration = len(audio_array) / sample_rate
if audio_duration < 0.5:
silence_samples = int(sample_rate * 0.5)
silence = np.zeros(silence_samples, dtype=np.float32)
return np.concatenate([audio_array, silence])
return audio_array
@app.cls(
gpu="A10G",
timeout=600,
scaledown_window=300,
image=image,
volumes={CACHE_PATH: model_cache, UPLOADS_PATH: upload_volume},
enable_memory_snapshot=True,
experimental_options={"enable_gpu_snapshot": True},
)
@modal.concurrent(max_inputs=10)
class TranscriberParakeetLive:
@modal.enter(snap=True)
def enter(self):
import nemo.collections.asr as nemo_asr
logging.getLogger("nemo_logger").setLevel(logging.CRITICAL)
self.lock = threading.Lock()
self.model = nemo_asr.models.ASRModel.from_pretrained(model_name=MODEL_NAME)
device = next(self.model.parameters()).device
print(f"Model is on device: {device}")
@modal.method()
def transcribe_segment(
self,
filename: str,
):
import librosa
upload_volume.reload()
file_path = f"{UPLOADS_PATH}/{filename}"
if not os.path.exists(file_path):
raise FileNotFoundError(f"File not found: {file_path}")
audio_array, sample_rate = librosa.load(file_path, sr=SAMPLERATE, mono=True)
padded_audio = pad_audio(audio_array, sample_rate)
with self.lock:
with NoStdStreams():
(output,) = self.model.transcribe([padded_audio], timestamps=True)
text = output.text.strip()
words: list[WordTiming] = [
WordTiming(
# XXX the space added here is to match the output of whisper
# whisper add space to each words, while parakeet don't
word=word_info["word"] + " ",
start=round(word_info["start"], 2),
end=round(word_info["end"], 2),
)
for word_info in output.timestamp["word"]
]
return {"text": text, "words": words}
@modal.method()
def transcribe_batch(
self,
filenames: list[str],
):
import librosa
upload_volume.reload()
results = []
audio_arrays = []
# Load all audio files with padding
for filename in filenames:
file_path = f"{UPLOADS_PATH}/{filename}"
if not os.path.exists(file_path):
raise FileNotFoundError(f"Batch file not found: {file_path}")
audio_array, sample_rate = librosa.load(file_path, sr=SAMPLERATE, mono=True)
padded_audio = pad_audio(audio_array, sample_rate)
audio_arrays.append(padded_audio)
with self.lock:
with NoStdStreams():
outputs = self.model.transcribe(audio_arrays, timestamps=True)
# Process results for each file
for i, (filename, output) in enumerate(zip(filenames, outputs)):
text = output.text.strip()
words: list[WordTiming] = [
WordTiming(
word=word_info["word"] + " ",
start=round(word_info["start"], 2),
end=round(word_info["end"], 2),
)
for word_info in output.timestamp["word"]
]
results.append(
{
"filename": filename,
"text": text,
"words": words,
}
)
return results
# L40S class for file transcription (bigger files)
@app.cls(
gpu="L40S",
timeout=900,
image=image,
volumes={CACHE_PATH: model_cache, UPLOADS_PATH: upload_volume},
enable_memory_snapshot=True,
experimental_options={"enable_gpu_snapshot": True},
)
class TranscriberParakeetFile:
@modal.enter(snap=True)
def enter(self):
import nemo.collections.asr as nemo_asr
import torch
from silero_vad import load_silero_vad
logging.getLogger("nemo_logger").setLevel(logging.CRITICAL)
self.model = nemo_asr.models.ASRModel.from_pretrained(model_name=MODEL_NAME)
device = next(self.model.parameters()).device
print(f"Model is on device: {device}")
torch.set_num_threads(1)
self.vad_model = load_silero_vad(onnx=False)
print("Silero VAD initialized")
@modal.method()
def transcribe_segment(
self,
filename: str,
timestamp_offset: float = 0.0,
):
import librosa
import numpy as np
from silero_vad import VADIterator
def load_and_convert_audio(file_path):
audio_array, sample_rate = librosa.load(file_path, sr=SAMPLERATE, mono=True)
return audio_array
def vad_segment_generator(
audio_array,
) -> Generator[TimeSegment, None, None]:
"""Generate speech segments using VAD with start/end sample indices"""
vad_iterator = VADIterator(self.vad_model, sampling_rate=SAMPLERATE)
window_size = VAD_CONFIG["window_size"]
start = None
for i in range(0, len(audio_array), window_size):
chunk = audio_array[i : i + window_size]
if len(chunk) < window_size:
chunk = np.pad(
chunk, (0, window_size - len(chunk)), mode="constant"
)
speech_dict = vad_iterator(chunk)
if not speech_dict:
continue
if "start" in speech_dict:
start = speech_dict["start"]
continue
if "end" in speech_dict and start is not None:
end = speech_dict["end"]
start_time = start / float(SAMPLERATE)
end_time = end / float(SAMPLERATE)
yield TimeSegment(start_time, end_time)
start = None
vad_iterator.reset_states()
def batch_speech_segments(
segments: Generator[TimeSegment, None, None], max_duration: int
) -> Generator[TimeSegment, None, None]:
"""
Input segments:
[0-2] [3-5] [6-8] [10-11] [12-15] [17-19] [20-22]
↓ (max_duration=10)
Output batches:
[0-8] [10-19] [20-22]
Note: silences are kept for better transcription, previous implementation was
passing segments separatly, but the output was less accurate.
"""
batch_start_time = None
batch_end_time = None
for segment in segments:
start_time, end_time = segment.start, segment.end
if batch_start_time is None or batch_end_time is None:
batch_start_time = start_time
batch_end_time = end_time
continue
total_duration = end_time - batch_start_time
if total_duration <= max_duration:
batch_end_time = end_time
continue
yield TimeSegment(batch_start_time, batch_end_time)
batch_start_time = start_time
batch_end_time = end_time
if batch_start_time is None or batch_end_time is None:
return
yield TimeSegment(batch_start_time, batch_end_time)
def batch_segment_to_audio_segment(
segments: Generator[TimeSegment, None, None],
audio_array,
) -> Generator[AudioSegment, None, None]:
"""Extract audio segments and apply padding for Parakeet compatibility.
Uses pad_audio to ensure segments are at least 0.5s long, preventing
Parakeet crashes. This padding may cause slight timing overlaps between
segments, which are corrected by enforce_word_timing_constraints.
"""
for segment in segments:
start_time, end_time = segment.start, segment.end
start_sample = int(start_time * SAMPLERATE)
end_sample = int(end_time * SAMPLERATE)
audio_segment = audio_array[start_sample:end_sample]
padded_segment = pad_audio(audio_segment, SAMPLERATE)
yield AudioSegment(start_time, end_time, padded_segment)
def transcribe_batch(model, audio_segments: list) -> list:
with NoStdStreams():
outputs = model.transcribe(audio_segments, timestamps=True)
return outputs
def enforce_word_timing_constraints(
words: list[WordTiming],
) -> list[WordTiming]:
"""Enforce that word end times don't exceed the start time of the next word.
Due to silence padding added in batch_segment_to_audio_segment for better
transcription accuracy, word timings from different segments may overlap.
This function ensures there are no overlaps by adjusting end times.
"""
if len(words) <= 1:
return words
enforced_words = []
for i, word in enumerate(words):
enforced_word = word.copy()
if i < len(words) - 1:
next_start = words[i + 1]["start"]
if enforced_word["end"] > next_start:
enforced_word["end"] = next_start
enforced_words.append(enforced_word)
return enforced_words
def emit_results(
results: list,
segments_info: list[AudioSegment],
) -> Generator[TranscriptResult, None, None]:
"""Yield transcribed text and word timings from model output, adjusting timestamps to absolute positions."""
for i, (output, segment) in enumerate(zip(results, segments_info)):
start_time, end_time = segment.start, segment.end
text = output.text.strip()
words: list[WordTiming] = [
WordTiming(
word=word_info["word"] + " ",
start=round(
word_info["start"] + start_time + timestamp_offset, 2
),
end=round(word_info["end"] + start_time + timestamp_offset, 2),
)
for word_info in output.timestamp["word"]
]
yield TranscriptResult(text, words)
upload_volume.reload()
file_path = f"{UPLOADS_PATH}/{filename}"
if not os.path.exists(file_path):
raise FileNotFoundError(f"File not found: {file_path}")
audio_array = load_and_convert_audio(file_path)
total_duration = len(audio_array) / float(SAMPLERATE)
all_text_parts: list[str] = []
all_words: list[WordTiming] = []
raw_segments = vad_segment_generator(audio_array)
speech_segments = batch_speech_segments(
raw_segments,
VAD_CONFIG["batch_max_duration"],
)
audio_segments = batch_segment_to_audio_segment(speech_segments, audio_array)
for batch in audio_segments:
audio_segment = batch.audio
results = transcribe_batch(self.model, [audio_segment])
for result in emit_results(
results,
[batch],
):
if not result.text:
continue
all_text_parts.append(result.text)
all_words.extend(result.words)
all_words = enforce_word_timing_constraints(all_words)
combined_text = " ".join(all_text_parts)
return {"text": combined_text, "words": all_words}
@app.function(
scaledown_window=60,
timeout=600,
secrets=[
modal.Secret.from_name("reflector-gpu"),
],
volumes={CACHE_PATH: model_cache, UPLOADS_PATH: upload_volume},
image=image,
)
@modal.concurrent(max_inputs=40)
@modal.asgi_app()
def web():
import os
import uuid
from fastapi import (
Body,
Depends,
FastAPI,
Form,
HTTPException,
UploadFile,
status,
)
from fastapi.security import OAuth2PasswordBearer
from pydantic import BaseModel
transcriber_live = TranscriberParakeetLive()
transcriber_file = TranscriberParakeetFile()
app = FastAPI()
oauth2_scheme = OAuth2PasswordBearer(tokenUrl="token")
def apikey_auth(apikey: str = Depends(oauth2_scheme)):
if apikey == os.environ["REFLECTOR_GPU_APIKEY"]:
return
raise HTTPException(
status_code=status.HTTP_401_UNAUTHORIZED,
detail="Invalid API key",
headers={"WWW-Authenticate": "Bearer"},
)
class TranscriptResponse(BaseModel):
result: dict
@app.post("/v1/audio/transcriptions", dependencies=[Depends(apikey_auth)])
def transcribe(
file: UploadFile = None,
files: list[UploadFile] | None = None,
model: str = Form(MODEL_NAME),
language: str = Form("en"),
batch: bool = Form(False),
):
# Parakeet only supports English
if language != "en":
raise HTTPException(
status_code=400,
detail=f"Parakeet model only supports English. Got language='{language}'",
)
# Handle both single file and multiple files
if not file and not files:
raise HTTPException(
status_code=400, detail="Either 'file' or 'files' parameter is required"
)
if batch and not files:
raise HTTPException(
status_code=400, detail="Batch transcription requires 'files'"
)
upload_files = [file] if file else files
# Upload files to volume
uploaded_filenames = []
for upload_file in upload_files:
audio_suffix = upload_file.filename.split(".")[-1]
assert audio_suffix in SUPPORTED_FILE_EXTENSIONS
# Generate unique filename
unique_filename = f"{uuid.uuid4()}.{audio_suffix}"
file_path = f"{UPLOADS_PATH}/{unique_filename}"
print(f"Writing file to: {file_path}")
with open(file_path, "wb") as f:
content = upload_file.file.read()
f.write(content)
uploaded_filenames.append(unique_filename)
upload_volume.commit()
try:
# Use A10G live transcriber for per-file transcription
if batch and len(upload_files) > 1:
# Use batch transcription
func = transcriber_live.transcribe_batch.spawn(
filenames=uploaded_filenames,
)
results = func.get()
return {"results": results}
# Per-file transcription
results = []
for filename in uploaded_filenames:
func = transcriber_live.transcribe_segment.spawn(
filename=filename,
)
result = func.get()
result["filename"] = filename
results.append(result)
return {"results": results} if len(results) > 1 else results[0]
finally:
for filename in uploaded_filenames:
try:
file_path = f"{UPLOADS_PATH}/{filename}"
print(f"Deleting file: {file_path}")
os.remove(file_path)
except Exception as e:
print(f"Error deleting {filename}: {e}")
upload_volume.commit()
@app.post("/v1/audio/transcriptions-from-url", dependencies=[Depends(apikey_auth)])
def transcribe_from_url(
audio_file_url: str = Body(
..., description="URL of the audio file to transcribe"
),
model: str = Body(MODEL_NAME),
language: str = Body("en", description="Language code (only 'en' supported)"),
timestamp_offset: float = Body(0.0),
):
# Parakeet only supports English
if language != "en":
raise HTTPException(
status_code=400,
detail=f"Parakeet model only supports English. Got language='{language}'",
)
unique_filename, audio_suffix = download_audio_to_volume(audio_file_url)
try:
func = transcriber_file.transcribe_segment.spawn(
filename=unique_filename,
timestamp_offset=timestamp_offset,
)
result = func.get()
return result
finally:
try:
file_path = f"{UPLOADS_PATH}/{unique_filename}"
print(f"Deleting file: {file_path}")
os.remove(file_path)
upload_volume.commit()
except Exception as e:
print(f"Error cleaning up {unique_filename}: {e}")
return app
class NoStdStreams:
def __init__(self):
self.devnull = open(os.devnull, "w")
def __enter__(self):
self._stdout, self._stderr = sys.stdout, sys.stderr
self._stdout.flush()
self._stderr.flush()
sys.stdout, sys.stderr = self.devnull, self.devnull
def __exit__(self, exc_type, exc_value, traceback):
sys.stdout, sys.stderr = self._stdout, self._stderr
self.devnull.close()

View File

@@ -1,171 +0,0 @@
# # Run an OpenAI-Compatible vLLM Server
import modal
MODELS_DIR = "/llamas"
MODEL_NAME = "NousResearch/Hermes-3-Llama-3.1-8B"
N_GPU = 1
def download_llm():
from huggingface_hub import snapshot_download
print("Downloading LLM model")
snapshot_download(
MODEL_NAME,
local_dir=f"{MODELS_DIR}/{MODEL_NAME}",
ignore_patterns=[
"*.pt",
"*.bin",
"*.pth",
"original/*",
], # Ensure safetensors
)
print("LLM model downloaded")
def move_cache():
from transformers.utils import move_cache as transformers_move_cache
transformers_move_cache()
vllm_image = (
modal.Image.debian_slim(python_version="3.10")
.pip_install("vllm==0.5.3post1")
.env({"HF_HUB_ENABLE_HF_TRANSFER": "1"})
.pip_install(
# "accelerate==0.34.2",
"einops==0.8.0",
"hf-transfer~=0.1",
)
.run_function(download_llm)
.run_function(move_cache)
.pip_install(
"bitsandbytes>=0.42.9",
)
)
app = modal.App("reflector-vllm-hermes3")
@app.function(
image=vllm_image,
gpu=modal.gpu.A100(count=N_GPU, size="40GB"),
timeout=60 * 5,
scaledown_window=60 * 5,
allow_concurrent_inputs=100,
secrets=[
modal.Secret.from_name("reflector-gpu"),
],
)
@modal.asgi_app()
def serve():
import os
import fastapi
import vllm.entrypoints.openai.api_server as api_server
from vllm.engine.arg_utils import AsyncEngineArgs
from vllm.engine.async_llm_engine import AsyncLLMEngine
from vllm.entrypoints.logger import RequestLogger
from vllm.entrypoints.openai.serving_chat import OpenAIServingChat
from vllm.entrypoints.openai.serving_completion import OpenAIServingCompletion
from vllm.usage.usage_lib import UsageContext
TOKEN = os.environ["REFLECTOR_GPU_APIKEY"]
# create a fastAPI app that uses vLLM's OpenAI-compatible router
web_app = fastapi.FastAPI(
title=f"OpenAI-compatible {MODEL_NAME} server",
description="Run an OpenAI-compatible LLM server with vLLM on modal.com",
version="0.0.1",
docs_url="/docs",
)
# security: CORS middleware for external requests
http_bearer = fastapi.security.HTTPBearer(
scheme_name="Bearer Token",
description="See code for authentication details.",
)
web_app.add_middleware(
fastapi.middleware.cors.CORSMiddleware,
allow_origins=["*"],
allow_credentials=True,
allow_methods=["*"],
allow_headers=["*"],
)
# security: inject dependency on authed routes
async def is_authenticated(api_key: str = fastapi.Security(http_bearer)):
if api_key.credentials != TOKEN:
raise fastapi.HTTPException(
status_code=fastapi.status.HTTP_401_UNAUTHORIZED,
detail="Invalid authentication credentials",
)
return {"username": "authenticated_user"}
router = fastapi.APIRouter(dependencies=[fastapi.Depends(is_authenticated)])
# wrap vllm's router in auth router
router.include_router(api_server.router)
# add authed vllm to our fastAPI app
web_app.include_router(router)
engine_args = AsyncEngineArgs(
model=MODELS_DIR + "/" + MODEL_NAME,
tensor_parallel_size=N_GPU,
gpu_memory_utilization=0.90,
# max_model_len=8096,
enforce_eager=False, # capture the graph for faster inference, but slower cold starts (30s > 20s)
# --- 4 bits load
# quantization="bitsandbytes",
# load_format="bitsandbytes",
)
engine = AsyncLLMEngine.from_engine_args(
engine_args, usage_context=UsageContext.OPENAI_API_SERVER
)
model_config = get_model_config(engine)
request_logger = RequestLogger(max_log_len=2048)
api_server.openai_serving_chat = OpenAIServingChat(
engine,
model_config=model_config,
served_model_names=[MODEL_NAME],
chat_template=None,
response_role="assistant",
lora_modules=[],
prompt_adapters=[],
request_logger=request_logger,
)
api_server.openai_serving_completion = OpenAIServingCompletion(
engine,
model_config=model_config,
served_model_names=[MODEL_NAME],
lora_modules=[],
prompt_adapters=[],
request_logger=request_logger,
)
return web_app
def get_model_config(engine):
import asyncio
try: # adapted from vLLM source -- https://github.com/vllm-project/vllm/blob/507ef787d85dec24490069ffceacbd6b161f4f72/vllm/entrypoints/openai/api_server.py#L235C1-L247C1
event_loop = asyncio.get_running_loop()
except RuntimeError:
event_loop = None
if event_loop is not None and event_loop.is_running():
# If the current is instanced by Ray Serve,
# there is already a running event loop
model_config = event_loop.run_until_complete(engine.get_model_config())
else:
# When using single vLLM without engine_use_ray
model_config = asyncio.run(engine.get_model_config())
return model_config

View File

@@ -1 +1,3 @@
Generic single-database configuration.
Generic single-database configuration.
Both data migrations and schema migrations must be in migrations.

View File

@@ -1,9 +1,10 @@
from logging.config import fileConfig
from alembic import context
from sqlalchemy import engine_from_config, pool
from reflector.db import metadata
from reflector.settings import settings
from sqlalchemy import engine_from_config, pool
# this is the Alembic Config object, which provides
# access to the values within the .ini file in use.

View File

@@ -0,0 +1,36 @@
"""Add webhook fields to rooms
Revision ID: 0194f65cd6d3
Revises: 5a8907fd1d78
Create Date: 2025-08-27 09:03:19.610995
"""
from typing import Sequence, Union
import sqlalchemy as sa
from alembic import op
# revision identifiers, used by Alembic.
revision: str = "0194f65cd6d3"
down_revision: Union[str, None] = "5a8907fd1d78"
branch_labels: Union[str, Sequence[str], None] = None
depends_on: Union[str, Sequence[str], None] = None
def upgrade() -> None:
# ### commands auto generated by Alembic - please adjust! ###
with op.batch_alter_table("room", schema=None) as batch_op:
batch_op.add_column(sa.Column("webhook_url", sa.String(), nullable=True))
batch_op.add_column(sa.Column("webhook_secret", sa.String(), nullable=True))
# ### end Alembic commands ###
def downgrade() -> None:
# ### commands auto generated by Alembic - please adjust! ###
with op.batch_alter_table("room", schema=None) as batch_op:
batch_op.drop_column("webhook_secret")
batch_op.drop_column("webhook_url")
# ### end Alembic commands ###

View File

@@ -8,7 +8,6 @@ Create Date: 2024-09-24 16:12:56.944133
from typing import Sequence, Union
import sqlalchemy as sa
from alembic import op
# revision identifiers, used by Alembic.

View File

@@ -0,0 +1,64 @@
"""add_long_summary_to_search_vector
Revision ID: 0ab2d7ffaa16
Revises: b1c33bd09963
Create Date: 2025-08-15 13:27:52.680211
"""
from typing import Sequence, Union
from alembic import op
# revision identifiers, used by Alembic.
revision: str = "0ab2d7ffaa16"
down_revision: Union[str, None] = "b1c33bd09963"
branch_labels: Union[str, Sequence[str], None] = None
depends_on: Union[str, Sequence[str], None] = None
def upgrade() -> None:
# Drop the existing search vector column and index
op.drop_index("idx_transcript_search_vector_en", table_name="transcript")
op.drop_column("transcript", "search_vector_en")
# Recreate the search vector column with long_summary included
op.execute("""
ALTER TABLE transcript ADD COLUMN search_vector_en tsvector
GENERATED ALWAYS AS (
setweight(to_tsvector('english', coalesce(title, '')), 'A') ||
setweight(to_tsvector('english', coalesce(long_summary, '')), 'B') ||
setweight(to_tsvector('english', coalesce(webvtt, '')), 'C')
) STORED
""")
# Recreate the GIN index for the search vector
op.create_index(
"idx_transcript_search_vector_en",
"transcript",
["search_vector_en"],
postgresql_using="gin",
)
def downgrade() -> None:
# Drop the updated search vector column and index
op.drop_index("idx_transcript_search_vector_en", table_name="transcript")
op.drop_column("transcript", "search_vector_en")
# Recreate the original search vector column without long_summary
op.execute("""
ALTER TABLE transcript ADD COLUMN search_vector_en tsvector
GENERATED ALWAYS AS (
setweight(to_tsvector('english', coalesce(title, '')), 'A') ||
setweight(to_tsvector('english', coalesce(webvtt, '')), 'B')
) STORED
""")
# Recreate the GIN index for the search vector
op.create_index(
"idx_transcript_search_vector_en",
"transcript",
["search_vector_en"],
postgresql_using="gin",
)

View File

@@ -0,0 +1,25 @@
"""add_webvtt_field_to_transcript
Revision ID: 0bc0f3ff0111
Revises: b7df9609542c
Create Date: 2025-08-05 19:36:41.740957
"""
from typing import Sequence, Union
import sqlalchemy as sa
from alembic import op
revision: str = "0bc0f3ff0111"
down_revision: Union[str, None] = "b7df9609542c"
branch_labels: Union[str, Sequence[str], None] = None
depends_on: Union[str, Sequence[str], None] = None
def upgrade() -> None:
op.add_column("transcript", sa.Column("webvtt", sa.Text(), nullable=True))
def downgrade() -> None:
op.drop_column("transcript", "webvtt")

View File

@@ -0,0 +1,36 @@
"""remove user_id from meeting table
Revision ID: 0ce521cda2ee
Revises: 6dec9fb5b46c
Create Date: 2025-09-10 12:40:55.688899
"""
from typing import Sequence, Union
import sqlalchemy as sa
from alembic import op
# revision identifiers, used by Alembic.
revision: str = "0ce521cda2ee"
down_revision: Union[str, None] = "6dec9fb5b46c"
branch_labels: Union[str, Sequence[str], None] = None
depends_on: Union[str, Sequence[str], None] = None
def upgrade() -> None:
# ### commands auto generated by Alembic - please adjust! ###
with op.batch_alter_table("meeting", schema=None) as batch_op:
batch_op.drop_column("user_id")
# ### end Alembic commands ###
def downgrade() -> None:
# ### commands auto generated by Alembic - please adjust! ###
with op.batch_alter_table("meeting", schema=None) as batch_op:
batch_op.add_column(
sa.Column("user_id", sa.VARCHAR(), autoincrement=False, nullable=True)
)
# ### end Alembic commands ###

View File

@@ -5,11 +5,11 @@ Revises: f819277e5169
Create Date: 2023-11-07 11:12:21.614198
"""
from typing import Sequence, Union
from alembic import op
import sqlalchemy as sa
from alembic import op
# revision identifiers, used by Alembic.
revision: str = "0fea6d96b096"

View File

@@ -0,0 +1,46 @@
"""add_full_text_search
Revision ID: 116b2f287eab
Revises: 0bc0f3ff0111
Create Date: 2025-08-07 11:27:38.473517
"""
from typing import Sequence, Union
from alembic import op
revision: str = "116b2f287eab"
down_revision: Union[str, None] = "0bc0f3ff0111"
branch_labels: Union[str, Sequence[str], None] = None
depends_on: Union[str, Sequence[str], None] = None
def upgrade() -> None:
conn = op.get_bind()
if conn.dialect.name != "postgresql":
return
op.execute("""
ALTER TABLE transcript ADD COLUMN search_vector_en tsvector
GENERATED ALWAYS AS (
setweight(to_tsvector('english', coalesce(title, '')), 'A') ||
setweight(to_tsvector('english', coalesce(webvtt, '')), 'B')
) STORED
""")
op.create_index(
"idx_transcript_search_vector_en",
"transcript",
["search_vector_en"],
postgresql_using="gin",
)
def downgrade() -> None:
conn = op.get_bind()
if conn.dialect.name != "postgresql":
return
op.drop_index("idx_transcript_search_vector_en", table_name="transcript")
op.drop_column("transcript", "search_vector_en")

View File

@@ -5,26 +5,26 @@ Revises: 0fea6d96b096
Create Date: 2023-11-30 15:56:03.341466
"""
from typing import Sequence, Union
from alembic import op
import sqlalchemy as sa
from alembic import op
# revision identifiers, used by Alembic.
revision: str = '125031f7cb78'
down_revision: Union[str, None] = '0fea6d96b096'
revision: str = "125031f7cb78"
down_revision: Union[str, None] = "0fea6d96b096"
branch_labels: Union[str, Sequence[str], None] = None
depends_on: Union[str, Sequence[str], None] = None
def upgrade() -> None:
# ### commands auto generated by Alembic - please adjust! ###
op.add_column('transcript', sa.Column('participants', sa.JSON(), nullable=True))
op.add_column("transcript", sa.Column("participants", sa.JSON(), nullable=True))
# ### end Alembic commands ###
def downgrade() -> None:
# ### commands auto generated by Alembic - please adjust! ###
op.drop_column('transcript', 'participants')
op.drop_column("transcript", "participants")
# ### end Alembic commands ###

View File

@@ -5,6 +5,7 @@ Revises: f819277e5169
Create Date: 2025-06-17 14:00:03.000000
"""
from typing import Sequence, Union
import sqlalchemy as sa
@@ -19,16 +20,16 @@ depends_on: Union[str, Sequence[str], None] = None
def upgrade() -> None:
op.create_table(
'meeting_consent',
sa.Column('id', sa.String(), nullable=False),
sa.Column('meeting_id', sa.String(), nullable=False),
sa.Column('user_id', sa.String(), nullable=True),
sa.Column('consent_given', sa.Boolean(), nullable=False),
sa.Column('consent_timestamp', sa.DateTime(), nullable=False),
sa.PrimaryKeyConstraint('id'),
sa.ForeignKeyConstraint(['meeting_id'], ['meeting.id']),
"meeting_consent",
sa.Column("id", sa.String(), nullable=False),
sa.Column("meeting_id", sa.String(), nullable=False),
sa.Column("user_id", sa.String(), nullable=True),
sa.Column("consent_given", sa.Boolean(), nullable=False),
sa.Column("consent_timestamp", sa.DateTime(), nullable=False),
sa.PrimaryKeyConstraint("id"),
sa.ForeignKeyConstraint(["meeting_id"], ["meeting.id"]),
)
def downgrade() -> None:
op.drop_table('meeting_consent')
op.drop_table("meeting_consent")

View File

@@ -5,6 +5,7 @@ Revises: 20250617140003
Create Date: 2025-06-18 14:00:00.000000
"""
from typing import Sequence, Union
import sqlalchemy as sa
@@ -22,4 +23,4 @@ def upgrade() -> None:
def downgrade() -> None:
op.drop_column("transcript", "audio_deleted")
op.drop_column("transcript", "audio_deleted")

View File

@@ -0,0 +1,38 @@
"""Add events column to meetings table
Revision ID: 2890b5104577
Revises: 6e6ea8e607c5
Create Date: 2025-09-02 17:51:41.620777
"""
from typing import Sequence, Union
import sqlalchemy as sa
from alembic import op
# revision identifiers, used by Alembic.
revision: str = "2890b5104577"
down_revision: Union[str, None] = "6e6ea8e607c5"
branch_labels: Union[str, Sequence[str], None] = None
depends_on: Union[str, Sequence[str], None] = None
def upgrade() -> None:
# ### commands auto generated by Alembic - please adjust! ###
with op.batch_alter_table("meeting", schema=None) as batch_op:
batch_op.add_column(
sa.Column(
"events", sa.JSON(), server_default=sa.text("'[]'"), nullable=False
)
)
# ### end Alembic commands ###
def downgrade() -> None:
# ### commands auto generated by Alembic - please adjust! ###
with op.batch_alter_table("meeting", schema=None) as batch_op:
batch_op.drop_column("events")
# ### end Alembic commands ###

View File

@@ -0,0 +1,32 @@
"""clean up orphaned room_id references in meeting table
Revision ID: 2ae3db106d4e
Revises: def1b5867d4c
Create Date: 2025-09-11 10:35:15.759967
"""
from typing import Sequence, Union
from alembic import op
# revision identifiers, used by Alembic.
revision: str = "2ae3db106d4e"
down_revision: Union[str, None] = "def1b5867d4c"
branch_labels: Union[str, Sequence[str], None] = None
depends_on: Union[str, Sequence[str], None] = None
def upgrade() -> None:
# Set room_id to NULL for meetings that reference non-existent rooms
op.execute("""
UPDATE meeting
SET room_id = NULL
WHERE room_id IS NOT NULL
AND room_id NOT IN (SELECT id FROM room WHERE id IS NOT NULL)
""")
def downgrade() -> None:
# Cannot restore orphaned references - no operation needed
pass

View File

@@ -5,36 +5,40 @@ Revises: ccd68dc784ff
Create Date: 2025-07-15 16:53:40.397394
"""
from typing import Sequence, Union
from alembic import op
import sqlalchemy as sa
from alembic import op
# revision identifiers, used by Alembic.
revision: str = '2cf0b60a9d34'
down_revision: Union[str, None] = 'ccd68dc784ff'
revision: str = "2cf0b60a9d34"
down_revision: Union[str, None] = "ccd68dc784ff"
branch_labels: Union[str, Sequence[str], None] = None
depends_on: Union[str, Sequence[str], None] = None
def upgrade() -> None:
# ### commands auto generated by Alembic - please adjust! ###
with op.batch_alter_table('transcript', schema=None) as batch_op:
batch_op.alter_column('duration',
existing_type=sa.INTEGER(),
type_=sa.Float(),
existing_nullable=True)
with op.batch_alter_table("transcript", schema=None) as batch_op:
batch_op.alter_column(
"duration",
existing_type=sa.INTEGER(),
type_=sa.Float(),
existing_nullable=True,
)
# ### end Alembic commands ###
def downgrade() -> None:
# ### commands auto generated by Alembic - please adjust! ###
with op.batch_alter_table('transcript', schema=None) as batch_op:
batch_op.alter_column('duration',
existing_type=sa.Float(),
type_=sa.INTEGER(),
existing_nullable=True)
with op.batch_alter_table("transcript", schema=None) as batch_op:
batch_op.alter_column(
"duration",
existing_type=sa.Float(),
type_=sa.INTEGER(),
existing_nullable=True,
)
# ### end Alembic commands ###

View File

@@ -5,17 +5,17 @@ Revises: 9920ecfe2735
Create Date: 2023-11-02 19:53:09.116240
"""
from typing import Sequence, Union
from alembic import op
import sqlalchemy as sa
from sqlalchemy.sql import table, column
from alembic import op
from sqlalchemy import select
from sqlalchemy.sql import column, table
# revision identifiers, used by Alembic.
revision: str = '38a927dcb099'
down_revision: Union[str, None] = '9920ecfe2735'
revision: str = "38a927dcb099"
down_revision: Union[str, None] = "9920ecfe2735"
branch_labels: Union[str, Sequence[str], None] = None
depends_on: Union[str, Sequence[str], None] = None

View File

@@ -5,13 +5,13 @@ Revises: 38a927dcb099
Create Date: 2023-11-10 18:12:17.886522
"""
from typing import Sequence, Union
from alembic import op
import sqlalchemy as sa
from sqlalchemy.sql import table, column
from alembic import op
from sqlalchemy import select
from sqlalchemy.sql import column, table
# revision identifiers, used by Alembic.
revision: str = "4814901632bc"
@@ -24,9 +24,11 @@ def upgrade() -> None:
# for all the transcripts, calculate the duration from the mp3
# and update the duration column
from pathlib import Path
from reflector.settings import settings
import av
from reflector.settings import settings
bind = op.get_bind()
transcript = table(
"transcript", column("id", sa.String), column("duration", sa.Float)

View File

@@ -5,14 +5,11 @@ Revises:
Create Date: 2023-08-29 10:54:45.142974
"""
from typing import Sequence, Union
from alembic import op
import sqlalchemy as sa
# revision identifiers, used by Alembic.
revision: str = '543ed284d69a'
revision: str = "543ed284d69a"
down_revision: Union[str, None] = None
branch_labels: Union[str, Sequence[str], None] = None
depends_on: Union[str, Sequence[str], None] = None

View File

@@ -0,0 +1,50 @@
"""add cascade delete to meeting consent foreign key
Revision ID: 5a8907fd1d78
Revises: 0ab2d7ffaa16
Create Date: 2025-08-26 17:26:50.945491
"""
from typing import Sequence, Union
from alembic import op
# revision identifiers, used by Alembic.
revision: str = "5a8907fd1d78"
down_revision: Union[str, None] = "0ab2d7ffaa16"
branch_labels: Union[str, Sequence[str], None] = None
depends_on: Union[str, Sequence[str], None] = None
def upgrade() -> None:
# ### commands auto generated by Alembic - please adjust! ###
with op.batch_alter_table("meeting_consent", schema=None) as batch_op:
batch_op.drop_constraint(
batch_op.f("meeting_consent_meeting_id_fkey"), type_="foreignkey"
)
batch_op.create_foreign_key(
batch_op.f("meeting_consent_meeting_id_fkey"),
"meeting",
["meeting_id"],
["id"],
ondelete="CASCADE",
)
# ### end Alembic commands ###
def downgrade() -> None:
# ### commands auto generated by Alembic - please adjust! ###
with op.batch_alter_table("meeting_consent", schema=None) as batch_op:
batch_op.drop_constraint(
batch_op.f("meeting_consent_meeting_id_fkey"), type_="foreignkey"
)
batch_op.create_foreign_key(
batch_op.f("meeting_consent_meeting_id_fkey"),
"meeting",
["meeting_id"],
["id"],
)
# ### end Alembic commands ###

View File

@@ -0,0 +1,28 @@
"""webhook url and secret null by default
Revision ID: 61882a919591
Revises: 0194f65cd6d3
Create Date: 2025-08-29 11:46:36.738091
"""
from typing import Sequence, Union
# revision identifiers, used by Alembic.
revision: str = "61882a919591"
down_revision: Union[str, None] = "0194f65cd6d3"
branch_labels: Union[str, Sequence[str], None] = None
depends_on: Union[str, Sequence[str], None] = None
def upgrade() -> None:
# ### commands auto generated by Alembic - please adjust! ###
pass
# ### end Alembic commands ###
def downgrade() -> None:
# ### commands auto generated by Alembic - please adjust! ###
pass
# ### end Alembic commands ###

View File

@@ -8,9 +8,8 @@ Create Date: 2025-06-27 09:04:21.006823
from typing import Sequence, Union
from alembic import op
import sqlalchemy as sa
from alembic import op
# revision identifiers, used by Alembic.
revision: str = "62dea3db63a5"
@@ -33,7 +32,7 @@ def upgrade() -> None:
sa.Column("user_id", sa.String(), nullable=True),
sa.Column("room_id", sa.String(), nullable=True),
sa.Column(
"is_locked", sa.Boolean(), server_default=sa.text("0"), nullable=False
"is_locked", sa.Boolean(), server_default=sa.text("false"), nullable=False
),
sa.Column("room_mode", sa.String(), server_default="normal", nullable=False),
sa.Column(
@@ -54,12 +53,15 @@ def upgrade() -> None:
sa.Column("user_id", sa.String(), nullable=False),
sa.Column("created_at", sa.DateTime(), nullable=False),
sa.Column(
"zulip_auto_post", sa.Boolean(), server_default=sa.text("0"), nullable=False
"zulip_auto_post",
sa.Boolean(),
server_default=sa.text("false"),
nullable=False,
),
sa.Column("zulip_stream", sa.String(), nullable=True),
sa.Column("zulip_topic", sa.String(), nullable=True),
sa.Column(
"is_locked", sa.Boolean(), server_default=sa.text("0"), nullable=False
"is_locked", sa.Boolean(), server_default=sa.text("false"), nullable=False
),
sa.Column("room_mode", sa.String(), server_default="normal", nullable=False),
sa.Column(

View File

@@ -0,0 +1,38 @@
"""make meeting room_id required and add foreign key
Revision ID: 6dec9fb5b46c
Revises: 61882a919591
Create Date: 2025-09-10 10:47:06.006819
"""
from typing import Sequence, Union
import sqlalchemy as sa
from alembic import op
# revision identifiers, used by Alembic.
revision: str = "6dec9fb5b46c"
down_revision: Union[str, None] = "61882a919591"
branch_labels: Union[str, Sequence[str], None] = None
depends_on: Union[str, Sequence[str], None] = None
def upgrade() -> None:
# ### commands auto generated by Alembic - please adjust! ###
with op.batch_alter_table("meeting", schema=None) as batch_op:
batch_op.alter_column("room_id", existing_type=sa.VARCHAR(), nullable=False)
batch_op.create_foreign_key(
None, "room", ["room_id"], ["id"], ondelete="CASCADE"
)
# ### end Alembic commands ###
def downgrade() -> None:
# ### commands auto generated by Alembic - please adjust! ###
with op.batch_alter_table("meeting", schema=None) as batch_op:
batch_op.drop_constraint("meeting_room_id_fkey", type_="foreignkey")
batch_op.alter_column("room_id", existing_type=sa.VARCHAR(), nullable=True)
# ### end Alembic commands ###

View File

@@ -0,0 +1,44 @@
"""Add VideoPlatform enum for rooms and meetings
Revision ID: 6e6ea8e607c5
Revises: 61882a919591
Create Date: 2025-09-02 17:33:21.022214
"""
from typing import Sequence, Union
import sqlalchemy as sa
from alembic import op
# revision identifiers, used by Alembic.
revision: str = "6e6ea8e607c5"
down_revision: Union[str, None] = "61882a919591"
branch_labels: Union[str, Sequence[str], None] = None
depends_on: Union[str, Sequence[str], None] = None
def upgrade() -> None:
# ### commands auto generated by Alembic - please adjust! ###
with op.batch_alter_table("meeting", schema=None) as batch_op:
batch_op.add_column(
sa.Column("platform", sa.String(), server_default="whereby", nullable=False)
)
with op.batch_alter_table("room", schema=None) as batch_op:
batch_op.add_column(
sa.Column("platform", sa.String(), server_default="whereby", nullable=False)
)
# ### end Alembic commands ###
def downgrade() -> None:
# ### commands auto generated by Alembic - please adjust! ###
with op.batch_alter_table("room", schema=None) as batch_op:
batch_op.drop_column("platform")
with op.batch_alter_table("meeting", schema=None) as batch_op:
batch_op.drop_column("platform")
# ### end Alembic commands ###

View File

@@ -20,11 +20,14 @@ depends_on: Union[str, Sequence[str], None] = None
def upgrade() -> None:
# ### commands auto generated by Alembic - please adjust! ###
sourcekind_enum = sa.Enum("room", "live", "file", name="sourcekind")
sourcekind_enum.create(op.get_bind())
op.add_column(
"transcript",
sa.Column(
"source_kind",
sa.Enum("ROOM", "LIVE", "FILE", name="sourcekind"),
sourcekind_enum,
nullable=True,
),
)
@@ -43,6 +46,8 @@ def upgrade() -> None:
def downgrade() -> None:
# ### commands auto generated by Alembic - please adjust! ###
op.drop_column("transcript", "source_kind")
sourcekind_enum = sa.Enum(name="sourcekind")
sourcekind_enum.drop(op.get_bind())
# ### end Alembic commands ###

View File

@@ -5,26 +5,28 @@ Revises: 62dea3db63a5
Create Date: 2024-09-06 14:02:06.649665
"""
from typing import Sequence, Union
from alembic import op
import sqlalchemy as sa
from alembic import op
# revision identifiers, used by Alembic.
revision: str = '764ce6db4388'
down_revision: Union[str, None] = '62dea3db63a5'
revision: str = "764ce6db4388"
down_revision: Union[str, None] = "62dea3db63a5"
branch_labels: Union[str, Sequence[str], None] = None
depends_on: Union[str, Sequence[str], None] = None
def upgrade() -> None:
# ### commands auto generated by Alembic - please adjust! ###
op.add_column('transcript', sa.Column('zulip_message_id', sa.Integer(), nullable=True))
op.add_column(
"transcript", sa.Column("zulip_message_id", sa.Integer(), nullable=True)
)
# ### end Alembic commands ###
def downgrade() -> None:
# ### commands auto generated by Alembic - please adjust! ###
op.drop_column('transcript', 'zulip_message_id')
op.drop_column("transcript", "zulip_message_id")
# ### end Alembic commands ###

View File

@@ -0,0 +1,106 @@
"""populate_webvtt_from_topics
Revision ID: 8120ebc75366
Revises: 116b2f287eab
Create Date: 2025-08-11 19:11:01.316947
"""
import json
from typing import Sequence, Union
from alembic import op
from sqlalchemy import text
# revision identifiers, used by Alembic.
revision: str = "8120ebc75366"
down_revision: Union[str, None] = "116b2f287eab"
branch_labels: Union[str, Sequence[str], None] = None
depends_on: Union[str, Sequence[str], None] = None
def topics_to_webvtt(topics):
"""Convert topics list to WebVTT format string."""
if not topics:
return None
lines = ["WEBVTT", ""]
for topic in topics:
start_time = format_timestamp(topic.get("start"))
end_time = format_timestamp(topic.get("end"))
text = topic.get("text", "").strip()
if start_time and end_time and text:
lines.append(f"{start_time} --> {end_time}")
lines.append(text)
lines.append("")
return "\n".join(lines).strip()
def format_timestamp(seconds):
"""Format seconds to WebVTT timestamp format (HH:MM:SS.mmm)."""
if seconds is None:
return None
hours = int(seconds // 3600)
minutes = int((seconds % 3600) // 60)
secs = seconds % 60
return f"{hours:02d}:{minutes:02d}:{secs:06.3f}"
def upgrade() -> None:
"""Populate WebVTT field for all transcripts with topics."""
# Get connection
connection = op.get_bind()
# Query all transcripts with topics
result = connection.execute(
text("SELECT id, topics FROM transcript WHERE topics IS NOT NULL")
)
rows = result.fetchall()
print(f"Found {len(rows)} transcripts with topics")
updated_count = 0
error_count = 0
for row in rows:
transcript_id = row[0]
topics_data = row[1]
if not topics_data:
continue
try:
# Parse JSON if it's a string
if isinstance(topics_data, str):
topics_data = json.loads(topics_data)
# Convert topics to WebVTT format
webvtt_content = topics_to_webvtt(topics_data)
if webvtt_content:
# Update the webvtt field
connection.execute(
text("UPDATE transcript SET webvtt = :webvtt WHERE id = :id"),
{"webvtt": webvtt_content, "id": transcript_id},
)
updated_count += 1
print(f"✓ Updated transcript {transcript_id}")
except Exception as e:
error_count += 1
print(f"✗ Error updating transcript {transcript_id}: {e}")
print(f"\nMigration complete!")
print(f" Updated: {updated_count}")
print(f" Errors: {error_count}")
def downgrade() -> None:
"""Clear WebVTT field for all transcripts."""
op.execute(text("UPDATE transcript SET webvtt = NULL"))

View File

@@ -9,8 +9,6 @@ Create Date: 2025-07-15 19:30:19.876332
from typing import Sequence, Union
from alembic import op
import sqlalchemy as sa
# revision identifiers, used by Alembic.
revision: str = "88d292678ba2"
@@ -21,7 +19,7 @@ depends_on: Union[str, Sequence[str], None] = None
def upgrade() -> None:
import json
import re
from sqlalchemy import text
# Get database connection
@@ -58,7 +56,9 @@ def upgrade() -> None:
fixed_events = json.dumps(jevents)
assert "NaN" not in fixed_events
except (json.JSONDecodeError, AssertionError) as e:
print(f"Warning: Invalid JSON for transcript {transcript_id}, skipping: {e}")
print(
f"Warning: Invalid JSON for transcript {transcript_id}, skipping: {e}"
)
continue
# Update the record with fixed JSON

View File

@@ -5,13 +5,13 @@ Revises: 99365b0cd87b
Create Date: 2023-11-02 18:55:17.019498
"""
from typing import Sequence, Union
from alembic import op
import sqlalchemy as sa
from sqlalchemy.sql import table, column
from alembic import op
from sqlalchemy import select
from sqlalchemy.sql import column, table
# revision identifiers, used by Alembic.
revision: str = "9920ecfe2735"

View File

@@ -8,8 +8,8 @@ Create Date: 2023-09-01 20:19:47.216334
from typing import Sequence, Union
from alembic import op
import sqlalchemy as sa
from alembic import op
# revision identifiers, used by Alembic.
revision: str = "99365b0cd87b"
@@ -22,7 +22,7 @@ def upgrade() -> None:
# ### commands auto generated by Alembic - please adjust! ###
op.execute(
"UPDATE transcript SET events = "
'REPLACE(events, \'"event": "SUMMARY"\', \'"event": "LONG_SUMMARY"\');'
'REPLACE(events::text, \'"event": "SUMMARY"\', \'"event": "LONG_SUMMARY"\')::json;'
)
op.alter_column("transcript", "summary", new_column_name="long_summary")
op.add_column("transcript", sa.Column("title", sa.String(), nullable=True))
@@ -34,7 +34,7 @@ def downgrade() -> None:
# ### commands auto generated by Alembic - please adjust! ###
op.execute(
"UPDATE transcript SET events = "
'REPLACE(events, \'"event": "LONG_SUMMARY"\', \'"event": "SUMMARY"\');'
'REPLACE(events::text, \'"event": "LONG_SUMMARY"\', \'"event": "SUMMARY"\')::json;'
)
with op.batch_alter_table("transcript", schema=None) as batch_op:
batch_op.alter_column("long_summary", nullable=True, new_column_name="summary")

View File

@@ -0,0 +1,121 @@
"""datetime timezone
Revision ID: 9f5c78d352d6
Revises: 8120ebc75366
Create Date: 2025-08-13 19:18:27.113593
"""
from typing import Sequence, Union
import sqlalchemy as sa
from alembic import op
from sqlalchemy.dialects import postgresql
# revision identifiers, used by Alembic.
revision: str = "9f5c78d352d6"
down_revision: Union[str, None] = "8120ebc75366"
branch_labels: Union[str, Sequence[str], None] = None
depends_on: Union[str, Sequence[str], None] = None
def upgrade() -> None:
# ### commands auto generated by Alembic - please adjust! ###
with op.batch_alter_table("meeting", schema=None) as batch_op:
batch_op.alter_column(
"start_date",
existing_type=postgresql.TIMESTAMP(),
type_=sa.DateTime(timezone=True),
existing_nullable=True,
)
batch_op.alter_column(
"end_date",
existing_type=postgresql.TIMESTAMP(),
type_=sa.DateTime(timezone=True),
existing_nullable=True,
)
with op.batch_alter_table("meeting_consent", schema=None) as batch_op:
batch_op.alter_column(
"consent_timestamp",
existing_type=postgresql.TIMESTAMP(),
type_=sa.DateTime(timezone=True),
existing_nullable=False,
)
with op.batch_alter_table("recording", schema=None) as batch_op:
batch_op.alter_column(
"recorded_at",
existing_type=postgresql.TIMESTAMP(),
type_=sa.DateTime(timezone=True),
existing_nullable=False,
)
with op.batch_alter_table("room", schema=None) as batch_op:
batch_op.alter_column(
"created_at",
existing_type=postgresql.TIMESTAMP(),
type_=sa.DateTime(timezone=True),
existing_nullable=False,
)
with op.batch_alter_table("transcript", schema=None) as batch_op:
batch_op.alter_column(
"created_at",
existing_type=postgresql.TIMESTAMP(),
type_=sa.DateTime(timezone=True),
existing_nullable=True,
)
# ### end Alembic commands ###
def downgrade() -> None:
# ### commands auto generated by Alembic - please adjust! ###
with op.batch_alter_table("transcript", schema=None) as batch_op:
batch_op.alter_column(
"created_at",
existing_type=sa.DateTime(timezone=True),
type_=postgresql.TIMESTAMP(),
existing_nullable=True,
)
with op.batch_alter_table("room", schema=None) as batch_op:
batch_op.alter_column(
"created_at",
existing_type=sa.DateTime(timezone=True),
type_=postgresql.TIMESTAMP(),
existing_nullable=False,
)
with op.batch_alter_table("recording", schema=None) as batch_op:
batch_op.alter_column(
"recorded_at",
existing_type=sa.DateTime(timezone=True),
type_=postgresql.TIMESTAMP(),
existing_nullable=False,
)
with op.batch_alter_table("meeting_consent", schema=None) as batch_op:
batch_op.alter_column(
"consent_timestamp",
existing_type=sa.DateTime(timezone=True),
type_=postgresql.TIMESTAMP(),
existing_nullable=False,
)
with op.batch_alter_table("meeting", schema=None) as batch_op:
batch_op.alter_column(
"end_date",
existing_type=sa.DateTime(timezone=True),
type_=postgresql.TIMESTAMP(),
existing_nullable=True,
)
batch_op.alter_column(
"start_date",
existing_type=sa.DateTime(timezone=True),
type_=postgresql.TIMESTAMP(),
existing_nullable=True,
)
# ### end Alembic commands ###

View File

@@ -25,7 +25,7 @@ def upgrade() -> None:
sa.Column(
"is_shared",
sa.Boolean(),
server_default=sa.text("0"),
server_default=sa.text("false"),
nullable=False,
),
)

View File

@@ -9,8 +9,6 @@ Create Date: 2025-07-15 20:09:40.253018
from typing import Sequence, Union
from alembic import op
import sqlalchemy as sa
from sqlalchemy.dialects import postgresql
# revision identifiers, used by Alembic.
revision: str = "a9c9c229ee36"

View File

@@ -5,30 +5,37 @@ Revises: 6ea59639f30e
Create Date: 2025-01-28 10:06:50.446233
"""
from typing import Sequence, Union
from alembic import op
import sqlalchemy as sa
from alembic import op
# revision identifiers, used by Alembic.
revision: str = 'b0e5f7876032'
down_revision: Union[str, None] = '6ea59639f30e'
revision: str = "b0e5f7876032"
down_revision: Union[str, None] = "6ea59639f30e"
branch_labels: Union[str, Sequence[str], None] = None
depends_on: Union[str, Sequence[str], None] = None
def upgrade() -> None:
# ### commands auto generated by Alembic - please adjust! ###
with op.batch_alter_table('meeting', schema=None) as batch_op:
batch_op.add_column(sa.Column('is_active', sa.Boolean(), server_default=sa.text('1'), nullable=False))
with op.batch_alter_table("meeting", schema=None) as batch_op:
batch_op.add_column(
sa.Column(
"is_active",
sa.Boolean(),
server_default=sa.text("true"),
nullable=False,
)
)
# ### end Alembic commands ###
def downgrade() -> None:
# ### commands auto generated by Alembic - please adjust! ###
with op.batch_alter_table('meeting', schema=None) as batch_op:
batch_op.drop_column('is_active')
with op.batch_alter_table("meeting", schema=None) as batch_op:
batch_op.drop_column("is_active")
# ### end Alembic commands ###

View File

@@ -0,0 +1,41 @@
"""add_search_optimization_indexes
Revision ID: b1c33bd09963
Revises: 9f5c78d352d6
Create Date: 2025-08-14 17:26:02.117408
"""
from typing import Sequence, Union
from alembic import op
# revision identifiers, used by Alembic.
revision: str = "b1c33bd09963"
down_revision: Union[str, None] = "9f5c78d352d6"
branch_labels: Union[str, Sequence[str], None] = None
depends_on: Union[str, Sequence[str], None] = None
def upgrade() -> None:
# Add indexes for actual search filtering patterns used in frontend
# Based on /browse page filters: room_id and source_kind
# Index for room_id + created_at (for room-specific searches with date ordering)
op.create_index(
"idx_transcript_room_id_created_at",
"transcript",
["room_id", "created_at"],
if_not_exists=True,
)
# Index for source_kind alone (actively used filter in frontend)
op.create_index(
"idx_transcript_source_kind", "transcript", ["source_kind"], if_not_exists=True
)
def downgrade() -> None:
# Remove the indexes in reverse order
op.drop_index("idx_transcript_source_kind", "transcript", if_exists=True)
op.drop_index("idx_transcript_room_id_created_at", "transcript", if_exists=True)

View File

@@ -8,9 +8,8 @@ Create Date: 2025-06-27 08:57:16.306940
from typing import Sequence, Union
from alembic import op
import sqlalchemy as sa
from alembic import op
# revision identifiers, used by Alembic.
revision: str = "b3df9681cae9"

View File

@@ -8,9 +8,8 @@ Create Date: 2024-10-11 13:45:28.914902
from typing import Sequence, Union
from alembic import op
import sqlalchemy as sa
from alembic import op
# revision identifiers, used by Alembic.
revision: str = "b469348df210"

View File

@@ -0,0 +1,35 @@
"""add_unique_constraint_one_active_meeting_per_room
Revision ID: b7df9609542c
Revises: d7fbb74b673b
Create Date: 2025-07-25 16:27:06.959868
"""
from typing import Sequence, Union
import sqlalchemy as sa
from alembic import op
# revision identifiers, used by Alembic.
revision: str = "b7df9609542c"
down_revision: Union[str, None] = "d7fbb74b673b"
branch_labels: Union[str, Sequence[str], None] = None
depends_on: Union[str, Sequence[str], None] = None
def upgrade() -> None:
# Create a partial unique index that ensures only one active meeting per room
# This works for both PostgreSQL and SQLite
op.create_index(
"idx_one_active_meeting_per_room",
"meeting",
["room_id"],
unique=True,
postgresql_where=sa.text("is_active = true"),
sqlite_where=sa.text("is_active = 1"),
)
def downgrade() -> None:
op.drop_index("idx_one_active_meeting_per_room", table_name="meeting")

View File

@@ -5,25 +5,31 @@ Revises: 125031f7cb78
Create Date: 2023-12-13 15:37:51.303970
"""
from typing import Sequence, Union
from alembic import op
import sqlalchemy as sa
from alembic import op
# revision identifiers, used by Alembic.
revision: str = 'b9348748bbbc'
down_revision: Union[str, None] = '125031f7cb78'
revision: str = "b9348748bbbc"
down_revision: Union[str, None] = "125031f7cb78"
branch_labels: Union[str, Sequence[str], None] = None
depends_on: Union[str, Sequence[str], None] = None
def upgrade() -> None:
# ### commands auto generated by Alembic - please adjust! ###
op.add_column('transcript', sa.Column('reviewed', sa.Boolean(), server_default=sa.text('0'), nullable=False))
op.add_column(
"transcript",
sa.Column(
"reviewed", sa.Boolean(), server_default=sa.text("false"), nullable=False
),
)
# ### end Alembic commands ###
def downgrade() -> None:
# ### commands auto generated by Alembic - please adjust! ###
op.drop_column('transcript', 'reviewed')
op.drop_column("transcript", "reviewed")
# ### end Alembic commands ###

View File

@@ -9,8 +9,6 @@ Create Date: 2025-07-15 11:48:42.854741
from typing import Sequence, Union
from alembic import op
import sqlalchemy as sa
# revision identifiers, used by Alembic.
revision: str = "ccd68dc784ff"

View File

@@ -8,9 +8,8 @@ Create Date: 2025-06-27 09:27:25.302152
from typing import Sequence, Union
from alembic import op
import sqlalchemy as sa
from alembic import op
# revision identifiers, used by Alembic.
revision: str = "d3ff3a39297f"

View File

@@ -0,0 +1,59 @@
"""Add room_id to transcript
Revision ID: d7fbb74b673b
Revises: a9c9c229ee36
Create Date: 2025-07-17 12:00:00.000000
"""
from typing import Sequence, Union
import sqlalchemy as sa
from alembic import op
# revision identifiers, used by Alembic.
revision: str = "d7fbb74b673b"
down_revision: Union[str, None] = "a9c9c229ee36"
branch_labels: Union[str, Sequence[str], None] = None
depends_on: Union[str, Sequence[str], None] = None
def upgrade() -> None:
# Add room_id column to transcript table
op.add_column("transcript", sa.Column("room_id", sa.String(), nullable=True))
# Add index for room_id for better query performance
op.create_index("idx_transcript_room_id", "transcript", ["room_id"])
# Populate room_id for existing ROOM-type transcripts
# This joins through recording -> meeting -> room to get the room_id
op.execute("""
UPDATE transcript AS t
SET room_id = r.id
FROM recording rec
JOIN meeting m ON rec.meeting_id = m.id
JOIN room r ON m.room_id = r.id
WHERE t.recording_id = rec.id
AND t.source_kind = 'room'
AND t.room_id IS NULL
""")
# Fix missing meeting_id for ROOM-type transcripts
# The meeting_id field exists but was never populated
op.execute("""
UPDATE transcript AS t
SET meeting_id = rec.meeting_id
FROM recording rec
WHERE t.recording_id = rec.id
AND t.source_kind = 'room'
AND t.meeting_id IS NULL
AND rec.meeting_id IS NOT NULL
""")
def downgrade() -> None:
# Drop the index first
op.drop_index("idx_transcript_room_id", "transcript")
# Drop the room_id column
op.drop_column("transcript", "room_id")

View File

@@ -0,0 +1,34 @@
"""make meeting room_id nullable but keep foreign key
Revision ID: def1b5867d4c
Revises: 0ce521cda2ee
Create Date: 2025-09-11 09:42:18.697264
"""
from typing import Sequence, Union
import sqlalchemy as sa
from alembic import op
# revision identifiers, used by Alembic.
revision: str = "def1b5867d4c"
down_revision: Union[str, None] = "0ce521cda2ee"
branch_labels: Union[str, Sequence[str], None] = None
depends_on: Union[str, Sequence[str], None] = None
def upgrade() -> None:
# ### commands auto generated by Alembic - please adjust! ###
with op.batch_alter_table("meeting", schema=None) as batch_op:
batch_op.alter_column("room_id", existing_type=sa.VARCHAR(), nullable=True)
# ### end Alembic commands ###
def downgrade() -> None:
# ### commands auto generated by Alembic - please adjust! ###
with op.batch_alter_table("meeting", schema=None) as batch_op:
batch_op.alter_column("room_id", existing_type=sa.VARCHAR(), nullable=False)
# ### end Alembic commands ###

View File

@@ -5,11 +5,11 @@ Revises: 4814901632bc
Create Date: 2023-11-16 10:29:09.351664
"""
from typing import Sequence, Union
from alembic import op
import sqlalchemy as sa
from alembic import op
# revision identifiers, used by Alembic.
revision: str = "f819277e5169"

File diff suppressed because one or more lines are too long

File diff suppressed because it is too large Load Diff

4607
server/poetry.lock generated

File diff suppressed because it is too large Load Diff

View File

@@ -1,84 +1,138 @@
[tool.poetry]
name = "reflector-server"
[project]
name = "reflector"
version = "0.1.0"
description = ""
authors = ["Monadical team <ops@monadical.com>"]
authors = [{ name = "Monadical team", email = "ops@monadical.com" }]
requires-python = ">=3.11, <3.13"
readme = "README.md"
packages = []
dependencies = [
"aiohttp>=3.9.0",
"aiohttp-cors>=0.7.0",
"av>=10.0.0",
"requests>=2.31.0",
"aiortc>=1.5.0",
"sortedcontainers>=2.4.0",
"loguru>=0.7.0",
"pydantic-settings>=2.0.2",
"structlog>=23.1.0",
"uvicorn[standard]>=0.23.1",
"fastapi[standard]>=0.100.1",
"sentry-sdk[fastapi]>=1.29.2",
"httpx>=0.24.1",
"fastapi-pagination>=0.12.6",
"databases[aiosqlite, asyncpg]>=0.7.0",
"sqlalchemy<1.5",
"alembic>=1.11.3",
"nltk>=3.8.1",
"prometheus-fastapi-instrumentator>=6.1.0",
"sentencepiece>=0.1.99",
"protobuf>=4.24.3",
"profanityfilter>=2.0.6",
"celery>=5.3.4",
"redis>=5.0.1",
"python-jose[cryptography]>=3.3.0",
"python-multipart>=0.0.6",
"transformers>=4.36.2",
"jsonschema>=4.23.0",
"openai>=1.59.7",
"psycopg2-binary>=2.9.10",
"llama-index>=0.12.52",
"llama-index-llms-openai-like>=0.4.0",
"pytest-env>=1.1.5",
"webvtt-py>=0.5.0",
"PyJWT>=2.8.0",
]
[tool.poetry.dependencies]
python = "^3.11"
aiohttp = "^3.9.0"
aiohttp-cors = "^0.7.0"
av = "^10.0.0"
requests = "^2.31.0"
aiortc = "^1.5.0"
sortedcontainers = "^2.4.0"
loguru = "^0.7.0"
pydantic-settings = "^2.0.2"
structlog = "^23.1.0"
uvicorn = {extras = ["standard"], version = "^0.23.1"}
fastapi = "^0.100.1"
sentry-sdk = {extras = ["fastapi"], version = "^1.29.2"}
httpx = "^0.24.1"
fastapi-pagination = "^0.12.6"
databases = {extras = ["aiosqlite", "asyncpg"], version = "^0.7.0"}
sqlalchemy = "<1.5"
fief-client = {extras = ["fastapi"], version = "^0.17.0"}
alembic = "^1.11.3"
nltk = "^3.8.1"
prometheus-fastapi-instrumentator = "^6.1.0"
sentencepiece = "^0.1.99"
protobuf = "^4.24.3"
profanityfilter = "^2.0.6"
celery = "^5.3.4"
redis = "^5.0.1"
python-jose = {extras = ["cryptography"], version = "^3.3.0"}
python-multipart = "^0.0.6"
faster-whisper = "^0.10.0"
transformers = "^4.36.2"
black = "24.1.1"
jsonschema = "^4.23.0"
openai = "^1.59.7"
[dependency-groups]
dev = [
"black>=24.1.1",
"stamina>=23.1.0",
"pyinstrument>=4.6.1",
]
tests = [
"pytest-cov>=4.1.0",
"pytest-aiohttp>=1.0.4",
"pytest-asyncio>=0.21.1",
"pytest>=7.4.0",
"httpx-ws>=0.4.1",
"pytest-httpx>=0.23.1",
"pytest-celery>=0.0.0",
"pytest-recording>=0.13.4",
"pytest-docker>=3.2.3",
"asgi-lifespan>=2.1.0",
]
aws = ["aioboto3>=11.2.0"]
evaluation = [
"jiwer>=3.0.2",
"levenshtein>=0.21.1",
"tqdm>=4.66.0",
"pydantic>=2.1.1",
]
local = [
"pyannote-audio>=3.3.2",
"faster-whisper>=0.10.0",
]
silero-vad = [
"silero-vad>=5.1.2",
"torch>=2.8.0",
"torchaudio>=2.8.0",
]
[tool.uv]
default-groups = [
"dev",
"tests",
"aws",
"evaluation",
"local",
"silero-vad"
]
[tool.poetry.group.dev.dependencies]
black = "^24.1.1"
stamina = "^23.1.0"
pyinstrument = "^4.6.1"
[[tool.uv.index]]
name = "pytorch-cpu"
url = "https://download.pytorch.org/whl/cpu"
explicit = true
[tool.poetry.group.tests.dependencies]
pytest-cov = "^4.1.0"
pytest-aiohttp = "^1.0.4"
pytest-asyncio = "^0.21.1"
pytest = "^7.4.0"
httpx-ws = "^0.4.1"
pytest-httpx = "^0.23.1"
pytest-celery = "^0.0.0"
[tool.poetry.group.aws.dependencies]
aioboto3 = "^11.2.0"
[tool.poetry.group.evaluation.dependencies]
jiwer = "^3.0.2"
levenshtein = "^0.21.1"
tqdm = "^4.66.0"
pydantic = "^2.1.1"
[tool.uv.sources]
torch = [
{ index = "pytorch-cpu" },
]
torchaudio = [
{ index = "pytorch-cpu" },
]
[build-system]
requires = ["poetry-core"]
build-backend = "poetry.core.masonry.api"
requires = ["hatchling"]
build-backend = "hatchling.build"
[tool.hatch.build.targets.wheel]
packages = ["reflector"]
[tool.coverage.run]
source = ["reflector"]
[tool.pytest_env]
ENVIRONMENT = "pytest"
DATABASE_URL = "postgresql://test_user:test_password@localhost:15432/reflector_test"
[tool.pytest.ini_options]
addopts = "-ra -q --disable-pytest-warnings --cov --cov-report html -v"
testpaths = ["tests"]
asyncio_mode = "auto"
markers = [
"gpu_modal: mark test to run only with GPU Modal endpoints (deselect with '-m \"not gpu_modal\"')",
]
[tool.ruff.lint]
select = [
"I", # isort - import sorting
"F401", # unused imports
"PLC0415", # import-outside-top-level - detect inline imports
]
[tool.ruff.lint.per-file-ignores]
"reflector/processors/summary/summary_builder.py" = ["E501"]
"gpu/**.py" = ["PLC0415"]
"reflector/tools/**.py" = ["PLC0415"]
"migrations/versions/**.py" = ["PLC0415"]
"tests/**.py" = ["PLC0415"]

View File

@@ -1,34 +0,0 @@
import os
import subprocess
import sys
from loguru import logger
# Get the input file name from the command line argument
input_file = sys.argv[1]
# example use: python 0-reflector-local.py input.m4a agenda.txt
# Get the agenda file name from the command line argument if provided
if len(sys.argv) > 2:
agenda_file = sys.argv[2]
else:
agenda_file = "agenda.txt"
# example use: python 0-reflector-local.py input.m4a my_agenda.txt
# Check if the agenda file exists
if not os.path.exists(agenda_file):
logger.error("agenda_file is missing")
# Check if the input file is .m4a, if so convert to .mp4
if input_file.endswith(".m4a"):
subprocess.run(["ffmpeg", "-i", input_file, f"{input_file}.mp4"])
input_file = f"{input_file}.mp4"
# Run the first script to generate the transcript
subprocess.run(["python3", "1-transcript-generator.py", input_file, f"{input_file}_transcript.txt"])
# Run the second script to compare the transcript to the agenda
subprocess.run(["python3", "2-agenda-transcript-diff.py", agenda_file, f"{input_file}_transcript.txt"])
# Run the third script to summarize the transcript
subprocess.run(["python3", "3-transcript-summarizer.py", f"{input_file}_transcript.txt", f"{input_file}_summary.txt"])

View File

@@ -1,62 +0,0 @@
import argparse
import os
import moviepy.editor
import whisper
from loguru import logger
WHISPER_MODEL_SIZE = "base"
def init_argparse() -> argparse.ArgumentParser:
parser = argparse.ArgumentParser(
usage="%(prog)s <LOCATION> <OUTPUT>",
description="Creates a transcript of a video or audio file using the OpenAI Whisper model"
)
parser.add_argument("location", help="Location of the media file")
parser.add_argument("output", help="Output file path")
return parser
def main():
import sys
sys.setrecursionlimit(10000)
parser = init_argparse()
args = parser.parse_args()
media_file = args.location
logger.info(f"Processing file: {media_file}")
# Check if the media file is a valid audio or video file
if os.path.isfile(media_file) and not media_file.endswith(
('.mp3', '.wav', '.ogg', '.flac', '.mp4', '.avi', '.flv')):
logger.error(f"Invalid file format: {media_file}")
return
# If the media file we just retrieved is an audio file then skip extraction step
audio_filename = media_file
logger.info(f"Found audio-only file, skipping audio extraction")
audio = moviepy.editor.AudioFileClip(audio_filename)
logger.info("Selected extracted audio")
# Transcribe the audio file using the OpenAI Whisper model
logger.info("Loading Whisper speech-to-text model")
whisper_model = whisper.load_model(WHISPER_MODEL_SIZE)
logger.info(f"Transcribing file: {media_file}")
whisper_result = whisper_model.transcribe(media_file)
logger.info("Finished transcribing file")
# Save the transcript to the specified file.
logger.info(f"Saving transcript to: {args.output}")
transcript_file = open(args.output, "w")
transcript_file.write(whisper_result["text"])
transcript_file.close()
if __name__ == "__main__":
main()

View File

@@ -1,68 +0,0 @@
import argparse
import spacy
from loguru import logger
# Define the paths for agenda and transcription files
def init_argparse() -> argparse.ArgumentParser:
parser = argparse.ArgumentParser(
usage="%(prog)s <AGENDA> <TRANSCRIPTION>",
description="Compares the transcript of a video or audio file to an agenda using the SpaCy model"
)
parser.add_argument("agenda", help="Location of the agenda file")
parser.add_argument("transcription", help="Location of the transcription file")
return parser
args = init_argparse().parse_args()
agenda_path = args.agenda
transcription_path = args.transcription
# Load the spaCy model and add the sentencizer
spaCy_model = "en_core_web_md"
nlp = spacy.load(spaCy_model)
nlp.add_pipe('sentencizer')
logger.info("Loaded spaCy model " + spaCy_model)
# Load the agenda
with open(agenda_path, "r") as f:
agenda = [line.strip() for line in f.readlines() if line.strip()]
logger.info("Loaded agenda items")
# Load the transcription
with open(transcription_path, "r") as f:
transcription = f.read()
logger.info("Loaded transcription")
# Tokenize the transcription using spaCy
doc_transcription = nlp(transcription)
logger.info("Tokenized transcription")
# Find the items covered in the transcription
covered_items = {}
for item in agenda:
item_doc = nlp(item)
for sent in doc_transcription.sents:
if not sent or not all(token.has_vector for token in sent):
# Skip an empty span or one without any word vectors
continue
similarity = sent.similarity(item_doc)
similarity_threshold = 0.7
if similarity > similarity_threshold: # Set the threshold to determine what is considered a match
covered_items[item] = True
break
# Count the number of items covered and calculatre the percentage
num_covered_items = sum(covered_items.values())
percentage_covered = num_covered_items / len(agenda) * 100
# Print the results
print("💬 Agenda items covered in the transcription:")
for item in agenda:
if item in covered_items and covered_items[item]:
print("", item)
else:
print("", item)
print("📊 Coverage: {:.2f}%".format(percentage_covered))
logger.info("Finished comparing agenda to transcription with similarity threshold of " + str(similarity_threshold))

View File

@@ -1,94 +0,0 @@
import argparse
import nltk
nltk.download('stopwords')
from nltk.corpus import stopwords
from nltk.tokenize import word_tokenize, sent_tokenize
from heapq import nlargest
from loguru import logger
# Function to initialize the argument parser
def init_argparse():
parser = argparse.ArgumentParser(
usage="%(prog)s <TRANSCRIPT> <SUMMARY>",
description="Summarization"
)
parser.add_argument("transcript", type=str, default="transcript.txt", help="Path to the input transcript file")
parser.add_argument("summary", type=str, default="summary.txt", help="Path to the output summary file")
parser.add_argument("--num_sentences", type=int, default=5, help="Number of sentences to include in the summary")
return parser
# Function to read the input transcript file
def read_transcript(file_path):
with open(file_path, "r") as file:
transcript = file.read()
return transcript
# Function to preprocess the text by removing stop words and special characters
def preprocess_text(text):
stop_words = set(stopwords.words('english'))
words = word_tokenize(text)
words = [w.lower() for w in words if w.isalpha() and w.lower() not in stop_words]
return words
# Function to score each sentence based on the frequency of its words and return the top sentences
def summarize_text(text, num_sentences):
# Tokenize the text into sentences
sentences = sent_tokenize(text)
# Preprocess the text by removing stop words and special characters
words = preprocess_text(text)
# Calculate the frequency of each word in the text
word_freq = nltk.FreqDist(words)
# Calculate the score for each sentence based on the frequency of its words
sentence_scores = {}
for i, sentence in enumerate(sentences):
sentence_words = preprocess_text(sentence)
for word in sentence_words:
if word in word_freq:
if i not in sentence_scores:
sentence_scores[i] = word_freq[word]
else:
sentence_scores[i] += word_freq[word]
# Select the top sentences based on their scores
top_sentences = nlargest(num_sentences, sentence_scores, key=sentence_scores.get)
# Sort the top sentences in the order they appeared in the original text
summary_sent = sorted(top_sentences)
summary = [sentences[i] for i in summary_sent]
return " ".join(summary)
def main():
# Initialize the argument parser and parse the arguments
parser = init_argparse()
args = parser.parse_args()
# Read the input transcript file
logger.info(f"Reading transcript from: {args.transcript}")
transcript = read_transcript(args.transcript)
# Summarize the transcript using the nltk library
logger.info("Summarizing transcript")
summary = summarize_text(transcript, args.num_sentences)
# Write the summary to the output file
logger.info(f"Writing summary to: {args.summary}")
with open(args.summary, "w") as f:
f.write("Summary of: " + args.transcript + "\n\n")
f.write(summary)
logger.info("Summarization completed")
if __name__ == "__main__":
main()

View File

@@ -1,4 +0,0 @@
# Deloitte HR @ NYS Cybersecurity Conference
- ways to retain and grow your workforce
- how to enable cybersecurity professionals to do their best work
- low-budget activities that can be implemented starting tomorrow

Some files were not shown because too many files have changed in this diff Show More