reflector

mirror of https://github.com/Monadical-SAS/reflector.git synced 2026-02-04 09:56:47 +00:00

Author	SHA1	Message	Date
Mathieu Virbel	9dfd76996f	fix: file pipeline status reporting and websocket updates (#589 ) * feat: use file pipeline for upload and reprocess action * fix: make file pipeline correctly report status events * fix: duplication of transcripts_controller * fix: tests * test: fix file upload test * test: fix reprocess * fix: also patch from main_file_pipeline (how patch is done is dependent of file import unfortunately)	2025-08-29 00:58:14 -06:00
Mathieu Virbel	55cc8637c6	ci: restrict workflow execution to main branch and add concurrency (#586 ) * ci: try adding concurrency * ci: restrict push on main branch * ci: fix concurrency key * ci: fix build concurrency * refactor: apply suggestion from @pr-agent-monadical[bot] Co-authored-by: pr-agent-monadical[bot] <198624643+pr-agent-monadical[bot]@users.noreply.github.com> --------- Co-authored-by: pr-agent-monadical[bot] <198624643+pr-agent-monadical[bot]@users.noreply.github.com>	2025-08-28 16:43:17 -06:00
Mathieu Virbel	f5331a2107	style: more type annotations to parakeet transcriber (#581 ) * feat: add comprehensive type annotations to Parakeet transcriber - Add TypedDict for WordTiming with word, start, end fields - Add NamedTuple for TimeSegment, AudioSegment, and TranscriptResult - Add type hints to all generator functions (vad_segment_generator, batch_speech_segments, etc.) - Add enforce_word_timing_constraints function to prevent word timing overlaps - Refactor batch_segment_to_audio_segment to reuse pad_audio function * doc: add note about space	2025-08-28 12:22:07 -06:00
Igor Loskutov	124ce03bf8	fix: Igor/evaluation (#575 ) * fix: impossible import error (#563) * evaluation cli - database events experiment * hallucinations * evaluation - unhallucinate * evaluation - unhallucinate * roll back reliability link * self reviewio * lint * self review * add file pipeline to cli * add file pipeline to cli + sorting * remove cli tests * remove ai comments * comments	2025-08-28 12:07:34 -04:00
Mathieu Virbel	7030e0f236	fix: optimize parakeet transcription batching algorithm (#577 ) * refactor: optimize transcription batching to accumulate speech segments - Changed VAD segment generator to return full audio array instead of segments - Removed segment filtering step - Modified batch_segments to accumulate maximum speech including silence - Transcribe larger continuous chunks instead of individual speech segments * fix: correct transcribe_batch call to use list and fix batch unpacking * fix: simplify * fix: remove unused variables * fix: add typing	2025-08-27 10:32:04 -06:00
Mathieu Virbel	37f0110892	doc: update local model readme	2025-08-22 17:50:24 -06:00
Mathieu Virbel	cf2896a7f4	doc: update readme about installation instructions Add a note about installation instructions being inaccurate.	2025-08-22 17:48:35 -06:00
Mathieu Virbel	aabf2c2572	chore(main): release 0.7.3 (#565 ) v0.7.3	2025-08-22 16:35:52 -06:00
Mathieu Virbel	6a7b08f016	doc: change readme intro	2025-08-22 16:26:25 -06:00
Mathieu Virbel	e2736563d9	doc: update readme with new images	2025-08-22 16:15:54 -06:00
Mathieu Virbel	0f54b7782d	chore: ignore www/.env.[development,production]	2025-08-22 14:41:09 -06:00
Mathieu Virbel	359280dd34	fix: cleaned repo, and get git-leaks clean	2025-08-22 11:51:34 -06:00
Mathieu Virbel	9265d201b5	fix: restore previous behavior on live pipeline + audio downscaler (#561 ) This commit restore the original behavior with frame cutting. While silero is used on our gpu for files, look like it's not working great on the live pipeline. To be investigated, but at the moment, what we keep is: - refactored to extract the downscale for further processing in the pipeline - remove any downscale implementation from audio_chunker and audio_merge - removed batching from audio_merge too for now	2025-08-22 10:49:26 -06:00
Mathieu Virbel	52f9f533d7	chore(main): release 0.7.2 (#559 ) v0.7.2	2025-08-21 21:00:05 -06:00
Mathieu Virbel	0c3878ac3c	fix: docker image not loading libgomp.so.1 for torch (#560 ) On ARM64, the docker iamge crash because torch cannot load libgomp.so.1 -- Look like pytorch does not install the same packages depending the platform. AMD64: /app/.venv/lib/python3.12/site-packages/torch/lib/libgomp.so.1 /app/.venv/lib/python3.12/site-packages/ctranslate2.libs/libgomp-a34b3233.so.1.0.0 /app/.venv/lib/python3.12/site-packages/scikit_learn.libs/libgomp-a34b3233.so.1.0.0 ARM64: /app/.venv/lib/python3.12/site-packages/ctranslate2.libs/libgomp-d22c30c5.so.1.0.0 /app/.venv/lib/python3.12/site-packages/scikit_learn.libs/libgomp-947d5fa1.so.1.0.0 /app/.venv/lib/python3.12/site-packages/torch.libs/libgomp-947d5fa1.so.1.0.0	2025-08-21 16:41:35 -06:00
Igor Loskutov	d70beee51b	fix: include shared rooms to search (#558 ) * include shared rooms to search * tests vibe * tests vibe * tests vibe * tests vibe * tests vibe * tests vibe * tests vibe * remove tests, thats too much	2025-08-21 14:52:29 -04:00
Mathieu Virbel	bc5b351d2b	chore(main): release 0.7.1 (#557 ) v0.7.1	2025-08-20 23:23:27 -06:00
Igor Loskutov	07981e8090	fix: webvtt db null expectation mismatch (#556 )	2025-08-20 23:22:41 -06:00
Mathieu Virbel	7e366f6338	chore(main): release 0.7.0 (#541 ) v0.7.0	2025-08-20 22:24:36 -06:00
Mathieu Virbel	7592679a35	build: separate silero-vad and force torch to be resolved without nvidia (#555 ) * build: separate silero-vad and force torch to be resolved without nvidia * build: also add torchaudio as cpu version	2025-08-20 22:23:48 -06:00
Mathieu Virbel	af16178f86	ci: use github-token to get around potential api throttling + rework dockerfile (#554 ) * ci: use github-token to get around potential api throttling * build: put pyannote-audio separate to the project * fix: now that we have a readme, use it * build: add UV_NO_CACHE	2025-08-20 21:59:29 -06:00
Mathieu Virbel	3ea7f6b7b6	feat: pipeline improvement with file processing, parakeet, silero-vad (#540 ) * feat: improve pipeline threading, and transcriber (parakeet and silero vad) * refactor: remove whisperx, implement parakeet * refactor: make audio_chunker more smart and wait for speech, instead of fixed frame * refactor: make audio merge to always downscale the audio to 16k for transcription * refactor: make the audio transcript modal accepting batches * refactor: improve type safety and remove prometheus metrics - Add DiarizationSegment TypedDict for proper diarization typing - Replace List/Optional with modern Python list/\| None syntax - Remove all Prometheus metrics from TranscriptDiarizationAssemblerProcessor - Add comprehensive file processing pipeline with parallel execution - Update processor imports and type annotations throughout - Implement optimized file pipeline as default in process.py tool * refactor: convert FileDiarizationProcessor I/O types to BaseModel Update FileDiarizationInput and FileDiarizationOutput to inherit from BaseModel instead of plain classes, following the standard pattern used by other processors in the codebase. * test: add tests for file transcript and diarization with pytest-recording * build: add pytest-recording * feat: add local pyannote for testing * fix: replace PyAV AudioResampler with torchaudio for reliable audio processing - Replace problematic PyAV AudioResampler that was causing ValueError: [Errno 22] Invalid argument - Use torchaudio.functional.resample for robust sample rate conversion - Optimize processing: skip conversion for already 16kHz mono audio - Add direct WAV writing with Python wave module for better performance - Consolidate duplicate downsample checks for cleaner code - Maintain list[av.AudioFrame] input interface - Required for Silero VAD which needs 16kHz mono audio * fix: replace PyAV AudioResampler with torchaudio solution - Resolves ValueError: [Errno 22] Invalid argument in AudioMergeProcessor - Replaces problematic PyAV AudioResampler with torchaudio.functional.resample - Optimizes processing to skip unnecessary conversions when audio is already 16kHz mono - Uses direct WAV writing with Python's wave module for better performance - Fixes test_basic_process to disable diarization (pyannote dependency not installed) - Updates test expectations to match actual processor behavior - Removes unused pydub dependency from pyproject.toml - Adds comprehensive TEST_ANALYSIS.md documenting test suite status * feat: add parameterized test for both diarization modes - Adds @pytest.mark.parametrize to test_basic_process with enable_diarization=[False, True] - Test with diarization=False always passes (tests core AudioMergeProcessor functionality) - Test with diarization=True gracefully skips when pyannote.audio is not installed - Provides comprehensive test coverage for both pipeline configurations * fix: resolve pipeline property naming conflict in AudioDiarizationPyannoteProcessor - Renames 'pipeline' property to 'diarization_pipeline' to avoid conflict with base Processor.pipeline attribute - Fixes AttributeError: 'property 'pipeline' object has no setter' when set_pipeline() is called - Updates property usage in _diarize method to use new name - Now correctly supports pipeline initialization for diarization processing * fix: add local for pyannote * test: add diarization test * fix: resample on audio merge now working * fix: correctly restore timestamp * fix: display exception in a threaded processor if that happen * Update pyproject.toml * ci: remove option * ci: update astral-sh/setup-uv * test: add monadical url for pytest-recording * refactor: remove previous version * build: move faster whisper to local dep * test: fix missing import * refactor: improve main_file_pipeline organization and error handling - Move all imports to the top of the file - Create unified EmptyPipeline class to replace duplicate mock pipeline code - Remove timeout and fallback logic - let processors handle their own retries - Fix error handling to raise any exception from parallel tasks - Add proper type hints and validation for captured results * fix: wrong function * fix: remove task_done * feat: add configurable file processing timeouts for modal processors - Add TRANSCRIPT_FILE_TIMEOUT setting (default: 600s) for file transcription - Add DIARIZATION_FILE_TIMEOUT setting (default: 600s) for file diarization - Replace hardcoded timeout=600 with configurable settings in modal processors - Allows customization of timeout values via environment variables * fix: use logger * fix: worker process meetings now use file pipeline * fix: topic not gathered * refactor: remove prepare(), pipeline now work * refactor: implement many review from Igor * test: add test for test_pipeline_main_file * refactor: remove doc * doc: add doc * ci: update build to use native arm64 builder * fix: merge fixes * refactor: changes from Igor review + add test (not by default) to test gpu modal part * ci: update to our own runner linux-amd64 * ci: try using suggested mode=min * fix: update diarizer for latest modal, and use volume * fix: modal file extension detection * fix: put the diarizer as A100	2025-08-20 20:07:19 -06:00
Igor Loskutov	009590c080	feat: search frontend (#551 ) * feat: better highlight * feat(search): add long_summary to search vector for improved search results - Update search vector to include long_summary with weight B (between title A and webvtt C) - Modify SearchController to fetch long_summary and prioritize its snippets - Generate snippets from long_summary first (max 2), then from webvtt for remaining slots - Add comprehensive tests for long_summary search functionality - Create migration to update search_vector_en column in PostgreSQL This improves search quality by including summarized content which often contains key topics and themes that may not be explicitly mentioned in the transcript. * fix: address code review feedback for search enhancements - Fix test file inconsistencies by removing references to non-existent model fields - Comment out tests for unimplemented features (room_ids, status filters, date ranges) - Update tests to only use currently available fields (room_id singular, no room_name/processing_status) - Mark future functionality tests with @pytest.mark.skip - Make snippet counts configurable - Add LONG_SUMMARY_MAX_SNIPPETS constant (default: 2) - Replace hardcoded value with configurable constant - Improve error handling consistency in WebVTT parsing - Use different log levels for different error types (debug for malformed, warning for decode, error for unexpected) - Add catch-all exception handler for unexpected errors - Include stack trace for critical errors All existing tests pass with these changes. * fix: correct datetime test to include required duration field * feat: better highlight * feat: search room names * feat: acknowledge deleted room * feat: search filters fix and rank removal * chore: minor refactoring * feat: better matches frontend * chore: self-review (vibe) * chore: self-review WIP * chore: self-review WIP * chore: self-review WIP * chore: self-review WIP * chore: self-review WIP * chore: self-review WIP * chore: self-review WIP * remove swc (vibe) * search url query sync (vibe) * search url query sync (vibe) * better casts and cap while * PR review + simplify frontend hook * pr: remove search db timeouts * cleanup tests * tests cleanup * frontend cleanup * index declarations * refactor frontend (self-review) * fix search pagination * clear "x" for search input * pagination max pages fix * chore: cleanup * cleanup * cleanup * cleanup * cleanup * cleanup * cleanup * cleanup * lockfile * pr review	2025-08-20 20:56:45 -04:00
Igor Loskutov	fe5d344cff	diarization cli: throw on modal errors (#553 )	2025-08-20 10:21:52 -04:00
Igor Loskutov	86455ce573	chore: type fixes (#544 ) * chore: type fixes * chore: type fixes	2025-08-18 16:31:23 -04:00
Mathieu Virbel	2fccd81bcd	fix: use structlog not logging (#550 )	2025-08-15 15:41:23 -06:00
Mathieu Virbel	1311714451	ci: add pre-commit hook and fix linting issues (#545 ) * style: deactivate PLC0415 only on part that it's ok + re-run pre-commit run --all * ci: add pre-commit hook * build: move from yarn to pnpm * build: move from yarn to pnpm * build: fix node-version * ci: install pnpm prior node (?) * build: update deps and pnpm trying to fix vercel build * feat: docker www corepack * style: pre-commit --------- Co-authored-by: Igor Loskutov <igor.loskutoff@gmail.com>	2025-08-14 20:59:54 -06:00
Sergey Mankovsky	b9d891d342	feat: delete recording with transcript (#547 ) * Delete recording with transcript * Delete confirmation dialog * Use aws storage abstraction for recording deletion * Test recording deleted with transcript * Use get transcript storage * Fix the test * Add env vars for recording storage	2025-08-14 20:45:30 +02:00
Mathieu Virbel	9eab952c63	feat: postgresql migration and removal of sqlite in pytest (#546 ) * feat: remove support of sqlite, 100% postgres * fix: more migration and make datetime timezone aware in postgres * fix: change how database is get, and use contextvar to have difference instance between different loops * test: properly use client fixture that handle lifetime/database connection * fix: add missing client fixture parameters to test functions This commit fixes NameError issues where test functions were trying to use the 'client' fixture but didn't have it as a parameter. The changes include: 1. Added 'client' parameter to test functions in: - test_transcripts_audio_download.py (6 functions including fixture) - test_transcripts_speaker.py (3 functions) - test_transcripts_upload.py (1 function) - test_transcripts_rtc_ws.py (2 functions + appserver fixture) 2. Resolved naming conflicts in test_transcripts_rtc_ws.py where both HTTP client and StreamClient were using variable name 'client'. StreamClient instances are now named 'stream_client' to avoid conflicts. 3. Added missing 'from reflector.app import app' import in rtc_ws tests. Background: Previously implemented contextvars solution with get_database() function resolves asyncio event loop conflicts in Celery tasks. The global client fixture was also created to replace manual AsyncClient instances, ensuring proper FastAPI application lifecycle management and database connections during tests. All tests now pass except for 2 pre-existing RTC WebSocket test failures related to asyncpg connection issues unrelated to these fixes. * fix: ensure task are correctly closed * fix: make separate event loop for the live server * fix: make default settings pointing at postgres * build: remove pytest-docker deps out of dev, just tests group	2025-08-14 11:40:52 -06:00
Igor Loskutov	6fb5cb21c2	feat: search backend (#537 ) * docs: transient docs * chore: cleanup * webvtt WIP * webvtt field * chore: webvtt tests comments * chore: remove useless tests * feat: search TASK.md * feat: full text search by title/webvtt * chore: search api task * feat: search api * feat: search API * chore: rm task md * chore: roll back unnecessary validators * chore: pr review WIP * chore: pr review WIP * chore: pr review * chore: top imports * feat: better lint + ci * feat: better lint + ci * feat: better lint + ci * feat: better lint + ci * chore: lint * chore: lint * fix: db datetime definitions * fix: flush() params * fix: update transcript mutability expectation / test * fix: update transcript mutability expectation / test * chore: auto review * chore: new controller extraction * chore: new controller extraction * chore: cleanup * chore: review WIP * chore: pr WIP * chore: remove ci lint * chore: openapi regeneration * chore: openapi regeneration * chore: postgres test doc * fix: .dockerignore for arm binaries * fix: .dockerignore for arm binaries * fix: cap test loops * fix: cap test loops * fix: cap test loops * fix: get_transcript_topics * chore: remove flow.md docs and claude guidance * chore: remove claude.md db doc * chore: remove claude.md db doc * chore: remove claude.md db doc * chore: remove claude.md db doc	2025-08-13 10:03:38 -04:00
Igor Loskutov	a42ed12982	fix: evaluation cli event wrap (#536 ) * fix: evaluation cli event wrap * fix: evaluation cli event wrap * chore: remove unrelated change * chore: rollback claude.md changes	2025-08-11 19:28:52 -04:00
Mathieu Virbel	1aa52a99b6	chore(main): release 0.6.1 (#539 ) v0.6.1	2025-08-06 19:38:43 -06:00
dependabot[bot]	2a97290f2e	build(deps): bump the npm_and_yarn group across 1 directory with 7 updates (#535 ) Bumps the npm_and_yarn group with 6 updates in the /www directory: \| Package \| From \| To \| \| --- \| --- \| --- \| \| [axios](https://github.com/axios/axios) \| `1.6.2` \| `1.8.2` \| \| [postcss](https://github.com/postcss/postcss) \| `8.4.25` \| `8.4.31` \| \| [braces](https://github.com/micromatch/braces) \| `3.0.2` \| `3.0.3` \| \| [cross-spawn](https://github.com/moxystudio/node-cross-spawn) \| `7.0.3` \| `7.0.6` \| \| [micromatch](https://github.com/micromatch/micromatch) \| `4.0.5` \| `4.0.8` \| \| [nanoid](https://github.com/ai/nanoid) \| `3.3.6` \| `3.3.11` \| Updates `axios` from 1.6.2 to 1.8.2 - [Release notes](https://github.com/axios/axios/releases) - [Changelog](https://github.com/axios/axios/blob/v1.x/CHANGELOG.md) - [Commits](https://github.com/axios/axios/compare/v1.6.2...v1.8.2) Updates `postcss` from 8.4.25 to 8.4.31 - [Release notes](https://github.com/postcss/postcss/releases) - [Changelog](https://github.com/postcss/postcss/blob/main/CHANGELOG.md) - [Commits](https://github.com/postcss/postcss/compare/8.4.25...8.4.31) Updates `braces` from 3.0.2 to 3.0.3 - [Changelog](https://github.com/micromatch/braces/blob/master/CHANGELOG.md) - [Commits](https://github.com/micromatch/braces/compare/3.0.2...3.0.3) Updates `cross-spawn` from 7.0.3 to 7.0.6 - [Changelog](https://github.com/moxystudio/node-cross-spawn/blob/master/CHANGELOG.md) - [Commits](https://github.com/moxystudio/node-cross-spawn/compare/v7.0.3...v7.0.6) Updates `follow-redirects` from 1.15.2 to 1.15.6 - [Release notes](https://github.com/follow-redirects/follow-redirects/releases) - [Commits](https://github.com/follow-redirects/follow-redirects/compare/v1.15.2...v1.15.6) Updates `micromatch` from 4.0.5 to 4.0.8 - [Release notes](https://github.com/micromatch/micromatch/releases) - [Changelog](https://github.com/micromatch/micromatch/blob/master/CHANGELOG.md) - [Commits](https://github.com/micromatch/micromatch/compare/4.0.5...4.0.8) Updates `nanoid` from 3.3.6 to 3.3.11 - [Release notes](https://github.com/ai/nanoid/releases) - [Changelog](https://github.com/ai/nanoid/blob/main/CHANGELOG.md) - [Commits](https://github.com/ai/nanoid/compare/3.3.6...3.3.11) --- updated-dependencies: - dependency-name: axios dependency-version: 1.8.2 dependency-type: direct:production dependency-group: npm_and_yarn - dependency-name: postcss dependency-version: 8.4.31 dependency-type: direct:production dependency-group: npm_and_yarn - dependency-name: braces dependency-version: 3.0.3 dependency-type: indirect dependency-group: npm_and_yarn - dependency-name: cross-spawn dependency-version: 7.0.6 dependency-type: indirect dependency-group: npm_and_yarn - dependency-name: follow-redirects dependency-version: 1.15.6 dependency-type: indirect dependency-group: npm_and_yarn - dependency-name: micromatch dependency-version: 4.0.8 dependency-type: indirect dependency-group: npm_and_yarn - dependency-name: nanoid dependency-version: 3.3.11 dependency-type: indirect dependency-group: npm_and_yarn ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-08-06 10:23:48 -06:00
Mathieu Virbel	7963cc8a52	fix: delayed waveform loading (#538 )	2025-08-06 10:22:51 -06:00
Mathieu Virbel	d12424848d	chore: remove black (#534 )	2025-08-05 12:07:53 -06:00
dependabot[bot]	6e765875d5	build(deps): bump @babel/runtime (#530 ) Bumps the npm_and_yarn group with 1 update in the /www directory: [@babel/runtime](https://github.com/babel/babel/tree/HEAD/packages/babel-runtime). Updates `@babel/runtime` from 7.23.6 to 7.28.2 - [Release notes](https://github.com/babel/babel/releases) - [Changelog](https://github.com/babel/babel/blob/main/CHANGELOG.md) - [Commits](https://github.com/babel/babel/commits/v7.28.2/packages/babel-runtime) --- updated-dependencies: - dependency-name: "@babel/runtime" dependency-version: 7.28.2 dependency-type: indirect dependency-group: npm_and_yarn ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-08-05 11:41:34 -06:00
dependabot[bot]	e0f4acf28b	build(deps): bump form-data (#531 ) Bumps the npm_and_yarn group with 1 update in the /www directory: [form-data](https://github.com/form-data/form-data). Updates `form-data` from 4.0.0 to 4.0.4 - [Release notes](https://github.com/form-data/form-data/releases) - [Changelog](https://github.com/form-data/form-data/blob/master/CHANGELOG.md) - [Commits](https://github.com/form-data/form-data/compare/v4.0.0...v4.0.4) --- updated-dependencies: - dependency-name: form-data dependency-version: 4.0.4 dependency-type: indirect dependency-group: npm_and_yarn ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-08-05 11:41:25 -06:00
dependabot[bot]	12359ea4eb	build(deps): bump next (#533 ) Bumps the npm_and_yarn group with 1 update in the /www directory: [next](https://github.com/vercel/next.js). Updates `next` from 14.2.7 to 14.2.30 - [Release notes](https://github.com/vercel/next.js/releases) - [Changelog](https://github.com/vercel/next.js/blob/canary/release.js) - [Commits](https://github.com/vercel/next.js/compare/v14.2.7...v14.2.30) --- updated-dependencies: - dependency-name: next dependency-version: 14.2.30 dependency-type: direct:production dependency-group: npm_and_yarn ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-08-05 11:41:10 -06:00
Mathieu Virbel	267b7401ea	chore(main): release 0.6.0 (#526 ) v0.6.0	2025-08-04 18:04:10 -06:00
Mathieu Virbel	aea9de393c	chore(main): release 0.6.0 Release-As: 0.6.0	2025-08-04 18:02:19 -06:00
Mathieu Virbel	dc177af3ff	feat: implement service-specific Modal API keys with auto processor pattern (#528 ) * fix: refactor modal API key configuration for better separation of concerns - Split generic MODAL_API_KEY into service-specific keys: - TRANSCRIPT_API_KEY for transcription service - DIARIZATION_API_KEY for diarization service - TRANSLATE_API_KEY for translation service - Remove deprecated _MODAL_API_KEY settings - Add proper validation to ensure URLs are set when using modal processors - Update README with new configuration format BREAKING CHANGE: Configuration keys have changed. Update your .env file: - TRANSCRIPT_MODAL_API_KEY → TRANSCRIPT_API_KEY - LLM_MODAL_API_KEY → (removed, use TRANSCRIPT_API_KEY) - Add DIARIZATION_API_KEY and TRANSLATE_API_KEY if using those services fix: update Modal backend configuration to use service-specific API keys - Changed from generic MODAL_API_KEY to service-specific keys: - TRANSCRIPT_MODAL_API_KEY for transcription - DIARIZATION_MODAL_API_KEY for diarization - TRANSLATION_MODAL_API_KEY for translation - Updated audio_transcript_modal.py and audio_diarization_modal.py to use modal_api_key parameter - Updated documentation in README.md, CLAUDE.md, and env.example * feat: implement auto/modal pattern for translation processor - Created TranscriptTranslatorAutoProcessor following the same pattern as transcript/diarization - Created TranscriptTranslatorModalProcessor with TRANSLATION_MODAL_API_KEY support - Added TRANSLATION_BACKEND setting (defaults to "modal") - Updated all imports to use TranscriptTranslatorAutoProcessor instead of TranscriptTranslatorProcessor - Updated env.example with TRANSLATION_BACKEND and TRANSLATION_MODAL_API_KEY - Updated test to expect TranscriptTranslatorModalProcessor name - All tests passing * refactor: simplify transcript_translator base class to match other processors - Moved all implementation from base class to modal processor - Base class now only defines abstract _translate method - Follows the same minimal pattern as audio_diarization and audio_transcript base classes - Updated test mock to use _translate instead of get_translation - All tests passing * chore: clean up settings and improve type annotations - Remove deprecated generic API key variables from settings - Add comments to group Modal-specific settings - Improve type annotations for modal_api_key parameters * fix: typing * fix: passing key to openai * test: fix rtc test failing due to change on transcript It also correctly setup database from sqlite, in case our configuration is setup to postgres. * ci: deactivate translation backend by default * test: fix modal->mock * refactor: implementing igor review, mock to passthrough	2025-08-04 12:07:30 -06:00
Mathieu Virbel	5bd8233657	chore: remove refactor md (#527 )	2025-08-01 16:33:40 -06:00
Mathieu Virbel	28ac031ff6	feat: use llamaindex everywhere (#525 ) * feat: use llamaindex for transcript final title too * refactor: removed llm backend, replaced with one single class+llamaindex * refactor: self-review * fix: typing * fix: tests * refactor: extract clean_title and add tests * test: fix * test: remove ensure_casing/nltk * fix: tiny mistake	2025-08-01 12:13:00 -06:00
Mathieu Virbel	1878834ce6	chore(main): release 0.5.0 (#521 ) v0.5.0	2025-07-31 20:11:41 -06:00
Mathieu Virbel	f5b82d44e3	style: use ruff for linting and formatting (#524 )	2025-07-31 17:57:43 -06:00
Mathieu Virbel	ad56165b54	fix: remove unused settings and utils files (#522 ) * fix: remove unused settings and utils files * fix: remove migration done * fix: remove outdated scripts * fix: removing deployment of hermes, not used anymore * fix: partially remove secret, still have to understand frontend.	2025-07-31 17:45:48 -06:00
Mathieu Virbel	4ee19ed015	ci: update pull request template (#523 )	2025-07-31 17:45:19 -06:00
Mathieu Virbel	406164033d	feat: new summary using phi-4 and llama-index (#519 ) * feat: add litellm backend implementation * refactor: improve generate/completion methods for base LLM * refactor: remove tokenizer logic * style: apply code formatting * fix: remove hallucinations from LLM responses * refactor: comprehensive LLM and summarization rework * chore: remove debug code * feat: add structured output support to LiteLLM * refactor: apply self-review improvements * docs: add model structured output comments * docs: update model structured output comments * style: apply linting and formatting fixes * fix: resolve type logic bug * refactor: apply PR review feedback * refactor: apply additional PR review feedback * refactor: apply final PR review feedback * fix: improve schema passing for LLMs without structured output * feat: add PR comments and logger improvements * docs: update README and add HTTP logging * feat: improve HTTP logging * feat: add summary chunking functionality * fix: resolve title generation runtime issues * refactor: apply self-review improvements * style: apply linting and formatting * feat: implement LiteLLM class structure * style: apply linting and formatting fixes * docs: env template model name fix * chore: remove older litellm class * chore: format * refactor: simplify OpenAILLM * refactor: OpenAILLM tokenizer * refactor: self-review * refactor: self-review * refactor: self-review * chore: format * chore: remove LLM_USE_STRUCTURED_OUTPUT from envs * chore: roll back migration lint changes * chore: roll back migration lint changes * fix: make summary llm configuration optional for the tests * fix: missing f-string * fix: tweak the prompt for summary title * feat: try llamaindex for summarization * fix: complete refactor of summary builder using llamaindex and structured output when possible * fix: separate prompt as constant * fix: typings * fix: enhance prompt to prevent mentioning others subject while summarize one * fix: various changes after self-review * fix: from igor review --------- Co-authored-by: Igor Loskutov <igor.loskutoff@gmail.com>	2025-07-31 15:29:29 -06:00
Mathieu Virbel	81d316cb56	ci: remove conventional commit for ci (#520 ) As we now squash merge, only the conventional commit is required for the title of the PR	2025-07-31 15:19:16 -06:00
Mathieu Virbel	db3beae5cd	chore(main): release 0.4.0 (#510 ) v0.4.0	2025-07-25 19:09:57 -06:00

1 2 3 4 5 ...

1260 Commits