reflector

mirror of https://github.com/Monadical-SAS/reflector.git synced 2025-12-20 20:29:06 +00:00

Author	SHA1	Message	Date
Igor Loskutov	a5124b599d	node version 20 for tests	2025-09-04 20:49:11 -04:00
Igor Loskutov	790a61be0d	less edgy config (ci)	2025-09-04 12:32:15 -04:00
Igor Loskutov	41c92b8aeb	ci randomness	2025-09-04 12:02:57 -04:00
Igor Loskutov	2811540d9a	less edgy config (ci)	2025-09-04 11:45:24 -04:00
Igor Loskutov	c28af33b25	ci randomness	2025-09-04 11:27:43 -04:00
Igor Loskutov	f0eba2b2cd	test ts server	2025-09-04 10:54:09 -04:00
Mathieu Virbel	55cc8637c6	ci: restrict workflow execution to main branch and add concurrency (#586 ) * ci: try adding concurrency * ci: restrict push on main branch * ci: fix concurrency key * ci: fix build concurrency * refactor: apply suggestion from @pr-agent-monadical[bot] Co-authored-by: pr-agent-monadical[bot] <198624643+pr-agent-monadical[bot]@users.noreply.github.com> --------- Co-authored-by: pr-agent-monadical[bot] <198624643+pr-agent-monadical[bot]@users.noreply.github.com>	2025-08-28 16:43:17 -06:00
Mathieu Virbel	af16178f86	ci: use github-token to get around potential api throttling + rework dockerfile (#554 ) * ci: use github-token to get around potential api throttling * build: put pyannote-audio separate to the project * fix: now that we have a readme, use it * build: add UV_NO_CACHE	2025-08-20 21:59:29 -06:00
Mathieu Virbel	3ea7f6b7b6	feat: pipeline improvement with file processing, parakeet, silero-vad (#540 ) * feat: improve pipeline threading, and transcriber (parakeet and silero vad) * refactor: remove whisperx, implement parakeet * refactor: make audio_chunker more smart and wait for speech, instead of fixed frame * refactor: make audio merge to always downscale the audio to 16k for transcription * refactor: make the audio transcript modal accepting batches * refactor: improve type safety and remove prometheus metrics - Add DiarizationSegment TypedDict for proper diarization typing - Replace List/Optional with modern Python list/\| None syntax - Remove all Prometheus metrics from TranscriptDiarizationAssemblerProcessor - Add comprehensive file processing pipeline with parallel execution - Update processor imports and type annotations throughout - Implement optimized file pipeline as default in process.py tool * refactor: convert FileDiarizationProcessor I/O types to BaseModel Update FileDiarizationInput and FileDiarizationOutput to inherit from BaseModel instead of plain classes, following the standard pattern used by other processors in the codebase. * test: add tests for file transcript and diarization with pytest-recording * build: add pytest-recording * feat: add local pyannote for testing * fix: replace PyAV AudioResampler with torchaudio for reliable audio processing - Replace problematic PyAV AudioResampler that was causing ValueError: [Errno 22] Invalid argument - Use torchaudio.functional.resample for robust sample rate conversion - Optimize processing: skip conversion for already 16kHz mono audio - Add direct WAV writing with Python wave module for better performance - Consolidate duplicate downsample checks for cleaner code - Maintain list[av.AudioFrame] input interface - Required for Silero VAD which needs 16kHz mono audio * fix: replace PyAV AudioResampler with torchaudio solution - Resolves ValueError: [Errno 22] Invalid argument in AudioMergeProcessor - Replaces problematic PyAV AudioResampler with torchaudio.functional.resample - Optimizes processing to skip unnecessary conversions when audio is already 16kHz mono - Uses direct WAV writing with Python's wave module for better performance - Fixes test_basic_process to disable diarization (pyannote dependency not installed) - Updates test expectations to match actual processor behavior - Removes unused pydub dependency from pyproject.toml - Adds comprehensive TEST_ANALYSIS.md documenting test suite status * feat: add parameterized test for both diarization modes - Adds @pytest.mark.parametrize to test_basic_process with enable_diarization=[False, True] - Test with diarization=False always passes (tests core AudioMergeProcessor functionality) - Test with diarization=True gracefully skips when pyannote.audio is not installed - Provides comprehensive test coverage for both pipeline configurations * fix: resolve pipeline property naming conflict in AudioDiarizationPyannoteProcessor - Renames 'pipeline' property to 'diarization_pipeline' to avoid conflict with base Processor.pipeline attribute - Fixes AttributeError: 'property 'pipeline' object has no setter' when set_pipeline() is called - Updates property usage in _diarize method to use new name - Now correctly supports pipeline initialization for diarization processing * fix: add local for pyannote * test: add diarization test * fix: resample on audio merge now working * fix: correctly restore timestamp * fix: display exception in a threaded processor if that happen * Update pyproject.toml * ci: remove option * ci: update astral-sh/setup-uv * test: add monadical url for pytest-recording * refactor: remove previous version * build: move faster whisper to local dep * test: fix missing import * refactor: improve main_file_pipeline organization and error handling - Move all imports to the top of the file - Create unified EmptyPipeline class to replace duplicate mock pipeline code - Remove timeout and fallback logic - let processors handle their own retries - Fix error handling to raise any exception from parallel tasks - Add proper type hints and validation for captured results * fix: wrong function * fix: remove task_done * feat: add configurable file processing timeouts for modal processors - Add TRANSCRIPT_FILE_TIMEOUT setting (default: 600s) for file transcription - Add DIARIZATION_FILE_TIMEOUT setting (default: 600s) for file diarization - Replace hardcoded timeout=600 with configurable settings in modal processors - Allows customization of timeout values via environment variables * fix: use logger * fix: worker process meetings now use file pipeline * fix: topic not gathered * refactor: remove prepare(), pipeline now work * refactor: implement many review from Igor * test: add test for test_pipeline_main_file * refactor: remove doc * doc: add doc * ci: update build to use native arm64 builder * fix: merge fixes * refactor: changes from Igor review + add test (not by default) to test gpu modal part * ci: update to our own runner linux-amd64 * ci: try using suggested mode=min * fix: update diarizer for latest modal, and use volume * fix: modal file extension detection * fix: put the diarizer as A100	2025-08-20 20:07:19 -06:00
Mathieu Virbel	1311714451	ci: add pre-commit hook and fix linting issues (#545 ) * style: deactivate PLC0415 only on part that it's ok + re-run pre-commit run --all * ci: add pre-commit hook * build: move from yarn to pnpm * build: move from yarn to pnpm * build: fix node-version * ci: install pnpm prior node (?) * build: update deps and pnpm trying to fix vercel build * feat: docker www corepack * style: pre-commit --------- Co-authored-by: Igor Loskutov <igor.loskutoff@gmail.com>	2025-08-14 20:59:54 -06:00
Mathieu Virbel	9eab952c63	feat: postgresql migration and removal of sqlite in pytest (#546 ) * feat: remove support of sqlite, 100% postgres * fix: more migration and make datetime timezone aware in postgres * fix: change how database is get, and use contextvar to have difference instance between different loops * test: properly use client fixture that handle lifetime/database connection * fix: add missing client fixture parameters to test functions This commit fixes NameError issues where test functions were trying to use the 'client' fixture but didn't have it as a parameter. The changes include: 1. Added 'client' parameter to test functions in: - test_transcripts_audio_download.py (6 functions including fixture) - test_transcripts_speaker.py (3 functions) - test_transcripts_upload.py (1 function) - test_transcripts_rtc_ws.py (2 functions + appserver fixture) 2. Resolved naming conflicts in test_transcripts_rtc_ws.py where both HTTP client and StreamClient were using variable name 'client'. StreamClient instances are now named 'stream_client' to avoid conflicts. 3. Added missing 'from reflector.app import app' import in rtc_ws tests. Background: Previously implemented contextvars solution with get_database() function resolves asyncio event loop conflicts in Celery tasks. The global client fixture was also created to replace manual AsyncClient instances, ensuring proper FastAPI application lifecycle management and database connections during tests. All tests now pass except for 2 pre-existing RTC WebSocket test failures related to asyncpg connection issues unrelated to these fixes. * fix: ensure task are correctly closed * fix: make separate event loop for the live server * fix: make default settings pointing at postgres * build: remove pytest-docker deps out of dev, just tests group	2025-08-14 11:40:52 -06:00
Mathieu Virbel	4ee19ed015	ci: update pull request template (#523 )	2025-07-31 17:45:19 -06:00
Mathieu Virbel	81d316cb56	ci: remove conventional commit for ci (#520 ) As we now squash merge, only the conventional commit is required for the title of the PR	2025-07-31 15:19:16 -06:00
Mathieu Virbel	86ce68651f	build: move to uv (#488 ) * build: move to uv * build: add packages declaration * build: move to python 3.12, as sentencespiece does not work on 3.13 * ci: remove pre-commit check, will be done in another branch. * ci: fix name checkout * ci: update lock and dockerfile * test: remove event_loop, not needed in python 3.12 * test: updated test due to av returning AudioFrame with 4096 samples instead of 1024 * build: prevent using fastapi cli, because there is no way to set default port I don't want to pass --port 1250 every time, so back on previous approach. I deactivated auto-reload for production. * ci: remove main.py * test: fix quirck with httpx	2025-07-16 18:10:11 -06:00
Mathieu Virbel	4764dfc219	ci: add conventional commits checks to the repo (#486 )	2025-07-16 08:31:31 -06:00
Mathieu Virbel	9b67deb9fe	ci: add release-please workflow (#485 )	2025-07-16 08:09:57 -06:00
Mathieu Virbel	3d370336cc	fix: alembic migrations (#470 ) * fix: alembic migrations This commit fixes all the migrations that was half-backed, due to auto creation in the db init before. The process was to checkout at the commit where the migration was created, and use --autogenerate to regenerate at the state of the migration. 4 migrations was fixed. It also includes a workflow to ensure migration can applies correctly. * fix: db migration check * fix: nullable on meeting_consent * fix: try fixing tests	2025-06-27 12:03:10 -06:00
Sergey Mankovsky	1e69214cdc	Don't install current project	2025-01-20 11:56:55 +01:00
Sergey Mankovsky	5007bd7875	Fix formatting	2024-07-15 11:29:25 +02:00
Andreas Bonini	2e1b4c2c68	Update deploy.yml	2023-11-16 09:19:24 +07:00
Mathieu Virbel	7fca7ae287	ci: add redis service required for celery	2023-11-02 17:39:21 +01:00
Mathieu Virbel	5a92080e8c	ci: fix path inclusion on test_server	2023-11-02 12:24:52 +01:00
Mathieu Virbel	a5b9e75e21	ci: include only server This prevent the tests to run when readme is changed.	2023-11-01 15:12:10 +01:00
Mathieu Virbel	c5297be924	gh: use poetry cache from setup-python and remove old deps (#281 ) * gh: use poetry cache from setup-python and remove old deps * gh: use pipx and not setup-poetry, as per setup-python example * server: remove pyaudio unused in current reflector	2023-10-13 15:29:54 +02:00
Mathieu Virbel	0e4ef90e62	add github cache for docker	2023-10-13 12:03:51 +02:00
Mathieu Virbel	362a3d5589	docker: fix build for arm64 This was broken with safetensors dependencies required by torch	2023-10-12 20:28:39 +02:00
Mathieu Virbel	63c1cbe89b	Update deploy.yml to include arm64	2023-10-12 09:59:43 +02:00
Mathieu Virbel	60c6acbe68	gh: do not auto reploy to ecs, manual dispatch only for now	2023-08-18 16:02:44 +02:00
Mathieu Virbel	98375d5c2c	ci: add manual deploy for server	2023-08-15 09:50:13 +02:00
Mathieu Virbel	1f8e4200fd	tests: rework tests and fixes bugs along the way	2023-08-01 16:05:48 +02:00
Mathieu Virbel	dd4ae24852	gh: add server workflow	2023-07-27 18:30:49 +02:00
Andreas Bonini	314321c603	Update pull_request_template.md	2023-07-27 18:34:42 +07:00
Koper	46d7269279	PR Template	2023-07-27 18:29:37 +07:00

33 Commits