mirror of https://github.com/Monadical-SAS/reflector.git synced 2026-02-04 09:56:47 +00:00

Go to file

Mathieu Virbel dc177af3ff feat: implement service-specific Modal API keys with auto processor pattern (#528 )

* fix: refactor modal API key configuration for better separation of concerns

- Split generic MODAL_API_KEY into service-specific keys:
  - TRANSCRIPT_API_KEY for transcription service
  - DIARIZATION_API_KEY for diarization service
  - TRANSLATE_API_KEY for translation service
- Remove deprecated *_MODAL_API_KEY settings
- Add proper validation to ensure URLs are set when using modal processors
- Update README with new configuration format

BREAKING CHANGE: Configuration keys have changed. Update your .env file:
- TRANSCRIPT_MODAL_API_KEY → TRANSCRIPT_API_KEY
- LLM_MODAL_API_KEY → (removed, use TRANSCRIPT_API_KEY)
- Add DIARIZATION_API_KEY and TRANSLATE_API_KEY if using those services

* fix: update Modal backend configuration to use service-specific API keys

- Changed from generic MODAL_API_KEY to service-specific keys:
  - TRANSCRIPT_MODAL_API_KEY for transcription
  - DIARIZATION_MODAL_API_KEY for diarization
  - TRANSLATION_MODAL_API_KEY for translation
- Updated audio_transcript_modal.py and audio_diarization_modal.py to use modal_api_key parameter
- Updated documentation in README.md, CLAUDE.md, and env.example

* feat: implement auto/modal pattern for translation processor

- Created TranscriptTranslatorAutoProcessor following the same pattern as transcript/diarization
- Created TranscriptTranslatorModalProcessor with TRANSLATION_MODAL_API_KEY support
- Added TRANSLATION_BACKEND setting (defaults to "modal")
- Updated all imports to use TranscriptTranslatorAutoProcessor instead of TranscriptTranslatorProcessor
- Updated env.example with TRANSLATION_BACKEND and TRANSLATION_MODAL_API_KEY
- Updated test to expect TranscriptTranslatorModalProcessor name
- All tests passing

* refactor: simplify transcript_translator base class to match other processors

- Moved all implementation from base class to modal processor
- Base class now only defines abstract _translate method
- Follows the same minimal pattern as audio_diarization and audio_transcript base classes
- Updated test mock to use _translate instead of get_translation
- All tests passing

* chore: clean up settings and improve type annotations

- Remove deprecated generic API key variables from settings
- Add comments to group Modal-specific settings
- Improve type annotations for modal_api_key parameters

* fix: typing

* fix: passing key to openai

* test: fix rtc test failing due to change on transcript

It also correctly setup database from sqlite, in case our configuration
is setup to postgres.

* ci: deactivate translation backend by default

* test: fix modal->mock

* refactor: implementing igor review, mock to passthrough

2025-08-04 12:07:30 -06:00

.github

ci: update pull request template (#523 )

2025-07-31 17:45:19 -06:00

.vscode

Send to zulip

2023-11-20 21:39:33 +07:00

server

feat: implement service-specific Modal API keys with auto processor pattern (#528 )

2025-08-04 12:07:30 -06:00

www

chore: remove refactor md (#527 )

2025-08-01 16:33:40 -06:00

.gitignore

feat: implement service-specific Modal API keys with auto processor pattern (#528 )

2025-08-04 12:07:30 -06:00

.pre-commit-config.yaml

style: use ruff for linting and formatting (#524 )

2025-07-31 17:57:43 -06:00

CHANGELOG.md

chore(main): release 0.5.0 (#521 )

2025-07-31 20:11:41 -06:00

CLAUDE.md

feat: implement service-specific Modal API keys with auto processor pattern (#528 )

2025-08-04 12:07:30 -06:00

compose.yml

refactor: migration from sqlite to postgres with migration script (#483 )

2025-07-16 19:38:33 -06:00

LICENSE

docs: add AGPL-v3 license and update README (#487 )

2025-07-16 08:31:55 -06:00

README.md

feat: new summary using phi-4 and llama-index (#519 )

2025-07-31 15:29:29 -06:00

README.md

Reflector

Reflector Audio Management and Analysis is a cutting-edge web application under development by Monadical. It utilizes AI to record meetings, providing a permanent record with transcripts, translations, and automated summaries.

Screenshots

Background

The project architecture consists of three primary components:

Front-End: NextJS React project hosted on Vercel, located in www/.
Back-End: Python server that offers an API and data persistence, found in server/.
GPU implementation: Providing services such as speech-to-text transcription, topic generation, automated summaries, and translations. Most reliable option is Modal deployment

It also uses authentik for authentication if activated, and Vercel for deployment and configuration of the front-end.

Contribution Guidelines

All new contributions should be made in a separate branch, and goes through a Pull Request. Conventional commits must be used for the PR title and commits.

Usage

To record both your voice and the meeting you're taking part in, you need:

For an in-person meeting, make sure your microphone is in range of all participants.
If using several microphones, make sure to merge the audio feeds into one with an external tool.
For an online meeting, if you do not use headphones, your microphone should be able to pick up both your voice and the audio feed of the meeting.
If you want to use headphones, you need to merge the audio feeds with an external tool.

Permissions:

You may have to add permission for browser's microphone access to record audio in System Preferences -> Privacy & Security -> Microphone System Preferences -> Privacy & Security -> Accessibility. You will be prompted to provide these when you try to connect.

How to Install Blackhole (Mac Only)

This is an external tool for merging the audio feeds as explained in the previous section of this document. Note: We currently do not have instructions for Windows users.

Install Blackhole-2ch (2 ch is enough) by 1 of 2 options listed.
Setup "Aggregate device" to route web audio and local microphone input.
Setup Multi-Output device
Then goto System Preferences -> Sound and choose the devices created from the Output and Input tabs.
The input from your local microphone, the browser run meeting should be aggregated into one virtual stream to listen to and the output should be fed back to your specified output devices if everything is configured properly.

Installation

Frontend

Start with cd www.

Installation

yarn install
cp .env_template .env
cp config-template.ts config.ts

Then, fill in the environment variables in .env and the configuration in config.ts as needed. If you are unsure on how to proceed, ask in Zulip.

Run in development mode

yarn dev

Then (after completing server setup and starting it) open http://localhost:3000 to view it in the browser.

OpenAPI Code Generation

To generate the TypeScript files from the openapi.json file, make sure the python server is running, then run:

yarn openapi

Backend

Start with cd server.

Run in development mode

docker compose up -d redis

# on the first run, or if the schemas changed
uv run alembic upgrade head

# start the worker
uv run celery -A reflector.worker.app worker --loglevel=info

# start the app
uv run -m reflector.app --reload

Then fill .env with the omitted values (ask in Zulip).

Crontab (optional)

For crontab (only healthcheck for now), start the celery beat (you don't need it on your local dev environment):

uv run celery -A reflector.worker.app beat

GPU models

Currently, reflector heavily use custom local models, deployed on modal. All the micro services are available in server/gpu/

To deploy llm changes to modal, you need:

a modal account
set up the required secret in your modal account (REFLECTOR_GPU_APIKEY)
install the modal cli
connect your modal cli to your account if not done previously
modal run path/to/required/llm

Using local files

You can manually process an audio file by calling the process tool:

uv run python -m reflector.tools.process path/to/audio.wav

Languages

Python 73.9%

TypeScript 24.6%

Shell 0.8%

JavaScript 0.3%

Dockerfile 0.2%