Commit Graph

12 Commits

Author SHA1 Message Date
dc177af3ff feat: implement service-specific Modal API keys with auto processor pattern (#528)
* fix: refactor modal API key configuration for better separation of concerns

- Split generic MODAL_API_KEY into service-specific keys:
  - TRANSCRIPT_API_KEY for transcription service
  - DIARIZATION_API_KEY for diarization service
  - TRANSLATE_API_KEY for translation service
- Remove deprecated *_MODAL_API_KEY settings
- Add proper validation to ensure URLs are set when using modal processors
- Update README with new configuration format

BREAKING CHANGE: Configuration keys have changed. Update your .env file:
- TRANSCRIPT_MODAL_API_KEY → TRANSCRIPT_API_KEY
- LLM_MODAL_API_KEY → (removed, use TRANSCRIPT_API_KEY)
- Add DIARIZATION_API_KEY and TRANSLATE_API_KEY if using those services

* fix: update Modal backend configuration to use service-specific API keys

- Changed from generic MODAL_API_KEY to service-specific keys:
  - TRANSCRIPT_MODAL_API_KEY for transcription
  - DIARIZATION_MODAL_API_KEY for diarization
  - TRANSLATION_MODAL_API_KEY for translation
- Updated audio_transcript_modal.py and audio_diarization_modal.py to use modal_api_key parameter
- Updated documentation in README.md, CLAUDE.md, and env.example

* feat: implement auto/modal pattern for translation processor

- Created TranscriptTranslatorAutoProcessor following the same pattern as transcript/diarization
- Created TranscriptTranslatorModalProcessor with TRANSLATION_MODAL_API_KEY support
- Added TRANSLATION_BACKEND setting (defaults to "modal")
- Updated all imports to use TranscriptTranslatorAutoProcessor instead of TranscriptTranslatorProcessor
- Updated env.example with TRANSLATION_BACKEND and TRANSLATION_MODAL_API_KEY
- Updated test to expect TranscriptTranslatorModalProcessor name
- All tests passing

* refactor: simplify transcript_translator base class to match other processors

- Moved all implementation from base class to modal processor
- Base class now only defines abstract _translate method
- Follows the same minimal pattern as audio_diarization and audio_transcript base classes
- Updated test mock to use _translate instead of get_translation
- All tests passing

* chore: clean up settings and improve type annotations

- Remove deprecated generic API key variables from settings
- Add comments to group Modal-specific settings
- Improve type annotations for modal_api_key parameters

* fix: typing

* fix: passing key to openai

* test: fix rtc test failing due to change on transcript

It also correctly setup database from sqlite, in case our configuration
is setup to postgres.

* ci: deactivate translation backend by default

* test: fix modal->mock

* refactor: implementing igor review, mock to passthrough
2025-08-04 12:07:30 -06:00
28ac031ff6 feat: use llamaindex everywhere (#525)
* feat: use llamaindex for transcript final title too

* refactor: removed llm backend, replaced with one single class+llamaindex

* refactor: self-review

* fix: typing

* fix: tests

* refactor: extract clean_title and add tests

* test: fix

* test: remove ensure_casing/nltk

* fix: tiny mistake
2025-08-01 12:13:00 -06:00
ad56165b54 fix: remove unused settings and utils files (#522)
* fix: remove unused settings and utils files

* fix: remove migration done

* fix: remove outdated scripts

* fix: removing deployment of hermes, not used anymore

* fix: partially remove secret, still have to understand frontend.
2025-07-31 17:45:48 -06:00
406164033d feat: new summary using phi-4 and llama-index (#519)
* feat: add litellm backend implementation

* refactor: improve generate/completion methods for base LLM

* refactor: remove tokenizer logic

* style: apply code formatting

* fix: remove hallucinations from LLM responses

* refactor: comprehensive LLM and summarization rework

* chore: remove debug code

* feat: add structured output support to LiteLLM

* refactor: apply self-review improvements

* docs: add model structured output comments

* docs: update model structured output comments

* style: apply linting and formatting fixes

* fix: resolve type logic bug

* refactor: apply PR review feedback

* refactor: apply additional PR review feedback

* refactor: apply final PR review feedback

* fix: improve schema passing for LLMs without structured output

* feat: add PR comments and logger improvements

* docs: update README and add HTTP logging

* feat: improve HTTP logging

* feat: add summary chunking functionality

* fix: resolve title generation runtime issues

* refactor: apply self-review improvements

* style: apply linting and formatting

* feat: implement LiteLLM class structure

* style: apply linting and formatting fixes

* docs: env template model name fix

* chore: remove older litellm class

* chore: format

* refactor: simplify OpenAILLM

* refactor: OpenAILLM tokenizer

* refactor: self-review

* refactor: self-review

* refactor: self-review

* chore: format

* chore: remove LLM_USE_STRUCTURED_OUTPUT from envs

* chore: roll back migration lint changes

* chore: roll back migration lint changes

* fix: make summary llm configuration optional for the tests

* fix: missing f-string

* fix: tweak the prompt for summary title

* feat: try llamaindex for summarization

* fix: complete refactor of summary builder using llamaindex and structured output when possible

* fix: separate prompt as constant

* fix: typings

* fix: enhance prompt to prevent mentioning others subject while summarize one

* fix: various changes after self-review

* fix: from igor review

---------

Co-authored-by: Igor Loskutov <igor.loskutoff@gmail.com>
2025-07-31 15:29:29 -06:00
cfb1b2f9bc Upgrade modal apps 2025-03-25 11:09:01 +01:00
163d4a6e4a Refactor transcribe segment 2025-01-20 12:46:20 +01:00
99ff06ff17 OpenAI compatible transcription api 2025-01-20 12:27:58 +01:00
7ff201f3ff Fix model download 2024-12-27 14:23:03 +01:00
895ba36cb9 fix: modal upgrade (#421) 2024-10-01 16:39:24 +02:00
5267ab2d37 feat: retake summary using NousResearch/Hermes-3-Llama-3.1-8B model (#415)
This feature a new modal endpoint, and a complete new way to build the
summary.

## SummaryBuilder

The summary builder is based on conversational model, where an exchange
between the model and the user is made. This allow more context
inclusion and a better respect of the rules.

It requires an endpoint with OpenAI-like completions endpoint
(/v1/chat/completions)

## vLLM Hermes3

Unlike previous deployment, this one use vLLM, which gives OpenAI-like
completions endpoint out of the box. It could also handle guided JSON
generation, so jsonformer is not needed. But, the model is quite good to
follow JSON schema if asked in the prompt.

## Conversion of long/short into summary builder

The builder is identifying participants, find key subjects, get a
summary for each, then get a quick recap.

The quick recap is used as a short_summary, while the markdown including
the quick recap + key subjects + summaries are used for the
long_summary.

This is why the nextjs component has to be updated, to correctly style
h1 and keep the new line of the markdown.
2024-09-14 02:28:38 +02:00
Sara
004787c055 upgrade modal 2024-08-12 12:24:14 +02:00
projects-g
06b0abaf62 deployment fix (#364) 2024-06-20 12:07:28 +05:30