feat: implement service-specific Modal API keys with auto processor pattern (#528)

* fix: refactor modal API key configuration for better separation of concerns

- Split generic MODAL_API_KEY into service-specific keys:
  - TRANSCRIPT_API_KEY for transcription service
  - DIARIZATION_API_KEY for diarization service
  - TRANSLATE_API_KEY for translation service
- Remove deprecated *_MODAL_API_KEY settings
- Add proper validation to ensure URLs are set when using modal processors
- Update README with new configuration format

BREAKING CHANGE: Configuration keys have changed. Update your .env file:
- TRANSCRIPT_MODAL_API_KEY → TRANSCRIPT_API_KEY
- LLM_MODAL_API_KEY → (removed, use TRANSCRIPT_API_KEY)
- Add DIARIZATION_API_KEY and TRANSLATE_API_KEY if using those services

* fix: update Modal backend configuration to use service-specific API keys

- Changed from generic MODAL_API_KEY to service-specific keys:
  - TRANSCRIPT_MODAL_API_KEY for transcription
  - DIARIZATION_MODAL_API_KEY for diarization
  - TRANSLATION_MODAL_API_KEY for translation
- Updated audio_transcript_modal.py and audio_diarization_modal.py to use modal_api_key parameter
- Updated documentation in README.md, CLAUDE.md, and env.example

* feat: implement auto/modal pattern for translation processor

- Created TranscriptTranslatorAutoProcessor following the same pattern as transcript/diarization
- Created TranscriptTranslatorModalProcessor with TRANSLATION_MODAL_API_KEY support
- Added TRANSLATION_BACKEND setting (defaults to "modal")
- Updated all imports to use TranscriptTranslatorAutoProcessor instead of TranscriptTranslatorProcessor
- Updated env.example with TRANSLATION_BACKEND and TRANSLATION_MODAL_API_KEY
- Updated test to expect TranscriptTranslatorModalProcessor name
- All tests passing

* refactor: simplify transcript_translator base class to match other processors

- Moved all implementation from base class to modal processor
- Base class now only defines abstract _translate method
- Follows the same minimal pattern as audio_diarization and audio_transcript base classes
- Updated test mock to use _translate instead of get_translation
- All tests passing

* chore: clean up settings and improve type annotations

- Remove deprecated generic API key variables from settings
- Add comments to group Modal-specific settings
- Improve type annotations for modal_api_key parameters

* fix: typing

* fix: passing key to openai

* test: fix rtc test failing due to change on transcript

It also correctly setup database from sqlite, in case our configuration
is setup to postgres.

* ci: deactivate translation backend by default

* test: fix modal->mock

* refactor: implementing igor review, mock to passthrough
This commit is contained in:
2025-08-04 12:07:30 -06:00
committed by GitHub
parent 5bd8233657
commit dc177af3ff
20 changed files with 220 additions and 85 deletions

View File

@@ -1,9 +1,5 @@
import httpx
from reflector.processors.base import Processor
from reflector.processors.types import Transcript, TranslationLanguages
from reflector.settings import settings
from reflector.utils.retry import retry
from reflector.processors.types import Transcript
class TranscriptTranslatorProcessor(Processor):
@@ -17,56 +13,23 @@ class TranscriptTranslatorProcessor(Processor):
def __init__(self, **kwargs):
super().__init__(**kwargs)
self.transcript = None
self.translate_url = settings.TRANSLATE_URL
self.timeout = settings.TRANSLATE_TIMEOUT
self.headers = {"Authorization": f"Bearer {settings.TRANSCRIPT_MODAL_API_KEY}"}
async def _push(self, data: Transcript):
self.transcript = data
await self.flush()
async def get_translation(self, text: str) -> str | None:
# FIXME this should be a processor after, as each user may want
# different languages
source_language = self.get_pref("audio:source_language", "en")
target_language = self.get_pref("audio:target_language", "en")
if source_language == target_language:
return
languages = TranslationLanguages()
# Only way to set the target should be the UI element like dropdown.
# Hence, this assert should never fail.
assert languages.is_supported(target_language)
self.logger.debug(f"Try to translate {text=}")
json_payload = {
"text": text,
"source_language": source_language,
"target_language": target_language,
}
async with httpx.AsyncClient() as client:
response = await retry(client.post)(
self.translate_url + "/translate",
headers=self.headers,
params=json_payload,
timeout=self.timeout,
follow_redirects=True,
logger=self.logger,
)
response.raise_for_status()
result = response.json()["text"]
# Sanity check for translation status in the result
if target_language in result:
translation = result[target_language]
self.logger.debug(f"Translation response: {text=}, {translation=}")
return translation
async def _translate(self, text: str) -> str | None:
raise NotImplementedError
async def _flush(self):
if not self.transcript:
return
self.transcript.translation = await self.get_translation(
text=self.transcript.text
)
source_language = self.get_pref("audio:source_language", "en")
target_language = self.get_pref("audio:target_language", "en")
if source_language == target_language:
self.transcript.translation = None
else:
self.transcript.translation = await self._translate(self.transcript.text)
await self.emit(self.transcript)