mirror of https://github.com/Monadical-SAS/reflector.git synced 2026-02-04 09:56:47 +00:00

Files

Mathieu Virbel f0a4fd10bc feat: new parakeet v3 implementation

Multi languages, but less performant than v2. I tried on french, and it
was switching from english to french. Maybe some configuration is
required to get it right, but at the moment we cannot select any kind of
source translation from the UI, only target translation

2025-08-21 20:41:56 -06:00

README.md

feat: pipeline improvement with file processing, parakeet, silero-vad (#540 )

2025-08-20 20:07:19 -06:00

reflector_diarizer.py

feat: pipeline improvement with file processing, parakeet, silero-vad (#540 )

2025-08-20 20:07:19 -06:00

reflector_transcriber_parakeet_v3.py

feat: new parakeet v3 implementation

2025-08-21 20:41:56 -06:00

reflector_transcriber_parakeet.py

feat: pipeline improvement with file processing, parakeet, silero-vad (#540 )

2025-08-20 20:07:19 -06:00

reflector_transcriber.py

Upgrade modal apps

2025-03-25 11:09:01 +01:00

reflector_translator.py

Upgrade modal apps

2025-03-25 11:09:01 +01:00

README.md

Reflector GPU implementation - Transcription and LLM

This repository hold an API for the GPU implementation of the Reflector API service, and use Modal.com

reflector_diarizer.py - Diarization API
reflector_transcriber.py - Transcription API (Whisper)
reflector_transcriber_parakeet.py - Transcription API (NVIDIA Parakeet)
reflector_translator.py - Translation API

Modal.com deployment

Create a modal secret, and name it reflector-gpu. It should contain an REFLECTOR_APIKEY environment variable with a value.

The deployment is done using Modal.com service.

$ modal deploy reflector_transcriber.py
...
└── 🔨 Created web => https://xxxx--reflector-transcriber-web.modal.run

$ modal deploy reflector_transcriber_parakeet.py
...
└── 🔨 Created web => https://xxxx--reflector-transcriber-parakeet-web.modal.run

$ modal deploy reflector_llm.py
...
└── 🔨 Created web => https://xxxx--reflector-llm-web.modal.run

Then in your reflector api configuration .env, you can set these keys:

TRANSCRIPT_BACKEND=modal
TRANSCRIPT_URL=https://xxxx--reflector-transcriber-web.modal.run
TRANSCRIPT_MODAL_API_KEY=REFLECTOR_APIKEY

DIARIZATION_BACKEND=modal
DIARIZATION_URL=https://xxxx--reflector-diarizer-web.modal.run
DIARIZATION_MODAL_API_KEY=REFLECTOR_APIKEY

TRANSLATION_BACKEND=modal
TRANSLATION_URL=https://xxxx--reflector-translator-web.modal.run
TRANSLATION_MODAL_API_KEY=REFLECTOR_APIKEY

API

Authentication must be passed with the Authorization header, using the bearer scheme.

Authorization: bearer <REFLECTOR_APIKEY>

LLM

POST /llm

request

{
    "prompt": "xxx"
}

response

{
    "text": "xxx completed"
}

Transcription

Parakeet Transcriber (`reflector_transcriber_parakeet.py`)

NVIDIA Parakeet is a state-of-the-art ASR model optimized for real-time transcription with superior word-level timestamps.

GPU Configuration:

A10G GPU - Used for /v1/audio/transcriptions endpoint (small files, live transcription)
- Higher concurrency (max_inputs=10)
- Optimized for multiple small audio files
- Supports batch processing for efficiency
L40S GPU - Used for /v1/audio/transcriptions-from-url endpoint (large files)
- Lower concurrency but more powerful processing
- Optimized for single large audio files
- VAD-based chunking for long-form audio

`/v1/audio/transcriptions` - Small file transcription

request (multipart/form-data)

file or files[] - audio file(s) to transcribe
model - model name (default: nvidia/parakeet-tdt-0.6b-v2)
language - language code (default: en)
batch - whether to use batch processing for multiple files (default: true)

response

{
    "text": "transcribed text",
    "words": [
        {"word": "hello", "start": 0.0, "end": 0.5},
        {"word": "world", "start": 0.5, "end": 1.0}
    ],
    "filename": "audio.mp3"
}

For multiple files with batch=true:

{
    "results": [
        {
            "filename": "audio1.mp3",
            "text": "transcribed text",
            "words": [...]
        },
        {
            "filename": "audio2.mp3",
            "text": "transcribed text",
            "words": [...]
        }
    ]
}

`/v1/audio/transcriptions-from-url` - Large file transcription

request (application/json)

{
    "audio_file_url": "https://example.com/audio.mp3",
    "model": "nvidia/parakeet-tdt-0.6b-v2",
    "language": "en",
    "timestamp_offset": 0.0
}

response

{
    "text": "transcribed text from large file",
    "words": [
        {"word": "hello", "start": 0.0, "end": 0.5},
        {"word": "world", "start": 0.5, "end": 1.0}
    ]
}

Supported file types: mp3, mp4, mpeg, mpga, m4a, wav, webm

Whisper Transcriber (`reflector_transcriber.py`)

POST /transcribe

request (multipart/form-data)

file - audio file
language - language code (e.g. en)

response

{
    "text": "xxx",
    "words": [
        {"text": "xxx", "start": 0.0, "end": 1.0}
    ]
}

README.md

Reflector GPU implementation - Transcription and LLM

Modal.com deployment

API

LLM

Transcription

Parakeet Transcriber (reflector_transcriber_parakeet.py)

/v1/audio/transcriptions - Small file transcription

/v1/audio/transcriptions-from-url - Large file transcription

Whisper Transcriber (reflector_transcriber.py)

Parakeet Transcriber (`reflector_transcriber_parakeet.py`)

`/v1/audio/transcriptions` - Small file transcription

`/v1/audio/transcriptions-from-url` - Large file transcription

Whisper Transcriber (`reflector_transcriber.py`)