mirror of
https://github.com/Monadical-SAS/reflector.git
synced 2026-03-28 01:46:46 +00:00
* feat: local LLM via Ollama + structured output response_format
- Add setup script (scripts/setup-local-llm.sh) for one-command Ollama setup
Mac: native Metal GPU, Linux: containerized via docker-compose profiles
- Add ollama-gpu and ollama-cpu docker-compose profiles for Linux
- Add extra_hosts to server/hatchet-worker-llm for host.docker.internal
- Pass response_format JSON schema in StructuredOutputWorkflow.extract()
enabling grammar-based constrained decoding on Ollama/llama.cpp/vLLM/OpenAI
- Update .env.example with Ollama as default LLM option
- Add Ollama PRD and local dev setup docs
* refactor: move Ollama services to docker-compose.standalone.yml
Ollama profiles (ollama-gpu, ollama-cpu) are only for Linux standalone
deployment. Mac devs never use them. Separate file keeps the main
compose clean and provides a natural home for future standalone services
(MinIO, etc.).
Linux: docker compose -f docker-compose.yml -f docker-compose.standalone.yml --profile ollama-gpu up -d
Mac: docker compose up -d (native Ollama, no standalone file needed)
* fix: correct PRD goal (demo/eval, not dev replacement) and processor naming
* chore: remove completed PRD, rename setup doc, drop response_format tests
- Remove docs/01_ollama.prd.md (implementation complete)
- Rename local-dev-setup.md -> standalone-local-setup.md
- Remove TestResponseFormat class from test_llm_retry.py
* docs: resolve standalone storage step — skip S3 for live-only mode
* docs: add TASKS.md for standalone env defaults + setup script work
* feat: add unified setup-local-dev.sh for standalone deployment
Single script takes fresh clone to working Reflector: Ollama/LLM setup,
env file generation (server/.env + www/.env.local), docker compose up,
health checks. No Hatchet in standalone — live pipeline is pure Celery.
* chore: rename to setup-standalone, remove redundant setup-local-llm.sh
* feat: add custom S3 endpoint support + Garage standalone storage
Add TRANSCRIPT_STORAGE_AWS_ENDPOINT_URL setting to enable S3-compatible
backends (Garage, MinIO). When set, uses path-style addressing and
routes all requests to the custom endpoint. When unset, AWS behavior
is unchanged.
- AwsStorage: accept aws_endpoint_url, pass to all 6 session.client()
calls, configure path-style addressing and base_url
- Fix 4 direct AwsStorage constructions in Hatchet workflows to pass
endpoint_url (would have silently targeted wrong endpoint)
- Standalone: add Garage service to docker-compose.standalone.yml,
setup script initializes layout/bucket/key and writes credentials
- Fix compose_cmd() bug: Mac path was missing standalone yml
- garage.toml template with runtime secret generation via openssl
* fix: standalone setup — garage config, symlink handling, healthcheck
- garage.toml: fix rpc_secret field name (was secret_transmitter),
move to top-level per Garage v1.1.0 spec, remove unused [s3_web]
- setup-standalone.sh: resolve symlinked .env files before writing,
always ensure all standalone-critical vars via env_set,
fix garage key create/info syntax (positional arg, not --name),
avoid overwriting key secret with "(redacted)" on re-run,
use compose_cmd in health check
- docker-compose.standalone.yml: fix garage healthcheck (no curl in
image, use /garage stats instead)
* docs: update standalone md — symlink handling, garage config template
* docs: add troubleshooting section + port conflict check in setup script
Port conflicts from stale next dev / other worktree processes silently
shadow Docker container port mappings, causing env vars to appear ignored.
* fix: invalidate transcript query on STATUS websocket event
Without this, the processing page never redirects after completion
because the redirect logic watches the REST query data, not the
WebSocket status state.
Cherry-picked from feat-dag-progress (faec509a).
* fix: local env setup (#855)
* Ensure rate limit
* Increase nextjs compilation speed
* Fix daily no content handling
* Simplify daily webhook creation
* Fix webhook request validation
* feat: add local pyannote file diarization processor (#858)
* feat: add local pyannote file diarization processor
Enables file diarization without Modal by using pyannote.audio locally.
Downloads model bundle from S3 on first use, caches locally, patches
config to use local paths. Set DIARIZATION_BACKEND=pyannote to enable.
* fix: standalone setup enables pyannote diarization and public mode
Replace DIARIZATION_ENABLED=false with DIARIZATION_BACKEND=pyannote so
file uploads get speaker diarization out of the box. Add PUBLIC_MODE=true
so unauthenticated users can list/browse transcripts.
* fix: touch env files before first compose_cmd in standalone setup
docker-compose.yml references www/.env.local as env_file, but the
setup script only creates it in step 4. compose_cmd calls in step 3
(Garage) fail on a fresh clone when the file doesn't exist yet.
* feat: standalone uses self-hosted GPU service for transcription+diarization
Replace in-process pyannote approach with self-hosted gpu/self_hosted/ service.
Same HTTP API as Modal — just TRANSCRIPT_URL/DIARIZATION_URL point to local container.
- Add gpu/self_hosted/Dockerfile.cpu (GPU Dockerfile minus NVIDIA CUDA)
- Add S3 model bundle fallback in diarizer.py when HF_TOKEN not set
- Add gpu service to docker-compose.standalone.yml with compose env overrides
- Fix /browse empty in PUBLIC_MODE (search+list queries filtered out roomless transcripts)
- Remove audio_diarization_pyannote.py, file_diarization_pyannote.py and tests
- Remove pyannote-audio from server local deps
* fix: allow unauthenticated GPU requests when no API key configured
OAuth2PasswordBearer with auto_error=True rejects requests without
Authorization header before apikey_auth can check if auth is needed.
* fix: rename standalone gpu service to cpu to match Dockerfile.cpu usage
* docs: add programmatic testing section and fix gpu->cpu naming in setup script/docs
- Add "Testing programmatically" section to standalone docs with curl commands
for creating transcript, uploading audio, polling status, checking result
- Fix setup-standalone.sh to reference `cpu` service (was still `gpu` after rename)
- Update all docs references from gpu to cpu service naming
---------
Co-authored-by: Igor Loskutov <igor.loskutoff@gmail.com>
* Fix websocket disconnect errors
* Fix event loop is closed in Celery workers
* Allow reprocessing idle multitrack transcripts
* feat: add local pyannote file diarization processor
Enables file diarization without Modal by using pyannote.audio locally.
Downloads model bundle from S3 on first use, caches locally, patches
config to use local paths. Set DIARIZATION_BACKEND=pyannote to enable.
* feat: standalone uses self-hosted GPU service for transcription+diarization
Replace in-process pyannote approach with self-hosted gpu/self_hosted/ service.
Same HTTP API as Modal — just TRANSCRIPT_URL/DIARIZATION_URL point to local container.
- Add gpu/self_hosted/Dockerfile.cpu (GPU Dockerfile minus NVIDIA CUDA)
- Add S3 model bundle fallback in diarizer.py when HF_TOKEN not set
- Add gpu service to docker-compose.standalone.yml with compose env overrides
- Fix /browse empty in PUBLIC_MODE (search+list queries filtered out roomless transcripts)
- Remove audio_diarization_pyannote.py, file_diarization_pyannote.py and tests
- Remove pyannote-audio from server local deps
* fix: set source_kind to FILE on audio file upload
The upload endpoint left source_kind as the default LIVE even when
a file was uploaded. Now sets it to FILE when the upload completes.
* Add hatchet env vars
* fix: improve port conflict detection and ollama model check in standalone setup
- Filter OrbStack/Docker Desktop PIDs from port conflict check (false positives on Mac)
- Check all infra ports (5432, 6379, 3900, 3903) not just app ports
- Fix ollama model detection to match on name column only
- Document OrbStack and cross-project port conflicts in troubleshooting
* fix: processing page auto-redirect after file upload completes
Three fixes for the processing page not redirecting when status becomes "ended":
- Add useWebSockets to processing page so it receives STATUS events
- Remove OAuth2PasswordBearer from auth_none — broke WebSocket endpoints (500)
- Reconnect stale Redis in ws_manager when Celery worker reuses dead event loop
* fix: mock Celery broker in idle transcript validation test
test_validation_idle_transcript_with_recording_allowed called
validate_transcript_for_processing without mocking
task_is_scheduled_or_active, which attempts a real Celery
broker connection (AMQP port 5672). Other tests in the same
file already mock this — apply the same pattern here.
* Enable server host mode
* Fix webrtc connection
* Remove turbopack
* fix: standalone GPU service connectivity with host network mode
Server runs with network_mode: host and can't resolve Docker service
names. Publish cpu port as 8100 on host, point server at localhost:8100.
Worker stays on bridge network using cpu:8000. Add dummy
TRANSCRIPT_MODAL_API_KEY since OpenAI SDK requires it even for local
endpoints.
---------
Co-authored-by: Igor Loskutov <igor.loskutoff@gmail.com>
Co-authored-by: Sergey Mankovsky <sergey@mankovsky.dev>
293 lines
10 KiB
Python
293 lines
10 KiB
Python
"""
|
|
Transcript processing service - shared logic for HTTP endpoints and Celery tasks.
|
|
|
|
This module provides result-based error handling that works in both contexts:
|
|
- HTTP endpoint: converts errors to HTTPException
|
|
- Celery task: converts errors to Exception
|
|
"""
|
|
|
|
from dataclasses import dataclass
|
|
from typing import Literal, Union, assert_never
|
|
|
|
import celery
|
|
from celery.result import AsyncResult
|
|
from hatchet_sdk.clients.rest.exceptions import ApiException, NotFoundException
|
|
from hatchet_sdk.clients.rest.models import V1TaskStatus
|
|
|
|
from reflector.db.recordings import recordings_controller
|
|
from reflector.db.transcripts import Transcript, transcripts_controller
|
|
from reflector.hatchet.client import HatchetClientManager
|
|
from reflector.logger import logger
|
|
from reflector.pipelines.main_file_pipeline import task_pipeline_file_process
|
|
from reflector.utils.string import NonEmptyString
|
|
|
|
|
|
@dataclass
|
|
class ProcessError:
|
|
detail: NonEmptyString
|
|
|
|
|
|
@dataclass
|
|
class FileProcessingConfig:
|
|
transcript_id: NonEmptyString
|
|
mode: Literal["file"] = "file"
|
|
|
|
|
|
@dataclass
|
|
class MultitrackProcessingConfig:
|
|
transcript_id: NonEmptyString
|
|
bucket_name: NonEmptyString
|
|
track_keys: list[str]
|
|
recording_id: NonEmptyString | None = None
|
|
room_id: NonEmptyString | None = None
|
|
mode: Literal["multitrack"] = "multitrack"
|
|
|
|
|
|
ProcessingConfig = Union[FileProcessingConfig, MultitrackProcessingConfig]
|
|
PrepareResult = Union[ProcessingConfig, ProcessError]
|
|
|
|
|
|
@dataclass
|
|
class ValidationOk:
|
|
# transcript currently doesnt always have recording_id
|
|
recording_id: NonEmptyString | None
|
|
transcript_id: NonEmptyString
|
|
room_id: NonEmptyString | None = None
|
|
|
|
|
|
@dataclass
|
|
class ValidationLocked:
|
|
detail: NonEmptyString
|
|
|
|
|
|
@dataclass
|
|
class ValidationNotReady:
|
|
detail: NonEmptyString
|
|
|
|
|
|
@dataclass
|
|
class ValidationAlreadyScheduled:
|
|
detail: NonEmptyString
|
|
|
|
|
|
ValidationError = Union[
|
|
ValidationNotReady, ValidationLocked, ValidationAlreadyScheduled
|
|
]
|
|
ValidationResult = Union[ValidationOk, ValidationError]
|
|
|
|
|
|
@dataclass
|
|
class DispatchOk:
|
|
status: Literal["ok"] = "ok"
|
|
|
|
|
|
@dataclass
|
|
class DispatchAlreadyRunning:
|
|
status: Literal["already_running"] = "already_running"
|
|
|
|
|
|
DispatchResult = Union[
|
|
DispatchOk, DispatchAlreadyRunning, ProcessError, ValidationError
|
|
]
|
|
|
|
|
|
async def validate_transcript_for_processing(
|
|
transcript: Transcript,
|
|
) -> ValidationResult:
|
|
if transcript.locked:
|
|
return ValidationLocked(detail="Recording is locked")
|
|
|
|
if (
|
|
transcript.status == "idle"
|
|
and not transcript.workflow_run_id
|
|
and not transcript.recording_id
|
|
):
|
|
return ValidationNotReady(detail="Recording is not ready for processing")
|
|
|
|
# Check Celery tasks
|
|
if task_is_scheduled_or_active(
|
|
"reflector.pipelines.main_file_pipeline.task_pipeline_file_process",
|
|
transcript_id=transcript.id,
|
|
) or task_is_scheduled_or_active(
|
|
"reflector.pipelines.main_multitrack_pipeline.task_pipeline_multitrack_process",
|
|
transcript_id=transcript.id,
|
|
):
|
|
return ValidationAlreadyScheduled(detail="already running")
|
|
|
|
# Check Hatchet workflow status if workflow_run_id exists
|
|
if transcript.workflow_run_id:
|
|
try:
|
|
status = await HatchetClientManager.get_workflow_run_status(
|
|
transcript.workflow_run_id
|
|
)
|
|
if status in (V1TaskStatus.RUNNING, V1TaskStatus.QUEUED):
|
|
return ValidationAlreadyScheduled(
|
|
detail="Hatchet workflow already running"
|
|
)
|
|
except ApiException:
|
|
# Workflow might be gone (404) or API issue - allow processing
|
|
pass
|
|
|
|
return ValidationOk(
|
|
recording_id=transcript.recording_id,
|
|
transcript_id=transcript.id,
|
|
room_id=transcript.room_id,
|
|
)
|
|
|
|
|
|
async def prepare_transcript_processing(validation: ValidationOk) -> PrepareResult:
|
|
"""
|
|
Determine processing mode from transcript/recording data.
|
|
"""
|
|
bucket_name: str | None = None
|
|
track_keys: list[str] | None = None
|
|
recording_id: str | None = validation.recording_id
|
|
|
|
if validation.recording_id:
|
|
recording = await recordings_controller.get_by_id(validation.recording_id)
|
|
if recording:
|
|
bucket_name = recording.bucket_name
|
|
track_keys = recording.track_keys
|
|
|
|
if track_keys is not None and len(track_keys) == 0:
|
|
return ProcessError(
|
|
detail="No track keys found, must be either > 0 or None",
|
|
)
|
|
if track_keys is not None and not bucket_name:
|
|
return ProcessError(
|
|
detail="Bucket name must be specified",
|
|
)
|
|
|
|
if track_keys:
|
|
return MultitrackProcessingConfig(
|
|
bucket_name=bucket_name, # type: ignore (validated above)
|
|
track_keys=track_keys,
|
|
transcript_id=validation.transcript_id,
|
|
recording_id=recording_id,
|
|
room_id=validation.room_id,
|
|
)
|
|
|
|
return FileProcessingConfig(
|
|
transcript_id=validation.transcript_id,
|
|
)
|
|
|
|
|
|
async def dispatch_transcript_processing(
|
|
config: ProcessingConfig, force: bool = False
|
|
) -> AsyncResult | None:
|
|
"""Dispatch transcript processing to appropriate backend (Hatchet or Celery).
|
|
|
|
Returns AsyncResult for Celery tasks, None for Hatchet workflows.
|
|
"""
|
|
if isinstance(config, MultitrackProcessingConfig):
|
|
# Multitrack processing always uses Hatchet (no Celery fallback)
|
|
# First check if we can replay (outside transaction since it's read-only)
|
|
transcript = await transcripts_controller.get_by_id(config.transcript_id)
|
|
if transcript and transcript.workflow_run_id and not force:
|
|
can_replay = await HatchetClientManager.can_replay(
|
|
transcript.workflow_run_id
|
|
)
|
|
if can_replay:
|
|
await HatchetClientManager.replay_workflow(transcript.workflow_run_id)
|
|
logger.info(
|
|
"Replaying Hatchet workflow",
|
|
workflow_id=transcript.workflow_run_id,
|
|
)
|
|
return None
|
|
else:
|
|
# Workflow can't replay (CANCELLED, COMPLETED, or 404 deleted)
|
|
# Log and proceed to start new workflow
|
|
try:
|
|
status = await HatchetClientManager.get_workflow_run_status(
|
|
transcript.workflow_run_id
|
|
)
|
|
logger.info(
|
|
"Old workflow not replayable, starting new",
|
|
old_workflow_id=transcript.workflow_run_id,
|
|
old_status=status.value,
|
|
)
|
|
except NotFoundException:
|
|
# Workflow deleted from Hatchet but ID still in DB
|
|
logger.info(
|
|
"Old workflow not found in Hatchet, starting new",
|
|
old_workflow_id=transcript.workflow_run_id,
|
|
)
|
|
|
|
# Force: cancel old workflow if exists
|
|
if force and transcript and transcript.workflow_run_id:
|
|
try:
|
|
await HatchetClientManager.cancel_workflow(transcript.workflow_run_id)
|
|
logger.info(
|
|
"Cancelled old workflow (--force)",
|
|
workflow_id=transcript.workflow_run_id,
|
|
)
|
|
except NotFoundException:
|
|
logger.info(
|
|
"Old workflow already deleted (--force)",
|
|
workflow_id=transcript.workflow_run_id,
|
|
)
|
|
await transcripts_controller.update(transcript, {"workflow_run_id": None})
|
|
|
|
# Re-fetch and check for concurrent dispatch (optimistic approach).
|
|
# No database lock - worst case is duplicate dispatch, but Hatchet
|
|
# workflows are idempotent so this is acceptable.
|
|
transcript = await transcripts_controller.get_by_id(config.transcript_id)
|
|
if transcript and transcript.workflow_run_id:
|
|
# Another process started a workflow between validation and now
|
|
try:
|
|
status = await HatchetClientManager.get_workflow_run_status(
|
|
transcript.workflow_run_id
|
|
)
|
|
if status in (V1TaskStatus.RUNNING, V1TaskStatus.QUEUED):
|
|
logger.info(
|
|
"Concurrent workflow detected, skipping dispatch",
|
|
workflow_id=transcript.workflow_run_id,
|
|
)
|
|
return None
|
|
except ApiException:
|
|
# Workflow might be gone (404) or API issue - proceed with new workflow
|
|
pass
|
|
|
|
workflow_id = await HatchetClientManager.start_workflow(
|
|
workflow_name="DiarizationPipeline",
|
|
input_data={
|
|
"recording_id": config.recording_id,
|
|
"tracks": [{"s3_key": k} for k in config.track_keys],
|
|
"bucket_name": config.bucket_name,
|
|
"transcript_id": config.transcript_id,
|
|
"room_id": config.room_id,
|
|
},
|
|
additional_metadata={
|
|
"transcript_id": config.transcript_id,
|
|
"recording_id": config.recording_id,
|
|
"daily_recording_id": config.recording_id,
|
|
},
|
|
)
|
|
|
|
if transcript:
|
|
await transcripts_controller.update(
|
|
transcript, {"workflow_run_id": workflow_id}
|
|
)
|
|
|
|
logger.info("Hatchet workflow dispatched", workflow_id=workflow_id)
|
|
return None
|
|
|
|
elif isinstance(config, FileProcessingConfig):
|
|
return task_pipeline_file_process.delay(transcript_id=config.transcript_id)
|
|
else:
|
|
assert_never(config)
|
|
|
|
|
|
def task_is_scheduled_or_active(task_name: str, **kwargs):
|
|
inspect = celery.current_app.control.inspect()
|
|
|
|
scheduled = inspect.scheduled() or {}
|
|
active = inspect.active() or {}
|
|
all = scheduled | active
|
|
for worker, tasks in all.items():
|
|
for task in tasks:
|
|
if task["name"] == task_name and task["kwargs"] == kwargs:
|
|
return True
|
|
|
|
return False
|