mirror of
https://github.com/Monadical-SAS/reflector.git
synced 2025-12-22 13:19:05 +00:00
3.2 KiB
3.2 KiB
Durable Workflow Migration Tasks
This document defines atomic, isolated work items for migrating the Daily.co multitrack diarization pipeline from Celery to durable workflow orchestration using Hatchet.
Provider Selection
# .env
DURABLE_WORKFLOW_PROVIDER=none # Celery only (default)
DURABLE_WORKFLOW_PROVIDER=hatchet # Use Hatchet
DURABLE_WORKFLOW_SHADOW_MODE=true # Run both Hatchet + Celery (for comparison)
Task Index
| ID | Title | Status |
|---|---|---|
| INFRA-001 | Add container to docker-compose | Done |
| INFRA-002 | Create Python client wrapper | Done |
| INFRA-003 | Add environment configuration | Done |
| TASK-001 | Create workflow definition | Done |
| TASK-002 | get_recording task | Done |
| TASK-003 | get_participants task | Done |
| TASK-004 | pad_track task | Done |
| TASK-005 | mixdown_tracks task | Done |
| TASK-006 | generate_waveform task | Done |
| TASK-007 | transcribe_track task | Done |
| TASK-008 | merge_transcripts task | Done (in process_tracks) |
| TASK-009 | detect_topics task | Done |
| TASK-010 | generate_title task | Done |
| TASK-011 | generate_summary task | Done |
| TASK-012 | finalize task | Done |
| TASK-013 | cleanup_consent task | Done |
| TASK-014 | post_zulip task | Done |
| TASK-015 | send_webhook task | Done |
| EVENT-001 | Progress WebSocket events | Done |
| INTEG-001 | Pipeline trigger integration | Done |
| SHADOW-001 | Shadow mode toggle | Done |
| TEST-001 | Integration tests | Pending |
| TEST-002 | E2E workflow test | Pending |
| CUTOVER-001 | Production cutover | Pending |
| CLEANUP-001 | Remove Celery code | Pending |
File Structure
server/reflector/hatchet/
├── client.py # SDK wrapper
├── progress.py # WebSocket progress emission
├── run_workers.py # Worker startup
└── workflows/
├── diarization_pipeline.py # Main workflow with all tasks
└── track_processing.py # Child workflow (pad + transcribe)
Remaining Work
TEST-001: Integration Tests
- Test each task with mocked external services
- Test error handling and retries
TEST-002: E2E Workflow Test
- Complete workflow run with real Daily.co recording
- Verify output matches Celery pipeline
- Performance comparison
CUTOVER-001: Production Cutover
- Deploy with
DURABLE_WORKFLOW_PROVIDER=hatchet - Monitor for failures
- Compare results with shadow mode if needed
CLEANUP-001: Remove Celery Code
- Remove
main_multitrack_pipeline.py - Remove Celery task triggers
- Update documentation
Known Issues
Hatchet
- See
HATCHET_LLM_OBSERVATIONS.mdfor debugging notes - SDK v1.21+ API changes (breaking)
- JWT token Docker networking issues
- Worker appears hung without debug mode
- Workflow replay is version-locked (use --force to run latest code)
Quick Start
Hatchet
# Start infrastructure
docker compose up -d hatchet hatchet-worker
# Workers auto-register on startup
Trigger Workflow
# Set provider in .env
DURABLE_WORKFLOW_PROVIDER=hatchet
# Process a Daily.co recording via webhook or API
# The pipeline trigger automatically uses the configured provider