Commit Graph

507 Commits

Author SHA1 Message Date
Igor Loskutov
7dfb37154d Fix critical data flow and concurrency bugs
- Add empty padded_tracks guard in process_transcriptions
- Fix created_padded_files: use list instead of set to preserve order for zip cleanup
- Document size=0 contract in PadTrackResult (size=0 means original key, not padded)
- Remove redundant ctx.log in padding_workflow
2026-01-23 16:47:11 -05:00
Igor Loskutov
67679e90b2 Revert waveform dependency - allow background completion
Waveform generation can complete after transcript marked "ended".
User can see transcript immediately while waveform finishes in background.
2026-01-23 16:42:36 -05:00
Igor Loskutov
aa4c368479 Fix critical bugs from refactoring
- Fix empty transcript broadcast (was sending text="", should send merged_transcript.text)
- Restore generate_waveform to finalize parents (finalize must wait for waveform)
2026-01-23 16:40:57 -05:00
Igor Loskutov
deb5ed6010 Fix: Preserve track_index explicitly in PaddedTrackInfo
- Add track_index to PaddedTrackInfo model
- Preserve track_index from PadTrackResult when building padded_tracks list
- Use explicit track_index instead of enumerate in process_transcriptions
- Removes fragile ordering assumption
2026-01-23 16:36:16 -05:00
Igor Loskutov
30b28eed3b Merge main into feature/split-padding-transcription 2026-01-23 16:20:39 -05:00
Igor Loskutov
1b33fba3ba Fix: Move padding_workflow to LLM worker for parallel execution
Critical bug fix: padding_workflow was registered on CPU worker (slots=1),
causing all padding tasks to run serially instead of in parallel.

Changes:
- Moved padding_workflow from run_workers_cpu.py to run_workers_llm.py
- LLM worker has slots=10, allowing up to 10 parallel padding operations
- Padding is I/O-bound (S3 download/upload), not CPU-intensive
- CPU worker now handles only mixdown_tracks (compute-heavy, serialized)

Impact:
- Before: 4 tracks × 5s padding = 20s serial execution
- After: 4 tracks × 5s padding = ~5s parallel execution (4 concurrent)
- Restores intended performance benefit of the refactoring
2026-01-23 16:05:43 -05:00
fc3ef6c893 feat: mixdown optional (#834)
* optional mixdown

* optional mixdown

---------

Co-authored-by: Igor Loskutov <igor.loskutoff@gmail.com>
2026-01-23 15:51:18 -05:00
6c175a11d8 feat: brady bunch (#816)
* brady bunch PRD/tasks

* clean dead daily.co code

* brady bunch prototype (no-mistakes)

* brady bunch prototype (no-mistakes) review

* self-review

* daily poll time match (no-mistakes)

* daily poll self-review (no-mistakes)

* daily poll self-review (no-mistakes)

* daily co doc

* cleanup

* cleanup

* self-review (no-mistakes)

* self-review (no-mistakes)

* self-review

* self-review

* ui typefix

* dupe calls error handling proper

* daily reflector data model doc

* logging style fix

* migration merge

---------

Co-authored-by: Igor Loskutov <igor.loskutoff@gmail.com>
2026-01-23 12:33:06 -05:00
6e786b7631 hatchet processing resilence several fixes (#831)
Co-authored-by: Igor Loskutov <igor.loskutoff@gmail.com>
2026-01-22 19:03:33 -05:00
c723752b7e feat: set hatchet as default for multitracks (#822)
* set hatchet as default for multitracks

* fix: pipeline routing tests for hatchet-default branch

- Create room with use_celery=True to force Celery backend in tests
- Link transcript to room to enable multitrack pipeline routing
- Fixes test failures caused by missing HATCHET_CLIENT_TOKEN in test env

* Update server/reflector/services/transcript_process.py

Co-authored-by: pr-agent-monadical[bot] <198624643+pr-agent-monadical[bot]@users.noreply.github.com>

---------

Co-authored-by: Igor Loskutov <igor.loskutoff@gmail.com>
Co-authored-by: pr-agent-monadical[bot] <198624643+pr-agent-monadical[bot]@users.noreply.github.com>
2026-01-21 17:05:03 -05:00
Igor Loskutov
3ce279daa4 Split padding and transcription into separate workflow steps
- Split process_tracks into process_paddings + process_transcriptions
- Create PaddingWorkflow and TranscriptionWorkflow as separate child workflows
- Update dependency: mixdown_tracks now depends on process_paddings (not process_transcriptions)
- Performance: mixdown starts ~295s earlier (after padding completes, not after transcription)

Changes:
- New: padding_workflow.py, transcription_workflow.py
- Modified: daily_multitrack_pipeline.py (new tasks, updated dependencies)
- Modified: models.py (new ProcessPaddingsResult, ProcessTranscriptionsResult, deleted dead ProcessTracksResult)
- Modified: constants.py (new task names)
- Modified: run_workers_cpu.py, run_workers_llm.py (workflow registration)
- Deleted: track_processing.py

Code quality fixes:
- Removed redundant comments and verbose docstrings
- Added language validation in process_transcriptions
- Improved error logging with full context (transcript_id, track_index)
- Fixed log accuracy bugs (use correct counts)
- Updated worker pool documentation
2026-01-21 16:53:06 -05:00
23d2bc283d fix: ics non-sync bugfix (#823)
* ics non-sync bugfix

* fix tests

---------

Co-authored-by: Igor Loskutov <igor.loskutoff@gmail.com>
2026-01-21 15:10:19 -05:00
Igor Loskutov
01650be787 fix tests 2026-01-21 15:04:05 -05:00
f00c16a41c Merge branch 'main' into fix/ics-window-bug 2026-01-21 14:38:36 -05:00
c8743fdf1c fix webhook tests (#826)
Co-authored-by: Igor Loskutov <igor.loskutoff@gmail.com>
2026-01-21 14:31:20 -05:00
859df5513e Merge branch 'main' into fix/ics-window-bug 2026-01-21 08:47:34 -05:00
Igor Loskutov
2af9918979 ics non-sync bugfix 2026-01-20 16:56:06 -05:00
8a293882ad timeout to untighten ws python loop (#821)
Co-authored-by: Igor Loskutov <igor.loskutoff@gmail.com>
2026-01-20 16:29:09 -05:00
3b6540eae5 feat: worker affinity (#819)
* worker affinity

* worker affinity

* worker affinity

---------

Co-authored-by: Igor Loskutov <igor.loskutoff@gmail.com>
2026-01-20 12:27:16 -05:00
Igor Loskutov
7ca9cad937 docs 2026-01-09 15:37:15 -05:00
Igor Loskutov
3be7fc0b9a 200ms webm daily doc 2026-01-09 10:54:12 -05:00
407c15299f docs: docs website + installation (#778)
* feat: WIP doc (vibe started and iterated)

* install from scratch docs

* caddyfile.example

* gitignore

* authentik script

* authentik script

* authentik script

* llm doc

* authentik ongoing

* more daily setup logs

* doc website

* gpu self hosted setup guide (no-mistakes)

* doc review round

* doc review round

* doc review round

* update doc site sidebars

* feat(docs): add mermaid diagram support

* docs polishing

* live pipeline doc

* move pipeline dev docs to dev docs location

* doc pr review iteration

* dockerfile healthcheck

* docs/pr-comments

* remove jwt comment

* llm suggestion

* pr comments

* pr comments

* document auto migrations

* cleanup docs

---------

Co-authored-by: Mathieu Virbel <mat@meltingrocks.com>
Co-authored-by: Igor Loskutov <igor.loskutoff@gmail.com>
2026-01-06 17:25:02 -05:00
e644d6497b correct workflow name for hatchet (#815)
Co-authored-by: Igor Loskutov <igor.loskutoff@gmail.com>
2025-12-29 20:36:36 -05:00
5f7b1ff1a6 fix: webhook parity, pipeline rename, waveform constant fix (#806)
* pipeline fixes: whereby Hatchet preparation

* send_webhook fixes

* cleanup

* self-review

* comment

* webhook util functions: less dependencies

* remove comment

---------

Co-authored-by: Igor Loskutov <igor.loskutoff@gmail.com>
2025-12-26 18:00:32 -05:00
2d0df48767 feat: devex/hatchet log progress track (#813)
* progress track for some hatchet tasks

* remove inline imports / type fixes

* progress callback for mixdown - move to a function

---------

Co-authored-by: Igor Loskutov <igor.loskutoff@gmail.com>
2025-12-26 14:10:21 -05:00
5baa6dd92e pipeline type fixes (#812)
Co-authored-by: Igor Loskutov <igor.loskutoff@gmail.com>
2025-12-26 11:28:43 -05:00
bab1e2d537 dynamic mixdown hatchet (#811)
Co-authored-by: Igor Loskutov <igor.loskutoff@gmail.com>
2025-12-23 19:48:16 -05:00
e886153ae1 fix hatchet parallel syntax (#810)
Co-authored-by: Igor Loskutov <igor.loskutoff@gmail.com>
2025-12-23 18:45:06 -05:00
7b352f465e dont always enable hatchet (#809)
* dont always enable hatchet

* fix hatchet worker params

---------

Co-authored-by: Igor Loskutov <igor.loskutoff@gmail.com>
2025-12-23 18:15:33 -05:00
3cf9757ac2 diarization flow - pralellelize better (#808)
Co-authored-by: Igor Loskutov <igor.loskutoff@gmail.com>
2025-12-23 17:35:43 -05:00
d9d3938192 better hatchet concurrency limits (#807)
Co-authored-by: Igor Loskutov <igor.loskutoff@gmail.com>
2025-12-23 17:26:23 -05:00
594bcc09e0 feat: parallelize hatchet (#804)
* parallelize hatchet (no-mistakes)

* dry (no-mistakes) (minimal)

* comments

* self-review

* self-review

* self-review

* self-review

* pr comments

* pr comments

---------

Co-authored-by: Igor Loskutov <igor.loskutoff@gmail.com>
2025-12-23 11:03:36 -05:00
1dac999b56 feat: durable (#794)
* durable (no-mistakes)

* hatchet no-mistake

* hatchet no-mistake

* hatchet no-mistake, better logging

* remove conductor and add hatchet tests (no-mistakes)

* self-review (no-mistakes)

* hatched logs

* remove shadow mode for hatchet

* and add hatchet processor setting to room

* .

* cleanup

* hatchet init db

* self-review (no-mistakes)

* self-review (no-mistakes)

* hatchet: restore zullip report

* self-review round

* self-review round

* self-review round

* dry hatchet with celery

* dry hatched with celery - 2

* self-review round

* more NES instead of str

* self-review wip

* self-review round

* self-review round

* self-review round

* can_replay cancelled

* add forgotten file

* pr autoreviewer fixes

* better log webhook events

* durable_started return

* migration sync

* latest changes feature parity

* migration merge

* pr review

---------

Co-authored-by: Igor Loskutov <igor.loskutoff@gmail.com>
2025-12-22 12:09:20 -05:00
f580b996ee feat: increase daily recording max duration (#801)
* increate daily recording max duration

* recording end time: 3h min

---------

Co-authored-by: Igor Loskutov <igor.loskutoff@gmail.com>
2025-12-22 09:02:14 -05:00
225783496f feat: consent disable feature (#799)
* consent disable feature (no-mistakes)

* sync migration

* consent disable refactor

* daily backend code refactor

* consent skip feature

* consent skip feature

* no forced whereby recording indicator

* active meetings type precision

* cleanup

* cleanup

---------

Co-authored-by: Igor Loskutov <igor.loskutoff@gmail.com>
2025-12-22 08:47:07 -05:00
964cd78bb6 feat: identify action items (#790)
* Identify action items

* Add action items to mock summary

* Add action items validator

* Remove final prefix from action items

* Make on action items callback required

* Don't mutation action items response

* Assign action items to none on error

* Use timeout constant

* Exclude action items from transcript list
2025-12-18 21:13:47 +01:00
5f458aa4a7 fix: automatically reprocess daily recordings (#797)
* Automatically reprocess recordings

* Restore the comments

* Remove redundant check

* Fix indent

* Add comment about cyclic import
2025-12-18 21:10:04 +01:00
5f7dfadabd fix: retry on workflow timeout (#798) 2025-12-18 20:49:06 +01:00
c62e3c0753 incorporate daily api undocumented feature (#796)
Co-authored-by: Igor Loskutov <igor.loskutoff@gmail.com>
2025-12-17 09:51:55 -05:00
0eba147018 fix: populate room_name in transcript GET endpoint (#783)
Fixes monadical/internalai#14
2025-12-11 12:37:59 +01:00
61f0e29d4c feat: llm retries (#739)
* llm retries no-mistakes

* self-review (no-mistakes)

* self-review (no-mistakes)

* bigger retry intervals by default

* tests and dry

* restore to main state

* parse retries

* json retries (no-mistakes)

* json retries (no-mistakes)

* json retries (no-mistakes)

* json retries (no-mistakes) self-review

* additional network retry test

* more lindt

---------

Co-authored-by: Igor Loskutov <igor.loskutoff@gmail.com>
2025-12-05 12:08:21 -05:00
ec17ed7b58 fix: celery inspect bug sidestep in restart script (#766)
* celery bug sidestep

* Update server/reflector/services/transcript_process.py

Co-authored-by: pr-agent-monadical[bot] <198624643+pr-agent-monadical[bot]@users.noreply.github.com>

---------

Co-authored-by: Igor Loskutov <igor.loskutoff@gmail.com>
Co-authored-by: pr-agent-monadical[bot] <198624643+pr-agent-monadical[bot]@users.noreply.github.com>
2025-12-04 09:22:51 -05:00
d3a5cd12d2 fix: return participant emails from transcript endpoint (#769)
* Return participant emails from transcript endpoint

* Fix broken test
2025-12-03 16:47:56 +01:00
bd5df1ce2e fix: Multitrack mixdown optimisation 2 (#764)
* Revert "fix: Skip mixdown for multitrack (#760)"

This reverts commit b51b7aa917.

* multitrack mixdown optimisation

* return the "good" ui part of "skip mixdown"

---------

Co-authored-by: Igor Loskutov <igor.loskutoff@gmail.com>
2025-12-02 17:10:06 -05:00
28f87c09dc fix: align daily room settings (#759)
* Switch platform ui

* Update room settings based on platform

* Add local and none recording options to daily

* Don't create tokens for unauthentikated users

* Enable knocking for private rooms

* Create new meeting on room settings change

* Always use 2-200 option for daily

* Show recording start trigger for daily

* Fix broken test
2025-12-02 09:06:36 +01:00
b51b7aa917 fix: Skip mixdown for multitrack (#760)
* multitrack mixdown optimisation

* skip mixdown for multitrack

* skip mixdown for multitrack

---------

Co-authored-by: Igor Loskutov <igor.loskutoff@gmail.com>
2025-12-01 23:35:12 -05:00
7f0b728991 fix: participants update from daily (#749)
* Fix participants update from daily

* Use track keys from params
2025-11-27 16:53:26 +01:00
d63040e2fd feat: Multitrack segmentation (#747)
* segmentation multitrack (no-mistakes)

* segmentation multitrack (no-mistakes)

* self review

* self review

* recording poll daily doc

* filter cam_audio tracks to remove screensharing from daily processing

* pr review

---------

Co-authored-by: Igor Loskutov <igor.loskutoff@gmail.com>
2025-11-26 16:21:32 -05:00
f6ca07505f feat: add transcript format parameter to GET endpoint (#709)
* feat: add transcript format parameter to GET endpoint

Add transcript_format query parameter to /v1/transcripts/{id} endpoint
with support for multiple output formats using discriminated unions.

Formats supported:
- text: Plain speaker dialogue (default)
- text-timestamped: Dialogue with [MM:SS] timestamps
- webvtt-named: WebVTT subtitles with participant names
- json: Structured segments with full metadata

Response models use Pydantic discriminated unions with transcript_format
as discriminator field. POST/PATCH endpoints return GetTranscriptWithParticipants
for minimal responses. GET endpoint returns format-specific models.

* Copy transcript format

* Regenerate types

* Fix transcript formats

* Don't throw inside try

* Remove any type

* Toast share copy errors

* transcript_format exhaustiveness and python idiomatic assert_never

* format_timestamp_mmss clear type definition

* Rename seconds_to_timestamp

* Test transcript format with overlapping speakers

* exact match for vtt multispeaker test

---------

Co-authored-by: Sergey Mankovsky <sergey@monadical.com>
Co-authored-by: Igor Loskutov <igor.loskutoff@gmail.com>
2025-11-26 18:51:14 +01:00
0b2c82227d is_owner pass for dailyco (#745)
Co-authored-by: Igor Loskutov <igor.loskutoff@gmail.com>
2025-11-25 22:41:54 -05:00