200ms webm daily doc

This commit is contained in:
Igor Loskutov
2026-01-09 10:54:12 -05:00
parent 1f2aeff8cc
commit 3be7fc0b9a
3 changed files with 13 additions and 9 deletions

View File

@@ -176,15 +176,17 @@ Parse timestamps:
Track 0: Track 0:
Filename offset: 438ms Filename offset: 438ms
PyAV metadata: 229ms PyAV metadata: 229ms
Difference: 209ms Difference: ~200ms
Track 1: Track 1:
Filename offset: 8339ms Filename offset: 8339ms
PyAV metadata: 8130ms PyAV metadata: 8130ms
Difference: 209ms Difference: ~200ms
``` ```
**Consistent 209ms delta** suggests network/encoding delay between file upload initiation (filename) and actual audio stream start (metadata). **Consistent ~200ms delta** suggests network/encoding delay between file upload initiation (filename) and actual audio stream start (metadata).
**Note:** The ~200ms difference observed in this test recording is not crucial for timing accuracy. Either method (filename timestamps or PyAV metadata) works well for multi-track alignment. Filename timestamps are preferable as they are better officially documented by Daily.co.
**Current implementation uses PyAV metadata** because: **Current implementation uses PyAV metadata** because:
- More accurate (represents when audio actually started) - More accurate (represents when audio actually started)

View File

@@ -91,12 +91,12 @@ class PipelineMainMultitrack(PipelineMainBase):
- Track 0: (1760988935922 - 1760988935484) / 1000 = 0.438s - Track 0: (1760988935922 - 1760988935484) / 1000 = 0.438s
- Track 1: (1760988943823 - 1760988935484) / 1000 = 8.339s - Track 1: (1760988943823 - 1760988935484) / 1000 = 8.339s
TIME DIFFERENCE: PyAV metadata vs filename timestamps differ by ~209ms: TIME DIFFERENCE: PyAV metadata vs filename timestamps differ by ~200ms:
- Track 0: filename=438ms, metadata=229ms (diff: 209ms) - Track 0: filename=438ms, metadata=229ms (diff: ~200ms)
- Track 1: filename=8339ms, metadata=8130ms (diff: 209ms) - Track 1: filename=8339ms, metadata=8130ms (diff: ~200ms)
Consistent delta suggests network/encoding delay. PyAV metadata is ground truth Note: The ~200ms difference isn't crucial - either method works for alignment.
(represents when audio stream actually started vs when file upload initiated). Filename timestamps are preferable due to being better officially documented.
Example with 2 participants: Example with 2 participants:
Track A: start_time=0.2s → Joined 200ms after recording began Track A: start_time=0.2s → Joined 200ms after recording began

View File

@@ -25,7 +25,9 @@ def extract_stream_start_time_from_container(
"""Extract meeting-relative start time from WebM stream metadata. """Extract meeting-relative start time from WebM stream metadata.
Uses PyAV to read stream.start_time from WebM container. Uses PyAV to read stream.start_time from WebM container.
More accurate than filename timestamps by ~209ms due to network/encoding delays. Note: Differs from filename timestamps by ~200ms in test recordings, but this difference
is not crucial - either method works. Filename timestamps are preferable due to being
better officially documented by Daily.co.
Args: Args:
container: PyAV container opened from audio file/URL container: PyAV container opened from audio file/URL