stub processor (vibe)

This commit is contained in:
Igor Loskutov
2025-10-10 18:05:31 -04:00
parent 4c523c8eec
commit f945f84be9
4 changed files with 515 additions and 34 deletions

View File

@@ -1,5 +1,23 @@
# Daily.co Integration Test Plan
## ⚠️ IMPORTANT: Stub Implementation
**This test validates Daily.co webhook integration with MOCK transcription data.**
The actual audio/video files are recorded to S3, but transcription/diarization is NOT performed. Instead:
- A **stub processor** generates fake transcript with predetermined text ("The Great Fish Eating Argument")
- All database entities (recording, transcript, topics, participants, words) are created with **fake "fish" conversation data**
- This allows testing the complete webhook → database flow WITHOUT expensive GPU processing
**Expected transcript content:**
- Title: "The Great Fish Eating Argument"
- Participants: "Fish Eater" (speaker 0), "Annoying Person" (speaker 1)
- Transcription: Nonsensical argument about eating fish (see `reflector/worker/daily_stub_data.py`)
**Next implementation step:** Replace stub with real transcription pipeline (download tracks from S3, merge audio, run Whisper/diarization).
---
## Prerequisites
**1. Environment Variables** (check in `.env.development.local`):
@@ -126,38 +144,79 @@ curl -s -X GET "https://api.daily.co/v1/rooms/$ROOM_NAME" \
## Test 4: Browser UI Test (Playwright MCP)
**Using Claude Code MCP tools:**
**Load room:**
```javascript
await page.goto('http://localhost:3000/test2');
await new Promise(f => setTimeout(f, 12000)); // Wait for load
```
Use: mcp__playwright__browser_navigate
Input: {"url": "http://localhost:3000/test2"}
Then wait 12 seconds for iframe to load
```
**Verify Daily.co iframe loaded:**
```javascript
const iframes = document.querySelectorAll('iframe');
// Expected: 1 iframe with src containing "monadical.daily.co"
```
Use: mcp__playwright__browser_snapshot
Expected in snapshot:
- iframe element with src containing "monadical.daily.co"
- Daily.co pre-call UI visible
```
**Take screenshot:**
```javascript
await page.screenshot({ path: 'test2-before-join.png' });
// Expected: Daily.co pre-call UI visible
```
Use: mcp__playwright__browser_take_screenshot
Input: {"filename": "test2-before-join.png"}
Expected: Daily.co pre-call UI with "Join" button visible
```
**Join meeting:**
```javascript
await page.locator('iframe').contentFrame().getByRole('button', { name: 'Join' }).click();
await new Promise(f => setTimeout(f, 5000));
```
Note: Daily.co iframe interaction requires clicking inside iframe.
Use: mcp__playwright__browser_click
Input: {"element": "Join button in Daily.co iframe", "ref": "<ref-from-snapshot>"}
Then wait 5 seconds for call to connect
```
**Verify in-call:**
```javascript
await page.screenshot({ path: 'test2-in-call.png' });
// Expected: "Waiting for others to join" or participant video visible
```
Use: mcp__playwright__browser_take_screenshot
Input: {"filename": "test2-in-call.png"}
Expected: "Waiting for others to join" or participant video visible
```
**Leave meeting:**
```
Use: mcp__playwright__browser_click
Input: {"element": "Leave button in Daily.co iframe", "ref": "<ref-from-snapshot>"}
```
---
**Alternative: JavaScript snippets (for manual testing):**
```javascript
await page.goto('http://localhost:3000/test2');
await new Promise(f => setTimeout(f, 12000)); // Wait for load
// Verify iframe
const iframes = document.querySelectorAll('iframe');
// Expected: 1 iframe with src containing "monadical.daily.co"
// Screenshot
await page.screenshot({ path: 'test2-before-join.png' });
// Join
await page.locator('iframe').contentFrame().getByRole('button', { name: 'Join' }).click();
await new Promise(f => setTimeout(f, 5000));
// In-call screenshot
await page.screenshot({ path: 'test2-in-call.png' });
// Leave
await page.locator('iframe').contentFrame().getByRole('button', { name: 'Leave' }).click();
```
@@ -250,7 +309,142 @@ Tracks: 2 files
---
## Test 7: Recording Type Verification
## Test 7: Database Check - Recording and Transcript
**Check recording created:**
```bash
docker-compose exec -T postgres psql -U reflector -d reflector -c \
"SELECT id, bucket_name, object_key, status, meeting_id, recorded_at
FROM recording
ORDER BY recorded_at DESC LIMIT 1;"
```
**Expected:**
```
id: <recording-id-from-webhook>
bucket_name: reflector-dailyco-local
object_key: monadical/test2-<timestamp>/<recording-timestamp>-<uuid>-cam-audio-<track-start>.webm
status: completed
meeting_id: <meeting-id>
recorded_at: <recent-timestamp>
```
**Check transcript created:**
```bash
docker-compose exec -T postgres psql -U reflector -d reflector -c \
"SELECT id, title, status, duration, recording_id, meeting_id, room_id
FROM transcript
ORDER BY created_at DESC LIMIT 1;"
```
**Expected:**
```
id: <transcript-id>
title: The Great Fish Eating Argument
status: ended
duration: ~200-300 seconds (depends on fish text parsing)
recording_id: <same-as-recording-id-above>
meeting_id: <meeting-id>
room_id: 552640fd-16f2-4162-9526-8cf40cd2357e
```
**Check transcript topics (stub data):**
```bash
TRANSCRIPT_ID=$(docker-compose exec -T postgres psql -U reflector -d reflector -t -c \
"SELECT id FROM transcript ORDER BY created_at DESC LIMIT 1;")
docker-compose exec -T postgres psql -U reflector -d reflector -c \
"SELECT
jsonb_array_length(topics) as num_topics,
jsonb_array_length(participants) as num_participants,
short_summary,
title
FROM transcript
WHERE id = '$TRANSCRIPT_ID';"
```
**Expected:**
```
num_topics: 3
num_participants: 2
short_summary: Two people argue about eating fish
title: The Great Fish Eating Argument
```
**Check topics contain fish text:**
```bash
docker-compose exec -T postgres psql -U reflector -d reflector -c \
"SELECT topics->0->'title', topics->0->'summary', topics->0->'transcript'
FROM transcript
ORDER BY created_at DESC LIMIT 1;" | head -20
```
**Expected output should contain:**
```
Fish Argument Part 1
Argument about eating fish continues (part 1)
Fish for dinner are nothing wrong with you? There's nothing...
```
**Check participants:**
```bash
docker-compose exec -T postgres psql -U reflector -d reflector -c \
"SELECT participants FROM transcript ORDER BY created_at DESC LIMIT 1;" \
| python3 -c "import sys, json; data=json.loads(sys.stdin.read()); print(json.dumps(data, indent=2))"
```
**Expected:**
```json
[
{
"id": "<uuid>",
"speaker": 0,
"name": "Fish Eater"
},
{
"id": "<uuid>",
"speaker": 1,
"name": "Annoying Person"
}
]
```
**Check word-level data:**
```bash
docker-compose exec -T postgres psql -U reflector -d reflector -c \
"SELECT jsonb_array_length(topics->0->'words') as num_words_first_topic
FROM transcript
ORDER BY created_at DESC LIMIT 1;"
```
**Expected:**
```
num_words_first_topic: ~100-150 (varies based on topic chunking)
```
**Verify speaker diarization in words:**
```bash
docker-compose exec -T postgres psql -U reflector -d reflector -c \
"SELECT
topics->0->'words'->0->>'text' as first_word,
topics->0->'words'->0->>'speaker' as speaker,
topics->0->'words'->0->>'start' as start_time,
topics->0->'words'->0->>'end' as end_time
FROM transcript
ORDER BY created_at DESC LIMIT 1;"
```
**Expected:**
```
first_word: Fish
speaker: 0 or 1 (depends on parsing)
start_time: 0.0
end_time: 0.35 (approximate)
```
---
## Test 8: Recording Type Verification
**Check what Daily.co received:**
```bash
@@ -367,3 +561,10 @@ Recording: raw-tracks
- [x] S3 contains 2 files: audio (.webm) and video (.webm)
- [x] S3 path: `monadical/test2-{timestamp}/{recording-start-ts}-{participant-uuid}-cam-{audio|video}-{track-start-ts}`
- [x] Database `num_clients` increments/decrements correctly
- [x] **Database recording entry created** with correct S3 path and status `completed`
- [x] **Database transcript entry created** with status `ended`
- [x] **Transcript has stub data**: title "The Great Fish Eating Argument"
- [x] **Transcript has 3 topics** about fish argument
- [x] **Transcript has 2 participants**: "Fish Eater" (speaker 0) and "Annoying Person" (speaker 1)
- [x] **Topics contain word-level data** with timestamps and speaker IDs
- [x] **Total duration** ~200-300 seconds based on fish text parsing

View File

@@ -142,31 +142,54 @@ async def _handle_recording_started(event: DailyWebhookEvent):
async def _handle_recording_ready(event: DailyWebhookEvent):
"""Handle recording ready for download event."""
"""Handle recording ready for download event.
Daily.co webhook payload for raw-tracks recordings:
{
"recording_id": "...",
"room_name": "test2-20251009192341",
"tracks": [
{"type": "audio", "s3Key": "monadical/test2-.../uuid-cam-audio-123.webm", "size": 400000},
{"type": "video", "s3Key": "monadical/test2-.../uuid-cam-video-456.webm", "size": 30000000}
]
}
"""
room_name = _extract_room_name(event)
recording_id = event.payload.get("recording_id")
download_link = event.payload.get("download_link")
tracks = event.payload.get("tracks", [])
if not room_name or not download_link:
if not room_name or not tracks:
logger.warning(
"recording.ready-to-download: missing room_name or tracks",
room_name=room_name,
has_tracks=bool(tracks),
payload=event.payload,
)
return
meeting = await meetings_controller.get_by_room_name(room_name)
if meeting:
try:
from reflector.worker.process import process_recording_from_url
if not meeting:
logger.warning(
"recording.ready-to-download: meeting not found", room_name=room_name
)
return
process_recording_from_url.delay(
recording_url=download_link,
meeting_id=meeting.id,
recording_id=recording_id or event.id,
)
except ImportError:
logger.warning(
"Could not queue recording processing",
meeting_id=meeting.id,
room_name=room_name,
platform="daily",
)
logger.info(
"Recording ready for download",
meeting_id=meeting.id,
room_name=room_name,
recording_id=recording_id,
num_tracks=len(tracks),
platform="daily",
)
from reflector.worker.process import process_daily_recording
process_daily_recording.delay(
meeting_id=meeting.id,
recording_id=recording_id or event.id,
tracks=tracks,
)
async def _handle_recording_error(event: DailyWebhookEvent):

View File

@@ -0,0 +1,168 @@
"""Stub data for Daily.co testing - Fish conversation"""
import re
from reflector.utils import generate_uuid4
# The fish argument text - 2 speakers arguing about eating fish
FISH_TEXT = """Fish for dinner are nothing wrong with you? There's nothing wrong with me. Wrong with you? Would you shut up? There's nothing wrong with me. I'm just trying to. There's nothing wrong with me. I'm trying to eat a fish. Wrong with you trying to eat a fish and it falls off the plate. Would you shut up? You're bothering me. More than a fish is bothering me. Would you shut up and leave me alone? What's your problem? I'm just trying to eat a fish is wrong with you. I'm only trying to eat a fish. Would you shut up? Wrong with you. There's nothing wrong with me. There's nothing wrong with me. Wrong with you. There's nothing wrong with me. Wrong with you. There's nothing wrong with me. Would you shut up and let me eat my fish? Wrong with you. Shut up! What is wrong with you? Would you just shut up? What's your problem? Would you shut up with you? What is wrong with you? Wrong with me? I'm just trying to get my attention. Did you shut up? You're bothering me. Would you shut up? You're beginning to bug me. What's your problem? Just trying to eat my fish. Stay on the plate. Would you shut up? Just trying to eat my fish.
I'm gonna hit you with my problem. You're worse than this fish. You're more of a problem than a fish. What's your problem? Would you shut up? Would you shut your mouth? I want to eat my fish. Shut up! I can't even think. What's your problem? Trying to eat my fish is wrong with you. I don't have a problem. What is wrong with you? I have a problem. What's your problem? I don't have a problem. Can't you hear me with you? Can't you hear me? I don't have a problem. I want to eat my fish. Your problem? Just want to eat. What is wrong with you? Shut up! What is wrong with you? You just shut up! What's your problem? What is wrong with you anyway? What is wrong with you? I won't stay on the plate. You shut up! What is wrong with you? Would you just shut up? Let me eat my fish. What's your problem? Shut up and leave me alone! I can't even think. Wrong with you. I don't have a problem. Problem? I don't have a problem. Wrong with you. I don't have a problem with you. That's your problem. Don't have a problem? I want to eat my fish.
What is wrong with you? What's your problem? Problem? I just want to eat my fish. Wrong with you. What's wrong with you? I don't have a problem. You shut up! What's wrong with you? Just shut up! What's wrong with you? Shut up! What is wrong with you? I'm trying to eat a fish. I'm trying to eat a fish and it falls off the plate. Would you shut up? What is wrong with you? Would you shut up? Is wrong with you? Would you just shut up? What is wrong with you? Would you just shut? Is wrong with you? What's your problem? You just shut. What is wrong with you? Trying to eat my fish. Would you be quiet? What's your problem? Would you just shut up? Eat my fish. I can't even eat it. Don't stay on the plate. What's your problem? Would you shut up? What is wrong with you? What is wrong with you? Would you just shut up? What's your problem? What is wrong with you? I'm gonna hit you with my fish if you don't shut up. What's your problem? Would you shut up? What's wrong with you? What is wrong? Shut up! What's your problem?"""
def parse_fish_text():
"""Parse fish text into words with timestamps and speakers.
Returns a list of words: [{"text": str, "start": float, "end": float, "speaker": int}]
Speaker assignment heuristic:
- Speaker 0 (eating fish): "fish", "eat", "trying", "problem", "I"
- Speaker 1 (annoying): "wrong with you", "shut up", "What's your problem"
"""
# Split into sentences (rough)
sentences = re.split(r"([.!?])", FISH_TEXT)
# Reconstruct sentences with punctuation
full_sentences = []
for i in range(0, len(sentences) - 1, 2):
if sentences[i].strip():
full_sentences.append(
sentences[i].strip()
+ (sentences[i + 1] if i + 1 < len(sentences) else "")
)
words = []
current_time = 0.0
for sentence in full_sentences:
if not sentence.strip():
continue
# Determine speaker based on content
sentence_lower = sentence.lower()
# Speaker 1 patterns (annoying person)
if any(
p in sentence_lower
for p in [
"wrong with you",
"shut up",
"what's your problem",
"what is wrong",
"would you shut",
"you shut",
]
):
speaker = 1
# Speaker 0 patterns (trying to eat)
elif any(
p in sentence_lower
for p in [
"i'm trying",
"i'm just",
"i want to eat",
"eat my fish",
"trying to eat",
"nothing wrong with me",
"i don't have a problem",
"just trying",
"leave me alone",
"can't even",
"i'm gonna hit",
]
):
speaker = 0
# Default: alternate or use context
else:
# For short phrases, guess based on keywords
if "fish" in sentence_lower and "eat" in sentence_lower:
speaker = 0
elif "problem" in sentence_lower and "your" not in sentence_lower:
speaker = 0
else:
speaker = 1
# Split sentence into words
sentence_words = sentence.split()
for word in sentence_words:
word_duration = 0.3 + (len(word) * 0.05) # ~0.3-0.5s per word
words.append(
{
"text": word + " ", # Add space
"start": current_time,
"end": current_time + word_duration,
"speaker": speaker,
}
)
current_time += word_duration
return words
def generate_fake_topics(words):
"""Generate fake topics from words.
Splits into ~3 topics based on timestamp.
"""
if not words:
return []
total_duration = words[-1]["end"]
chunk_size = len(words) // 3
topics = []
for i in range(3):
start_idx = i * chunk_size
end_idx = (i + 1) * chunk_size if i < 2 else len(words)
if start_idx >= len(words):
break
chunk_words = words[start_idx:end_idx]
topic = {
"id": generate_uuid4(),
"title": f"Fish Argument Part {i+1}",
"summary": f"Argument about eating fish continues (part {i+1})",
"timestamp": chunk_words[0]["start"],
"duration": chunk_words[-1]["end"] - chunk_words[0]["start"],
"transcript": "".join(w["text"] for w in chunk_words),
"words": chunk_words,
}
topics.append(topic)
return topics
def generate_fake_participants():
"""Generate fake participants."""
return [
{"id": generate_uuid4(), "speaker": 0, "name": "Fish Eater"},
{"id": generate_uuid4(), "speaker": 1, "name": "Annoying Person"},
]
def get_stub_transcript_data():
"""Get complete stub transcript data for Daily.co testing.
Returns dict with topics, participants, title, summaries, duration.
"""
words = parse_fish_text()
topics = generate_fake_topics(words)
participants = generate_fake_participants()
return {
"topics": topics,
"participants": participants,
"title": "The Great Fish Eating Argument",
"short_summary": "Two people argue about eating fish",
"long_summary": "An extended argument between someone trying to eat fish and another person who won't stop asking what's wrong. The fish keeps falling off the plate.",
"duration": words[-1]["end"] if words else 0.0,
}

View File

@@ -238,6 +238,95 @@ async def process_meetings():
)
@shared_task
@asynctask
async def process_daily_recording(meeting_id: str, recording_id: str, tracks: list):
"""Stub processor for Daily.co recordings - writes fake transcription/diarization.
Args:
meeting_id: Meeting ID
recording_id: Recording ID from Daily.co webhook
tracks: List of track dicts from Daily.co webhook
[{type: 'audio'|'video', s3Key: str, size: int}, ...]
"""
logger.info(
"Processing Daily.co recording (STUB)",
meeting_id=meeting_id,
recording_id=recording_id,
num_tracks=len(tracks),
)
meeting = await meetings_controller.get_by_id(meeting_id)
if not meeting:
raise Exception(f"Meeting {meeting_id} not found")
room = await rooms_controller.get_by_id(meeting.room_id)
# Find first audio track for Recording entity
audio_track = next((t for t in tracks if t["type"] == "audio"), None)
if not audio_track:
raise Exception(f"No audio tracks found in {len(tracks)} tracks")
# Create Recording entry
recording = await recordings_controller.create(
Recording(
id=recording_id,
bucket_name=settings.AWS_DAILY_S3_BUCKET,
object_key=audio_track["s3Key"],
recorded_at=datetime.now(timezone.utc),
meeting_id=meeting.id,
status="completed",
)
)
logger.info(
"Created recording",
recording_id=recording.id,
s3_key=audio_track["s3Key"],
)
# Create Transcript entry
transcript = await transcripts_controller.add(
"",
source_kind=SourceKind.ROOM,
source_language="en",
target_language="en",
user_id=room.user_id,
recording_id=recording.id,
share_mode="public",
meeting_id=meeting.id,
room_id=room.id,
)
logger.info("Created transcript", transcript_id=transcript.id)
# Generate fake data (fish argument)
from reflector.worker.daily_stub_data import get_stub_transcript_data
stub_data = get_stub_transcript_data()
# Update transcript with fake data
await transcripts_controller.update(
transcript,
{
"topics": stub_data["topics"],
"participants": stub_data["participants"],
"title": stub_data["title"],
"short_summary": stub_data["short_summary"],
"long_summary": stub_data["long_summary"],
"duration": stub_data["duration"],
"status": "ended",
},
)
logger.info(
"Daily.co recording processed (STUB)",
transcript_id=transcript.id,
duration=stub_data["duration"],
num_topics=len(stub_data["topics"]),
)
@shared_task
@asynctask
async def reprocess_failed_recordings():