stub processor (vibe)

2025-12-21 04:39:06 +00:00 · 2025-10-10 18:05:31 -04:00
parent 4c523c8eec
commit f945f84be9
4 changed files with 515 additions and 34 deletions
--- a/server/DAILYCO_TEST.md
+++ b/server/DAILYCO_TEST.md
@@ -1,5 +1,23 @@
 # Daily.co Integration Test Plan

+## ⚠️ IMPORTANT: Stub Implementation
+
+**This test validates Daily.co webhook integration with MOCK transcription data.**
+
+The actual audio/video files are recorded to S3, but transcription/diarization is NOT performed. Instead:
+- A **stub processor** generates fake transcript with predetermined text ("The Great Fish Eating Argument")
+- All database entities (recording, transcript, topics, participants, words) are created with **fake "fish" conversation data**
+- This allows testing the complete webhook → database flow WITHOUT expensive GPU processing
+
+**Expected transcript content:**
+- Title: "The Great Fish Eating Argument"
+- Participants: "Fish Eater" (speaker 0), "Annoying Person" (speaker 1)
+- Transcription: Nonsensical argument about eating fish (see `reflector/worker/daily_stub_data.py`)
+
+**Next implementation step:** Replace stub with real transcription pipeline (download tracks from S3, merge audio, run Whisper/diarization).
+
+---
+
 ## Prerequisites

 **1. Environment Variables** (check in `.env.development.local`):
@@ -126,38 +144,79 @@ curl -s -X GET "https://api.daily.co/v1/rooms/$ROOM_NAME" \

 ## Test 4: Browser UI Test (Playwright MCP)

+**Using Claude Code MCP tools:**
+
 **Load room:**
-```javascript
-await page.goto('http://localhost:3000/test2');
-await new Promise(f => setTimeout(f, 12000));  // Wait for load
+```
+Use: mcp__playwright__browser_navigate
+Input: {"url": "http://localhost:3000/test2"}
+
+Then wait 12 seconds for iframe to load
 ```

 **Verify Daily.co iframe loaded:**
-```javascript
-const iframes = document.querySelectorAll('iframe');
-// Expected: 1 iframe with src containing "monadical.daily.co"
+```
+Use: mcp__playwright__browser_snapshot
+
+Expected in snapshot:
+- iframe element with src containing "monadical.daily.co"
+- Daily.co pre-call UI visible
 ```

 **Take screenshot:**
-```javascript
-await page.screenshot({ path: 'test2-before-join.png' });
-// Expected: Daily.co pre-call UI visible
+```
+Use: mcp__playwright__browser_take_screenshot
+Input: {"filename": "test2-before-join.png"}
+
+Expected: Daily.co pre-call UI with "Join" button visible
 ```

 **Join meeting:**
-```javascript
-await page.locator('iframe').contentFrame().getByRole('button', { name: 'Join' }).click();
-await new Promise(f => setTimeout(f, 5000));
+```
+Note: Daily.co iframe interaction requires clicking inside iframe.
+Use: mcp__playwright__browser_click
+Input: {"element": "Join button in Daily.co iframe", "ref": "<ref-from-snapshot>"}
+
+Then wait 5 seconds for call to connect
 ```

 **Verify in-call:**
-```javascript
-await page.screenshot({ path: 'test2-in-call.png' });
-// Expected: "Waiting for others to join" or participant video visible
+```
+Use: mcp__playwright__browser_take_screenshot
+Input: {"filename": "test2-in-call.png"}
+
+Expected: "Waiting for others to join" or participant video visible
 ```

 **Leave meeting:**
+```
+Use: mcp__playwright__browser_click
+Input: {"element": "Leave button in Daily.co iframe", "ref": "<ref-from-snapshot>"}
+```
+
+---
+
+**Alternative: JavaScript snippets (for manual testing):**
+
 ```javascript
+await page.goto('http://localhost:3000/test2');
+await new Promise(f => setTimeout(f, 12000));  // Wait for load
+
+// Verify iframe
+const iframes = document.querySelectorAll('iframe');
+// Expected: 1 iframe with src containing "monadical.daily.co"
+
+// Screenshot
+await page.screenshot({ path: 'test2-before-join.png' });
+
+// Join
+await page.locator('iframe').contentFrame().getByRole('button', { name: 'Join' }).click();
+await new Promise(f => setTimeout(f, 5000));
+
+// In-call screenshot
+await page.screenshot({ path: 'test2-in-call.png' });
+
+// Leave
 await page.locator('iframe').contentFrame().getByRole('button', { name: 'Leave' }).click();
 ```

@@ -250,7 +309,142 @@ Tracks: 2 files

 ---

-## Test 7: Recording Type Verification
+## Test 7: Database Check - Recording and Transcript
+
+**Check recording created:**
+```bash
+docker-compose exec -T postgres psql -U reflector -d reflector -c \
+  "SELECT id, bucket_name, object_key, status, meeting_id, recorded_at
+   FROM recording
+   ORDER BY recorded_at DESC LIMIT 1;"
+```
+
+**Expected:**
+```
+id: <recording-id-from-webhook>
+bucket_name: reflector-dailyco-local
+object_key: monadical/test2-<timestamp>/<recording-timestamp>-<uuid>-cam-audio-<track-start>.webm
+status: completed
+meeting_id: <meeting-id>
+recorded_at: <recent-timestamp>
+```
+
+**Check transcript created:**
+```bash
+docker-compose exec -T postgres psql -U reflector -d reflector -c \
+  "SELECT id, title, status, duration, recording_id, meeting_id, room_id
+   FROM transcript
+   ORDER BY created_at DESC LIMIT 1;"
+```
+
+**Expected:**
+```
+id: <transcript-id>
+title: The Great Fish Eating Argument
+status: ended
+duration: ~200-300 seconds (depends on fish text parsing)
+recording_id: <same-as-recording-id-above>
+meeting_id: <meeting-id>
+room_id: 552640fd-16f2-4162-9526-8cf40cd2357e
+```
+
+**Check transcript topics (stub data):**
+```bash
+TRANSCRIPT_ID=$(docker-compose exec -T postgres psql -U reflector -d reflector -t -c \
+  "SELECT id FROM transcript ORDER BY created_at DESC LIMIT 1;")
+
+docker-compose exec -T postgres psql -U reflector -d reflector -c \
+  "SELECT
+     jsonb_array_length(topics) as num_topics,
+     jsonb_array_length(participants) as num_participants,
+     short_summary,
+     title
+   FROM transcript
+   WHERE id = '$TRANSCRIPT_ID';"
+```
+
+**Expected:**
+```
+num_topics: 3
+num_participants: 2
+short_summary: Two people argue about eating fish
+title: The Great Fish Eating Argument
+```
+
+**Check topics contain fish text:**
+```bash
+docker-compose exec -T postgres psql -U reflector -d reflector -c \
+  "SELECT topics->0->'title', topics->0->'summary', topics->0->'transcript'
+   FROM transcript
+   ORDER BY created_at DESC LIMIT 1;" | head -20
+```
+
+**Expected output should contain:**
+```
+Fish Argument Part 1
+Argument about eating fish continues (part 1)
+Fish for dinner are nothing wrong with you? There's nothing...
+```
+
+**Check participants:**
+```bash
+docker-compose exec -T postgres psql -U reflector -d reflector -c \
+  "SELECT participants FROM transcript ORDER BY created_at DESC LIMIT 1;" \
+  | python3 -c "import sys, json; data=json.loads(sys.stdin.read()); print(json.dumps(data, indent=2))"
+```
+
+**Expected:**
+```json
+[
+  {
+    "id": "<uuid>",
+    "speaker": 0,
+    "name": "Fish Eater"
+  },
+  {
+    "id": "<uuid>",
+    "speaker": 1,
+    "name": "Annoying Person"
+  }
+]
+```
+
+**Check word-level data:**
+```bash
+docker-compose exec -T postgres psql -U reflector -d reflector -c \
+  "SELECT jsonb_array_length(topics->0->'words') as num_words_first_topic
+   FROM transcript
+   ORDER BY created_at DESC LIMIT 1;"
+```
+
+**Expected:**
+```
+num_words_first_topic: ~100-150 (varies based on topic chunking)
+```
+
+**Verify speaker diarization in words:**
+```bash
+docker-compose exec -T postgres psql -U reflector -d reflector -c \
+  "SELECT
+     topics->0->'words'->0->>'text' as first_word,
+     topics->0->'words'->0->>'speaker' as speaker,
+     topics->0->'words'->0->>'start' as start_time,
+     topics->0->'words'->0->>'end' as end_time
+   FROM transcript
+   ORDER BY created_at DESC LIMIT 1;"
+```
+
+**Expected:**
+```
+first_word: Fish
+speaker: 0 or 1 (depends on parsing)
+start_time: 0.0
+end_time: 0.35 (approximate)
+```
+
+---
+
+## Test 8: Recording Type Verification

 **Check what Daily.co received:**
 ```bash
@@ -367,3 +561,10 @@ Recording: raw-tracks
 - [x] S3 contains 2 files: audio (.webm) and video (.webm)
 - [x] S3 path: `monadical/test2-{timestamp}/{recording-start-ts}-{participant-uuid}-cam-{audio|video}-{track-start-ts}`
 - [x] Database `num_clients` increments/decrements correctly
+- [x] **Database recording entry created** with correct S3 path and status `completed`
+- [x] **Database transcript entry created** with status `ended`
+- [x] **Transcript has stub data**: title "The Great Fish Eating Argument"
+- [x] **Transcript has 3 topics** about fish argument
+- [x] **Transcript has 2 participants**: "Fish Eater" (speaker 0) and "Annoying Person" (speaker 1)
+- [x] **Topics contain word-level data** with timestamps and speaker IDs
+- [x] **Total duration** ~200-300 seconds based on fish text parsing
--- a/server/reflector/views/daily.py
+++ b/server/reflector/views/daily.py
@@ -142,31 +142,54 @@ async def _handle_recording_started(event: DailyWebhookEvent):


 async def _handle_recording_ready(event: DailyWebhookEvent):
-    """Handle recording ready for download event."""
+    """Handle recording ready for download event.
+
+    Daily.co webhook payload for raw-tracks recordings:
+    {
+      "recording_id": "...",
+      "room_name": "test2-20251009192341",
+      "tracks": [
+        {"type": "audio", "s3Key": "monadical/test2-.../uuid-cam-audio-123.webm", "size": 400000},
+        {"type": "video", "s3Key": "monadical/test2-.../uuid-cam-video-456.webm", "size": 30000000}
+      ]
+    }
+    """
    room_name = _extract_room_name(event)
    recording_id = event.payload.get("recording_id")
-    download_link = event.payload.get("download_link")
+    tracks = event.payload.get("tracks", [])

-    if not room_name or not download_link:
+    if not room_name or not tracks:
+        logger.warning(
+            "recording.ready-to-download: missing room_name or tracks",
+            room_name=room_name,
+            has_tracks=bool(tracks),
+            payload=event.payload,
+        )
        return

    meeting = await meetings_controller.get_by_room_name(room_name)
-    if meeting:
-        try:
-            from reflector.worker.process import process_recording_from_url
+    if not meeting:
+        logger.warning(
+            "recording.ready-to-download: meeting not found", room_name=room_name
+        )
+        return

-            process_recording_from_url.delay(
-                recording_url=download_link,
-                meeting_id=meeting.id,
-                recording_id=recording_id or event.id,
-            )
-        except ImportError:
-            logger.warning(
-                "Could not queue recording processing",
-                meeting_id=meeting.id,
-                room_name=room_name,
-                platform="daily",
-            )
+    logger.info(
+        "Recording ready for download",
+        meeting_id=meeting.id,
+        room_name=room_name,
+        recording_id=recording_id,
+        num_tracks=len(tracks),
+        platform="daily",
+    )
+
+    from reflector.worker.process import process_daily_recording
+
+    process_daily_recording.delay(
+        meeting_id=meeting.id,
+        recording_id=recording_id or event.id,
+        tracks=tracks,
+    )


 async def _handle_recording_error(event: DailyWebhookEvent):
--- a/server/reflector/worker/daily_stub_data.py
+++ b/server/reflector/worker/daily_stub_data.py
@@ -0,0 +1,168 @@
+"""Stub data for Daily.co testing - Fish conversation"""
+
+import re
+
+from reflector.utils import generate_uuid4
+
+# The fish argument text - 2 speakers arguing about eating fish
+FISH_TEXT = """Fish for dinner are nothing wrong with you? There's nothing wrong with me. Wrong with you? Would you shut up? There's nothing wrong with me. I'm just trying to. There's nothing wrong with me. I'm trying to eat a fish. Wrong with you trying to eat a fish and it falls off the plate. Would you shut up? You're bothering me. More than a fish is bothering me. Would you shut up and leave me alone? What's your problem? I'm just trying to eat a fish is wrong with you. I'm only trying to eat a fish. Would you shut up? Wrong with you. There's nothing wrong with me. There's nothing wrong with me. Wrong with you. There's nothing wrong with me. Wrong with you. There's nothing wrong with me. Would you shut up and let me eat my fish? Wrong with you. Shut up! What is wrong with you? Would you just shut up? What's your problem? Would you shut up with you? What is wrong with you? Wrong with me? I'm just trying to get my attention. Did you shut up? You're bothering me. Would you shut up? You're beginning to bug me. What's your problem? Just trying to eat my fish. Stay on the plate. Would you shut up? Just trying to eat my fish.
+
+I'm gonna hit you with my problem. You're worse than this fish. You're more of a problem than a fish. What's your problem? Would you shut up? Would you shut your mouth? I want to eat my fish. Shut up! I can't even think. What's your problem? Trying to eat my fish is wrong with you. I don't have a problem. What is wrong with you? I have a problem. What's your problem? I don't have a problem. Can't you hear me with you? Can't you hear me? I don't have a problem. I want to eat my fish. Your problem? Just want to eat. What is wrong with you? Shut up! What is wrong with you? You just shut up! What's your problem? What is wrong with you anyway? What is wrong with you? I won't stay on the plate. You shut up! What is wrong with you? Would you just shut up? Let me eat my fish. What's your problem? Shut up and leave me alone! I can't even think. Wrong with you. I don't have a problem. Problem? I don't have a problem. Wrong with you. I don't have a problem with you. That's your problem. Don't have a problem? I want to eat my fish.
+
+What is wrong with you? What's your problem? Problem? I just want to eat my fish. Wrong with you. What's wrong with you? I don't have a problem. You shut up! What's wrong with you? Just shut up! What's wrong with you? Shut up! What is wrong with you? I'm trying to eat a fish. I'm trying to eat a fish and it falls off the plate. Would you shut up? What is wrong with you? Would you shut up? Is wrong with you? Would you just shut up? What is wrong with you? Would you just shut? Is wrong with you? What's your problem? You just shut. What is wrong with you? Trying to eat my fish. Would you be quiet? What's your problem? Would you just shut up? Eat my fish. I can't even eat it. Don't stay on the plate. What's your problem? Would you shut up? What is wrong with you? What is wrong with you? Would you just shut up? What's your problem? What is wrong with you? I'm gonna hit you with my fish if you don't shut up. What's your problem? Would you shut up? What's wrong with you? What is wrong? Shut up! What's your problem?"""
+
+
+def parse_fish_text():
+    """Parse fish text into words with timestamps and speakers.
+
+    Returns a list of words: [{"text": str, "start": float, "end": float, "speaker": int}]
+
+    Speaker assignment heuristic:
+    - Speaker 0 (eating fish): "fish", "eat", "trying", "problem", "I"
+    - Speaker 1 (annoying): "wrong with you", "shut up", "What's your problem"
+    """
+
+    # Split into sentences (rough)
+    sentences = re.split(r"([.!?])", FISH_TEXT)
+
+    # Reconstruct sentences with punctuation
+    full_sentences = []
+    for i in range(0, len(sentences) - 1, 2):
+        if sentences[i].strip():
+            full_sentences.append(
+                sentences[i].strip()
+                + (sentences[i + 1] if i + 1 < len(sentences) else "")
+            )
+
+    words = []
+    current_time = 0.0
+
+    for sentence in full_sentences:
+        if not sentence.strip():
+            continue
+
+        # Determine speaker based on content
+        sentence_lower = sentence.lower()
+
+        # Speaker 1 patterns (annoying person)
+        if any(
+            p in sentence_lower
+            for p in [
+                "wrong with you",
+                "shut up",
+                "what's your problem",
+                "what is wrong",
+                "would you shut",
+                "you shut",
+            ]
+        ):
+            speaker = 1
+        # Speaker 0 patterns (trying to eat)
+        elif any(
+            p in sentence_lower
+            for p in [
+                "i'm trying",
+                "i'm just",
+                "i want to eat",
+                "eat my fish",
+                "trying to eat",
+                "nothing wrong with me",
+                "i don't have a problem",
+                "just trying",
+                "leave me alone",
+                "can't even",
+                "i'm gonna hit",
+            ]
+        ):
+            speaker = 0
+        # Default: alternate or use context
+        else:
+            # For short phrases, guess based on keywords
+            if "fish" in sentence_lower and "eat" in sentence_lower:
+                speaker = 0
+            elif "problem" in sentence_lower and "your" not in sentence_lower:
+                speaker = 0
+            else:
+                speaker = 1
+
+        # Split sentence into words
+        sentence_words = sentence.split()
+        for word in sentence_words:
+            word_duration = 0.3 + (len(word) * 0.05)  # ~0.3-0.5s per word
+
+            words.append(
+                {
+                    "text": word + " ",  # Add space
+                    "start": current_time,
+                    "end": current_time + word_duration,
+                    "speaker": speaker,
+                }
+            )
+
+            current_time += word_duration
+
+    return words
+
+
+def generate_fake_topics(words):
+    """Generate fake topics from words.
+
+    Splits into ~3 topics based on timestamp.
+    """
+    if not words:
+        return []
+
+    total_duration = words[-1]["end"]
+    chunk_size = len(words) // 3
+
+    topics = []
+
+    for i in range(3):
+        start_idx = i * chunk_size
+        end_idx = (i + 1) * chunk_size if i < 2 else len(words)
+
+        if start_idx >= len(words):
+            break
+
+        chunk_words = words[start_idx:end_idx]
+
+        topic = {
+            "id": generate_uuid4(),
+            "title": f"Fish Argument Part {i+1}",
+            "summary": f"Argument about eating fish continues (part {i+1})",
+            "timestamp": chunk_words[0]["start"],
+            "duration": chunk_words[-1]["end"] - chunk_words[0]["start"],
+            "transcript": "".join(w["text"] for w in chunk_words),
+            "words": chunk_words,
+        }
+
+        topics.append(topic)
+
+    return topics
+
+
+def generate_fake_participants():
+    """Generate fake participants."""
+    return [
+        {"id": generate_uuid4(), "speaker": 0, "name": "Fish Eater"},
+        {"id": generate_uuid4(), "speaker": 1, "name": "Annoying Person"},
+    ]
+
+
+def get_stub_transcript_data():
+    """Get complete stub transcript data for Daily.co testing.
+
+    Returns dict with topics, participants, title, summaries, duration.
+    """
+    words = parse_fish_text()
+    topics = generate_fake_topics(words)
+    participants = generate_fake_participants()
+
+    return {
+        "topics": topics,
+        "participants": participants,
+        "title": "The Great Fish Eating Argument",
+        "short_summary": "Two people argue about eating fish",
+        "long_summary": "An extended argument between someone trying to eat fish and another person who won't stop asking what's wrong. The fish keeps falling off the plate.",
+        "duration": words[-1]["end"] if words else 0.0,
+    }
--- a/server/reflector/worker/process.py
+++ b/server/reflector/worker/process.py
@@ -238,6 +238,95 @@ async def process_meetings():
    )


+@shared_task
+@asynctask
+async def process_daily_recording(meeting_id: str, recording_id: str, tracks: list):
+    """Stub processor for Daily.co recordings - writes fake transcription/diarization.
+
+    Args:
+        meeting_id: Meeting ID
+        recording_id: Recording ID from Daily.co webhook
+        tracks: List of track dicts from Daily.co webhook
+                [{type: 'audio'|'video', s3Key: str, size: int}, ...]
+    """
+    logger.info(
+        "Processing Daily.co recording (STUB)",
+        meeting_id=meeting_id,
+        recording_id=recording_id,
+        num_tracks=len(tracks),
+    )
+
+    meeting = await meetings_controller.get_by_id(meeting_id)
+    if not meeting:
+        raise Exception(f"Meeting {meeting_id} not found")
+
+    room = await rooms_controller.get_by_id(meeting.room_id)
+
+    # Find first audio track for Recording entity
+    audio_track = next((t for t in tracks if t["type"] == "audio"), None)
+    if not audio_track:
+        raise Exception(f"No audio tracks found in {len(tracks)} tracks")
+
+    # Create Recording entry
+    recording = await recordings_controller.create(
+        Recording(
+            id=recording_id,
+            bucket_name=settings.AWS_DAILY_S3_BUCKET,
+            object_key=audio_track["s3Key"],
+            recorded_at=datetime.now(timezone.utc),
+            meeting_id=meeting.id,
+            status="completed",
+        )
+    )
+
+    logger.info(
+        "Created recording",
+        recording_id=recording.id,
+        s3_key=audio_track["s3Key"],
+    )
+
+    # Create Transcript entry
+    transcript = await transcripts_controller.add(
+        "",
+        source_kind=SourceKind.ROOM,
+        source_language="en",
+        target_language="en",
+        user_id=room.user_id,
+        recording_id=recording.id,
+        share_mode="public",
+        meeting_id=meeting.id,
+        room_id=room.id,
+    )
+
+    logger.info("Created transcript", transcript_id=transcript.id)
+
+    # Generate fake data (fish argument)
+    from reflector.worker.daily_stub_data import get_stub_transcript_data
+
+    stub_data = get_stub_transcript_data()
+
+    # Update transcript with fake data
+    await transcripts_controller.update(
+        transcript,
+        {
+            "topics": stub_data["topics"],
+            "participants": stub_data["participants"],
+            "title": stub_data["title"],
+            "short_summary": stub_data["short_summary"],
+            "long_summary": stub_data["long_summary"],
+            "duration": stub_data["duration"],
+            "status": "ended",
+        },
+    )
+
+    logger.info(
+        "Daily.co recording processed (STUB)",
+        transcript_id=transcript.id,
+        duration=stub_data["duration"],
+        num_topics=len(stub_data["topics"]),
+    )
+
+
@shared_task
@asynctask
 async def reprocess_failed_recordings():