# Codebase Review Guide: Audio Storage Consent Implementation This guide walks through the relevant parts of the codebase for implementing the audio storage consent flow. **Important**: This implementation works with post-processing deletion, not real-time recording control, due to Whereby integration constraints. ## System Reality: Recording Detection Constraints **Critical Understanding**: - **No real-time recording detection** - System only discovers recordings after they complete via SQS polling (60+ second delay) - **Cannot stop recordings in progress** - Whereby controls recording entirely based on room configuration - **Limited webhooks** - Only `room.client.joined/left` events available, no recording events - **Post-processing intervention only** - Can only mark recordings for deletion during SQS processing ## 1. Current Consent Implementation (TO BE REMOVED) ### File: `www/app/[roomName]/page.tsx` **Purpose:** Room entry page with blocking consent dialog **Key Areas:** - **Line 24:** `const [consentGiven, setConsentGiven] = useState(null);` - **Lines 34-36:** `handleConsent` function that sets consent state - **Lines 80-124:** Consent UI blocking room entry - **Line 80:** `if (!isAuthenticated && !consentGiven)` - blocking condition **Current Logic:** ```typescript // Lines 99-111: Consent request UI {consentGiven === null ? ( <> This meeting may be recorded. Do you consent to being recorded? ) : ( // Lines 114-120: Rejection message You cannot join the meeting without consenting... )} ``` **What to Change:** Remove entire consent blocking logic, allow direct room entry. --- ## 2. Whereby Integration Reality ### File: `www/app/[roomName]/page.tsx` **Purpose:** Main room page where video call happens via whereby-embed **Key Whereby Integration:** - **Line 129:** `` element - this IS the video call - **Lines 26-28:** Room URL from meeting API - **Lines 48-57:** Event listeners for whereby events **What Happens:** 1. `useRoomMeeting()` calls backend to create/get Whereby meeting 2. Whereby automatically records based on room `recording_trigger` configuration 3. **NO real-time recording status** - system doesn't know when recording starts/stops ### File: `www/app/[roomName]/useRoomMeeting.tsx` **Purpose:** Creates or retrieves Whereby meeting for room **Key Flow:** - **Line 48:** Calls `v1RoomsCreateMeeting({ roomName })` - **Lines 49-52:** Returns meeting with `room_url` and `host_room_url` - Meeting includes recording configuration from room settings **What to Add:** Consent dialog overlay on the whereby-embed - always ask for consent regardless of meeting configuration (simplified approach). --- ## 3. Recording Discovery System (POST-PROCESSING ONLY) ### File: `server/reflector/worker/process.py` **Purpose:** Discovers recordings after they complete via SQS polling **Key Areas:** - **Lines 24-62:** `process_messages()` - polls SQS every 60 seconds - **Lines 66-133:** `process_recording()` - processes discovered recording files - **Lines 69-71:** Extracts meeting info from S3 object key format **Current Discovery Flow:** ```python # Lines 69-71: Parse S3 object key room_name = f"/{object_key[:36]}" # First 36 chars = room GUID recorded_at = datetime.fromisoformat(object_key[37:57]) # Timestamp # Lines 73-74: Link to meeting meeting = await meetings_controller.get_by_room_name(room_name) room = await rooms_controller.get_by_id(meeting.room_id) ``` **What to Add:** Consent checking after transcript processing - always create transcript first, then delete only audio files if consent denied. ### File: `server/reflector/worker/app.py` **Purpose:** Celery task scheduling **Key Schedule:** - **Lines 26-29:** `process_messages` runs every 60 seconds - **Lines 30-33:** `process_meetings` runs every 60 seconds to check meeting status **Reality:** consent must be requested during the meeting, not based on recording detection. --- ## 4. Meeting-Based Consent Timing ### File: `server/reflector/views/whereby.py` **Purpose:** Whereby webhook handler - receives participant join/leave events **Key Areas:** - **Lines 69-72:** Handles `room.client.joined` and `room.client.left` events - **Line 71:** Updates `num_clients` count in meeting record **Current Logic:** ```python # Lines 69-72: Participant tracking if event.type in ["room.client.joined", "room.client.left"]: await meetings_controller.update_meeting( meeting.id, num_clients=event.data["numClients"] ) ``` **What to Add:** ALWAYS ask for consent - no triggers, no conditions. Simple list field to track who denied consent. ### File: `server/reflector/db/meetings.py` **Purpose:** Meeting database model and recording configuration **Key Recording Config:** - **Lines 56-59:** Recording trigger options: - `"automatic"` - Recording starts immediately - `"automatic-2nd-participant"` (default) - Recording starts when 2nd person joins - `"prompt"` - Manual recording start - `"none"` - No recording **Current Meeting Model:** ```python # Lines 56-59: Recording configuration recording_type: Literal["none", "local", "cloud"] = "cloud" recording_trigger: Literal[ "none", "prompt", "automatic", "automatic-2nd-participant" ] = "automatic-2nd-participant" ``` **What to Add:** Dictionary field `participant_consent_responses: dict[str, bool]` in Meeting model to store {user_id: true/false}. ALWAYS ask for consent - no complex logic. --- ## 5. Consent Implementation (NO WebSockets Needed) **Consent is meeting-level, not transcript-level** - WebSocket events are for transcript processing, not consent. ### Simple Consent Flow: 1. **Frontend**: Show consent dialog when meeting loads 2. **User Response**: Direct API call to `/meetings/{meeting_id}/consent` 3. **Backend**: Store response in meeting record 4. **SQS Processing**: Check consent during recording processing **No WebSocket events needed** - consent is a simple API interaction, not real-time transcript data. --- ## 4. Backend WebSocket System ### File: `server/reflector/views/transcripts_websocket.py` **Purpose:** Server-side WebSocket endpoint for real-time events **Key Areas:** - **Lines 19-55:** `transcript_events_websocket` function - **Line 32:** Room ID format: `room_id = f"ts:{transcript_id}"` - **Lines 37-44:** Initial event sending to new connections - **Lines 42-43:** Filtering events: `if name in ("TRANSCRIPT", "STATUS"): continue` **Current Flow:** 1. WebSocket connects to `/transcripts/{transcript_id}/events` 2. Server adds user to Redis room `ts:{transcript_id}` 3. Server sends historical events (except TRANSCRIPT/STATUS) 4. Server waits for new events via Redis pub/sub **What to Add:** Handle new consent events in the message flow. ### File: `server/reflector/ws_manager.py` **Purpose:** Redis pub/sub WebSocket management **Key Areas:** - **Lines 61-99:** `WebsocketManager` class - **Lines 78-79:** `send_json` method for broadcasting - **Lines 88-98:** `_pubsub_data_reader` for distributing messages **Broadcasting Pattern:** ```python # Line 78: How to broadcast to all users in a room async def send_json(self, room_id: str, message: dict) -> None: await self.pubsub_client.send_json(room_id, message) ``` **What to Use:** This system for broadcasting consent requests and responses. --- ## 5. Database Models and Migrations ### File: `server/reflector/db/transcripts.py` **Purpose:** Transcript database model and controller **Key Areas:** - **Lines 28-73:** `transcripts` SQLAlchemy table definition - **Lines 149-172:** `Transcript` Pydantic model - **Lines 304-614:** `TranscriptController` class with database operations **Current Schema Fields:** ```python # Lines 31-72: Key existing columns sqlalchemy.Column("id", sqlalchemy.String, primary_key=True), sqlalchemy.Column("status", sqlalchemy.String), sqlalchemy.Column("duration", sqlalchemy.Integer), sqlalchemy.Column("locked", sqlalchemy.Boolean), sqlalchemy.Column("audio_location", sqlalchemy.String, server_default="local"), # ... more columns ``` **Audio File Management:** - **Lines 225-230:** Audio file path properties - **Lines 252-284:** `get_audio_url` method for accessing audio - **Lines 554-571:** `move_mp3_to_storage` for cloud storage **What to Add:** New columns for consent tracking and deletion marking. ### File: `server/migrations/versions/b9348748bbbc_reviewed.py` **Purpose:** Example migration pattern for adding boolean columns **Pattern:** ```python # Lines 20-23: Adding boolean column with default def upgrade() -> None: op.add_column('transcript', sa.Column('reviewed', sa.Boolean(), server_default=sa.text('0'), nullable=False)) def downgrade() -> None: op.drop_column('transcript', 'reviewed') ``` **What to Follow:** This pattern for adding consent columns. --- ## 6. API Endpoint Patterns ### File: `server/reflector/views/transcripts.py` **Purpose:** REST API endpoints for transcript operations **Key Areas:** - **Lines 29-30:** Router setup: `router = APIRouter()` - **Lines 70-85:** `CreateTranscript` and `UpdateTranscript` models - **Lines 122-135:** Example POST endpoint: `transcripts_create` **Endpoint Pattern:** ```python # Lines 122-135: Standard endpoint structure @router.post("/transcripts", response_model=GetTranscript) async def transcripts_create( info: CreateTranscript, user: Annotated[Optional[auth.UserInfo], Depends(auth.current_user_optional)], ): user_id = user["sub"] if user else None return await transcripts_controller.add(...) ``` **Authentication Pattern:** - **Line 125:** Optional user authentication dependency - **Line 127:** Extract user ID: `user_id = user["sub"] if user else None` **What to Follow:** This pattern for new consent endpoint. --- ## 7. Live Pipeline System ### File: `server/reflector/pipelines/main_live_pipeline.py` **Purpose:** Real-time processing pipeline during recording **Key Areas:** - **Lines 80-96:** `@broadcast_to_sockets` decorator for WebSocket events - **Lines 98-104:** `@get_transcript` decorator for database access - **Line 56:** WebSocket manager import: `from reflector.ws_manager import get_ws_manager` **Event Broadcasting Pattern:** ```python # Lines 80-95: Decorator for broadcasting events def broadcast_to_sockets(func): async def wrapper(self, *args, **kwargs): resp = await func(self, *args, **kwargs) if resp is None: return await self.ws_manager.send_json( room_id=self.ws_room_id, message=resp.model_dump(mode="json"), ) return wrapper ``` --- ## 8. Modal/Dialog Patterns ### File: `www/app/(app)/transcripts/[transcriptId]/shareModal.tsx` **Purpose:** Example modal implementation using fixed overlay **Key Areas:** - **Lines 105-176:** Modal implementation using `fixed inset-0` overlay - **Lines 107-108:** Overlay styling: `fixed inset-0 bg-gray-600 bg-opacity-50` - **Lines 152-170:** Button patterns for actions **Modal Structure:** ```typescript // Lines 105-109: Modal overlay and container
{props.show && (
// Modal content...
)}
``` ### File: `www/app/(app)/transcripts/shareAndPrivacy.tsx` **Purpose:** Example using Chakra UI Modal components **Key Areas:** - **Lines 10-16:** Chakra UI Modal imports - **Lines 86-100:** Chakra Modal structure **Chakra Modal Pattern:** ```typescript // Lines 86-94: Chakra UI Modal structure setShowModal(false)} size={"xl"}> Share // Modal content... ``` **What to Choose:** Either pattern works - fixed overlay for simple cases, Chakra UI for consistent styling. --- ## 9. Audio File Management ### File: `server/reflector/db/transcripts.py` **Purpose:** Audio file storage and access **Key Methods:** - **Lines 225-230:** File path properties - `audio_wav_filename`: Local WAV file path - `audio_mp3_filename`: Local MP3 file path - `storage_audio_path`: Cloud storage path - **Lines 252-284:** `get_audio_url()` - Generate access URL - **Lines 554-571:** `move_mp3_to_storage()` - Move to cloud - **Lines 572-580:** `download_mp3_from_storage()` - Download from cloud **File Path Properties:** ```python # Lines 225-230: Audio file locations @property def audio_wav_filename(self): return self.data_path / "audio.wav" @property def audio_mp3_filename(self): return self.data_path / "audio.mp3" ``` **Storage Logic:** - **Line 253:** Local files: `if self.audio_location == "local"` - **Line 255:** Cloud storage: `elif self.audio_location == "storage"` **What to Modify:** Add deletion logic and update `get_audio_url` to handle deleted files. --- ## 10. Review Checklist Before implementing, manually review these areas with the **meeting-based consent** approach: ### Frontend Changes - [ ] **Room Entry**: Remove consent blocking in `www/app/[roomName]/page.tsx:80-124` - [ ] **Meeting UI**: Add consent dialog overlay on `whereby-embed` in `www/app/[roomName]/page.tsx:126+` - [ ] **Meeting Hook**: Update `www/app/[roomName]/useRoomMeeting.tsx` to provide meeting data for consent - [ ] **WebSocket Events**: Add consent event handlers (meeting-based, not transcript-based) - [ ] **User Identification**: Add browser fingerprinting for anonymous users ### Backend Changes - Meeting Scope - [ ] **Database**: Create `meeting_consent` table migration following `server/migrations/versions/b9348748bbbc_reviewed.py` pattern - [ ] **Meeting Model**: Add consent tracking in `server/reflector/db/meetings.py` - [ ] **Recording Model**: Add deletion flags in `server/reflector/db/recordings.py` - [ ] **API**: Add meeting consent endpoint in `server/reflector/views/meetings.py` - [ ] **Whereby Webhook**: Update `server/reflector/views/whereby.py` to trigger consent based on participant count - [ ] **SQS Processing**: Update `server/reflector/worker/process.py` to check consent before processing recordings ### Critical Integration Points - [ ] **Consent Timing**: ALWAYS ask for consent - no conditions, no triggers, no participant count checks - [ ] **SQS Processing**: Always create transcript first, then delete only audio files if consent denied - [ ] **Meeting Scoping**: All consent tracking uses `meeting_id`, not `room_id` (rooms are reused) - [ ] **Post-Processing Only**: No real-time recording control - all intervention happens during SQS processing ### Testing Strategy - [ ] **Multiple Participants**: Test consent collection from multiple users in same meeting - [ ] **Room Reuse**: Verify consent doesn't affect other meetings in same room - [ ] **Recording Triggers**: Test different `recording_trigger` configurations - [ ] **SQS Deletion**: Verify recordings are deleted from S3 when consent denied - [ ] **Timing Edge Cases**: Test consent given after recording already started **Reality Check**: This implementation works with **post-processing deletion only**. We cannot stop recordings in progress or detect exactly when they start. Consent timing is estimated based on meeting configuration and participant events.