reflector/GUIDE.md at 7bb2962f94944bbf5e0296ec815acc271c96717b

mirror of https://github.com/Monadical-SAS/reflector.git synced 2025-12-20 20:29:06 +00:00

Files

Igor Loskutov 7bb2962f94 consent preparation

2025-06-17 12:18:41 -04:00

15 KiB

Raw Blame History

This guide walks through the relevant parts of the codebase for implementing the audio storage consent flow. Important: This implementation works with post-processing deletion, not real-time recording control, due to Whereby integration constraints.

System Reality: Recording Detection Constraints

Critical Understanding:

No real-time recording detection - System only discovers recordings after they complete via SQS polling (60+ second delay)
Cannot stop recordings in progress - Whereby controls recording entirely based on room configuration
Limited webhooks - Only room.client.joined/left events available, no recording events
Post-processing intervention only - Can only mark recordings for deletion during SQS processing

File: `www/app/[roomName]/page.tsx`

Purpose: Room entry page with blocking consent dialog

Key Areas:

Line 24: const [consentGiven, setConsentGiven] = useState<boolean | null>(null);
Lines 34-36: handleConsent function that sets consent state
Lines 80-124: Consent UI blocking room entry
Line 80: if (!isAuthenticated && !consentGiven) - blocking condition

Current Logic:

// Lines 99-111: Consent request UI
{consentGiven === null ? (
  <>
    <Text fontSize="lg" fontWeight="bold">
      This meeting may be recorded. Do you consent to being recorded?
    </Text>
    <HStack spacing={4}>
      <Button variant="outline" onClick={() => handleConsent(false)}>
        No, I do not consent
      </Button>
      <Button colorScheme="blue" onClick={() => handleConsent(true)}>
        Yes, I consent
      </Button>
    </HStack>
  </>
) : (
  // Lines 114-120: Rejection message
  <Text>You cannot join the meeting without consenting...</Text>
)}

What to Change: Remove entire consent blocking logic, allow direct room entry.

2. Whereby Integration Reality

File: `www/app/[roomName]/page.tsx`

Purpose: Main room page where video call happens via whereby-embed

Key Whereby Integration:

Line 129: <whereby-embed> element - this IS the video call
Lines 26-28: Room URL from meeting API
Lines 48-57: Event listeners for whereby events

What Happens:

useRoomMeeting() calls backend to create/get Whereby meeting
Whereby automatically records based on room recording_trigger configuration
NO real-time recording status - system doesn't know when recording starts/stops

File: `www/app/[roomName]/useRoomMeeting.tsx`

Purpose: Creates or retrieves Whereby meeting for room

Key Flow:

Line 48: Calls v1RoomsCreateMeeting({ roomName })
Lines 49-52: Returns meeting with room_url and host_room_url
Meeting includes recording configuration from room settings

What to Add: Consent dialog overlay on the whereby-embed - always ask for consent regardless of meeting configuration (simplified approach).

3. Recording Discovery System (POST-PROCESSING ONLY)

File: `server/reflector/worker/process.py`

Purpose: Discovers recordings after they complete via SQS polling

Key Areas:

Lines 24-62: process_messages() - polls SQS every 60 seconds
Lines 66-133: process_recording() - processes discovered recording files
Lines 69-71: Extracts meeting info from S3 object key format

Current Discovery Flow:

# Lines 69-71: Parse S3 object key
room_name = f"/{object_key[:36]}"  # First 36 chars = room GUID
recorded_at = datetime.fromisoformat(object_key[37:57])  # Timestamp

# Lines 73-74: Link to meeting
meeting = await meetings_controller.get_by_room_name(room_name)
room = await rooms_controller.get_by_id(meeting.room_id)

What to Add: Consent checking after transcript processing - always create transcript first, then delete only audio files if consent denied.

File: `server/reflector/worker/app.py`

Purpose: Celery task scheduling

Key Schedule:

Lines 26-29: process_messages runs every 60 seconds
Lines 30-33: process_meetings runs every 60 seconds to check meeting status

Reality: consent must be requested during the meeting, not based on recording detection.

File: `server/reflector/views/whereby.py`

Purpose: Whereby webhook handler - receives participant join/leave events

Key Areas:

Lines 69-72: Handles room.client.joined and room.client.left events
Line 71: Updates num_clients count in meeting record

Current Logic:

# Lines 69-72: Participant tracking
if event.type in ["room.client.joined", "room.client.left"]:
    await meetings_controller.update_meeting(
        meeting.id, num_clients=event.data["numClients"]
    )

What to Add: ALWAYS ask for consent - no triggers, no conditions. Simple list field to track who denied consent.

File: `server/reflector/db/meetings.py`

Purpose: Meeting database model and recording configuration

Key Recording Config:

Lines 56-59: Recording trigger options:
- "automatic" - Recording starts immediately
- "automatic-2nd-participant" (default) - Recording starts when 2nd person joins
- "prompt" - Manual recording start
- "none" - No recording

Current Meeting Model:

# Lines 56-59: Recording configuration
recording_type: Literal["none", "local", "cloud"] = "cloud"
recording_trigger: Literal[
    "none", "prompt", "automatic", "automatic-2nd-participant"
] = "automatic-2nd-participant"

What to Add: Dictionary field participant_consent_responses: dict[str, bool] in Meeting model to store {user_id: true/false}. ALWAYS ask for consent - no complex logic.

Consent is meeting-level, not transcript-level - WebSocket events are for transcript processing, not consent.

Frontend: Show consent dialog when meeting loads
User Response: Direct API call to /meetings/{meeting_id}/consent
Backend: Store response in meeting record
SQS Processing: Check consent during recording processing

No WebSocket events needed - consent is a simple API interaction, not real-time transcript data.

4. Backend WebSocket System

File: `server/reflector/views/transcripts_websocket.py`

Purpose: Server-side WebSocket endpoint for real-time events

Key Areas:

Lines 19-55: transcript_events_websocket function
Line 32: Room ID format: room_id = f"ts:{transcript_id}"
Lines 37-44: Initial event sending to new connections
Lines 42-43: Filtering events: if name in ("TRANSCRIPT", "STATUS"): continue

Current Flow:

WebSocket connects to /transcripts/{transcript_id}/events
Server adds user to Redis room ts:{transcript_id}
Server sends historical events (except TRANSCRIPT/STATUS)
Server waits for new events via Redis pub/sub

What to Add: Handle new consent events in the message flow.

File: `server/reflector/ws_manager.py`

Purpose: Redis pub/sub WebSocket management

Key Areas:

Lines 61-99: WebsocketManager class
Lines 78-79: send_json method for broadcasting
Lines 88-98: _pubsub_data_reader for distributing messages

Broadcasting Pattern:

# Line 78: How to broadcast to all users in a room
async def send_json(self, room_id: str, message: dict) -> None:
    await self.pubsub_client.send_json(room_id, message)

What to Use: This system for broadcasting consent requests and responses.

5. Database Models and Migrations

File: `server/reflector/db/transcripts.py`

Purpose: Transcript database model and controller

Key Areas:

Lines 28-73: transcripts SQLAlchemy table definition
Lines 149-172: Transcript Pydantic model
Lines 304-614: TranscriptController class with database operations

Current Schema Fields:

# Lines 31-72: Key existing columns
sqlalchemy.Column("id", sqlalchemy.String, primary_key=True),
sqlalchemy.Column("status", sqlalchemy.String),
sqlalchemy.Column("duration", sqlalchemy.Integer),
sqlalchemy.Column("locked", sqlalchemy.Boolean),
sqlalchemy.Column("audio_location", sqlalchemy.String, server_default="local"),
# ... more columns

Audio File Management:

Lines 225-230: Audio file path properties
Lines 252-284: get_audio_url method for accessing audio
Lines 554-571: move_mp3_to_storage for cloud storage

What to Add: New columns for consent tracking and deletion marking.

File: `server/migrations/versions/b9348748bbbc_reviewed.py`

Purpose: Example migration pattern for adding boolean columns

Pattern:

# Lines 20-23: Adding boolean column with default
def upgrade() -> None:
    op.add_column('transcript', sa.Column('reviewed', sa.Boolean(), 
                 server_default=sa.text('0'), nullable=False))

def downgrade() -> None:
    op.drop_column('transcript', 'reviewed')

What to Follow: This pattern for adding consent columns.

6. API Endpoint Patterns

File: `server/reflector/views/transcripts.py`

Purpose: REST API endpoints for transcript operations

Key Areas:

Lines 29-30: Router setup: router = APIRouter()
Lines 70-85: CreateTranscript and UpdateTranscript models
Lines 122-135: Example POST endpoint: transcripts_create

Endpoint Pattern:

# Lines 122-135: Standard endpoint structure
@router.post("/transcripts", response_model=GetTranscript)
async def transcripts_create(
    info: CreateTranscript,
    user: Annotated[Optional[auth.UserInfo], Depends(auth.current_user_optional)],
):
    user_id = user["sub"] if user else None
    return await transcripts_controller.add(...)

Authentication Pattern:

Line 125: Optional user authentication dependency
Line 127: Extract user ID: user_id = user["sub"] if user else None

What to Follow: This pattern for new consent endpoint.

7. Live Pipeline System

File: `server/reflector/pipelines/main_live_pipeline.py`

Purpose: Real-time processing pipeline during recording

Key Areas:

Lines 80-96: @broadcast_to_sockets decorator for WebSocket events
Lines 98-104: @get_transcript decorator for database access
Line 56: WebSocket manager import: from reflector.ws_manager import get_ws_manager

Event Broadcasting Pattern:

# Lines 80-95: Decorator for broadcasting events
def broadcast_to_sockets(func):
    async def wrapper(self, *args, **kwargs):
        resp = await func(self, *args, **kwargs)
        if resp is None:
            return
        await self.ws_manager.send_json(
            room_id=self.ws_room_id,
            message=resp.model_dump(mode="json"),
        )
    return wrapper

8. Modal/Dialog Patterns

File: `www/app/(app)/transcripts/[transcriptId]/shareModal.tsx`

Purpose: Example modal implementation using fixed overlay

Key Areas:

Lines 105-176: Modal implementation using fixed inset-0 overlay
Lines 107-108: Overlay styling: fixed inset-0 bg-gray-600 bg-opacity-50
Lines 152-170: Button patterns for actions

Modal Structure:

// Lines 105-109: Modal overlay and container
<div className="absolute">
  {props.show && (
    <div className="fixed inset-0 bg-gray-600 bg-opacity-50 overflow-y-auto h-full w-full z-50">
      <div className="relative top-20 mx-auto p-5 w-96 shadow-lg rounded-md bg-white">
        // Modal content...
      </div>
    </div>
  )}
</div>

File: `www/app/(app)/transcripts/shareAndPrivacy.tsx`

Purpose: Example using Chakra UI Modal components

Key Areas:

Lines 10-16: Chakra UI Modal imports
Lines 86-100: Chakra Modal structure

Chakra Modal Pattern:

// Lines 86-94: Chakra UI Modal structure
<Modal isOpen={!!showModal} onClose={() => setShowModal(false)} size={"xl"}>
  <ModalOverlay />
  <ModalContent>
    <ModalHeader>Share</ModalHeader>
    <ModalBody>
      // Modal content...
    </ModalBody>
  </ModalContent>
</Modal>

What to Choose: Either pattern works - fixed overlay for simple cases, Chakra UI for consistent styling.

9. Audio File Management

File: `server/reflector/db/transcripts.py`

Purpose: Audio file storage and access

Key Methods:

Lines 225-230: File path properties
- audio_wav_filename: Local WAV file path
- audio_mp3_filename: Local MP3 file path
- storage_audio_path: Cloud storage path
Lines 252-284: get_audio_url() - Generate access URL
Lines 554-571: move_mp3_to_storage() - Move to cloud
Lines 572-580: download_mp3_from_storage() - Download from cloud

File Path Properties:

# Lines 225-230: Audio file locations
@property
def audio_wav_filename(self):
    return self.data_path / "audio.wav"

@property  
def audio_mp3_filename(self):
    return self.data_path / "audio.mp3"

Storage Logic:

Line 253: Local files: if self.audio_location == "local"
Line 255: Cloud storage: elif self.audio_location == "storage"

What to Modify: Add deletion logic and update get_audio_url to handle deleted files.

10. Review Checklist

Before implementing, manually review these areas with the meeting-based consent approach:

Frontend Changes

Room Entry: Remove consent blocking in www/app/[roomName]/page.tsx:80-124
Meeting UI: Add consent dialog overlay on whereby-embed in www/app/[roomName]/page.tsx:126+
Meeting Hook: Update www/app/[roomName]/useRoomMeeting.tsx to provide meeting data for consent
WebSocket Events: Add consent event handlers (meeting-based, not transcript-based)
User Identification: Add browser fingerprinting for anonymous users

Backend Changes - Meeting Scope

Database: Create meeting_consent table migration following server/migrations/versions/b9348748bbbc_reviewed.py pattern
Meeting Model: Add consent tracking in server/reflector/db/meetings.py
Recording Model: Add deletion flags in server/reflector/db/recordings.py
API: Add meeting consent endpoint in server/reflector/views/meetings.py
Whereby Webhook: Update server/reflector/views/whereby.py to trigger consent based on participant count
SQS Processing: Update server/reflector/worker/process.py to check consent before processing recordings

Critical Integration Points

Consent Timing: ALWAYS ask for consent - no conditions, no triggers, no participant count checks
SQS Processing: Always create transcript first, then delete only audio files if consent denied
Meeting Scoping: All consent tracking uses meeting_id, not room_id (rooms are reused)
Post-Processing Only: No real-time recording control - all intervention happens during SQS processing

Testing Strategy

Multiple Participants: Test consent collection from multiple users in same meeting
Room Reuse: Verify consent doesn't affect other meetings in same room
Recording Triggers: Test different recording_trigger configurations
SQS Deletion: Verify recordings are deleted from S3 when consent denied
Timing Edge Cases: Test consent given after recording already started

Reality Check: This implementation works with post-processing deletion only. We cannot stop recordings in progress or detect exactly when they start. Consent timing is estimated based on meeting configuration and participant events.

15 KiB Raw Blame History

Codebase Review Guide: Audio Storage Consent Implementation

System Reality: Recording Detection Constraints

1. Current Consent Implementation (TO BE REMOVED)

File: www/app/[roomName]/page.tsx

2. Whereby Integration Reality

File: www/app/[roomName]/page.tsx

File: www/app/[roomName]/useRoomMeeting.tsx

3. Recording Discovery System (POST-PROCESSING ONLY)

File: server/reflector/worker/process.py

File: server/reflector/worker/app.py

4. Meeting-Based Consent Timing

File: server/reflector/views/whereby.py

File: server/reflector/db/meetings.py

5. Consent Implementation (NO WebSockets Needed)

Simple Consent Flow:

4. Backend WebSocket System

File: server/reflector/views/transcripts_websocket.py

File: server/reflector/ws_manager.py

5. Database Models and Migrations

File: server/reflector/db/transcripts.py

File: server/migrations/versions/b9348748bbbc_reviewed.py

6. API Endpoint Patterns

File: server/reflector/views/transcripts.py

7. Live Pipeline System

File: server/reflector/pipelines/main_live_pipeline.py

8. Modal/Dialog Patterns

File: www/app/(app)/transcripts/[transcriptId]/shareModal.tsx

File: www/app/(app)/transcripts/shareAndPrivacy.tsx

9. Audio File Management

File: server/reflector/db/transcripts.py

10. Review Checklist

Frontend Changes

Backend Changes - Meeting Scope

Critical Integration Points

Testing Strategy

15 KiB

Raw Blame History

File: `www/app/[roomName]/page.tsx`

File: `www/app/[roomName]/page.tsx`

File: `www/app/[roomName]/useRoomMeeting.tsx`

File: `server/reflector/worker/process.py`

File: `server/reflector/worker/app.py`

File: `server/reflector/views/whereby.py`

File: `server/reflector/db/meetings.py`

File: `server/reflector/views/transcripts_websocket.py`

File: `server/reflector/ws_manager.py`

File: `server/reflector/db/transcripts.py`

File: `server/migrations/versions/b9348748bbbc_reviewed.py`

File: `server/reflector/views/transcripts.py`

File: `server/reflector/pipelines/main_live_pipeline.py`

File: `www/app/(app)/transcripts/[transcriptId]/shareModal.tsx`

File: `www/app/(app)/transcripts/shareAndPrivacy.tsx`

File: `server/reflector/db/transcripts.py`