reflector/PLAN.md at 5b1f11047c3dbf85649aa1b6ef6bb4a0149221be

selfhostyourtech/reflector

Fork 0

mirror of https://github.com/Monadical-SAS/reflector.git synced 2025-12-20 20:29:06 +00:00

Files

Igor Loskutov 91c7c8b83a meeting consent vibe

2025-06-17 16:30:23 -04:00

12 KiB

Raw Blame History

Overview

Move consent from room entry to during recording, asking specifically about audio storage while allowing transcription to continue regardless of response.

Implementation Phases

Phase 1: Database Schema Changes

Meeting Consent Table: server/migrations/versions/[timestamp]_add_meeting_consent_table.py

Create new table for meeting-scoped consent (rooms are reused, consent is per-meeting):

def upgrade() -> None:
    op.create_table('meeting_consent',
        sa.Column('id', sa.String(), nullable=False),
        sa.Column('meeting_id', sa.String(), nullable=False),
        sa.Column('user_identifier', sa.String(), nullable=False),  # IP, session, or user ID
        sa.Column('consent_given', sa.Boolean(), nullable=False),
        sa.Column('consent_timestamp', sa.DateTime(), nullable=False),
        sa.PrimaryKeyConstraint('id'),
        sa.ForeignKeyConstraint(['meeting_id'], ['meeting.id']),
    )

Update Models: server/reflector/db/meetings.py and server/reflector/db/recordings.py

# New model for meeting consent
class MeetingConsent(BaseModel):
    id: str = Field(default_factory=generate_uuid4)
    meeting_id: str
    user_identifier: str
    consent_given: bool
    consent_timestamp: datetime
    user_agent: str | None = None

Phase 2: Backend API Changes

New Consent Endpoint: server/reflector/views/meetings.py

Meeting-based consent endpoint (since consent is per meeting session):

class MeetingConsentRequest(BaseModel):
    consent_given: bool
    user_identifier: str  # IP, session ID, or user ID
    
@router.post("/meetings/{meeting_id}/consent")
async def meeting_audio_consent(
    meeting_id: str,
    request: MeetingConsentRequest,
    user_request: Request,
):
    meeting = await meetings_controller.get_by_id(meeting_id)
    if not meeting:
        raise HTTPException(status_code=404, detail="Meeting not found")
        
    # Store consent in meeting_consent table
    consent = MeetingConsent(
        meeting_id=meeting_id,
        user_identifier=request.user_identifier,
        consent_given=request.consent_given,
        consent_timestamp=datetime.utcnow(),
        user_agent=user_request.headers.get("user-agent")
    )
    
    await meeting_consent_controller.create(consent)
    
    # Broadcast consent event via WebSocket to room participants
    ws_manager = get_ws_manager()
    await ws_manager.send_json(
        room_id=f"meeting:{meeting_id}",
        message={
            "event": "CONSENT_RESPONSE",
            "data": {
                "meeting_id": meeting_id,
                "consent_given": request.consent_given,
                "user_identifier": request.user_identifier
            }
        }
    )
    
    return {"status": "success", "consent_id": consent.id}

Phase 3: WebSocket Event System

Consent Communication: Use direct API calls instead of WebSocket events

Since consent is meeting-level (not transcript-level), use direct API calls:

Frontend shows consent dialog immediately when meeting loads
User response sent directly to /meetings/{meeting_id}/consent endpoint
No need for new WebSocket events - keep it simple

Consent Request: ALWAYS ask - no conditions

# Frontend: Show consent dialog immediately when meeting loads
useEffect(() => {
  if (meeting?.id) {
    // ALWAYS show consent dialog - no conditions
    showConsentDialog(meeting.id);
  }
}, [meeting?.id]);

# Backend: Consent storage using meeting_consent table
# Use meeting_consent table for proper normalization

Phase 4: Frontend Changes

Remove Room Entry Consent: www/app/[roomName]/page.tsx

Remove lines 24, 34-36, 80-124:

// Remove these lines:
const [consentGiven, setConsentGiven] = useState<boolean | null>(null);
const handleConsent = (consent: boolean) => { setConsentGiven(consent); };
// Remove entire consent UI block (lines 80-124)

// Simplify render condition:
if (!isAuthenticated) {
  // Show loading or direct room entry, no consent check
}

Add Consent Dialog Component: www/app/(app)/rooms/audioConsentDialog.tsx

Based on shareModal.tsx patterns:

interface AudioConsentDialogProps {
  isOpen: boolean;
  onClose: () => void;
  onConsent: (given: boolean) => void;
}

const AudioConsentDialog = ({ isOpen, onClose, onConsent }: AudioConsentDialogProps) => {
  return (
    <Modal isOpen={isOpen} onClose={onClose} closeOnOverlayClick={false}>
      <ModalOverlay />
      <ModalContent>
        <ModalHeader>Audio Storage Consent</ModalHeader>
        <ModalBody>
          <Text mb={4}>
            Do you consent to storing this audio recording? 
            The transcript will be generated regardless of your choice.
          </Text>
          <HStack spacing={4}>
            <Button colorScheme="green" onClick={() => onConsent(true)}>
              Yes, store the audio
            </Button>
            <Button colorScheme="red" onClick={() => onConsent(false)}>
              No, delete after transcription
            </Button>
          </HStack>
        </ModalBody>
      </ModalContent>
    </Modal>
  );
};

Update Recording Interface: www/app/(app)/transcripts/[transcriptId]/record/page.tsx

Add consent dialog state and handling:

const [showConsentDialog, setShowConsentDialog] = useState(false);
const [consentStatus, setConsentStatus] = useState<string>('');

// Add to existing WebSocket event handlers
const handleConsentRequest = () => {
  setShowConsentDialog(true);
};

const handleConsentResponse = async (consentGiven: boolean) => {
  // Call API endpoint
  await api.v1TranscriptAudioConsent({
    transcriptId: details.params.transcriptId,
    requestBody: { consent_given: consentGiven }
  });
  setShowConsentDialog(false);
  setConsentStatus(consentGiven ? 'given' : 'denied');
};

Phase 5: SQS Processing Integration

Consent Check During Recording Processing: server/reflector/worker/process.py

Update process_recording() to check consent before processing:

@shared_task
@asynctask
async def process_recording(bucket_name: str, object_key: str):
    logger.info("Processing recording: %s/%s", bucket_name, object_key)

    # Extract meeting info from S3 object key
    room_name = f"/{object_key[:36]}"
    recorded_at = datetime.fromisoformat(object_key[37:57])

    meeting = await meetings_controller.get_by_room_name(room_name)

    
    recording = await recordings_controller.get_by_object_key(bucket_name, object_key)
    if not recording:
        recording = await recordings_controller.create(
            Recording(
                bucket_name=bucket_name,
                object_key=object_key,
                recorded_at=recorded_at,
                meeting_id=meeting.id
            )
        )
    
    # ALWAYS create transcript first (regardless of consent)
    transcript = await transcripts_controller.get_by_recording_id(recording.id)
    if transcript:
        await transcripts_controller.update(transcript, {"topics": []})
    else:
        transcript = await transcripts_controller.add(
            "", source_kind=SourceKind.ROOM, source_language="en", 
            target_language="en", user_id=room.user_id, 
            recording_id=recording.id, share_mode="public"
        )
    
    # Process transcript normally (transcription, topics, summaries)
    _, extension = os.path.splitext(object_key)
    upload_filename = transcript.data_path / f"upload{extension}"
    # ... continue with full transcript processing ...
    # Check if any participant denied consent using meeting_consent_controller
    should_delete = await meeting_consent_controller.has_any_denial(meeting.id)
    # AFTER transcript processing is complete, delete audio if consent denied
    if should_delete:
        logger.info(f"Deleting audio files for {object_key} due to consent denial")
        await delete_audio_files_only(transcript, bucket_name, object_key)

Audio Deletion Function (AFTER transcript processing):

async def delete_audio_files_only(transcript: Transcript, bucket_name: str, object_key: str):
    """Delete ONLY audio files from all locations, keep transcript data"""
    
    try:
        # 1. Delete original Whereby recording from S3
        s3_whereby = boto3.client(
            "s3",
            aws_access_key_id=settings.AWS_WHEREBY_ACCESS_KEY_ID,
            aws_secret_access_key=settings.AWS_WHEREBY_ACCESS_KEY_SECRET,
        )
        s3_whereby.delete_object(Bucket=bucket_name, Key=object_key)
        logger.info(f"Deleted original Whereby recording: {bucket_name}/{object_key}")
        
        # 2. Delete processed audio from transcript storage S3 bucket
        if transcript.audio_location == "storage":
            storage = get_storage()
            await storage.delete_file(transcript.storage_audio_path)
            logger.info(f"Deleted processed audio from storage: {transcript.storage_audio_path}")
        
        # 3. Delete local audio files (if any remain)
        transcript.audio_mp3_filename.unlink(missing_ok=True)
        transcript.audio_wav_filename.unlink(missing_ok=True)
        (transcript.data_path / "upload.mp4").unlink(missing_ok=True)
        
        # 4. Update transcript to reflect audio deletion (keep all other data)
        await transcripts_controller.update(transcript, {
            'audio_location_deleted': True
        })
        
        logger.info(f"Deleted all audio files for transcript {transcript.id}, kept transcript data")
        
    except Exception as e:
        logger.error(f"Failed to delete audio files for {object_key}: {str(e)}")

Meeting Consent Controller: server/reflector/db/meeting_consent.py

class MeetingConsentController:
    async def create(self, consent: MeetingConsent):
        query = meeting_consent.insert().values(**consent.model_dump())
        await database.execute(query)
        return consent
    
    async def get_by_meeting_id(self, meeting_id: str) -> list[MeetingConsent]:
        query = meeting_consent.select().where(meeting_consent.c.meeting_id == meeting_id)
        results = await database.fetch_all(query)
        return [MeetingConsent(**result) for result in results]
    
    async def has_any_denial(self, meeting_id: str) -> bool:
        """Check if any participant denied consent for this meeting"""
        query = meeting_consent.select().where(
            meeting_consent.c.meeting_id == meeting_id,
            meeting_consent.c.consent_given == False
        )
        result = await database.fetch_one(query)
        return result is not None

Phase 6: Testing Strategy

Unit Tests:

Test consent API endpoint
Test WebSocket event broadcasting
Test audio deletion logic
Test consent status tracking

Integration Tests:

Test full consent flow during recording
Test multiple participants consent handling
Test recording continuation regardless of consent
Test audio file cleanup

Manual Testing:

Join room without consent (should work)
Receive consent request during recording
Verify transcription continues regardless of consent choice
Verify audio deletion when consent denied
Verify audio preservation when consent given

Phase 7: Deployment Considerations

Database Migration:

# Run migration
alembic upgrade head

Rollback Plan:

Keep old consent logic in feature flag
Database migration includes downgrade function
Frontend can toggle between old/new consent flows

Monitoring:

Track consent request rates
Monitor audio deletion operations
Alert on consent-related errors

Implementation Order

Database migration - Foundation for all changes
Backend API endpoints - Core consent handling logic
WebSocket event system - Real-time consent communication
Remove room entry consent - Unblock room joining
Add recording consent dialog - New consent UI
Audio deletion logic - Cleanup mechanism
Testing and deployment - Validation and rollout

Risk Mitigation

Feature flags for gradual rollout
Comprehensive logging for consent operations
Rollback plan if consent flow breaks
Audio file backup before deletion (configurable)
Legal review of consent language and timing

This plan maintains backward compatibility while implementing the new consent flow without interrupting core recording functionality.

Extra notes

Room creator must not be asked for consent

12 KiB Raw Blame History

Audio Storage Consent Implementation Plan