reflector/PLAN.md

# Audio Storage Consent Implementation Plan

## Overview
Move consent from room entry to during recording, asking specifically about audio storage while allowing transcription to continue regardless of response.

## Implementation Phases

### Phase 1: Database Schema Changes

**Meeting Consent Table:** `server/migrations/versions/[timestamp]_add_meeting_consent_table.py`

Create new table for meeting-scoped consent (rooms are reused, consent is per-meeting):

```python
def upgrade() -> None:
    op.create_table('meeting_consent',
        sa.Column('id', sa.String(), nullable=False),
        sa.Column('meeting_id', sa.String(), nullable=False),
        sa.Column('user_identifier', sa.String(), nullable=False),  # IP, session, or user ID
        sa.Column('consent_given', sa.Boolean(), nullable=False),
        sa.Column('consent_timestamp', sa.DateTime(), nullable=False),
        sa.PrimaryKeyConstraint('id'),
        sa.ForeignKeyConstraint(['meeting_id'], ['meeting.id']),
    )
```

**Update Models:** `server/reflector/db/meetings.py` and `server/reflector/db/recordings.py`

```python
# New model for meeting consent
class MeetingConsent(BaseModel):
    id: str = Field(default_factory=generate_uuid4)
    meeting_id: str
    user_identifier: str
    consent_given: bool
    consent_timestamp: datetime
    user_agent: str | None = None
```

### Phase 2: Backend API Changes

**New Consent Endpoint:** `server/reflector/views/meetings.py`

Meeting-based consent endpoint (since consent is per meeting session):

```python
class MeetingConsentRequest(BaseModel):
    consent_given: bool
    user_identifier: str  # IP, session ID, or user ID

@router.post("/meetings/{meeting_id}/consent")
async def meeting_audio_consent(
    meeting_id: str,
    request: MeetingConsentRequest,
    user_request: Request,
):
    meeting = await meetings_controller.get_by_id(meeting_id)
    if not meeting:
        raise HTTPException(status_code=404, detail="Meeting not found")

    # Store consent in meeting_consent table
    consent = MeetingConsent(
        meeting_id=meeting_id,
        user_identifier=request.user_identifier,
        consent_given=request.consent_given,
        consent_timestamp=datetime.utcnow(),
        user_agent=user_request.headers.get("user-agent")
    )

    await meeting_consent_controller.create(consent)

    # Broadcast consent event via WebSocket to room participants
    ws_manager = get_ws_manager()
    await ws_manager.send_json(
        room_id=f"meeting:{meeting_id}",
        message={
            "event": "CONSENT_RESPONSE",
            "data": {
                "meeting_id": meeting_id,
                "consent_given": request.consent_given,
                "user_identifier": request.user_identifier
            }
        }
    )

    return {"status": "success", "consent_id": consent.id}
```

### Phase 3: WebSocket Event System

**Consent Communication:** Use direct API calls instead of WebSocket events

Since consent is meeting-level (not transcript-level), use direct API calls:
- Frontend shows consent dialog immediately when meeting loads
- User response sent directly to `/meetings/{meeting_id}/consent` endpoint
- No need for new WebSocket events - keep it simple

**Consent Request:** ALWAYS ask - no conditions

```ts
# Frontend: Show consent dialog immediately when meeting loads
useEffect(() => {
  if (meeting?.id) {
    // ALWAYS show consent dialog - no conditions
    showConsentDialog(meeting.id);
  }
}, [meeting?.id]);

# Backend: Consent storage using meeting_consent table
# Use meeting_consent table for proper normalization
```

### Phase 4: Frontend Changes

**Remove Room Entry Consent:** `www/app/[roomName]/page.tsx`

Remove lines 24, 34-36, 80-124:
```typescript
// Remove these lines:
const [consentGiven, setConsentGiven] = useState<boolean | null>(null);
const handleConsent = (consent: boolean) => { setConsentGiven(consent); };
// Remove entire consent UI block (lines 80-124)

// Simplify render condition:
if (!isAuthenticated) {
  // Show loading or direct room entry, no consent check
}
```

**Add Consent Dialog Component:** `www/app/(app)/rooms/audioConsentDialog.tsx`

Based on `shareModal.tsx` patterns:

```typescript
interface AudioConsentDialogProps {
  isOpen: boolean;
  onClose: () => void;
  onConsent: (given: boolean) => void;
}

const AudioConsentDialog = ({ isOpen, onClose, onConsent }: AudioConsentDialogProps) => {
  return (
    <Modal isOpen={isOpen} onClose={onClose} closeOnOverlayClick={false}>
      <ModalOverlay />
      <ModalContent>
        <ModalHeader>Audio Storage Consent</ModalHeader>
        <ModalBody>
          <Text mb={4}>
            Do you consent to storing this audio recording?
            The transcript will be generated regardless of your choice.
          </Text>
          <HStack spacing={4}>
            <Button colorScheme="green" onClick={() => onConsent(true)}>
              Yes, store the audio
            </Button>
            <Button colorScheme="red" onClick={() => onConsent(false)}>
              No, delete after transcription
            </Button>
          </HStack>
        </ModalBody>
      </ModalContent>
    </Modal>
  );
};
```

**Update Recording Interface:** `www/app/(app)/transcripts/[transcriptId]/record/page.tsx`

Add consent dialog state and handling:

```typescript
const [showConsentDialog, setShowConsentDialog] = useState(false);
const [consentStatus, setConsentStatus] = useState<string>('');

// Add to existing WebSocket event handlers
const handleConsentRequest = () => {
  setShowConsentDialog(true);
};

const handleConsentResponse = async (consentGiven: boolean) => {
  // Call API endpoint
  await api.v1TranscriptAudioConsent({
    transcriptId: details.params.transcriptId,
    requestBody: { consent_given: consentGiven }
  });
  setShowConsentDialog(false);
  setConsentStatus(consentGiven ? 'given' : 'denied');
};
```


### Phase 5: SQS Processing Integration

**Consent Check During Recording Processing:** `server/reflector/worker/process.py`

Update `process_recording()` to check consent before processing:

```python
@shared_task
@asynctask
async def process_recording(bucket_name: str, object_key: str):
    logger.info("Processing recording: %s/%s", bucket_name, object_key)

    # Extract meeting info from S3 object key
    room_name = f"/{object_key[:36]}"
    recorded_at = datetime.fromisoformat(object_key[37:57])

    meeting = await meetings_controller.get_by_room_name(room_name)


    recording = await recordings_controller.get_by_object_key(bucket_name, object_key)
    if not recording:
        recording = await recordings_controller.create(
            Recording(
                bucket_name=bucket_name,
                object_key=object_key,
                recorded_at=recorded_at,
                meeting_id=meeting.id
            )
        )

    # ALWAYS create transcript first (regardless of consent)
    transcript = await transcripts_controller.get_by_recording_id(recording.id)
    if transcript:
        await transcripts_controller.update(transcript, {"topics": []})
    else:
        transcript = await transcripts_controller.add(
            "", source_kind=SourceKind.ROOM, source_language="en",
            target_language="en", user_id=room.user_id,
            recording_id=recording.id, share_mode="public"
        )

    # Process transcript normally (transcription, topics, summaries)
    _, extension = os.path.splitext(object_key)
    upload_filename = transcript.data_path / f"upload{extension}"
    # ... continue with full transcript processing ...
    # Check if any participant denied consent using meeting_consent_controller
    should_delete = await meeting_consent_controller.has_any_denial(meeting.id)
    # AFTER transcript processing is complete, delete audio if consent denied
    if should_delete:
        logger.info(f"Deleting audio files for {object_key} due to consent denial")
        await delete_audio_files_only(transcript, bucket_name, object_key)

```

**Audio Deletion Function (AFTER transcript processing):**

```python
async def delete_audio_files_only(transcript: Transcript, bucket_name: str, object_key: str):
    """Delete ONLY audio files from all locations, keep transcript data"""

    try:
        # 1. Delete original Whereby recording from S3
        s3_whereby = boto3.client(
            "s3",
            aws_access_key_id=settings.AWS_WHEREBY_ACCESS_KEY_ID,
            aws_secret_access_key=settings.AWS_WHEREBY_ACCESS_KEY_SECRET,
        )
        s3_whereby.delete_object(Bucket=bucket_name, Key=object_key)
        logger.info(f"Deleted original Whereby recording: {bucket_name}/{object_key}")

        # 2. Delete processed audio from transcript storage S3 bucket
        if transcript.audio_location == "storage":
            storage = get_storage()
            await storage.delete_file(transcript.storage_audio_path)
            logger.info(f"Deleted processed audio from storage: {transcript.storage_audio_path}")

        # 3. Delete local audio files (if any remain)
        transcript.audio_mp3_filename.unlink(missing_ok=True)
        transcript.audio_wav_filename.unlink(missing_ok=True)
        (transcript.data_path / "upload.mp4").unlink(missing_ok=True)

        # 4. Update transcript to reflect audio deletion (keep all other data)
        await transcripts_controller.update(transcript, {
            'audio_location_deleted': True
        })

        logger.info(f"Deleted all audio files for transcript {transcript.id}, kept transcript data")

    except Exception as e:
        logger.error(f"Failed to delete audio files for {object_key}: {str(e)}")
```

**Meeting Consent Controller:** `server/reflector/db/meeting_consent.py`


```python
class MeetingConsentController:
    async def create(self, consent: MeetingConsent):
        query = meeting_consent.insert().values(**consent.model_dump())
        await database.execute(query)
        return consent

    async def get_by_meeting_id(self, meeting_id: str) -> list[MeetingConsent]:
        query = meeting_consent.select().where(meeting_consent.c.meeting_id == meeting_id)
        results = await database.fetch_all(query)
        return [MeetingConsent(**result) for result in results]

    async def has_any_denial(self, meeting_id: str) -> bool:
        """Check if any participant denied consent for this meeting"""
        query = meeting_consent.select().where(
            meeting_consent.c.meeting_id == meeting_id,
            meeting_consent.c.consent_given == False
        )
        result = await database.fetch_one(query)
        return result is not None
```

### Phase 6: Testing Strategy

**Unit Tests:**
- Test consent API endpoint
- Test WebSocket event broadcasting
- Test audio deletion logic
- Test consent status tracking

**Integration Tests:**
- Test full consent flow during recording
- Test multiple participants consent handling
- Test recording continuation regardless of consent
- Test audio file cleanup

**Manual Testing:**
- Join room without consent (should work)
- Receive consent request during recording
- Verify transcription continues regardless of consent choice
- Verify audio deletion when consent denied
- Verify audio preservation when consent given

### Phase 7: Deployment Considerations

**Database Migration:**
```bash
# Run migration
alembic upgrade head
```

**Rollback Plan:**
- Keep old consent logic in feature flag
- Database migration includes downgrade function
- Frontend can toggle between old/new consent flows

**Monitoring:**
- Track consent request rates
- Monitor audio deletion operations
- Alert on consent-related errors

## Implementation Order

1. **Database migration** - Foundation for all changes
2. **Backend API endpoints** - Core consent handling logic
3. **WebSocket event system** - Real-time consent communication
4. **Remove room entry consent** - Unblock room joining
5. **Add recording consent dialog** - New consent UI
6. **Audio deletion logic** - Cleanup mechanism
7. **Testing and deployment** - Validation and rollout

## Risk Mitigation

- **Feature flags** for gradual rollout
- **Comprehensive logging** for consent operations
- **Rollback plan** if consent flow breaks
- **Audio file backup** before deletion (configurable)
- **Legal review** of consent language and timing

This plan maintains backward compatibility while implementing the new consent flow without interrupting core recording functionality.

## Extra notes

Room creator must not be asked for consent