# Transcript Formats The Reflector API provides multiple output formats for transcript data through the `transcript_format` query parameter on the GET `/v1/transcripts/{id}` endpoint. ## Overview When retrieving a transcript, you can specify the desired format using the `transcript_format` query parameter. The API supports four formats optimized for different use cases: - **text** - Plain text with speaker names (default) - **text-timestamped** - Timestamped text with speaker names - **webvtt-named** - WebVTT subtitle format with participant names - **json** - Structured JSON segments with full metadata All formats include participant information when available, resolving speaker IDs to actual names. ## Query Parameter Usage ``` GET /v1/transcripts/{id}?transcript_format={format} ``` ### Parameters - `transcript_format` (optional): The desired output format - Type: `"text" | "text-timestamped" | "webvtt-named" | "json"` - Default: `"text"` ## Format Descriptions ### Text Format (`text`) **Use case:** Simple, human-readable transcript for display or export. **Format:** Speaker names followed by their dialogue, one line per segment. **Example:** ``` John Smith: Hello everyone Jane Doe: Hi there John Smith: How are you today? ``` **Request:** ```bash GET /v1/transcripts/{id}?transcript_format=text ``` **Response:** ```json { "id": "transcript_123", "name": "Meeting Recording", "transcript_format": "text", "transcript": "John Smith: Hello everyone\nJane Doe: Hi there\nJohn Smith: How are you today?", "participants": [ {"id": "p1", "speaker": 0, "name": "John Smith"}, {"id": "p2", "speaker": 1, "name": "Jane Doe"} ], ... } ``` ### Text Timestamped Format (`text-timestamped`) **Use case:** Transcript with timing information for navigation or reference. **Format:** `[MM:SS]` timestamp prefix before each speaker and dialogue. **Example:** ``` [00:00] John Smith: Hello everyone [00:05] Jane Doe: Hi there [00:12] John Smith: How are you today? ``` **Request:** ```bash GET /v1/transcripts/{id}?transcript_format=text-timestamped ``` **Response:** ```json { "id": "transcript_123", "name": "Meeting Recording", "transcript_format": "text-timestamped", "transcript": "[00:00] John Smith: Hello everyone\n[00:05] Jane Doe: Hi there\n[00:12] John Smith: How are you today?", "participants": [ {"id": "p1", "speaker": 0, "name": "John Smith"}, {"id": "p2", "speaker": 1, "name": "Jane Doe"} ], ... } ``` ### WebVTT Named Format (`webvtt-named`) **Use case:** Subtitle files for video players, accessibility tools, or video editing. **Format:** Standard WebVTT subtitle format with voice tags using participant names. **Example:** ``` WEBVTT 00:00:00.000 --> 00:00:05.000 Hello everyone 00:00:05.000 --> 00:00:12.000 Hi there 00:00:12.000 --> 00:00:18.000 How are you today? ``` **Request:** ```bash GET /v1/transcripts/{id}?transcript_format=webvtt-named ``` **Response:** ```json { "id": "transcript_123", "name": "Meeting Recording", "transcript_format": "webvtt-named", "transcript": "WEBVTT\n\n00:00:00.000 --> 00:00:05.000\nHello everyone\n\n...", "participants": [ {"id": "p1", "speaker": 0, "name": "John Smith"}, {"id": "p2", "speaker": 1, "name": "Jane Doe"} ], ... } ``` ### JSON Format (`json`) **Use case:** Programmatic access with full timing and speaker metadata. **Format:** Array of segment objects with speaker information, text content, and precise timing. **Example:** ```json [ { "speaker": 0, "speaker_name": "John Smith", "text": "Hello everyone", "start": 0.0, "end": 5.0 }, { "speaker": 1, "speaker_name": "Jane Doe", "text": "Hi there", "start": 5.0, "end": 12.0 }, { "speaker": 0, "speaker_name": "John Smith", "text": "How are you today?", "start": 12.0, "end": 18.0 } ] ``` **Request:** ```bash GET /v1/transcripts/{id}?transcript_format=json ``` **Response:** ```json { "id": "transcript_123", "name": "Meeting Recording", "transcript_format": "json", "transcript": [ { "speaker": 0, "speaker_name": "John Smith", "text": "Hello everyone", "start": 0.0, "end": 5.0 }, { "speaker": 1, "speaker_name": "Jane Doe", "text": "Hi there", "start": 5.0, "end": 12.0 } ], "participants": [ {"id": "p1", "speaker": 0, "name": "John Smith"}, {"id": "p2", "speaker": 1, "name": "Jane Doe"} ], ... } ``` ## Response Structure All formats return the same base transcript metadata with an additional `transcript_format` field and format-specific `transcript` field: ### Common Fields - `id`: Transcript identifier - `user_id`: Owner user ID (if authenticated) - `name`: Transcript name - `status`: Processing status - `locked`: Whether transcript is locked for editing - `duration`: Total duration in seconds - `title`: Auto-generated or custom title - `short_summary`: Brief summary - `long_summary`: Detailed summary - `created_at`: Creation timestamp - `share_mode`: Access control setting - `source_language`: Original audio language - `target_language`: Translation target language - `reviewed`: Whether transcript has been reviewed - `meeting_id`: Associated meeting ID (if applicable) - `source_kind`: Source type (live, file, room) - `room_id`: Associated room ID (if applicable) - `audio_deleted`: Whether audio has been deleted - `participants`: Array of participant objects with speaker mappings ### Format-Specific Fields - `transcript_format`: The format identifier (discriminator field) - `transcript`: The formatted transcript content (string for text/webvtt formats, array for json format) ## Speaker Name Resolution All formats resolve speaker IDs to participant names when available: - If a participant exists for the speaker ID, their name is used - If no participant exists, a default name like "Speaker 0" is generated - Speaker IDs are integers (0, 1, 2, etc.) assigned during diarization