feat: add WebVTT context generation to chat WebSocket endpoint

- Import topics_to_webvtt_named and recordings controller
- Add _get_is_multitrack helper function
- Generate WebVTT context on WebSocket connection
- Add get_context message type to retrieve WebVTT
- Maintain backward compatibility with echo for other messages
- Add test fixture and test for WebVTT context generation

Implements task fn-1.2: WebVTT context generation for transcript chat
This commit is contained in:
Igor Loskutov
2026-01-12 18:21:10 -05:00
parent 7ca9cad937
commit 316f7b316d
41 changed files with 10730 additions and 0 deletions

52
.flow/specs/fn-1.1.md Normal file
View File

@@ -0,0 +1,52 @@
# Task 1: WebSocket Endpoint Skeleton
**File:** `server/reflector/views/transcripts_chat.py`
**Lines:** ~30
**Dependencies:** None
## Objective
Create basic WebSocket endpoint with auth and connection handling.
## Implementation
```python
from typing import Optional
from fastapi import APIRouter, Depends, WebSocket, WebSocketDisconnect
import reflector.auth as auth
from reflector.db.transcripts import transcripts_controller
router = APIRouter()
@router.websocket("/transcripts/{transcript_id}/chat")
async def transcript_chat_websocket(
transcript_id: str,
websocket: WebSocket,
user: Optional[auth.UserInfo] = Depends(auth.current_user_optional),
):
# 1. Auth check
user_id = user["sub"] if user else None
transcript = await transcripts_controller.get_by_id_for_http(
transcript_id, user_id
)
# 2. Accept connection
await websocket.accept()
try:
# 3. Basic message loop (stub)
while True:
data = await websocket.receive_json()
await websocket.send_json({"type": "echo", "data": data})
except WebSocketDisconnect:
pass
```
## Validation
- [ ] Endpoint accessible at `ws://localhost:1250/v1/transcripts/{id}/chat`
- [ ] Auth check executes (404 if transcript not found)
- [ ] Connection accepts
- [ ] Echo messages back to client
- [ ] Disconnect handled gracefully
## Notes
- Test with `websocat` or browser WebSocket client
- Don't add LLM yet, just echo

43
.flow/specs/fn-1.2.md Normal file
View File

@@ -0,0 +1,43 @@
# Task 2: WebVTT Context Generation
**File:** `server/reflector/views/transcripts_chat.py` (modify)
**Lines:** ~15
**Dependencies:** Task 1
## Objective
Generate WebVTT transcript context on connection.
## Implementation
```python
from reflector.utils.transcript_formats import topics_to_webvtt_named
from reflector.views.transcripts import _get_is_multitrack
# Add after websocket.accept():
# Get WebVTT context
is_multitrack = await _get_is_multitrack(transcript)
webvtt = topics_to_webvtt_named(
transcript.topics,
transcript.participants,
is_multitrack
)
# Truncate if needed
webvtt_truncated = webvtt[:15000] if len(webvtt) > 15000 else webvtt
# Send to client for verification
await websocket.send_json({
"type": "context",
"webvtt": webvtt_truncated,
"truncated": len(webvtt) > 15000
})
```
## Validation
- [ ] WebVTT generated on connection
- [ ] Truncated to 15k chars if needed
- [ ] Client receives context message
- [ ] Format matches WebVTT spec (timestamps, speaker names)
## Notes
- Log if truncation occurs
- Keep echo functionality for testing

67
.flow/specs/fn-1.3.md Normal file
View File

@@ -0,0 +1,67 @@
# Task 3: LLM Streaming Integration
**File:** `server/reflector/views/transcripts_chat.py` (modify)
**Lines:** ~35
**Dependencies:** Task 2
## Objective
Integrate LLM streaming with conversation management.
## Implementation
```python
from llama_index.core import Settings
from reflector.llm import LLM
from reflector.settings import settings
# After WebVTT generation:
# Configure LLM
llm = LLM(settings=settings, temperature=0.7)
# System message
system_msg = f"""You are analyzing this meeting transcript (WebVTT):
{webvtt_truncated}
Answer questions about content, speakers, timeline. Include timestamps when relevant."""
# Conversation history
conversation_history = [{"role": "system", "content": system_msg}]
# Replace echo loop with:
try:
while True:
data = await websocket.receive_json()
if data["type"] != "message":
continue
# Add user message
user_msg = {"role": "user", "content": data["text"]}
conversation_history.append(user_msg)
# Stream LLM response
assistant_msg = ""
async for chunk in Settings.llm.astream_chat(conversation_history):
token = chunk.delta
await websocket.send_json({"type": "token", "text": token})
assistant_msg += token
# Save assistant response
conversation_history.append({"role": "assistant", "content": assistant_msg})
await websocket.send_json({"type": "done"})
except WebSocketDisconnect:
pass
except Exception as e:
await websocket.send_json({"type": "error", "message": str(e)})
```
## Validation
- [ ] LLM responds to user messages
- [ ] Tokens stream incrementally
- [ ] Conversation history maintained
- [ ] `done` message sent after completion
- [ ] Errors caught and sent to client
## Notes
- Test with: "What was discussed?"
- Verify timestamps appear in responses

22
.flow/specs/fn-1.4.md Normal file
View File

@@ -0,0 +1,22 @@
# Task 4: Register WebSocket Route
**File:** `server/reflector/app.py` (modify)
**Lines:** ~3
**Dependencies:** Task 3
## Objective
Register chat router in FastAPI app.
## Implementation
```python
# Add import
from reflector.views.transcripts_chat import router as transcripts_chat_router
# Add to route registration section
app.include_router(transcripts_chat_router, prefix="/v1", tags=["transcripts"])
```
## Validation
- [ ] Route appears in OpenAPI docs at `/docs`
- [ ] WebSocket endpoint accessible from frontend
- [ ] No import errors

102
.flow/specs/fn-1.5.md Normal file
View File

@@ -0,0 +1,102 @@
# Task 5: Frontend WebSocket Hook
**File:** `www/app/(app)/transcripts/useTranscriptChat.ts`
**Lines:** ~60
**Dependencies:** Task 1 (protocol defined)
## Objective
Create React hook for WebSocket chat communication.
## Implementation
```typescript
import { useEffect, useState, useRef } from "react"
import { WEBSOCKET_URL } from "../../lib/apiClient"
type Message = {
id: string
role: "user" | "assistant"
text: string
timestamp: Date
}
export const useTranscriptChat = (transcriptId: string) => {
const [messages, setMessages] = useState<Message[]>([])
const [isStreaming, setIsStreaming] = useState(false)
const [currentStreamingText, setCurrentStreamingText] = useState("")
const wsRef = useRef<WebSocket | null>(null)
useEffect(() => {
const ws = new WebSocket(
`${WEBSOCKET_URL}/v1/transcripts/${transcriptId}/chat`
)
wsRef.current = ws
ws.onopen = () => console.log("Chat WebSocket connected")
ws.onmessage = (event) => {
const msg = JSON.parse(event.data)
switch (msg.type) {
case "token":
setIsStreaming(true)
setCurrentStreamingText((prev) => prev + msg.text)
break
case "done":
setMessages((prev) => [
...prev,
{
id: Date.now().toString(),
role: "assistant",
text: currentStreamingText,
timestamp: new Date(),
},
])
setCurrentStreamingText("")
setIsStreaming(false)
break
case "error":
console.error("Chat error:", msg.message)
setIsStreaming(false)
break
}
}
ws.onerror = (error) => console.error("WebSocket error:", error)
ws.onclose = () => console.log("Chat WebSocket closed")
return () => ws.close()
}, [transcriptId, currentStreamingText])
const sendMessage = (text: string) => {
if (!wsRef.current) return
setMessages((prev) => [
...prev,
{
id: Date.now().toString(),
role: "user",
text,
timestamp: new Date(),
},
])
wsRef.current.send(JSON.stringify({ type: "message", text }))
}
return { messages, sendMessage, isStreaming, currentStreamingText }
}
```
## Validation
- [ ] Hook connects to WebSocket
- [ ] Sends messages to server
- [ ] Receives streaming tokens
- [ ] Accumulates tokens into messages
- [ ] Handles done/error events
- [ ] Closes connection on unmount
## Notes
- Test with browser console first
- Verify message format matches backend protocol

124
.flow/specs/fn-1.6.md Normal file
View File

@@ -0,0 +1,124 @@
# Task 6: Chat Dialog Component
**File:** `www/app/(app)/transcripts/TranscriptChatModal.tsx`
**Lines:** ~90
**Dependencies:** Task 5 (hook interface)
## Objective
Create Chakra UI v3 Dialog component for chat interface.
## Implementation
```typescript
"use client"
import { useState } from "react"
import { Dialog, Box, Input, IconButton } from "@chakra-ui/react"
import { MessageCircle } from "lucide-react"
type Message = {
id: string
role: "user" | "assistant"
text: string
timestamp: Date
}
interface TranscriptChatModalProps {
open: boolean
onClose: () => void
messages: Message[]
sendMessage: (text: string) => void
isStreaming: boolean
currentStreamingText: string
}
export function TranscriptChatModal({
open,
onClose,
messages,
sendMessage,
isStreaming,
currentStreamingText,
}: TranscriptChatModalProps) {
const [input, setInput] = useState("")
const handleSend = () => {
if (!input.trim()) return
sendMessage(input)
setInput("")
}
return (
<Dialog.Root open={open} onOpenChange={(e) => !e.open && onClose()}>
<Dialog.Backdrop />
<Dialog.Positioner>
<Dialog.Content maxW="500px" h="600px">
<Dialog.Header>Transcript Chat</Dialog.Header>
<Dialog.Body overflowY="auto">
{messages.map((msg) => (
<Box
key={msg.id}
p={3}
mb={2}
bg={msg.role === "user" ? "blue.50" : "gray.50"}
borderRadius="md"
>
{msg.text}
</Box>
))}
{isStreaming && (
<Box p={3} bg="gray.50" borderRadius="md">
{currentStreamingText}
<Box as="span" className="animate-pulse">
</Box>
</Box>
)}
</Dialog.Body>
<Dialog.Footer>
<Input
value={input}
onChange={(e) => setInput(e.target.value)}
onKeyDown={(e) => e.key === "Enter" && handleSend()}
placeholder="Ask about transcript..."
disabled={isStreaming}
/>
</Dialog.Footer>
</Dialog.Content>
</Dialog.Positioner>
</Dialog.Root>
)
}
export function TranscriptChatButton({ onClick }: { onClick: () => void }) {
return (
<IconButton
position="fixed"
bottom="24px"
right="24px"
onClick={onClick}
size="lg"
colorScheme="blue"
borderRadius="full"
aria-label="Open chat"
>
<MessageCircle />
</IconButton>
)
}
```
## Validation
- [ ] Dialog opens/closes correctly
- [ ] Messages display (user: blue, assistant: gray)
- [ ] Streaming text shows with cursor
- [ ] Input disabled during streaming
- [ ] Enter key sends message
- [ ] Dialog scrolls with content
- [ ] Floating button positioned correctly
## Notes
- Test with mock data before connecting hook
- Verify Chakra v3 Dialog.Root API

54
.flow/specs/fn-1.7.md Normal file
View File

@@ -0,0 +1,54 @@
# Task 7: Integrate into Transcript Page
**File:** `www/app/(app)/transcripts/[transcriptId]/page.tsx` (modify)
**Lines:** ~15
**Dependencies:** Task 5, Task 6
## Objective
Add chat components to transcript detail page.
## Implementation
```typescript
// Add imports
import { useDisclosure } from "@chakra-ui/react"
import {
TranscriptChatModal,
TranscriptChatButton,
} from "../TranscriptChatModal"
import { useTranscriptChat } from "../useTranscriptChat"
// Inside component:
export default function TranscriptDetails(details: TranscriptDetails) {
const params = use(details.params)
const transcriptId = params.transcriptId
// Add chat state
const { open, onOpen, onClose } = useDisclosure()
const chat = useTranscriptChat(transcriptId)
return (
<>
{/* Existing Grid with transcript content */}
<Grid templateColumns="1fr" templateRows="auto minmax(0, 1fr)" /* ... */>
{/* ... existing content ... */}
</Grid>
{/* Chat interface */}
<TranscriptChatModal open={open} onClose={onClose} {...chat} />
<TranscriptChatButton onClick={onOpen} />
</>
)
}
```
## Validation
- [ ] Button appears on transcript page
- [ ] Clicking button opens dialog
- [ ] Chat works end-to-end
- [ ] Dialog closes properly
- [ ] No layout conflicts with existing UI
- [ ] Button doesn't overlap other elements
## Notes
- Test on different transcript pages
- Verify z-index for button and dialog

47
.flow/specs/fn-1.8.md Normal file
View File

@@ -0,0 +1,47 @@
# Task 8: End-to-End Testing
**File:** N/A (testing)
**Lines:** 0
**Dependencies:** All tasks (1-7)
## Objective
Validate complete feature functionality.
## Test Scenarios
### 1. Basic Flow
- [ ] Navigate to transcript page
- [ ] Click floating button
- [ ] Dialog opens with "Transcript Chat" header
- [ ] Type "What was discussed?"
- [ ] Press Enter
- [ ] Streaming response appears token-by-token
- [ ] Response completes with relevant content
- [ ] Ask follow-up question
- [ ] Conversation context maintained
### 2. Edge Cases
- [ ] Empty message (doesn't send)
- [ ] Very long transcript (>15k chars truncated)
- [ ] Network disconnect (graceful error)
- [ ] Multiple rapid messages (queued correctly)
- [ ] Close dialog mid-stream (conversation cleared)
- [ ] Reopen dialog (fresh conversation)
### 3. Auth
- [ ] Works with logged-in user
- [ ] Works with anonymous user
- [ ] Private transcript blocked for wrong user
### 4. UI/UX
- [ ] Button doesn't cover other UI elements
- [ ] Dialog scrolls properly
- [ ] Streaming cursor visible
- [ ] Input disabled during streaming
- [ ] Messages clearly distinguished (user vs assistant)
## Bugs to Watch
- WebSocket connection leaks (check browser devtools)
- Streaming text accumulation bugs
- Race conditions on rapid messages
- Memory leaks from conversation history

439
.flow/specs/fn-1.md Normal file
View File

@@ -0,0 +1,439 @@
# PRD: Transcript Chat Assistant (POC)
## Research Complete
**Backend Infrastructure:**
- LLM configured: `reflector/llm.py` using llama-index's `OpenAILike`
- Streaming support: `Settings.llm.astream_chat()` available (configured by LLM class)
- WebSocket infrastructure: Redis pub/sub via `ws_manager`
- Existing pattern: `/v1/transcripts/{transcript_id}/events` WebSocket (broadcast-only)
**Frontend Infrastructure:**
- `useWebSockets` hook pattern established
- Chakra UI v3 with Dialog.Root API
- lucide-react icons available
**Decision: Use existing WebSocket + custom chat UI**
---
## Architecture
```
Frontend Backend (FastAPI)
┌──────────────────┐ ┌────────────────────────────┐
│ Transcript Page │ │ /v1/transcripts/{id}/chat │
│ │ │ │
│ ┌──────────────┐ │ │ WebSocket Endpoint │
│ │ Chat Dialog │ │◄──WebSocket│ (bidirectional) │
│ │ │ │────────────┤ 1. Auth check │
│ │ - Messages │ │ send msg │ 2. Get WebVTT transcript │
│ │ - Input │ │ │ 3. Build conversation │
│ │ - Streaming │ │◄───────────┤ 4. Call astream_chat() │
│ └──────────────┘ │ stream │ 5. Stream tokens via WS │
│ useTranscriptChat│ response │ │
└──────────────────┘ │ ┌────────────────────────┐ │
│ │ LLM (llama-index) │ │
│ │ Settings.llm │ │
│ │ astream_chat() │ │
│ └────────────────────────┘ │
│ │
│ Existing: │
│ - topics_to_webvtt_named() │
└────────────────────────────┘
```
**Note:** This WebSocket is bidirectional (client→server messages) unlike existing broadcast-only pattern (`/events` endpoint).
---
## Components
### Backend
**1. WebSocket Endpoint** (`server/reflector/views/transcripts_chat.py`)
```python
@router.websocket("/transcripts/{transcript_id}/chat")
async def transcript_chat_websocket(
transcript_id: str,
websocket: WebSocket,
user: Optional[auth.UserInfo] = Depends(auth.current_user_optional),
):
# 1. Auth check
user_id = user["sub"] if user else None
transcript = await transcripts_controller.get_by_id_for_http(transcript_id, user_id)
# 2. Accept WebSocket
await websocket.accept()
# 3. Get WebVTT context
webvtt = topics_to_webvtt_named(
transcript.topics,
transcript.participants,
await _get_is_multitrack(transcript)
)
# 4. Configure LLM (sets up Settings.llm with session tracking)
llm = LLM(settings=settings, temperature=0.7)
# 5. System message
system_msg = f"""You are analyzing this meeting transcript (WebVTT):
{webvtt[:15000]} # Truncate if needed
Answer questions about content, speakers, timeline. Include timestamps when relevant."""
# 6. Conversation loop
conversation_history = [{"role": "system", "content": system_msg}]
try:
while True:
# Receive user message
data = await websocket.receive_json()
if data["type"] != "message":
continue
user_msg = {"role": "user", "content": data["text"]}
conversation_history.append(user_msg)
# Stream LLM response
assistant_msg = ""
async for chunk in Settings.llm.astream_chat(conversation_history):
token = chunk.delta
await websocket.send_json({"type": "token", "text": token})
assistant_msg += token
conversation_history.append({"role": "assistant", "content": assistant_msg})
await websocket.send_json({"type": "done"})
except WebSocketDisconnect:
pass
except Exception as e:
await websocket.send_json({"type": "error", "message": str(e)})
```
**Message Protocol:**
```typescript
// Client → Server
{type: "message", text: "What was discussed?"}
// Server → Client (streaming)
{type: "token", text: "At "}
{type: "token", text: "01:23"}
...
{type: "done"}
{type: "error", message: "..."} // on errors
```
### Frontend
**2. Chat Hook** (`www/app/(app)/transcripts/useTranscriptChat.ts`)
```typescript
export const useTranscriptChat = (transcriptId: string) => {
const [messages, setMessages] = useState<Message[]>([])
const [isStreaming, setIsStreaming] = useState(false)
const [currentStreamingText, setCurrentStreamingText] = useState("")
const wsRef = useRef<WebSocket | null>(null)
useEffect(() => {
const ws = new WebSocket(`${WEBSOCKET_URL}/v1/transcripts/${transcriptId}/chat`)
wsRef.current = ws
ws.onopen = () => console.log("Chat WebSocket connected")
ws.onmessage = (event) => {
const msg = JSON.parse(event.data)
switch (msg.type) {
case "token":
setIsStreaming(true)
setCurrentStreamingText(prev => prev + msg.text)
break
case "done":
setMessages(prev => [...prev, {
id: Date.now().toString(),
role: "assistant",
text: currentStreamingText,
timestamp: new Date()
}])
setCurrentStreamingText("")
setIsStreaming(false)
break
case "error":
console.error("Chat error:", msg.message)
setIsStreaming(false)
break
}
}
ws.onerror = (error) => console.error("WebSocket error:", error)
ws.onclose = () => console.log("Chat WebSocket closed")
return () => ws.close()
}, [transcriptId])
const sendMessage = (text: string) => {
if (!wsRef.current) return
setMessages(prev => [...prev, {
id: Date.now().toString(),
role: "user",
text,
timestamp: new Date()
}])
wsRef.current.send(JSON.stringify({type: "message", text}))
}
return {messages, sendMessage, isStreaming, currentStreamingText}
}
```
**3. Chat Dialog** (`www/app/(app)/transcripts/TranscriptChatModal.tsx`)
```tsx
import { Dialog, Box, Input, IconButton } from "@chakra-ui/react"
import { MessageCircle } from "lucide-react"
interface TranscriptChatModalProps {
open: boolean
onClose: () => void
messages: Message[]
sendMessage: (text: string) => void
isStreaming: boolean
currentStreamingText: string
}
export function TranscriptChatModal({
open,
onClose,
messages,
sendMessage,
isStreaming,
currentStreamingText
}: TranscriptChatModalProps) {
const [input, setInput] = useState("")
const handleSend = () => {
if (!input.trim()) return
sendMessage(input)
setInput("")
}
return (
<Dialog.Root open={open} onOpenChange={(e) => !e.open && onClose()}>
<Dialog.Backdrop />
<Dialog.Positioner>
<Dialog.Content maxW="500px" h="600px">
<Dialog.Header>Transcript Chat</Dialog.Header>
<Dialog.Body overflowY="auto">
{messages.map(msg => (
<Box
key={msg.id}
p={3}
mb={2}
bg={msg.role === "user" ? "blue.50" : "gray.50"}
borderRadius="md"
>
{msg.text}
</Box>
))}
{isStreaming && (
<Box p={3} bg="gray.50" borderRadius="md">
{currentStreamingText}
<Box as="span" className="animate-pulse"></Box>
</Box>
)}
</Dialog.Body>
<Dialog.Footer>
<Input
value={input}
onChange={(e) => setInput(e.target.value)}
onKeyDown={(e) => e.key === "Enter" && handleSend()}
placeholder="Ask about transcript..."
disabled={isStreaming}
/>
</Dialog.Footer>
</Dialog.Content>
</Dialog.Positioner>
</Dialog.Root>
)
}
// Floating button
export function TranscriptChatButton({ onClick }: { onClick: () => void }) {
return (
<IconButton
position="fixed"
bottom="24px"
right="24px"
onClick={onClick}
size="lg"
colorScheme="blue"
borderRadius="full"
aria-label="Open chat"
>
<MessageCircle />
</IconButton>
)
}
```
**4. Integration** (Modify `/transcripts/[transcriptId]/page.tsx`)
```tsx
import { useDisclosure } from "@chakra-ui/react"
import { TranscriptChatModal, TranscriptChatButton } from "../TranscriptChatModal"
import { useTranscriptChat } from "../useTranscriptChat"
export default function TranscriptDetails(details: TranscriptDetails) {
const params = use(details.params)
const transcriptId = params.transcriptId
const { open, onOpen, onClose } = useDisclosure()
const chat = useTranscriptChat(transcriptId)
return (
<>
{/* Existing transcript UI */}
<Grid templateColumns="1fr" /* ... */>
{/* ... existing content ... */}
</Grid>
{/* Chat interface */}
<TranscriptChatModal
open={open}
onClose={onClose}
{...chat}
/>
<TranscriptChatButton onClick={onOpen} />
</>
)
}
```
---
## Data Structures
```typescript
type Message = {
id: string
role: "user" | "assistant"
text: string
timestamp: Date
}
```
---
## API Specifications
### WebSocket Endpoint
**URL:** `ws://localhost:1250/v1/transcripts/{transcript_id}/chat`
**Auth:** Optional user (same as existing endpoints)
**Client → Server:**
```json
{"type": "message", "text": "What was discussed?"}
```
**Server → Client:**
```json
{"type": "token", "text": "chunk"}
{"type": "done"}
{"type": "error", "message": "error text"}
```
---
## Implementation Notes
**LLM Integration:**
- Instantiate `LLM()` to configure `Settings.llm` with session tracking
- Use `Settings.llm.astream_chat()` directly for streaming
- Chunks have `.delta` property with token text
**WebVTT Context:**
- Reuse `topics_to_webvtt_named()` utility
- Truncate to ~15k chars if needed (known limitation for POC)
- Include in system message
**Conversation State:**
- Store in-memory in WebSocket handler (ephemeral)
- Clear on disconnect
- No persistence (out of scope)
**Error Handling:**
- Basic try/catch with error message to client
- Log errors server-side
---
## File Structure
```
server/reflector/views/
└── transcripts_chat.py # New: ~80 lines
www/app/(app)/transcripts/
├── [transcriptId]/
│ └── page.tsx # Modified: +10 lines
├── useTranscriptChat.ts # New: ~60 lines
└── TranscriptChatModal.tsx # New: ~80 lines
```
**Total:** ~230 lines of code
---
## Dependencies
**Backend:** None (all existing)
**Frontend:** None (Chakra UI + lucide-react already installed)
---
## Out of Scope (POC)
- ❌ Message persistence/history
- ❌ Context window optimization
- ❌ Sentence buffering (token-by-token is fine)
- ❌ Rate limiting beyond auth
- ❌ Tool calling
- ❌ RAG/vector search
**Known Limitations:**
- Long transcripts (>15k chars) will be truncated
- Conversation lost on disconnect
- No error recovery/retry
---
## Acceptance Criteria
- [ ] Floating button on transcript page
- [ ] Click opens dialog with chat interface
- [ ] Send message, receive streaming response
- [ ] LLM has WebVTT transcript context
- [ ] Auth works (optional user)
- [ ] Dialog closes, conversation cleared
- [ ] Works with configured OpenAI-compatible LLM
---
## References
- [LlamaIndex Streaming](https://docs.llamaindex.ai/en/stable/module_guides/deploying/query_engine/streaming/)
- [LlamaIndex OpenAILike](https://docs.llamaindex.ai/en/stable/api_reference/llms/openai_like/)
- [FastAPI WebSocket](https://fastapi.tiangolo.com/advanced/websockets/)