This commit is contained in:
Igor Loskutov
2025-09-02 12:04:30 -04:00
17 changed files with 971 additions and 53 deletions

View File

@@ -1,5 +1,34 @@
# Changelog
## [0.8.2](https://github.com/Monadical-SAS/reflector/compare/v0.8.1...v0.8.2) (2025-08-29)
### Bug Fixes
* search-logspam ([#593](https://github.com/Monadical-SAS/reflector/issues/593)) ([695d1a9](https://github.com/Monadical-SAS/reflector/commit/695d1a957d4cd862753049f9beed88836cabd5ab))
## [0.8.1](https://github.com/Monadical-SAS/reflector/compare/v0.8.0...v0.8.1) (2025-08-29)
### Bug Fixes
* make webhook secret/url allowing null ([#590](https://github.com/Monadical-SAS/reflector/issues/590)) ([84a3812](https://github.com/Monadical-SAS/reflector/commit/84a381220bc606231d08d6f71d4babc818fa3c75))
## [0.8.0](https://github.com/Monadical-SAS/reflector/compare/v0.7.3...v0.8.0) (2025-08-29)
### Features
* **cleanup:** add automatic data retention for public instances ([#574](https://github.com/Monadical-SAS/reflector/issues/574)) ([6f0c7c1](https://github.com/Monadical-SAS/reflector/commit/6f0c7c1a5e751713366886c8e764c2009e12ba72))
* **rooms:** add webhook for transcript completion ([#578](https://github.com/Monadical-SAS/reflector/issues/578)) ([88ed7cf](https://github.com/Monadical-SAS/reflector/commit/88ed7cfa7804794b9b54cad4c3facc8a98cf85fd))
### Bug Fixes
* file pipeline status reporting and websocket updates ([#589](https://github.com/Monadical-SAS/reflector/issues/589)) ([9dfd769](https://github.com/Monadical-SAS/reflector/commit/9dfd76996f851cc52be54feea078adbc0816dc57))
* Igor/evaluation ([#575](https://github.com/Monadical-SAS/reflector/issues/575)) ([124ce03](https://github.com/Monadical-SAS/reflector/commit/124ce03bf86044c18313d27228a25da4bc20c9c5))
* optimize parakeet transcription batching algorithm ([#577](https://github.com/Monadical-SAS/reflector/issues/577)) ([7030e0f](https://github.com/Monadical-SAS/reflector/commit/7030e0f23649a8cf6c1eb6d5889684a41ce849ec))
## [0.7.3](https://github.com/Monadical-SAS/reflector/compare/v0.7.2...v0.7.3) (2025-08-22)

212
server/docs/webhook.md Normal file
View File

@@ -0,0 +1,212 @@
# Reflector Webhook Documentation
## Overview
Reflector supports webhook notifications to notify external systems when transcript processing is completed. Webhooks can be configured per room and are triggered automatically after a transcript is successfully processed.
## Configuration
Webhooks are configured at the room level with two fields:
- `webhook_url`: The HTTPS endpoint to receive webhook notifications
- `webhook_secret`: Optional secret key for HMAC signature verification (auto-generated if not provided)
## Events
### `transcript.completed`
Triggered when a transcript has been fully processed, including transcription, diarization, summarization, and topic detection.
### `test`
A test event that can be triggered manually to verify webhook configuration.
## Webhook Request Format
### Headers
All webhook requests include the following headers:
| Header | Description | Example |
|--------|-------------|---------|
| `Content-Type` | Always `application/json` | `application/json` |
| `User-Agent` | Identifies Reflector as the source | `Reflector-Webhook/1.0` |
| `X-Webhook-Event` | The event type | `transcript.completed` or `test` |
| `X-Webhook-Retry` | Current retry attempt number | `0`, `1`, `2`... |
| `X-Webhook-Signature` | HMAC signature (if secret configured) | `t=1735306800,v1=abc123...` |
### Signature Verification
If a webhook secret is configured, Reflector includes an HMAC-SHA256 signature in the `X-Webhook-Signature` header to verify the webhook authenticity.
The signature format is: `t={timestamp},v1={signature}`
To verify the signature:
1. Extract the timestamp and signature from the header
2. Create the signed payload: `{timestamp}.{request_body}`
3. Compute HMAC-SHA256 of the signed payload using your webhook secret
4. Compare the computed signature with the received signature
Example verification (Python):
```python
import hmac
import hashlib
def verify_webhook_signature(payload: bytes, signature_header: str, secret: str) -> bool:
# Parse header: "t=1735306800,v1=abc123..."
parts = dict(part.split("=") for part in signature_header.split(","))
timestamp = parts["t"]
received_signature = parts["v1"]
# Create signed payload
signed_payload = f"{timestamp}.{payload.decode('utf-8')}"
# Compute expected signature
expected_signature = hmac.new(
secret.encode("utf-8"),
signed_payload.encode("utf-8"),
hashlib.sha256
).hexdigest()
# Compare signatures
return hmac.compare_digest(expected_signature, received_signature)
```
## Event Payloads
### `transcript.completed` Event
This event includes a convenient URL for accessing the transcript:
- `frontend_url`: Direct link to view the transcript in the web interface
```json
{
"event": "transcript.completed",
"event_id": "transcript.completed-abc-123-def-456",
"timestamp": "2025-08-27T12:34:56.789012Z",
"transcript": {
"id": "abc-123-def-456",
"room_id": "room-789",
"created_at": "2025-08-27T12:00:00Z",
"duration": 1800.5,
"title": "Q3 Product Planning Meeting",
"short_summary": "Team discussed Q3 product roadmap, prioritizing mobile app features and API improvements.",
"long_summary": "The product team met to finalize the Q3 roadmap. Key decisions included...",
"webvtt": "WEBVTT\n\n00:00:00.000 --> 00:00:05.000\n<v Speaker 1>Welcome everyone to today's meeting...",
"topics": [
{
"title": "Introduction and Agenda",
"summary": "Meeting kickoff with agenda review",
"timestamp": 0.0,
"duration": 120.0,
"webvtt": "WEBVTT\n\n00:00:00.000 --> 00:00:05.000\n<v Speaker 1>Welcome everyone..."
},
{
"title": "Mobile App Features Discussion",
"summary": "Team reviewed proposed mobile app features for Q3",
"timestamp": 120.0,
"duration": 600.0,
"webvtt": "WEBVTT\n\n00:02:00.000 --> 00:02:10.000\n<v Speaker 2>Let's talk about the mobile app..."
}
],
"participants": [
{
"id": "participant-1",
"name": "John Doe",
"speaker": "Speaker 1"
},
{
"id": "participant-2",
"name": "Jane Smith",
"speaker": "Speaker 2"
}
],
"source_language": "en",
"target_language": "en",
"status": "completed",
"frontend_url": "https://app.reflector.com/transcripts/abc-123-def-456"
},
"room": {
"id": "room-789",
"name": "Product Team Room"
}
}
```
### `test` Event
```json
{
"event": "test",
"event_id": "test.2025-08-27T12:34:56.789012Z",
"timestamp": "2025-08-27T12:34:56.789012Z",
"message": "This is a test webhook from Reflector",
"room": {
"id": "room-789",
"name": "Product Team Room"
}
}
```
## Retry Policy
Webhooks are delivered with automatic retry logic to handle transient failures. When a webhook delivery fails due to server errors or network issues, Reflector will automatically retry the delivery multiple times over an extended period.
### Retry Mechanism
Reflector implements an exponential backoff strategy for webhook retries:
- **Initial retry delay**: 60 seconds after the first failure
- **Exponential backoff**: Each subsequent retry waits approximately twice as long as the previous one
- **Maximum retry interval**: 1 hour (backoff is capped at this duration)
- **Maximum retry attempts**: 30 attempts total
- **Total retry duration**: Retries continue for approximately 24 hours
### How Retries Work
When a webhook fails, Reflector will:
1. Wait 60 seconds, then retry (attempt #1)
2. If it fails again, wait ~2 minutes, then retry (attempt #2)
3. Continue doubling the wait time up to a maximum of 1 hour between attempts
4. Keep retrying at 1-hour intervals until successful or 30 attempts are exhausted
The `X-Webhook-Retry` header indicates the current retry attempt number (0 for the initial attempt, 1 for first retry, etc.), allowing your endpoint to track retry attempts.
### Retry Behavior by HTTP Status Code
| Status Code | Behavior |
|-------------|----------|
| 2xx (Success) | No retry, webhook marked as delivered |
| 4xx (Client Error) | No retry, request is considered permanently failed |
| 5xx (Server Error) | Automatic retry with exponential backoff |
| Network/Timeout Error | Automatic retry with exponential backoff |
**Important Notes:**
- Webhooks timeout after 30 seconds. If your endpoint takes longer to respond, it will be considered a timeout error and retried.
- During the retry period (~24 hours), you may receive the same webhook multiple times if your endpoint experiences intermittent failures.
- There is no mechanism to manually retry failed webhooks after the retry period expires.
## Testing Webhooks
You can test your webhook configuration before processing transcripts:
```http
POST /v1/rooms/{room_id}/webhook/test
```
Response:
```json
{
"success": true,
"status_code": 200,
"message": "Webhook test successful",
"response_preview": "OK"
}
```
Or in case of failure:
```json
{
"success": false,
"error": "Webhook request timed out (10 seconds)"
}
```

View File

@@ -0,0 +1,36 @@
"""Add webhook fields to rooms
Revision ID: 0194f65cd6d3
Revises: 5a8907fd1d78
Create Date: 2025-08-27 09:03:19.610995
"""
from typing import Sequence, Union
import sqlalchemy as sa
from alembic import op
# revision identifiers, used by Alembic.
revision: str = "0194f65cd6d3"
down_revision: Union[str, None] = "5a8907fd1d78"
branch_labels: Union[str, Sequence[str], None] = None
depends_on: Union[str, Sequence[str], None] = None
def upgrade() -> None:
# ### commands auto generated by Alembic - please adjust! ###
with op.batch_alter_table("room", schema=None) as batch_op:
batch_op.add_column(sa.Column("webhook_url", sa.String(), nullable=True))
batch_op.add_column(sa.Column("webhook_secret", sa.String(), nullable=True))
# ### end Alembic commands ###
def downgrade() -> None:
# ### commands auto generated by Alembic - please adjust! ###
with op.batch_alter_table("room", schema=None) as batch_op:
batch_op.drop_column("webhook_secret")
batch_op.drop_column("webhook_url")
# ### end Alembic commands ###

View File

@@ -0,0 +1,28 @@
"""webhook url and secret null by default
Revision ID: 61882a919591
Revises: 0194f65cd6d3
Create Date: 2025-08-29 11:46:36.738091
"""
from typing import Sequence, Union
# revision identifiers, used by Alembic.
revision: str = "61882a919591"
down_revision: Union[str, None] = "0194f65cd6d3"
branch_labels: Union[str, Sequence[str], None] = None
depends_on: Union[str, Sequence[str], None] = None
def upgrade() -> None:
# ### commands auto generated by Alembic - please adjust! ###
pass
# ### end Alembic commands ###
def downgrade() -> None:
# ### commands auto generated by Alembic - please adjust! ###
pass
# ### end Alembic commands ###

View File

@@ -1,3 +1,4 @@
import secrets
from datetime import datetime, timezone
from sqlite3 import IntegrityError
from typing import Literal
@@ -40,6 +41,8 @@ rooms = sqlalchemy.Table(
sqlalchemy.Column(
"is_shared", sqlalchemy.Boolean, nullable=False, server_default=false()
),
sqlalchemy.Column("webhook_url", sqlalchemy.String, nullable=True),
sqlalchemy.Column("webhook_secret", sqlalchemy.String, nullable=True),
sqlalchemy.Index("idx_room_is_shared", "is_shared"),
)
@@ -59,6 +62,8 @@ class Room(BaseModel):
"none", "prompt", "automatic", "automatic-2nd-participant"
] = "automatic-2nd-participant"
is_shared: bool = False
webhook_url: str | None = None
webhook_secret: str | None = None
class RoomController:
@@ -107,10 +112,15 @@ class RoomController:
recording_type: str,
recording_trigger: str,
is_shared: bool,
webhook_url: str = "",
webhook_secret: str = "",
):
"""
Add a new room
"""
if webhook_url and not webhook_secret:
webhook_secret = secrets.token_urlsafe(32)
room = Room(
name=name,
user_id=user_id,
@@ -122,6 +132,8 @@ class RoomController:
recording_type=recording_type,
recording_trigger=recording_trigger,
is_shared=is_shared,
webhook_url=webhook_url,
webhook_secret=webhook_secret,
)
query = rooms.insert().values(**room.model_dump())
try:
@@ -134,6 +146,9 @@ class RoomController:
"""
Update a room fields with key/values in values
"""
if values.get("webhook_url") and not values.get("webhook_secret"):
values["webhook_secret"] = secrets.token_urlsafe(32)
query = rooms.update().where(rooms.c.id == room.id).values(**values)
try:
await get_database().execute(query)

View File

@@ -8,12 +8,14 @@ from typing import Annotated, Any, Dict, Iterator
import sqlalchemy
import webvtt
from databases.interfaces import Record as DbRecord
from fastapi import HTTPException
from pydantic import (
BaseModel,
Field,
NonNegativeFloat,
NonNegativeInt,
TypeAdapter,
ValidationError,
constr,
field_serializer,
@@ -24,6 +26,7 @@ from reflector.db.rooms import rooms
from reflector.db.transcripts import SourceKind, transcripts
from reflector.db.utils import is_postgresql
from reflector.logger import logger
from reflector.utils.string import NonEmptyString, try_parse_non_empty_string
DEFAULT_SEARCH_LIMIT = 20
SNIPPET_CONTEXT_LENGTH = 50 # Characters before/after match to include
@@ -31,12 +34,13 @@ DEFAULT_SNIPPET_MAX_LENGTH = NonNegativeInt(150)
DEFAULT_MAX_SNIPPETS = NonNegativeInt(3)
LONG_SUMMARY_MAX_SNIPPETS = 2
SearchQueryBase = constr(min_length=0, strip_whitespace=True)
SearchQueryBase = constr(min_length=1, strip_whitespace=True)
SearchLimitBase = Annotated[int, Field(ge=1, le=100)]
SearchOffsetBase = Annotated[int, Field(ge=0)]
SearchTotalBase = Annotated[int, Field(ge=0)]
SearchQuery = Annotated[SearchQueryBase, Field(description="Search query text")]
search_query_adapter = TypeAdapter(SearchQuery)
SearchLimit = Annotated[SearchLimitBase, Field(description="Results per page")]
SearchOffset = Annotated[
SearchOffsetBase, Field(description="Number of results to skip")
@@ -88,7 +92,7 @@ class WebVTTProcessor:
@staticmethod
def generate_snippets(
webvtt_content: WebVTTContent,
query: str,
query: SearchQuery,
max_snippets: NonNegativeInt = DEFAULT_MAX_SNIPPETS,
) -> list[str]:
"""Generate snippets from WebVTT content."""
@@ -125,7 +129,7 @@ class SnippetCandidate:
class SearchParameters(BaseModel):
"""Validated search parameters for full-text search."""
query_text: SearchQuery
query_text: SearchQuery | None = None
limit: SearchLimit = DEFAULT_SEARCH_LIMIT
offset: SearchOffset = 0
user_id: str | None = None
@@ -199,15 +203,13 @@ class SnippetGenerator:
prev_start = start
@staticmethod
def count_matches(text: str, query: str) -> NonNegativeInt:
def count_matches(text: str, query: SearchQuery) -> NonNegativeInt:
"""Count total number of matches for a query in text."""
ZERO = NonNegativeInt(0)
if not text:
logger.warning("Empty text for search query in count_matches")
return ZERO
if not query:
logger.warning("Empty query for search text in count_matches")
return ZERO
assert query is not None
return NonNegativeInt(
sum(1 for _ in SnippetGenerator.find_all_matches(text, query))
)
@@ -243,13 +245,14 @@ class SnippetGenerator:
@staticmethod
def generate(
text: str,
query: str,
query: SearchQuery,
max_length: NonNegativeInt = DEFAULT_SNIPPET_MAX_LENGTH,
max_snippets: NonNegativeInt = DEFAULT_MAX_SNIPPETS,
) -> list[str]:
"""Generate snippets from text."""
if not text or not query:
logger.warning("Empty text or query for generate_snippets")
assert query is not None
if not text:
logger.warning("Empty text for generate_snippets")
return []
candidates = (
@@ -270,7 +273,7 @@ class SnippetGenerator:
@staticmethod
def from_summary(
summary: str,
query: str,
query: SearchQuery,
max_snippets: NonNegativeInt = LONG_SUMMARY_MAX_SNIPPETS,
) -> list[str]:
"""Generate snippets from summary text."""
@@ -278,9 +281,9 @@ class SnippetGenerator:
@staticmethod
def combine_sources(
summary: str | None,
summary: NonEmptyString | None,
webvtt: WebVTTContent | None,
query: str,
query: SearchQuery,
max_total: NonNegativeInt = DEFAULT_MAX_SNIPPETS,
) -> tuple[list[str], NonNegativeInt]:
"""Combine snippets from multiple sources and return total match count.
@@ -289,6 +292,11 @@ class SnippetGenerator:
snippets can be empty for real in case of e.g. title match
"""
assert (
summary is not None or webvtt is not None
), "At least one source must be present"
webvtt_matches = 0
summary_matches = 0
@@ -355,8 +363,8 @@ class SearchController:
else_=rooms.c.name,
).label("room_name"),
]
if params.query_text:
search_query = None
if params.query_text is not None:
search_query = sqlalchemy.func.websearch_to_tsquery(
"english", params.query_text
)
@@ -373,7 +381,9 @@ class SearchController:
transcripts.join(rooms, transcripts.c.room_id == rooms.c.id, isouter=True)
)
if params.query_text:
if params.query_text is not None:
# because already initialized based on params.query_text presence above
assert search_query is not None
base_query = base_query.where(
transcripts.c.search_vector_en.op("@@")(search_query)
)
@@ -393,7 +403,7 @@ class SearchController:
transcripts.c.source_kind == params.source_kind
)
if params.query_text:
if params.query_text is not None:
order_by = sqlalchemy.desc(sqlalchemy.text("rank"))
else:
order_by = sqlalchemy.desc(transcripts.c.created_at)
@@ -407,20 +417,30 @@ class SearchController:
)
total = await get_database().fetch_val(count_query)
def _process_result(r) -> SearchResult:
def _process_result(r: DbRecord) -> SearchResult:
r_dict: Dict[str, Any] = dict(r)
webvtt_raw: str | None = r_dict.pop("webvtt", None)
webvtt: WebVTTContent | None
if webvtt_raw:
webvtt = WebVTTProcessor.parse(webvtt_raw)
else:
webvtt = None
long_summary: str | None = r_dict.pop("long_summary", None)
long_summary_r: str | None = r_dict.pop("long_summary", None)
long_summary: NonEmptyString = try_parse_non_empty_string(long_summary_r)
room_name: str | None = r_dict.pop("room_name", None)
db_result = SearchResultDB.model_validate(r_dict)
snippets, total_match_count = SnippetGenerator.combine_sources(
at_least_one_source = webvtt is not None or long_summary is not None
has_query = params.query_text is not None
snippets, total_match_count = (
SnippetGenerator.combine_sources(
long_summary, webvtt, params.query_text, DEFAULT_MAX_SNIPPETS
)
if has_query and at_least_one_source
else ([], 0)
)
return SearchResult(
**db_result.model_dump(),

View File

@@ -7,6 +7,7 @@ Uses parallel processing for transcription, diarization, and waveform generation
"""
import asyncio
import uuid
from pathlib import Path
import av
@@ -14,7 +15,9 @@ import structlog
from celery import shared_task
from reflector.asynctask import asynctask
from reflector.db.rooms import rooms_controller
from reflector.db.transcripts import (
SourceKind,
Transcript,
TranscriptStatus,
transcripts_controller,
@@ -48,6 +51,7 @@ from reflector.processors.types import (
)
from reflector.settings import settings
from reflector.storage import get_transcripts_storage
from reflector.worker.webhook import send_transcript_webhook
class EmptyPipeline:
@@ -385,7 +389,6 @@ async def task_pipeline_file_process(*, transcript_id: str):
raise Exception(f"Transcript {transcript_id} not found")
pipeline = PipelineMainFile(transcript_id=transcript_id)
try:
await pipeline.set_status(transcript_id, "processing")
@@ -402,3 +405,17 @@ async def task_pipeline_file_process(*, transcript_id: str):
except Exception:
await pipeline.set_status(transcript_id, "error")
raise
# Trigger webhook if this is a room recording with webhook configured
if transcript.source_kind == SourceKind.ROOM and transcript.room_id:
room = await rooms_controller.get_by_id(transcript.room_id)
if room and room.webhook_url:
logger.info(
"Dispatching webhook task",
transcript_id=transcript_id,
room_id=room.id,
webhook_url=room.webhook_url,
)
send_transcript_webhook.delay(
transcript_id, room.id, event_id=uuid.uuid4().hex
)

View File

@@ -0,0 +1,20 @@
from typing import Annotated
from pydantic import Field, TypeAdapter, constr
NonEmptyStringBase = constr(min_length=1, strip_whitespace=False)
NonEmptyString = Annotated[
NonEmptyStringBase,
Field(description="A non-empty string", min_length=1),
]
non_empty_string_adapter = TypeAdapter(NonEmptyString)
def parse_non_empty_string(s: str) -> NonEmptyString:
return non_empty_string_adapter.validate_python(s)
def try_parse_non_empty_string(s: str) -> NonEmptyString | None:
if not s:
return None
return parse_non_empty_string(s)

View File

@@ -15,6 +15,7 @@ from reflector.db.meetings import meetings_controller
from reflector.db.rooms import rooms_controller
from reflector.settings import settings
from reflector.whereby import create_meeting, upload_logo
from reflector.worker.webhook import test_webhook
logger = logging.getLogger(__name__)
@@ -44,6 +45,11 @@ class Room(BaseModel):
is_shared: bool
class RoomDetails(Room):
webhook_url: str | None
webhook_secret: str | None
class Meeting(BaseModel):
id: str
room_name: str
@@ -64,6 +70,8 @@ class CreateRoom(BaseModel):
recording_type: str
recording_trigger: str
is_shared: bool
webhook_url: str
webhook_secret: str
class UpdateRoom(BaseModel):
@@ -76,16 +84,26 @@ class UpdateRoom(BaseModel):
recording_type: str
recording_trigger: str
is_shared: bool
webhook_url: str
webhook_secret: str
class DeletionStatus(BaseModel):
status: str
@router.get("/rooms", response_model=Page[Room])
class WebhookTestResult(BaseModel):
success: bool
message: str = ""
error: str = ""
status_code: int | None = None
response_preview: str | None = None
@router.get("/rooms", response_model=Page[RoomDetails])
async def rooms_list(
user: Annotated[Optional[auth.UserInfo], Depends(auth.current_user_optional)],
) -> list[Room]:
) -> list[RoomDetails]:
if not user and not settings.PUBLIC_MODE:
raise HTTPException(status_code=401, detail="Not authenticated")
@@ -99,6 +117,18 @@ async def rooms_list(
)
@router.get("/rooms/{room_id}", response_model=RoomDetails)
async def rooms_get(
room_id: str,
user: Annotated[Optional[auth.UserInfo], Depends(auth.current_user_optional)],
):
user_id = user["sub"] if user else None
room = await rooms_controller.get_by_id_for_http(room_id, user_id=user_id)
if not room:
raise HTTPException(status_code=404, detail="Room not found")
return room
@router.post("/rooms", response_model=Room)
async def rooms_create(
room: CreateRoom,
@@ -117,10 +147,12 @@ async def rooms_create(
recording_type=room.recording_type,
recording_trigger=room.recording_trigger,
is_shared=room.is_shared,
webhook_url=room.webhook_url,
webhook_secret=room.webhook_secret,
)
@router.patch("/rooms/{room_id}", response_model=Room)
@router.patch("/rooms/{room_id}", response_model=RoomDetails)
async def rooms_update(
room_id: str,
info: UpdateRoom,
@@ -209,3 +241,24 @@ async def rooms_create_meeting(
meeting.host_room_url = ""
return meeting
@router.post("/rooms/{room_id}/webhook/test", response_model=WebhookTestResult)
async def rooms_test_webhook(
room_id: str,
user: Annotated[Optional[auth.UserInfo], Depends(auth.current_user_optional)],
):
"""Test webhook configuration by sending a sample payload."""
user_id = user["sub"] if user else None
room = await rooms_controller.get_by_id(room_id)
if not room:
raise HTTPException(status_code=404, detail="Room not found")
if user_id and room.user_id != user_id:
raise HTTPException(
status_code=403, detail="Not authorized to test this room's webhook"
)
result = await test_webhook(room_id)
return WebhookTestResult(**result)

View File

@@ -5,7 +5,7 @@ from fastapi import APIRouter, Depends, HTTPException, Query
from fastapi_pagination import Page
from fastapi_pagination.ext.databases import apaginate
from jose import jwt
from pydantic import BaseModel, Field, field_serializer
from pydantic import BaseModel, Field, constr, field_serializer
import reflector.auth as auth
from reflector.db import get_database
@@ -19,10 +19,10 @@ from reflector.db.search import (
SearchOffsetBase,
SearchParameters,
SearchQuery,
SearchQueryBase,
SearchResult,
SearchTotal,
search_controller,
search_query_adapter,
)
from reflector.db.transcripts import (
SourceKind,
@@ -114,7 +114,19 @@ class DeletionStatus(BaseModel):
status: str
SearchQueryParam = Annotated[SearchQueryBase, Query(description="Search query text")]
SearchQueryParamBase = constr(min_length=0, strip_whitespace=True)
SearchQueryParam = Annotated[
SearchQueryParamBase, Query(description="Search query text")
]
# http and api standards accept "q="; we would like to handle it as the absence of query, not as "empty string query"
def parse_search_query_param(q: SearchQueryParam) -> SearchQuery | None:
if q == "":
return None
return search_query_adapter.validate_python(q)
SearchLimitParam = Annotated[SearchLimitBase, Query(description="Results per page")]
SearchOffsetParam = Annotated[
SearchOffsetBase, Query(description="Number of results to skip")
@@ -124,7 +136,7 @@ SearchOffsetParam = Annotated[
class SearchResponse(BaseModel):
results: list[SearchResult]
total: SearchTotal
query: SearchQuery
query: SearchQuery | None = None
limit: SearchLimit
offset: SearchOffset
@@ -174,7 +186,7 @@ async def transcripts_search(
user_id = user["sub"] if user else None
search_params = SearchParameters(
query_text=q,
query_text=parse_search_query_param(q),
limit=limit,
offset=offset,
user_id=user_id,

View File

@@ -0,0 +1,258 @@
"""Webhook task for sending transcript notifications."""
import hashlib
import hmac
import json
import uuid
from datetime import datetime, timezone
import httpx
import structlog
from celery import shared_task
from celery.utils.log import get_task_logger
from reflector.db.rooms import rooms_controller
from reflector.db.transcripts import transcripts_controller
from reflector.pipelines.main_live_pipeline import asynctask
from reflector.settings import settings
from reflector.utils.webvtt import topics_to_webvtt
logger = structlog.wrap_logger(get_task_logger(__name__))
def generate_webhook_signature(payload: bytes, secret: str, timestamp: str) -> str:
"""Generate HMAC signature for webhook payload."""
signed_payload = f"{timestamp}.{payload.decode('utf-8')}"
hmac_obj = hmac.new(
secret.encode("utf-8"),
signed_payload.encode("utf-8"),
hashlib.sha256,
)
return hmac_obj.hexdigest()
@shared_task(
bind=True,
max_retries=30,
default_retry_delay=60,
retry_backoff=True,
retry_backoff_max=3600, # Max 1 hour between retries
)
@asynctask
async def send_transcript_webhook(
self,
transcript_id: str,
room_id: str,
event_id: str,
):
log = logger.bind(
transcript_id=transcript_id,
room_id=room_id,
retry_count=self.request.retries,
)
try:
# Fetch transcript and room
transcript = await transcripts_controller.get_by_id(transcript_id)
if not transcript:
log.error("Transcript not found, skipping webhook")
return
room = await rooms_controller.get_by_id(room_id)
if not room:
log.error("Room not found, skipping webhook")
return
if not room.webhook_url:
log.info("No webhook URL configured for room, skipping")
return
# Generate WebVTT content from topics
topics_data = []
if transcript.topics:
# Build topics data with diarized content per topic
for topic in transcript.topics:
topic_webvtt = topics_to_webvtt([topic]) if topic.words else ""
topics_data.append(
{
"title": topic.title,
"summary": topic.summary,
"timestamp": topic.timestamp,
"duration": topic.duration,
"webvtt": topic_webvtt,
}
)
# Build webhook payload
frontend_url = f"{settings.UI_BASE_URL}/transcripts/{transcript.id}"
participants = [
{"id": p.id, "name": p.name, "speaker": p.speaker}
for p in (transcript.participants or [])
]
payload_data = {
"event": "transcript.completed",
"event_id": event_id,
"timestamp": datetime.now(timezone.utc).isoformat(),
"transcript": {
"id": transcript.id,
"room_id": transcript.room_id,
"created_at": transcript.created_at.isoformat(),
"duration": transcript.duration,
"title": transcript.title,
"short_summary": transcript.short_summary,
"long_summary": transcript.long_summary,
"webvtt": transcript.webvtt,
"topics": topics_data,
"participants": participants,
"source_language": transcript.source_language,
"target_language": transcript.target_language,
"status": transcript.status,
"frontend_url": frontend_url,
},
"room": {
"id": room.id,
"name": room.name,
},
}
# Convert to JSON
payload_json = json.dumps(payload_data, separators=(",", ":"))
payload_bytes = payload_json.encode("utf-8")
# Generate signature if secret is configured
headers = {
"Content-Type": "application/json",
"User-Agent": "Reflector-Webhook/1.0",
"X-Webhook-Event": "transcript.completed",
"X-Webhook-Retry": str(self.request.retries),
}
if room.webhook_secret:
timestamp = str(int(datetime.now(timezone.utc).timestamp()))
signature = generate_webhook_signature(
payload_bytes, room.webhook_secret, timestamp
)
headers["X-Webhook-Signature"] = f"t={timestamp},v1={signature}"
# Send webhook with timeout
async with httpx.AsyncClient(timeout=30.0) as client:
log.info(
"Sending webhook",
url=room.webhook_url,
payload_size=len(payload_bytes),
)
response = await client.post(
room.webhook_url,
content=payload_bytes,
headers=headers,
)
response.raise_for_status()
log.info(
"Webhook sent successfully",
status_code=response.status_code,
response_size=len(response.content),
)
except httpx.HTTPStatusError as e:
log.error(
"Webhook failed with HTTP error",
status_code=e.response.status_code,
response_text=e.response.text[:500], # First 500 chars
)
# Don't retry on client errors (4xx)
if 400 <= e.response.status_code < 500:
log.error("Client error, not retrying")
return
# Retry on server errors (5xx)
raise self.retry(exc=e)
except (httpx.ConnectError, httpx.TimeoutException) as e:
# Retry on network errors
log.error("Webhook failed with connection error", error=str(e))
raise self.retry(exc=e)
except Exception as e:
# Retry on unexpected errors
log.exception("Unexpected error in webhook task", error=str(e))
raise self.retry(exc=e)
async def test_webhook(room_id: str) -> dict:
"""
Test webhook configuration by sending a sample payload.
Returns immediately with success/failure status.
This is the shared implementation used by both the API endpoint and Celery task.
"""
try:
room = await rooms_controller.get_by_id(room_id)
if not room:
return {"success": False, "error": "Room not found"}
if not room.webhook_url:
return {"success": False, "error": "No webhook URL configured"}
now = (datetime.now(timezone.utc).isoformat(),)
payload_data = {
"event": "test",
"event_id": uuid.uuid4().hex,
"timestamp": now,
"message": "This is a test webhook from Reflector",
"room": {
"id": room.id,
"name": room.name,
},
}
payload_json = json.dumps(payload_data, separators=(",", ":"))
payload_bytes = payload_json.encode("utf-8")
# Generate headers with signature
headers = {
"Content-Type": "application/json",
"User-Agent": "Reflector-Webhook/1.0",
"X-Webhook-Event": "test",
}
if room.webhook_secret:
timestamp = str(int(datetime.now(timezone.utc).timestamp()))
signature = generate_webhook_signature(
payload_bytes, room.webhook_secret, timestamp
)
headers["X-Webhook-Signature"] = f"t={timestamp},v1={signature}"
# Send test webhook with short timeout
async with httpx.AsyncClient(timeout=10.0) as client:
response = await client.post(
room.webhook_url,
content=payload_bytes,
headers=headers,
)
return {
"success": response.is_success,
"status_code": response.status_code,
"message": f"Webhook test {'successful' if response.is_success else 'failed'}",
"response_preview": response.text if response.text else None,
}
except httpx.TimeoutException:
return {
"success": False,
"error": "Webhook request timed out (10 seconds)",
}
except httpx.ConnectError as e:
return {
"success": False,
"error": f"Could not connect to webhook URL: {str(e)}",
}
except Exception as e:
return {
"success": False,
"error": f"Unexpected error: {str(e)}",
}

View File

@@ -23,7 +23,7 @@ async def test_search_postgresql_only():
assert results == []
assert total == 0
params_empty = SearchParameters(query_text="")
params_empty = SearchParameters(query_text=None)
results_empty, total_empty = await search_controller.search_transcripts(
params_empty
)
@@ -34,7 +34,7 @@ async def test_search_postgresql_only():
@pytest.mark.asyncio
async def test_search_with_empty_query():
"""Test that empty query returns all transcripts."""
params = SearchParameters(query_text="")
params = SearchParameters(query_text=None)
results, total = await search_controller.search_transcripts(params)
assert isinstance(results, list)

View File

@@ -1,5 +1,7 @@
"""Unit tests for search snippet generation."""
import pytest
from reflector.db.search import (
SnippetCandidate,
SnippetGenerator,
@@ -512,11 +514,9 @@ data visualization and data storage"""
)
assert webvtt_count == 3
snippets_empty, count_empty = SnippetGenerator.combine_sources(
None, None, "data", max_total=3
)
assert snippets_empty == []
assert count_empty == 0
# combine_sources requires at least one source to be present
with pytest.raises(AssertionError, match="At least one source must be present"):
SnippetGenerator.combine_sources(None, None, "data", max_total=3)
def test_edge_cases(self):
"""Test edge cases for the pure functions."""

View File

@@ -11,11 +11,15 @@ import {
Input,
Select,
Spinner,
IconButton,
createListCollection,
useDisclosure,
} from "@chakra-ui/react";
import { useEffect, useState } from "react";
import { LuEye, LuEyeOff } from "react-icons/lu";
import useApi from "../../lib/useApi";
import useRoomList from "./useRoomList";
import { ApiError, RoomDetails } from "../../api";
import type { components } from "../../reflector-api";
import {
useRoomCreate,
@@ -63,6 +67,8 @@ const roomInitialState = {
recordingType: "cloud",
recordingTrigger: "automatic-2nd-participant",
isShared: false,
webhookUrl: "",
webhookSecret: "",
};
export default function RoomsList() {
@@ -89,6 +95,19 @@ export default function RoomsList() {
const [nameError, setNameError] = useState("");
const [linkCopied, setLinkCopied] = useState("");
const [selectedStreamId, setSelectedStreamId] = useState<number | null>(null);
const [testingWebhook, setTestingWebhook] = useState(false);
const [webhookTestResult, setWebhookTestResult] = useState<string | null>(
null,
);
const [showWebhookSecret, setShowWebhookSecret] = useState(false);
interface Stream {
stream_id: number;
name: string;
}
interface Topic {
name: string;
}
const createRoomMutation = useRoomCreate();
const updateRoomMutation = useRoomUpdate();
@@ -135,6 +154,69 @@ export default function RoomsList() {
}, 2000);
};
const handleCloseDialog = () => {
setShowWebhookSecret(false);
setWebhookTestResult(null);
onClose();
};
const handleTestWebhook = async () => {
if (!room.webhookUrl || !editRoomId) {
setWebhookTestResult("Please enter a webhook URL first");
return;
}
setTestingWebhook(true);
setWebhookTestResult(null);
try {
const response = await api?.v1RoomsTestWebhook({
roomId: editRoomId,
});
if (response?.success) {
setWebhookTestResult(
`✅ Webhook test successful! Status: ${response.status_code}`,
);
} else {
let errorMsg = `❌ Webhook test failed`;
if (response?.status_code) {
errorMsg += ` (Status: ${response.status_code})`;
}
if (response?.error) {
errorMsg += `: ${response.error}`;
} else if (response?.response_preview) {
// Try to parse and extract meaningful error from response
// Specific to N8N at the moment, as there is no specification for that
// We could just display as is, but decided here to dig a little bit more.
try {
const preview = JSON.parse(response.response_preview);
if (preview.message) {
errorMsg += `: ${preview.message}`;
}
} catch {
// If not JSON, just show the preview text (truncated)
const previewText = response.response_preview.substring(0, 150);
errorMsg += `: ${previewText}`;
}
} else if (response?.message) {
errorMsg += `: ${response.message}`;
}
setWebhookTestResult(errorMsg);
}
} catch (error) {
console.error("Error testing webhook:", error);
setWebhookTestResult("❌ Failed to test webhook. Please check your URL.");
} finally {
setTestingWebhook(false);
}
// Clear result after 5 seconds
setTimeout(() => {
setWebhookTestResult(null);
}, 5000);
};
const handleSaveRoom = async () => {
try {
if (RESERVED_PATHS.includes(room.name)) {
@@ -152,6 +234,8 @@ export default function RoomsList() {
recording_type: room.recordingType,
recording_trigger: room.recordingTrigger,
is_shared: room.isShared,
webhook_url: room.webhookUrl,
webhook_secret: room.webhookSecret,
};
if (isEditing) {
@@ -173,6 +257,7 @@ export default function RoomsList() {
setNameError("");
refetch();
onClose();
handleCloseDialog();
} catch (err: any) {
if (
err?.status === 400 &&
@@ -187,7 +272,32 @@ export default function RoomsList() {
}
};
const handleEditRoom = (roomId, roomData) => {
const handleEditRoom = async (roomId, roomData) => {
// Reset states
setShowWebhookSecret(false);
setWebhookTestResult(null);
// Fetch full room details to get webhook fields
try {
const detailedRoom = await api?.v1RoomsGet({ roomId });
if (detailedRoom) {
setRoom({
name: detailedRoom.name,
zulipAutoPost: detailedRoom.zulip_auto_post,
zulipStream: detailedRoom.zulip_stream,
zulipTopic: detailedRoom.zulip_topic,
isLocked: detailedRoom.is_locked,
roomMode: detailedRoom.room_mode,
recordingType: detailedRoom.recording_type,
recordingTrigger: detailedRoom.recording_trigger,
isShared: detailedRoom.is_shared,
webhookUrl: detailedRoom.webhook_url || "",
webhookSecret: detailedRoom.webhook_secret || "",
});
}
} catch (error) {
console.error("Failed to fetch room details, using list data:", error);
// Fallback to using the data from the list
setRoom({
name: roomData.name,
zulipAutoPost: roomData.zulip_auto_post,
@@ -198,7 +308,10 @@ export default function RoomsList() {
recordingType: roomData.recording_type,
recordingTrigger: roomData.recording_trigger,
isShared: roomData.is_shared,
webhookUrl: roomData.webhook_url || "",
webhookSecret: roomData.webhook_secret || "",
});
}
setEditRoomId(roomId);
setIsEditing(true);
setNameError("");
@@ -233,9 +346,9 @@ export default function RoomsList() {
});
};
const myRooms: Room[] =
const myRooms: RoomDetails[] =
response?.items.filter((roomData) => !roomData.is_shared) || [];
const sharedRooms: Room[] =
const sharedRooms: RoomDetails[] =
response?.items.filter((roomData) => roomData.is_shared) || [];
if (loading && !response)
@@ -270,6 +383,8 @@ export default function RoomsList() {
setIsEditing(false);
setRoom(roomInitialState);
setNameError("");
setShowWebhookSecret(false);
setWebhookTestResult(null);
onOpen();
}}
>
@@ -279,7 +394,7 @@ export default function RoomsList() {
<Dialog.Root
open={open}
onOpenChange={(e) => (e.open ? onOpen() : onClose())}
onOpenChange={(e) => (e.open ? onOpen() : handleCloseDialog())}
size="lg"
>
<Dialog.Backdrop />
@@ -516,6 +631,109 @@ export default function RoomsList() {
</Select.Positioner>
</Select.Root>
</Field.Root>
{/* Webhook Configuration Section */}
<Field.Root mt={8}>
<Field.Label>Webhook URL</Field.Label>
<Input
name="webhookUrl"
type="url"
placeholder="https://example.com/webhook"
value={room.webhookUrl}
onChange={handleRoomChange}
/>
<Field.HelperText>
Optional: URL to receive notifications when transcripts are
ready
</Field.HelperText>
</Field.Root>
{room.webhookUrl && (
<>
<Field.Root mt={4}>
<Field.Label>Webhook Secret</Field.Label>
<Flex gap={2}>
<Input
name="webhookSecret"
type={showWebhookSecret ? "text" : "password"}
value={room.webhookSecret}
onChange={handleRoomChange}
placeholder={
isEditing && room.webhookSecret
? "••••••••"
: "Leave empty to auto-generate"
}
flex="1"
/>
{isEditing && room.webhookSecret && (
<IconButton
size="sm"
variant="ghost"
aria-label={
showWebhookSecret ? "Hide secret" : "Show secret"
}
onClick={() =>
setShowWebhookSecret(!showWebhookSecret)
}
>
{showWebhookSecret ? <LuEyeOff /> : <LuEye />}
</IconButton>
)}
</Flex>
<Field.HelperText>
Used for HMAC signature verification (auto-generated if
left empty)
</Field.HelperText>
</Field.Root>
{isEditing && (
<>
<Flex
mt={2}
gap={2}
alignItems="flex-start"
direction="column"
>
<Button
size="sm"
variant="outline"
onClick={handleTestWebhook}
disabled={testingWebhook || !room.webhookUrl}
>
{testingWebhook ? (
<>
<Spinner size="xs" mr={2} />
Testing...
</>
) : (
"Test Webhook"
)}
</Button>
{webhookTestResult && (
<div
style={{
fontSize: "14px",
wordBreak: "break-word",
maxWidth: "100%",
padding: "8px",
borderRadius: "4px",
backgroundColor: webhookTestResult.startsWith(
"✅",
)
? "#f0fdf4"
: "#fef2f2",
border: `1px solid ${webhookTestResult.startsWith("✅") ? "#86efac" : "#fca5a5"}`,
}}
>
{webhookTestResult}
</div>
)}
</Flex>
</>
)}
</>
)}
<Field.Root mt={4}>
<Checkbox.Root
name="isShared"
@@ -540,7 +758,7 @@ export default function RoomsList() {
</Field.Root>
</Dialog.Body>
<Dialog.Footer>
<Button variant="ghost" onClick={onClose}>
<Button variant="ghost" onClick={handleCloseDialog}>
Cancel
</Button>
<Button

View File

View File

0
www/app/api/types.gen.ts Normal file
View File