Files
reflector/server/reflector/tools/exportdanswer.py
Mathieu Virbel 9eab952c63 feat: postgresql migration and removal of sqlite in pytest (#546)
* feat: remove support of sqlite, 100% postgres

* fix: more migration and make datetime timezone aware in postgres

* fix: change how database is get, and use contextvar to have difference instance between different loops

* test: properly use client fixture that handle lifetime/database connection

* fix: add missing client fixture parameters to test functions

This commit fixes NameError issues where test functions were trying to use
the 'client' fixture but didn't have it as a parameter. The changes include:

1. Added 'client' parameter to test functions in:
   - test_transcripts_audio_download.py (6 functions including fixture)
   - test_transcripts_speaker.py (3 functions)
   - test_transcripts_upload.py (1 function)
   - test_transcripts_rtc_ws.py (2 functions + appserver fixture)

2. Resolved naming conflicts in test_transcripts_rtc_ws.py where both HTTP
   client and StreamClient were using variable name 'client'. StreamClient
   instances are now named 'stream_client' to avoid conflicts.

3. Added missing 'from reflector.app import app' import in rtc_ws tests.

Background: Previously implemented contextvars solution with get_database()
function resolves asyncio event loop conflicts in Celery tasks. The global
client fixture was also created to replace manual AsyncClient instances,
ensuring proper FastAPI application lifecycle management and database
connections during tests.

All tests now pass except for 2 pre-existing RTC WebSocket test failures
related to asyncpg connection issues unrelated to these fixes.

* fix: ensure task are correctly closed

* fix: make separate event loop for the live server

* fix: make default settings pointing at postgres

* build: remove pytest-docker deps out of dev, just tests group
2025-08-14 11:40:52 -06:00

72 lines
2.5 KiB
Python

import json
import pathlib
from datetime import timedelta
async def export_db(filename: str) -> None:
from reflector.settings import settings
filename = pathlib.Path(filename).resolve()
settings.DATABASE_URL = f"sqlite:///{filename}"
from reflector.db import get_database, transcripts
database = get_database()
await database.connect()
transcripts = await database.fetch_all(transcripts.select())
await database.disconnect()
def export_transcript(transcript, output_dir):
for topic in transcript.topics:
metadata = {
"link": f"https://reflector.media/transcripts/{transcript.id}#topic:{topic['id']},timestamp:{topic['timestamp']}",
"transcript_id": transcript.id,
"transcript_created_at": transcript.created_at.isoformat(),
"topic_id": topic["id"],
"topic_relative_timestamp": topic["timestamp"],
"topic_created_at": (
transcript.created_at + timedelta(seconds=topic["timestamp"])
).isoformat(),
"topic_title": topic["title"],
}
j_metadata = json.dumps(metadata)
# export transcript
output = output_dir / f"{transcript.id}-topic-{topic['id']}.txt"
with open(output, "w", encoding="utf8") as fd:
fd.write(f"#DANSWER_METADATA={j_metadata}\n")
fd.write("\n")
fd.write(f"# {topic['title']}\n")
fd.write("\n")
fd.write(f"{topic['transcript']}\n")
# # export summary
# output = output_dir / f"{transcript.id}-summary.txt"
# metadata = {
# "link": f"https://reflector.media/transcripts/{transcript.id}",
# "rfl_id": transcript.id,
# }
#
# j_metadata = json.dumps(metadata)
# with open(output, "w", encoding="utf8") as fd:
# fd.write(f"#DANSWER_METADATA={j_metadata}\n")
# fd.write("\n")
# fd.write("# Summary\n")
# fd.write("\n")
# fd.write(f"{transcript.long_summary}\n")
output_dir = pathlib.Path("exportdanswer")
for transcript in transcripts:
export_transcript(transcript, output_dir)
if __name__ == "__main__":
import argparse
import asyncio
parser = argparse.ArgumentParser()
parser.add_argument("database", help="Sqlite Database file")
args = parser.parse_args()
asyncio.run(export_db(args.database))