Commit Graph

31 Commits

Author SHA1 Message Date
5267ab2d37 feat: retake summary using NousResearch/Hermes-3-Llama-3.1-8B model (#415)
This feature a new modal endpoint, and a complete new way to build the
summary.

## SummaryBuilder

The summary builder is based on conversational model, where an exchange
between the model and the user is made. This allow more context
inclusion and a better respect of the rules.

It requires an endpoint with OpenAI-like completions endpoint
(/v1/chat/completions)

## vLLM Hermes3

Unlike previous deployment, this one use vLLM, which gives OpenAI-like
completions endpoint out of the box. It could also handle guided JSON
generation, so jsonformer is not needed. But, the model is quite good to
follow JSON schema if asked in the prompt.

## Conversion of long/short into summary builder

The builder is identifying participants, find key subjects, get a
summary for each, then get a quick recap.

The quick recap is used as a short_summary, while the markdown including
the quick recap + key subjects + summaries are used for the
long_summary.

This is why the nextjs component has to be updated, to correctly style
h1 and keep the new line of the markdown.
2024-09-14 02:28:38 +02:00
projects-g
72b22d1005 Update all modal deployments and change seamless configuration due to changes in src repo (#353)
* update all modal deployments and change seamless configuration due to change in src repo

* add fixture
2024-04-16 21:12:24 +05:30
f7f67521fc server: try reconcile both tests webrtc and upload with celery worker 2023-12-13 11:25:46 +01:00
99b973f36f server: fix tests 2023-11-22 14:41:40 +01:00
Sara
a846e38fbd fix waveform in pipeline 2023-11-17 13:38:32 +01:00
Sara
1fc261a669 try to move waveform to pipeline 2023-11-15 20:30:00 +01:00
e18a7c8d4e server: correctly save duration, when filewriter is finished 2023-11-11 01:00:09 +01:00
eb76cd9bcd server/www: rename topic text field to transcript
This aleviate the current issue with vercel deployment
2023-11-02 19:59:56 +01:00
4da890b95f server: add dummy diarization and fixes instanciation 2023-11-02 17:39:21 +01:00
d8a842f099 server: full diarization processor implementation based on gokul app 2023-11-02 17:39:21 +01:00
07c4d080c2 server: refactor with diarization, logic works 2023-11-02 17:39:21 +01:00
projects-g
1d92d43fe0 New summary (#283)
* handover final summary to Zephyr deployment

* fix display error

* push new summary feature

* fix failing test case

* Added markdown support for final summary

* update UI render issue

* retain sentence tokenizer call

---------

Co-authored-by: Koper <andreas@monadical.com>
2023-10-13 22:53:29 +05:30
4e40cc511a server: create fixture for starting the server, and always close server even if one test fail 2023-10-13 15:01:58 +02:00
Koper
149342f854 Fix unit tests 2023-10-13 10:42:52 +01:00
projects-g
6a43297309 Translation enhancements (#247) 2023-09-26 19:49:54 +05:30
Gokul Mohanarangan
0b00881ce4 update tests: LLM mock to return LLM TITLE for all cases 2023-09-25 10:22:41 +05:30
2b9eef6131 server: use mp3 as default for audio storage
Closes #223
2023-09-13 17:26:03 +02:00
projects-g
9fe261406c Feature additions (#210)
* initial

* add LLM features

* update LLM logic

* update llm functions: change control flow

* add generation config

* update return types

* update processors and tests

* update rtc_offer

* revert new title processor change

* fix unit tests

* add comments and fix HTTP 500

* adjust prompt

* test with reflector app

* revert new event for final title

* update

* move onus onto processors

* move onus onto processors

* stash

* add provision for gen config

* dynamically pack the LLM input using context length

* tune final summary params

* update consolidated class structures

* update consolidated class structures

* update precommit

* add broadcast processors

* working baseline

* Organize LLMParams

* minor fixes

* minor fixes

* minor fixes

* fix unit tests

* fix unit tests

* fix unit tests

* update tests

* update tests

* edit pipeline response events

* update summary return types

* configure tests

* alembic db migration

* change LLM response flow

* edit main llm functions

* edit main llm functions

* change llm name and gen cf

* Update transcript_topic_detector.py

* PR review comments

* checkpoint before db event migration

* update DB migration of past events

* update DB migration of past events

* edit LLM classes

* Delete unwanted file

* remove List typing

* remove List typing

* update oobabooga API call

* topic enhancements

* update UI event handling

* move ensure_casing to llm base

* update tests

* update tests
2023-09-13 11:26:08 +05:30
68dce235ec server: pass source and target language from api to pipeline 2023-08-29 11:16:23 +02:00
5c9adb2664 server: fixes tests 2023-08-18 10:23:15 +02:00
f26b6d4621 Merge branch 'main' into feat-user-auth-fief 2023-08-18 10:20:44 +02:00
Gokul Mohanarangan
b08724a191 correct schema typing from str to dict 2023-08-17 20:57:31 +05:30
Gokul Mohanarangan
a98a9853be PR review comments 2023-08-17 14:42:45 +05:30
Gokul Mohanarangan
17b850951a pull from main 2023-08-17 09:38:35 +05:30
Gokul Mohanarangan
eb13a7bd64 make schema optional argument 2023-08-17 09:23:14 +05:30
e12f9afe7b server: implement user authentication (none by default) 2023-08-16 17:24:05 +02:00
a809e5e734 server: implement wav/mp3 audio download
If set, will save audio transcription to disk.
MP3 conversion is on-request, but cached to disk as well only if it is successfull.

Closes #148
2023-08-16 09:34:26 +02:00
Mathieu Virbel
26e34aec2d server: ensure transcript status model is updated + tests 2023-08-09 11:23:28 +02:00
Mathieu Virbel
a9e0c9aa03 server: implement status update in model and websocket 2023-08-09 11:21:48 +02:00
Mathieu Virbel
7f807c8f5f server: implement FINAL_SUMMARY for websocket + update tests and fix flush 2023-08-08 19:32:20 +02:00
Mathieu Virbel
96f52c631a api: implement first server API + tests 2023-08-04 20:06:43 +02:00