Commit Graph

33 Commits

Author SHA1 Message Date
5267ab2d37 feat: retake summary using NousResearch/Hermes-3-Llama-3.1-8B model (#415)
This feature a new modal endpoint, and a complete new way to build the
summary.

## SummaryBuilder

The summary builder is based on conversational model, where an exchange
between the model and the user is made. This allow more context
inclusion and a better respect of the rules.

It requires an endpoint with OpenAI-like completions endpoint
(/v1/chat/completions)

## vLLM Hermes3

Unlike previous deployment, this one use vLLM, which gives OpenAI-like
completions endpoint out of the box. It could also handle guided JSON
generation, so jsonformer is not needed. But, the model is quite good to
follow JSON schema if asked in the prompt.

## Conversion of long/short into summary builder

The builder is identifying participants, find key subjects, get a
summary for each, then get a quick recap.

The quick recap is used as a short_summary, while the markdown including
the quick recap + key subjects + summaries are used for the
long_summary.

This is why the nextjs component has to be updated, to correctly style
h1 and keep the new line of the markdown.
2024-09-14 02:28:38 +02:00
4dbcb80228 server: remove reference to banana.dev 2023-11-15 19:49:32 +01:00
c87c30d339 hotfix/server: add follow_redirect on modal 2023-11-02 19:09:13 +01:00
projects-g
1d92d43fe0 New summary (#283)
* handover final summary to Zephyr deployment

* fix display error

* push new summary feature

* fix failing test case

* Added markdown support for final summary

* update UI render issue

* retain sentence tokenizer call

---------

Co-authored-by: Koper <andreas@monadical.com>
2023-10-13 22:53:29 +05:30
projects-g
bbe63ad407 Fix extra space between some tokens (punctuations) (#267)
* ensure uptime for reflector.media

* remove extra space before punct

* update detokenizer method

* create detokenizer property

* merge conflict
2023-10-12 10:42:19 +05:30
47f7e1836e server: remove warmup methods everywhere 2023-10-06 13:59:17 -04:00
projects-g
6a43297309 Translation enhancements (#247) 2023-09-26 19:49:54 +05:30
projects-g
3a374ea593 Delete server/reflector/llm/llm_params_cod.py 2023-09-25 09:46:00 +05:30
Gokul Mohanarangan
117acfacae update comment 2023-09-25 09:43:02 +05:30
Gokul Mohanarangan
80fd5e6176 update llm params 2023-09-22 07:49:41 +05:30
Gokul Mohanarangan
009d52ea23 update casing and trimming 2023-09-22 07:29:01 +05:30
Gokul Mohanarangan
ab41ce90e8 add profanity filter, post-process topic/title 2023-09-21 11:12:00 +05:30
fb93c55993 server: fix nltk download 2023-09-13 11:40:39 +02:00
Gokul Mohanarangan
ed83236145 remove cache dir 2023-09-13 14:41:38 +05:30
Gokul Mohanarangan
9a10eef789 add nltk lookup path 2023-09-13 14:13:31 +05:30
projects-g
9fe261406c Feature additions (#210)
* initial

* add LLM features

* update LLM logic

* update llm functions: change control flow

* add generation config

* update return types

* update processors and tests

* update rtc_offer

* revert new title processor change

* fix unit tests

* add comments and fix HTTP 500

* adjust prompt

* test with reflector app

* revert new event for final title

* update

* move onus onto processors

* move onus onto processors

* stash

* add provision for gen config

* dynamically pack the LLM input using context length

* tune final summary params

* update consolidated class structures

* update consolidated class structures

* update precommit

* add broadcast processors

* working baseline

* Organize LLMParams

* minor fixes

* minor fixes

* minor fixes

* fix unit tests

* fix unit tests

* fix unit tests

* update tests

* update tests

* edit pipeline response events

* update summary return types

* configure tests

* alembic db migration

* change LLM response flow

* edit main llm functions

* edit main llm functions

* change llm name and gen cf

* Update transcript_topic_detector.py

* PR review comments

* checkpoint before db event migration

* update DB migration of past events

* update DB migration of past events

* edit LLM classes

* Delete unwanted file

* remove List typing

* remove List typing

* update oobabooga API call

* topic enhancements

* update UI event handling

* move ensure_casing to llm base

* update tests

* update tests
2023-09-13 11:26:08 +05:30
60edca6366 server: add prometheus instrumentation 2023-09-12 13:11:13 +02:00
Gokul Mohanarangan
2d686da15c pass schema as dict 2023-08-17 21:51:44 +05:30
Gokul Mohanarangan
b08724a191 correct schema typing from str to dict 2023-08-17 20:57:31 +05:30
Gokul Mohanarangan
a98a9853be PR review comments 2023-08-17 14:42:45 +05:30
Gokul Mohanarangan
eb13a7bd64 make schema optional argument 2023-08-17 09:23:14 +05:30
Gokul Mohanarangan
5f79e04642 make schema optional for all LLMs 2023-08-16 22:37:20 +05:30
Mathieu Virbel
01806ce037 server: remove warmup, increase LLM timeout for now 2023-08-11 19:56:39 +02:00
Mathieu Virbel
82ce8202bd server: improve llm warmup exception handling
If LLM is stuck to warm or an exception happen in the pipeline, then the processor responsible for the exception fail, and there is no fallback. So audio continue to arrive, but no processing happen.While this should be done right especially after disconnection, still, we should ignore llm warmup issue and just go.

Closes #140
2023-08-11 19:33:07 +02:00
Mathieu Virbel
a06056f9bc server: fixes initial timeout for llm warmup 2023-08-11 15:45:28 +02:00
Mathieu Virbel
38a5ee0da2 server: implement warmup event for llm and transcription 2023-08-11 15:32:41 +02:00
Mathieu Virbel
445d3c1221 server: implement modal backend for llm and transcription 2023-08-11 12:43:09 +02:00
7d40305737 Implement retry that automatically detect httpx and backoff (#119)
* server: implement retry that automatically detect httpx and backoff

Closes #118

* server: fix formatting
2023-08-08 14:03:36 +02:00
Mathieu Virbel
dce92e0cf7 server: fixes pipeline logger not transmitted to processors
Closes #110
2023-08-04 12:02:18 +02:00
d94e2911c3 Serverless GPU support on banana.dev (#106)
* serverless: implement banana backend for both audio and LLM

Related to monadical-sas/reflector-gpu-banana project

* serverless: got llm working on banana !

* tests: fixes

* serverless: fix dockerfile to use fastapi server + httpx
2023-08-04 10:24:11 +02:00
Mathieu Virbel
d320558cc9 server/rtc: fix topic output 2023-08-01 19:12:51 +02:00
Mathieu Virbel
1f8e4200fd tests: rework tests and fixes bugs along the way 2023-08-01 16:05:48 +02:00
Mathieu Virbel
42f1442e56 server: introduce LLM backends 2023-08-01 14:23:34 +02:00