5267ab2d37
feat: retake summary using NousResearch/Hermes-3-Llama-3.1-8B model ( #415 )
...
This feature a new modal endpoint, and a complete new way to build the
summary.
## SummaryBuilder
The summary builder is based on conversational model, where an exchange
between the model and the user is made. This allow more context
inclusion and a better respect of the rules.
It requires an endpoint with OpenAI-like completions endpoint
(/v1/chat/completions)
## vLLM Hermes3
Unlike previous deployment, this one use vLLM, which gives OpenAI-like
completions endpoint out of the box. It could also handle guided JSON
generation, so jsonformer is not needed. But, the model is quite good to
follow JSON schema if asked in the prompt.
## Conversion of long/short into summary builder
The builder is identifying participants, find key subjects, get a
summary for each, then get a quick recap.
The quick recap is used as a short_summary, while the markdown including
the quick recap + key subjects + summaries are used for the
long_summary.
This is why the nextjs component has to be updated, to correctly style
h1 and keep the new line of the markdown.
2024-09-14 02:28:38 +02:00
4dbcb80228
server: remove reference to banana.dev
2023-11-15 19:49:32 +01:00
c87c30d339
hotfix/server: add follow_redirect on modal
2023-11-02 19:09:13 +01:00
projects-g
1d92d43fe0
New summary ( #283 )
...
* handover final summary to Zephyr deployment
* fix display error
* push new summary feature
* fix failing test case
* Added markdown support for final summary
* update UI render issue
* retain sentence tokenizer call
---------
Co-authored-by: Koper <andreas@monadical.com >
2023-10-13 22:53:29 +05:30
projects-g
bbe63ad407
Fix extra space between some tokens (punctuations) ( #267 )
...
* ensure uptime for reflector.media
* remove extra space before punct
* update detokenizer method
* create detokenizer property
* merge conflict
2023-10-12 10:42:19 +05:30
47f7e1836e
server: remove warmup methods everywhere
2023-10-06 13:59:17 -04:00
projects-g
6a43297309
Translation enhancements ( #247 )
2023-09-26 19:49:54 +05:30
projects-g
3a374ea593
Delete server/reflector/llm/llm_params_cod.py
2023-09-25 09:46:00 +05:30
Gokul Mohanarangan
117acfacae
update comment
2023-09-25 09:43:02 +05:30
Gokul Mohanarangan
80fd5e6176
update llm params
2023-09-22 07:49:41 +05:30
Gokul Mohanarangan
009d52ea23
update casing and trimming
2023-09-22 07:29:01 +05:30
Gokul Mohanarangan
ab41ce90e8
add profanity filter, post-process topic/title
2023-09-21 11:12:00 +05:30
fb93c55993
server: fix nltk download
2023-09-13 11:40:39 +02:00
Gokul Mohanarangan
ed83236145
remove cache dir
2023-09-13 14:41:38 +05:30
Gokul Mohanarangan
9a10eef789
add nltk lookup path
2023-09-13 14:13:31 +05:30
projects-g
9fe261406c
Feature additions ( #210 )
...
* initial
* add LLM features
* update LLM logic
* update llm functions: change control flow
* add generation config
* update return types
* update processors and tests
* update rtc_offer
* revert new title processor change
* fix unit tests
* add comments and fix HTTP 500
* adjust prompt
* test with reflector app
* revert new event for final title
* update
* move onus onto processors
* move onus onto processors
* stash
* add provision for gen config
* dynamically pack the LLM input using context length
* tune final summary params
* update consolidated class structures
* update consolidated class structures
* update precommit
* add broadcast processors
* working baseline
* Organize LLMParams
* minor fixes
* minor fixes
* minor fixes
* fix unit tests
* fix unit tests
* fix unit tests
* update tests
* update tests
* edit pipeline response events
* update summary return types
* configure tests
* alembic db migration
* change LLM response flow
* edit main llm functions
* edit main llm functions
* change llm name and gen cf
* Update transcript_topic_detector.py
* PR review comments
* checkpoint before db event migration
* update DB migration of past events
* update DB migration of past events
* edit LLM classes
* Delete unwanted file
* remove List typing
* remove List typing
* update oobabooga API call
* topic enhancements
* update UI event handling
* move ensure_casing to llm base
* update tests
* update tests
2023-09-13 11:26:08 +05:30
60edca6366
server: add prometheus instrumentation
2023-09-12 13:11:13 +02:00
Gokul Mohanarangan
2d686da15c
pass schema as dict
2023-08-17 21:51:44 +05:30
Gokul Mohanarangan
b08724a191
correct schema typing from str to dict
2023-08-17 20:57:31 +05:30
Gokul Mohanarangan
a98a9853be
PR review comments
2023-08-17 14:42:45 +05:30
Gokul Mohanarangan
eb13a7bd64
make schema optional argument
2023-08-17 09:23:14 +05:30
Gokul Mohanarangan
5f79e04642
make schema optional for all LLMs
2023-08-16 22:37:20 +05:30
Mathieu Virbel
01806ce037
server: remove warmup, increase LLM timeout for now
2023-08-11 19:56:39 +02:00
Mathieu Virbel
82ce8202bd
server: improve llm warmup exception handling
...
If LLM is stuck to warm or an exception happen in the pipeline, then the processor responsible for the exception fail, and there is no fallback. So audio continue to arrive, but no processing happen.While this should be done right especially after disconnection, still, we should ignore llm warmup issue and just go.
Closes #140
2023-08-11 19:33:07 +02:00
Mathieu Virbel
a06056f9bc
server: fixes initial timeout for llm warmup
2023-08-11 15:45:28 +02:00
Mathieu Virbel
38a5ee0da2
server: implement warmup event for llm and transcription
2023-08-11 15:32:41 +02:00
Mathieu Virbel
445d3c1221
server: implement modal backend for llm and transcription
2023-08-11 12:43:09 +02:00
7d40305737
Implement retry that automatically detect httpx and backoff ( #119 )
...
* server: implement retry that automatically detect httpx and backoff
Closes #118
* server: fix formatting
2023-08-08 14:03:36 +02:00
Mathieu Virbel
dce92e0cf7
server: fixes pipeline logger not transmitted to processors
...
Closes #110
2023-08-04 12:02:18 +02:00
d94e2911c3
Serverless GPU support on banana.dev ( #106 )
...
* serverless: implement banana backend for both audio and LLM
Related to monadical-sas/reflector-gpu-banana project
* serverless: got llm working on banana !
* tests: fixes
* serverless: fix dockerfile to use fastapi server + httpx
2023-08-04 10:24:11 +02:00
Mathieu Virbel
d320558cc9
server/rtc: fix topic output
2023-08-01 19:12:51 +02:00
Mathieu Virbel
1f8e4200fd
tests: rework tests and fixes bugs along the way
2023-08-01 16:05:48 +02:00
Mathieu Virbel
42f1442e56
server: introduce LLM backends
2023-08-01 14:23:34 +02:00