Commit Graph

44 Commits

Author SHA1 Message Date
7ff201f3ff Fix model download 2024-12-27 14:23:03 +01:00
895ba36cb9 fix: modal upgrade (#421) 2024-10-01 16:39:24 +02:00
5267ab2d37 feat: retake summary using NousResearch/Hermes-3-Llama-3.1-8B model (#415)
This feature a new modal endpoint, and a complete new way to build the
summary.

## SummaryBuilder

The summary builder is based on conversational model, where an exchange
between the model and the user is made. This allow more context
inclusion and a better respect of the rules.

It requires an endpoint with OpenAI-like completions endpoint
(/v1/chat/completions)

## vLLM Hermes3

Unlike previous deployment, this one use vLLM, which gives OpenAI-like
completions endpoint out of the box. It could also handle guided JSON
generation, so jsonformer is not needed. But, the model is quite good to
follow JSON schema if asked in the prompt.

## Conversion of long/short into summary builder

The builder is identifying participants, find key subjects, get a
summary for each, then get a quick recap.

The quick recap is used as a short_summary, while the markdown including
the quick recap + key subjects + summaries are used for the
long_summary.

This is why the nextjs component has to be updated, to correctly style
h1 and keep the new line of the markdown.
2024-09-14 02:28:38 +02:00
Sara
004787c055 upgrade modal 2024-08-12 12:24:14 +02:00
projects-g
06b0abaf62 deployment fix (#364) 2024-06-20 12:07:28 +05:30
projects-g
63502becd6 Move HF_token to modal secret (#354)
* update all modal deployments and change seamless configuration due to change in src repo

* add fixture

* move token to secret
2024-04-19 10:30:45 +05:30
projects-g
72b22d1005 Update all modal deployments and change seamless configuration due to changes in src repo (#353)
* update all modal deployments and change seamless configuration due to change in src repo

* add fixture
2024-04-16 21:12:24 +05:30
8b1b71940f hotfix/server: update diarization settings to increase timeout, reduce idle timeout on the minimum 2023-11-30 19:25:09 +01:00
projects-g
eae01c1495 Change diarization internal flow (#320)
* change diarization internal flow
2023-11-30 22:00:06 +05:30
projects-g
5cb132cac7 fix loading shards from local cache (#313) 2023-11-08 22:02:48 +05:30
Gokul Mohanarangan
894c989d60 update language codes 2023-10-14 17:35:30 +05:30
Sara
90c6824f52 replace two letter codes with three letter codes 2023-10-13 23:36:02 +02:00
9269db74c0 gpu: update format + list of country 2 to 3 2023-10-13 23:33:37 +02:00
6c1869b79a gpu: improve concurrency on modal - coauthored with Gokul (#286) 2023-10-13 21:15:57 +02:00
projects-g
1d92d43fe0 New summary (#283)
* handover final summary to Zephyr deployment

* fix display error

* push new summary feature

* fix failing test case

* Added markdown support for final summary

* update UI render issue

* retain sentence tokenizer call

---------

Co-authored-by: Koper <andreas@monadical.com>
2023-10-13 22:53:29 +05:30
projects-g
628c69f81c Separate out transcription and translation into own Modal deployments (#268)
* abstract transcript/translate into separate GPU apps

* update app names

* update transformers library version

* update env.example file
2023-10-13 22:01:21 +05:30
47f7e1836e server: remove warmup methods everywhere 2023-10-06 13:59:17 -04:00
projects-g
c9f613aff5 Revert GPU/Container retention settings for modal apps (#260) 2023-10-03 09:54:32 +05:30
projects-g
6a43297309 Translation enhancements (#247) 2023-09-26 19:49:54 +05:30
Gokul Mohanarangan
d7ed93ae3e fix runtime download by creating specific storage paths for models 2023-09-25 09:34:42 +05:30
Gokul Mohanarangan
19dfb1d027 Upgrade to a bigger translation model 2023-09-20 20:02:52 +05:30
projects-g
9fe261406c Feature additions (#210)
* initial

* add LLM features

* update LLM logic

* update llm functions: change control flow

* add generation config

* update return types

* update processors and tests

* update rtc_offer

* revert new title processor change

* fix unit tests

* add comments and fix HTTP 500

* adjust prompt

* test with reflector app

* revert new event for final title

* update

* move onus onto processors

* move onus onto processors

* stash

* add provision for gen config

* dynamically pack the LLM input using context length

* tune final summary params

* update consolidated class structures

* update consolidated class structures

* update precommit

* add broadcast processors

* working baseline

* Organize LLMParams

* minor fixes

* minor fixes

* minor fixes

* fix unit tests

* fix unit tests

* fix unit tests

* update tests

* update tests

* edit pipeline response events

* update summary return types

* configure tests

* alembic db migration

* change LLM response flow

* edit main llm functions

* edit main llm functions

* change llm name and gen cf

* Update transcript_topic_detector.py

* PR review comments

* checkpoint before db event migration

* update DB migration of past events

* update DB migration of past events

* edit LLM classes

* Delete unwanted file

* remove List typing

* remove List typing

* update oobabooga API call

* topic enhancements

* update UI event handling

* move ensure_casing to llm base

* update tests

* update tests
2023-09-13 11:26:08 +05:30
Gokul Mohanarangan
9a7b89adaa keep models in cache and load from cache 2023-09-08 10:05:17 +05:30
Gokul Mohanarangan
2bed312e64 persistent model storage 2023-09-08 00:22:38 +05:30
Gokul Mohanarangan
e613157fd6 update to use cache dir 2023-09-05 14:28:48 +05:30
Gokul Mohanarangan
6b84bbb4f6 download transcriber model 2023-09-05 12:52:07 +05:30
Gokul Mohanarangan
61e24969e4 change model download 2023-08-30 13:00:42 +05:30
Gokul Mohanarangan
012390d0aa backup 2023-08-30 10:43:51 +05:30
Gokul Mohanarangan
e4fe3dfd3a remove print 2023-08-28 15:28:53 +05:30
Gokul Mohanarangan
a5b8849e5e change modal app name 2023-08-28 15:22:26 +05:30
Gokul Mohanarangan
d92a0de56c update HTTP POST 2023-08-28 15:19:36 +05:30
Gokul Mohanarangan
ebbe01f282 update fixes 2023-08-28 14:32:21 +05:30
Gokul Mohanarangan
49d6e2d1dc return both en and fr in transcriptio 2023-08-28 14:25:44 +05:30
d76bb83fe0 modal: fix schema passing issue with shadowing BaseModel.schema default 2023-08-22 17:10:36 +02:00
Gokul Mohanarangan
a0ea32db8a review comments 2023-08-21 13:50:59 +05:30
Gokul Mohanarangan
acdd5f7dab update 2023-08-21 12:53:49 +05:30
Gokul Mohanarangan
5b0883730f translation update 2023-08-21 11:46:28 +05:30
Gokul Mohanarangan
2d686da15c pass schema as dict 2023-08-17 21:51:44 +05:30
Gokul Mohanarangan
9103c8cca8 remove ast 2023-08-17 15:15:43 +05:30
Gokul Mohanarangan
2e48f89fdc add comments and log 2023-08-17 09:33:59 +05:30
Gokul Mohanarangan
eb13a7bd64 make schema optional argument 2023-08-17 09:23:14 +05:30
Gokul Mohanarangan
5f79e04642 make schema optional for all LLMs 2023-08-16 22:37:20 +05:30
Gokul Mohanarangan
0cdd7037fb wrap JSONFormer around LLM 2023-08-16 14:03:25 +05:30
Gokul Mohanarangan
2f0e9a51f7 integrate reflector-gpu-modal repo 2023-08-16 13:28:23 +05:30