server: dockerize the server and update documentation

This commit is contained in:
Mathieu Virbel
2023-07-27 12:18:49 +02:00
parent 305db8a3e4
commit b5e0baa6c8
6 changed files with 1784 additions and 47 deletions

1
server/.python-version Normal file
View File

@@ -0,0 +1 @@
3.11

30
server/Dockerfile Normal file
View File

@@ -0,0 +1,30 @@
FROM python:3.11-slim as base
ENV PIP_DEFAULT_TIMEOUT=100 \
PIP_DISABLE_PIP_VERSION_CHECK=1 \
PIP_NO_CACHE_DIR=1 \
PYTHONDONTWRITEBYTECODE=1 \
PYTHONUNBUFFERED=1 \
POETRY_VERSION=1.3.1
# install packages needed for base
# RUN apt-get update && apt-get install -y make gettext
# builder
FROM base AS builder
WORKDIR /tmp
# RUN apt-get install -y build-essential libffi-dev zlib1g-dev
COPY pyproject.toml poetry.lock /tmp
RUN pip install "poetry==$POETRY_VERSION"
RUN python -m venv /venv
RUN . /venv/bin/activate && poetry config virtualenvs.create false
RUN . /venv/bin/activate && poetry install --only main --no-root --no-interaction --no-ansi
# bootstrap
FROM base AS final
COPY --from=builder /venv /venv
RUN mkdir -p /app
COPY server.py reflector_dataclasses.py /app/
COPY utils /app/utils
WORKDIR /app
CMD ["/venv/bin/python", "server.py"]

View File

@@ -1,5 +1,49 @@
# Reflector
Reflector server is responsible for audio transcription and summarization for now.
_The project is moving fast, documentation is currently unstable and outdated_
## Server
We currently use oogabooga as a LLM backend.
### Using docker
Create a `.env` with
```
LLM_URL=http://IP:HOST/api/v1/generate
```
Then start with:
```
$ docker-compose up server
```
### Using local environment
Install the dependencies with poetry:
```
$ poetry install
```
Then run the server:
```
# With a config.ini
$ poetry run python server.py
# Within a poetry env
$ poetry shell
$ LLM_URL=http://.../api/v1/generate python server.py
```
# Old documentation
This is the code base for the Reflector demo (formerly called agenda-talk-diff) for the leads : Troy Web Consulting
panel (A Chat with AWS about AI: Real AI/ML AWS projects and what you should know) on 6/14 at 430PM.
@@ -16,10 +60,10 @@ script is run. You need AWS_ACCESS_KEY / AWS_SECRET_KEY to authenticate your cal
For AWS S3 Web UI,
1) Login to AWS management console.
2) Search for S3 in the search bar at the top.
3) Navigate to list the buckets under the current account, if needed and choose your bucket [```reflector-bucket```]
4) You should be able to see items in the bucket. You can upload/download files here directly.
1. Login to AWS management console.
2. Search for S3 in the search bar at the top.
3. Navigate to list the buckets under the current account, if needed and choose your bucket [```reflector-bucket```]
4. You should be able to see items in the bucket. You can upload/download files here directly.
For CLI,
Refer to the FILE UTIL section below.
@@ -32,90 +76,92 @@ pipeline workflow in the script, you can do so by :
Upload:
``` python3 file_util.py upload <object_name_in_S3_bucket>```
` python3 file_util.py upload <object_name_in_S3_bucket>`
Download:
``` python3 file_util.py download <object_name_in_S3_bucket>```
` python3 file_util.py download <object_name_in_S3_bucket>`
If you want to access the S3 artefacts, from another machine, you can either use the python file_util with the commands
mentioned above or simply use the GUI of AWS Management Console.
To setup,
1) Check values in config.ini file. Specifically add your OPENAI_APIKEY if you plan to use OpenAI API requests.
2) Run ``` export KMP_DUPLICATE_LIB_OK=True``` in
1. Check values in config.ini file. Specifically add your OPENAI_APIKEY if you plan to use OpenAI API requests.
2. Run ` export KMP_DUPLICATE_LIB_OK=True` in
Terminal. [This is taken care of in code, but not reflecting, Will fix this issue later.]
NOTE: If you don't have portaudio installed already, run ```brew install portaudio```
NOTE: If you don't have portaudio installed already, run `brew install portaudio`
3) Run the script setup_depedencies.sh.
3. Run the script setup_depedencies.sh.
``` chmod +x setup_dependencies.sh ```
`chmod +x setup_dependencies.sh`
``` sh setup_dependencies.sh <ENV>```
` sh setup_dependencies.sh <ENV>`
ENV refers to the intended environment for JAX. JAX is available in several
variants, [CPU | GPU | Colab TPU | Google Cloud TPU]
```ENV``` is :
`ENV` is :
cpu -> JAX CPU installation
cuda11 -> JAX CUDA 11.x version
cuda12 -> JAX CUDA 12.x version (Core Weave has CUDA 12 version, can check with ```nvidia-smi```)
cuda12 -> JAX CUDA 12.x version (Core Weave has CUDA 12 version, can check with `nvidia-smi`)
```sh setup_dependencies.sh cuda12```
4) If not already done, install ffmpeg. ```brew install ffmpeg```
4. If not already done, install ffmpeg. `brew install ffmpeg`
For NLTK SSL error,
check [here](https://stackoverflow.com/questions/38916452/nltk-download-ssl-certificate-verify-failed)
5) Run the Whisper-JAX pipeline. Currently, the repo can take a Youtube video and transcribes/summarizes it.
5. Run the Whisper-JAX pipeline. Currently, the repo can take a Youtube video and transcribes/summarizes it.
``` python3 whisjax.py "https://www.youtube.com/watch?v=ihf0S97oxuQ"```
` python3 whisjax.py "https://www.youtube.com/watch?v=ihf0S97oxuQ"`
You can even run it on local file or a file in your configured S3 bucket.
``` python3 whisjax.py "startup.mp4"```
` python3 whisjax.py "startup.mp4"`
The script will take care of a few cases like youtube file, local file, video file, audio-only file,
file in S3, etc. If local file is not present, it can automatically take the file from S3.
**OFFLINE WORKFLOW:**
1) Specify the input source file] from a local, youtube link or upload to S3 if needed and pass it as input to the
1. Specify the input source file] from a local, youtube link or upload to S3 if needed and pass it as input to the
script.If the source file is in
```.m4a``` format, it will get converted to ```.mp4``` automatically.
2) Keep the agenda header topics in a local file named ```agenda-headers.txt```. This needs to be present where the
`.m4a` format, it will get converted to `.mp4` automatically.
2. Keep the agenda header topics in a local file named `agenda-headers.txt`. This needs to be present where the
script is run.
This version of the pipeline compares covered agenda topics using agenda headers in the following format.
1) ```agenda_topic : <short description>```
3) Check all the values in ```config.ini```. You need to predefine 2 categories for which you need to scatter plot the
1. `agenda_topic : <short description>`
3. Check all the values in `config.ini`. You need to predefine 2 categories for which you need to scatter plot the
topic modelling visualization in the config file. This is the default visualization. But, from the dataframe artefact
called
```df_<timestamp>.pkl``` , you can load the df and choose different topics to plot. You can filter using certain
`df_<timestamp>.pkl` , you can load the df and choose different topics to plot. You can filter using certain
words to search for the
transcriptions and you can see the top influencers and characteristic in each topic we have chosen to plot in the
interactive HTML document. I have added a new jupyter notebook that gives the base template to play around with,
named
```Viz_experiments.ipynb```.
4) Run the script. The script automatically transcribes, summarizes and creates a scatter plot of words & topics in the
`Viz_experiments.ipynb`.
4. Run the script. The script automatically transcribes, summarizes and creates a scatter plot of words & topics in the
form of an interactive
HTML file, a sample word cloud and uploads them to the S3 bucket
5) Additional artefacts pushed to S3:
1) HTML visualization file
2) pandas df in pickle format for others to collaborate and make their own visualizations
3) Summary, transcript and transcript with timestamps file in text format.
5. Additional artefacts pushed to S3:
1. HTML visualization file
2. pandas df in pickle format for others to collaborate and make their own visualizations
3. Summary, transcript and transcript with timestamps file in text format.
The script also creates 2 types of mappings.
1) Timestamp -> The top 2 matched agenda topic
2) Topic -> All matched timestamps in the transcription
1. Timestamp -> The top 2 matched agenda topic
2. Topic -> All matched timestamps in the transcription
Other visualizations can be planned based on available artefacts or new ones can be created. Refer the
section ```Viz-experiments```.
section `Viz-experiments`.
**Visualization experiments:**
@@ -135,21 +181,22 @@ We need to way to route audio from an application opened via the browser, ex. "W
microphone input which you will be using for speaking. We
use [Blackhole](https://github.com/ExistentialAudio/BlackHole).
1) Install Blackhole-2ch (2 ch is enough) by 1 of 2 options listed.
2) Setup [Aggregate device](https://github.com/ExistentialAudio/BlackHole/wiki/Aggregate-Device) to route web audio and
1. Install Blackhole-2ch (2 ch is enough) by 1 of 2 options listed.
2. Setup [Aggregate device](https://github.com/ExistentialAudio/BlackHole/wiki/Aggregate-Device) to route web audio and
local microphone input.
Be sure to mirror the settings given ![here](./images/aggregate_input.png)
3) Setup [Multi-Output device](https://github.com/ExistentialAudio/BlackHole/wiki/Multi-Output-Device)
3. Setup [Multi-Output device](https://github.com/ExistentialAudio/BlackHole/wiki/Multi-Output-Device)
Refer ![here](./images/multi-output.png)
4) Set the aggregator input device name created in step 2 in config.ini as ```BLACKHOLE_INPUT_AGGREGATOR_DEVICE_NAME```
4. Set the aggregator input device name created in step 2 in config.ini as `BLACKHOLE_INPUT_AGGREGATOR_DEVICE_NAME`
5) Then goto ``` System Preferences -> Sound ``` and choose the devices created from the Output and
5. Then goto `System Preferences -> Sound` and choose the devices created from the Output and
Input tabs.
6) The input from your local microphone, the browser run meeting should be aggregated into one virtual stream to listen
6. The input from your local microphone, the browser run meeting should be aggregated into one virtual stream to listen
to
and the output should be fed back to your specified output devices if everything is configured properly. Check this
before trying out the trial.
@@ -157,18 +204,18 @@ use [Blackhole](https://github.com/ExistentialAudio/BlackHole).
**Permissions:**
You may have to add permission for "Terminal"/Code Editors [Pycharm/VSCode, etc.] microphone access to record audio in
```System Preferences -> Privacy & Security -> Microphone```,
```System Preferences -> Privacy & Security -> Accessibility```,
```System Preferences -> Privacy & Security -> Input Monitoring```.
`System Preferences -> Privacy & Security -> Microphone`,
`System Preferences -> Privacy & Security -> Accessibility`,
`System Preferences -> Privacy & Security -> Input Monitoring`.
From the reflector root folder,
run ```python3 whisjax_realtime.py```
run `python3 whisjax_realtime.py`
The transcription text should be written to ```real_time_transcription_<timestamp>.txt```.
The transcription text should be written to `real_time_transcription_<timestamp>.txt`.
NEXT STEPS:
1) Create a RunPod setup for this feature (mentioned in 1 & 2) and test it end-to-end
2) Perform Speaker Diarization using Whisper-JAX
3) Based on the feasibility of the above points, explore suitable visualizations for transcription & summarization.
1. Create a RunPod setup for this feature (mentioned in 1 & 2) and test it end-to-end
2. Perform Speaker Diarization using Whisper-JAX
3. Based on the feasibility of the above points, explore suitable visualizations for transcription & summarization.

14
server/docker-compose.yml Normal file
View File

@@ -0,0 +1,14 @@
version: "3.9"
services:
server:
build:
context: .
ports:
- 1250:1250
environment:
LLM_URL: "${LLM_URL}"
volumes:
- model-cache:/root/.cache
volumes:
model-cache:

1622
server/poetry.lock generated Normal file

File diff suppressed because it is too large Load Diff

23
server/pyproject.toml Normal file
View File

@@ -0,0 +1,23 @@
[tool.poetry]
name = "reflector-server"
version = "0.1.0"
description = ""
authors = ["Monadical team <ops@monadical.com>"]
readme = "README.md"
packages = [{include = "reflector_server"}]
[tool.poetry.dependencies]
python = "^3.11"
aiohttp = "^3.8.5"
aiohttp-cors = "^0.7.0"
av = "^10.0.0"
requests = "^2.31.0"
aiortc = "^1.5.0"
faster-whisper = "^0.7.1"
sortedcontainers = "^2.4.0"
loguru = "^0.7.0"
[build-system]
requires = ["poetry-core"]
build-backend = "poetry.core.masonry.api"