1.8 KiB
Reflector
This is the code base for the Reflector demo (formerly called agenda-talk-diff) for the leads : Troy Web Consulting panel (A Chat with AWS about AI: Real AI/ML AWS projects and what you should know) on 6/14 at 430PM.
The target deliverable is a local-first live transcription and visualization tool to compare a discussion's target agenda/objectives to the actual discussion live.
To setup,
-
Check values in config.ini file. Specifically add your OPENAI_APIKEY.
-
Run
export KMP_DUPLICATE_LIB_OK=Truein Terminal. [This is taken care of in code, but not reflecting, Will fix this issue later.] -
Run the script setup_depedencies.sh.
chmod +x setup_dependecies.shsh setup_dependencies.sh <ENV>ENV refers to the intended environment for JAX. JAX is available in several variants, [CPU | GPU | Colab TPU | Google Cloud TPU]
ENVis :cpu -> JAX CPU installation
cuda11 -> JAX CUDA 11.x version
cuda12 -> JAX CUDA 12.x version (Core Weave has CUDA 12 version, can check with
nvidia-smi)sh setup_dependencies.sh cuda12
-
Run the Whisper-JAX pipeline. Currently, the repo takes a Youtube video and transcribes/summarizes it.
python3 whisjax.py "https://www.youtube.com/watch?v=ihf0S97oxuQ" --transcript transcript.txt summary.txt
NEXT STEPS:
- Run this demo on a local Mac M1 to test flow and observe the performance
- Create a pipeline using microphone to listen to audio chunks to perform transcription realtime (and also efficiently summarize it as well)
- Create a RunPod setup for this feature (mentioned in 1 & 2) and test it end-to-end
- Perform Speaker Diarization using Whisper-JAX
- Based on feasibility of above points, explore suitable visualizations for transcription & summarization.