diff --git a/README.md b/README.md index c1b28b5c..12e4ef15 100644 --- a/README.md +++ b/README.md @@ -32,11 +32,11 @@ To setup, 5) Run the Whisper-JAX pipeline. Currently, the repo can take a Youtube video and transcribes/summarizes it. -``` python3 whisjax.py "https://www.youtube.com/watch?v=ihf0S97oxuQ" --transcript transcript.txt summary.txt ``` +``` python3 whisjax.py "https://www.youtube.com/watch?v=ihf0S97oxuQ"``` You can even run it on local file or a file in your configured S3 bucket. -``` python3 whisjax.py "startup.mp4" --transcript transcript.txt summary.txt ``` +``` python3 whisjax.py "startup.mp4"``` The script will take care of a few cases like youtube file, local file, video file, audio-only file, file in S3, etc. If local file is not present, it can automatically take the file from S3. @@ -85,7 +85,7 @@ mentioned above or simply use the GUI of AWS Management Console. 1) ```agenda_topic : ``` 3) Check all the values in ```config.ini```. You need to predefine 2 categories for which you need to scatter plot the topic modelling visualization in the config file. This is the default visualization. But, from the dataframe artefact called - ```df.pkl``` , you can load the df and choose different topics to plot. You can filter using certain words to search for the + ```df_.pkl``` , you can load the df and choose different topics to plot. You can filter using certain words to search for the transcriptions and you can see the top influencers and characteristic in each topic we have chosen to plot in the interactive HTML document. I have added a new jupyter notebook that gives the base template to play around with, named ```Viz_experiments.ipynb```.