- Get the data from https://www.dropbox.com/sh/x3zpttp7bjevb3r/AAAeFLnIeBMBXa9DNQD4a8TOa?e=2&dl=0 and put the content inside
data/raw/NSVA_Data/NSVA_Data
. - Run the
webscraper.py
- Take note the games have been selected to just 'dal' games. Refers to 'dallas mavericks'.
- Run
pip install -r requirements.txt
. - Create a file .env and place it under
/src
, it must containOPENAI_API_KEY=<API_KEY>
. - Place the video files extracted using the webscraper and place them in the directory
/data/raw/NSVA_Video/
- Place the webscraper generated metadata file under
/data/processed/final_results_{game}.csv
- The output commentary will be placed in the folder
/data/text/GPT4o/{game}_commentary_results.csv
Note:{game}
refers to the game_id which is available in the NSVA file, Eg: 0021800013-dal-vs-phx.
- Run
pip install -r requirements.txt
. - Determine the number of action recognition classes (current used based on dataset is 5)
- Finetune based on the specified hyperparams (e.g. Learning rate, epochs, etc)
- Default finetune by unfreezing last 3 layers and used standard Cross Entropy Loss function.
- go to
notebooks/inference.ipynb
- make your own
generate_captions
model - On last cell of the notebook, run the for loop and make the minor changes as needed / video data used can be found in gdrive link here
- Install required library
pip install langchain-openai==0.1.7
or just runpip install -r requirements.txt
. - Since the model is from local, follow install instruction for LM Studio / LocalAI.
- Run
python src/text_personification.py
from the root directory (or anywhere, doesn't matter). - The script will output both personified text and token usage
- Download the models from https://huggingface.co/enlyth/baj-tts/tree/main/models and put it inside the
models
directory - Install the requirements,
pip install TTS==0.22.0
or just runpip install -r requirements.txt
. - Run
python src/tts.py
from the root directory. - Check the generated
.wav
file inoutput
directory.
This is the main program. The models will run given captions of the videos, and will return a .wav
file for the output personified caption.
- Run
python main.py
from the root directory.