🏀 About HoopsTalk

HoopsTalk harnesses the power of generative AI to provide engaging and ensuring a comprehensive and immersive experience for fans. Our proposed caption generating architecture consists of (i) using one of the best multi-modal (OpenAI GPT4o Endpoint) (ii)Fintuned TimeSFormer (pretrained on Kinetics400) Recognising the limitations of having well annotated pair of data for style transfer, we have used open-sourced Language Models from LM Studio by curating prompt template of XXX Celebrity.

Webscraper

Get the data from https://www.dropbox.com/sh/x3zpttp7bjevb3r/AAAeFLnIeBMBXa9DNQD4a8TOa?e=2&dl=0 and put the content inside data/raw/NSVA_Data/NSVA_Data.
Run the webscraper.py
Take note the games have been selected to just 'dal' games. Refers to 'dallas mavericks'.

Generate Caption - GPT4o

Run pip install -r requirements.txt.
Create a file .env and place it under /src, it must contain OPENAI_API_KEY=<API_KEY>.
Place the video files extracted using the webscraper and place them in the directory /data/raw/NSVA_Video/
Place the webscraper generated metadata file under /data/processed/final_results_{game}.csv
The output commentary will be placed in the folder /data/text/GPT4o/{game}_commentary_results.csv Note: {game} refers to the game_id which is available in the NSVA file, Eg: 0021800013-dal-vs-phx.

Finetune TimeSFormer - Action Recognition

Run pip install -r requirements.txt.
Determine the number of action recognition classes (current used based on dataset is 5)
Finetune based on the specified hyperparams (e.g. Learning rate, epochs, etc)
Default finetune by unfreezing last 3 layers and used standard Cross Entropy Loss function.

Generate Caption - Inference

go to notebooks/inference.ipynb
make your own generate_captions model
On last cell of the notebook, run the for loop and make the minor changes as needed / video data used can be found in gdrive link here

Text Personification

Install required library pip install langchain-openai==0.1.7 or just run pip install -r requirements.txt.
Since the model is from local, follow install instruction for LM Studio / LocalAI.
Run python src/text_personification.py from the root directory (or anywhere, doesn't matter).
The script will output both personified text and token usage

Text-to-Speech

Download the models from https://huggingface.co/enlyth/baj-tts/tree/main/models and put it inside the models directory
Install the requirements, pip install TTS==0.22.0 or just run pip install -r requirements.txt.
Run python src/tts.py from the root directory.
Check the generated .wav file in output directory.

MAIN

This is the main program. The models will run given captions of the videos, and will return a .wav file for the output personified caption.

Run python main.py from the root directory.

Name		Name	Last commit message	Last commit date
Latest commit History 68 Commits
data		data
docs/images		docs/images
models		models
notebooks		notebooks
output/sample_audio		output/sample_audio
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🏀 About HoopsTalk

Webscraper

Generate Caption - GPT4o

Finetune TimeSFormer - Action Recognition

Generate Caption - Inference

Text Personification

Text-to-Speech

MAIN

About

Releases

Packages

Contributors 3

Languages

License

sciencenerd880/HoopsTalk

Folders and files

Latest commit

History

Repository files navigation

🏀 About HoopsTalk

Webscraper

Generate Caption - GPT4o

Finetune TimeSFormer - Action Recognition

Generate Caption - Inference

Text Personification

Text-to-Speech

MAIN

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages