It is the video-processing service for tracking the hockey players on videos of hockey games. It was developed for Adaptive Hockey (hockey for people with disabilities) Federation.
The main parts of the service:
- receiving the query from the backend
- downloading the video
- converting the video with ffmpeg
- selecting frames without advertisement in order not to use them
- tracking players
- recognizing players' numbers
- collecting data and preparing JSON response to the backend
Fig. 1 - The service scheme
This repo contains branches:
- main - here are necessary scripts and instructions for running the service
- gradio_app - scripts for demo application based on Gradio
- notebooks - experimental notebooks and researches, include:
- experiments with tracking
- experiments with video processing
- training the number recognizing model
Clone the repo
git clone [email protected]:fedor-konovalenko/a_hockey.git -b main
cd app
pip install -r requirements.txt
Download pretrained weights and useful script for Deva Tracker
download weights with these links:
to app/src/weights folder
download file with this link:
to app/ folder
Сheck that you have compatible versions installed:
- CUDA Drivers
- torch
- torchvision
The service was tested for:
- torch==2.0.1+cu117 / torchvision==0.15.2+cu117
- torch==1.13.1+cu116 / torchvision==0.14.1+cu116
Install Grounding DINO and Segment Anything Model
In this project the Grounded SegmentAnything is used as image processing model. It consists of two components: Grounding DINO - for zero-shot detection and Segment Anything Model (SAM) - for converting boxes into segmentations.
cd ..
git clone https://github.com/hkchengrex/Grounded-Segment-Anything
export CUDA_HOME=/usr/local/cuda
export BUILD_WITH_CUDA=True
export AM_I_DOCKER=False
cd Grounded-Segment-Anything
pip uninstall -y GroundingDINO
pip install -e GroundingDINO
pip install -q -e segment_anything
Install DEVA, download pretrained weights and replace utils.py script
cd ..
git clone https://github.com/hkchengrex/Tracking-Anything-with-DEVA
cd Tracking-Anything-with-DEVA
pip install -q -e .
wget -q -P ./saves/ https://github.com/hkchengrex/Tracking-Anything-with-DEVA/releases/download/v1.0/DEVA-propagation.pth
wget -q -P ./saves/ https://github.com/IDEA-Research/GroundingDINO/releases/download/v0.1.0-alpha/groundingdino_swint_ogc.pth
wget -q -P ./saves/ https://dl.fbaipublicfiles.com/segment_anything/sam_vit_h_4b8939.pth
wget -q -P ./saves/ https://github.com/hkchengrex/Tracking-Anything-with-DEVA/releases/download/v1.0/GroundingDINO_SwinT_OGC.py
cd ..
cd app
mv result_utils.py /Tracking-Anything-with-DEVA/deva/inference/result_utils.py
Service Structure
After all the manipulations above there should be the following folders' structure:
a_hockey
│ README.md
│ .gitignore
│
└───app
│ │ requirements.txt
│ │ Makefile
| | Dockerfile
│ │
│ └───src
│ │ app.py
│ │ clear_game.py
| | recognition.py
| | tracking.py
│ │ utils.py
| |
| └───weights
| └───test
│
└───Tracking-Anything-with-DEVA
| │ ...
|
└───Grounded-Segment-Anything
│ ...
Run the FastApi app
cd src
python3 app.py
then the FastApi application will be available at http://localhost:8000/
Test scripts for simulate the requests are available in app/src/test folder
Two post-requests are available:
Processing Request
Post-request for download, clean, process video and prepare .json file with tracking results. The tracking result in .json format will be saved in temporary directory /app/src/recognition and will be returned as JsonResponse
The request structure:
{"game_id": int,
"game_link": str,
"token": str,
"player_ids": [[int, int, ...], [int, int, ...]],
"player_numbers": [[int, int, ...], [int, int, ...]],
"team_ids": [int, int]}
The test script example
import requests
import json
def main():
with open("test_query_process.json", "r") as fid:
data = json.load(fid)
r = requests.post("http://localhost:8000/process", json=data)
if r.status_code != 200:
print(r.status_code)
print(r.json())
if __name__ == "__main__":
main()
And after processing the video the response is returned:
{"game_link": str,
"token": str,
players: [{"player_id": int,
"team_id": int,
"number": int,
"frames": [int, int, ...]},
...
]
"player_numbers": [[int, int, ...], [int, int, ...]],
"team_ids": [int, int]}
Cleaning Request
Strongly recommended after each service usage. Removes all content in temporary service directories.
The test script example
import requests
import json
def main():
r = requests.post("http://localhost:8000/clean")
if r.status_code != 200:
print(r.status_code)
print(r.json())
if __name__ == "__main__":
main()
The response example:
{"Removed": str,
"Objects": int,
"Size": str}
docker pull fdkonovalenko/adaptive_hockey:latest
docker run --rm -it -p 8010:8000 --gpus all --name hockey fdkonovalenko/adaptive_hockey:latest
The image is based on nvidia/cuda:11.6.2-cudnn8-devel-ubuntu20.04 image.
It was tested at the following hardware configuration:
- Tesla T4
- NVIDIA-SMI 510.47.03
- Driver Version: 510.47.03
- CUDA Version: 11.6
If you want to rebuild the image keep in mind, that you need not only run the container on GPU, but build the image on GPU too. It's the GroundingDino feature and it was described in issue.
Class.method | Parameters | Returns | Comments |
---|---|---|---|
Helper | input_dir: str, convert_dir: str |
Class for preparing video. Required parameters - path to directory for downloading video and for converting video |
|
Helper.download_file | link: str, token: str |
str | Downloads video from Yandex Disk, returns the raw video name |
Helper.convert_file | video_name: str | str | Converts video with ffmeg, returns the name of the converted video |
ClearGame | convert_dir: str, clear_dir: str |
Class for searching frames without a hockey game. Required parameters - path to directory with converted video and to directory for save results |
|
ClearGame.get_advertising_frames | video_name: str | str | With image2text model searches frames without hockey game, prepares the .json file with frames and returns it's name |
Tracking | convert_dir: str, clear_dir: str, final_dir: str |
Class for tracking players with DEVA. Required parameters - path to directory with converted video, with frames without game and for tracking results |
|
Tracking.get_bbox_track | video_name: str | str | Tracks players, prepare json file with tracked objects and its frames, returns the file name |
Numbers | input_dir: str, clear_dir: str, output_dir: str, emb_mode: str |
Class for recognizing numbers. Required parameters- path to directory with converted video, with frames without game and for recognizing results and the embedding model mode (ResNet or DinoV2) |
|
Numbers.predict_after | class_threshold: float, ann_path: str, video_path: str, tms: list, box_min_size: int |
list | Recognizes numbers on tracked objects, compare numbers with team lists, writes the results to .json file, returns list of dictionaries |