Code Release for AICITY_2022 Track3 HSNB Team

We make good use of the multi-view synchronization among videos, and conduct robust Multi-View Practice (MVP) for driving action localization. To avoid overfitting, we finetune SlowFast with Kinetics-700 pre-training as the feature extractor. Then the features of different views are passed to ActionFormer to generate candidate action proposals. For precisely localizing all the actions, we design elaborate post-processing, including model voting, threshold filtering and duplication removal. More details can be found in our workshop paper: MVP: Robust Multi-View Practice for Driving Action Localization. Arxiv

Pre-processing

There are three steps and the details are explained in README.md under the corresponding folder:

pre-processing/gen_frame: In this folder, we decord the original video data as frames.
pre-processing/gen_json_and_fea: In this folder, we prepare the action segment temporal location labels as the json format, and collect meta info for all videos, like duration, duration_frame and etc.
pre-processing/generate_cls_csv: In this folder, we prepare the action segment classfication labels as the csv format.

Method

There are five steps:

classification/README.md(Train model): Train the basic classification model for action segments.
classification/README.md(Inference model to extract features): Use the well-trained classification model to extract snippet features.
proposal_extract/README.md(Train and inference model): Use the snippet features to the train temporal location model, and infercence test dataset to generate proposals.
proposal_extract/README.md(Convert format): Convert proposals from pkl format to csv format.
classification/README.md(Inference model to predict proposals): Classify the generated proposals.

Post-processing

Conduct post-processing following post-processing/README.md.

License

This project is released under the MIT license. Please see the LICENSE file for more information.

Acknowledgment

We are very grateful to the organizers for providing this opportunity for us, to explore the model in real multi-view driving videos. This is very meaningful. We believe it will promote the development of ai city and automatic driving.

In addition, our code is built based on UniFormer, SlowFast and ActionFormer.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Code Release for AICITY_2022 Track3 HSNB Team

Pre-processing

Method

Post-processing

License

Acknowledgment

About

Releases

Packages

Contributors 2

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
classification		classification
data		data
post-processing		post-processing
pre-processing		pre-processing
proposal_extract		proposal_extract
.DS_Store		.DS_Store
INSTALL.md		INSTALL.md
LICENSE		LICENSE
README.md		README.md

License

TheEighthDay/AICITY_HSNB

Folders and files

Latest commit

History

Repository files navigation

Code Release for AICITY_2022 Track3 HSNB Team

Pre-processing

Method

Post-processing

License

Acknowledgment

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages