Skip to content

Audio-WestlakeU/ATST-RCT

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 

Repository files navigation

ATST-RCT: For DCASE 2022 task4 challenge.

This is the official implementation of ATST-RCT.

Introduction

ATST is a self-supervised pretraining model designed for clip-level audio tasks. Please refer to ATST official page for more information.

RCT is a semi-supervised learning scheme designed for sound event detection. Please refer to RCT official page for more informaiton.

Training

The training/validation data is obtained from the DCSAE2022 task4 DESED dataset. The download of DESED is quite tedious and not all data is available for the accesses. You could ask for help from the DCASE committee to get the full dataset. Noted that, your testing result might be different with an incomplete validation dataset.

To train the model, please first get the baseline architecture of DCASE2022 task 4 by:

git clone [email protected]:DCASE-REPO/DESED_task.git

Don't forget to configure your environment by their requirements. And install any packages required. Dont't forget to change the path of the dataset to your owns.

Then, please cover the official DESED repo with ATST-RCT codes in this repo.

As for the ATST pretrained model, you could download the pretrained model from the following link, with password 2022:

https://pan.baidu.com/s/1Nh6Na1azs6lNKPBstBiStw 

Please also change the path of pretraining model in the configuration file, and to train you own model, run:

python train_fusion_rct.py

Results

The result of the challenge is not published, please refer to their official page.

Reference

[1] DESED Dataset: https://github.com/turpaultn/DESED

[2] DCASE2022 Task4 baseline: https://github.com/DCASE-REPO/DESED_task

[3] FilterAug: https://github.com/frednam93/FilterAugSED

About

ATST-RCT model for DCASE 2022 task4.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages