Skip to content

Latest commit

 

History

History
20 lines (13 loc) · 1.26 KB

README.md

File metadata and controls

20 lines (13 loc) · 1.26 KB

Decouple Multimodal Distilling for Speech Emotion Recognition

This is one of the model that we proposed for Advanced Projects at the Quality and Usability Lab at TU Berlin. We refer to this paper and follow this approach from Github/DMD.

Data files consist of MOSI and MOSEI datasets can be found here.

The datasets are placed into the folder ./dataset.

By default, the trained model will be saved in folder ./pt. Our trained model can be download from pt folder

If there is no difference in validation loss during, the last model contains layers and weights will be additionally saved in ./dmd.pth. Before testing the model we set the path of trained model in run.py (line 174). The results are saved into folder ./result

We run the codes on google colab, implemented under the .ipynb file Unimse_Submission.ipynb.

Current result:

  • Acc_2: 0.8430 F1_score: 0.8437 Acc_7: 0.4548 MAE: 0.7334 Loss: 0.7334

Goals:

  • Create IEMOCAP or EMODB datasets and train these datasets
  • Fine tune the model