Skip to content

Official code for A Segment Level Approach to Speech Emotion Recognition using Transfer Learning, ACPR 2019

License

Notifications You must be signed in to change notification settings

sourav22899/segment-ser

Repository files navigation

segment-ser

This is the official code repo for A Segment Level Approach to Speech Emotion Recognition using Transfer Learning, ACPR 2019. In this paper, we propose a speech emotion recognition system that predicts emotions for multiple segments of a single audio clip unlike the conventional emotion recognition models that predict the emotion of an entire audio clip directly. The proposed system consists of a pre-trained deep convolutional neural network (CNN), the Google VGGish model, followed by a single layered neural network which predicts the emotion classes of the audio segments. The proposed model attains an accuracy of 68.7% surpassing the current state-of-the-art models in classifying the data into one of the four emotional classes (angry, happy, sad and neutral) when trained and evaluated on IEMOCAP audio-only dataset.

Requirements

These are all easily installable via, e.g., pip install numpy

To implement the code, the following two data files needs to be downloaded:

Citation

If you find this repo useful in your research, please consider citing the following paper:

@inproceedings{sahoo2019segment,
  title={A Segment Level Approach to Speech Emotion Recognition Using Transfer Learning},
  author={Sahoo, Sourav and Kumar, Puneet and Raman, Balasubramanian and Roy, Partha Pratim},
  booktitle={Asian Conference on Pattern Recognition},
  pages={435--448},
  year={2019},
  organization={Springer}
}

About

Official code for A Segment Level Approach to Speech Emotion Recognition using Transfer Learning, ACPR 2019

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published