This is a part of Mock-Buddy project, used to detect speech confidence. CNN architecture used to build the model to classify emotions. The ensembled model is build with TensorFlow applying bootstrap aggregation apporach.
- Python 3.7 or newer
- You can download dataset from
data.ipynb
. Run it and preprocess data. - Run
train.ipynb
to start training. ser_pred.ipyb
has template for using model inside other applications.
👤 Karthick T. Sharma
- Github: @Karthick47v2
- LinkedIn: @Karthick47
The Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS) can be downloaded free of charge at https://zenodo.org/record/1188976.
@data{SP2/E8H2MF_2020,
author = {Pichora-Fuller, M. Kathleen and Dupuis, Kate},
publisher = {Scholars Portal Dataverse},
title = "{Toronto emotional speech set (TESS)}",
year = {2020},
version = {DRAFT VERSION},
doi = {10.5683/SP2/E8H2MF},
url = {https://doi.org/10.5683/SP2/E8H2MF}
}
@inproceedings{Vlasenko_combiningframe,
author = {Vlasenko, Bogdan and Schuller, Bjorn and Wendemuth, Andreas and Rigoll, Gerhard},
year = {2007},
month = {01},
pages = {2249-2252},
title = {Combining frame and turn-level information for robust recognition of emotions within speech},
journal = {Proceedings of Interspeech}
}
Contributions, issues and feature requests are welcome!
Feel free to check issues page.
Give a ⭐️ if this project helped you!