Merge pull request #41 from RosePY/master

Add Quechua-SER corpus
SuperKogito · Jun 3, 2024 · 4a1fc5e · 4a1fc5e
2 parents 606b510 + 0c6a5b4
commit 4a1fc5e
Show file tree

Hide file tree

Showing 3 changed files with 17 additions and 1 deletion.
diff --git a/README.md b/README.md
@@ -1,8 +1,9 @@
-***Spoken Emotion Recognition Datasets:*** *A collection of datasets (count=43) for the purpose of emotion recognition/detection in speech.
+***Spoken Emotion Recognition Datasets:*** *A collection of datasets (count=44) for the purpose of emotion recognition/detection in speech.
 The table is chronologically ordered and includes a description of the content of each dataset along with the emotions included.
 The table can be browsed, sorted and searched under https://superkogito.github.io/SER-datasets/*
 | Dataset                                                                                                                                           | Year            | Content                                                                                                                                                                   | Emotions                                                                                                                                                                                                                                                                     | Format                        | Size                    | Language                                                          | Paper                                                                                                                                                                                                                                                                                                                                                     | Access                    | License                                                                                                                                   |
 |:--------------------------------------------------------------------------------------------------------------------------------------------------|:----------------|:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:------------------------------|:------------------------|:------------------------------------------------------------------|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:--------------------------|:------------------------------------------------------------------------------------------------------------------------------------------|
+| <sub>[Quechua-SER](https://figshare.com/articles/media/Quechua_Collao_for_Speech_Emotion_Recognition/20292516)</sub>                              | <sub>2022</sub> | <sub>12420 audio recordings (~15 hours) and their transcriptions by 7 native speakers.</sub>                                                                              | <sub>Emotional labels using dimensions: valence, arousal, and dominance.</sub>                                                                                                                                                                                               | <sub>Audio</sub>              | <sub>3.53 GB</sub>      | <sub>Quechua Collao</sub>                                         | <sub>[A speech corpus of Quechua Collao for automatic dimensional emotion recognition](https://www.nature.com/articles/s41597-022-01855-9)</sub>                                                                                                                                                                                                          | <sub>Open</sub>           | <sub>[CC BY 4.0](https://creativecommons.org/licenses/by/4.0/)</sub>                                                                      |
 | <sub>[MESD](https://data.mendeley.com/datasets/cy34mh68j9/5)</sub>                                                                                | <sub>2022</sub> | <sub>864 audio files of single-word emotional utterances with Mexican cultural shaping.</sub>                                                                             | <sub>6 emotions provides single-word utterances for anger, disgust, fear, happiness, neutral, and sadness.</sub>                                                                                                                                                             | <sub>Audio</sub>              | <sub>0,097 GB</sub>     | <sub>Spanish (Mexican)</sub>                                      | <sub>[The Mexican Emotional Speech Database (MESD): elaboration and assessment based on machine learning](https://pubmed.ncbi.nlm.nih.gov/34891601/)</sub>                                                                                                                                                                                                | <sub>Open</sub>           | <sub>[CC BY 4.0](https://creativecommons.org/licenses/by/4.0/)</sub>                                                                      |
 | <sub>[SyntAct](https://zenodo.org/record/6573016#.ZAjy_9LMJpj)</sub>                                                                              | <sub>2022</sub> | <sub>Synthesized database of three basic emotions and neutral expression based on rule-based manipulation for a diphone synthesizer which we release to the public </sub> | <sub>997 utterances including 6 emotions: angry, bored, happy, neutral, sad and scared</sub>                                                                                                                                                                                 | <sub>Audio</sub>              | <sub>941 MB</sub>       | <sub>German</sub>                                                 | <sub>[SyntAct: A Synthesized Database of Basic Emotions](http://felix.syntheticspeech.de/publications/synthetic_database.pdf)</sub>                                                                                                                                                                                                                       | <sub>Open</sub>           | <sub>[CC BY-SA 4.0](https://creativecommons.org/licenses/by/4.0)</sub>                                                                    |
 | <sub>[MLEnd](https://www.kaggle.com/datasets/jesusrequena/mlend-spoken-numerals)</sub>                                                            | <sub>2021</sub> | <sub>~32700 audio recordings files produced by 154 speakers. Each audio recording corresponds to one English numeral (from "zero" to "billion")</sub>                     | <sub>Intonations: neutral, bored, excited and question</sub>                                                                                                                                                                                                                 | <sub>Audio</sub>              | <sub>2.27 GB</sub>      | <sub>--</sub>                                                     | <sub>--</sub>                                                                                                                                                                                                                                                                                                                                             | <sub>Open</sub>           | <sub>Unknown</sub>                                                                                                                        |

diff --git a/src/ser-datasets.csv b/src/ser-datasets.csv
@@ -1,4 +1,5 @@
 Dataset,Year,Content,Emotions,Format,Size,Language,Paper,Access,License
+`Quechua-SER <https://figshare.com/articles/media/Quechua_Collao_for_Speech_Emotion_Recognition/20292516>`_,2022,12420 audio recordings (~15 hours) and their transcriptions by 7 native speakers.,"Emotional labels using dimensions: valence, arousal, and dominance.",Audio,3.53 GB,Quechua Collao,`A speech corpus of Quechua Collao for automatic dimensional emotion recognition <https://www.nature.com/articles/s41597-022-01855-9>`_,Open,`CC BY 4.0 <https://creativecommons.org/licenses/by/4.0/>`_
 `MESD <https://data.mendeley.com/datasets/cy34mh68j9/5>`_,2022,864 audio files of single-word emotional utterances with Mexican cultural shaping.,"6 emotions provides single-word utterances for anger, disgust, fear, happiness, neutral, and sadness.",Audio,"0,097 GB",Spanish (Mexican),`The Mexican Emotional Speech Database (MESD): elaboration and assessment based on machine learning <https://pubmed.ncbi.nlm.nih.gov/34891601/>`_,Open,`CC BY 4.0 <https://creativecommons.org/licenses/by/4.0/>`_
 `SyntAct <https://zenodo.org/record/6573016#.ZAjy_9LMJpj>`_,2022,Synthesized database of three basic emotions and neutral expression based on rule-based manipulation for a diphone synthesizer which we release to the public ,"997 utterances including 6 emotions: angry, bored, happy, neutral, sad and scared",Audio,941 MB,German,`SyntAct: A Synthesized Database of Basic Emotions <http://felix.syntheticspeech.de/publications/synthetic_database.pdf>`_,Open,`CC BY-SA 4.0 <https://creativecommons.org/licenses/by/4.0>`_
 `MLEnd <https://www.kaggle.com/datasets/jesusrequena/mlend-spoken-numerals>`_,2021,"~32700 audio recordings files produced by 154 speakers. Each audio recording corresponds to one English numeral (from ""zero"" to ""billion"")","Intonations: neutral, bored, excited and question",Audio,2.27 GB,--,--,Open,Unknown

diff --git a/src/ser-datasets.json b/src/ser-datasets.json
@@ -1,4 +1,18 @@
 {
+    "Quechua-SER": {
+        "Year": 2022,
+        "Content": "12420 audio recordings (~15 hours) and their transcriptions by 7 native speakers.",
+        "Emotions": "Emotional labels using dimensions: valence, arousal, and dominance.",
+        "Format": "Audio",
+        "Size": "3.53 GB",
+        "Language": "Quechua Collao",
+        "Paper": "A speech corpus of Quechua Collao for automatic dimensional emotion recognition",
+        "Access": "Open",
+        "License": "CC BY 4.0",
+        "Dataset-link": "https://figshare.com/articles/media/Quechua_Collao_for_Speech_Emotion_Recognition/20292516",
+        "Paper-link": "https://www.nature.com/articles/s41597-022-01855-9",
+        "License-link": "https://creativecommons.org/licenses/by/4.0/"
+    },
     "MESD": {
         "Year": 2022,
         "Content": "864 audio files of single-word emotional utterances with Mexican cultural shaping.",