An Expert-Annotated Benchmark for Synthetic Emotion Recognition
Explore the docs »
Dataset HQ
·
Dataset Binary
·
Dataset Big
Table of Contents
Effective human-AI interaction relies on AI's ability to accurately perceive and interpret human emotions. However, current benchmarks for vision and vision-language models are severely limited: they offer a narrow emotional spectrum, overlook nuanced states (e.g., bitterness, intoxication), and fail to distinguish subtle differences between related feelings (e.g., shame vs. embarrassment). Existing datasets often use uncontrolled imagery with occluded faces and lack demographic diversity, risking significant bias.
EmoNet-Face addresses these critical gaps with a comprehensive benchmark suite featuring:
- A novel 40-category emotion taxonomy, meticulously derived from foundational research to capture finer details of human emotional experiences.
- Three large-scale, AI-generated datasets with explicit, full-face expressions and controlled demographic balance across ethnicity, age, and gender.
- Rigorous, multi-expert annotations for both training and high-fidelity evaluation.
- The Empathic Insight Face model, achieving human-expert-level performance on our benchmark.
The publicly released EmoNet-Face suite—taxonomy, datasets, and model—provides a robust foundation for developing and evaluating AI systems with a deeper understanding of human emotions.
- data/
binary.csv: Dataset file (without images) for binary emotion analysis.hq.csv: High-quality dataset file (without images) for analysis.guide.json: Guide or metadata for the datasets.
- inference/
vlm-inference-prompt-multi-shot.ipynb: Notebook for multi-shot VLM inference.vlm-inference-prompt-zero-shot.ipynb: Notebook for zero-shot VLM inference.
- statistics/
analysis-binary.ipynb: Analysis notebook for the binary dataset.analysis-hq.ipynb: Analysis notebook for the high-quality dataset.descriptive.ipynb: Descriptive statistics notebook.worldmap.ipynb: Notebook for world map visualizations.
- README.md: This file.
The datasets for this project are hosted on Hugging Face and are designed to address the limitations of prior work by providing explicit, full-face expressions and balanced demographic representation:
- EmoNet-Face HQ: 2,500 expert-annotated images covering 40 emotion categories (test set), with rigorous multi-expert annotation and demographic control.
- EmoNet-Face Binary: 19,999 images with binary expert annotations (for fine-tuning), also demographically balanced.
- EmoNet-Face Big: Over 400,000 images with weak emotion labels (training set), generated with explicit, full-face expressions and demographic diversity.
We provide two self-trained inference models for emotion recognition, collectively referred to as Empathic Insight Face. These models achieve human-expert-level performance on the EmoNet-Face benchmark:
-
Empathic Insight Face Small
Colab Notebook -
Empathic Insight Face Large
Colab Notebook
- Code: MIT License
- Datasets: Creative Commons Attribution 4.0 International (CC BY 4.0)
See LICENSE for details.
If you use this repository or models, please cite our paper:
@misc{emonetface2025,
title={EmoNet-Face: An Expert-Annotated Benchmark for Synthetic Emotion Recognition},
author={Christoph Schuhmann and Robert Kaczmarczyk and Gollam Rabby and Felix Friedrich and Maurice Kraus and Krishna Kalyan and Kourosh Nadi and Huu Nguyen and Kristian Kersting and Sören Auer},
year={2025},
eprint={2505.20033},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2505.20033},
}
We gratefully acknowledge the support of Intel (oneAPI Center of Excellence), DFKI, Nous Research (providing cluster access and compute), TU Darmstadt, TIB—Leibniz Information Centre for Science and Technology, and Hessian.AI (providing compute and helpful discussions), as well as the open-source community for contributing to emotional AI.
This work benefited from the ICT-48 Network of AI Research Excellence Center “TAILOR” (EU Horizon 2020, GA No 952215), the Hessian research priority program LOEWE within the project WhiteBox, the HMWK cluster projects “Adaptive Mind” and “Third Wave of AI”, and from the NHR4CES. Furthermore, this work was partly funded by the Federal Ministry of Education and Research (BMBF) project “XEI” (FKZ 01IS24079B).