CREMA-D: Crowd-Sourced Emotional Multimodal Actors Dataset
University of Pennsylvania · Ursinus College · +1 more institution
Abstract
People convey their emotional state in their face and voice. We present an audio-visual data set uniquely suited for the study of multi-modal emotion expression and perception. The data set consists of facial and vocal emotional expressions in sentences spoken in a range of basic emotional states (happy, sad, anger, fear, disgust, and neutral). 7,442 clips of 91 actors with diverse ethnic backgrounds were rated by multiple raters in three modalities: audio, visual, and audio-visual. Categorical emotion labels and real-value intensity values for the perceived emotion were collected using crowd-sourcing from 2,443 raters. The human recognition of intended emotion for the audio-only, visual-only, and audio-visual…
Citation impact
- FWCI
- 3.91
- Percentile
- 100%
- References
- 63
Authors
6Topics & keywords
- Disgust
- Anger
- Emotion perception
- Psychology
- Happiness
- Perception
- Set (abstract data type)
- Cognitive psychology