End-to-End Multimodal Emotion Recognition Using Deep Neural Networks

Imperial College London · Goldsmiths University of London · +2 more institutions

Indexed inarxivcrossref

Abstract

Automatic affect recognition is a challenging task due to the various modalities emotions can be expressed with. Applications can be found in many domains including multimedia retrieval and human-computer interaction. In recent years, deep neural networks have been used with great success in determining emotional states. Inspired by this success, we propose an emotion recognition system using auditory and visual modalities. To capture the emotional content for various styles of speaking, robust features need to be extracted. To this purpose, we utilize a convolutional neural network (CNN) to extract features from the speech, while for the visual modality a deep residual network of 50 layers is used. In…

Citation impact

722
total citations
FWCI
49.27
Percentile
100%
References
62
Citations per year

Authors

5

Topics & keywords

Keywords
  • Computer science
  • Artificial intelligence
  • Convolutional neural network
  • Deep learning
  • Modalities
  • Feature extraction
  • Context (archaeology)
  • Modality (human–computer interaction)
UN Sustainable Development Goals
  • Quality Education
No related works found for this paper.