Adieu features? End-to-end speech emotion recognition using a deep convolutional recurrent network
Imperial College London · Technical University of Munich · +2 more institutions
Abstract
The automatic recognition of spontaneous emotions from speech is a challenging task. On the one hand, acoustic features need to be robust enough to capture the emotional content for various styles of speaking, and while on the other, machine learning algorithms need to be insensitive to outliers while being able to model the context. Whereas the latter has been tackled by the use of Long Short-Term Memory (LSTM) networks, the former is still under very active investigations, even though more than a decade of research has provided a large set of acoustic descriptors. In this paper, we propose a solution to the problem of ‘context-aware’ emotional relevant feature extraction, by combining Convolutional Neural…
Citation impact
- FWCI
- 109.70
- Percentile
- 100%
- References
- 42
Authors
7Topics & keywords
- End-to-end principle
- Computer science
- Deep learning
- Speech recognition
- Artificial intelligence