articleJan 1, 2024GOLD OA

emotion2vec: Self-Supervised Pre-Training for Speech Emotion Representation

Indexed incrossref

Abstract

We propose emotion2vec, a universal speech emotion representation model.emotion2vec is pre-trained on open-source unlabeled emotion data through self-supervised online distillation, combining utterance-level loss and framelevel loss during pre-training.emotion2vec outperforms state-of-the-art pre-trained universal models and emotion specialist models by only training linear layers for the speech emotion recognition task on the mainstream IEMOCAP dataset.In addition, emotion2vec shows consistent improvements among 10 different languages of speech emotion recognition datasets.emotion2vec also shows excellent results on other emotion tasks, such as song emotion recognition, emotion prediction in conversation, and…

Citation impact

116
total citations
FWCI
37.18
Percentile
100%
References
0
Citations per year

Authors

7

Topics & keywords

Keywords
  • Computer science
  • Training (meteorology)
  • Speech recognition
  • Representation (politics)
  • Artificial intelligence
  • Emotion recognition
  • Self representation
  • Natural language processing
No related works found for this paper.