Speaker Diarization: A Review of Recent Research

Telefónica (Spain) · EURECOM · +3 more institutions

Indexed incrossref

Abstract

Speaker diarization is the task of determining “who spoke when?” in an audio or video recording that contains an unknown amount of speech and also an unknown number of speakers. Initially, it was proposed as a research topic related to automatic speech recognition, where speaker diarization serves as an upstream processing step. Over recent years, however, speaker diarization has become an important key technology for many tasks, such as navigation, retrieval, or higher level inference on audio data. Accordingly, many important improvements in accuracy and robustness have been reported in journals and conferences in the area. The application domains, from broadcast news, to lectures and meetings, vary greatly…

Citation impact

678
total citations
FWCI
32.19
Percentile
100%
References
165
Citations per year

Authors

6

Topics & keywords

Keywords
  • Speaker diarisation
  • Computer science
  • NIST
  • Transcription (linguistics)
  • Speech recognition
  • Speaker recognition
  • Inference
  • Speech technology
UN Sustainable Development Goals
  • Quality Education
No related works found for this paper.

Funding