Speaker Diarization: A Review of Recent Research
Telefónica (Spain) · EURECOM · +3 more institutions
Abstract
Speaker diarization is the task of determining “who spoke when?” in an audio or video recording that contains an unknown amount of speech and also an unknown number of speakers. Initially, it was proposed as a research topic related to automatic speech recognition, where speaker diarization serves as an upstream processing step. Over recent years, however, speaker diarization has become an important key technology for many tasks, such as navigation, retrieval, or higher level inference on audio data. Accordingly, many important improvements in accuracy and robustness have been reported in journals and conferences in the area. The application domains, from broadcast news, to lectures and meetings, vary greatly…
Citation impact
- FWCI
- 32.19
- Percentile
- 100%
- References
- 165
Authors
6- XAXavier AngueraCorresponding
Telefónica (Spain)
- SBSimon Bozonnet
EURECOM
- NENicholas Evans
Université d'Avignon et des Pays de Vaucluse, EURECOM
- CFCorinne Fredouille
Université d'Avignon et des Pays de Vaucluse, International Computer Science Institute
- GFGerald Friedland
International Computer Science Institute, University of California, Berkeley
Topics & keywords
- Speaker diarisation
- Computer science
- NIST
- Transcription (linguistics)
- Speech recognition
- Speaker recognition
- Inference
- Speech technology
- Quality Education