preprintAug 16, 2017GREEN OA
VoxCeleb: A Large-Scale Speaker Identification Dataset
Indexed inarxivcrossref
Abstract
Most existing datasets for speaker identification contain samples obtained under quite constrained conditions, and are usually hand-annotated, hence limited in size. The goal of this paper is to generate a large scale text-independent speaker identi- fication dataset collected ‘in the wild’. We make two contributions. First, we propose a fully automated pipeline based on computer vision techniques to create the dataset from open-source media. Our pipeline involves obtaining videos from YouTube; performing active speaker verifi- cation using a two-stream synchronization Convolutional Neural Network (CNN), and confirming the identity of the speaker using CNN based facial recognition. We use this pipeline to…
Citation impact
2,121
total citations
- FWCI
- 90.54
- Percentile
- 100%
- References
- 4
Citations per year
Authors
3Topics & keywords
Topics
Keywords
- Computer science
- Pipeline (software)
- Convolutional neural network
- Identification (biology)
- Speaker recognition
- Speaker diarisation
- Speech recognition
- Identity (music)
UN Sustainable Development Goals
- Quality Education
No related works found for this paper.