MUSAN: A Music, Speech, and Noise Corpus
Indexed inarxivdatacite
Abstract
This report introduces a new corpus of music, speech, and noise. This dataset is suitable for training models for voice activity detection (VAD) and music/speech discrimination. Our corpus is released under a flexible Creative Commons license. The dataset consists of music from several genres, speech from twelve languages, and a wide assortment of technical and non-technical noises. We demonstrate use of this corpus for music/speech discrimination on Broadcast news and VAD for speaker identification.
Citation impact
922
total citations
- FWCI
- —
- Percentile
- —
- References
- 5
Citations per year
Authors
3Topics & keywords
Topics
Keywords
- Speech recognition
- Computer science
- License
- Speech corpus
- Noise (video)
- Voice activity detection
- Identification (biology)
- Natural language processing
UN Sustainable Development Goals
- Peace, Justice and strong institutions
No related works found for this paper.