preprintarXiv (Cornell University)Jan 1, 2015GREEN OA

MUSAN: A Music, Speech, and Noise Corpus

Indexed inarxivdatacite

Abstract

This report introduces a new corpus of music, speech, and noise. This dataset is suitable for training models for voice activity detection (VAD) and music/speech discrimination. Our corpus is released under a flexible Creative Commons license. The dataset consists of music from several genres, speech from twelve languages, and a wide assortment of technical and non-technical noises. We demonstrate use of this corpus for music/speech discrimination on Broadcast news and VAD for speaker identification.

Citation impact

922
total citations
FWCI
Percentile
References
5
Citations per year

Authors

3

Topics & keywords

Keywords
  • Speech recognition
  • Computer science
  • License
  • Speech corpus
  • Noise (video)
  • Voice activity detection
  • Identification (biology)
  • Natural language processing
UN Sustainable Development Goals
  • Peace, Justice and strong institutions
No related works found for this paper.