MUSAN: A Music, Speech, and Noise Corpus

Snyder, David; Chen, Guoguo; Povey, Daniel

doi:10.48550/arxiv.1510.08484

preprintarXiv (Cornell University)Jan 1, 2015GREEN OA

MUSAN: A Music, Speech, and Noise Corpus

DSDavid Snyder GCGuoguo Chen DPDaniel Povey

Indexed inarxivdatacite

Abstract

This report introduces a new corpus of music, speech, and noise. This dataset is suitable for training models for voice activity detection (VAD) and music/speech discrimination. Our corpus is released under a flexible Creative Commons license. The dataset consists of music from several genres, speech from twelve languages, and a wide assortment of technical and non-technical noises. We demonstrate use of this corpus for music/speech discrimination on Broadcast news and VAD for speaker identification.

Citation impact

922

total citations

FWCI: —
Percentile: —
References: 5

Citations per year

Authors

3

Topics & keywords

Topics

Keywords

Speech recognition
Computer science
License
Speech corpus
Noise (video)
Voice activity detection
Identification (biology)
Natural language processing

UN Sustainable Development Goals

Peace, Justice and strong institutions

No related works found for this paper.