An audio-visual corpus for speech perception and automatic speech recognition

Cooke, Martin; Barker, Jon; Cunningham, Stuart; Shao, Xu

doi:10.1121/1.2229005

articleThe Journal of the Acoustical Society of AmericaNov 1, 2006Closed access

An audio-visual corpus for speech perception and automatic speech recognition

MCMartin Cooke JBJon Barker SCStuart Cunningham XSXu Shao

University of Sheffield

PubMed

Indexed incrossrefpubmed

Abstract

An audio-visual corpus has been collected to support the use of common material in speech perception and automatic speech recognition studies. The corpus consists of high-quality audio and video recordings of 1000 sentences spoken by each of 34 talkers. Sentences are simple, syntactically identical phrases such as "place green at B 4 now". Intelligibility tests using the audio signals suggest that the material is easily identifiable in quiet and low levels of stationary noise. The annotated corpus is available on the web for research use.

Citation impact

1,165

total citations

FWCI: 9.65
Percentile: 100%
References: 16

Citations per year

Authors

4

Topics & keywords

Topics

Keywords

Intelligibility (philosophy)
Computer science
Speech recognition
QUIET
Speech corpus
Perception
Audio mining
Audio visual

No related works found for this paper.

Funding

UO
University of Sheffield