The ICSI Meeting Corpus

Janin, Adam; Baron, Don; Edwards, Jane A.; Ellis, Daniel P. W.; Gelbart, David; Morgan, N.; Peskin, Barbara; Pfau, Thilo; Shriberg, E.; Stolcke, Andreas; Wooters, Chuck

doi:10.1109/icassp.2003.1198793

article2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03).Nov 20, 2003Closed access

The ICSI Meeting Corpus

AJAdam Janin DBDon Baron JAJane A. Edwards DPDaniel P. W. Ellis DGDavid Gelbart

International Computer Science Institute · University of California, Berkeley · +3 more institutions

Indexed incrossref

Abstract

We have collected a corpus of data from natural meetings that occurred at the International Computer Science Institute (ICSI) in Berkeley, California over the last three years. The corpus contains audio recorded simultaneously from head-worn and table-top microphones, word-level transcripts of meetings, and various metadata on participants, meetings, and hardware. Such a corpus supports work in automatic speech recognition, noise robustness, dialog modeling, prosody, rich transcription, information retrieval, and more. We present details on the contents of the corpus, as well as rationales for the decisions that led to its configuration. The corpus were delivered to the Linguistic Data Consortium (LDC).

Citation impact

672

total citations

FWCI: 41.27
Percentile: 100%
References: 8

Citations per year

Authors

11

Topics & keywords

Topics

Keywords

Computer science
Metadata
Natural language processing
Prosody
Transcription (linguistics)
Speech corpus
Artificial intelligence
Speech recognition

No related works found for this paper.

Funding

UO
University of Washington