A study on data augmentation of reverberant speech for robust speech recognition

Ko, Tom; Peddinti, Vijayaditya; Povey, Daniel; Seltzer, Michael L.; Khudanpur, Sanjeev

doi:10.1109/icassp.2017.7953152

articleMar 1, 2017Closed access

A study on data augmentation of reverberant speech for robust speech recognition

TKTom Ko VPVijayaditya Peddinti DPDaniel Povey MLMichael L. Seltzer SKSanjeev Khudanpur

Huawei Technologies (China) · Institute for Language and Speech Processing · +2 more institutions

Indexed incrossref

Abstract

The environmental robustness of DNN-based acoustic models can be significantly improved by using multi-condition training data. However, as data collection is a costly proposition, simulation of the desired conditions is a frequently adopted strategy. In this paper we detail a data augmentation approach for far-field ASR. We examine the impact of using simulated room impulse responses (RIRs), as real RIRs can be difficult to acquire, and also the effect of adding point-source noises. We find that the performance gap between using simulated and real RIRs can be eliminated when point-source noises are added. Further we show that the trained acoustic models not only perform well in the distant-talking scenario…

Citation impact

914

total citations

FWCI: 45.73
Percentile: 100%
References: 23

Citations per year

Authors

5

Topics & keywords

Topics

Keywords

Robustness (evolution)
Computer science
Speech recognition
Acoustic model
Impulse response
Training set
Impulse (physics)
Field (mathematics)

UN Sustainable Development Goals

Life in Land

No related works found for this paper.