A Regression Approach to Speech Enhancement Based on Deep Neural Networks

Xu, Yong; Du, Jun; Dai, Li-Rong; Lee, Chin‐Hui

doi:10.1109/taslp.2014.2364452

articleIEEE/ACM Transactions on Audio Speech and Language ProcessingOct 21, 2014Closed access

A Regression Approach to Speech Enhancement Based on Deep Neural Networks

YXYong Xu JDJun Du LDLi-Rong Dai CLChin‐Hui Lee

University of Science and Technology of China · Georgia Institute of Technology

Indexed incrossref

Abstract

In contrast to the conventional minimum mean square error (MMSE)-based noise reduction techniques, we propose a supervised method to enhance speech by means of finding a mapping function between noisy and clean speech signals based on deep neural networks (DNNs). In order to be able to handle a wide range of additive noises in real-world situations, a large training set that encompasses many possible combinations of speech and noise types, is first designed. A DNN architecture is then employed as a nonlinear regression function to ensure a powerful modeling capability. Several techniques have also been proposed to improve the DNN-based speech enhancement system, including global variance equalization to…

Citation impact

1,407

total citations

FWCI: 53.05
Percentile: 100%
References: 63

Citations per year

Authors

4

Topics & keywords

Topics

Keywords

Computer science
Speech enhancement
Smoothing
Noise (video)
Dropout (neural networks)
Artificial neural network
Speech recognition
Minimum mean square error

UN Sustainable Development Goals

Peace, Justice and strong institutions

No related works found for this paper.

Funding

NN
National Natural Science Foundation of China
Awards: 61305002, 61273264