Phase-sensitive and recognition-boosted speech separation using deep recurrent neural networks

Erdoğan, Hakan; Hershey, John R.; Watanabe, Shinji; Roux, Jonathan Le

doi:10.1109/icassp.2015.7178061

articleApr 1, 2015Closed access

Phase-sensitive and recognition-boosted speech separation using deep recurrent neural networks

HEHakan Erdoğan JRJohn R. Hershey SWShinji Watanabe JLJonathan Le Roux

Mitsubishi Electric (United States) · Sabancı Üniversitesi

Indexed incrossref

Abstract

Separation of speech embedded in non-stationary interference is a challenging problem that has recently seen dramatic improvements using deep network-based methods. Previous work has shown that estimating a masking function to be applied to the noisy spectrum is a viable approach that can be improved by using a signal-approximation based objective function. Better modeling of dynamics through deep recurrent networks has also been shown to improve performance. Here we pursue both of these directions. We develop a phase-sensitive objective function based on the signal-to-noise ratio (SNR) of the reconstructed signal, and show that in experiments it yields uniformly better results in terms of signal-to-distortion…

Citation impact

701

total citations

FWCI: 43.76
Percentile: 100%
References: 23

Citations per year

Authors

4

Topics & keywords

Topics

Keywords

Computer science
Speech recognition
Artificial neural network
Artificial intelligence
Signal-to-noise ratio (imaging)
Speech enhancement
SIGNAL (programming language)
Distortion (music)

UN Sustainable Development Goals

Peace, Justice and strong institutions

No related works found for this paper.