DCCRN: Deep Complex Convolution Recurrent Network for Phase-Aware Speech Enhancement

Hu, Yanxin; Liu, Yun; Lv, Shubo; Xing, Mengtao; Zhang, Shimin; Fu, Yihui; Wu, Jian; Zhang, Bihong; Xie, Lei

doi:10.21437/interspeech.2020-2537

articleOct 25, 2020Closed access

DCCRN: Deep Complex Convolution Recurrent Network for Phase-Aware Speech Enhancement

YHYanxin Hu YLYun Liu SLShubo Lv MXMengtao Xing SZShimin Zhang

Northwestern Polytechnical University · Sohu (China)

Indexed incrossref

Abstract

Speech enhancement has benefited from the success of deep learning in terms of intelligibility and perceptual quality.Conventional time-frequency (TF) domain methods focus on predicting TF-masks or speech spectrum, via a naive convolution neural network (CNN) or recurrent neural network (RNN).Some recent studies use complex-valued spectrogram as a training target but train in a real-valued network, predicting the magnitude and phase component or real and imaginary part, respectively.Particularly, convolution recurrent network (CRN) integrates a convolutional encoder-decoder (CED) structure and long short-term memory (LSTM), which has been proven to be helpful for complex targets.In order to train the complex…

Citation impact

681

total citations

FWCI: 55.76
Percentile: 100%
References: 32

Citations per year

Authors

9

Topics & keywords

Topics

Keywords

Computer science
Convolution (computer science)
Phase (matter)
Speech enhancement
Speech recognition
Artificial intelligence
Artificial neural network
Physics

UN Sustainable Development Goals

Peace, Justice and strong institutions

No related works found for this paper.