DCCRN: Deep Complex Convolution Recurrent Network for Phase-Aware Speech Enhancement
Northwestern Polytechnical University · Sohu (China)
Abstract
Speech enhancement has benefited from the success of deep learning in terms of intelligibility and perceptual quality.Conventional time-frequency (TF) domain methods focus on predicting TF-masks or speech spectrum, via a naive convolution neural network (CNN) or recurrent neural network (RNN).Some recent studies use complex-valued spectrogram as a training target but train in a real-valued network, predicting the magnitude and phase component or real and imaginary part, respectively.Particularly, convolution recurrent network (CRN) integrates a convolutional encoder-decoder (CED) structure and long short-term memory (LSTM), which has been proven to be helpful for complex targets.In order to train the complex…
Citation impact
- FWCI
- 55.76
- Percentile
- 100%
- References
- 32
Authors
9Topics & keywords
- Computer science
- Convolution (computer science)
- Phase (matter)
- Speech enhancement
- Speech recognition
- Artificial intelligence
- Artificial neural network
- Physics
- Peace, Justice and strong institutions