Dual-Path RNN: Efficient Long Sequence Modeling for Time-Domain Single-Channel Speech Separation

Luo, Yi; Chen, Zhuo; Yoshioka, Takuya

doi:10.1109/icassp40776.2020.9054266

articleApr 9, 2020Closed access

Dual-Path RNN: Efficient Long Sequence Modeling for Time-Domain Single-Channel Speech Separation

YLYi Luo ZCZhuo Chen TYTakuya Yoshioka

Columbia University · Microsoft (United States)

Indexed incrossref

Abstract

Recent studies in deep learning-based speech separation have proven the superiority of time-domain approaches to conventional time-frequency-based methods. Unlike the time-frequency domain approaches, the time-domain separation systems often receive input sequences consisting of a huge number of time steps, which introduces challenges for modeling extremely long sequences. Conventional recurrent neural networks (RNNs) are not effective for modeling such long sequences due to optimization difficulties, while one-dimensional convolutional neural networks (1-D CNNs) cannot perform utterance-level sequence modeling when its receptive field is smaller than the sequence length. In this paper, we propose dual-path…

Citation impact

765

total citations

FWCI: 67.01
Percentile: 100%
References: 46

Citations per year

Authors

3

Topics & keywords

Topics

Keywords

Recurrent neural network
Computer science
Sequence (biology)
Algorithm
Path (computing)
Deep learning
Frequency domain
Artificial intelligence

No related works found for this paper.