Complex Ratio Masking for Monaural Speech Separation

Williamson, Donald S.; Wang, Yuxuan; Wang, DeLiang

doi:10.1109/taslp.2015.2512042

articleIEEE/ACM Transactions on Audio Speech and Language ProcessingDec 23, 2015GREEN OA

Complex Ratio Masking for Monaural Speech Separation

DSDonald S. Williamson YWYuxuan Wang DWDeLiang Wang

The Ohio State University · Google (United States)

PubMed

Indexed incrossrefpubmed

Abstract

Speech separation systems usually operate on the short-time Fourier transform (STFT) of noisy speech, and enhance only the magnitude spectrum while leaving the phase spectrum unchanged. This is done because there was a belief that the phase spectrum is unimportant for speech enhancement. Recent studies, however, suggest that phase is important for perceptual quality, leading some researchers to consider magnitude and phase spectrum enhancements. We present a supervised monaural speech separation approach that simultaneously enhances the magnitude and phase spectra by operating in the complex domain. Our approach uses a deep neural network to estimate the real and imaginary components of the ideal ratio mask…

Citation impact

823

total citations

FWCI: 24.34
Percentile: 100%
References: 41

Citations per year

Authors

3

Topics & keywords

Topics

Keywords

PESQ
Monaural
Computer science
Speech recognition
Short-time Fourier transform
Speech enhancement
Masking (illustration)
Artificial intelligence

UN Sustainable Development Goals

Peace, Justice and strong institutions

No related works found for this paper.