Complex Ratio Masking for Monaural Speech Separation

The Ohio State University · Google (United States)

PubMed
Indexed incrossrefpubmed

Abstract

Speech separation systems usually operate on the short-time Fourier transform (STFT) of noisy speech, and enhance only the magnitude spectrum while leaving the phase spectrum unchanged. This is done because there was a belief that the phase spectrum is unimportant for speech enhancement. Recent studies, however, suggest that phase is important for perceptual quality, leading some researchers to consider magnitude and phase spectrum enhancements. We present a supervised monaural speech separation approach that simultaneously enhances the magnitude and phase spectra by operating in the complex domain. Our approach uses a deep neural network to estimate the real and imaginary components of the ideal ratio mask…

Citation impact

823
total citations
FWCI
24.34
Percentile
100%
References
41
Citations per year

Authors

3

Topics & keywords

Keywords
  • PESQ
  • Monaural
  • Computer science
  • Speech recognition
  • Short-time Fourier transform
  • Speech enhancement
  • Masking (illustration)
  • Artificial intelligence
UN Sustainable Development Goals
  • Peace, Justice and strong institutions
No related works found for this paper.

Funding