Deep Learning for Audio Signal Processing

Purwins, H.‐G.; Li, Bo; Virtanen, Tuomas; Schlüter, Jan; Chang, Shuo-Yiin; Sainath, Tara N.

doi:10.1109/jstsp.2019.2908700

articleIEEE Journal of Selected Topics in Signal ProcessingApr 1, 2019GREEN OA

Deep Learning for Audio Signal Processing

HPH.‐G. Purwins BLBo Li TVTuomas Virtanen JSJan Schlüter SCShuo-Yiin Chang

Aalborg University · Google (United States) · +5 more institutions

Indexed inarxivcrossref

Abstract

Given the recent surge in developments of deep learning, this paper provides a review of the state-of-the-art deep learning techniques for audio signal processing. Speech, music, and environmental sound processing are considered side-by-side, in order to point out similarities and differences between the domains, highlighting general methods, problems, key references, and potential for cross fertilization between areas. The dominant feature representations (in particular, log-mel spectra and raw waveform) and deep learning models are reviewed, including convolutional neural networks, variants of the long short-term memory architecture, as well as more audio-specific neural network models. Subsequently,…

Citation impact

815

total citations

FWCI: 59.09
Percentile: 100%
References: 221

Citations per year

Authors

6

Topics & keywords

Topics

Keywords

Computer science
Deep learning
Audio signal processing
Speech recognition
Audio signal
Artificial intelligence
Convolutional neural network
Speech processing

No related works found for this paper.

Funding

CN
Conseil National de la Recherche Scientifique
Award: INS2I 2018