Deep Learning for Audio Signal Processing
Aalborg University · Google (United States) · +5 more institutions
Abstract
Given the recent surge in developments of deep learning, this paper provides a review of the state-of-the-art deep learning techniques for audio signal processing. Speech, music, and environmental sound processing are considered side-by-side, in order to point out similarities and differences between the domains, highlighting general methods, problems, key references, and potential for cross fertilization between areas. The dominant feature representations (in particular, log-mel spectra and raw waveform) and deep learning models are reviewed, including convolutional neural networks, variants of the long short-term memory architecture, as well as more audio-specific neural network models. Subsequently,…
Citation impact
- FWCI
- 59.09
- Percentile
- 100%
- References
- 221
Authors
6- HPH.‐G. PurwinsCorresponding
Aalborg University
- BLBo Li
Google (United States)
- TVTuomas Virtanen
Tampere University
- JSJan Schlüter
Centre National de la Recherche Scientifique, Université de Toulon, Austrian Research Institute for Artificial Intelligence, Laboratoire d’Informatique et Systèmes
- SCShuo-Yiin Chang
Google (United States)
Topics & keywords
- Computer science
- Deep learning
- Audio signal processing
- Speech recognition
- Audio signal
- Artificial intelligence
- Convolutional neural network
- Speech processing