Environmental sound classification with convolutional neural networks

Piczak, Karol J.

doi:10.1109/mlsp.2015.7324337

articleSep 1, 2015Closed access

Environmental sound classification with convolutional neural networks

KJKarol J. Piczak

Warsaw University of Technology

Indexed incrossref

Abstract

This paper evaluates the potential of convolutional neural networks in classifying short audio clips of environmental sounds. A deep model consisting of 2 convolutional layers with max-pooling and 2 fully connected layers is trained on a low level representation of audio data (segmented spectrograms) with deltas. The accuracy of the network is evaluated on 3 public datasets of environmental and urban recordings. The model outperforms baseline implementations relying on mel-frequency cepstral coefficients and achieves results comparable to other state-of-the-art approaches.

Citation impact

904

total citations

FWCI: 34.20
Percentile: 100%
References: 54

Citations per year

Authors

1

KJ
Karol J. PiczakCorresponding
Warsaw University of Technology

Topics & keywords

Topics

Keywords

Spectrogram
Pooling
Computer science
Convolutional neural network
Mel-frequency cepstrum
Speech recognition
Artificial intelligence
Pattern recognition (psychology)

UN Sustainable Development Goals

Sustainable cities and communities

No related works found for this paper.