articleSep 1, 2015Closed access

Environmental sound classification with convolutional neural networks

Warsaw University of Technology

Indexed incrossref

Abstract

This paper evaluates the potential of convolutional neural networks in classifying short audio clips of environmental sounds. A deep model consisting of 2 convolutional layers with max-pooling and 2 fully connected layers is trained on a low level representation of audio data (segmented spectrograms) with deltas. The accuracy of the network is evaluated on 3 public datasets of environmental and urban recordings. The model outperforms baseline implementations relying on mel-frequency cepstral coefficients and achieves results comparable to other state-of-the-art approaches.

Citation impact

904
total citations
FWCI
34.20
Percentile
100%
References
54
Citations per year

Authors

1

Topics & keywords

Keywords
  • Spectrogram
  • Pooling
  • Computer science
  • Convolutional neural network
  • Mel-frequency cepstrum
  • Speech recognition
  • Artificial intelligence
  • Pattern recognition (psychology)
UN Sustainable Development Goals
  • Sustainable cities and communities
No related works found for this paper.