articleMar 1, 2017Closed access

Audio Set: An ontology and human-labeled dataset for audio events

Google (United States)

Indexed incrossref

Abstract

Audio event recognition, the human-like ability to identify and relate sounds from audio, is a nascent problem in machine perception. Comparable problems such as object detection in images have reaped enormous benefits from comprehensive datasets - principally ImageNet. This paper describes the creation of Audio Set, a large-scale dataset of manually-annotated audio events that endeavors to bridge the gap in data availability between image and audio research. Using a carefully structured hierarchical ontology of 632 audio classes guided by the literature and manual curation, we collect data from human labelers to probe the presence of specific audio classes in 10 second segments of YouTube videos. Segments are…

Citation impact

2,950
total citations
FWCI
126.38
Percentile
100%
References
25
Citations per year

Authors

8

Topics & keywords

Keywords
  • Computer science
  • Metadata
  • Event (particle physics)
  • Ontology
  • Set (abstract data type)
  • Context (archaeology)
  • Audio signal processing
  • Audio signal
No related works found for this paper.