articleJul 1, 2017Closed access

YouTube-BoundingBoxes: A Large High-Precision Human-Annotated Data Set for Object Detection in Video

Google (United States)

Indexed incrossref

Abstract

We introduce a new large-scale data set of video URLs with densely-sampled object bounding box annotations called YouTube-BoundingBoxes (YT-BB). The data set consists of approximately 380,000 video segments about 19s long, automatically selected to feature objects in natural settings without editing or post-processing, with a recording quality often akin to that of a hand-held cell phone camera. The objects represent a subset of the COCO [32] label set. All video segments were human-annotated with high-precision classification labels and bounding boxes at 1 frame per second. The use of a cascade of increasingly precise human annotations ensures a label accuracy above 95% for every class and tight bounding…

Citation impact

564
total citations
FWCI
15.16
Percentile
100%
References
62
Citations per year

Authors

5

Topics & keywords

Keywords
  • Computer science
  • Artificial intelligence
  • Minimum bounding box
  • Data set
  • Set (abstract data type)
  • Video tracking
  • Frame (networking)
  • Object (grammar)
No related works found for this paper.