articleJul 1, 2017Closed access
YouTube-BoundingBoxes: A Large High-Precision Human-Annotated Data Set for Object Detection in Video
Indexed incrossref
Abstract
We introduce a new large-scale data set of video URLs with densely-sampled object bounding box annotations called YouTube-BoundingBoxes (YT-BB). The data set consists of approximately 380,000 video segments about 19s long, automatically selected to feature objects in natural settings without editing or post-processing, with a recording quality often akin to that of a hand-held cell phone camera. The objects represent a subset of the COCO [32] label set. All video segments were human-annotated with high-precision classification labels and bounding boxes at 1 frame per second. The use of a cascade of increasingly precise human annotations ensures a label accuracy above 95% for every class and tight bounding…
Citation impact
564
total citations
- FWCI
- 15.16
- Percentile
- 100%
- References
- 62
Citations per year
Authors
5Topics & keywords
Topics
Keywords
- Computer science
- Artificial intelligence
- Minimum bounding box
- Data set
- Set (abstract data type)
- Video tracking
- Frame (networking)
- Object (grammar)
No related works found for this paper.