YouTube-8M: A Large-Scale Video Classification Benchmark
Indexed inarxivdatacite
Abstract
Many recent advancements in Computer Vision are attributed to large datasets. Open-source software packages for Machine Learning and inexpensive commodity hardware have reduced the barrier of entry for exploring novel approaches at scale. It is possible to train models over millions of examples within a few days. Although large-scale datasets exist for image understanding, such as ImageNet, there are no comparable size video classification datasets. In this paper, we introduce YouTube-8M, the largest multi-label video classification dataset, composed of ~8 million videos (500K hours of video), annotated with a vocabulary of 4800 visual entities. To get the videos and their labels, we used a YouTube video…
Citation impact
921
total citations
- FWCI
- —
- Percentile
- —
- References
- 32
Citations per year
Authors
7Topics & keywords
Topics
Keywords
- Computer science
- Artificial intelligence
- Metadata
- Benchmark (surveying)
- Frame (networking)
- Machine learning
- Annotation
- Variety (cybernetics)
No related works found for this paper.