YouTube-8M: A Large-Scale Video Classification Benchmark

Abu-El-Haija, Sami; Kothari, Nisarg; Lee, Joonseok; Paul, Natsev,; Toderici, George; Varadarajan, Balakrishnan; Vijayanarasimhan, Sudheendra

doi:10.48550/arxiv.1609.08675

preprintarXiv (Cornell University)Sep 27, 2016GREEN OA

YouTube-8M: A Large-Scale Video Classification Benchmark

SASami Abu-El-Haija NKNisarg Kothari JLJoonseok LeeNPNatsev, PaulGTGeorge Toderici

Indexed inarxivdatacite

Abstract

Many recent advancements in Computer Vision are attributed to large datasets. Open-source software packages for Machine Learning and inexpensive commodity hardware have reduced the barrier of entry for exploring novel approaches at scale. It is possible to train models over millions of examples within a few days. Although large-scale datasets exist for image understanding, such as ImageNet, there are no comparable size video classification datasets. In this paper, we introduce YouTube-8M, the largest multi-label video classification dataset, composed of ~8 million videos (500K hours of video), annotated with a vocabulary of 4800 visual entities. To get the videos and their labels, we used a YouTube video…

Citation impact

921

total citations

FWCI: —
Percentile: —
References: 32

Citations per year

Authors

7

Topics & keywords

Topics

Keywords

Computer science
Artificial intelligence
Metadata
Benchmark (surveying)
Frame (networking)
Machine learning
Annotation
Variety (cybernetics)

No related works found for this paper.