LAION-5B: An open large-scale dataset for training next generation image-text models

Schuhmann, Christoph; Beaumont, Romain; Vencu, Richard; Gordon, Cade; Wightman, Ross; Cherti, Mehdi; Coombes, Theo; Katta, Aarush; Clayton, Mullis,; Wortsman, Mitchell; Schramowski, Patrick; Kundurthy, Srivatsa; Crowson, Katherine; Schmidt, Ludwig; Kaczmarczyk, Robert; Jitsev, Jenia

doi:10.48550/arxiv.2210.08402

preprintarXiv (Cornell University)Oct 16, 2022GREEN OA

LAION-5B: An open large-scale dataset for training next generation image-text models

CSChristoph Schuhmann RBRomain Beaumont RVRichard Vencu CGCade Gordon RWRoss Wightman

Indexed inarxivdatacite

Abstract

Groundbreaking language-vision architectures like CLIP and DALL-E proved the utility of training on large amounts of noisy image-text data, without relying on expensive accurate labels used in standard vision unimodal supervised learning. The resulting models showed capabilities of strong text-guided image generation and transfer to downstream tasks, while performing remarkably at zero-shot classification with noteworthy out-of-distribution robustness. Since then, large-scale language-vision models like ALIGN, BASIC, GLIDE, Flamingo and Imagen made further improvements. Studying the training and capabilities of such models requires datasets containing billions of image-text pairs. Until now, no datasets of…

Citation impact

1,036

total citations

FWCI: —
Percentile: —
References: 0

Citations per year

Authors

16

Topics & keywords

Topics

Keywords

Computer science
Robustness (evolution)
Artificial intelligence
Image (mathematics)
Scale (ratio)
Modal
Machine learning
Data mining

UN Sustainable Development Goals

Quality Education

No related works found for this paper.