DeViSE: A Deep Visual-Semantic Embedding Model

Frome, Andrea; Corrado, Greg S.; Shlens, Jon; Bengio, Samy; Dean, Jeff; Ranzato, Marc’Aurelio; Mikolov, Tomáš

articleDec 5, 2013Closed access

DeViSE: A Deep Visual-Semantic Embedding Model

AFAndrea Frome GSGreg S. Corrado JSJon Shlens SBSamy Bengio JDJeff Dean

Abstract

Modern visual recognition systems are often limited in their ability to scale to large numbers of object categories. This limitation is in part due to the increasing difficulty of acquiring sufficient training data in the form of labeled images as the number of object categories grows. One remedy is to leverage data from other sources – such as text data – both to train visual models and to constrain their pre-dictions. In this paper we present a new deep visual-semantic embedding model trained to identify visual objects using both labeled image data as well as seman-tic information gleaned from unannotated text. We demonstrate that this model matches state-of-the-art performance on the 1000-class ImageNet…

Citation impact

2,062

total citations

FWCI: 157.16
Percentile: 100%
References: 21

Citations per year

Authors

7

Topics & keywords

Topics

Keywords

Computer science
Leverage (statistics)
Embedding
Artificial intelligence
Class (philosophy)
Object (grammar)
Visualization
Cognitive neuroscience of visual object recognition

UN Sustainable Development Goals

Quality Education

No related works found for this paper.