Grounded Compositional Semantics for Finding and Describing Images with Sentences
Laboratoire d'Informatique de Paris-Nord · Stanford University · +1 more institution
Abstract
Previous work on Recursive Neural Networks (RNNs) shows that these models can produce compositional feature vectors for accurately representing and classifying sentences or images. However, the sentence vectors of previous models cannot accurately represent visually grounded meaning. We introduce the DT-RNN model which uses dependency trees to embed sentences into a vector space in order to retrieve images that are described by those sentences. Unlike previous RNN-based models which use constituency trees, DT-RNNs naturally focus on the action and agents in a sentence. They are better able to abstract from the details of word order and syntactic expression. DT-RNNs outperform other recursive and recurrent…
Citation impact
- FWCI
- 84.03
- Percentile
- 100%
- References
- 42
Authors
5- RSRichard SocherCorresponding
Laboratoire d'Informatique de Paris-Nord, Stanford University
- AKAndrej KarpathyCorresponding
Laboratoire d'Informatique de Paris-Nord, Stanford University
- QVQuoc V. LeCorresponding
Google (United States)
- CDChristopher D. ManningCorresponding
Laboratoire d'Informatique de Paris-Nord, Stanford University
- AYAndrew Y. NgCorresponding
Laboratoire d'Informatique de Paris-Nord, Stanford University
Topics & keywords
- Computer science
- Sentence
- Recurrent neural network
- Artificial intelligence
- Natural language processing
- Dependency (UML)
- Word (group theory)
- Feature (linguistics)
- Quality Education