Visual Translation Embedding Network for Visual Relation Detection
Columbia University · National University of Singapore
Abstract
Visual relations, such as person ride bike and bike next to car, offer a comprehensive scene understanding of an image, and have already shown their great utility in connecting computer vision and natural language. However, due to the challenging combinatorial complexity of modeling subject-predicate-object relation triplets, very little work has been done to localize and predict visual relations. Inspired by the recent advances in relational representation learning of knowledge bases and convolutional object detection networks, we propose a Visual Translation Embedding network (VTransE) for visual relation detection. VTransE places objects in a low-dimensional relation space where a relation can be modeled as…
Citation impact
- FWCI
- 25.41
- Percentile
- 100%
- References
- 59
Authors
4Topics & keywords
- Computer science
- Relation (database)
- Artificial intelligence
- Embedding
- Inference
- Relationship extraction
- Spatial relation
- Convolutional neural network
- Quality Education