Framing Image Description as a Ranking Task: Data, Models and Evaluation Metrics
University of Illinois Urbana-Champaign
Abstract
The ability to associate images with natural language sentences that describe what is depicted in them is a hallmark of image understanding, and a prerequisite for applications such as sentence-based image search. In analogy to image search, we propose to frame sentence-based image annotation as the task of ranking a given pool of captions. We introduce a new benchmark collection for sentence-based image description and search, consisting of 8,000 images that are each paired with five different captions which provide clear descriptions of the salient entities and events. We introduce a number of systems that perform quite well on this task, even though they are only based on features that can be obtained with…
Citation impact
- FWCI
- 61.31
- Percentile
- 100%
- References
- 103
Authors
3Topics & keywords
- Computer science
- Sentence
- Ranking (information retrieval)
- Artificial intelligence
- Natural language processing
- Information retrieval
- Salient
- Task (project management)
- Quality Education