preprintOct 1, 2017Closed access

Boosting Image Captioning with Attributes

Microsoft Research Asia (China) · University of Science and Technology of China · +1 more institution

Indexed incrossref

Abstract

Automatically describing an image with a natural language has been an emerging challenge in both fields of computer vision and natural language processing. In this paper, we present Long Short-Term Memory with Attributes (LSTM-A) - a novel architecture that integrates attributes into the successful Convolutional Neural Networks (CNNs) plus Recurrent Neural Networks (RNNs) image captioning framework, by training them in an end-to-end manner. Particularly, the learning of attributes is strengthened by integrating inter-attribute correlations into Multiple Instance Learning (MIL). To incorporate attributes into captioning, we construct variants of architectures by feeding image representations and attributes into…

Citation impact

761
total citations
FWCI
36.66
Percentile
100%
References
62
Citations per year

Authors

5

Topics & keywords

Keywords
  • Closed captioning
  • Computer science
  • Boosting (machine learning)
  • Artificial intelligence
  • Recurrent neural network
  • Convolutional neural network
  • Image (mathematics)
  • Natural language
UN Sustainable Development Goals
  • Quality Education
No related works found for this paper.