Long-term Recurrent Convolutional Networks for Visual Recognition and Description

Donahue, Jeff; Hendricks, Lisa Anne; Guadarrama, Sergio; Rohrbach, Marcus; Venugopalan, Subhashini; Saenko, Kate; Darrell, Trevor

doi:10.21236/ada623249

preprintNov 17, 2014Closed access

Long-term Recurrent Convolutional Networks for Visual Recognition and Description

JDJeff Donahue LALisa Anne Hendricks SGSergio Guadarrama MRMarcus Rohrbach SVSubhashini Venugopalan

Indexed incrossref

Abstract

Models based on deep convolutional networks have dominated recent image interpretation tasks; we investigate whether models which are also recurrent are effective for tasks involving sequences, visual and otherwise.We describe a class of recurrent convolutional architectures which is end-to-end trainable and suitable for large-scale visual understanding tasks, and demonstrate the value of these models for activity recognition, image captioning, and video description.In contrast to previous models which assume a fixed visual representation or perform simple temporal averaging for sequential processing, recurrent convolutional models are "doubly deep" in that they learn compositional representations in space and…

Citation impact

1,067

total citations

FWCI: —
Percentile: —
References: 94

Citations per year

Authors

7

Topics & keywords

Topics

Keywords

Term (time)
Computer science
Convolutional neural network
Artificial intelligence
Pattern recognition (psychology)
Physics

UN Sustainable Development Goals

Quality Education

No related works found for this paper.