Long-Term Recurrent Convolutional Networks for Visual Recognition and Description
University of California, Berkeley · International Computer Science Institute · +2 more institutions
Abstract
Models based on deep convolutional networks have dominated recent image interpretation tasks; we investigate whether models which are also recurrent are effective for tasks involving sequences, visual and otherwise. We describe a class of recurrent convolutional architectures which is end-to-end trainable and suitable for large-scale visual understanding tasks, and demonstrate the value of these models for activity recognition, image captioning, and video description. In contrast to previous models which assume a fixed visual representation or perform simple temporal averaging for sequential processing, recurrent convolutional models are "doubly deep" in that they learn compositional representations in space…
Citation impact
- FWCI
- 80.08
- Percentile
- 100%
- References
- 111
Authors
7- JDJeff DonahueCorresponding
University of California, Berkeley
- LALisa Anne Hendricks
University of California, Berkeley
- MRMarcus Rohrbach
University of California, Berkeley, International Computer Science Institute
- SVSubhashini Venugopalan
The University of Texas at Austin
- SGSergio Guadarrama
University of California, Berkeley
Topics & keywords
- Computer science
- Artificial intelligence
- Convolutional neural network
- Recurrent neural network
- Pattern recognition (psychology)
- Representation (politics)
- Deep learning
- Differentiable function
- Quality Education