Long-term recurrent convolutional networks for visual recognition and description
University of California, Berkeley · International Computer Science Institute · +2 more institutions
Abstract
Models based on deep convolutional networks have dominated recent image interpretation tasks; we investigate whether models which are also recurrent, or “temporally deep”, are effective for tasks involving sequences, visual and otherwise. We develop a novel recurrent convolutional architecture suitable for large-scale visual learning which is end-to-end trainable, and demonstrate the value of these models on benchmark video recognition tasks, image description and retrieval problems, and video narration challenges. In contrast to current models which assume a fixed spatio-temporal receptive field or simple temporal averaging for sequential processing, recurrent convolutional models are “doubly deep” in that…
Citation impact
- FWCI
- 355.97
- Percentile
- 100%
- References
- 78
Authors
7- JDJeff DonahueCorresponding
University of California, Berkeley
- LALisa Anne Hendricks
International Computer Science Institute, University of California, Berkeley
- SGSergio Guadarrama
University of California, Berkeley, International Computer Science Institute
- MRMarcus Rohrbach
University of California, Berkeley, International Computer Science Institute
- SVSubhashini Venugopalan
The University of Texas at Austin
Topics & keywords
- Computer science
- Artificial intelligence
- Benchmark (surveying)
- Deep learning
- Convolutional neural network
- Recurrent neural network
- Pattern recognition (psychology)
- Machine learning
- Quality Education