DyTox: Transformers for Continual Learning with DYnamic TOken eXpansion

Douillard, Arthur; Ramé, Alexandre; Couairon, Guillaume; Cord, Matthieu

doi:10.1109/cvpr52688.2022.00907

article2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)Jun 1, 2022GREEN OA

DyTox: Transformers for Continual Learning with DYnamic TOken eXpansion

ADArthur Douillard ARAlexandre Ramé GCGuillaume Couairon MCMatthieu Cord

Sorbonne Université · Valeo (France)

Indexed incrossref

Abstract

Deep network architectures struggle to continually learn new tasks without forgetting the previous tasks. A recent trend indicates that dynamic architectures based on an ex-pansion of the parameters can reduce catastrophic forget-ting efficiently in continual learning. However, existing approaches often require a task identifier at test-time, need complex tuning to balance the growing number of parameters, and barely share any information across tasks. As a result, they struggle to scale to a large number of tasks without significant overhead. In this paper, we propose a transformer architecture based on a dedicated encoder/decoder framework. Critically, the encoder and decoder are shared among all tasks.…

Citation impact

320

total citations

FWCI: 31.37
Percentile: 100%
References: 125

Citations per year

Authors

4

Topics & keywords

Topics

Keywords

Computer science
Forgetting
Encoder
Transformer
Identifier
Distributed computing
Artificial intelligence
Computer network

No related works found for this paper.