Beyond Self-Attention: External Attention Using Two Linear Layers for Visual Tasks

Tsinghua University

PubMed
Indexed incrossrefpubmed

Abstract

Attention mechanisms, especially self-attention, have played an increasingly important role in deep feature representation for visual tasks. Self-attention updates the feature at each position by computing a weighted sum of features using pair-wise affinities across all positions to capture the long-range dependency within a single sample. However, self-attention has quadratic complexity and ignores potential correlation between different samples. This article proposes a novel attention mechanism which we call external attention, based on two external, small, learnable, shared memories, which can be implemented easily by simply using two cascaded linear layers and two normalization layers; it conveniently…

Citation impact

620
total citations
FWCI
60.63
Percentile
100%
References
120
Citations per year

Authors

4

Topics & keywords

Keywords
  • Computer science
  • Artificial intelligence
  • Pattern recognition (psychology)
  • Normalization (sociology)
  • Segmentation
  • Feature (linguistics)
No related works found for this paper.