articleJun 1, 2023Closed access
BiFormer: Vision Transformer with Bi-Level Routing Attention
Indexed incrossref
Abstract
As the core building block of vision transformers, attention is a powerful tool to capture long-range dependency. However, such power comes at a cost: it incurs a huge computation burden and heavy memory footprint as pairwise token interaction across all spatial locations is computed. A series of works attempt to alleviate this problem by introducing handcrafted and content-agnostic sparsity into attention, such as restricting the attention operation to be inside local windows, axial stripes, or dilated windows. In contrast to these approaches, we propose a novel dynamic sparse attention via bi-level routing to enable a more flexible allocation of computations with content awareness. Specifically, for a query,…
Citation impact
1,033
total citations
- FWCI
- 117.59
- Percentile
- 100%
- References
- 67
Citations per year
Authors
5Topics & keywords
Topics
Keywords
- Computer science
- Memory footprint
- Security token
- Computation
- Transformer
- Segmentation
- Artificial intelligence
- Parallel computing
No related works found for this paper.