Self-Supervised Pre-Training of Swin Transformers for 3D Medical Image Analysis

Tang, Yucheng; Yang, Dong; Li, Wenqi; Roth, Holger R.; Landman, Bennett A.; Xu, Daguang; Nath, Vishwesh; Hatamizadeh, Ali

doi:10.1109/cvpr52688.2022.02007

article2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)Jun 1, 2022Closed access

Self-Supervised Pre-Training of Swin Transformers for 3D Medical Image Analysis

YTYucheng Tang DYDong Yang WLWenqi Li HRHolger R. Roth BABennett A. Landman

Vanderbilt University

Indexed incrossref

Abstract

Vision Transformers (ViT)s have shown great performance in self-supervised learning of global and local representations that can be transferred to downstream applications. Inspired by these results, we introduce a novel self-supervised learning framework with tailored proxy tasks for medical image analysis. Specifically, we propose: (i) a new 3D transformer-based model, dubbed Swin UNEt TRansformers (Swin UNETR), with a hierarchical encoder for self-supervised pretraining; (ii) tailored proxy tasks for learning the underlying pattern of human anatomy. We demonstrate successful pre-training of the proposed model on 5,050 publicly available computed tomography (CT) images from various body organs. The…

Citation impact

751

total citations

FWCI: 39.60
Percentile: 100%
References: 84

Citations per year

Authors

8

Topics & keywords

Topics

Keywords

Artificial intelligence
Computer science
Segmentation
Transformer
Encoder
Machine learning
Engineering

No related works found for this paper.