Swin Transformer Embedding UNet for Remote Sensing Image Semantic Segmentation

He, Xin; Zhou, Yong; Zhao, Jiaqi; Zhang, Di; Yao, Rui; Xue, Yong

doi:10.1109/tgrs.2022.3144165

articleIEEE Transactions on Geoscience and Remote SensingJan 1, 2022Closed access

Swin Transformer Embedding UNet for Remote Sensing Image Semantic Segmentation

XHXin He YZYong Zhou JZJiaqi Zhao DZDi Zhang RYRui Yao

Ministry of Education of the People's Republic of China · China University of Mining and Technology · +1 more institution

Indexed incrossref

Abstract

Global context information is essential for the semantic segmentation of remote sensing (RS) images. However, most existing methods rely on a convolutional neural network (CNN), which is challenging to directly obtain the global context due to the locality of the convolution operation. Inspired by the Swin transformer with powerful global modeling capabilities, we propose a novel semantic segmentation framework for RS images called ST-U-shaped network (UNet), which embeds the Swin transformer into the classical CNN-based UNet. ST-UNet constitutes a novel dual encoder structure of the Swin transformer and CNN in parallel. First, we propose a spatial interaction module (SIM), which encodes spatial information in…

Citation impact

490

total citations

FWCI: 51.30
Percentile: 100%
References: 82

Citations per year

Authors

6

Topics & keywords

Topics

Keywords

Computer science
Encoder
Transformer
Artificial intelligence
Segmentation
Upsampling
Embedding
Convolutional neural network

No related works found for this paper.