CMTFNet: CNN and Multiscale Transformer Fusion Network for Remote-Sensing Image Semantic Segmentation

Wu, Honglin; Huang, Peng; Zhang, Min; Tang, Wenlong; Yu, Xinyu

doi:10.1109/tgrs.2023.3314641

articleIEEE Transactions on Geoscience and Remote SensingJan 1, 2023Closed access

CMTFNet: CNN and Multiscale Transformer Fusion Network for Remote-Sensing Image Semantic Segmentation

HWHonglin Wu PHPeng Huang MZMin Zhang WTWenlong Tang XYXinyu Yu

Changsha University of Science and Technology

Indexed incrossref

Abstract

Convolutional neural networks (CNNs) are powerful in extracting local information but lack the ability to model long-range dependencies. In contrast, transformer relies on multihead self-attention mechanisms to effectively extract the global contextual information and thus model long-range dependencies. In this paper, we propose a novel encoder-decoder structured semantic segmentation network, named as CNN and multiscale transformer fusion network (CMTFNet), to extract and fuse local information and multiscale global contextual information of high-resolution remote sensing images. Specifically, to further process the output features from the CNN encoder, we build a transformer decoder based on the multiscale…

Citation impact

300

total citations

FWCI: 45.60
Percentile: 100%
References: 58

Citations per year

Authors

5

Topics & keywords

Topics

Keywords

Computer science
Encoder
Artificial intelligence
Convolutional neural network
Segmentation
Transformer
Pattern recognition (psychology)
Computer vision

No related works found for this paper.

Funding

ED
Education Department of Hunan Province
Award: 21B0329