articleIEEE Transactions on Geoscience and Remote SensingJan 1, 2024Closed access

A Multilevel Multimodal Fusion Transformer for Remote Sensing Semantic Segmentation

Chinese University of Hong Kong, Shenzhen · Wuhan University of Science and Technology

Indexed incrossref

Abstract

Accurate semantic segmentation of remote sensing data plays a crucial role in the success of geoscience research and applications. Recently, multimodal fusion-based segmentation models have attracted much attention due to their outstanding performance as compared to conventional single-modal techniques. However, most of these models perform their fusion operation using convolutional neural networks (CNN) or the vision transformer (Vit), resulting in insufficient local-global contextual modeling and representative capabilities. In this work, a multilevel multimodal fusion scheme called FTransUNet is proposed to provide a robust and effective multimodal fusion backbone for semantic segmentation by integrating…

Citation impact

208
total citations
FWCI
61.39
Percentile
100%
References
65
Citations per year

Authors

4

Topics & keywords

Keywords
  • Computer science
  • Segmentation
  • Fusion
  • Artificial intelligence
  • Transformer
  • Computer vision
  • Remote sensing
  • Image segmentation
No related works found for this paper.

Funding