Transformer and CNN Hybrid Deep Neural Network for Semantic Segmentation of Very-High-Resolution Remote Sensing Imagery
Wuhan University · State Key Laboratory of Information Engineering in Surveying Mapping and Remote Sensing
Abstract
This article presents a transformer and convolutional neural network (CNN) hybrid deep neural network for semantic segmentation of very high resolution (VHR) remote sensing imagery. The model follows an encoder–decoder structure. The encoder module uses a new universal backbone Swin transformer to extract features to achieve better long-range spatial dependencies modeling. The decoder module draws on some effective blocks and successful strategies of CNN-based models in remote sensing image segmentation. In the middle of the framework, an atrous spatial pyramid pooling block based on depthwise separable convolution (SASPP) is applied to obtain a multiscale context. A U-shaped decoder is used to gradually…
Citation impact
- FWCI
- 30.34
- Percentile
- 100%
- References
- 54
Authors
6- CZCheng ZhangCorresponding
Wuhan University, State Key Laboratory of Information Engineering in Surveying Mapping and Remote Sensing
- WJWanshou Jiang
Wuhan University, State Key Laboratory of Information Engineering in Surveying Mapping and Remote Sensing
- YZYuan Zhang
Wuhan University, State Key Laboratory of Information Engineering in Surveying Mapping and Remote Sensing
- WWWei Wang
Wuhan University, State Key Laboratory of Information Engineering in Surveying Mapping and Remote Sensing
- QZQing Zhao
Wuhan University, State Key Laboratory of Information Engineering in Surveying Mapping and Remote Sensing
Topics & keywords
- Computer science
- Artificial intelligence
- Segmentation
- Convolutional neural network
- Encoder
- Pyramid (geometry)
- Pattern recognition (psychology)
- Deep learning