articleJun 16, 2024Closed access

SkySense: A Multi-Modal Remote Sensing Foundation Model Towards Universal Interpretation for Earth Observation Imagery

Wuhan University

Indexed incrossref

Abstract

Prior studies on Remote Sensing Foundation Model (RSFM) reveal immense potential towards a generic model for Earth Observation. Nevertheless, these works primar-ily focus on a single modality without temporal and geo-context modeling, hampering their capabilities for diverse tasks. In this study, we present SkySense, a generic billion-scale model, pretrained on a curated multimodal Remote Sensing Imagery (RSI) dataset with 21.5 million temporal sequences. SkySense incorporates a factorized multimodal spatiotemporal encoder taking temporal sequences of opti-cal and Synthetic Aperture Radar (SAR) data as input. This encoder is pretrained by our proposed Multi-Granularity Contrastive Learning to learn…

Citation impact

174
total citations
FWCI
51.02
Percentile
100%
References
84
Citations per year

Authors

16

Topics & keywords

Keywords
  • Modal
  • Remote sensing
  • Foundation (evidence)
  • Earth (classical element)
  • Interpretation (philosophy)
  • Earth observation
  • Computer science
  • Artificial intelligence
UN Sustainable Development Goals
  • Climate action
No related works found for this paper.