SkySense++
SkySense++ is a semantic-enhanced multi-modal remote sensing foundation model for Earth observation. It fuses high-resolution optical imagery (HR), Sentinel-2 (S2), and Sentinel-1 SAR (S1) through independent backbones, an optional modality-completion VAE, and a shared transformer fusion encoder.
Primary use: representation extraction. The pretrained backbones produce rich feature representations for downstream tasks (classification, segmentation, regression). Extract features_hr, features_s2, features_s1, or features_fusion and feed them to your task-specific head. Fine-tuning on your target dataset is required. See the main SkySensePlusPlus repository for pretraining, 1-shot, and finetuning workflows.
Model Metadata
| Attribute | Value |
|---|---|
| Model type | Multi-modal segmentation (HR + S2 + S1) |
| Paper | SkySense++: A Semantic-Enhanced Multi-Modal Remote Sensing Foundation Model Beyond SkySense for Earth Observation |
| Publication | Nature Machine Intelligence, 2025 |
| License | Apache-2.0 |
| Input modalities | High-resolution optical, Sentinel-2, Sentinel-1 |
| Output | Semantic segmentation (65 classes) |
| Checkpoint contents | Backbone weights only; segmentation head not pretrained |
| HR input size | 512Γ512 |
| S2/S1 patch size | 16Γ16 |
Model Variants
| Variant | Path | Sources | Use Modal VAE | Description |
|---|---|---|---|---|
| full (default) | . |
hr, s2, s1 | Yes | All three modalities with VAE completion |
| hr | hr/ |
hr | No | High-resolution optical only |
| s2 | s2/ |
s2 | No | Sentinel-2 only |
| s1 | s1/ |
s1 | No | Sentinel-1 only |
Repository structure (full variant, diffusers layout)
.
βββ config.json
βββ model.safetensors
βββ modality_vae/ # VAE subfolder (diffusers standard)
β βββ config.json
β βββ diffusion_pytorch_model.safetensors
βββ modeling_skysensepp.py
βββ configuration_skysensepp.py
βββ pipeline_skysensepp.py
βββ sky_sensepp_impl/ # ModalityCompletionVAE, ModalityCompletionVAEPipeline in necks/
βββ hr/, s2/, s1/ # Single-modality variants
VAE loads automatically from modality_vae/ subfolder. Legacy modality_vae.safetensors at root is also supported. Migrate with:
python tools/split_vae_from_checkpoint.py --model-dir path/to/model --migrate
Installation
pip install transformers torch safetensors diffusers
The modality VAE uses diffusers VQModel. Legacy checkpoints (ConvVQVAEv2) load via backward-compatible fallback.
Usage
Diffusers-style loading and inference
The VAE follows the diffusers layout: model in a modality_vae/ subfolder with config.json and diffusion_pytorch_model.safetensors. Load and run inference like this:
import torch
from transformers import AutoModel
# Load full model (VAE auto-loads from modality_vae/ subfolder, diffusers-style)
model = AutoModel.from_pretrained("path/to/SkySensepp", trust_remote_code=True)
model = model.eval().to("cuda")
# Prepare inputs
hr_img = torch.randn(1, 3, 512, 512, device="cuda")
s2_img = torch.randn(1, 10, 2, 256, 256, device="cuda") # B, 10 bands, S steps, H, W
s1_img = torch.randn(1, 2, 2, 256, 256, device="cuda") # B, 2 bands, S steps, H, W
modalities = torch.ones(1, 3, dtype=torch.bool, device="cuda") # [hr, s2, s1] present
# Inference
with torch.no_grad():
out = model(
hr_img=hr_img,
s2_img=s2_img,
s1_img=s1_img,
modality_flag_hr=modalities[:, :1],
modality_flag_s2=modalities[:, 1:2],
modality_flag_s1=modalities[:, 2:],
return_features=True,
)
features_fusion = out["features_fusion"]
logits_hr = out.get("logits_hr")
Load VAE component only (diffusers-style)
from sky_sensepp_impl.necks import ModalityCompletionVAE
# Load VAE from subfolder (same pattern as diffusers Stable Diffusion VAE)
vae = ModalityCompletionVAE.from_pretrained(
"path/to/SkySensepp",
subfolder="modality_vae",
)
vae = vae.eval().to("cuda")
# Run modality completion on backbone features (e.g. 2816-d, 16Γ16)
feat_hr = torch.randn(1, 2816, 16, 16, device="cuda")
feat_s2 = torch.randn(1, 2816, 16, 16, device="cuda")
feat_s1 = torch.randn(1, 2816, 16, 16, device="cuda")
modality_info = torch.ones(1, 3, dtype=torch.bool, device="cuda")
with torch.no_grad():
out = vae(feat_hr, feat_s2, feat_s1, modality_info)
hr_out = out["hr_out"]
s2_out = out["s2_out"]
s1_out = out["s1_out"]
ModalityCompletionVAEPipeline (modular, diffusers-style)
from sky_sensepp_impl.necks import ModalityCompletionVAE, ModalityCompletionVAEPipeline
# Load pipeline (VAE from modality_vae/ subfolder)
pipe = ModalityCompletionVAEPipeline.from_pretrained(
"path/to/SkySensepp",
subfolder="modality_vae",
)
pipe = pipe.to("cuda")
# Inference on features
out = pipe(
feat_hr=feat_hr,
feat_s2=feat_s2,
feat_s1=feat_s1,
modality_info=modality_info,
)
hr_out, s2_out, s1_out = out["hr_out"], out["s2_out"], out["s1_out"]
# Modular: inject custom VAE
custom_vae = ModalityCompletionVAE.from_pretrained("path/to/custom_vae")
pipe = ModalityCompletionVAEPipeline.from_pretrained("path/to/SkySensepp", vae=custom_vae)
# Or swap components after load
pipe.register_components(vae=custom_vae)
Load model and attach VAE manually
model = AutoModel.from_pretrained("path/to/SkySensepp", trust_remote_code=True)
model.load_vae(
pretrained_model_name_or_path="path/to/SkySensepp",
subfolder="modality_vae",
)
Variants (single-modality, no VAE)
model = AutoModel.from_pretrained("path/to/SkySensepp/hr", trust_remote_code=True)
model = AutoModel.from_pretrained("path/to/SkySensepp/s2", trust_remote_code=True)
model = AutoModel.from_pretrained("path/to/SkySensepp/s1", trust_remote_code=True)
Representation shapes (HR-only)
| Output | Shape | Description |
|---|---|---|
features_hr[i] |
multi-scale | Backbone features at 4 scales (stage 0β3) |
features_fusion |
(B, 1024, H, W) |
Fused spatial representation for downstream head |
Input Formats
| Modality | Shape | Description |
|---|---|---|
| hr_img | (B, 3, H, W) |
RGB high-res, H=W=512 typical |
| s2_img | (B, 10, S, H, W) |
Sentinel-2, 10 bands, S time steps |
| s1_img | (B, 2, S, H, W) |
Sentinel-1 VV/VH, S time steps |
Citation
@article{skysensepp2025,
title={SkySense++: A Semantic-Enhanced Multi-Modal Remote Sensing Foundation Model Beyond SkySense for Earth Observation},
journal={Nature Machine Intelligence},
year={2025},
url={https://www.nature.com/articles/s42256-025-01078-8}
}
References
- Downloads last month
- 234