pi05_real_pk_sharp

Fine-tuned pi0.5 vision-language-action (VLA) model for real robot manipulation.

Task

  • Task: Pass Knife
  • Training data: Sharp-end mode only
  • Dataset: real_pass_knife_sharp
  • Robot: Franka Panda (7-DOF)
  • Cameras: Base RGB + Wrist RGB (256x256)

Training Configuration

Parameter Value
Base model pi0.5 (PaliGemma 2B + Gemma 2B action expert)
Total parameters ~3.35B
Action dimension 32
Action horizon 10
Batch size 16
Training steps 5,000
Learning rate Cosine decay: warmup=500, peak=5e-5, end=5e-6
Optimizer AdamW (gradient clip norm=1.0)
Base weights gs://openpi-assets/checkpoints/pi05_base/params
GPUs 8x NVIDIA A100
Normalization Quantile normalization

Included Checkpoints

  • Step 4000: loss = 0.0030
  • Step 4999: loss = 0.0023

Loss Curve

Step Loss
0 0.1074
1000 0.0120
2000 0.0076
3000 0.0052
4000 0.0030
4900 0.0023

Usage with openpi

# Add config name to openpi training config, then:
from openpi.training.config import get_config
config = get_config("pi05_real_pk_sharp")

# For inference, load the params checkpoint:
# checkpoint_path = "path/to/step_XXXX/params"

Part of Mode Editing Research

This checkpoint is part of the "Don't Filter Your Data, Edit Your Policy" project (CoRL 2026), investigating post-hoc behavior mode editing for robot policies using Classifier-Guided Distillation (CG-Distill).

  • Mixed models are trained on demonstrations containing all behavioral modes
  • Mode-specific models are trained on single-mode filtered data
  • CG-Distill edited models (coming soon) use classifier gradients to steer mixed models toward specific modes at zero inference cost
Downloads last month

-

Downloads are not tracked for this model. How to track
Video Preview
loading