pi05_real_pk_sharp

Fine-tuned pi0.5 vision-language-action (VLA) model for real robot manipulation.

Task

Task: Pass Knife
Training data: Sharp-end mode only
Dataset: real_pass_knife_sharp
Robot: Franka Panda (7-DOF)
Cameras: Base RGB + Wrist RGB (256x256)

Training Configuration

Parameter	Value
Base model	pi0.5 (PaliGemma 2B + Gemma 2B action expert)
Total parameters	~3.35B
Action dimension	32
Action horizon	10
Batch size	16
Training steps	5,000
Learning rate	Cosine decay: warmup=500, peak=5e-5, end=5e-6
Optimizer	AdamW (gradient clip norm=1.0)
Base weights	`gs://openpi-assets/checkpoints/pi05_base/params`
GPUs	8x NVIDIA A100
Normalization	Quantile normalization

Included Checkpoints

Step 4000: loss = 0.0030
Step 4999: loss = 0.0023

Loss Curve

Step	Loss
0	0.1074
1000	0.0120
2000	0.0076
3000	0.0052
4000	0.0030
4900	0.0023

Usage with openpi

# Add config name to openpi training config, then:
from openpi.training.config import get_config
config = get_config("pi05_real_pk_sharp")

# For inference, load the params checkpoint:
# checkpoint_path = "path/to/step_XXXX/params"

Part of Mode Editing Research

This checkpoint is part of the "Don't Filter Your Data, Edit Your Policy" project (CoRL 2026), investigating post-hoc behavior mode editing for robot policies using Classifier-Guided Distillation (CG-Distill).

Mixed models are trained on demonstrations containing all behavioral modes
Mode-specific models are trained on single-mode filtered data
CG-Distill edited models (coming soon) use classifier gradients to steer mixed models toward specific modes at zero inference cost

Downloads last month: -; Downloads are not tracked for this model. How to track

Video Preview

Robotics