pi05_real_pk_sharp
Fine-tuned pi0.5 vision-language-action (VLA) model for real robot manipulation.
Task
- Task: Pass Knife
- Training data: Sharp-end mode only
- Dataset:
real_pass_knife_sharp - Robot: Franka Panda (7-DOF)
- Cameras: Base RGB + Wrist RGB (256x256)
Training Configuration
| Parameter | Value |
|---|---|
| Base model | pi0.5 (PaliGemma 2B + Gemma 2B action expert) |
| Total parameters | ~3.35B |
| Action dimension | 32 |
| Action horizon | 10 |
| Batch size | 16 |
| Training steps | 5,000 |
| Learning rate | Cosine decay: warmup=500, peak=5e-5, end=5e-6 |
| Optimizer | AdamW (gradient clip norm=1.0) |
| Base weights | gs://openpi-assets/checkpoints/pi05_base/params |
| GPUs | 8x NVIDIA A100 |
| Normalization | Quantile normalization |
Included Checkpoints
- Step 4000: loss = 0.0030
- Step 4999: loss = 0.0023
Loss Curve
| Step | Loss |
|---|---|
| 0 | 0.1074 |
| 1000 | 0.0120 |
| 2000 | 0.0076 |
| 3000 | 0.0052 |
| 4000 | 0.0030 |
| 4900 | 0.0023 |
Usage with openpi
# Add config name to openpi training config, then:
from openpi.training.config import get_config
config = get_config("pi05_real_pk_sharp")
# For inference, load the params checkpoint:
# checkpoint_path = "path/to/step_XXXX/params"
Part of Mode Editing Research
This checkpoint is part of the "Don't Filter Your Data, Edit Your Policy" project (CoRL 2026), investigating post-hoc behavior mode editing for robot policies using Classifier-Guided Distillation (CG-Distill).
- Mixed models are trained on demonstrations containing all behavioral modes
- Mode-specific models are trained on single-mode filtered data
- CG-Distill edited models (coming soon) use classifier gradients to steer mixed models toward specific modes at zero inference cost