Aurora β€” Model Weights

Pretrained weights for "Aurora: Unified Video Editing with a Tool-Using Agent" (arXiv:2605.18748). Code: github.com/yeates/Aurora Β· Project page: yeates.github.io/Aurora-Page

This repository bundles the two trained Aurora components, laid out to drop straight into the code repository's models/ directory:

huggingface-cli download yeates/aurora-weights --local-dir models

Contents

Path Component Notes
aurora_editor.safetensors Video editor trained dit + mllm.context_projector + ref_vae_condition (~9.4 GB, bf16)
aurora_agent_vlm/ Agent planner adapter PEFT LoRA (r=32, alpha=64) on Qwen/Qwen3-VL-8B-Instruct

Editor β€” aurora_editor.safetensors

A partial checkpoint containing only the trained Aurora modules:

  • dit.* β€” the WAN2.2-TI2V-5B diffusion transformer (fine-tuned)
  • mllm.context_projector.* β€” projects frozen Qwen3.5-4B hidden states into DiT width
  • ref_vae_condition.* β€” multi-reference conditioning with per-reference index embedding

It is loaded on top of the frozen backbones (WAN2.2-TI2V-5B + WAN2.2 VAE + Qwen3.5-4B), not standalone. One checkpoint covers source-conditioned (s2v), video-to-video (v2v), and reference-conditioned (sv2v) editing.

Agent adapter β€” aurora_agent_vlm/

PEFT LoRA adapter for the tool-using planner: base Qwen/Qwen3-VL-8B-Instruct, r=32, lora_alpha=64, on the attention + MLP projections. adapter_config.json records base_model_name_or_path = Qwen/Qwen3-VL-8B-Instruct.

Usage

After downloading into models/ and installing the code repository:

# Editor
from evaluation.pipeline_loader import load_v2_pipeline
pipe = load_v2_pipeline("models/aurora_editor.safetensors", device="cuda:0", ref_max_items=8)

# Agent planner (LoRA merged at load)
import aurora.agent
agent = aurora.agent.AgentVLM("models/Qwen3-VL-8B-Instruct", "models/aurora_agent_vlm", device="cuda:0")

You also need the frozen backbones under models/ (WAN2.2-TI2V-5B, Wan2.2_VAE.pth, Qwen3.5-4B, Qwen3-VL-8B-Instruct) β€” see the code repository's Model Zoo. The full inference recipe (3-pass CFG defaults, per-benchmark commands) is in the repository README.

License

MIT (Aurora weights). The WAN2.2-TI2V-5B / WAN2.2 VAE / Qwen3.5-4B / Qwen3-VL-8B-Instruct backbones carry their own respective licenses.

Citation

@article{yu2026aurora,
  title={Aurora: Unified Video Editing with a Tool-Using Agent},
  author={Yu, Yongsheng and Zeng, Ziyun and Xiao, Zhiyuan and Zhou, Zhenghong and Hua, Hang and Xiong, Wei and Luo, Jiebo},
  journal={arXiv preprint arXiv:2605.18748},
  year={2026}
}
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Collection including yeates/aurora-weights

Paper for yeates/aurora-weights