Exploring MLLM-Diffusion Information Transfer with MetaCanvas Paper • 2512.11464 • Published Dec 12, 2025 • 13
WorldLens: Full-Spectrum Evaluations of Driving World Models in Real World Paper • 2512.10958 • Published Dec 11, 2025
HiStream: Efficient High-Resolution Video Generation via Redundancy-Eliminated Streaming Paper • 2512.21338 • Published about 1 month ago • 22
Vision-Language-Action Models for Autonomous Driving: Past, Present, and Future Paper • 2512.16760 • Published Dec 18, 2025 • 14
VLSA: Vision-Language-Action Models with Plug-and-Play Safety Constraint Layer Paper • 2512.11891 • Published Dec 9, 2025 • 9
Multi-Modal Data-Efficient 3D Scene Understanding for Autonomous Driving Paper • 2405.05258 • Published May 8, 2024
Beyond One Shot, Beyond One Perspective: Cross-View and Long-Horizon Distillation for Better LiDAR Representations Paper • 2507.05260 • Published Jul 7, 2025
An Empirical Study of Training State-of-the-Art LiDAR Segmentation Models Paper • 2405.14870 • Published May 23, 2024
Veila: Panoramic LiDAR Generation from a Monocular RGB Image Paper • 2508.03690 • Published Aug 5, 2025
SuperFlow++: Enhanced Spatiotemporal Consistency for Cross-Modal Data Pretraining Paper • 2503.19912 • Published Mar 25, 2025
Point Transformer V3 Extreme: 1st Place Solution for 2024 Waymo Open Dataset Challenge in Semantic Segmentation Paper • 2407.15282 • Published Jul 21, 2024
SEE4D: Pose-Free 4D Generation via Auto-Regressive Video Inpainting Paper • 2510.26796 • Published Oct 30, 2025 • 1
RynnVLA-002: A Unified Vision-Language-Action and World Model Paper • 2511.17502 • Published Nov 21, 2025 • 26
Uni-MMMU: A Massive Multi-discipline Multimodal Unified Benchmark Paper • 2510.13759 • Published Oct 15, 2025 • 11
Simulating the Visual World with Artificial Intelligence: A Roadmap Paper • 2511.08585 • Published Nov 11, 2025 • 30
The Quest for Generalizable Motion Generation: Data, Model, and Evaluation Paper • 2510.26794 • Published Oct 30, 2025 • 27