CoF-T2I: Video Models as Pure Visual Reasoners for Text-to-Image Generation Paper • 2601.10061 • Published 11 days ago • 30
LLaDA2.0: Scaling Up Diffusion Language Models to 100B Paper • 2512.15745 • Published Dec 10, 2025 • 80
StereoPilot: Learning Unified and Efficient Stereo Conversion via Generative Priors Paper • 2512.16915 • Published Dec 18, 2025 • 38
Alchemist: Unlocking Efficiency in Text-to-Image Model Training via Meta-Gradient Data Selection Paper • 2512.16905 • Published Dec 18, 2025 • 32
MemFlow: Flowing Adaptive Memory for Consistent and Efficient Long Video Narratives Paper • 2512.14699 • Published Dec 16, 2025 • 28
UnityVideo: Unified Multi-Modal Multi-Task Learning for Enhancing World-Aware Video Generation Paper • 2512.07831 • Published Dec 8, 2025 • 17
Video-as-Answer: Predict and Generate Next Video Event with Joint-GRPO Paper • 2511.16669 • Published Nov 20, 2025 • 32
Video-as-Answer: Predict and Generate Next Video Event with Joint-GRPO Paper • 2511.16669 • Published Nov 20, 2025 • 32
Video-as-Answer: Predict and Generate Next Video Event with Joint-GRPO Paper • 2511.16669 • Published Nov 20, 2025 • 32 • 3
PhysMaster: Mastering Physical Representation for Video Generation via Reinforcement Learning Paper • 2510.13809 • Published Oct 15, 2025 • 38
VideoCanvas: Unified Video Completion from Arbitrary Spatiotemporal Patches via In-Context Conditioning Paper • 2510.08555 • Published Oct 9, 2025 • 63