VideoMLA: Low-Rank Latent KV Cache for Minute-Scale Autoregressive Video Diffusion Paper • 2605.30351 • Published 7 days ago • 25
AdaState: Self-Evolving Anchors for Streaming Video Generation Paper • 2605.30349 • Published 7 days ago • 12
AdaState: Self-Evolving Anchors for Streaming Video Generation Paper • 2605.30349 • Published 7 days ago • 12
Infinity-RoPE: Action-Controllable Infinite Video Generation Emerges From Autoregressive Self-Rollout Paper • 2511.20649 • Published Nov 25, 2025 • 52
LayerComposer: Interactive Personalized T2I via Spatially-Aware Layered Canvas Paper • 2510.20820 • Published Oct 23, 2025 • 11
LayerComposer: Interactive Personalized T2I via Spatially-Aware Layered Canvas Paper • 2510.20820 • Published Oct 23, 2025 • 11
Canvas-to-Image: Compositional Image Generation with Multimodal Controls Paper • 2511.21691 • Published Nov 26, 2025 • 36
Canvas-to-Image: Compositional Image Generation with Multimodal Controls Paper • 2511.21691 • Published Nov 26, 2025 • 36
Canvas-to-Image: Compositional Image Generation with Multimodal Controls Paper • 2511.21691 • Published Nov 26, 2025 • 36 • 6
How Well Does GPT-4o Understand Vision? Evaluating Multimodal Foundation Models on Standard Computer Vision Tasks Paper • 2507.01955 • Published Jul 2, 2025 • 36
LoRAShop: Training-Free Multi-Concept Image Generation and Editing with Rectified Flow Transformers Paper • 2505.23758 • Published May 29, 2025 • 22
LoRAShop: Training-Free Multi-Concept Image Generation and Editing with Rectified Flow Transformers Paper • 2505.23758 • Published May 29, 2025 • 22
LoRAShop: Training-Free Multi-Concept Image Generation and Editing with Rectified Flow Transformers Paper • 2505.23758 • Published May 29, 2025 • 22 • 3
RoPECraft: Training-Free Motion Transfer with Trajectory-Guided RoPE Optimization on Diffusion Transformers Paper • 2505.13344 • Published May 19, 2025 • 7
Image-to-Image Translation with Disentangled Latent Vectors for Face Editing Paper • 2301.04628 • Published Jan 11, 2023