Squeezing Capacity from Multimodal Large Language Models for Subject-driven Generation Paper • 2605.26111 • Published 26 days ago • 12
UniFusion: Vision-Language Model as Unified Encoder in Image Generation Paper • 2510.12789 • Published Oct 14, 2025 • 19
UniFusion: Vision-Language Model as Unified Encoder in Image Generation Paper • 2510.12789 • Published Oct 14, 2025 • 19