VidTwin: Video VAE with Decoupled Structure and Dynamics Paper • 2412.17726 • Published Dec 23, 2024 • 9
MineWorld: a Real-Time and Open-Source Interactive World Model on Minecraft Paper • 2504.08388 • Published Apr 11, 2025 • 42
HiTVideo: Hierarchical Tokenizers for Enhancing Text-to-Video Generation with Autoregressive Large Language Models Paper • 2503.11513 • Published Mar 14, 2025
Reinforcement Learning with Inverse Rewards for World Model Post-training Paper • 2509.23958 • Published Sep 28, 2025
Memory Forcing: Spatio-Temporal Memory for Consistent Scene Generation on Minecraft Paper • 2510.03198 • Published Oct 3, 2025
Geometry Forcing: Marrying Video Diffusion and 3D Representation for Consistent World Modeling Paper • 2507.07982 • Published Jul 10, 2025 • 34
Playing with Transformer at 30+ FPS via Next-Frame Diffusion Paper • 2506.01380 • Published Jun 2, 2025 • 2
Make Your Actor Talk: Generalizable and High-Fidelity Lip Sync with Motion and Appearance Disentanglement Paper • 2406.08096 • Published Jun 12, 2024
IGOR: Image-GOal Representations are the Atomic Control Units for Foundation Models in Embodied AI Paper • 2411.00785 • Published Oct 17, 2024 • 8
Memories are One-to-Many Mapping Alleviators in Talking Face Generation Paper • 2212.05005 • Published Dec 9, 2022
End-to-End Rate-Distortion Optimized 3D Gaussian Representation Paper • 2406.01597 • Published Apr 9, 2024
DAE-Talker: High Fidelity Speech-Driven Talking Face Generation with Diffusion Autoencoder Paper • 2303.17550 • Published Mar 30, 2023
Compositional 3D-aware Video Generation with LLM Director Paper • 2409.00558 • Published Aug 31, 2024 • 15
HiFace: High-Fidelity 3D Face Reconstruction by Learning Static and Dynamic Details Paper • 2303.11225 • Published Mar 20, 2023 • 1
UniEdit: A Unified Tuning-Free Framework for Video Motion and Appearance Editing Paper • 2402.13185 • Published Feb 20, 2024
InstructAvatar: Text-Guided Emotion and Motion Control for Avatar Generation Paper • 2405.15758 • Published May 24, 2024 • 1