scene4D
updated
4D LangSplat: 4D Language Gaussian Splatting via Multimodal Large
Language Models
Paper
• 2503.10437
• Published
• 34
Open-Sora 2.0: Training a Commercial-Level Video Generation Model in
$200k
Paper
• 2503.09642
• Published
• 20
VGGT: Visual Geometry Grounded Transformer
Paper
• 2503.11651
• Published
• 35
1000+ FPS 4D Gaussian Splatting for Dynamic Scene Rendering
Paper
• 2503.16422
• Published
• 14
SynCity: Training-Free Generation of 3D Worlds
Paper
• 2503.16420
• Published
• 27
M3: 3D-Spatial MultiModal Memory
Paper
• 2503.16413
• Published
• 15
MetaSpatial: Reinforcing 3D Spatial Reasoning in VLMs for the Metaverse
Paper
• 2503.18470
• Published
• 3
Any6D: Model-free 6D Pose Estimation of Novel Objects
Paper
• 2503.18673
• Published
• 3
FirePlace: Geometric Refinements of LLM Common Sense Reasoning for 3D
Object Placement
Paper
• 2503.04919
• Published
• 8
FlexWorld: Progressively Expanding 3D Scenes for Flexiable-View
Synthesis
Paper
• 2503.13265
• Published
• 15
Feature4X: Bridging Any Monocular Video to 4D Agentic AI with Versatile
Gaussian Feature Fields
Paper
• 2503.20776
• Published
• 10
Segment Any Motion in Videos
Paper
• 2503.22268
• Published
• 19
Free4D: Tuning-free 4D Scene Generation with Spatial-Temporal
Consistency
Paper
• 2503.20785
• Published
• 22
VideoScene: Distilling Video Diffusion Model to Generate 3D Scenes in
One Step
Paper
• 2504.01956
• Published
• 41
TAPIP3D: Tracking Any Point in Persistent 3D Geometry
Paper
• 2504.14717
• Published
• 8
Towards Understanding Camera Motions in Any Video
Paper
• 2504.15376
• Published
• 155
EmbodiedGen: Towards a Generative 3D World Engine for Embodied
Intelligence
Paper
• 2506.10600
• Published
• 8
StreamSplat: Towards Online Dynamic 3D Reconstruction from Uncalibrated
Video Streams
Paper
• 2506.08862
• Published
• 6
PlayerOne: Egocentric World Simulator
Paper
• 2506.09995
• Published
• 34
π^3: Scalable Permutation-Equivariant Visual Geometry Learning
Paper
• 2507.13347
• Published
• 66
SpatialTrackerV2: 3D Point Tracking Made Easy
Paper
• 2507.12462
• Published
• 19
PhysX: Physical-Grounded 3D Asset Generation
Paper
• 2507.12465
• Published
• 44
Streaming 4D Visual Geometry Transformer
Paper
• 2507.11539
• Published
• 15
Yume: An Interactive World Generation Model
Paper
• 2507.17744
• Published
• 91
Reconstructing 4D Spatial Intelligence: A Survey
Paper
• 2507.21045
• Published
• 38
HunyuanWorld 1.0: Generating Immersive, Explorable, and Interactive 3D
Worlds from Words or Pixels
Paper
• 2507.21809
• Published
• 140
NeRF Is a Valuable Assistant for 3D Gaussian Splatting
Paper
• 2507.23374
• Published
• 12
DreamScene: 3D Gaussian-based End-to-end Text-to-3D Scene Generation
Paper
• 2507.13985
• Published
• 7
Matrix-3D: Omnidirectional Explorable 3D World Generation
Paper
• 2508.08086
• Published
• 76
UniEgoMotion: A Unified Model for Egocentric Motion Reconstruction,
Forecasting, and Generation
Paper
• 2508.01126
• Published
• 6
G-CUT3R: Guided 3D Reconstruction with Camera and Depth Prior
Integration
Paper
• 2508.11379
• Published
• 12
STream3R: Scalable Sequential 3D Reconstruction with Causal Transformer
Paper
• 2508.10893
• Published
• 31
MeshSplat: Generalizable Sparse-View Surface Reconstruction via Gaussian
Splatting
Paper
• 2508.17811
• Published
• 7
Pixie: Fast and Generalizable Supervised Learning of 3D Physics from
Pixels
Paper
• 2508.17437
• Published
• 37
DA^2: Depth Anything in Any Direction
Paper
• 2509.26618
• Published
• 26
TTT3R: 3D Reconstruction as Test-Time Training
Paper
• 2509.26645
• Published
• 15
Game-TARS: Pretrained Foundation Models for Scalable Generalist
Multimodal Game Agents
Paper
• 2510.23691
• Published
• 54