scene4D - a PandaQQ Collection

PandaQQ 's Collections

scene4D

updated Oct 30, 2025

4D LangSplat: 4D Language Gaussian Splatting via Multimodal Large Language Models

Paper • 2503.10437 • Published Mar 13, 2025 • 34
Open-Sora 2.0: Training a Commercial-Level Video Generation Model in $200k

Paper • 2503.09642 • Published Mar 12, 2025 • 20
VGGT: Visual Geometry Grounded Transformer

Paper • 2503.11651 • Published Mar 14, 2025 • 40
1000+ FPS 4D Gaussian Splatting for Dynamic Scene Rendering

Paper • 2503.16422 • Published Mar 20, 2025 • 16
SynCity: Training-Free Generation of 3D Worlds

Paper • 2503.16420 • Published Mar 20, 2025 • 27
M3: 3D-Spatial MultiModal Memory

Paper • 2503.16413 • Published Mar 20, 2025 • 15
MetaSpatial: Reinforcing 3D Spatial Reasoning in VLMs for the Metaverse

Paper • 2503.18470 • Published Mar 24, 2025 • 4
Any6D: Model-free 6D Pose Estimation of Novel Objects

Paper • 2503.18673 • Published Mar 24, 2025 • 3
FirePlace: Geometric Refinements of LLM Common Sense Reasoning for 3D Object Placement

Paper • 2503.04919 • Published Mar 6, 2025 • 8
FlexWorld: Progressively Expanding 3D Scenes for Flexiable-View Synthesis

Paper • 2503.13265 • Published Mar 17, 2025 • 15
Feature4X: Bridging Any Monocular Video to 4D Agentic AI with Versatile Gaussian Feature Fields

Paper • 2503.20776 • Published Mar 26, 2025 • 10
Segment Any Motion in Videos

Paper • 2503.22268 • Published Mar 28, 2025 • 19
Free4D: Tuning-free 4D Scene Generation with Spatial-Temporal Consistency

Paper • 2503.20785 • Published Mar 26, 2025 • 22
VideoScene: Distilling Video Diffusion Model to Generate 3D Scenes in One Step

Paper • 2504.01956 • Published Apr 2, 2025 • 41
TAPIP3D: Tracking Any Point in Persistent 3D Geometry

Paper • 2504.14717 • Published Apr 20, 2025 • 8
Towards Understanding Camera Motions in Any Video

Paper • 2504.15376 • Published Apr 21, 2025 • 157
EmbodiedGen: Towards a Generative 3D World Engine for Embodied Intelligence

Paper • 2506.10600 • Published Jun 12, 2025 • 8
StreamSplat: Towards Online Dynamic 3D Reconstruction from Uncalibrated Video Streams

Paper • 2506.08862 • Published Jun 10, 2025 • 6
PlayerOne: Egocentric World Simulator

Paper • 2506.09995 • Published Jun 11, 2025 • 34
π^3: Scalable Permutation-Equivariant Visual Geometry Learning

Paper • 2507.13347 • Published Jul 17, 2025 • 67
SpatialTrackerV2: 3D Point Tracking Made Easy

Paper • 2507.12462 • Published Jul 16, 2025 • 19
PhysX: Physical-Grounded 3D Asset Generation

Paper • 2507.12465 • Published Jul 16, 2025 • 45
Streaming 4D Visual Geometry Transformer

Paper • 2507.11539 • Published Jul 15, 2025 • 15
Yume: An Interactive World Generation Model

Paper • 2507.17744 • Published Jul 23, 2025 • 92
Reconstructing 4D Spatial Intelligence: A Survey

Paper • 2507.21045 • Published Jul 28, 2025 • 38
HunyuanWorld 1.0: Generating Immersive, Explorable, and Interactive 3D Worlds from Words or Pixels

Paper • 2507.21809 • Published Jul 29, 2025 • 142
NeRF Is a Valuable Assistant for 3D Gaussian Splatting

Paper • 2507.23374 • Published Jul 31, 2025 • 12
DreamScene: 3D Gaussian-based End-to-end Text-to-3D Scene Generation

Paper • 2507.13985 • Published Jul 18, 2025 • 7
Matrix-3D: Omnidirectional Explorable 3D World Generation

Paper • 2508.08086 • Published Aug 11, 2025 • 76
UniEgoMotion: A Unified Model for Egocentric Motion Reconstruction, Forecasting, and Generation

Paper • 2508.01126 • Published Aug 2, 2025 • 6
G-CUT3R: Guided 3D Reconstruction with Camera and Depth Prior Integration

Paper • 2508.11379 • Published Aug 15, 2025 • 12
STream3R: Scalable Sequential 3D Reconstruction with Causal Transformer

Paper • 2508.10893 • Published Aug 14, 2025 • 31
MeshSplat: Generalizable Sparse-View Surface Reconstruction via Gaussian Splatting

Paper • 2508.17811 • Published Aug 25, 2025 • 7
Pixie: Fast and Generalizable Supervised Learning of 3D Physics from Pixels

Paper • 2508.17437 • Published Aug 20, 2025 • 37
DA^2: Depth Anything in Any Direction

Paper • 2509.26618 • Published Sep 30, 2025 • 26
TTT3R: 3D Reconstruction as Test-Time Training

Paper • 2509.26645 • Published Sep 30, 2025 • 15
Game-TARS: Pretrained Foundation Models for Scalable Generalist Multimodal Game Agents

Paper • 2510.23691 • Published Oct 27, 2025 • 57