Video Depth Anything — Large
Mirror of depth-anything/Video-Depth-Anything-Large for use with ComfyUI-FFMPEGA.
What is Video Depth Anything?
Video Depth Anything is a state-of-the-art model for temporally consistent monocular depth estimation in videos. It extends Depth Anything V2 with temporal modules for smooth, flicker-free depth maps across video frames.
Key features:
- Temporal consistency — smooth depth maps without frame-to-frame flickering
- Multiple encoder sizes — Small (335.3M), Base, and Large variants
- Apache 2.0 license — fully open source
- Colormap output — supports multiple colormap visualizations (inferno, magma, plasma, etc.)
Files
model.safetensors
config.json
Usage
With ComfyUI-FFMPEGA (recommended)
- Set
no_llm_modetovideo_depthon the FFMPEG Agent node - Select encoder size (
small,base,large) under Advanced Options - Choose colormap for visualization
- The model auto-downloads on first use
Manual download
huggingface-cli download AEmotionStudio/Video-Depth-Anything-Large --local-dir ./video_depth_anything
Available Sizes
| Variant | Parameters | Size | Speed |
|---|---|---|---|
| Small | 24.8M | ~102 MB | Fastest |
| Base | 97.5M | ~390 MB | Balanced |
| Large | 335.3M | ~670 MB | Best quality |
License
Apache 2.0 — see the upstream repository for full license terms.
Credits
- Original model by: Depth Anything team
- Paper: "Video Depth Anything: Consistent Depth Estimation for Super-Long Videos"
- Upstream HuggingFace: depth-anything/Video-Depth-Anything-Large
- Redistributed by: Æmotion Studio for use with ComfyUI-FFMPEGA
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support
Model tree for AEmotionStudio/Video-Depth-Anything-Large
Base model
depth-anything/Video-Depth-Anything-Large