Video Depth Anything — Small

Mirror of depth-anything/Video-Depth-Anything-Small for use with ComfyUI-FFMPEGA.

What is Video Depth Anything?

Video Depth Anything is a state-of-the-art model for temporally consistent monocular depth estimation in videos. It extends Depth Anything V2 with temporal modules for smooth, flicker-free depth maps across video frames.

Key features:

  • Temporal consistency — smooth depth maps without frame-to-frame flickering
  • Multiple encoder sizes — Small (24.8M), Base, and Large variants
  • Apache 2.0 license — fully open source
  • Colormap output — supports multiple colormap visualizations (inferno, magma, plasma, etc.)

Files

model.safetensors
config.json

Usage

With ComfyUI-FFMPEGA (recommended)

  1. Set no_llm_mode to video_depth on the FFMPEG Agent node
  2. Select encoder size (small, base, large) under Advanced Options
  3. Choose colormap for visualization
  4. The model auto-downloads on first use

Manual download

huggingface-cli download AEmotionStudio/Video-Depth-Anything-Small --local-dir ./video_depth_anything

Available Sizes

Variant Parameters Size Speed
Small 24.8M ~102 MB Fastest
Base 97.5M ~390 MB Balanced
Large 335.3M ~670 MB Best quality

License

Apache 2.0 — see the upstream repository for full license terms.

Credits

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for AEmotionStudio/Video-Depth-Anything-Small

Finetuned
(1)
this model