Regarding those "bad outputs, worse quality issues" pls check ur piplines. Here is an overview like dos and donts

#85

by Sikaworld1990 - opened 3 days ago

Sikaworld1990

https://www.reddit.com/r/StableDiffusion/comments/1ro8qdi/ltx_23_official_workflows_and_pipelines_comparison/#:~:text=0.0%20(Distilled%20weights%20baked%20in).%20Stage%201%20Guidance%2C%20MultiModalGuider%20(nodes%20from%20ComfyUI%2DLTXVideo

RuneXX

Owner 3 days ago

•

edited 3 days ago

Those are mostly for the DEV workflows ;-)
only one workflow here that uses DEV model

and the dev workflow is for sure open for experimental tweaks .. with cfg multi modal nodes (independent CFG for audio and video), and try out different samplers.
Personally i think res_2s give a bit too "overbaked" looks, but on the flip side it works with less steps (so in the chart above, only 15 steps with dev model, res_2s would be the choice for sure).
Alternatively do more steps 20-30 ish or more... and use euler or similar. Speed wise they might end up using same time, since res_2s can be a bit slow.

But yes, dev workflow is a bit less set in stone, than using distilled model "click and go" and get decent results ;-)

RuneXX

Owner 3 days ago

•

edited 3 days ago

One interesting part of the chart above is perhaps the "balanced" middle one with DEV model and no distilled lora at stage 1, and very low cfg for audio.
Havent tried that combo before, audio cfg 1 only looks a bit strange (usually its 7.0 with dev workflow), but maybe its all ok ;-)

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment