LTX 2.3 I2V-T2V Basic ID-Lora Workflow with reference audio By RuneXX

#70

by shep211 - opened 20 days ago

This is not working for me. What do I have wrong? I upload a wav but it does not use that voice. Using LTX-2.3_-_I2V_T2V_Basic_ID-Lora_reference_audio.json workflow.

shep211

20 days ago

shep211

20 days ago

shep211

20 days ago

Wav used is from this video

https://packaged-media.redd.it/5d433ftqndrg1/pb/m2-res_720p.mp4?m=DASHPlaylist.mpd&c=wh_ben_en&var=sgpssan&v=1&e=1774738800&s=6af54c37564ea683f31aa7af514956d982c67339

shep211

19 days ago

Running Windows. I think it has something to do with that. Anyone running windows and has this working?

RuneXX

Owner 19 days ago

I'm on windows too ;-) will check if anything has changed or something broke

shep211

19 days ago

Are you running portable or desktop?

shep211

19 days ago

I got it working. I had to replace all the files. Not just the nodes_lt.py

https://github.com/Comfy-Org/ComfyUI/issues/13194

shep211

19 days ago

shep211

19 days ago

RuneXX

Owner 19 days ago

ah. You are on the desktop version of ComfyUI I guess then.
The desktop version the updates rolls out a bit slower, but the ID-Lora update will come eventually ;-)

malfy

16 days ago

Hey RuneXX, do you know, is this ID-LoRA for audio only or is it both audio and visual ID?

M-Mxm

15 days ago

Great workflows RuneXX, thanks a lot. Which of your workflows do you actually find better for this purpose? LTX-2.3_-I2V_T2V_Basic_ID-Lora_reference_audio.json, LTX-2.3-I2V_T2V_Talking_Avatar(voice_clone_with_Fish-Audio-Pro).json or LTX-2.3_-I2V_T2V_Talking_Avatar(voice_clone_with_Qwen-TTS).json?

I have tried the ID-Lora and Fish Audio. I find that ID-Lora creates better cohesion with better facial motor skills. Fish Audio has slightly better voice quality and is less robotic, but the motor skills are significantly worse.

By the way, do you know a way to improve mouth and tooth quality?

RuneXX

Owner 15 days ago

Hey RuneXX, do you know, is this ID-LoRA for audio only or is it both audio and visual ID?

Its an audio only lora, to make consistent voice across multiple videos

RuneXX

Owner 15 days ago

•

edited 15 days ago

By the way, do you know a way to improve mouth and tooth quality?

Could try more steps, or higher resolution.
And yes the ID lora is trained on expressions (face and voice), so it might be more expressive .. true
But you could make up for that with Fish Audio and the prompt.. probably. With prompts like "she talks with an expressive face and gesticulate as she talks with an emotional tone..." or something else that suits your input ;-) LTX loves a good prompt, for how to act .. and sequence of acting.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment