Help with Character LoRA Training for Anima Preview2 – Poor Generalization, Only Frontal ID-Style Shot Matches Target Character

#119

by brendonlpbesler - opened 9 days ago

Hello everyone, I need help troubleshooting my character LoRA training for the Anima Preview2 model, using DiffPipe Forge as the training tool.
Dataset Details
I prepared 46 character images, all in 1024*1024 resolution, with the following breakdown:
3 images: left half-profile at 45 degrees
3 images: right half-profile at 45 degrees
3 images: standard frontal face upper-body shot
6 images: standard full-body sitting pose
16 images: face close-ups (covering 45°, 75°, 90° profile, and frontal angles)
3 images: slightly top-down frontal face shot
6 images: shots with subtle natural minor movements
3 images: full left profile
3 images: full right profile
In total, 16 core face-focused shots and 30 generalization shots.
Full Training Parameters
Bucket count: 3
Repeat count: 1
Tag shuffle versions: 3
LLM Adapter learning rate: 0
Training epochs: 30
Batch size per GPU: 1
Warmup steps: 75
Image batch size (hybrid training): 1
Gradient accumulation steps: 2
Gradient clipping: 1
Learning rate scheduler: linear
Optimizer: AdamW Optimizer
Learning rate: 2e-5 (0.00002)
Weight decay (regularization): 0.01
AdamW beta1: 0.9
AdamW beta2: 0.99
AdamW epsilon: 1e-8
LoRA rank: 32
Max training steps: 1500
The Problem
After training, the LoRA only produces an accurate likeness of the target character when generating a frontal ID-style headshot. All generated images with other angles, poses, or compositions completely fail to resemble the target character.
Could anyone help point out what's wrong with my parameter settings, and how I should adjust them to fix this generalization issue? Any advice is greatly appreciated!

brendonlpbesler

9 days ago

this is my trainconfig.toml

epochs = 30
micro_batch_size_per_gpu = 1
gradient_accumulation_steps = 2
warmup_steps = 75
output_dir = 'D:/ComfyUI-aki-v2/trainui/DiffPipeForge_v1.3.3_Full_Portable/output/20260405_22-37-44/mylora'
dataset = 'D:/ComfyUI-aki-v2/trainui/DiffPipeForge_v1.3.3_Full_Portable/output/20260405_22-37-44/dataset.toml'
save_dtype = 'bfloat16'
partition_method = 'parameters'
activation_checkpointing = true
pipeline_stages = 1
caching_batch_size = 1
save_every_n_epochs = 5
checkpoint_every_n_minutes = 120
gradient_clipping = 1.0
lr_scheduler = 'linear'
max_steps = 1500
map_num_proc = 6
steps_per_print = 10
save_every_n_steps = 250

[model]
type = 'anima'
dtype = 'bfloat16'
transformer_path = 'D:/ComfyUI-aki-v2/ComfyUI-aki-v2/ComfyUI/models/diffusion_models/anima-preview2.safetensors'
vae_path = 'D:/ComfyUI-aki-v2/ComfyUI-aki-v2/ComfyUI/models/vae/anima preview/qwen_image_vae.safetensors'
llm_path = 'D:/ComfyUI-aki-v2/ComfyUI-aki-v2/ComfyUI/models/text_encoders/qwen_3_06b_base.safetensors'
llm_adapter_lr = 0.0

[optimizer]
type = 'adamw_optimi'
lr = 2e-5
weight_decay = 0.01
eps = 1e-8
betas = [0.9, 0.99]

[adapter]
type = 'lora'
rank = 32
dtype = 'bfloat16'
dropout = 0.0

[monitoring]
enable_wandb = false

Espamholding

9 days ago

•

edited 9 days ago

Up the learning rate. I use 0.000122 for 32 dim, however that's with 1 batch size and 1 gradient accumulation steps. Higher batches/accum should have lower lr, but I have no idea how that scales. For your 2 accum, try with 0.00005. If you start getting burned-looking images, that was too high.
Also, do you mind showing the dataset?

Also also, preview 3 is out.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment