GCStream commited on
Commit
49076de
·
verified ·
1 Parent(s): 4a4d373

docs: add model card with vLLM serve guide

Browse files
Files changed (1) hide show
  1. README.md +147 -3
README.md CHANGED
@@ -1,3 +1,147 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model: Tongyi-MAI/Z-Image-Turbo
3
+ library_name: diffusers
4
+ tags:
5
+ - diffusers
6
+ - text-to-image
7
+ - anime
8
+ - art-style
9
+ - z-image
10
+ - fuliji
11
+ - lora-merged
12
+ license: apache-2.0
13
+ language:
14
+ - zh
15
+ - en
16
+ ---
17
+
18
+ # Z-Image-Turbo × Fuliji — Merged Model
19
+
20
+ **Z-Image Turbo with Fuliji artist LoRA baked in.** The LoRA weights have been permanently merged into the base transformer via `merge_and_unload()`, so no PEFT dependency is needed at inference time.
21
+
22
+ > **Want the standalone LoRA adapter instead?**
23
+ > Use [DownFlow/Z-Image-Turbo-Fuli-LoRA](https://huggingface.co/DownFlow/Z-Image-Turbo-Fuli-LoRA) to apply the adapter on top of any Z-Image-Turbo checkpoint.
24
+
25
+ ---
26
+
27
+ ## What This Is
28
+
29
+ This model is [Tongyi-MAI/Z-Image-Turbo](https://huggingface.co/Tongyi-MAI/Z-Image-Turbo) (an 8-step flow-matching image generation model) fine-tuned with a LoRA trained on art from 8 Chinese anime/illustration artists in the [DownFlow/fuliji](https://huggingface.co/datasets/DownFlow/fuliji) dataset.
30
+
31
+ Trigger the artist style by prepending `by <artist>,` to your prompt.
32
+
33
+ ---
34
+
35
+ ## Quick Start (Python)
36
+
37
+ ```bash
38
+ pip install diffusers transformers accelerate safetensors
39
+ ```
40
+
41
+ ```python
42
+ import torch
43
+ from diffusers import DiffusionPipeline
44
+
45
+ pipe = DiffusionPipeline.from_pretrained(
46
+ "DownFlow/Z-Image-Turbo-Fuli",
47
+ torch_dtype=torch.bfloat16,
48
+ ).to("cuda")
49
+
50
+ image = pipe(
51
+ prompt="by 蠢沫沫, 1girl, solo, smile, soft lighting",
52
+ num_inference_steps=8,
53
+ guidance_scale=0.0, # Z-Image Turbo uses CFG=0
54
+ height=512,
55
+ width=512,
56
+ ).images[0]
57
+
58
+ image.save("output.png")
59
+ ```
60
+
61
+ ---
62
+
63
+ ## Serving with vLLM
64
+
65
+ vLLM (≥ 0.8) can serve this model via an OpenAI-compatible `/v1/images/generations` endpoint.
66
+
67
+ ### 1 — Start the server
68
+
69
+ ```bash
70
+ pip install "vllm>=0.8.0"
71
+
72
+ vllm serve DownFlow/Z-Image-Turbo-Fuli \
73
+ --task generate \
74
+ --dtype bfloat16 \
75
+ --max-model-len 512 \
76
+ --port 8000
77
+ ```
78
+
79
+ ### 2 — Generate via curl
80
+
81
+ ```bash
82
+ curl http://localhost:8000/v1/images/generations \
83
+ -H "Content-Type: application/json" \
84
+ -d '{
85
+ "model": "DownFlow/Z-Image-Turbo-Fuli",
86
+ "prompt": "by 蠢沫沫, 1girl, smile, soft watercolour style",
87
+ "n": 1,
88
+ "size": "512x512"
89
+ }'
90
+ ```
91
+
92
+ ### 3 — Generate via OpenAI Python SDK
93
+
94
+ ```python
95
+ from openai import OpenAI
96
+
97
+ client = OpenAI(base_url="http://localhost:8000/v1", api_key="not-needed")
98
+
99
+ response = client.images.generate(
100
+ model="DownFlow/Z-Image-Turbo-Fuli",
101
+ prompt="by 年年, 1girl, white dress, cherry blossoms",
102
+ n=1,
103
+ size="512x512",
104
+ )
105
+ print(response.data[0].url)
106
+ ```
107
+
108
+ ---
109
+
110
+ ## Artist Trigger Tokens
111
+
112
+ Prepend `by <artist>, ` at the start of your prompt.
113
+
114
+ | Token | Training images |
115
+ |---|---|
116
+ | `萌芽儿o0` | 30 |
117
+ | `年年` | 26 |
118
+ | `封疆疆v` | 26 |
119
+ | `焖焖碳` | 26 |
120
+ | `星之迟迟` | 25 |
121
+ | `蠢沫沫` | 23 |
122
+ | `雨波HaneAme` | 23 |
123
+ | `清水由乃` | 21 |
124
+
125
+ ---
126
+
127
+ ## Model Details
128
+
129
+ | Property | Value |
130
+ |---|---|
131
+ | Base model | `Tongyi-MAI/Z-Image-Turbo` |
132
+ | Fine-tuning method | LoRA rank=32, alpha=32 — merged into weights |
133
+ | Target modules | `to_q`, `to_k`, `to_v`, `w1`, `w2`, `w3` |
134
+ | Training steps | 3 000 (EMA decay=0.9999) |
135
+ | Training resolution | 512 × 512 |
136
+ | Inference steps | 8 |
137
+ | CFG scale | 0.0 (CFG-free) |
138
+ | Precision | bfloat16 |
139
+ | Dataset | [DownFlow/fuliji](https://huggingface.co/datasets/DownFlow/fuliji) (8 artists, ~200 images) |
140
+
141
+ ---
142
+
143
+ ## Related
144
+
145
+ - [DownFlow/Z-Image-Turbo-Fuli-LoRA](https://huggingface.co/DownFlow/Z-Image-Turbo-Fuli-LoRA) — standalone LoRA adapter
146
+ - [DownFlow/fuliji](https://huggingface.co/datasets/DownFlow/fuliji) — training dataset
147
+ - [Tongyi-MAI/Z-Image-Turbo](https://huggingface.co/Tongyi-MAI/Z-Image-Turbo) — base model