NguyenDinhHieu commited on
Commit
94ef14e
·
verified ·
1 Parent(s): 6d766bc

EquiFashionModel

Browse files
Files changed (1) hide show
  1. README.md +0 -28
README.md CHANGED
@@ -24,8 +24,6 @@ Tran Minh Khuong, Nguyen Dinh Hieu [0009-0002-6683-8036], Ngo Dinh Hoang Minh, N
24
  **Institution:** FPT University, Hanoi, Vietnam
25
  📧 khuongtmhe180089@fpt.edu.vn, hieundhe180318@fpt.edu.vn, minhndhhe182227@fpt.edu.vn, bachndhe173222@fpt.edu.vn, hungpd2@fe.edu.vn
26
 
27
- ---
28
-
29
  ## 🧩 Overview
30
 
31
  **EquiFashion** is a hybrid *GAN–Diffusion* framework that reconciles the long-standing trade-off between **stylistic diversity** and **photorealistic fidelity** in generative fashion design.
@@ -34,15 +32,11 @@ It integrates a GAN-based ideation branch for creative exploration and a diffusi
34
  > 🎨 Try the live demo here:
35
  > 👉 [EquiFashion Demo on Hugging Face Spaces](https://huggingface.co/spaces/NguyenDinhHieu/EquiFashion)
36
 
37
- ---
38
-
39
  ## 🎯 Motivation
40
 
41
  Fashion design requires models that are simultaneously **creative**, **robust**, and **trustworthy**.
42
  While GANs generate diverse styles but lack stability, and Diffusion Models produce realism but constrain creativity, **EquiFashion** bridges both worlds—achieving controlled diversity, semantic alignment, and realistic garment rendering.
43
 
44
- ---
45
-
46
  ## 🧱 Architecture Overview
47
 
48
  | Component | Description |
@@ -53,8 +47,6 @@ While GANs generate diverse styles but lack stability, and Diffusion Models prod
53
  | **Semantic-Bundled Attention** | Couples adjective–noun pairs (e.g., “red collar”) for coherent attribute localization. |
54
  | **Pose-Guided Conditioning** | Aligns garments naturally to human body structure using OpenPose keypoints. |
55
 
56
- ---
57
-
58
  ## 🧮 Training Configuration
59
 
60
  | Setting | Value |
@@ -70,8 +62,6 @@ While GANs generate diverse styles but lack stability, and Diffusion Models prod
70
  | Timesteps (T) | 8 |
71
  | Fusion Decay (γ) | 0.7 |
72
 
73
- ---
74
-
75
  ## 🧠 Core Equation
76
 
77
  The total loss combines autoencoding, adversarial, semantic, and perceptual components:
@@ -80,8 +70,6 @@ The total loss combines autoencoding, adversarial, semantic, and perceptual comp
80
  L_{total} = λ_{AE}L_{AE} + λ_{cons}L_{cons} + λ_{bundle}L_{bundle} + λ_{comp}L_{comp} + λ_G(L_G + λ_{MS}L_{MS}) + λ_{den}L_{denoise} + λ_{rob}L_{rob} + λ_{perc}L_{perc}
81
  \]
82
 
83
- ---
84
-
85
  ## 📊 Quantitative Results
86
 
87
  | Metric | Value | Benchmark |
@@ -92,16 +80,12 @@ L_{total} = λ_{AE}L_{AE} + λ_{cons}L_{cons} + λ_{bundle}L_{bundle} + λ_{comp
92
  | Coverage ↑ | **92.8%** | – |
93
  | Inference Time | **3.8 s / sample (512×512, A100, FP16)** | – |
94
 
95
- ---
96
-
97
  ## 🖼️ Visual Results
98
 
99
  | Input Pose | Generated Outfit |
100
  |-------------|------------------|
101
  | ![](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusion_pose.png) | ![](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/fashion_diffusion.png) |
102
 
103
- ---
104
-
105
  ## 📦 Dataset: **EquiFashion-DB**
106
 
107
  | Property | Description |
@@ -113,8 +97,6 @@ L_{total} = λ_{AE}L_{AE} + λ_{cons}L_{cons} + λ_{bundle}L_{bundle} + λ_{comp
113
  | Key Feature | Noise-aware text, balanced demographics |
114
  | Purpose | Training + robust benchmarking for generative fashion |
115
 
116
- ---
117
-
118
  ## 🚀 Usage Example
119
 
120
  ```python
@@ -133,8 +115,6 @@ model.eval()
133
  prompt = "long-sleeve floral dress with tied waist, elegant, 8k detail"
134
  ```
135
 
136
- ---
137
-
138
  ## 💡 Citation
139
 
140
  If you use this model or dataset, please cite:
@@ -149,8 +129,6 @@ If you use this model or dataset, please cite:
149
  }
150
  ```
151
 
152
- ---
153
-
154
  ## 🧩 File Descriptions
155
 
156
  | File | Description |
@@ -161,8 +139,6 @@ If you use this model or dataset, please cite:
161
  | `app.py` | Gradio demo UI |
162
  | `utils/configs/cldm_v2.yaml` | Architecture configuration |
163
 
164
- ---
165
-
166
  ## 📚 References
167
 
168
  1. Zhu et al. *Be Your Own Prada* (ICCV 2017)
@@ -176,14 +152,10 @@ If you use this model or dataset, please cite:
176
  9. Baldrati et al. *Multimodal Garment Designer* (ICCV 2023)
177
  10. Rombach et al. *Latent Diffusion Models* (CVPR 2022)
178
 
179
- ---
180
-
181
  ## 🪪 License
182
  Released under the **MIT License**.
183
  You may use, modify, and distribute the model and dataset with attribution.
184
 
185
- ---
186
-
187
  ## 🧩 Acknowledgment
188
  Developed by **FPT University AI Research Group**, Hanoi, Vietnam
189
  as part of the **EquiAI Research Suite** on fairness, robustness, and trustworthy generative AI.
 
24
  **Institution:** FPT University, Hanoi, Vietnam
25
  📧 khuongtmhe180089@fpt.edu.vn, hieundhe180318@fpt.edu.vn, minhndhhe182227@fpt.edu.vn, bachndhe173222@fpt.edu.vn, hungpd2@fe.edu.vn
26
 
 
 
27
  ## 🧩 Overview
28
 
29
  **EquiFashion** is a hybrid *GAN–Diffusion* framework that reconciles the long-standing trade-off between **stylistic diversity** and **photorealistic fidelity** in generative fashion design.
 
32
  > 🎨 Try the live demo here:
33
  > 👉 [EquiFashion Demo on Hugging Face Spaces](https://huggingface.co/spaces/NguyenDinhHieu/EquiFashion)
34
 
 
 
35
  ## 🎯 Motivation
36
 
37
  Fashion design requires models that are simultaneously **creative**, **robust**, and **trustworthy**.
38
  While GANs generate diverse styles but lack stability, and Diffusion Models produce realism but constrain creativity, **EquiFashion** bridges both worlds—achieving controlled diversity, semantic alignment, and realistic garment rendering.
39
 
 
 
40
  ## 🧱 Architecture Overview
41
 
42
  | Component | Description |
 
47
  | **Semantic-Bundled Attention** | Couples adjective–noun pairs (e.g., “red collar”) for coherent attribute localization. |
48
  | **Pose-Guided Conditioning** | Aligns garments naturally to human body structure using OpenPose keypoints. |
49
 
 
 
50
  ## 🧮 Training Configuration
51
 
52
  | Setting | Value |
 
62
  | Timesteps (T) | 8 |
63
  | Fusion Decay (γ) | 0.7 |
64
 
 
 
65
  ## 🧠 Core Equation
66
 
67
  The total loss combines autoencoding, adversarial, semantic, and perceptual components:
 
70
  L_{total} = λ_{AE}L_{AE} + λ_{cons}L_{cons} + λ_{bundle}L_{bundle} + λ_{comp}L_{comp} + λ_G(L_G + λ_{MS}L_{MS}) + λ_{den}L_{denoise} + λ_{rob}L_{rob} + λ_{perc}L_{perc}
71
  \]
72
 
 
 
73
  ## 📊 Quantitative Results
74
 
75
  | Metric | Value | Benchmark |
 
80
  | Coverage ↑ | **92.8%** | – |
81
  | Inference Time | **3.8 s / sample (512×512, A100, FP16)** | – |
82
 
 
 
83
  ## 🖼️ Visual Results
84
 
85
  | Input Pose | Generated Outfit |
86
  |-------------|------------------|
87
  | ![](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusion_pose.png) | ![](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/fashion_diffusion.png) |
88
 
 
 
89
  ## 📦 Dataset: **EquiFashion-DB**
90
 
91
  | Property | Description |
 
97
  | Key Feature | Noise-aware text, balanced demographics |
98
  | Purpose | Training + robust benchmarking for generative fashion |
99
 
 
 
100
  ## 🚀 Usage Example
101
 
102
  ```python
 
115
  prompt = "long-sleeve floral dress with tied waist, elegant, 8k detail"
116
  ```
117
 
 
 
118
  ## 💡 Citation
119
 
120
  If you use this model or dataset, please cite:
 
129
  }
130
  ```
131
 
 
 
132
  ## 🧩 File Descriptions
133
 
134
  | File | Description |
 
139
  | `app.py` | Gradio demo UI |
140
  | `utils/configs/cldm_v2.yaml` | Architecture configuration |
141
 
 
 
142
  ## 📚 References
143
 
144
  1. Zhu et al. *Be Your Own Prada* (ICCV 2017)
 
152
  9. Baldrati et al. *Multimodal Garment Designer* (ICCV 2023)
153
  10. Rombach et al. *Latent Diffusion Models* (CVPR 2022)
154
 
 
 
155
  ## 🪪 License
156
  Released under the **MIT License**.
157
  You may use, modify, and distribute the model and dataset with attribution.
158
 
 
 
159
  ## 🧩 Acknowledgment
160
  Developed by **FPT University AI Research Group**, Hanoi, Vietnam
161
  as part of the **EquiAI Research Suite** on fairness, robustness, and trustworthy generative AI.