NguyenDinhHieu
/

EquiFashionModel

@@ -24,8 +24,6 @@ Tran Minh Khuong, Nguyen Dinh Hieu [0009-0002-6683-8036], Ngo Dinh Hoang Minh, N
 **Institution:** FPT University, Hanoi, Vietnam
 📧 khuongtmhe180089@fpt.edu.vn, hieundhe180318@fpt.edu.vn, minhndhhe182227@fpt.edu.vn, bachndhe173222@fpt.edu.vn, hungpd2@fe.edu.vn
----
 ## 🧩 Overview
 **EquiFashion** is a hybrid *GAN–Diffusion* framework that reconciles the long-standing trade-off between **stylistic diversity** and **photorealistic fidelity** in generative fashion design.
@@ -34,15 +32,11 @@ It integrates a GAN-based ideation branch for creative exploration and a diffusi
 > 🎨 Try the live demo here:
 > 👉 [EquiFashion Demo on Hugging Face Spaces](https://huggingface.co/spaces/NguyenDinhHieu/EquiFashion)
----
 ## 🎯 Motivation
 Fashion design requires models that are simultaneously **creative**, **robust**, and **trustworthy**.
 While GANs generate diverse styles but lack stability, and Diffusion Models produce realism but constrain creativity, **EquiFashion** bridges both worlds—achieving controlled diversity, semantic alignment, and realistic garment rendering.
----
 ## 🧱 Architecture Overview
 | Component | Description |
@@ -53,8 +47,6 @@ While GANs generate diverse styles but lack stability, and Diffusion Models prod
 | **Semantic-Bundled Attention** | Couples adjective–noun pairs (e.g., “red collar”) for coherent attribute localization. |
 | **Pose-Guided Conditioning** | Aligns garments naturally to human body structure using OpenPose keypoints. |
----
 ## 🧮 Training Configuration
 | Setting | Value |
@@ -70,8 +62,6 @@ While GANs generate diverse styles but lack stability, and Diffusion Models prod
 | Timesteps (T) | 8 |
 | Fusion Decay (γ) | 0.7 |
----
 ## 🧠 Core Equation
 The total loss combines autoencoding, adversarial, semantic, and perceptual components:
@@ -80,8 +70,6 @@ The total loss combines autoencoding, adversarial, semantic, and perceptual comp
 L_{total} = λ_{AE}L_{AE} + λ_{cons}L_{cons} + λ_{bundle}L_{bundle} + λ_{comp}L_{comp} + λ_G(L_G + λ_{MS}L_{MS}) + λ_{den}L_{denoise} + λ_{rob}L_{rob} + λ_{perc}L_{perc}
 \]
----
 ## 📊 Quantitative Results
 | Metric | Value | Benchmark |
@@ -92,16 +80,12 @@ L_{total} = λ_{AE}L_{AE} + λ_{cons}L_{cons} + λ_{bundle}L_{bundle} + λ_{comp
 | Coverage ↑ | **92.8%** | – |
 | Inference Time | **3.8 s / sample (512×512, A100, FP16)** | – |
----
 ## 🖼️ Visual Results
 | Input Pose | Generated Outfit |
 |-------------|------------------|
 | ![](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusion_pose.png) | ![](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/fashion_diffusion.png) |
----
 ## 📦 Dataset: **EquiFashion-DB**
 | Property | Description |
@@ -113,8 +97,6 @@ L_{total} = λ_{AE}L_{AE} + λ_{cons}L_{cons} + λ_{bundle}L_{bundle} + λ_{comp
 | Key Feature | Noise-aware text, balanced demographics |
 | Purpose | Training + robust benchmarking for generative fashion |
----
 ## 🚀 Usage Example
 ```python
@@ -133,8 +115,6 @@ model.eval()
 prompt = "long-sleeve floral dress with tied waist, elegant, 8k detail"
 ```
----
 ## 💡 Citation
 If you use this model or dataset, please cite:
@@ -149,8 +129,6 @@ If you use this model or dataset, please cite:
 }
 ```
----
 ## 🧩 File Descriptions
 | File | Description |
@@ -161,8 +139,6 @@ If you use this model or dataset, please cite:
 | `app.py` | Gradio demo UI |
 | `utils/configs/cldm_v2.yaml` | Architecture configuration |
----
 ## 📚 References
 1. Zhu et al. *Be Your Own Prada* (ICCV 2017)
@@ -176,14 +152,10 @@ If you use this model or dataset, please cite:
 9. Baldrati et al. *Multimodal Garment Designer* (ICCV 2023)
 10. Rombach et al. *Latent Diffusion Models* (CVPR 2022)
----
 ## 🪪 License
 Released under the **MIT License**.
 You may use, modify, and distribute the model and dataset with attribution.
----
 ## 🧩 Acknowledgment
 Developed by **FPT University AI Research Group**, Hanoi, Vietnam
 as part of the **EquiAI Research Suite** on fairness, robustness, and trustworthy generative AI.

 **Institution:** FPT University, Hanoi, Vietnam
 📧 khuongtmhe180089@fpt.edu.vn, hieundhe180318@fpt.edu.vn, minhndhhe182227@fpt.edu.vn, bachndhe173222@fpt.edu.vn, hungpd2@fe.edu.vn
 ## 🧩 Overview
 **EquiFashion** is a hybrid *GAN–Diffusion* framework that reconciles the long-standing trade-off between **stylistic diversity** and **photorealistic fidelity** in generative fashion design.
 > 🎨 Try the live demo here:
 > 👉 [EquiFashion Demo on Hugging Face Spaces](https://huggingface.co/spaces/NguyenDinhHieu/EquiFashion)
 ## 🎯 Motivation
 Fashion design requires models that are simultaneously **creative**, **robust**, and **trustworthy**.
 While GANs generate diverse styles but lack stability, and Diffusion Models produce realism but constrain creativity, **EquiFashion** bridges both worlds—achieving controlled diversity, semantic alignment, and realistic garment rendering.
 ## 🧱 Architecture Overview
 | Component | Description |
 | **Semantic-Bundled Attention** | Couples adjective–noun pairs (e.g., “red collar”) for coherent attribute localization. |
 | **Pose-Guided Conditioning** | Aligns garments naturally to human body structure using OpenPose keypoints. |
 ## 🧮 Training Configuration
 | Setting | Value |
 | Timesteps (T) | 8 |
 | Fusion Decay (γ) | 0.7 |
 ## 🧠 Core Equation
 The total loss combines autoencoding, adversarial, semantic, and perceptual components:
 L_{total} = λ_{AE}L_{AE} + λ_{cons}L_{cons} + λ_{bundle}L_{bundle} + λ_{comp}L_{comp} + λ_G(L_G + λ_{MS}L_{MS}) + λ_{den}L_{denoise} + λ_{rob}L_{rob} + λ_{perc}L_{perc}
 \]
 ## 📊 Quantitative Results
 | Metric | Value | Benchmark |
 | Coverage ↑ | **92.8%** | – |
 | Inference Time | **3.8 s / sample (512×512, A100, FP16)** | – |
 ## 🖼️ Visual Results
 | Input Pose | Generated Outfit |
 |-------------|------------------|
 | ![](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusion_pose.png) | ![](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/fashion_diffusion.png) |
 ## 📦 Dataset: **EquiFashion-DB**
 | Property | Description |
 | Key Feature | Noise-aware text, balanced demographics |
 | Purpose | Training + robust benchmarking for generative fashion |
 ## 🚀 Usage Example
 ```python
 prompt = "long-sleeve floral dress with tied waist, elegant, 8k detail"
 ```
 ## 💡 Citation
 If you use this model or dataset, please cite:
 }
 ```
 ## 🧩 File Descriptions
 | File | Description |
 | `app.py` | Gradio demo UI |
 | `utils/configs/cldm_v2.yaml` | Architecture configuration |
 ## 📚 References
 1. Zhu et al. *Be Your Own Prada* (ICCV 2017)
 9. Baldrati et al. *Multimodal Garment Designer* (ICCV 2023)
 10. Rombach et al. *Latent Diffusion Models* (CVPR 2022)
 ## 🪪 License
 Released under the **MIT License**.
 You may use, modify, and distribute the model and dataset with attribution.
 ## 🧩 Acknowledgment
 Developed by **FPT University AI Research Group**, Hanoi, Vietnam
 as part of the **EquiAI Research Suite** on fairness, robustness, and trustworthy generative AI.