YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

πŸ›‘οΈ Red Team Framework: Image Protection Against AI Manipulation

A comprehensive red team pipeline for evaluating adversarial perturbation-based image protection methods against deepfake generation and AI manipulation.

Overview

This framework implements and benchmarks three state-of-the-art image protection methods:

Method Paper Venue Focus
FaceShield arXiv:2412.09921 ICCV 2025 Face protection against DM & GAN-based deepfakes
DiffusionGuard arXiv:2410.05694 ICLR 2025 Protection against diffusion-based inpainting
VGMShield arXiv:2402.13126 arXiv 2024 Video generative model misuse prevention

Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                   RED TEAM PIPELINE                      β”‚
β”‚                                                          β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”‚
β”‚  β”‚  INPUT    │──▢│  PROTECTION  │──▢│  DEEPFAKE      β”‚   β”‚
β”‚  β”‚  Media    β”‚   β”‚  Module      β”‚   β”‚  Generation    β”‚   β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β”‚  (Paper X)   β”‚   β”‚  (Attack)      β”‚   β”‚
β”‚                  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β””β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β”‚
β”‚                                             β”‚            β”‚
β”‚                  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”Œβ”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”   β”‚
β”‚                  β”‚  REPORTING   │◀──│  EVALUATION    β”‚   β”‚
β”‚                  β”‚  & Analysis  β”‚   β”‚  & Metrics     β”‚   β”‚
β”‚                  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Pipeline Workflow

  1. Input: Original face/image to protect
  2. Protection: Apply protection method (FaceShield / DiffusionGuard / VGMShield)
  3. Attack: Attempt deepfake generation on protected image
  4. Evaluation: Measure protection success via multiple metrics
  5. Report: Compare methods across dimensions

Quick Start

Using Docker Compose (Recommended)

# Build all services
docker-compose build

# Run the full red team pipeline
docker-compose run pipeline python run_pipeline.py \
    --input_dir ./data/test_images \
    --methods faceshield diffusionguard vgmshield \
    --attacks face_swap inpainting \
    --output_dir ./results

# Run individual protection modules
docker-compose run faceshield python protect.py --image input.png
docker-compose run diffusionguard python protect.py --image input.png --mask mask.png
docker-compose run vgmshield python protect.py --image input.png

Using Individual Docker Images

# FaceShield
cd modules/faceshield
docker build -t redteam-faceshield .
docker run --gpus all -v $(pwd)/data:/data redteam-faceshield \
    python protect.py --image /data/input.png --output /data/protected.png

# DiffusionGuard
cd modules/diffusionguard
docker build -t redteam-diffusionguard .
docker run --gpus all -v $(pwd)/data:/data redteam-diffusionguard \
    python protect.py --image /data/input.png --mask /data/mask.png --output /data/protected.png

# VGMShield
cd modules/vgmshield
docker build -t redteam-vgmshield .
docker run --gpus all -v $(pwd)/data:/data redteam-vgmshield \
    python protect.py --image /data/input.png --output /data/protected.png

Evaluation Metrics

Metric Description Used By
ISM (Identity Score Matching) Face identity similarity between source & deepfake output FaceShield
LPIPS Perceptual distance between images All
PSNR Peak signal-to-noise ratio All
SSIM Structural similarity index FaceShield
CLIP Dir. Sim. Alignment between edit direction and text DiffusionGuard
ImageReward Human-aligned quality assessment DiffusionGuard
FID FrΓ©chet Inception Distance All
L2 Distance Pixel-level difference All

Pre-processing Robustness Tests

The framework tests protection robustness against common pre-processing attacks:

  • JPEG Compression (Q=75, Q=50, Q=25)
  • Gaussian Blur (Οƒ=1.0, 2.0, 3.0)
  • Resize & Restore (50%, 75%)
  • Center Crop & Resize
  • AdverseCleaner (algorithmic purification)
  • Random Noise Addition (Οƒ=0.01, 0.05)

Hardware Requirements

Method Min VRAM Recommended GPU Time per Image
FaceShield 8 GB RTX A6000 (48GB) ~30s (30 iters)
DiffusionGuard 12 GB RTX 3090 (24GB) ~90s (800 iters)
VGMShield (Prevention) 16 GB A100 (80GB) ~5min (1000 iters)
Full Pipeline 24 GB A100 (80GB) ~10min per method

Comprehensive Analysis Report

See ANALYSIS_REPORT.md for the full comparative analysis with quantitative results from all papers.

Citation

@InProceedings{Jeong_2025_ICCV,
    title={FaceShield: Defending Facial Image against Deepfake Threats},
    author={Jeong, Jaehwan and In, Sumin and Kim, Sieun and Shin, Hannie and Jeong, Jongheon and Yoon, Sang Ho and Chung, Jaewook and Kim, Sangpil},
    booktitle={ICCV},
    year={2025}
}

@InProceedings{Choi_2025_ICLR,
    title={DiffusionGuard: A Robust Defense Against Malicious Diffusion-based Image Editing},
    author={Choi, June Suk and Lee, Kyungmin and Jeong, Jongheon and Xie, Saining and Shin, Jinwoo and Lee, Kimin},
    booktitle={ICLR},
    year={2025}
}

@article{pang2024vgmshield,
    title={VGMShield: Mitigating Misuse of Video Generative Models},
    author={Pang, Yan and Zhang, Yang and Wang, Tianhao},
    journal={arXiv:2402.13126},
    year={2024}
}
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Papers for Acadelab/red-team-image-protection