🛡️ Red Team Framework: Image Protection Against AI Manipulation

A comprehensive red team pipeline for evaluating adversarial perturbation-based image protection methods against deepfake generation and AI manipulation.

Overview

This framework implements and benchmarks three state-of-the-art image protection methods:

Method	Paper	Venue	Focus
FaceShield	arXiv:2412.09921	ICCV 2025	Face protection against DM & GAN-based deepfakes
DiffusionGuard	arXiv:2410.05694	ICLR 2025	Protection against diffusion-based inpainting
VGMShield	arXiv:2402.13126	arXiv 2024	Video generative model misuse prevention

Architecture

┌─────────────────────────────────────────────────────────┐
│                   RED TEAM PIPELINE                      │
│                                                          │
│  ┌──────────┐   ┌──────────────┐   ┌────────────────┐   │
│  │  INPUT    │──▶│  PROTECTION  │──▶│  DEEPFAKE      │   │
│  │  Media    │   │  Module      │   │  Generation    │   │
│  └──────────┘   │  (Paper X)   │   │  (Attack)      │   │
│                  └──────────────┘   └───────┬────────┘   │
│                                             │            │
│                  ┌──────────────┐   ┌───────▼────────┐   │
│                  │  REPORTING   │◀──│  EVALUATION    │   │
│                  │  & Analysis  │   │  & Metrics     │   │
│                  └──────────────┘   └────────────────┘   │
└─────────────────────────────────────────────────────────┘

Pipeline Workflow

Input: Original face/image to protect
Protection: Apply protection method (FaceShield / DiffusionGuard / VGMShield)
Attack: Attempt deepfake generation on protected image
Evaluation: Measure protection success via multiple metrics
Report: Compare methods across dimensions

Quick Start

Using Docker Compose (Recommended)

# Build all services
docker-compose build

# Run the full red team pipeline
docker-compose run pipeline python run_pipeline.py \
    --input_dir ./data/test_images \
    --methods faceshield diffusionguard vgmshield \
    --attacks face_swap inpainting \
    --output_dir ./results

# Run individual protection modules
docker-compose run faceshield python protect.py --image input.png
docker-compose run diffusionguard python protect.py --image input.png --mask mask.png
docker-compose run vgmshield python protect.py --image input.png

Using Individual Docker Images

# FaceShield
cd modules/faceshield
docker build -t redteam-faceshield .
docker run --gpus all -v $(pwd)/data:/data redteam-faceshield \
    python protect.py --image /data/input.png --output /data/protected.png

# DiffusionGuard
cd modules/diffusionguard
docker build -t redteam-diffusionguard .
docker run --gpus all -v $(pwd)/data:/data redteam-diffusionguard \
    python protect.py --image /data/input.png --mask /data/mask.png --output /data/protected.png

# VGMShield
cd modules/vgmshield
docker build -t redteam-vgmshield .
docker run --gpus all -v $(pwd)/data:/data redteam-vgmshield \
    python protect.py --image /data/input.png --output /data/protected.png

Evaluation Metrics

Metric	Description	Used By
ISM (Identity Score Matching)	Face identity similarity between source & deepfake output	FaceShield
LPIPS	Perceptual distance between images	All
PSNR	Peak signal-to-noise ratio	All
SSIM	Structural similarity index	FaceShield
CLIP Dir. Sim.	Alignment between edit direction and text	DiffusionGuard
ImageReward	Human-aligned quality assessment	DiffusionGuard
FID	Fréchet Inception Distance	All
L2 Distance	Pixel-level difference	All

Pre-processing Robustness Tests

The framework tests protection robustness against common pre-processing attacks:

JPEG Compression (Q=75, Q=50, Q=25)
Gaussian Blur (σ=1.0, 2.0, 3.0)
Resize & Restore (50%, 75%)
Center Crop & Resize
AdverseCleaner (algorithmic purification)
Random Noise Addition (σ=0.01, 0.05)

Hardware Requirements

Method	Min VRAM	Recommended GPU	Time per Image
FaceShield	8 GB	RTX A6000 (48GB)	~30s (30 iters)
DiffusionGuard	12 GB	RTX 3090 (24GB)	~90s (800 iters)
VGMShield (Prevention)	16 GB	A100 (80GB)	~5min (1000 iters)
Full Pipeline	24 GB	A100 (80GB)	~10min per method

Comprehensive Analysis Report

See ANALYSIS_REPORT.md for the full comparative analysis with quantitative results from all papers.

Citation

@InProceedings{Jeong_2025_ICCV,
    title={FaceShield: Defending Facial Image against Deepfake Threats},
    author={Jeong, Jaehwan and In, Sumin and Kim, Sieun and Shin, Hannie and Jeong, Jongheon and Yoon, Sang Ho and Chung, Jaewook and Kim, Sangpil},
    booktitle={ICCV},
    year={2025}
}

@InProceedings{Choi_2025_ICLR,
    title={DiffusionGuard: A Robust Defense Against Malicious Diffusion-based Image Editing},
    author={Choi, June Suk and Lee, Kyungmin and Jeong, Jongheon and Xie, Saining and Shin, Jinwoo and Lee, Kimin},
    booktitle={ICLR},
    year={2025}
}

@article{pang2024vgmshield,
    title={VGMShield: Mitigating Misuse of Video Generative Models},
    author={Pang, Yan and Zhang, Yang and Wang, Tianhao},
    journal={arXiv:2402.13126},
    year={2024}
}

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Papers for Acadelab/red-team-image-protection

FaceShield: Defending Facial Image against Deepfake Threats

Paper • 2412.09921 • Published Dec 19, 2025

DiffusionGuard: A Robust Defense Against Malicious Diffusion-based Image Editing

Paper • 2410.05694 • Published Oct 8, 2024

VGMShield: Mitigating Misuse of Video Generative Models

Paper • 2402.13126 • Published Feb 20, 2024