Awesome VGM/VLM Unlearning

A collection of papers and resources about Machine Unlearning on Vision Generative Models (VGMs) and Vision Language Models (VLMs).

Another collection of Vision Language Models and Vision Generative models can be found here.

Background, Concerns, and Motivations

Image Generation Ability

High-Resolution Image Synthesis with Latent Diffusion Models, Rombach et al., Stable Diffusion, CVPR 2022, 2021-12
Midjourney, Midjourney, Business Image Generation Services, Website, 2023

Regulation and Compliance

Regulating ChatGPT and other large generative AI models, Hacker et al., Regulating ChatGPT, FAccT 23, 2023-12

Harmful Generation

Safe Latent Diffusion: Mitigating Inappropriate Degeneration in Diffusion Models, Schramowski et al., Safe Latent Diffusion, Code, CVPR 2023, 2022-11
Discovering Universal Semantic Triggers for Text-to-Image Synthesis, Zhai et al., Universal Semantic Triggers, ArXiv, 2024-02, No Code Available

Copyright Infringement

Users of Midjourney text-to-image site claim issues with new update, News, The Street, 2023-12
Understanding and Mitigating Copying in Diffusion Models, Somepalli et al., Code, NeurIPS 2023, 2023-05

Bias, Fairness and Stereotypes

Exploiting Cultural Biases via Homoglyphs in Text-to-Image Synthesis, Struppek et al., Homoglyphs, Code, Journal of Artificial Intelligence Research, 2022-09
Concept Decomposition for Visual Exploration and Inspiration, Vinker et al, Concept Decomposition, Code, ACM Transactions on Graphics (TOG), 2023-05
On the Trustworthiness Landscape of State-of-the-art Generative Models: A Survey and Outlook, Fan et al., Trustworthiness Survey, ArXiv, 2023-07
How AI reduces the world to stereotypes, Turk, News, Rest of World, 2023-10
Security and Privacy on Generative Data in AIGC: A Survey, Wang et al., Security and Privacy Survey, ArXiv, 2023-09
Trustworthy Large Models in Vision: A Survey, Guo et al., Trustworthy Vision Models Survey, ArXiv, 2023-11
Multilingual Text-to-Image Generation Magnifies Gender Stereotypes and Prompt Engineering May Not Help You, Friedrich et al., Bias in multilingual, ArXiv, 2024-01

Privacy and Security

Extracting Training Data from Diffusion Models, Carlini et al., Diffusion Data Extraction, 32nd USENIX, 2023-01

Survey

Rethinking Machine Unlearning for Large Language Models, Liu et al., arXiv, 2024.02
Threats, Attacks, and Defenses in Machine Unlearning: A Survey, Liu et al., Machine Unlearning Survey, ArXiv, 2024-03
Machine Unlearning: Taxonomy, Metrics, Applications, Challenges, and Prospects, arXiv, 2024.03

Methods

Image-to-Image

Machine Unlearning for Image-to-Image Generative Models, Li et al, Machine Unlearning for Image-to-Image, Code, ArXiv, 2024-02

Prompt Engineering

Safe Latent Diffusion: Mitigating Inappropriate Degeneration in Diffusion Models, Schramowski et al., Safe Latent Diffusion, Code, CVPR 2023, 2022-11
Forgedit: Text-guided Image Editing via Learning and Forgetting, Zhang et al., Latent vector substraction, projection and editing, Code, ArXiv, 2023-09
Prompting4Debugging: Red-Teaming Text-to-Image Diffusion Models by Finding Problematic Prompts, Chin et al., Finding problematic prompts, No Code Available, ArXiv, 2023-09
Prompt-Specific Poisoning Attacks on Text-to-Image Generative Models, Prompt poisoning, Shan et al., No Code Available, ArXiv, 2023-10
Removing NSFW Concepts from Vision-and-Language Models for Text-to-Image Retrieval and Generation, Poppi et al., Clip-based Prompt engineering, Code, ArXiv, 2023-11
Removing Undesirable Concepts in Text-to-Image Generative Models with Learnable Prompts, Bui et al., Learnable Prompts, ArXiv, 2024-03
Unlearning Backdoor Threats: Enhancing Backdoor Defense in Multimodal Contrastive Learning via Local Token Unlearning, Liang et al., Local Token Unlearning, ArXiv, 2024-03

Weight Engineering

Salun: Empowering machine unlearning via gradientbased weight saliency in both image classification and generation, Fan et al., Weight Saliency, Code, ICLR 2024, 2023-10
Forget-Me-Not: Learning to Forget in Text-to-Image Diffusion Models, Zhang et al., Attention Resteering Loss, Code, ArXiv, 2023-03

Knowledge Distillation

Exploiting Cultural Biases via Homoglyphs in Text-to-Image Synthesis, Struppek et al., Homoglyphs, Code, Journal of Artificial Intelligence Research, 2022-09
Towards Safe Self-Distillation of Internet-Scale Text-to-Image Diffusion Models, Kim et al., Safe Self-Distillation, Code, ArXiv, 2023-07
Editing Massive Concepts in Text-to-Image Diffusion Models, Xiong et al., Massive Concept Editing, Code, ArXiv, 2024-03

Adversarial Training

To Generate or Not? Safety-Driven Unlearned Diffusion Models Are Still Easy To Generate Unsafe Images ... For Now, Zhang et al., UnlearnDiff, Code, ArXiv, 2023-10

Concept Editing

Ablating concepts in text-to-image diffusion models, Kumari et al., Concept Ablation, Code, ICCV 2023, 2023-03
Erasing concepts from diffusion models, Gandikota et al., Concept Erasure, Code, CVPR 2023, 2023-03
Circumventing Concept Erasure Methods For Text-To-Image Generative Models, Pham et al., Concept Erasure, ICLR 2024, 2023-08
Unified Concept Editing in Diffusion Models, Gandikota et al., Unified Concept Editing, Code, WACV 2024, 2023-08
Concept Sliders: LoRA Adaptors for Precise Control in Diffusion Models, Gandikota et al., Concept Sliders, ArXiv, 2023-11
Receler: Reliable Concept Erasing of Text-to-Image Diffusion Models via Lightweight Erasers, Huang et al., Lightweight Erasers,ArXiv, 2023-11
All but One: Surgical Concept Erasing with Model Preservation in Text-to-Image Diffusion Models, Hong et al., Surgical Concept Erasing, ArXiv, 2023-12
Get What You Want, Not What You Don't: Image Content Suppression for Text-to-Image Diffusion Models, Li et al., Image Content Suppression, ArXiv, 2024-02
Separable Multi-Concept Erasure from Diffusion Models, Zhao et al., Multi-Concept Erasure, Code, ArXiv, 2024-02
Editing Massive Concepts in Text-to-Image Diffusion Models, Xiong et al., Massive Concept Editing, Code, ArXiv, 2024-03

Others

Continual Learning for Forgetting in Deep Generative Models, Heng et al, Continual UnLearning, ICML 2023 Workshop DeployableGenerativeAI, 2023-06

Dataset

Text-Image

Conceptual 12M, Changpinyo et al., CVPR 2021, 2021-09
COYO, kakaobrain, COYO, ArXiv, 2022-08
TedBench, Kawar et al., TedBench, ArXiv, 2022-10
LAION-5B: An open large-scale dataset for training next generation image-text models, Schuhmann et al, LAION-5B, NeurIPS 2022, 2022-10
Inappropriate Image Prompts (I2P), schramowski et al., I2P, CVPR 2023, 2023-01
ConceptBench, Zhang et al., ConceptBench, ArXiv, 2023-03
MAGBIG, Fredrich et al., Gender bias dataset, ArXiv, 2024-01
UnlearnCanvas, Zhang et al., UnlearnCanvas, ArXiv, 2024-03

Evaluation

General

Deja Vu Memorization in Vision-Language Models, Jayaraman et al., Deja Vu Memorization, ArXiv, 2024-02

Attack

Are Diffusion Models Vulnerable to Membership Inference Attacks?, Duan et al., Membership Inference Attacks, Code, ICML 2023, 2023-02
SneakyPrompt: Jailbreaking Text-to-image Generative Models, Yang et al., SneakyPrompt, Code, IEEE SSP 2024, 2023-05
Can Sensitive Information Be Deleted From LLMs? Objectives for Defending Against Extraction Attacks, Wang et al., Defending Against Extraction Attacks, Code, ICLR 2024, 2023-09
Ring-A-Bell! How Reliable are Concept Removal Methods for Diffusion Models?, Tsai et al, Red-teaming tool for T2I diffusion models, ICLR 2024, 2023-10
To Generate or Not? Safety-Driven Unlearned Diffusion Models Are Still Easy To Generate Unsafe Images ... For Now, Zhang et al., UnlearnDiff, Code, ArXiv, 2023-10
MMA-Diffusion: MultiModal Attack on Diffusion Models, Yang et al., MMA-Diffusion, Arxiv, 2023-11
The Stronger the Diffusion Model, the Easier the Backdoor: Data Poisoning to Induce Copyright Breaches Without Adjusting Finetuning Pipeline, Wang et al., Data Poisoning for Copyright Breaches, ArXiv, 2024-01
Discovering Universal Semantic Triggers for Text-to-Image Synthesis, Zhai et al., Universal Semantic Triggers, ArXiv, 2024-02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

vlm.md

vlm.md

Awesome VGM/VLM Unlearning

Background, Concerns, and Motivations

Image Generation Ability

Regulation and Compliance

Harmful Generation

Copyright Infringement

Bias, Fairness and Stereotypes

Privacy and Security

Survey

Methods

Image-to-Image

Prompt Engineering

Weight Engineering

Knowledge Distillation

Adversarial Training

Concept Editing

Others

Dataset

Text-Image

Evaluation

General

Attack

Files

vlm.md

Latest commit

History

vlm.md

File metadata and controls

Awesome VGM/VLM Unlearning

Background, Concerns, and Motivations

Image Generation Ability

Regulation and Compliance

Harmful Generation

Copyright Infringement

Bias, Fairness and Stereotypes

Privacy and Security

Survey

Methods

Image-to-Image

Prompt Engineering

Weight Engineering

Knowledge Distillation

Adversarial Training

Concept Editing

Others

Dataset

Text-Image

Evaluation

General

Attack