Skip to content

Latest commit

Β 

History

History
394 lines (186 loc) Β· 27.5 KB

README_ord.md

File metadata and controls

394 lines (186 loc) Β· 27.5 KB

Awesome Image Editing

This is the github repository of our work "A Survey of Multimodal-Guided Image Editing with Text-to-Image Diffusion Models".

Editing Tasks Discussed in Our Survey

image

Unified Framework

image

Table of contents

Content-Aware Editing


Content-Free Editing


Experiment and Data


Object Manipulation and Attribute Manipulation:

1. Training-Free Approaches

πŸ“„ UniTune: Text-Driven Image Editing by Fine Tuning a Diffusion Model on a Single Image | πŸ“– TOG 2023 | πŸ”€ $F_{inv}^T+F_{edit}^{Norm}$ | 🌐 Code

πŸ“„ Highly Personalized Text Embedding for Image Manipulation by Stable Diffusion | πŸ“– Arxiv 2023 | πŸ”€ $F_{inv}^T+F_{edit}^{Norm}$| 🌐 Code

πŸ“„ Imagic: Text-Based Real Image Editing with Diffusion Models | πŸ“– CVPR 2023 | πŸ”€ $F_{inv}^T+F_{edit}^{Blend}$ | [🌐 Code]

πŸ“„ Forgedit: Text Guided Image Editing via Learning and Forgetting | πŸ“– Arxiv 2023 | πŸ”€ $F_{inv}^T+F_{edit}^{Blend}$ | 🌐 Code

πŸ“„ Doubly Abductive Counterfactual Inference for Text-based Image Editing | πŸ“– CVPR 2024 | πŸ”€ $F_{inv}^T+F_{edit}^{Blend}$ | 🌐 Code

πŸ“„ SINE: Sinle Image Editing with Text-to-Image Diffusion Models | πŸ“– CVPR 2023 | πŸ”€ $F_{inv}^T+F_{edit}^{Score}$ | 🌐 Code

πŸ“„ EDICT: Exact Diffusion Inversion via Coupled Transformations | πŸ“– CVPR 2023 | πŸ”€ $F_{inv}^F+F_{edit}^{Norm}$ | 🌐 Code

πŸ“„ Exact Diffusion Inversion via Bi-directional Integration Approximation | πŸ“– Arxiv 2023 | πŸ”€ $F_{inv}^F+F_{edit}^{Norm}$ | 🌐 Code

πŸ“„ Effective Real Image Editing with Accelerated Iterative Diffusion Inversion | πŸ“– ICCV 2023 | πŸ”€ $F_{inv}^F+F_{edit}^{Norm}$ | 🌐 Code

πŸ“„ Null-text Inversion for Editing Real Images using Guided Diffusion Models | πŸ“– CVPR 2023 | πŸ”€ $F_{inv}^F+F_{edit}^{Attn}$ | 🌐 Code

πŸ“„ Negative-prompt Inversion: Fast Image Inversion for Editing with Text-guided Diffusion Models | πŸ“– Arxiv 2023 | πŸ”€ $F_{inv}^F+F_{edit}^{Attn}$ | 🌐 Code

πŸ“„ProxEdit: Improving Tuning-Free Real Image Editing with Proximal Guidance | πŸ“– WACV 2024 | πŸ”€ $F_{inv}^F+F_{edit}^{Attn}$ | 🌐 Code

πŸ“„ Fixed-point Inversion for Text-to-image diffusion models | πŸ“– Arxiv 2023 | πŸ”€ $F_{inv}^F+F_{edit}^{Attn}$ | 🌐 Code

πŸ“„PnP Inversion: Boosting Diffusion-based Editing with 3 Lines of Code | πŸ“– ICLR 2024 | πŸ”€ $F_{inv}^F+F_{edit}^{Attn}$| 🌐 Code

πŸ“„ Dynamic Prompt Learning: Addressing Cross-Attention Leakage for Text-Based Image Editing | πŸ“– NeurIPS 2023 | πŸ”€ $F_{inv}^F+F_{edit}^{Attn}$ | 🌐 Code

πŸ“„ An Edit Friendly DDPM Noise Space: Inversion and Manipulations | πŸ“– CVPR 2024 | πŸ”€ $F_{inv}^F+F_{edit}^{Attn}$ | 🌐 Code

πŸ“„ Prompt-to-Prompt Image Editing with Cross-Attention Control | πŸ“– ICLR 2023 | πŸ”€ $F_{inv}^F+F_{edit}^{Attn}$ | 🌐 Code

πŸ“„ Plug-and-Play Diffusion Features for Text-Driven Image-to-Image Translation | πŸ“– CVPR 2023 | πŸ”€ $F_{inv}^F+F_{edit}^{Attn}$ | 🌐 Code

πŸ“„ Towards Understanding Cross and Self-Attention in Stable Diffusion for Text-Guided Image Editing | πŸ“– CVPR 2024 | πŸ”€ $F_{inv}^F+F_{edit}^{Attn}$ | 🌐 Code

πŸ“„ StyleDiffusion: Prompt-Embedding Inversion for Text-Based Editing | πŸ“– Arxiv 2023 | πŸ”€ $F_{inv}^F+F_{edit}^{Attn}$ | 🌐 Code

πŸ“„ Prompt Tuning Inversion for Text-Driven Image Editing Using Diffusion Models | πŸ“– ICCV 2023 | πŸ”€ $F_{inv}^F+F_{edit}^{Blend}$ | 🌐 Code

πŸ“„ Object-aware Inversion and Reassembly for Image Editing | πŸ“– ICLR 2024 | πŸ”€ $F_{inv}^F+F_{edit}^{Blend}$ | 🌐 Code

πŸ“„ DiffEdit: Diffusion-based semantic image editing with mask guidance | πŸ“– ICLR 2023 | πŸ”€ $F_{inv}^F+F_{edit}^{Blend}$ | 🌐 Code

πŸ“„ PFB-Diff: Progressive Feature Blending Diffusion for Text-driven Image Editing | πŸ“– Arxiv 2023 | πŸ”€ $F_{inv}^F+F_{edit}^{Blend}$ | 🌐 Code

πŸ“„ Uncovering the Disentanglement Capability in Text-to-Image Diffusion Models | πŸ“– CVPR 2023 | πŸ”€ $F_{inv}^F+F_{edit}^{Blend}$| 🌐 Code

πŸ“„ Noise Map Guidance: Inversion with Spatial Context for Real Image Editing | πŸ“– ICLR 2024 | πŸ”€ $F_{inv}^F+F_{edit}^{Score}$ | 🌐 Code

πŸ“„ pix2pix-zero | πŸ“– SIGGRAPH 2023 | πŸ”€ $F_{inv}^F+F_{edit}^{Score}$ | 🌐 Code

πŸ“„ SEGA: Instructing Diffusion using Semantic Dimensions | πŸ“– NeurIPS 2023 | πŸ”€ $F_{inv}^F+F_{edit}^{Score}$ | 🌐 Code

πŸ“„ The Stable Artist: Steering Semantics in Diffusion Latent Space | πŸ“– Arxiv 2022 | πŸ”€ $F_{inv}^F+F_{edit}^{Score}$ | 🌐 Code

πŸ“„ LEDITS: Real Image Editing with DDPM Inversion and Semantic Guidance | πŸ“– Arxiv 2023 | πŸ”€ $F_{inv}^F+F_{edit}^{Score}$ | 🌐 Code

πŸ“„ LEDITS++: Limitless Image Editing using Text-to-Image Models | πŸ“– CVPR 2024 | πŸ”€ $F_{inv}^F+F_{edit}^{Score}$ | 🌐 Code

πŸ“„ Magicremover: Tuning-free Text-guided Image inpainting with Diffusion Models | πŸ“– ICLR 2024 | πŸ”€ $F_{inv}^F+F_{edit}^{Score}$ | 🌐 Code

πŸ“„ Region-Aware Diffusion for Zero-shot Text-driven Image Editing | πŸ“– Arxiv 2023 | πŸ”€ $F_{inv}^F+F_{edit}^{Optim}$ | 🌐 Code

πŸ“„ Delta Denoising Score | πŸ“– ICCV 2023 | πŸ”€ $F_{inv}^F+F_{edit}^{Optim}$ | 🌐 Code

πŸ“„ Contrastive Denoising Score for Text-guided Latent Diffusion Image Editing | πŸ“– CVPR 2024 | πŸ”€ $F_{inv}^F+F_{edit}^{Optim}$ | 🌐 Code

πŸ“„ Ground-A-Score: Scaling Up the Score Distillation for Multi-Attribute Editing | πŸ“– Arxiv 2024 | πŸ”€ $F_{inv}^F+F_{edit}^{Optim}$ | 🌐 Code

πŸ“„ Custom-Edit: Text-Guided Image Editing with Customized Diffusion Models | πŸ“– CVPR 2023 | πŸ”€ $F_{inv}^T+F_{inv}^F+F_{edit}^{Attn}$ | 🌐 Code

πŸ“„ Photoswap: Personalized Subject Swapping in Images | πŸ“– NeurIPS 2023 | πŸ”€ $F_{inv}^T+F_{inv}^F+F_{edit}^{Attn}$ | 🌐 Code

πŸ“„ DreamEdit: Subject-driven Image Editing | πŸ“– TMLR 2023 |πŸ”€ $F_{inv}^T+F_{inv}^F+F_{edit}^{Blend}$ | 🌐 Code

2. Training-Based Approaches

πŸ“„ InstructPix2Pix: Learning to Follow Image Editing Instructions | πŸ“– CVPR 2023 | 🌐 Code

πŸ“„ MagicBrush: A Manually Annotated Dataset for Instruction-Guided Image Editing | πŸ“– NeurIPS 2023 | 🌐 Code

πŸ“„ HIVE: Harnessing Human Feedback for Instructional Visual Editing | πŸ“– Arxiv 2023 | 🌐 Code

πŸ“„ Emu Edit: Precise Image Editing via Recognition and Generation Tasks | πŸ“– Arxiv 2023 | 🌐 Code

πŸ“„ GUIDING INSTRUCTION-BASED IMAGE EDITING VIA MULTIMODAL LARGE LANGUAGE MODELS | πŸ“– ICLR 2024 | 🌐 Code

πŸ“„ SmartEdit: Exploring Complex Instruction-based Image Editing with Multimodal Large Language Models | πŸ“–CVPR 2024 | 🌐 Code

πŸ“„ Referring Image Editing: Object-level Image Editing via Referring Expressions | πŸ“–CVPR 2024 | 🌐 Code


Attribute Manipulation:

1. Training-Free Approaches

πŸ“„ KV Inversion: KV Embeddings Learning for Text-Conditioned Real Image Action Editing | πŸ“– PRCV 2023 | πŸ”€ $F_{inv}^F+F_{edit}^{Attn}$ | 🌐 Code

πŸ“„ Localizing Object-level Shape Variations with Text-to-Image Diffusion Models | πŸ“– ICCV 2023 | πŸ”€ $F_{inv}^F+F_{edit}^{Attn}$ | 🌐 Code

πŸ“„ MasaCtrl: Tuning-Free Mutual Self-Attention Control for Consistent Image Synthesis and Editing | πŸ“– ICCV 2023 | πŸ”€ $F_{inv}^F+F_{edit}^{Attn}$ | 🌐 Code

πŸ“„ Tuning-Free Inversion-Enhanced Control for Consistent Image Editing | πŸ“– AAAI 2023 | πŸ”€ $F_{inv}^F+F_{edit}^{Attn}$ | 🌐 Code

πŸ“„ Cross-Image Attention for Zero-Shot Appearance Transfer | πŸ“– SIGGRAPH 2024 | πŸ”€ $F_{inv}^F+F_{edit}^{Attn}$ | 🌐 Code

2. Training-Based Approaches


Spatial Transformation:

1. Training-Free Approaches

πŸ“„ DesignEdit: Multi-Layered Latent Decomposition and Fusion for Unified & Accurate Image Editing | πŸ“– Arxiv 2024 | πŸ”€ $F_{inv}^F+F_{edit}^{Attn}$ | 🌐 Code

πŸ“„ Diffusion Self-Guidance for Controllable Image Generation | πŸ“– NeurIPS 2023 | πŸ”€ $F_{inv}^F+F_{edit}^{Score}$ | 🌐 Code

πŸ“„ DragonDiffusion: Enabling Drag-style Manipulation on Diffusion Models | πŸ“– ICLR 2024 | πŸ”€ $F_{inv}^F+F_{edit}^{Score}$ | 🌐 Code

πŸ“„ DragDiffusion: Harnessing Diffusion Models for Interactive Point-based Image Editing | πŸ“– ICLR 2024 | πŸ”€ $F_{inv}^T+F_{inv}^F+F_{edit}^{Optim}$ | 🌐 Code

πŸ“„ DiffEditor: Boosting Accuracy and Flexibility on Diffusion-based Image Editing | πŸ“– ICLR 2024 | πŸ”€ $F_{inv}^T+F_{inv}^F+F_{edit}^{Score}$ | 🌐 Code

2. Training-Based Approaches


Inpainting:

1. Training-Free Approaches

πŸ“„ HD-Painter: High-Resolution and Prompt-Faithful Text-Guided Image Inpainting with Diffusion Models | πŸ“– Arxiv 2023 | πŸ”€ $F_{inv}^F+F_{edit}^{Attn}$ | 🌐 Code

πŸ“„ TF-ICON: Diffusion-Based Training-Free Cross-Domain Image Composition | πŸ“– ICCV 2023 | πŸ”€ $F_{inv}^F+F_{edit}^{Blend}$ | 🌐 Code

πŸ“„ Blended Latent Diffusion | πŸ“– TOG 2023 | πŸ”€ $F_{inv}^F+F_{edit}^{Blend}$ | 🌐 Code

πŸ“„ High-Resolution Image Editing via Multi-Stage Blended Diffusion | πŸ“– Arxiv 2022 | πŸ”€ $F_{inv}^F+F_{edit}^{Blend}$ | 🌐 Code

πŸ“„ Differential Diffusion: Giving Each Pixel Its Strength | πŸ“– Arxiv 2023 | πŸ”€ $F_{inv}^F+F_{edit}^{Blend}$ | 🌐 Code

πŸ“„ Tuning-Free Image Customization with Image and Text Guidance | πŸ“– CVPR 2024 | πŸ”€ $F_{inv}^F+F_{edit}^{Blend}$ | 🌐 Code

2. Training-Based Approaches

πŸ“„ Imagen Editor and EditBench: Advancing and Evaluating Text-Guided Image Inpainting | πŸ“– CVPR 2024| 🌐 Code

πŸ“„ SmartBrush: Text and Shape Guided Object Inpainting with Diffusion Model | πŸ“– CVPR 2023 | 🌐 Code

πŸ“„ A Task is Worth One Word: Learning with Task Prompts for High-Quality Versatile Image Inpainting | πŸ“– Arxiv 2023 | 🌐 Code

πŸ“„ Paint by Example: Exemplar-based Image Editing with Diffusion Models | πŸ“– CVPR 2023 | 🌐 Code

πŸ“„ ObjectStitch: Object Compositing with Diffusion Model | πŸ“– CVPR 2023 | 🌐 Code

πŸ“„ Reference-based Image Composition with Sketch via Structure-aware Diffusion Model | πŸ“– CVPR 2023 | 🌐 Code

πŸ“„ Paste, Inpaint and Harmonize via Denoising: Subject-Driven Image Editing with Pre-Trained Diffusion Model | πŸ“– ICASSP 2024 | 🌐 Code

πŸ“„ AnyDoor: Zero-shot Object-level Image Customization | πŸ“– CVPR 2024 | 🌐 Code


Style Change:

1. Training-Free Approaches

πŸ“„ Inversion-Based Style Transfer with Diffusion Models | πŸ“– CVPR 2023 | πŸ”€ $F_{inv}^T+F_{inv}^F+F_{edit}^{Attn}$ | 🌐 Code

πŸ“„ Style Injection in Diffusion: A Training-free Approach for Adapting Large-scale Diffusion Models for Style Transfer | πŸ“– CVPR 2024 | πŸ”€ $F_{inv}^F+F_{edit}^{Attn}$ | 🌐 Code

πŸ“„ Zβˆ—: Zero-shot Style Transfer via Attention Rearrangement | πŸ“– Arxiv 2023 | πŸ”€ $F_{inv}^F+F_{edit}^{Attn}$ | 🌐 Code

2. Training-Based Approaches


Image Translation:

1. Training-Free Approaches

πŸ“„ FreeControl: Training-Free Spatial Control of Any Text-to-Image Diffusion Model with Any Condition | πŸ“– CVPR 2024 | 🌐 Code

2. Training-Based Approaches

πŸ“„ Adding Conditional Control to Text-to-Image Diffusion Models | πŸ“– ICCV 2023 | 🌐 Code

πŸ“„ T2I-Adapter: Learning Adapters to Dig out More Controllable Ability for Text-to-Image Diffusion Models | πŸ“– AAAI 2024 | 🌐 Code

πŸ“„ SCEdit: Efficient and Controllable Image Diffusion Generation via Skip Connection Editing | πŸ“– CVPR 2024 | 🌐 Code

πŸ“„ Cocktail: Mixing Multi-Modality Controls for Text-Conditional Image Generation | πŸ“– NeurIPS 2023 | 🌐 Code

πŸ“„ Uni-ControlNet: All-in-One Control to Text-to-Image Diffusion Model | πŸ“– NeurIPS 2023 | 🌐 Code

πŸ“„ CycleNet: Rethinking Cycle Consistency in Text-Guided Diffusion for Image Manipulation | πŸ“– NeurIPS 2023 | 🌐 Code

πŸ“„ One-Step Image Translation with Text-to-Image Models | πŸ“– Arxiv 2024] | 🌐 Code


Subject-Driven Customization:

1. Training-Free Approaches

πŸ“„ An Image is Worth One Word: Personalizing Text-to-Image Generation using Textual Inversion | πŸ“– ICLR 2023 | πŸ”€ $F_{inv}^T+F_{edit}^{Norm}$ | 🌐 Code

πŸ“„ DreamArtist: Towards Controllable One-Shot Text-to-Image Generation via Positive-Negative Prompt-Tuning | πŸ“– Arxiv 2022 | πŸ”€ $F_{inv}^T+F_{edit}^{Norm}$ | 🌐 Code

πŸ“„ P+: Extended Textual Conditioning in Text-to-Image Generation | πŸ“– Arxiv 2023 | πŸ”€ $F_{inv}^T+F_{edit}^{Norm}$ | 🌐 Code

πŸ“„ A Neural Space-Time Representation for Text-to-Image Personalization | πŸ“– TOG 2023 | πŸ”€ $F_{inv}^T+F_{edit}^{Norm}$ | 🌐 Code

πŸ“„ DreamBooth: Fine Tuning Text-to-Image Diffusion Models for Subject-Driven Generation | πŸ“– CVPR 2023 | πŸ”€ $F_{inv}^T+F_{edit}^{Norm}$ | 🌐 Code

πŸ“„A Data Perspective on Enhanced Identity Preservation for Diffusion Personalization | πŸ“– ICLR 2024 | πŸ”€ $F_{inv}^T+F_{edit}^{Norm}$ | 🌐 Code

πŸ“„ FaceChain-SuDe: Building Derived Class to Inherit Category Attributes for One-shot Subject-Driven Generation | πŸ“– CVPR 2024 | πŸ”€ $F_{inv}^T+F_{edit}^{Norm}$ | 🌐 Code

πŸ“„ Multi-Concept Customization of Text-to-Image Diffusion | πŸ“– CVPR 2023 | πŸ”€ $F_{inv}^T+F_{edit}^{Norm}$ | 🌐 Code

πŸ“„ Cones: Concept Neurons in Diffusion Models for Customized Generation | πŸ“– ICML 2023 | πŸ”€ $F_{inv}^T+F_{edit}^{Norm}$ | 🌐 Code

πŸ“„ SVDiff: Compact Parameter Space for Diffusion Fine-Tuning | πŸ“– ICCV 2023 | πŸ”€ $F_{inv}^T+F_{edit}^{Norm}$ | 🌐 Code

πŸ“„ Low-Rank Adaptation for Fast Text-to-Image Diffusion Fine-Tuning | πŸ“– | πŸ”€ $F_{inv}^T+F_{edit}^{Norm}$| 🌐 Code

πŸ“„ A Closer Look at Parameter-Efficient Tuning in Diffusion Models | πŸ“– Arxiv 2023 | πŸ”€ $F_{inv}^T+F_{edit}^{Norm}$ | 🌐 Code

πŸ“„ Break-a-scene: Extracting multiple concepts from a single image | πŸ“– SIGGRAPH 2023 | πŸ”€ $F_{inv}^T+F_{edit}^{Norm}$ | 🌐 Code

πŸ“„ Clic: Concept Learning in Context | πŸ“– Arxiv 2023 | πŸ”€ $F_{inv}^T+F_{edit}^{Norm}$ | 🌐 Code

πŸ“„ Disenbooth: Disentangled parameter-efficient tuning for subject-driven text-to-image generation | πŸ“– Arxiv 2023 | πŸ”€ $F_{inv}^T+F_{edit}^{Norm}$ | 🌐 Code

πŸ“„ Decoupled Textual Embeddings for Customized Image Generation | πŸ“– AAAI 2024 | πŸ”€ $F_{inv}^T+F_{edit}^{Norm}$ | 🌐 Code

πŸ“„ ViCo: Detail-Preserving Visual Condition for Personalized Text-to-Image Generation | πŸ“– Arxiv 2023 | πŸ”€ $F_{inv}^T+F_{edit}^{Attn}$ | 🌐 Code

πŸ“„ DreamMatcher: Appearance Matching Self-Attention for Semantically-Consistent Text-to-Image Personalization | πŸ“– CVPR 2024 | πŸ”€ $F_{inv}^T+F_{edit}^{Attn}$ | 🌐 Code

πŸ“„ Pick-and-Draw: Training-free Semantic Guidance for Text-to-Image Personalization | πŸ“– Arxiv 2024 | πŸ”€ $F_{inv}^F+F_{edit}^{Optim}$ | 🌐 Code

2. Training-Based Approaches

πŸ“„ Taming Encoder for Zero Fine-tuning Image Customization with Text-to-Image Diffusion Models | πŸ“– ICLR 2024 | 🌐 Code

πŸ“„ InstantBooth: Personalized Text-to-Image Generation without Test-Time Finetuning | πŸ“– CVPR 2024] | 🌐 Code

πŸ“„ Encoder-based Domain Tuning for Fast Personalization of Text-to-Image Models | πŸ“– Arxiv 2023 | 🌐 Code

πŸ“„ Enhancing Detail Preservation for Customized Text-to-Image Generation: A Regularization-Free Approach | πŸ“– ICLR 2024 | 🌐 Code

πŸ“„ FastComposer: Tuning-Free Multi-Subject Image Generation with Localized Attention | πŸ“– Arxiv 2023 | 🌐 Code

πŸ“„ PhotoMaker: Customizing Realistic Human Photos via Stacked {ID} Embedding | πŸ“– Arxiv 2023 | 🌐 Code

πŸ“„ PhotoVerse: Tuning-Free Image Customization with Text-to-Image Diffusion Models | πŸ“– Arxiv 2023 | 🌐 Code

πŸ“„ InstantID: Zero-shot Identity-Preserving Generation in Seconds | πŸ“– Arxiv 2024 | 🌐 Code

πŸ“„ ELITE: Encoding Visual Concepts into Textual Embeddings for Customized Text-to-Image Generation | πŸ“– ICCV 2023 | 🌐 Code

πŸ“„ BLIP-Diffusion: Pre-trained Subject Representation for Controllable Text-to-Image Generation and Editing | πŸ“– NeurIPS 2023 | 🌐 Code

πŸ“„ Domain-Agnostic Tuning-Encoder for Fast Personalization of Text-To-Image Models | πŸ“– SIGGRAPH 2023 | 🌐 Code

πŸ“„ Unified Multi-Modal Latent Diffusion for Joint Subject and Text Conditional Image Generation | πŸ“– Arxiv 2023 | 🌐 Code

πŸ“„ Subject-Diffusion: Open Domain Personalized Text-to-Image Generation without Test-time Fine-tuning | πŸ“– Arxiv 2023 | 🌐 Code

πŸ“„ Instruct-Imagen: Image Generation with Multi-modal Instruction | πŸ“– Arxiv 2024 | 🌐 Code


Attribute-Driven Customization:

1. Training-Free Approaches

πŸ“„ ProSpect: Prompt Spectrum for Attribute-Aware Personalization of Diffusion Models | πŸ“– Arxiv 2023 | πŸ”€ $F_{inv}^T+F_{edit}^{Norm}$ | 🌐 Code

πŸ“„ An Image is Worth Multiple Words: Multi-attribute Inversion for Constrained Text-to-Image Synthesis | πŸ“– Arxiv 2023 | πŸ”€ $F_{inv}^T+F_{edit}^{Norm}$ | 🌐 Code

πŸ“„ Concept Decomposition for Visual Exploration and Inspiration | πŸ“– TOG 2023 | πŸ”€ $F_{inv}^T+F_{edit}^{Norm}$ | 🌐 Code

πŸ“„ ReVersion: Diffusion-Based Relation Inversion from Images | πŸ“– Arxiv 2023 | πŸ”€ $F_{inv}^T+F_{edit}^{Norm}$ | 🌐 Code

πŸ“„ Learning Disentangled Identifiers for Action-Customized Text-to-Image Generation | πŸ“– Arxiv 2023 | πŸ”€ $F_{inv}^T+F_{edit}^{Norm}$ | 🌐 Code

πŸ“„ Lego: Learning to Disentangle and Invert Concepts Beyond Object Appearance in Text-to-Image Diffusion Models | πŸ“– Arxiv 2023 | πŸ”€ $F_{inv}^T+F_{edit}^{Norm}$ | 🌐 Code

πŸ“„ StyleDrop: Text-to-Image Generation in Any Style | πŸ“– NeurIPS 2023 | πŸ”€ $F_{inv}^T+F_{edit}^{Norm}$ | 🌐 Code

2. Training-Based Approaches

πŸ“„ ArtAdapter: Text-to-Image Style Transfer using Multi-Level Style Encoder and Explicit Adaptation | πŸ“– Arxiv 2023 | 🌐 Code

πŸ“„ DreamCreature: Crafting Photorealistic Virtual Creatures from Imagination | πŸ“– Arxiv 2023 | 🌐 Code

πŸ“„ Language-Informed Visual Concept Learning | πŸ“– ICLR 2024| 🌐 Code

πŸ“„ pOps: Photo-Inspired Diffusion Operators | πŸ“– Arxiv 2024 | 🌐 Code