Add support for loading single file CLIPEmbedding models #6813

lstein · 2024-09-04T21:59:09Z

Summary

We're starting to see fine-tuned CLIPEmbedding models for improved FLUX performance appearing in the wild. For example: https://huggingface.co/zer0int/CLIP-GmP-ViT-L-14 . However, these fine-tunes are formatted as single "checkpoint"-style files rather than Transformers-compatible folders. This PR adds support for installing and loading these models.

Related Issues / Discussions

There is a problem with this implementation. The CLIP text embedder needs two models: the encoder and the tokenizer. When FLUX support was added to InvokeAI, these two models were grouped together under a single folder, and are treated as two submodels:

└── clip-vit-large-patch14
    ├── text_encoder
    │   ├── config.json
    │   └── model.safetensors
    └── tokenizer
        ├── merges.txt
        ├── special_tokens_map.json
        ├── tokenizer_config.json
        └── vocab.json

However, the single-file format contains just the text encoder and not the auxiliary files needed for the tokenizer. So as a workaround, when a CLIPEmbed single-file's tokenizer is requested I call CLIPTokenizer.from_pretrained() to download the tokenizer from the InvokeAI/clip-vit-large-patch14 HF repository. After it is downloaded, it is cached in the HuggingFace cache, so the next access does not require network. This is preferable to loading the tokenizer from the locally-installed clip model because (1) there is no guarantee that this has been installed previously; and (2) doing this would be incredibly ugly, and require the low-level loader to communicate with the high level model manager.

The main downside is that the first time the tokenizer is needed, the backend will hit the network, which is something we are trying to avoid. (see PR #6740 )

QA Instructions

Use the model manager tab to install one of the "HF" format CLIPTextModel models located at https://huggingface.co/zer0int/CLIP-GmP-ViT-L-14. Try to render with it. The Tokenizer and TextEncoder should load and run successfully.

Merge Plan

Merge when approved.

Checklist

The PR has a short but descriptive title, suitable for a changelog
Tests added / updated (if applicable)
Documentation added / updated (if applicable)

RahulVadisetty91

You need to add the following lines to include the new model configurations.

    Annotated[SpandrelImageToImageConfig, SpandrelImageToImageConfig.get_tag()],
    Annotated[T2IAdapterConfig, T2IAdapterConfig.get_tag()],

add support for CLIPTextModel single file loading

ef18ecd

lstein requested review from blessedcoolant, brandonrising, RyanJDick and hipsterusername as code owners September 4, 2024 21:59

github-actions bot added python PRs that change python files backend PRs that change backend files labels Sep 4, 2024

lstein and others added 2 commits September 5, 2024 18:34

Merge branch 'main' into lstein/feat/probe-flat-clip-textencoder

4d95129

Merge branch 'main' into lstein/feat/probe-flat-clip-textencoder

7391e23

RahulVadisetty91 reviewed Sep 13, 2024

View reviewed changes

Lincoln Stein and others added 4 commits September 17, 2024 08:42

Merge branch 'main' into lstein/feat/probe-flat-clip-textencoder

4013213

Merge branch 'main' into lstein/feat/probe-flat-clip-textencoder

081c590

Merge branch 'main' into lstein/feat/probe-flat-clip-textencoder

bc0ab06

Merge branch 'main' into lstein/feat/probe-flat-clip-textencoder

434b19a

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add support for loading single file CLIPEmbedding models #6813

Add support for loading single file CLIPEmbedding models #6813

Uh oh!

lstein commented Sep 4, 2024

Uh oh!

RahulVadisetty91 left a comment

Uh oh!

Uh oh!

Add support for loading single file CLIPEmbedding models #6813

Are you sure you want to change the base?

Add support for loading single file CLIPEmbedding models #6813

Uh oh!

Conversation

lstein commented Sep 4, 2024

Summary

Related Issues / Discussions

QA Instructions

Merge Plan

Checklist

Uh oh!

RahulVadisetty91 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!