Fix Flux: clip_l support (SD3/3.5 improvements included) #397

stduhpf · 2024-09-04T19:38:36Z

Fixes #396

I made the clip backend skip the last text projection if the text_projection tensor doesn't exist This is mathematically equaivalent to replacing the text_projection with the identity matrix (aka torch.eye()).
Also replaced the matrix multiplication with a biasless linear layer when text_projection exists (somehow this changes the outcome).

Flux.1 Schnell (q3_k):

	clip-L	ViT-L-14
master
PR

SD3 2B (q8_0):

Master	PR

SD3.5 Large Turbo (q4_1)

Master	PR

stduhpf · 2024-09-05T00:23:15Z

If anyone know how this is all supposed to work, feel free to improve the code.

stduhpf · 2024-09-13T10:07:19Z

I'm pretty sure this is not the way it's supposed to work, so I'm drafting this PR until I figure out how to make it work properly. ~~Right now, I think this is only adding some "noise" the the t5 prompt.~~

stduhpf · 2024-09-13T16:33:54Z

Ok I believe i got it now

stduhpf · 2024-09-13T17:23:15Z

Hard-coding the prompt for clip to always be "Painting, in the style of starry night by Van Gogh", while keeping the same example prompt for T5 ("a lovely cat holding a sign says 'flux.cpp'") now gives this result:

I'm 99.99% certain this PR is now ready for merge.

Patch for hardcoded Clip prompt

diff --git a/conditioner.hpp b/conditioner.hpp
index 8a710d1..75efb08 100644
--- a/conditioner.hpp
+++ b/conditioner.hpp
@@ -1038,9 +1038,10 @@ struct FluxCLIPEmbedder : public Conditioner {
         std::vector<float> t5_weights;
         for (const auto& item : parsed_attention) {
             const std::string& curr_text = item.first;
+            const std::string& curr_text_l = "Painting, in the style of starry night by Van Gogh";
             float curr_weight            = item.second;
 
-            std::vector<int> curr_tokens = clip_l_tokenizer.encode(curr_text, on_new_token_cb);
+            std::vector<int> curr_tokens = clip_l_tokenizer.encode(curr_text_l, on_new_token_cb);
             clip_l_tokens.insert(clip_l_tokens.end(), curr_tokens.begin(), curr_tokens.end());
             clip_l_weights.insert(clip_l_weights.end(), curr_tokens.size(), curr_weight);

Edit: I tried this test with my previous attempt at fixing clip (e6314d3) and I got a similar result. It was actually working-ish even though it was definitely not implemented like in the official Flux inference code.

Comparison

(all of those with the same hardcoded prompt for clip_l)

PR (`af4f83f`)	Previous attempt (`d7679c9`)	Master

stduhpf · 2024-09-18T13:35:20Z

@leejet does this look good enough to merge?

Green-Sky · 2024-10-19T15:03:17Z

conditioner.hpp

@@ -1073,7 +1065,7 @@ struct FluxCLIPEmbedder : public Conditioner {
        return {{clip_l_tokens, clip_l_weights}, {t5_tokens, t5_weights}};
    }

-    SDCondition get_learned_condition_common(ggml_context* work_ctx,
+      SDCondition get_learned_condition_common(ggml_context* work_ctx,


extra spaces?

Green-Sky · 2024-10-19T15:03:48Z

clip.hpp

@@ -711,7 +711,11 @@ class CLIPTextModel : public GGMLBlock {
        if (return_pooled) {
            auto text_projection = params["text_projection"];
            ggml_tensor* pooled  = ggml_view_1d(ctx, x, hidden_size, x->nb[1] * max_token_idx);
-            pooled               = ggml_mul_mat(ctx, ggml_cont(ctx, ggml_transpose(ctx, text_projection)), pooled);
+            if(text_projection != NULL){


try to fit in the formatting more with the surrounding. ( spaces )

stduhpf · 2024-10-24T14:23:22Z

This breaks SD3.5 support somehow!

This reverts commit de973c1.

stduhpf · 2024-10-24T16:08:05Z

Ok it works again now.

stduhpf marked this pull request as draft September 13, 2024 10:05

stduhpf marked this pull request as ready for review September 13, 2024 16:33

stduhpf changed the title ~~Fix Flux: clip_l support~~ Fix Flux: clip_l support (SD3 improvements included) Sep 13, 2024

Green-Sky reviewed Oct 19, 2024

View reviewed changes

stduhpf marked this pull request as draft October 24, 2024 14:24

stduhpf mentioned this pull request Oct 24, 2024

add sd3.5 support #445

Merged

stduhpf added 11 commits October 24, 2024 17:48

Flux: clip_l support

ff4976e

Fix oopsie

e6314d3

I don't know what I'm doing, but it's working better now

4d7fed1

Use all of the clip

6ed5609

Revert "Use all of the clip" (it's breaking things)

69aad86

This reverts commit de973c1.

Clip: Fixed for real this time, i swear

7a3a166

remove useless logging

60f2192

Apply Flux fixes to SD3

7bbfb10

Remove TODOs

a96b64d

Fix formatting

1138294

Clip-g: Fix text_projection

d8c5073

stduhpf closed this Oct 24, 2024

stduhpf force-pushed the master branch from 8c6caf0 to ac54e00 Compare October 24, 2024 15:48

stduhpf reopened this Oct 24, 2024

stduhpf marked this pull request as ready for review October 24, 2024 16:08

stduhpf changed the title ~~Fix Flux: clip_l support (SD3 improvements included)~~ Fix Flux: clip_l support (SD3/3.5 improvements included) Oct 24, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix Flux: clip_l support (SD3/3.5 improvements included) #397

Fix Flux: clip_l support (SD3/3.5 improvements included) #397

stduhpf commented Sep 4, 2024 •

edited

Loading

stduhpf commented Sep 5, 2024

stduhpf commented Sep 13, 2024 •

edited

Loading

stduhpf commented Sep 13, 2024

stduhpf commented Sep 13, 2024 •

edited

Loading

stduhpf commented Sep 18, 2024

Green-Sky Oct 19, 2024

Green-Sky Oct 19, 2024

stduhpf commented Oct 24, 2024

stduhpf commented Oct 24, 2024

Fix Flux: clip_l support (SD3/3.5 improvements included) #397

Are you sure you want to change the base?

Fix Flux: clip_l support (SD3/3.5 improvements included) #397

Conversation

stduhpf commented Sep 4, 2024 • edited Loading

Flux.1 Schnell (q3_k):

SD3 2B (q8_0):

SD3.5 Large Turbo (q4_1)

stduhpf commented Sep 5, 2024

stduhpf commented Sep 13, 2024 • edited Loading

stduhpf commented Sep 13, 2024

stduhpf commented Sep 13, 2024 • edited Loading

stduhpf commented Sep 18, 2024

Green-Sky Oct 19, 2024

Choose a reason for hiding this comment

Green-Sky Oct 19, 2024

Choose a reason for hiding this comment

stduhpf commented Oct 24, 2024

stduhpf commented Oct 24, 2024

stduhpf commented Sep 4, 2024 •

edited

Loading

stduhpf commented Sep 13, 2024 •

edited

Loading

stduhpf commented Sep 13, 2024 •

edited

Loading