Which cross-attention optimization technique is best? Could someone please clarify when to use it and why? #846

superprohero · 2023-05-09T22:57:10Z

superprohero
May 9, 2023

simple question, can someone explain me please?

vladmandic · 2023-05-09T23:14:48Z

vladmandic
May 9, 2023
Maintainer

simple answer - it depends :)

first was split-attention, then came invokeai, then doggetx. those 3 are somewhat platform agnostic, so they can be used regardless.
and between them, i guess its in that exact order from lower to better.

xformers were heavily optimized for torch+cuda on nvidia, so if you have that, they used to be the best.
amd users are reporting that sub-quadratic works fine for them.
and sdp is newest as its built-in in torch 2.0 and typically gives similar performance to xformers with less fuss. but again, it varies depending on exact cpu and gpu (xformers does some work on cpu, so low end gpus can benefit from that while sdp as all-in on gpu, so high-end gpus work nicely).

and then second line is optional - since cross-optimization can lead to non-derministic results (tiny differences in images even if settings are the same) - this enables deterministic mode as some small cost.

if there was a simple answer, it would be pre-set and there wouldn't be a configurable option.

4 replies

superprohero May 10, 2023
Author

Thanks for the simple reply, and I get what you are saying.

I have an Rtx 3060 Ti with 8 GB of RAM and a Ryzen 5600x processor.
In terms of speed, I really can't tell any difference between Xformers and SDP. The only difference is that with SDP I can't do 1440x1440 in Hires fix.

One more question: is there another way to increase the speed of the process of image creation?

vladmandic May 10, 2023
Maintainer

increase from what? if you run benchmark, what it's do you get?

superprohero May 10, 2023
Author

Hires fix resize: from 512x512 to 1024x1024

around 20 to 22 seconds

image above, can it go faster without using more ram?

vladmandic May 10, 2023
Maintainer

hires fix is quite limited, you may want to experiment with different methods

CapsAdmin · 2023-05-11T02:20:18Z

CapsAdmin
May 11, 2023

On a RX 6900 XT I find that sub quadratic is the slowest, but uses the least amount of vram. This is useful when you want to generate something high resolution without getting OOM issues. In my specific case it would actually be great if it could switch between SDP and sub quadratic depending on predicted memory usage or simply just above some certain resolution.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Which cross-attention optimization technique is best? Could someone please clarify when to use it and why? #846

{{title}}

Replies: 2 comments 4 replies

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

Select a reply

Which cross-attention optimization technique is best? Could someone please clarify when to use it and why? #846

superprohero May 9, 2023

Replies: 2 comments · 4 replies

vladmandic May 9, 2023 Maintainer

superprohero May 10, 2023 Author

vladmandic May 10, 2023 Maintainer

superprohero May 10, 2023 Author

vladmandic May 10, 2023 Maintainer

CapsAdmin May 11, 2023

superprohero
May 9, 2023

Replies: 2 comments 4 replies

vladmandic
May 9, 2023
Maintainer

superprohero May 10, 2023
Author

vladmandic May 10, 2023
Maintainer

superprohero May 10, 2023
Author

vladmandic May 10, 2023
Maintainer

CapsAdmin
May 11, 2023