Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Redo custom attention processor to support other attention types #6550

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

StAlKeR7779
Copy link
Contributor

@StAlKeR7779 StAlKeR7779 commented Jun 27, 2024

Summary

Current attention processor implements only torch-sdp attention type, so when any ip-adapter or regional prompt used, we override model to run torch-sdp attention.
New attention processor combines 4 attention processors(normal, sliced, xformers, torch-sdp) by moving parts of attention that differs(mask preparation and attention itself), to separate function call, where required implementation executed.

Related Issues / Discussions

None

QA Instructions

Change attention_type in invokeai.yaml and then run generation with ip-adapter or regional prompt.

Merge Plan

None?

Checklist

  • The PR has a short but descriptive title, suitable for a changelog
  • Tests added / updated (if applicable)
  • Documentation added / updated (if applicable)

@dunkeroni @RyanJDick

@github-actions github-actions bot added python PRs that change python files backend PRs that change backend files labels Jun 27, 2024
@StAlKeR7779 StAlKeR7779 marked this pull request as ready for review June 27, 2024 14:02
@RyanJDick
Copy link
Collaborator

I haven't looked at the code yet, but do you know if there are still use cases for using attention processors other than Torch 2.0 SDP? Based on the benchmarking that diffusers has done, it seems like the all around best choice. But maybe there are still reasons to use other implementation e.g. very-low-vram system?

@StAlKeR7779
Copy link
Contributor Author

I thought roughly same:
normal - generally no need in it
xformers - if you said that torch-sdp on par or even faster, then too can be removed
sliced - yes it's suitable for low memory situations, and I think it's main attention for mps

@psychedelicious
Copy link
Collaborator

On CUDA, torch's SDP was faster than xformers for me when I last checked a month or so back. IIRC it was just a couple % faster.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend PRs that change backend files python PRs that change python files
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants