Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Load non quantized t5encoder using same dtype the model is saved in/ #7140

Merged
merged 2 commits into from
Oct 22, 2024

Conversation

Vargol
Copy link
Contributor

@Vargol Vargol commented Oct 17, 2024

Summary

While testing the possibility of using GGUF to load the T5encoder, I noticed that when I tried a fp16 T5encoder in GGUf format it wasn't using the extreme amount of memory on MacOS the base encoder was, despite that encoder being bf16
I tried forcing the load of the base encoder to bfloat16 and that had the same effect, reducing the memory usage and reducing the model load times by 60 - 80%

Read the transformer doc and it says by default it loads to the torch default dtype not the dtype the model was saved in.
So changed the code to auto

Related Issues / Discussions

No issue raised as MacOS requires #7113 to used flux by default (upgrading to torch nightlies works as well) and ideally you'd want to combine this with that PR.
i did bring this up in discord https://discord.com/channels/1020123559063990373/1049495067846524939/1296111527044186144

QA Instructions

Tested on MacOS with and without #7113
Needs testing on Non MacOS systems to make sure it doesn't break stuff.
There may be some difference in images in the same thing is happening there,
as there will be a difference between float32 and bfloat16 calculations and hence resulting images.

Merge Plan

Should be a straight forward merge, as its only one line

@github-actions github-actions bot added python PRs that change python files backend PRs that change backend files labels Oct 17, 2024
Copy link
Collaborator

@brandonrising brandonrising left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems to be working as described for me, have any concerns with this @RyanJDick ?

@psychedelicious psychedelicious merged commit 24f9b46 into invoke-ai:main Oct 22, 2024
14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend PRs that change backend files python PRs that change python files
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants