-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Load single-file checkpoints directly without conversion #6510
Conversation
…erformance is poor
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did a first pass over the code. I still need to find some time to test it all.
The issue that you called out where the DB migration requires the convert_cache_path
config value is quite awkward. Generally, this will be an issue anytime a migration depends on a config value.
A few general strategies come to mind:
- If a migration references a config value, then to be rigorous we should never remove that config value. Of course, we could remove it as part of a hard cutoff at some point in the future.
- Use our current migration pattern to handle both config / DB migrations in the same place. I.e. it would become more of an app migration system rather than a DB migration system. We already do some file cleanup in our migrations, so it's not a huge stretch to handle config migrations there as well. I haven't looked at the implications for the app launch logic though.
- Instruct users on how to manually remove their convert cache rather than doing it in a migration.
What do you think about all of this? I think I saw a PR at one point where you were exploring config migration patterns, so you have probably spent more time thinking about this than me.
invokeai/backend/model_manager/load/model_loaders/stable_diffusion.py
Outdated
Show resolved
Hide resolved
invokeai/backend/model_manager/load/model_loaders/stable_diffusion.py
Outdated
Show resolved
Hide resolved
At one point I had written a system for registering config migrations using decorators in which each migration step was executed by a separate function decorated with |
…sion.py Co-authored-by: Ryan Dick <[email protected]>
…sion.py Co-authored-by: Ryan Dick <[email protected]>
…keAI into lstein/feat/load-one-file
Maybe we should handle config migration in the DB migrator since it's doing more than DB migrations now? Prefer to work through taht separate to this PR tho |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I review again, I'm realizing that we don't really have a strategy with config version numbers. It seems like it was in-sync with app versions at one point. Now it's neither in-sync nor following semver.
Given this plus the awkwardness with handling convert_cache_dir
, maybe we should just leave both convert_cache
and convert_cache_dir
as deprecated fields, and figure out a proper solution in a future PR. (I'm also not opposed to removingconvert_cache
now - it doesn't make the situation any worse.)
I agree. We need to work out a system for coordinating the app, config file, and database schema version numbers. We should probably implement a single migration system that is adapted for config schema, db schema and root filesystem migrations. What would happen if we have a rule that we use the app version for the config and db schemas? The main complication would be that each time we make a breaking schema change, we have to update the app version within the PR. Given that PRs are merged at different times and orders, this would be a management headache. |
I did a bunch of manual testing. Here are the results. Test plan
Issues encountered that were not introduced by this PR:
Issues introduced by this PR:There is a reproducibility problem with some models before/after this PR. This issue did not seem to affect all models. |
invokeai/backend/model_manager/load/model_loaders/stable_diffusion.py
Outdated
Show resolved
Hide resolved
I just noticed that we are still on |
Agreed, for the reason you've described, I think at the very least we'd want an internal app version separate from the app release version. There might be reasons for further splitting which systems gets versioned together, but I'll save that discussion for another place. |
There are some significant changes involved in moving to 0.28.0, and these have gone into PR #6512 for separate review. Now that you've done all this testing I'm not comfortable including them in this PR. For example, with 0.28 and higher, the |
3dbcf9a
to
5c8cf99
Compare
With that PR, this SDXL VAE is recognized and loaded. However, it looks washed out and I don't know if there's something wrong with the model or with the loading. I also tried with the older diffusers conversion method, and got the same washed out result, so I tend to think it is a model problem.
Yes, it is in diffusers format. To load it we would need to put it into a folder, rename it
This seems to be a controlnet format that I haven't run into yet. It will need a bit of new backend support: e.g. https://github.com/kohya-ss/ControlNet-LLLite-ComfyUI. There are a total of five LLLite controlnet models on civitai. It looks kinda' nice. Do you think we implement support, or wait until someone makes a feature request..
I noticed that sometimes I didn't get the right VAE when remixing. Thanks for figuring out the way to reproduce the problem. I'll make this a bug report.
I was unable to reproduce this. However, I think there is a problem at the frontend level in which the human-readable name and description of starter models are not passed to the intsaller, so that you get the automatically-inferred model name and description.
Indeed. I'll make this a bug report. |
@RyanJDick Are there any remaining issues? |
I think the coordinated config, filesystem and db migration system needs its own PR. |
Summary
This PR takes advantage of the new diffusers
load_single_file()
method to load single-file checkpoints (e.g..safetensors
) directly. It completely removes the convert cache directory, avoiding model duplication and conserving the user's disk space. Single file loaders for main models, VAEs, and controlnets have been implemented.Manual conversion from single files to diffusers directories is still supported. As before, this is a one-way operation which deletes the original .safetensors file (if it is in the models directory).
Loading is nice and fast with no discernible difference in rendering performance. The main drawback of single file loading is that the entire pipeline is loaded into RAM at once, causing a surge in RAM usage compared to loading a diffusers submodel. However, after loading all pipeline components into RAM, its submodels are cached individually, and subsequent load calls will return the cached submodel as long as it remains cached. There may be a performance degradation if the user has set the RAM cache size to be too small for a whole pipeline to fit into. Under these circumstances a single-file pipeline will be re-read from disk each time it is needed.
It might be a good idea for the manager to produce a warning message when the cache size is too small to hold a whole pipeline. Let me know if you think this should be implemented.
Related Issues / Discussions
See the Discord thread starting at: https://discord.com/channels/1020123559063990373/1049495067846524939/1239774642449813514
QA Instructions
models/.convert_cache
directory, migrate the database schema to v12, and upgrade the config schema to 4.0.2. These migrations remove the convert cache directory and theconfig_cache
size configuration option. It does not remove theconfig_cache_dir
configuration option because the database migration needs to know where the user stored the converted files in order to delete them!ram
configuration option to a low value.Merge Plan
Squash merge when approved.
Checklist