-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Regional Prompting (Node only) #5916
Conversation
5be7ca4
to
86ca81c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Impressive work. A couple of questions.
- We now have several masks being used at multiple places, including the bitmap inpainting mask, the boolean mask used in regional prompting, and the segment anything mask used in wip: Segment Anything #5829 . Are these compatible with each other? For example, can I use a segment anything mask as a regional prompting mask?
- What is the use case for a rectangular prompting mask?
Re: Masks - That is the intent. They should just be Alpha channel-only images, interchangeable in different contexts. The system does not currently seem to handle "Mask" & "Image" as distinct types, but more as useful labels of the same type of object. The Rectangular Mask is primarily for creating a Mask from a workflow without importing an existing image. It's obviously not the ideal UX - but we currently lack a better interface in the Workflow editor if the user doesn't want to leave it to create the mask. The nature of the regional prompting ultimately doesn't require an exact shape, so the rectangle is "good enough" for many use-cases. |
From a quick skim, all of our new mask formats are likely not compatible. I'll look into fixing that. |
4355842
to
9da0336
Compare
…eLatents invocation, and add handling of the conditioning region masks in DenoiseLatents.
…essor2_0 from diffusers.
…nProcessor2_0 modules.
9da0336
to
4f97192
Compare
I started a thread here to discuss consolidating the mask types: https://discord.com/channels/1020123559063990373/1225113849489915925/1225113851121504416 |
I've got a branch with a WIP UI (not hooked up to any graphs). It outputs an segmented mask image like this, which I'm using for testing: I've added a couple nodes to this branch:
I haven't been able to generate with regional prompts, though. When I recreate @RyanJDick 's example workflow, I get this error:
I adapted the workflow to SD1.5 and get the same error: Rectangle Regional Prompts - SD1.5.json
I get a different error if I use the new nodes I created. Here's the workflow: Arbitrary Regional Prompts.json And the error:
I think I need a "global" positive prompt, so I added that and got a different error: Arbitrary Regional Prompts with global prompt.json
I had merged I tried with Maybe I'm missing some detail in the workflow, and/or creating the mask tensors incorrectly in my node. |
Here's the very rough regional prompt UI (no feedback needed at this point, it's just functional enough for testing): Screen.Recording.2024-04-08.at.6.34.03.pm.mov |
…mAttnProcessor2_0. This fixes a bug in CustomAttnProcessor2_0 that was being triggered when peft was not installed. The bug was present in a block of code that was previously copied from diffusers. The bug seems to have been introduced during diffusers' migration to PEFT for their LoRA handling. The upstream bug was fixed in huggingface/diffusers@531e719.
@psychedelicious I have addressed your first error ( Moving on to the others now. |
…Also, added a clearer error message in case the same error is introduced in the future.
@psychedelicious The errors coming from your new nodes are addressed in 826f3d6 |
…ts to a standardized format.
This reverts commit 3a531c5.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Confirmed errors are resolved.
SD1.5 is pretty iffy in terms of putting the prompted things in the specified regions, but SDXL works great.
I've reverted my added node. It was built on incorrect assumptions about how regional prompting worked. I'll add a different node for arbitrary mask images in the PR that adds the UI.
# At some point, someone decided that schedulers that accept a generator should use the original seed with | ||
# all bits flipped. I don't know the original rationale for this, but now we must keep it like this for | ||
# reproducibility. | ||
scheduler_step_kwargs = {"generator": torch.Generator(device=device).manual_seed(seed ^ 0xFFFFFFFF)} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Stochastic schedulers needed an explicitly seeded generator passed for reproducibility.
Nobody wanted to add a new stochastic_scheduler_seed
or generator
field to DenoiseLatentsInvocation, but they figured out how to get a seed from one of its input LatentFields.
However, I feared re-using the same seed
as had already been used to generate that input would result in the same tensor, which is not the random behavior those schedulers expect to be doing.
So the compromise was, when you have an operation that needs a seed but the only seed defined has already been used earlier in the execution graph, you do some reproducible transformation like this to get a new seed for your new operation.
FWIW, I recommend factoring this seed ^ 0xFFFFFFFF
up out of this method to the place that calls get_scheduler()
and init_scheduler()
. That makes this method simpler: the seed passed to init_scheduler
is, in fact, the seed used for them; and the quirky logic about deriving the scheduler_seed
goes with the code that's trying to pull a seed
out of the InvocationContext's LatentFields.
What type of PR is this? (check all applicable)
Have you discussed this change with the InvokeAI team?
Have you updated all relevant documentation?
Description
This branch makes the following changes to support regional prompting:
mask
input to the compel prompt invocations (for both SD and SDXL)Here's a sample workflow using the new nodes:
Note about Memory Usage:
When IP-Adapter and/or regional prompting are used, we use a custom attention processor. This attention processor does not currently support xformers or sliced attention, so will use more memory than standard attention when those options are enabled.
The custom attention processor currently uses torch.scaled_dot_product_attention().
If there is enough demand, we could add support for xformers and sliced implementations. But, it probably makes sense to re-think our attention configuration strategy in the context of the latest improvements to torch (which supports low-memory and flash-attention modes).
QA Instructions, Screenshots, Recordings
Completed Tests
Basic functionnality:
Speed regression tests (run on an RTX4090):
Compatibility:
Remaining tests
Added/updated tests?