-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support Flux IP Adapter #10261
Support Flux IP Adapter #10261
Conversation
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
Co-authored-by: YiYi Xu <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Small nits. Looks good otherwise.
@slow | ||
@require_big_gpu_with_torch_cuda | ||
@pytest.mark.big_gpu_with_torch_cuda | ||
class FluxIPAdapterPipelineSlowTests(unittest.TestCase): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hlky Could we add a fast test using something similar to what's been done here
diffusers/tests/pipelines/test_pipelines_common.py
Lines 269 to 301 in 9020086
def _modify_inputs_for_ip_adapter_test(self, inputs: Dict[str, Any]): | |
parameters = inspect.signature(self.pipeline_class.__call__).parameters | |
if "image" in parameters.keys() and "strength" in parameters.keys(): | |
inputs["num_inference_steps"] = 4 | |
inputs["output_type"] = "np" | |
inputs["return_dict"] = False | |
return inputs | |
def test_ip_adapter(self, expected_max_diff: float = 1e-4, expected_pipe_slice=None): | |
r"""Tests for IP-Adapter. | |
The following scenarios are tested: | |
- Single IP-Adapter with scale=0 should produce same output as no IP-Adapter. | |
- Multi IP-Adapter with scale=0 should produce same output as no IP-Adapter. | |
- Single IP-Adapter with scale!=0 should produce different output compared to no IP-Adapter. | |
- Multi IP-Adapter with scale!=0 should produce different output compared to no IP-Adapter. | |
""" | |
# Raising the tolerance for this test when it's run on a CPU because we | |
# compare against static slices and that can be shaky (with a VVVV low probability). | |
expected_max_diff = 9e-4 if torch_device == "cpu" else expected_max_diff | |
components = self.get_dummy_components() | |
pipe = self.pipeline_class(**components).to(torch_device) | |
pipe.set_progress_bar_config(disable=None) | |
cross_attention_dim = pipe.unet.config.get("cross_attention_dim", 32) | |
# forward pass without ip adapter | |
inputs = self._modify_inputs_for_ip_adapter_test(self.get_dummy_inputs(torch_device)) | |
if expected_pipe_slice is None: | |
output_without_adapter = pipe(**inputs)[0] | |
else: | |
output_without_adapter = expected_pipe_slice |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
left one comment, looks good otherwise!
Co-authored-by: YiYi Xu <[email protected]>
* Flux IP-Adapter * test cfg * make style * temp remove copied from * fix test * fix test * v2 * fix * make style * temp remove copied from * Apply suggestions from code review Co-authored-by: YiYi Xu <[email protected]> * Move encoder_hid_proj to inside FluxTransformer2DModel * merge * separate encode_prompt, add copied from, image_encoder offload * make * fix test * fix * Update src/diffusers/pipelines/flux/pipeline_flux.py * test_flux_prompt_embeds change not needed * true_cfg -> true_cfg_scale * fix merge conflict * test_flux_ip_adapter_inference * add fast test * FluxIPAdapterMixin not test mixin * Update pipeline_flux.py Co-authored-by: YiYi Xu <[email protected]> --------- Co-authored-by: YiYi Xu <[email protected]>
* Flux IP-Adapter * test cfg * make style * temp remove copied from * fix test * fix test * v2 * fix * make style * temp remove copied from * Apply suggestions from code review Co-authored-by: YiYi Xu <[email protected]> * Move encoder_hid_proj to inside FluxTransformer2DModel * merge * separate encode_prompt, add copied from, image_encoder offload * make * fix test * fix * Update src/diffusers/pipelines/flux/pipeline_flux.py * test_flux_prompt_embeds change not needed * true_cfg -> true_cfg_scale * fix merge conflict * test_flux_ip_adapter_inference * add fast test * FluxIPAdapterMixin not test mixin * Update pipeline_flux.py Co-authored-by: YiYi Xu <[email protected]> --------- Co-authored-by: YiYi Xu <[email protected]>
What does this PR do?
Adds support for XLabs Flux IP Adapter.
Example
flux-ip-adapter-v2
Details
Note:
true_cfg=1.0
is important, andstrength
is sensitive, fixed strength may not work, see here for more strength schedules, good results will require experimentation with strength schedules and the start/stop values. Results also vary with input image, I had no success with the statue image used for v1 test.Multiple input images is not yet supported (dev note: apply
torch.mean
to the batch ofimage_embeds
and toip_attention
)Notes
--timestep_to_start_cfg
greater than the number of steps to disable CFGpipeline_flux_with_cfg
community example, except we run positive and negative separately.load_ip_adapter
.load_ip_adapter
supportsimage_encoder_pretrained_model_name_or_path
e.g."openai/clip-vit-large-patch14"
rather than justimage_encoder_folder
, also supportsimage_encoder_dtype
with defaulttorch.float16
.FluxTransformerBlock
because of where ip_attention is applied to thehidden_states
, see here in the original codebase.flux-ip-adapter-v2 will be fixed and tested shortly.Fixes #9825
Fixes #9403
Who can review?
Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.
@sayakpaul @yiyixuxu @DN6