Stable Diffusion 3 local support #1018

andrewfrench · 2024-07-25T00:02:31Z

I have read and agree to the contributing guidelines for submitting new pull requests.
Docs
Tests

Describe your changes

This PR introduces drivers that can be used to generate images using Stable Diffusion 3 locally.

A model-agnostic HuggingFaceDiffusionPipelineImageGenerationDriver manages creating and running inferences on a HuggingFace diffusers pipeline. New model drivers: StableDiffusion3PipelineImageGenerationModelDriver, StableDiffusion3Img2ImgPipelineImageGenerationModelDriver, and StableDiffusion3ControlNetPipelineImageGenerationDriver extend the BaseDiffusionPipelineImageGenerationModelDriver to specify how to prepare the inference pipeline and format pipeline inputs.

andrewfrench · 2024-07-25T02:23:01Z

...mage_generation_model/stable_diffusion_3_img_2_img_pipeline_image_generation_model_driver.py

+            raise NotImplementedError(
+                "StableDiffusion3Img2ImgPipeline does not yet support loading from a single file."
+            )


For eventual ComfyUI convenience, we accept three model input types:

path to a single file containing a model

path to a directory containing model files

HuggingFace model repo name

The one exception to this is the StableDiffusion3Img2ImgPipeline, which doesn't support .from_single_file(). Models can still be loaded by path to a local directory or by model repo name (and not downloaded again if they're cached locally).

codecov · 2024-07-25T02:26:46Z

Codecov Report

Attention: Patch coverage is 92.65537% with 13 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
...le_diffusion_3_image_generation_pipeline_driver.py	86.79%	1 Missing and 6 partials ⚠️
...on/huggingface_pipeline_image_generation_driver.py	94.11%	1 Missing and 1 partial ⚠️
...n_3_controlnet_image_generation_pipeline_driver.py	95.34%	0 Missing and 2 partials ⚠️
...on_3_img_2_img_image_generation_pipeline_driver.py	94.44%	0 Missing and 2 partials ⚠️

📢 Thoughts on this report? Let us know!

collindutter · 2024-07-25T18:29:37Z

pyproject.toml

@@ -134,6 +138,14 @@ drivers-observability-datadog = [
    "opentelemetry-exporter-otlp-proto-http",
 ]

+drivers-imagegen-huggingface = [


Should be named drivers-image-generation-huggingface

collindutter · 2024-07-25T18:30:37Z

...tape/drivers/image_generation_model/base_diffusion_pipeline_image_generation_model_driver.py

+    @abstractmethod
+    def get_output_image_dimensions(self) -> Optional[tuple[int, int]]: ...


Should maybe be a @property?

collindutter · 2024-07-25T21:10:55Z

griptape/drivers/image_generation/huggingface_diffusion_pipeline_image_generation_driver.py

+
+
+@define
+class HuggingFaceDiffusionPipelineImageGenerationDriver(BaseImageGenerationDriver, ABC):


Should this subclass BaseImageGenerationDriver instead?

I assume you mean BaseMultiModelImageGenerationDriver. That was the original intention, but the Pipeline drivers require a substantially different interface than those that inherit from BaseImageGenerationModelDriver as required by the BaseMultiModelImageGenerationDriver.

collindutter · 2024-07-25T21:12:53Z

griptape/drivers/image_generation/huggingface_diffusion_pipeline_image_generation_driver.py

+
+
+@define
+class HuggingFaceDiffusionPipelineImageGenerationDriver(BaseImageGenerationDriver, ABC):


Wondering if we should just rename to HuggingFacePipelineImageGenerationDriver for similarity with the HuggingFacePipelinePromptDriver that uses transformers.

...age_generation_model/stable_diffusion_3_controlnet_pipeline_image_generation_model_driver.py

collindutter · 2024-07-25T21:19:29Z

...age_generation_model/stable_diffusion_3_controlnet_pipeline_image_generation_model_driver.py

+        # as a path to a local file or as a HuggingFace model repo name.
+        # We use the from_single_file method if the model is a local file and the
+        # from_pretrained method if the model is a local directory or hosted on HuggingFace.
+        sd3_controlnet_model = import_optional_dependency("diffusers.models.controlnet_sd3").SD3ControlNetModel


Imports should happen at the top of the function as if it were a regular import.

collindutter · 2024-07-25T21:19:42Z

...age_generation_model/stable_diffusion_3_controlnet_pipeline_image_generation_model_driver.py

+        sd3_controlnet_pipeline = import_optional_dependency(
+            "diffusers.pipelines.controlnet_sd3.pipeline_stable_diffusion_3_controlnet"
+        ).StableDiffusion3ControlNetPipeline


Imports should happen at the top of the function as if it were a regular import.

collindutter · 2024-07-25T21:22:49Z

...age_generation_model/stable_diffusion_3_controlnet_pipeline_image_generation_model_driver.py

+        ).StableDiffusion3ControlNetPipeline
+        if os.path.isfile(model):
+            pipeline = sd3_controlnet_pipeline.from_single_file(model, **pipeline_params)
+


Weird that the autoformatter didn't format away this whitespace

collindutter · 2024-07-25T21:23:17Z

...tape/drivers/image_generation_model/base_diffusion_pipeline_image_generation_model_driver.py

+    def prepare_pipeline(self, model: str, device: Optional[str]) -> Any: ...
+
+    @abstractmethod
+    def make_image_param(self, image: Optional[Image]) -> Optional[dict[str, Image]]: ...


Why is image optional if every implementation seems to require it?

collindutter · 2024-07-25T21:24:44Z

.../drivers/image_generation_model/stable_diffusion_3_pipeline_image_generation_model_driver.py

+        return pipeline
+
+    def make_image_param(self, image: Optional[Image]) -> Optional[dict[str, Image]]:
+        return None


I guess this is why image is optional, but this feels odd.

This is odd, and a half-measure. Zooming out, I think the time has come to admit the design where each driver implements a suite of common image generation types (prompt, variation, inpainting, outpainting) isn't working so well: forcing NotImplementedErrors and backwards workarounds like this. We can move this conversation to another venue, but should maybe consider making image generation drivers more granular.

Can you please create a ticket to track this work? This feels like an important refactor pre-1.0.

collindutter · 2024-07-25T22:42:04Z

...age_generation_model/stable_diffusion_3_controlnet_pipeline_image_generation_model_driver.py

+if TYPE_CHECKING:
+    from PIL.Image import Image
+else:
+    StableDiffusion3ControlNetPipeline = import_optional_dependency(


Having 3 in the file/class name feels odd, maybe let's remove?

Stable Diffusion's pipelines include versions in their class names and are distinct enough that we would need a driver for each, much in the same way that ControlNet and Img2Img are distinct here. Amaru pointed out that Stable Diffusion 3 is still quite new and many people still prefer their Stable Diffusion XL workflows — to support this, we'd need another driver typed for StableDiffusionXL (same story for SD1.5 and SD2). I don't like the aesthetics of it, but this leaves some space for other SD versions.

andrewfrench · 2024-07-26T16:11:54Z

It seemed like I was trying to fit a square peg (model drivers that were actually specific to pipeline types, not models) into a round hole (model drivers meant to define interactions with a model), so I refactored these out to a new type: BaseImageGenerationPipelineDriver, that encapsulates specific preparation and input for diffusion pipeline image generation flows.

andrewfrench added 5 commits July 24, 2024 17:08

Add 🤗 diffusers image generation driver

031aabd

Add image generation model drivers for diffusers

1e4f3c5

Add driver for sd3+controlnet

413bbb4

Export new drivers

71461bf

Optional dependencies update

10b3702

andrewfrench force-pushed the sd3-local branch from 5c4f3c8 to 10b3702 Compare July 25, 2024 00:42

andrewfrench changed the title ~~Sd3 local~~ Stable Diffusion 3 Local support Jul 25, 2024

andrewfrench changed the title ~~Stable Diffusion 3 Local support~~ Stable Diffusion 3 local support Jul 25, 2024

andrewfrench added 3 commits July 24, 2024 19:01

Import fixes

e81b195

Linter fixes, pyright fixes

9d9f385

Fix make format artifact

a4e2f89

andrewfrench commented Jul 25, 2024

View reviewed changes

andrewfrench added 3 commits July 24, 2024 19:39

Docstrings

e6bc346

Add unit tests, small fixes

2742140

Update img2img input field name

e9e9ddf

andrewfrench marked this pull request as ready for review July 25, 2024 20:03

collindutter reviewed Jul 25, 2024

View reviewed changes

andrewfrench added 5 commits July 25, 2024 16:37

Address comments, update poetry.lock

3d87cb7

whoops, fix tests

2157096

Configurable image output format

6e74c29

Update docstring

4df8dde

Refactor pipeline drivers

7f0c211

collindutter previously approved these changes Jul 26, 2024

View reviewed changes

andrewfrench added 2 commits July 26, 2024 16:29

model_driver -> pipeline_driver

d84c724

Downgrade torch

3180a18

andrewfrench dismissed collindutter’s stale review via 3180a18 July 26, 2024 23:31

Expose memory saving pipeline options

77304f6

andrewfrench added 7 commits July 27, 2024 10:06

Mark torch as optional again

1296a46

Support from_single_file for img2img pipeline

40a6e8a

Add tests for new options

1e72051

Transfer fork to griptape-ai

ce89d33

git@ -> https://

7b5950a

poetry lock --no-update

25c2d39

Update reference links

1aaeef6

andrewfrench force-pushed the sd3-local branch from e162440 to 1aaeef6 Compare July 28, 2024 02:52

andrewfrench requested a review from collindutter July 28, 2024 03:05

andrewfrench and others added 2 commits July 28, 2024 16:14

Test coverage

313f8a0

Merge branch 'dev' into sd3-local

f8d1e09

collindutter approved these changes Jul 29, 2024

View reviewed changes

andrewfrench merged commit 9f9ac91 into dev Jul 29, 2024
13 checks passed

andrewfrench deleted the sd3-local branch July 29, 2024 19:30

collindutter pushed a commit that referenced this pull request Aug 2, 2024

Stable Diffusion 3 local support (#1018)

56b4104

andrewfrench mentioned this pull request Oct 14, 2024

Explore support for local StableDiffusion models #504

Closed

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Stable Diffusion 3 local support #1018

Stable Diffusion 3 local support #1018

andrewfrench commented Jul 25, 2024 •

edited

Loading

andrewfrench Jul 25, 2024

codecov bot commented Jul 25, 2024 •

edited

Loading

collindutter Jul 25, 2024

collindutter Jul 25, 2024

collindutter Jul 25, 2024

andrewfrench Jul 25, 2024

collindutter Jul 25, 2024

collindutter Jul 25, 2024

collindutter Jul 25, 2024

collindutter Jul 25, 2024

collindutter Jul 25, 2024

collindutter Jul 25, 2024

andrewfrench Jul 25, 2024

collindutter Jul 26, 2024

collindutter Jul 25, 2024

andrewfrench Jul 25, 2024 •

edited

Loading

andrewfrench commented Jul 26, 2024

		@abstractmethod
		def get_output_image_dimensions(self) -> Optional[tuple[int, int]]: ...



		@define
		class HuggingFaceDiffusionPipelineImageGenerationDriver(BaseImageGenerationDriver, ABC):

Stable Diffusion 3 local support #1018

Stable Diffusion 3 local support #1018

Conversation

andrewfrench commented Jul 25, 2024 • edited Loading

Describe your changes

Choose a reason for hiding this comment

codecov bot commented Jul 25, 2024 • edited Loading

Codecov Report

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

andrewfrench Jul 25, 2024 • edited Loading

Choose a reason for hiding this comment

andrewfrench commented Jul 26, 2024

andrewfrench commented Jul 25, 2024 •

edited

Loading

codecov bot commented Jul 25, 2024 •

edited

Loading

andrewfrench Jul 25, 2024 •

edited

Loading