adds the pipeline for pixart alpha controlnet #8857

raulc0399 · 2024-07-12T20:57:35Z

this PR adds the controlnet pipeline for the pixart alpha diffusion model

the following example uses the HED edge to control the generation.

import torch
import torchvision.transforms as T
import torchvision.transforms.functional as TF

from diffusers.models import PixArtControlNetAdapterModel
from diffusers.pipelines import PixArtAlphaControlnetPipeline, get_closest_hw
import PIL.Image as Image

from controlnet_aux import HEDdetector

input_image_path = "asset/images/controlnet/car.jpg"
given_image = Image.open(input_image_path)

path_to_controlnet = "raulc0399/pixart-alpha-hed-controlnet"
prompt = "modern car, city in background, clear sky, suny day"

weight_dtype = torch.float16
image_size = 1024

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

controlnet = PixArtControlNetAdapterModel.from_pretrained(
    path_to_controlnet,
    torch_dtype=weight_dtype,
    use_safetensors=True,
).to(device)

pipe = PixArtAlphaControlnetPipeline.from_pretrained(
    "PixArt-alpha/PixArt-XL-2-1024-MS",
    controlnet=controlnet,
    torch_dtype=weight_dtype,
    use_safetensors=True,
).to(device)

# preprocess image, generate HED edge
hed = HEDdetector.from_pretrained("lllyasviel/Annotators")

width, height = get_closest_hw(given_image.size[0], given_image.size[1], image_size)

condition_transform = T.Compose([
    T.Lambda(lambda img: img.convert('RGB')),
    T.Resize(int(min(height, width))),
    T.CenterCrop([int(height), int(width)]),
    T.ToTensor()
])

control_image = condition_transform(control_image)
hed_edge = hed(control_image, detect_resolution=image_size, image_resolution=image_size)

with torch.no_grad():
    out = pipe(
        prompt=prompt,
        image=hed_edge,
        num_inference_steps=14,
        guidance_scale=4.5,
        height=image_size,
        width=image_size,
    )

    out.images[0].save(f"./output.jpg")

here some images: original image, control image and generated image

Who can review?

@yiyixuxu @lawrence-cj

yiyixuxu · 2024-07-17T01:59:10Z

is this the checkpoint? https://huggingface.co/PixArt-alpha/PixArt-ControlNet
I don't see any downloads, not sure if it's tracking correctly

is this pixart alpha controlnet used a lot in the community? if not, maybe we can make a community pipeline to start with?

also cc @asomoza

raulc0399 · 2024-07-17T06:51:13Z

@yiyixuxu that is the pixart controlnet model for HED conditioning as uploaded by the authors of pixart.
for this pipeline i have converted the controlnet layers to safetensors, uploaded here https://huggingface.co/raulc0399/pixart-alpha-hed-controlnet

they can be used with this pipeline

asomoza · 2024-07-17T10:56:10Z

why does it have its own implementation of the HED detector? It doesn't work with the regular one that everyone uses? Have you tested it with the one from the controlnet_aux library?

raulc0399 · 2024-07-17T13:52:23Z

@asomoza

why does it have its own implementation of the HED detector? It doesn't work with the regular one that everyone uses? Have you tested it with the one from the controlnet_aux library?

the sample above just used the HED class that the authors had in their repository, and that was used to train their HED controlnet.

but i just checked it it seems to be the same, or better said adapted, from the controlnet_aux

asomoza · 2024-07-17T14:15:33Z

thanks, I'll give it a test later. I was asking because if it was trained with a custom HED detector which produces different results than the default one it will be really hard for people to use it.

It would be nice if you could post some results (images) in the PR description.

raulc0399 · 2024-07-17T16:06:16Z

thanks, I'll give it a test later. I was asking because if it was trained with a custom HED detector which produces different results than the default one it will be really hard for people to use it.

using the HED from control_aux it looses some quality.
will try some more tests with that one.

but i also have a training script that i am testing before creating a PR:
https://github.com/raulc0399/PixArt-alpha/blob/master_train_controlnet_diffusers/controlnet/train_pixart_controlnet_hf.py

that can be used to train further models.

It would be nice if you could post some results (images) in the PR description.

will do.

raulc0399 · 2024-07-17T16:26:03Z

i have to correct my previous comment. i was using the default params for HED, which converted the image to 512, if i use however 1024 it works as it should.

asomoza · 2024-07-18T04:07:14Z

Thanks, the results looks nice, since we only have one controlnet, maybe do what @yiyixuxu suggested, lets start with a community pipeline first and then as it gets traction and we have more controlnets move it to core.

raulc0399 · 2024-07-18T06:03:58Z

@asomoza

Thanks, the results looks nice, since we only have one controlnet, maybe do what @yiyixuxu suggested, lets start with a community pipeline first and then as it gets traction and we have more controlnets move it to core.

ok, i will move it to the examples folder and put there the training script as well.
i have done some initial tests on the "fusing/fill50k" dataset to validate it works

raulc0399 · 2024-07-22T17:32:49Z

@yiyixuxu @asomoza
have moved all to the examples folder
i have also added the training script. together with sh files for starting the training and for running the pipeline

yiyixuxu · 2024-07-22T22:51:36Z

examples/pixart/controlnet_pixart_alpha.py

@@ -0,0 +1,292 @@
+from typing import Any, Dict, Optional


I think maybe pipelines can go to the /example/community folder, the training script can stay in example/pixart folder

cc @sayakpaul

Okay with that plan.

raulc0399 · 2024-07-23T09:56:39Z

@yiyixuxu
the last commit moves the pipeline and the example on how to run it to examples/community

yiyixuxu

I left some comments, thanks!

yiyixuxu · 2024-07-23T22:38:46Z

examples/pixart/controlnet_pixart_alpha.py

+from diffusers.models.modeling_utils import ModelMixin
+from diffusers.models.modeling_outputs import Transformer2DModelOutput
+
+class PixArtControlNetAdapterBlock(nn.Module):


I think we need to copy paste all the model code here into the pipeline so that the pipeline will be able to run, no?

the pipeline code changes the sys path, so it runs
sys.path.append(os.path.dirname(os.path.dirname(os.path.abspath(__file__))))

yiyixuxu · 2024-07-23T22:39:41Z

examples/community/run_pixart_alpha_controlnet_pipeline.py

@@ -0,0 +1,81 @@
+import sys


this should go into README https://github.com/huggingface/diffusers/blob/main/examples/community/README.md

i have added the section

raulc0399 added 3 commits July 12, 2024 20:53

add the controlnet pipeline for pixart alpha

5389c8c

import structure for the pixart alpha controlnet pipeline

abcc770

use PixArtImageProcessor

eaa2e21

raulc0399 added 3 commits July 22, 2024 19:00

moved the pixart controlnet in examples

3e2ec9a

rollback changes

98f63fb

training script

e381b40

yiyixuxu reviewed Jul 22, 2024

View reviewed changes

moved pipepile to comunity folder

2e6ca8c

yiyixuxu reviewed Jul 23, 2024

View reviewed changes

readme section for the pixart controlnet model and pipeline

a6ed8f7

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

adds the pipeline for pixart alpha controlnet #8857

adds the pipeline for pixart alpha controlnet #8857

raulc0399 commented Jul 12, 2024 •

edited

Loading

yiyixuxu commented Jul 17, 2024

raulc0399 commented Jul 17, 2024

asomoza commented Jul 17, 2024

raulc0399 commented Jul 17, 2024

asomoza commented Jul 17, 2024

raulc0399 commented Jul 17, 2024 •

edited

Loading

raulc0399 commented Jul 17, 2024

asomoza commented Jul 18, 2024

raulc0399 commented Jul 18, 2024

raulc0399 commented Jul 22, 2024

yiyixuxu Jul 22, 2024

sayakpaul Jul 23, 2024

raulc0399 commented Jul 23, 2024

yiyixuxu left a comment

yiyixuxu Jul 23, 2024

raulc0399 Jul 24, 2024 •

edited

Loading

yiyixuxu Jul 23, 2024

raulc0399 Jul 24, 2024

adds the pipeline for pixart alpha controlnet #8857

Are you sure you want to change the base?

adds the pipeline for pixart alpha controlnet #8857

Conversation

raulc0399 commented Jul 12, 2024 • edited Loading

Who can review?

yiyixuxu commented Jul 17, 2024

raulc0399 commented Jul 17, 2024

asomoza commented Jul 17, 2024

raulc0399 commented Jul 17, 2024

asomoza commented Jul 17, 2024

raulc0399 commented Jul 17, 2024 • edited Loading

raulc0399 commented Jul 17, 2024

asomoza commented Jul 18, 2024

raulc0399 commented Jul 18, 2024

raulc0399 commented Jul 22, 2024

yiyixuxu Jul 22, 2024

Choose a reason for hiding this comment

sayakpaul Jul 23, 2024

Choose a reason for hiding this comment

raulc0399 commented Jul 23, 2024

yiyixuxu left a comment

Choose a reason for hiding this comment

yiyixuxu Jul 23, 2024

Choose a reason for hiding this comment

raulc0399 Jul 24, 2024 • edited Loading

Choose a reason for hiding this comment

yiyixuxu Jul 23, 2024

Choose a reason for hiding this comment

raulc0399 Jul 24, 2024

Choose a reason for hiding this comment

raulc0399 commented Jul 12, 2024 •

edited

Loading

raulc0399 commented Jul 17, 2024 •

edited

Loading

raulc0399 Jul 24, 2024 •

edited

Loading