Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Onnx Pipeline: Inference for text to image conversion #3380

Closed

Conversation

saikrishna2893
Copy link

@saikrishna2893 saikrishna2893 commented May 10, 2023

Initial version of ONNX based inference pipeline for Text to image conversion based on Stable-diffusion models.
Performance related information: Tested on Raphael (AMD 7600X) and Raptor Lake (I5-13000K)
Onnx based inference performance - InvokeAI-pipelines.pdf

Sample output from execution of Onnx Pipeline tested on Raptor-Lake (I5-13000k) on CPU:
image

Sample output from execution of Pytorch Pipeline tested on Raptor-Lake (I5-13000k) on CPU:
image

@hipsterusername
Copy link
Member

@saikrishna2893 - Thanks for the contribution!

To confirm, is this integrated into the new Nodes pipelines that are being worked on in Main?

Copy link
Collaborator

@lstein lstein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really appreciate the work that went into this.

I'm sad to say that this will have to be modified in order to work with nodes. In particular, CLI.py is going to disappear from the repository soon. Please take a look at the invokeai/app tree, in particular invokeai/app/invocation/latents.py, to understand how the new text to image inference system works.

@lstein
Copy link
Collaborator

lstein commented May 10, 2023

Initial version of ONNX based inference pipeline for Text to image conversion based on Stable-diffusion models. Performance related information: Tested on Raphael (AMD 7600X) and Raptor Lake (I5-13000K) Onnx based inference performance - InvokeAI-pipelines.pdf

Sample output from execution of Onnx Pipeline tested on Raptor-Lake (I5-13000k): image

Sample output from execution of Pytorch Pipeline tested on Raptor-Lake (I5-13000k): image

Does the ONNYX pipeline take advantage of CUDA, and if so, how does it perform?

@lstein
Copy link
Collaborator

lstein commented May 10, 2023

Also note the CI failures.

@saikrishna2893
Copy link
Author

Initial version of ONNX based inference pipeline for Text to image conversion based on Stable-diffusion models. Performance related information: Tested on Raphael (AMD 7600X) and Raptor Lake (I5-13000K) Onnx based inference performance - InvokeAI-pipelines.pdf
Sample output from execution of Onnx Pipeline tested on Raptor-Lake (I5-13000k): image
Sample output from execution of Pytorch Pipeline tested on Raptor-Lake (I5-13000k): image

Does the ONNYX pipeline take advantage of CUDA, and if so, how does it perform?

The Onnx Pipeline currently uses CPU as its device. Pipeline makes use of OpenVINO execution provider for enhanced optimized inferencing.

@saikrishna2893
Copy link
Author

saikrishna2893 commented May 16, 2023

I really appreciate the work that went into this.

I'm sad to say that this will have to be modified in order to work with nodes. In particular, CLI.py is going to disappear from the repository soon. Please take a look at the invokeai/app tree, in particular invokeai/app/invocation/latents.py, to understand how the new text to image inference system works.

@lstein can you point out any documentation related to use of app and node structure. Example commands to run and test. We have done code through of the invokeai/app. Quite unclear on some of the working. We have checked PR#3180, description from discussion page and other PR's. Have seen some commands using pipe to create multiple sessions of inference. Any further information on this would be helpful. Thanks.

Faced errors when running following commands:

  1. txt2img --prompt "an old man reading newspaper" - below quoted error
  2. t2l --prompt "an old man reading newspaper" | l2i - Warning --> Invalid command.
(test_Invoekai) C:\Users\user1\Invokeai\InvokeAI>python scripts\invoke-new.py
[16-05-2023 03:48:08]::[InvokeAI]::INFO --> Patchmatch initialized
[16-05-2023 03:48:08]::[InvokeAI]::INFO --> Initializing, be patient...
[16-05-2023 03:48:08]::[InvokeAI]::INFO --> Initialization file C:\Users\mcw\invokeai\invokeai.init found. Loading...
[16-05-2023 03:48:08]::[InvokeAI]::INFO --> InvokeAI, version 3.0.0+a0
[16-05-2023 03:48:08]::[InvokeAI]::INFO --> InvokeAI runtime directory is "C:\Users\mcw\invokeai"
[16-05-2023 03:48:08]::[InvokeAI]::INFO --> Model manager initialized
[16-05-2023 03:48:08]::[InvokeAI]::INFO --> GFPGAN Initialized
[16-05-2023 03:48:08]::[InvokeAI]::INFO --> CodeFormer Initialized
[16-05-2023 03:48:08]::[InvokeAI]::INFO --> Face restoration initialized
**invoke> txt2img --prompt "an old man reading newspaper"**
[16-05-2023 03:48:14]::[InvokeAI]::INFO --> Loading diffusers model from runwayml/stable-diffusion-v1-5
[16-05-2023 03:48:14]::[InvokeAI]::DEBUG --> Using more accurate float32 precision
[16-05-2023 03:48:14]::[InvokeAI]::DEBUG --> Loading diffusers VAE from stabilityai/sd-vae-ft-mse
[16-05-2023 03:48:14]::[InvokeAI]::DEBUG --> Using more accurate float32 precision
[16-05-2023 03:48:16]::[InvokeAI]::DEBUG --> Default image dimensions = 512 x 512
[16-05-2023 03:48:16]::[InvokeAI]::INFO --> Loading embeddings from C:\Users\mcw\Documents\InvokeAI_org\invokeai\embeddings
[16-05-2023 03:48:16]::[InvokeAI]::INFO --> Textual inversion triggers:
[16-05-2023 03:48:16]::[InvokeAI]::INFO --> Model loaded in 2.13s
Generating:   0%|                                                                                                                                | 0/1 [00:00<?, ?it/s]
[16-05-2023 03:48:17]::[InvokeAI]::ERROR --> Error in node fe27d5da-b317-43e1-851e-5ec112abcdb8 (source node 0): Traceback (most recent call last):
  File "C:\Users\user1\Invokeai\InvokeAI\invokeai\app\services\processor.py", line 70, in __process
    outputs = invocation.invoke(
  File "C:\Users\user1\Invokeai\InvokeAI\invokeai\app\invocations\generate.py", line 92, in invoke
    generate_output = next(outputs)
  File "C:\Users\user1\Invokeai\InvokeAI\invokeai\backend\generator\base.py", line 144, in generate
    results = generator.generate(prompt,
  File "C:\Users\user1\Invokeai\InvokeAI\invokeai\backend\generator\base.py", line 374, in generate
    image = make_image(x_T, seed)
  File "C:\Users\user1\Invokeai\InvokeAI\invokeai\backend\generator\txt2img.py", line 65, in make_image
    pipeline_output = pipeline.image_from_embeddings(
  File "C:\Users\user1\Invokeai\InvokeAI\invokeai\backend\stable_diffusion\diffusers_pipeline.py", line 480, in image_from_embeddings
    result_latents, result_attention_map_saver = self.latents_from_embeddings(
  File "C:\Users\user1\Invokeai\InvokeAI\invokeai\backend\stable_diffusion\diffusers_pipeline.py", line 523, in latents_from_embeddings
    result: PipelineIntermediateState = infer_latents_from_embeddings(
  File "C:\Users\user1\Invokeai\InvokeAI\invokeai\backend\stable_diffusion\diffusers_pipeline.py", line 207, in __call__
    callback(result)
  File "C:\Users\user1\Invokeai\InvokeAI\invokeai\app\invocations\generate.py", line 66, in dispatch_progress
    stable_diffusion_step_callback(
  File "C:\Users\user1\Invokeai\InvokeAI\invokeai\app\util\step_callback.py", line 40, in stable_diffusion_step_callback
    image = Generator.sample_to_lowres_estimated_image(sample)
  File "C:\Users\user1\Invokeai\InvokeAI\invokeai\backend\generator\base.py", line 514, in sample_to_lowres_estimated_image
    latent_image = samples[0].permute(1, 2, 0) @ v1_5_latent_rgb_factors
RuntimeError: "addmm_impl_cpu_" not implemented for 'Half'

[16-05-2023 03:48:17]::[InvokeAI]::WARNING --> Session error: creating a new session

@hipsterusername
Copy link
Member

Hello! Following up on the request from discussions with @lalith-mcw here - CCing @lstein and @StAlKeR7779 for visibility.

To confirm, is there a reason you're looking for CLI documentation? I ask because 3.0 supports a graph-based API, that can be accessed via OpenAPI documentation.

It may be easier to get direct implementation support if you join Discord. That is where we offer live Dev feedback and Q&A, and where a number of folks find implementation guidance.

In any case, I believe that you'll need the following guidance, which I believe @lstein and/or @StAlKeR7779 can provide more details on:

  • Using the new Model Manager (Model Manager rewrite #3335) to incorporate the ONNX model configuration.
  • Updating node logic in Latents.py or in the Diffuser pipeline to support the new model format

If you reach out to me on Discord, I can create a channel for us to discuss this project. Thanks again for you and the team's support

@lalith-mcw
Copy link

To confirm, is there a reason you're looking for CLI documentation? I ask because 3.0 supports a graph-based API, that can be accessed via OpenAPI documentation.

Currently in this PR we do provide an option for the user to select their own model type between torch/onnx. These were updated within args.py and cli.py with previous format. We do try to integrate the same with the graph based node invocations. I'll try to start a chat in discord, thanks.

@psychedelicious
Copy link
Collaborator

Hi @saikrishna2893 , we have implemented Onnx support in #3562. It is integrated in the nodes backend, but support is limited to text to image only for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants