Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Too hard to run on win, lol(now available run on windows with custom ComfyUI node) #36

Open
zmwv823 opened this issue Nov 24, 2024 · 20 comments
Labels
documentation Improvements or additions to documentation fixed fix a bug

Comments

@zmwv823
Copy link

zmwv823 commented Nov 24, 2024

Try play in comfyui, write a simple init code.
By offload te and vae (prompt_process、latent decode can be seperated in comfy), only 3.5gb vram required in gen process, te need 5gb.
But stuck at latent decode, seems need triton compile or something else wrong.

image

  File "D:\AI\ComfyUI_windows_portable\ComfyUI\execution.py", line 169, in _map_node_over_list
    process_inputs(input_dict, i)
  File "D:\AI\ComfyUI_windows_portable\ComfyUI\execution.py", line 158, in process_inputs
    results.append(getattr(obj, func)(**inputs))
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\AI\ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI-UL\Image_Generation\Sana\nodes.py", line 46, in sampler
    results = model['pipe'](
              ^^^^^^^^^^^^^^
  File "D:\AI\ComfyUI_windows_portable\py311\Lib\site-packages\torch\nn\modules\module.py", line 1736, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\AI\ComfyUI_windows_portable\py311\Lib\site-packages\torch\nn\modules\module.py", line 1747, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\AI\ComfyUI_windows_portable\py311\Lib\site-packages\torch\utils\_contextlib.py", line 116, in decorate_context
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "D:\AI\ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI-UL\Image_Generation\Sana\sana_pipeline.py", line 333, in forward
    sample = vae_decode(self.config.vae.vae_type, self.vae, sample)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\AI\ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI-UL\Image_Generation\Sana\diffusion\model\builder.py", line 133, in vae_decode
    samples = ae.decode(latent.detach() / ae.cfg.scaling_factor)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\AI\ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI-UL\Image_Generation\Sana\diffusion\model\dc_ae\efficientvit\models\efficientvit\dc_ae.py", line 446, in decode
    x = self.decoder(x)
        ^^^^^^^^^^^^^^^
  File "D:\AI\ComfyUI_windows_portable\py311\Lib\site-packages\torch\nn\modules\module.py", line 1736, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\AI\ComfyUI_windows_portable\py311\Lib\site-packages\torch\nn\modules\module.py", line 1747, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\AI\ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI-UL\Image_Generation\Sana\diffusion\model\dc_ae\efficientvit\models\efficientvit\dc_ae.py", line 414, in forward
    x = stage(x)
        ^^^^^^^^
  File "D:\AI\ComfyUI_windows_portable\py311\Lib\site-packages\torch\nn\modules\module.py", line 1736, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\AI\ComfyUI_windows_portable\py311\Lib\site-packages\torch\nn\modules\module.py", line 1747, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\AI\ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI-UL\Image_Generation\Sana\diffusion\model\dc_ae\efficientvit\models\nn\ops.py", line 834, in forward
    x = op(x)
        ^^^^^
  File "D:\AI\ComfyUI_windows_portable\py311\Lib\site-packages\torch\nn\modules\module.py", line 1736, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\AI\ComfyUI_windows_portable\py311\Lib\site-packages\torch\nn\modules\module.py", line 1747, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\AI\ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI-UL\Image_Generation\Sana\diffusion\model\dc_ae\efficientvit\models\nn\ops.py", line 743, in forward
    x = self.context_module(x)
        ^^^^^^^^^^^^^^^^^^^^^^
  File "D:\AI\ComfyUI_windows_portable\py311\Lib\site-packages\torch\nn\modules\module.py", line 1736, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\AI\ComfyUI_windows_portable\py311\Lib\site-packages\torch\nn\modules\module.py", line 1747, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\AI\ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI-UL\Image_Generation\Sana\diffusion\model\dc_ae\efficientvit\models\nn\ops.py", line 780, in forward
    res = self.forward_main(x) + self.shortcut(x)
          ^^^^^^^^^^^^^^^^^^^^
  File "D:\AI\ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI-UL\Image_Generation\Sana\diffusion\model\dc_ae\efficientvit\models\nn\ops.py", line 770, in forward_main
    return self.main(x)
           ^^^^^^^^^^^^
  File "D:\AI\ComfyUI_windows_portable\py311\Lib\site-packages\torch\nn\modules\module.py", line 1736, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\AI\ComfyUI_windows_portable\py311\Lib\site-packages\torch\nn\modules\module.py", line 1747, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\AI\ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI-UL\Image_Generation\Sana\diffusion\model\dc_ae\efficientvit\models\nn\ops.py", line 682, in forward
    out = self.proj(out)
          ^^^^^^^^^^^^^^
  File "D:\AI\ComfyUI_windows_portable\py311\Lib\site-packages\torch\nn\modules\module.py", line 1736, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\AI\ComfyUI_windows_portable\py311\Lib\site-packages\torch\nn\modules\module.py", line 1747, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\AI\ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI-UL\Image_Generation\Sana\diffusion\model\dc_ae\efficientvit\models\nn\ops.py", line 91, in forward
    x = self.norm(x)
        ^^^^^^^^^^^^
  File "D:\AI\ComfyUI_windows_portable\py311\Lib\site-packages\torch\nn\modules\module.py", line 1736, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\AI\ComfyUI_windows_portable\py311\Lib\site-packages\torch\nn\modules\module.py", line 1747, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\AI\ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI-UL\Image_Generation\Sana\diffusion\model\dc_ae\efficientvit\models\nn\norm.py", line 40, in forward
    return TritonRMSNorm2dFunc.apply(x, self.weight, self.bias, self.eps)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\AI\ComfyUI_windows_portable\py311\Lib\site-packages\torch\autograd\function.py", line 575, in apply
    return super().apply(*args, **kwargs)  # type: ignore[misc]
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\AI\ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI-UL\Image_Generation\Sana\diffusion\model\dc_ae\efficientvit\models\nn\triton_rms_norm.py", line 146, in forward
    _rms_norm_2d_fwd_fused[(M * num_blocks,)](  #
  File "D:\AI\ComfyUI_windows_portable\py311\Lib\site-packages\triton\runtime\jit.py", line 345, in <lambda>
    return lambda *args, **kwargs: self.run(grid=grid, warmup=False, *args, **kwargs)
                                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\AI\ComfyUI_windows_portable\py311\Lib\site-packages\triton\runtime\jit.py", line 607, in run
    device = driver.active.get_current_device()
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\AI\ComfyUI_windows_portable\py311\Lib\site-packages\triton\runtime\driver.py", line 23, in __getattr__
    self._initialize_obj()
  File "D:\AI\ComfyUI_windows_portable\py311\Lib\site-packages\triton\runtime\driver.py", line 20, in _initialize_obj
    self._obj = self._init_fn()
                ^^^^^^^^^^^^^^^
  File "D:\AI\ComfyUI_windows_portable\py311\Lib\site-packages\triton\runtime\driver.py", line 9, in _create_driver
    return actives[0]()
           ^^^^^^^^^^^^
  File "D:\AI\ComfyUI_windows_portable\py311\Lib\site-packages\triton\backends\nvidia\driver.py", line 414, in __init__
    self.utils = CudaUtils()  # TODO: make static
                 ^^^^^^^^^^^
  File "D:\AI\ComfyUI_windows_portable\py311\Lib\site-packages\triton\backends\nvidia\driver.py", line 92, in __init__
    mod = compile_module_from_src(Path(os.path.join(dirname, "driver.c")).read_text(), "cuda_utils")
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\AI\ComfyUI_windows_portable\py311\Lib\site-packages\triton\backends\nvidia\driver.py", line 69, in compile_module_from_src
    so = _build(name, src_path, tmpdir, library_dirs(), include_dir, libraries)
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\AI\ComfyUI_windows_portable\py311\Lib\site-packages\triton\runtime\build.py", line 71, in _build
    ret = subprocess.check_call(cc_cmd)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\AI\ComfyUI_windows_portable\py311\Lib\subprocess.py", line 413, in check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['C:\\Program Files\\Microsoft Visual Studio\\2022\\Community\\VC\\Tools\\MSVC\\14.42.34433\\bin\\Hostx64\\x64\\cl.exe', 'C:\\Users\\pc\\AppData\\Local\\Temp\\tmpj8dgtq34\\main.c', '/nologo', '/O2', '/LD', '/wd4819', '/ID:\\AI\\ComfyUI_windows_portable\\py311\\Lib\\site-packages\\triton\\backends\\nvidia\\include', '/IC:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.6\\include', '/IC:\\Users\\pc\\AppData\\Local\\Temp\\tmpj8dgtq34', '/ID:\\AI\\ComfyUI_windows_portable\\py311\\Include', '/IC:\\Program Files (x86)\\Windows Kits\\10\\Include\\10.0.22621.0\\shared', '/IC:\\Program Files (x86)\\Windows Kits\\10\\Include\\10.0.22621.0\\ucrt', '/IC:\\Program Files (x86)\\Windows Kits\\10\\Include\\10.0.22621.0\\um', '/link', '/LIBPATH:D:\\AI\\ComfyUI_windows_portable\\py311\\Lib\\site-packages\\triton\\backends\\nvidia\\lib', '/LIBPATH:C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.6\\lib\\x64', '/LIBPATH:D:\\AI\\ComfyUI_windows_portable\\py311\\libs', '/LIBPATH:D:\\AI\\ComfyUI_windows_portable\\py311\\libs', '/LIBPATH:D:\\AI\\ComfyUI_windows_portable\\py311\\libs', '/LIBPATH:C:\\Python311\\libs', '/LIBPATH:C:\\Program Files (x86)\\Windows Kits\\10\\Lib\\10.0.22621.0\\ucrt\\x64', '/LIBPATH:C:\\Program Files (x86)\\Windows Kits\\10\\Lib\\10.0.22621.0\\um\\x64', '/LIBPATH:D:\\AI\\ComfyUI_windows_portable\\py311\\libs', '/LIBPATH:D:\\AI\\ComfyUI_windows_portable\\py311\\libs', '/LIBPATH:D:\\AI\\ComfyUI_windows_portable\\py311\\libs', '/LIBPATH:C:\\Python311\\libs', '/LIBPATH:C:\\Program Files (x86)\\Windows Kits\\10\\Lib\\10.0.22621.0\\ucrt\\x64', '/LIBPATH:C:\\Program Files (x86)\\Windows Kits\\10\\Lib\\10.0.22621.0\\um\\x64', '/LIBPATH:D:\\AI\\ComfyUI_windows_portable\\py311\\libs', '/LIBPATH:D:\\AI\\ComfyUI_windows_portable\\py311\\libs', '/LIBPATH:D:\\AI\\ComfyUI_windows_portable\\py311\\libs', '/LIBPATH:C:\\Python311\\libs', '/LIBPATH:C:\\Program Files (x86)\\Windows Kits\\10\\Lib\\10.0.22621.0\\ucrt\\x64', '/LIBPATH:C:\\Program Files (x86)\\Windows Kits\\10\\Lib\\10.0.22621.0\\um\\x64', 'cuda.lib', '/OUT:C:\\Users\\pc\\AppData\\Local\\Temp\\tmpj8dgtq34\\cuda_utils.cp311-win_amd64.pyd']' returned non-zero exit status 2.

Prompt executed in 36.08 seconds
@nitinmukesh
Copy link

nitinmukesh commented Nov 24, 2024

Very easy to run on Windows as standalone installation

#27 (comment)

@zmwv823
Copy link
Author

zmwv823 commented Nov 24, 2024

Very easy to run on Windows as standalone installation

#27 (comment)

Need vs env setup, so it's not easy.
I have vs installed, but seems env not setup for triton.
And i use python env from comfyui embed-python.
It doesn't like other vaes, just load and decode latent.
I wonder if there is a way to directly decode latent without triton compile.

@nitinmukesh
Copy link

If you had seen the video you would find out that you don't need to build Triton. :)

@zmwv823
Copy link
Author

zmwv823 commented Nov 24, 2024

If you had seen the video you would find out that you don't need to build Triton. :)

I have triton installed weeks ago (triton 3.1, py3.11, cuda 12.6).
I will never try build wheels such as triton、flash_attention and xformers etc on my pc after i have tried once (Hours passed the process seems never ending).

Thanks a lot, though it doesn't work.
Maybe i need to downgrade my cuda to 12.4 to try triton 3.0.

@mp3pintyo
Copy link

That's why I skipped the Windows installation and created a RunPod template right away. :D That way I can access it from anywhere, even from a mobile phone.

@lawrence-cj
Copy link
Collaborator

Once our Autoencoder is merged into diffusers, where will be no triton need for inference. Refer to: huggingface/diffusers#9708
Besides, let us try to remove the dependency on triton in our repo to make it runnable without it.

@FurkanGozukara
Copy link

it works with triton on windows i am using right now

you have to install pre compiled triton wheel

checkout my article : https://www.linkedin.com/pulse/nvidia-labs-developed-sana-model-weights-gradio-demo-app-g%C3%B6z%C3%BCkara-gxirf/?trackingId=55yg59jISbuecGZgE8F8NA%3D%3D

@lawrence-cj
Copy link
Collaborator

Seems you are trying to use Sana in CompyUI. Any open-source code available now? Maybe we can collaborate on it. @zmwv823

@nitinmukesh
Copy link

nitinmukesh commented Nov 24, 2024

@lawrence-cj

May I request if you can share the code to use this (using Diffusers), can't wait for it to be merged. Want to test on my local.
I will build from source with PR
huggingface/diffusers#9708

P.S. I'm not developer. I will just run the code.

@lawrence-cj
Copy link
Collaborator

lawrence-cj commented Nov 24, 2024

@nitinmukesh Of course. There maybe some changes with the final version. But you can try this file:

https://github.com/lawrence-cj/diffusers/blob/Sana/sana.py

convert sana pth to diffusers' safetensor script is also available:
SanaPipeline: https://github.com/huggingface/diffusers/pull/9982/files#diff-124e39f314758010671d24c3d0495f679aba2a7b69124018f198ebccf1854233
SanaPAGPipeline: https://github.com/huggingface/diffusers/pull/9982/files#diff-3217f6d4123a2ffff1921a1c687f16ee13c6ddd78d788f08ad30d1c64817349b

@lawrence-cj
Copy link
Collaborator

We removed the depandency of triton in DC-AE decoder. Let's have a try if the decode process will run smoothly in your CompyUI code. @zmwv823

Refer to: #38

@zmwv823
Copy link
Author

zmwv823 commented Nov 24, 2024

We removed the depandency of triton in DC-AE decoder. Let's have a try if the decode process will run smoothly in your CompyUI code. @zmwv823

Refer to: #38

Thanks a lot, the vae decode now works without triton.
I will try rewrite the node if diffusers pr merged, in that case, just diffusers module needed.
I'm not familar with comfy native sampler, so it's just a wrapper node for personal test.
image

@zmwv823 zmwv823 closed this as completed Nov 24, 2024
@lawrence-cj
Copy link
Collaborator

Is your code of ComfyUI which supports Sana is public available, I’m looking into ComfyUI recently.

@zmwv823
Copy link
Author

zmwv823 commented Nov 24, 2024

Is your code of ComfyUI which supports Sana is public available, I’m looking into ComfyUI recently.

Not ready yet, the text_encode need to split out.
And it's a wrapper node, all process code from your project, totally diffrent from comfy native process.

@lawrence-cj lawrence-cj reopened this Nov 24, 2024
@lawrence-cj lawrence-cj changed the title Too hard to run on win, lol Too hard to run on win, lol(now available run on windows with custom ComfyUI node) Nov 24, 2024
@lawrence-cj lawrence-cj added documentation Improvements or additions to documentation fixed fix a bug labels Nov 24, 2024
@zmwv823
Copy link
Author

zmwv823 commented Nov 25, 2024

Is your code of ComfyUI which supports Sana is public available, I’m looking into ComfyUI recently.

A ComfyUI custom-node with lots of bugs:

@nitinmukesh
Copy link

nitinmukesh commented Nov 25, 2024

Hello @lawrence-cj

I tried the diffuser version.

Does it support memory optimizations? I got error
AttributeError: 'DCAE' object has no attribute 'enable_tiling'
AttributeError: 'DCAE' object has no attribute 'enable_slicing'

and sequential offload
RuntimeError: Tensor.item() cannot be called on meta tensors

vae = DCAE.from_pretrained(
        "output",
        subfolder="vae",
        torch_dtype=torch.float32
    )
    # diffusers Sana Model
    pipe = SanaPipeline.from_pretrained(
       "output", 
       torch_dtype=dtype,
        use_safetensors=True,
        vae=vae,
    ).to(device)


# Enable memory optimizations
pipe.vae.enable_tiling()
pipe.vae.enable_slicing()

# Enable CPU offloading
pipe.enable_sequential_cpu_offload()

@CRCODE22
Copy link

Is your code of ComfyUI which supports Sana is public available, I’m looking into ComfyUI recently.

A ComfyUI custom-node with lots of bugs:

* https://github.com/zmwv823/ConfyUI-Sana

Once it works in ComfyUI please let me know I think at least this method is a way to get it working under Windows 11 soon. ComfyUI works great here.

@lawrence-cj
Copy link
Collaborator

@zmwv823 , so cool. Will look into it recently and remain your contribution if I PR for ComfyUI.

@zmwv823
Copy link
Author

zmwv823 commented Nov 26, 2024

@zmwv823 , so cool. Will look into it recently and remain your contribution if I PR for ComfyUI.

I stronggly recmmended looking into this repo: https://github.com/city96/ComfyUI_ExtraModels.
It's comfy native sampler of pixart models with lora、controlnet support.

@zmwv823 zmwv823 closed this as completed Nov 26, 2024
@lawrence-cj
Copy link
Collaborator

Yah, we know this project, and we collaborate before. I'll also mention this project.

BTW, I'll re-open this after the ComfyUI is supported.

@lawrence-cj lawrence-cj reopened this Nov 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation fixed fix a bug
Projects
None yet
Development

No branches or pull requests

6 participants