Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

comfyui: init #268378

Open
wants to merge 4 commits into
base: master
Choose a base branch
from
Open

comfyui: init #268378

wants to merge 4 commits into from

Conversation

fazo96
Copy link
Contributor

@fazo96 fazo96 commented Nov 18, 2023

Description of changes

This adds ComfyUI, an advanced Stable Diffusion web based UI. Fixes #227006

I plan to also add a service to be able to configure this properly. This PR includes a service for running the web app plus a system to manage "custom nodes" which are ComfyUI plugins, and packages one that I use.

ComfyUI does not do versions, so I am using the commit hash as version I am setting the version as recommended by @rnhmjoj.

Things done

Due to having to build torch, the build takes very long. I plan to let this build tomorrow while I am not using the computer then fix any issues with the derivation after that.

I changed base so I could leverage the package cache and got it to build and run. I also tested the service.

Some very recently merged developments allow this to automatically run with CUDA or with ROCm support depending on the nixpkgs configuration, however I am trying to detect that so I can automatically run this with --cpu in case both are off, otherwise it crashes on startup.

On aarch64-darwin ComfyUI can leverage Apple's build of torch to use the hardware acceleration (similar to CUDA and rocM) but I am not sure if torch is packaged properly for this on darwin.

  • Built on platform(s)
    • x86_64-linux
    • aarch64-linux
    • x86_64-darwin
    • aarch64-darwin
  • For non-Linux: Is sandboxing enabled in nix.conf? (See Nix manual)
    • sandbox = relaxed
    • sandbox = true
  • Tested, as applicable:
  • Tested compilation of all packages that depend on this change using nix-shell -p nixpkgs-review --run "nixpkgs-review rev HEAD". Note: all changes have to be committed, also see nixpkgs-review usage
  • Tested basic functionality of all binary files (usually in ./result/bin/)
  • 23.11 Release Notes (or backporting 23.05 Release notes)
    • (Package updates) Added a release notes entry if the change is major or breaking
    • (Module updates) Added a release notes entry if the change is significant
    • (Module addition) Added a release notes entry if adding a new NixOS module
  • Fits CONTRIBUTING.md.

@ofborg ofborg bot added 8.has: package (new) This PR adds a new package 10.rebuild-darwin: 0 This PR does not cause any packages to rebuild on Darwin 10.rebuild-linux: 1-10 10.rebuild-linux: 1 labels Nov 18, 2023
@github-actions github-actions bot added 6.topic: nixos Issues or PRs affecting NixOS modules, or package usability issues specific to NixOS 8.has: module (update) This PR changes an existing module in `nixos/` labels Nov 19, 2023
pkgs/applications/misc/comfyui/default.nix Outdated Show resolved Hide resolved
pkgs/applications/misc/comfyui/default.nix Outdated Show resolved Hide resolved
pkgs/applications/misc/comfyui/default.nix Outdated Show resolved Hide resolved
pkgs/applications/misc/comfyui/default.nix Outdated Show resolved Hide resolved
pkgs/applications/misc/comfyui/default.nix Outdated Show resolved Hide resolved
pkgs/applications/misc/comfyui/default.nix Outdated Show resolved Hide resolved
pkgs/applications/misc/comfyui/default.nix Outdated Show resolved Hide resolved
pkgs/applications/misc/comfyui/default.nix Outdated Show resolved Hide resolved
pkgs/top-level/all-packages.nix Outdated Show resolved Hide resolved
@fazo96 fazo96 force-pushed the comfyui branch 2 times, most recently from 28f6409 to 1cb8fe7 Compare November 19, 2023 13:51
@ofborg ofborg bot added 8.has: clean-up 11.by: package-maintainer This PR was created by the maintainer of the package it changes labels Nov 19, 2023
description = mdDoc "Set the listen port for the Web UI and API.";
};

extraArgs = mkOption {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be types.attrsOf types.str and merged with the attrset on line 124.
In this way the option would allow for better composability.

@fazo96
Copy link
Contributor Author

fazo96 commented Nov 19, 2023

I am currently stuck due to a problem: I have a ROCm compatible AMD graphics card and I have been using ComfyUI with a nix shell so far, using the torchWithRocm package to leverage the hardware acceleration

In this package and service, I made sure the default torch package can be replaced via an option or override with torchWithRocm or torchWithCuda to leverage hardware acceleration

However, when I try to build the derivation using torchWithRocm I get the following error:

error: collision between `/nix/store/d2a3974p032ln2kjqrlrx9ycy2s1lisf-python3.11-torch-2.0.1/bin/torchrun' and `/nix/store/s0xfhdvmxy6l0qrdy1qb8af0fqri2jzi-python3.11-torch-2.0.1/bin/torchrun'

I also tried switching to using buildPythonPackage to build the derivation, similar to how my nix shell does it, but then I got the same error just formatted in a better way:

pythonCatchConflictsPhase
Found duplicated packages in closure for dependency 'torch': 
        torch 2.0.1 (/nix/store/s0xfhdvmxy6l0qrdy1qb8af0fqri2jzi-python3.11-torch-2.0.1/lib/python3.11/site-packages/torch-2.0.1.dist-info)
        torch 2.0.1 (/nix/store/d2a3974p032ln2kjqrlrx9ycy2s1lisf-python3.11-torch-2.0.1/lib/python3.11/site-packages/torch-2.0.1.dist-info)

Package duplicates found in closure, see above. Usually this happens if two packages depend on different version of the same dependency.

It looks like the problem is that it's trying to put different versions of torch into the same python environment.

I then tried manually overriding all the python deps that use torch so that they would use the torch provided to the comfyui package. This triggered a bunch of compilations, instead of using the hydra cache, so I gave up.

With the following shell.nix I can run the application and nothing gets compiled by nix:

with import <nixpkgs> {};
with python311Packages;

buildPythonPackage rec {
  pname = "comfyui";
  version = "master";
  src = ./.;
  propagatedBuildInputs = [
    torchWithRocm
    torchsde
    torchvision
    torchaudio
    transformers
    safetensors
    aiohttp
    accelerate
    einops
    pyyaml
    pillow
    scipy
    psutil
    tqdm
    # Custom nodes dependencies
    opencv4
    imageio-ffmpeg
  ];
  shellHook = ''
    python main.py --listen --enable-cors-header --preview-method auto
  '';
}

I also tried the enableRocm nixpkgs config but that just triggers a bunch of recompilations that take very long.

My Nix knowledge is limited to some chatting with @rnhmjoj, me searching the Nixpkgs manual and me looking at the source code of other packages.

I think we'll need some help to understand why the package isn't working with torchWithRocm as we are stuck on this.

Using it with --cpu (or useCPU = true; when using the service) seems to work at least


useCPU = mkOption {
type = types.bool;
default = false; #!config.cudaSupport && !config.rocmSupport;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
default = false; #!config.cudaSupport && !config.rocmSupport;
default = with nixpkgs.config; !(cudaSupport || rocmSupport);

@ofborg ofborg bot added 10.rebuild-darwin: 1-10 10.rebuild-darwin: 1 and removed 10.rebuild-darwin: 0 This PR does not cause any packages to rebuild on Darwin labels Nov 20, 2023
@h7x4 h7x4 added the 8.has: module (new) This PR adds a module in `nixos/` label Nov 20, 2023
@fazo96
Copy link
Contributor Author

fazo96 commented Nov 21, 2023

me and @rnhmjoj weren't able to verify this works with hardware acceleration. We don't have nvidia hardware to test this on, and we could not get AMD acceleration via rocm to work.

The closest we got to get it to work was to use torchWithRocm and override the derivation of all comfyui dependencies that depend on torch to use torchWithRocm, which triggered many recompilations, at least one of which fails.

However the new package and service work fine with the base torch package and the --cpu flag, so I think it's valuable to merge it as is.

EDIT: I found some relevant work (#268919 #268736 #268746) so I will give another shot to running this package on torchWithRocm once the area has stabilized

@fazo96 fazo96 marked this pull request as ready for review November 21, 2023 21:39
Comment on lines 69 to 117
installPhase = ''
runHook preInstall
echo "Preparing bin folder"
mkdir -p $out/bin/
echo "Copying comfyui files"
cp -r $src/comfy $out/
cp -r $src/comfy_extras $out/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this alternative much better.
Even better, submit a patch along with what SomeoneSerge said in the other comment and just fetch that patch so we don't have to include extra files in nixpkgs.

Comment on lines 89 to 140
echo "Patching python code..."
substituteInPlace $out/folder_paths.py --replace "if not os.path.exists(input_directory):" "if False:"
substituteInPlace $out/nodes.py --replace "os.listdir(custom_node_path)" "os.listdir(os.path.realpath(custom_node_path))"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's very much desirable to submit a PR fixing these crimes upstream, and include the patch linking the PR here

Copy link
Contributor

@Madouura Madouura left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, I'm the person behind most of the ROCm stuff on nixpkgs nowadays.
Can't believe we're already at this point, great work!
There's a few problems though, mostly outlined in the comments below.
The ROCm version is likely not working because it needs torchvision and torchaudio to be the ROCm variants. https://github.com/comfyanonymous/ComfyUI#amd-gpus-linux-only
From what I can tell, there's no setup.py or pyproject.toml in the project, which is irritating because then we could add this to python-packages instead. I'm unsure if there's a better alternative than what you're currently doing in this PR. See: #268378 (comment) and #268378 (comment)
I may be missing some things, but overall aside from the things I've commented on, it looks good!

package = mkOption {
type = types.package;
default = pkgs.comfyui;
defaultText = "pkgs.comfyui";
Copy link
Contributor

@Madouura Madouura Nov 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
defaultText = "pkgs.comfyui";
defaultText = literalExpression "pkgs.comfyui";
example = literalExpression "pkgs.comfyui-rocm";

nixos/modules/services/web-apps/comfyui.nix Outdated Show resolved Hide resolved
Comment on lines 20 to 319
ultimate-sd-upscale = mkComfyUICustomNodes {
pname = "ultimate-sd-upscale";
version = "unstable-2023-08-16";

src = fetchFromGitHub {
owner = "ssitu";
repo = "ComfyUI_UltimateSDUpscale";
rev = "6ea48202a76ccf5904ddfa85f826efa80dd50520";
hash = "sha256-fUZ0AO+LARa+x30Iu+8jvEDun4T3f9G0DOlB2XNxY9Q=";
fetchSubmodules = true;
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we move this to it's own package?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might need some help to figure out where I can define the mkConfyUICustomNodes function so that I can then import it from the .nix file that defines each custom node.

Copy link
Contributor

@Madouura Madouura Nov 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I misread the code here on my first review pass.
This is probably fine as it is, and we can safely ignore my comment here.
May want to consider patching shebangs with patchShebangs on the output however.
Just if you see anything particularly bad like #!/bin/bash, #!/bin/python3, etc.

@@ -0,0 +1,103 @@
{ lib
, python311
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
, python311
, python3Packages

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, so since we can't move this to python-packages, the version, python311 makes sense here.
However, this needs to be changed instead to python311Packages, again, for splicing reasons.

pkgs/by-name/co/comfyui/package.nix Outdated Show resolved Hide resolved
Comment on lines 35 to 48
pythonEnv = (python311.withPackages (ps: [ torch ] ++ [
ps.torchsde
ps.torchvision
ps.torchaudio
ps.transformers
ps.safetensors
ps.accelerate
ps.aiohttp
ps.einops
ps.pyyaml
ps.pillow
ps.scipy
ps.psutil
ps.tqdm
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
pythonEnv = (python311.withPackages (ps: [ torch ] ++ [
ps.torchsde
ps.torchvision
ps.torchaudio
ps.transformers
ps.safetensors
ps.accelerate
ps.aiohttp
ps.einops
ps.pyyaml
ps.pillow
ps.scipy
ps.psutil
ps.tqdm
pythonEnv = (python3Packages.python.withPackages (ps: with ps; [
(
if gpuBackend == "cuda"
then torchWithCuda
else if gpuBackend == "rocm"
then torchWithRocm
else torch
)
torchsde
torchvision
torchaudio
transformers
safetensors
accelerate
aiohttp
einops
pyyaml
pillow
scipy
psutil
tqdm

Many of the torch-related packages here likely need rocm variants.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, off the top of my head I think torchsde, torchvision, torchaudio, transformers, safetensors, accelerate all depend on torch and therefore they need some kind of rocm variant.

I have tried doing an override on them here, to pass the correct torch reference, but it didn't seem like the right way to do it.

I am thinking that we either need to declare a variant for each one like we do for torch, or I will need to do an override here and pass the appropriate gpuBackend. However I am not sure which one I should do

@@ -6955,6 +6955,8 @@ with pkgs;

colormake = callPackage ../development/tools/build-managers/colormake { };

comfyui-custom-nodes = recurseIntoAttrs (callPackage ../by-name/co/comfyui/custom-nodes.nix { });
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
comfyui-custom-nodes = recurseIntoAttrs (callPackage ../by-name/co/comfyui/custom-nodes.nix { });
comfyui-cuda = callPackage ../by-name/co/comfyui/package.nix { gpuBackend = "cuda"; }
comfyui-rocm = callPackage ../by-name/co/comfyui/package.nix { gpuBackend = "rocm"; }
comfyui-custom-nodes = recurseIntoAttrs (callPackage ../by-name/co/comfyui/custom-nodes.nix { });

I suggest adding the variants to make sure hydra caches them. (In comfyui-cuda's case, so we don't have to override comfyui for packages that want a cuda variant.)
We don't have to do this with the unadulterated comfyui, since the by-name implementation takes care of it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note, we can also do this.

Suggested change
comfyui-custom-nodes = recurseIntoAttrs (callPackage ../by-name/co/comfyui/custom-nodes.nix { });
comfyui = callPackage ../by-name/co/comfyui/package.nix { }
comfyui-cuda = comfyui.override { gpuBackend = "cuda"; }
comfyui-rocm = comfyui.override { gpuBackend = "rocm"; }
comfyui-custom-nodes = recurseIntoAttrs (callPackage ../by-name/co/comfyui/custom-nodes.nix { });

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Madouura looks much better, but shouldn't we use the naming scheme comfyWithRocm to match torchWithRocm?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

withRocm and withCuda is arguably outdated.
We should probably eventually change it to torch-rocm (or torch-hip) and torch-cuda.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the correct way to.handle this is to introduce pkgsRocm/pkgsCuda similar to the pkgsStatic. The we just assign comfyui-rocm = pkgsRocm.comfyui (modulo handling the infinite recursion)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may want to consider something like pkgsGPU.(rocm|cuda).

Copy link
Contributor

@rnhmjoj rnhmjoj Nov 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That wouldn't be too hard, would simplify a LOT of things, and would actually make this PR simpler since we could just put comfyui under pkgs(Cuda|Rocm).

Anyway, I think it's unreasonable to ask that every package that depends on torch declare two variants.
Why is it so? To make hydra build and cache both version? Aren't both unfree and thus not cached anyway?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rocm is free now!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But yeah, otherwise I don't see much point in those attributes. Personally, I only use nix (build|run) -f '<nixpkgs>' --arg config '{ cudaSupport = X; rocmSupport = Y; }' ..., instead of the xWithY attributes

Copy link
Contributor

@rnhmjoj rnhmjoj Nov 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or alternatively, making it easy to override what torch is with an overlay. (it's not easy, right now).

@Madouura
Copy link
Contributor

Madouura commented Nov 21, 2023

I should mention, we should probably not use python311.pkgs.
This locks the derivation into that specific version, this is why I suggested using python3Packages.
If there are incompatibilities, we can use python3Packages.pythonOlder and/or python3Packages.pythonAtleast.
If we use @SomeoneSerge's derivation alternative, we can also just put this into the python packages directory instead of this. I just noticed, python-packages has it's own by-name directory now! That's much preferable.

};

preview-method = mkOption {
type = types.nullOr types.oneOf [
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably be an enum and put in parentheses. Could be the cause of your manual build error.

i.e. types.nullOr (types.enum [...])

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... wow this is what fixed it for me! Thank you!!! Once I had this sorted, it was just adding some description fields as needed.

extraArgs = mkOption {
type = types.attrsOf types.str;
default = {};
example = "--preview-method auto";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This example has type str but the option type is attrsOf str

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. Corrected!

@LoganBarnett
Copy link

Thanks to @oddlama (of whom I owe many gratitude units from this and on agenix-rekey), the manual building issue is now fixed.

My biggest hurdle is now one of pytorch-bin. I found the pytorch state-of-the-union post I said I'd lost! I'll try going through that and seeing if I can make some headway there. Hopefully from that, if I'm still stuck, I can put forth some more intelligible and actionable questions.

Comment on lines 71 to 73
(transformers.override { torch = pytorch-bin; })
(safetensors.override { torch = pytorch-bin; })
(accelerate.override { torch = pytorch-bin; })
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding pytorch-bin: we should abstain from referring to pytorch-bin directly and from using ad hoc overrides like that. Instead, we should consistently refer to Nixpkgs' attributes, so that users can expect their overlays to actually have effect.

If a particular user decides to use pytorch-bin they'll just import nixpkgs with something like { overlays = [(final: prev: { pythonPackagesExtensions = [ (py-fin: py-pre: { torch = py-fin.pytorch-bin; }) ]; })];

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Yes this is the guidance I was looking for because I knew it was wrong. I'll give this a try and see if it still works for me.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, this hasn't worked for me in the end. I've spun my wheels quite a bit on getting this working. Currently I'm stuck on this when using this with pytorch-bin / torch-bin as above:

       error: function 'anonymous lambda' called with unexpected argument 'cudaSupport'

       at /nix/store/c7xdr8fi9vbjmp2srk61cidhc9azb9yf-source/pkgs/development/python-modules/torch/bin.nix:1:1:

And building torch just remains out of reach for me. The exact error seems to vary based on my exact commit of nixpkgs I'm using. It almost looks like a hardware error, but I've exercised the machine more consistently and extensively than by building torch (such as actually using it). The build machine has 32 GB of ram and it's only used about 200 GB of its total 1 TB of disk. Some advice suggests resource exhaustion.

I'm kind of stuck on this. I'll keep trying things.

FWIW I periodically rebase atop master so at least I'm not chasing last month's fixed bugs.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was bit by a variant of #138943 (which I think is a separate but related issue from #255070). I've worked past it now. My latest force-push removes all of this torch-bin forcing since I have demonstrated everything works without it for myself. Thanks!


executable = writers.writeDashBin "comfyui" ''
cd $out && \
${pythonEnv}/bin/python comfyui \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was comfyui meant to be a module name (as when used with -m ...)? These are ideally done using pyproject.toml and entrypoints

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I admit I am new to the Python ecosystem. I'll make it a point to take a look.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit lost, what exactly is the purpose of this? Why not call fetchurl directly?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I neglected to document this, and will do so next chance I get to do some work on it. ComfyUI cares about the file names - particularly the extensions. It goes off of that to know how to handle the file. When doing a pure fetchurl on the file, Nix was storing the file with its name being the hash. Much of the model handling is about that, but also to leave files in your store that you could have some idea what they were about. I also noted that needing some sort of auth token was frequently required for a lot of models in various places, so that's where the bearerFile stuff comes in.

I'm not particularly attached to the approach, and it's possible I've missed some handy helpers or patterns in the process. It's also possible my findings were due to some bungle that I've fixed along the way.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll try to stick to this thread now.

ComfyUI cares about the file names - particularly the extensions

I see. Canonicalizing names ad hoc like you do is surely one way to go. Ideally we'd want this canonicalization scheme to be stable (as in, we'd rather choose one and never change it). An alternative could be to use plain fetchurl and then symlink the artifacts in a separate derivation.

need ... some sort of auth token

It appears that bearer-based authentication support in fetchurl is something that already came up outside this PR (e.g. https://discourse.nixos.org/t/how-to-fetchurl-with-credentials/11994). I think it may be a bit out of scope for this PR to be taking care of this: this probably belongs in fetchurl itself, and the comfyui code should better be oblivious as to whether fetching the artifact involved any authentication.

Maybe we should have a closer look at this and open a tracking issue

This would be a path to the file, so it wouldn't be in your Nix store

Note that this would require special configuration on the host side, like extra-sandbox-paths.

Is there a general pattern established on how to get files named from fetchurl?

Note that custom names affect hashes too.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologies if I'm being overly verbose/pedantic: It sounds like you're saying we shouldn't be trying to store a derivation by any kind of filename since changing the filename (an immaterial aspect of the file from Nix's perspective) causes a re-download and duplicate storage of the file - thus we should not do that. Is that correct?

I'm fine with that if it's the case, and the filename symlinked in the end is the only thing we care about.

I am happy to remove this utility entirely, and simply suggest in documentation how one might use a bearer token. I'd come across the netrc stuff before (in that very post I think) - it may be my ignorance but it seemed like a less ideal way to go about handling a sensitive token, especially when there may be multiple of them needed. I'm using agenix-rekey to manage my secrets currently but I understand not everyone can/should do that.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file and its function have been removed. The new models look like this:

      upscale_models = {
        "real-esrgan-4xplus.pth" = pkgs.fetchurl {
          url = "https://github.com/xinntao/Real-ESRGAN/releases/download/v0.1.0/RealESRGAN_x4plus.pth";
          sha256 = "sha256-T6DTiQX3WsButJp5UbQmZwAhvjAYJl/RkdISXfnWgvE=";
        };
        "stable-diffusion-4x-upscaler.safetensors" = pkgs.fetchurl {
          url = "https://huggingface.co/stabilityai/stable-diffusion-x4-upscaler/resolve/a44206cef52cdd20c6a32fea6257860474f0be43/x4-upscaler-ema.safetensors?download=true";
          sha256 = "35c01d6160bdfe6644b0aee52ac2667da2f40a33a5d1ef12bbd011d059057bc6";
        };
        "kim2091-4k-ultrasharp.pth" = pkgs.fetchurl {
          url = "https://huggingface.co/Kim2091/UltraSharp/resolve/b32be1b8c6f81d1f6ef8ce11f016aad3a139d1e0/4x-UltraSharp.pth";
          sha256 = "sha256-1UUMW6f2bilb9iROW6olbMcUnwOImHAgeEQaVDs6jW4=";
        };
      };
      vae = {
        "sdxl_vae.safetensors" = pkgs.fetchurl {
          url = "https://civitai.com/api/download/models/290640?type=VAE";
          sha256 = "1qf65fia7g0ammwjw2vw1yhijw5kd2c54ksv3d64mgw6inplamr3";
        };
      };

So this could come from any source now, and not just fetchurl, let alone the half-baked version I had. Thanks for pushing for this!

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lboklin I don't know how closely you've been tracking the work here but this is a noteworthy deviation from the prior interface.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I admit I've been sort of tunnel-visioning and doing my own thing. I only recently added some custom nodes from this PR, but otherwise my implementation is diverging a fair bit.

I deviated from the original interface along the way as well.

I flattened the hierarchy, turned the attribute-based install path heuristic into an installPath attribute of the model, and instead of using namespacing I added model type (and base model) as extra attributes to enable filtering (if a bit more involved than tab-completing in the repl).

One problem is that fetchurl (IIRC both the one in builtins and <nix/fetchurl.nix>) would fail to automatically derive a proper name based on some civitai urls (they use url basename, which (are not only useless for civitai urls but) includes illegal characters like ?), so name had to be explicitly passed in. Does fetchurl here not have that problem?

Comment on lines 10 to 13
name = (lib.replaceStrings
["?" "&" ":" "/"]
["__questionmark__" "__ampersand__" "__colon__" "__slash__"]
url
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that this custom name will affect the outputHash. In particular, if you download the same safetensors file from two different URLs they'll end up in two different store paths

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! Is there a general pattern established on how to get files named from fetchurl? I've only found a function that was removed from nixpkgs, but I can't remember what it was at the moment.

Comment on lines 50 to 51
"--header" "@${bearerFile}"
"--location"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't find any usages of bearerFile. Could you elaborate on its purpose too? Also note that if this is some kind of a secret you probably don't want it to go through the nix store

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, but some sites require an auth token. This would be a path to the file, so it wouldn't be in your Nix store. I should document this better though. I'm also open to better approaches.

Comment on lines +10 to +16
, gpuBackend ? (
if config.cudaSupport
then "cuda"
else if config.rocmSupport
then "rocm"
else "none"
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't used anywhere else, so Torch isn't getting compiled with CUDA and comfyui crashes at startup. Is there something blocking this?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may be vestigial, but I'm not sure. I have older code doing things like using torchWithCuda if the gpuBackend is "cuda" for example. I don't think that's right, because then you can't override your own torch (like via torch-bin). This is a place where contributor documentation/advice could use some help.

In addition, I'm mostly blocked on $TIME and I can't even get this building presently because of torch via #268378 (comment).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wyndon you might try again since my latest force push. It brings back in the stuff using gpuBackend. I don't have it completely verified to work though due to what I believe are unrelated problems.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've worked past my issue and the fix does indeed work for me.

lboklin pushed a commit to lboklin/nixified-ai that referenced this pull request Jun 27, 2024
@chayleaf
Copy link
Contributor

feel free to use my ComfyUI derivation and my Maubot module for inspiration

@lboklin
Copy link

lboklin commented Jun 28, 2024

I also want to remind of the existence of mine. It's quite usable.

@LoganBarnett
Copy link

As far as I know, #268378 (comment) (use buildPythonApplication) is the last remaining feedback item to address. It relates to #268378 (comment) and I imagine they'll probably get addressed simultaneously.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi! I just want to mention that this is quite a hefty (almost overwhelming 🙃) PR and personally I haven't even got around to read most of the diff. There is a lot to review if we wish to ensure that this is maintainable and will actually stay in Nixpkgs long-term. I would like this PR to see its logical conclusion, and to be honest I think that you'd speed the process up if you split this into smaller chunks (e.g. a PR for the package, a PR for the module)

}:

let

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unneeded empty line

customNodesCollection = (
linkFarm "comfyui-custom-nodes" (builtins.map (pkg: { name = pkg.pname; path = pkg; }) customNodes)
);
in stdenv.mkDerivation rec {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should use the finalAttrs pattern here.

The finalAttrs pattern will let you remove the rec keyword (see its implementation in Nix).

Why is this pattern preferred to rec ?

Let's take this simple code example:

mkDerivation rec {
   foo = 1;
   bar = foo + 1;
}

and then .overrideAttrs(old: { foo = 2; }), you'll get { foo = 2; bar = 2; } while with finalAttrs pattern, it would work correctly because it's a real fixed point.

Let me share a couple of useful links regarding the finalAttrs pattern:

  1. History: https://discourse.nixos.org/t/avoid-rec-expresions-in-nixpkgs/8293
  2. Documentation: https://nixos.org/manual/nixpkgs/unstable/#mkderivation-recursive-attributes
  3. Recent example of implementation: https://github.com/NixOS/nixpkgs/compare/17f96f7b978e61576cfe16136eb418f74fab9952..9e6ea843e473d34d4f379b8b0d8ef0425a06defe

Feel free to reach out if you need some assistance.


# outputs = ["" "extra_model_paths.yaml" "inputs" "outputs"];

meta = with lib; {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can get rid of lib here and add a meta.mainProgram too.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

$ grep -r "meta = {" pkgs | wc -l
13691
$ grep -r "meta = with lib; {" pkgs | wc -l
23145

both options are fine, but with lib; is more frequent

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically yes, however, the trend is to avoid prepending set with with .... This is confusing static analysis tools and LSPs.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't find the reference right now, but we are in the process of pruning instances of the meta = with lib; ... abuse. We still use the more local expressions like platforms = with lib.platforms; ... though, I think

@@ -0,0 +1,139 @@
{ config, lib, options, pkgs, ...}:

with lib;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, with lib; needs to be removed.

@Azeirah
Copy link

Azeirah commented Aug 8, 2024

Is this still being worked on? I would really like to try comfyUI with the recently released FLUX model. I have an AMD 7900xtx, so I need to use rocm.

I set the

nixpkgs.config.rocmSupport = true

Setting.

I'm trying to install this using an overlay. In short, what I did was:

  1. Create an overlay file for comfy
final: prev: {
  comfyui = (import final.comfyui-preview {
    system = final.system;
    config = final.config;
  }).comfyui;
}
  1. Add comfyui to my flake's inputs
  # ...
  inputs = {
    #...
    comfyui-preview.url = "github:fazo96/nixpkgs/comfyui";
  };

  outputs = { self, nixpkgs, comfyui-preview, ...}@inputs: {
    # ...
    overlays.default = final: prev: {
      comfyui = (import comfyui-preview {
        system = final.system;
        config = final.config;
      }).comfyui;
    };
  }

I also added comfyui to my environment.systemPackages

While installing, I'm getting the following issue:

$ sudo nixos-rebuild test --flake /etc/nixos#lb-workstation
warning: Git tree '/etc/nixos' is dirty
building the system configuration...
warning: Git tree '/etc/nixos' is dirty
error:
       … while calling the 'head' builtin

         at /nix/store/qmh8bas1qni03drm0lnjas2azh7h87cn-source/lib/attrsets.nix:1575:11:

         1574|         || pred here (elemAt values 1) (head values) then
         1575|           head values
             |           ^
         1576|         elsewhile evaluating the attribute 'value'

         at /nix/store/qmh8bas1qni03drm0lnjas2azh7h87cn-source/lib/modules.nix:821:9:

          820|     in warnDeprecation opt //
          821|       { value = addErrorContext "while evaluating the option `${showOption loc}':" value;
             |         ^
          822|         inherit (res.defsFinal') highestPrio;

       (stack trace truncated; use '--show-trace' to show the full trace)

       error: 'miopengemm' has been deprecated.
       It is still available for some time as part of rocmPackages_5.

I believe rocm was updated to version 6 earlier this year, and I believe this means that rocm 5 support has been dropped in my version of Nixpkgs?

@adrian-gierakowski
Copy link
Contributor

@Azeirah this works: nixified-ai/flake#94

@LoganBarnett
Copy link

@Azeirah from my end this is presently on pause. There's a few review items left, some trivial and some potentially large. It's also been requested to move this into two separate PRs. None of the asks are unreasonable but $life demands have become sufficiently high.

I don't have means of testing rcom support but that may change in the next few weeks. That said, your problem there seems more related to rcom than anything the comfyui Nix package is doing. If you believe that to be incorrect, then I'd need more info to go off of until I am able to test on my own hardware.

@Azeirah
Copy link

Azeirah commented Sep 3, 2024

@adrian-gierakowski

I've been able to run comfyui itself with both this package as well as the flake you mentioned, but I'm unable to get either to work with rocm. I do have llama.cpp running using their flake with rocm, so at the very least I know nix + rocm is possible. I do understand that llama.cpp uses rocm directly, and doesn't depend on pytorch however.

I'm still trying to get this to work with rocm, but it's difficult. The python311packages has a torchWithRocm package, but that package is marked as broken.

I can add allowBroken, but then I get another message stating that miopengemm is deprecated.

"Rocm support is currently broken because `rocmPackages.hipblaslt` is unpackaged. (2024-06-09)" =

@Madouura Do you have any suggestions for getting pytorch rocm to work with this project as-is?

$ sudo nixos-rebuild test --flake /etc/nixos#lb-workstation                                                                                                                                                    
warning: Git tree '/etc/nixos' is dirty
building the system configuration...
warning: Git tree '/etc/nixos' is dirty
error:
       … while calling the 'head' builtin

         at /nix/store/qmh8bas1qni03drm0lnjas2azh7h87cn-source/lib/attrsets.nix:1575:11:

         1574|         || pred here (elemAt values 1) (head values) then
         1575|           head values
             |           ^
         1576|         else

       … while evaluating the attribute 'value'

         at /nix/store/qmh8bas1qni03drm0lnjas2azh7h87cn-source/lib/modules.nix:821:9:

          820|     in warnDeprecation opt //
          821|       { value = addErrorContext "while evaluating the option `${showOption loc}':" value;
             |         ^
          822|         inherit (res.defsFinal') highestPrio;

       (stack trace truncated; use '--show-trace' to show the full trace)

       error: 'miopengemm' has been deprecated.
       It is still available for some time as part of rocmPackages_5.

What confuses me is that I think this set-up is using rocmPackages_5, right? In rocmPackages/5/ there is a miopengemm folder, in rocmPackages/6/ there is not. So I think this error comes from somehow using rocmPackages/6?

I don't quite understand if the path in nixpkgs /pkgs/development/python-modules/ == python311Packages. Where does python311Packages come from in nixpkgs? Because the /pkgs/development/python-modules/torch/default.nix clearly states it's using rocmPackages_5, not rocmPackages_6, so I assume python311Packages does not resolve to this file?

@LoganBarnett
Copy link

@Azeirah I've finally gotten a chance to sit down and try this out on an AMD GPU I have (it didn't have NixOS, which is another story). I'm seeing the exact same error you are. My attempts so far to override torch with torchWithRocm have failed. I see @fazo96 was tracking this same problem in earlier work, and I don't think it got anywhere, though it was due to compilation times and not build failures, I think? There's a lot of history in this PR :D

#268736 implies that we can just use torchWithRocm, but I'm not sure how. I've tried a few different things, with this one being the most promising:

  nixpkgs.overlays = [
    (final: prev: {
      pythonPackagesExtensions = [(py-final: py-prev: {
        transformers = py-prev.transformers.override {
          torch = py-prev.torchWithRocm;
        };
        safetensors = py-prev.safetensors.override {
          torch = py-prev.torchWithRocm;
        };
        accelerate = py-prev.accelerate.override {
          torch = py-prev.torchWithRocm;
        };
        torchaudio = py-prev.torchaudio.override {
          torch = py-prev.torchWithRocm;
        };
      })];
  })
];

@SomeoneSerge you posted about how torchWithRocm can build in #268736 - is that feasible here? I'm I simply doing this incorrectly?

@SomeoneSerge
Copy link
Contributor

Can't reply to all of the preceding messages yet, but regarding the very last one:

  1. python3Packages.torch: broken with config.rocmSupport = true #268736 is about the "global" config option (nix-build "<nixpkgs>" --arg config { rocmSupport = true; } -A ...) and nix eval errors (trivially fixable, I expect)
  2. torchWithRocm is actually currently (last I checked) broken because it needs hipBLASLt: [Tracking] ROCm packages #197885 (shouldn't be hard, but not trivial)
  3. the overlay you're building maybe coudl be just fi: pre: { pythonPackagesExtensions = pre.pythonPackagesExtensions ++ [ (py-fi: py-pre: { torch = fi.torchWithRocm; }) ]; if torchWithRocm is defined as torch.override and causes infinite recursion (I forget if it is) maybe you could just copy-paste the whole callPackage expression from python-packages.nix (if you're using channels/impure eval it's callPackage <nixpkgs/pkgs/development/python-modules/torch> { /* stuff from torchWithRocm callPackage expr */ }

If that looks messy it's because this stuff needs more love and more refactoring

@Azeirah
Copy link

Azeirah commented Oct 16, 2024

@Azeirah I've finally gotten a chance to sit down and try this out on an AMD GPU I have (it didn't have NixOS, which is another story). I'm seeing the exact same error you are. My attempts so far to override torch with torchWithRocm have failed. I see @fazo96 was tracking this same problem in earlier work, and I don't think it got anywhere, though it was due to compilation times and not build failures, I think? There's a lot of history in this PR :D

#268736 implies that we can just use torchWithRocm, but I'm not sure how. I've tried a few different things, with this one being the most promising:

  nixpkgs.overlays = [
    (final: prev: {
      pythonPackagesExtensions = [(py-final: py-prev: {
        transformers = py-prev.transformers.override {
          torch = py-prev.torchWithRocm;
        };
        safetensors = py-prev.safetensors.override {
          torch = py-prev.torchWithRocm;
        };
        accelerate = py-prev.accelerate.override {
          torch = py-prev.torchWithRocm;
        };
        torchaudio = py-prev.torchaudio.override {
          torch = py-prev.torchWithRocm;
        };
      })];
  })
];

@SomeoneSerge you posted about how torchWithRocm can build in #268736 - is that feasible here? I'm I simply doing this incorrectly?

Hi! While it's not exactly what you're looking for at all here, I think it's useful enough to mention. I got comfyUI working on Nix in a manner of minutes via a few shortcuts.

If you're interested, let me know. I'll grab my shovel and go start digging for nuggets in my OS config.

It has to do with distrobox, and I know distrobox is sacrilege in the Church of Nix, but I just think it's a beautiful case of endosymbiosis.

@wegank wegank added the 2.status: merge conflict This PR has merge conflicts with the target branch label Nov 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
2.status: merge conflict This PR has merge conflicts with the target branch 6.topic: cuda Parallel computing platform and API 6.topic: nixos Issues or PRs affecting NixOS modules, or package usability issues specific to NixOS 6.topic: rocm 8.has: clean-up 8.has: maintainer-list (update) This PR changes `maintainers/maintainer-list.nix` 8.has: module (new) This PR adds a module in `nixos/` 8.has: module (update) This PR changes an existing module in `nixos/` 8.has: package (new) This PR adds a new package 10.rebuild-darwin: 1-10 10.rebuild-linux: 1-10 11.by: package-maintainer This PR was created by the maintainer of the package it changes
Projects
Status: 📋 The forgotten
Development

Successfully merging this pull request may close these issues.

Package request: comfyui