Converting a SD 3.5 model from PyTorch to Core ML causes ValueError #51

justoneplanet · 2024-12-22T09:51:40Z

Thank you for developing the useful tools! Let me report an issue. I have executed the following command as the README.md document.

python -m python.src.diffusionkit.tests.torch2coreml.test_mmdit --sd3-ckpt-path stabilityai/stable-diffusion-3.5-medium --model-version 2b -o ./tmp --latent-size 64

And then, the following error was unexpectedly caused.

INFO:__main__:Initializing SD3 model
INFO:__main__:Initialized.
INFO:__main__:Loading SD3 model checkpoint from ~/.cache/huggingface/hub/models--stabilityai--stable-diffusion-3.5-medium/snapshots/b940f670f0eda2d07fbb75229e779da1ad11eb80/sd3.5_medium.safetensors
INFO:diffusionkit.torch.model_io:Loading state_dict into nn.Module with  635 parameter tensors totaling 2084877376 parameters from ~/.cache/huggingface/hub/models--stabilityai--stable-diffusion-3.5-medium/snapshots/b940f670f0eda2d07fbb75229e779da1ad11eb80/sd3.5_medium.safetensors
INFO:diffusionkit.torch.model_io:Loaded state dict with 783 tensors totaling 2408206912 parameters
E
======================================================================
ERROR: setUpClass (__main__.TestSD3MMDiT)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "~/path-to/DiffusionKit/python/src/diffusionkit/tests/torch2coreml/test_mmdit.py", line 82, in setUpClass
    _load_mmdit_weights(cls.test_torch_model, TEST_SD3_CKPT_PATH)
  File "~/.pyenv/versions/3.10.11/lib/python3.10/site-packages/diffusionkit/torch/model_io.py", line 85, in _load_mmdit_weights
    raise ValueError(
ValueError: Total number of parameters in state_dict (2469663936) does not match the number of parameters in the module (2084877376)

----------------------------------------------------------------------
Ran 0 tests in 7.204s

FAILED (errors=1)

I don't think the torch/mmdit implementation is compatible with the SD 3.5. Did I execute a wrong command, or is there any solution for this?

The text was updated successfully, but these errors were encountered:

arda-argmax · 2024-12-23T17:46:04Z

Hello,

If you want to run stabilityai/stable-diffusion-3.5-medium on Mac, you can use the MLX pipeline. As you mentioned, our torch/mmdit implementation is incompatible with stabilityai/stable-diffusion-3.5-medium; we only updated the MLX implementation.

Contributions are always welcome for anyone interested in helping out with the torch implementation.

justoneplanet · 2024-12-24T04:48:32Z

Hi, Thank you for the response! I understand. Some classes and mechanisms like dual-attention are not ported seemingly. So, it might take time to do so.

Is there any reason why the diffusers classes such as SD3Transformer2DModel and JointTransformerBlock have not been adopted? I mean, instead of the following part on test_mmdit.py,

        cls.test_torch_model = (
            mmdit.MMDiT(mmdit.SD35_2b)
            .to(TEST_DEV)
            .to(TEST_TORCH_DTYPE)
            .eval()
        )

I just thought The SD3Transformer2DModel class might work like this.

        from diffusers import SD3Transformer2DModel
        cls.test_torch_model = (
            SD3Transformer2DModel()
            .to(TEST_DEV)
            .to(TEST_TORCH_DTYPE)
            .eval()
        )

arda-argmax · 2024-12-24T06:55:23Z

We aim to maximize performance on Apple Silicon, which requires modifications to the PyTorch model. For example, we change nn.Linear layers to nn.Conv2d and apply other optimizations tailored to Apple Silicon's architecture. Additionally, the I/O of the models may not align with the existing Core ML pipeline because we implemented the mmdit module independently from HuggingFace's Diffusers library.

While we haven't tested the SD3Transformer2DModel class ourselves, you’re welcome to try it and see how it performs.

justoneplanet · 2024-12-24T10:34:35Z

Thank you for the explanation.
That makes sense. As you mentioned, even if possible, it is not sure that the generated model from diffusers classes can run with Swift pipeline classes of apple/ml-stable-diffusion.
Let me think about them for a while.

arda-argmax added the help wanted Extra attention is needed label Dec 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Converting a SD 3.5 model from PyTorch to Core ML causes ValueError #51

Converting a SD 3.5 model from PyTorch to Core ML causes ValueError #51

justoneplanet commented Dec 22, 2024 •

edited

Loading

arda-argmax commented Dec 23, 2024

justoneplanet commented Dec 24, 2024

arda-argmax commented Dec 24, 2024 •

edited

Loading

justoneplanet commented Dec 24, 2024

Converting a SD 3.5 model from PyTorch to Core ML causes ValueError #51

Converting a SD 3.5 model from PyTorch to Core ML causes ValueError #51

Comments

justoneplanet commented Dec 22, 2024 • edited Loading

arda-argmax commented Dec 23, 2024

justoneplanet commented Dec 24, 2024

arda-argmax commented Dec 24, 2024 • edited Loading

justoneplanet commented Dec 24, 2024

justoneplanet commented Dec 22, 2024 •

edited

Loading

arda-argmax commented Dec 24, 2024 •

edited

Loading