Description
Is there an existing issue for this problem?
- I have searched the existing issues
Operating system
Linux
GPU vendor
AMD (ROCm)
GPU model
RX 7800 XT
GPU VRAM
16GB
Version number
4.2.0a4
Browser
LibreWolf 125.0.2-1
Python dependencies
{
"accelerate": "0.29.2",
"compel": "2.0.2",
"cuda": null,
"diffusers": "0.27.2",
"numpy": "1.26.4",
"opencv": "4.9.0.80",
"onnx": "1.15.0",
"pillow": "10.3.0",
"python": "3.11.9",
"torch": "2.2.2+rocm5.6",
"torchvision": "0.17.2+rocm5.6",
"transformers": "4.39.3",
"xformers": null
}
What happened
When trying to generate an image, a huge amount of VRAM was allocated and prevented the image generation from being able to request any more to actually... generate the image.
Coming from Nvidia to AMD recently, of which my Nvidia card had only 11GBs, I find this unusual because I only tried making a 832x1480 image and that's not particularly large (at least, not large enough to trigger OOM on my Nvidia card when I used other Stable Diffusion front-ends before I came to InvokeAI today).
What you expected to happen
I expected the image to be able to generate without issue; probably even with VRAM to spare.
How to reproduce the problem
- Edit the
invoke.sh
script to modify line 41 toHSA_OVERRIDE_GFX_VERSION=11.0.0 invokeai-web $PARAMS
to get past an initial segfault bug when attempting generations - Use this model (~2 GBs)
- Use this VAE (~385.9 MBs)
- Set aspect ratio to
9:16
- Set width to
832
- Set height to
1480
- Enable High Resolution Fix (or don't, it doesn't matter, result is the same)
- If you enabled the High Resolution Fix, set denoise to
0.55
and set the upscaler toESRGAN
- Set the scheduler to
DPM ++2M Karras
- Set steps to
25
- Set CFG Scale to
7.5
- Leave the VAE at the default precision (
FP32
) - Set Click Skip to
2
- Leave CFG Rescale Multiplier at
0
or follow the tooltip and set it to0.7
(Result is the same, regardless)
Additional context
Specific Linux Distro: Gentoo (LLVM17 Built)
Kernel: 6.9.0-rc6-tkg-eevdf-gentoo-llvm-zen2
Blas Implementation: openblas
Terminal Output:
Generate images with a browser-based interface
>> patchmatch.patch_match: ERROR - patchmatch failed to load or compile (/usr/lib64/libtiff.so.6: undefined symbol: jpeg12_read_raw_data, version LIBJPEG_6.2).
>> patchmatch.patch_match: INFO - Refer to https://invoke-ai.github.io/InvokeAI/installation/060_INSTALL_PATCHMATCH/ for installation instructions.
[2024-05-02 07:37:46,993]::[InvokeAI]::INFO --> Patchmatch not loaded (nonfatal)
[2024-05-02 07:38:06,846]::[InvokeAI]::INFO --> Using torch device: AMD Radeon RX 7800 XT
[2024-05-02 07:38:07,024]::[InvokeAI]::INFO --> cuDNN version: 2020000
[2024-05-02 07:38:07,038]::[uvicorn.error]::INFO --> Started server process [19373]
[2024-05-02 07:38:07,038]::[uvicorn.error]::INFO --> Waiting for application startup.
[2024-05-02 07:38:07,038]::[InvokeAI]::INFO --> InvokeAI version 4.2.0a4
[2024-05-02 07:38:07,039]::[InvokeAI]::INFO --> Root directory = /mnt/chonker/InvokeAI/InstallDir
[2024-05-02 07:38:07,039]::[InvokeAI]::INFO --> Initializing database at /mnt/chonker/InvokeAI/InstallDir/databases/invokeai.db
[2024-05-02 07:38:07,277]::[InvokeAI]::INFO --> Pruned 1 finished queue items
[2024-05-02 07:38:07,752]::[InvokeAI]::INFO --> Cleaned database (freed 0.02MB)
[2024-05-02 07:38:07,752]::[uvicorn.error]::INFO --> Application startup complete.
[2024-05-02 07:38:07,752]::[uvicorn.error]::INFO --> Uvicorn running on http://127.0.0.1:9090 (Press CTRL+C to quit)
[2024-05-02 07:38:09,825]::[uvicorn.access]::INFO --> 127.0.0.1:58476 - "GET /ws/socket.io/?EIO=4&transport=polling&t=OyvJ-gP HTTP/1.1" 200
[2024-05-02 07:38:09,830]::[uvicorn.access]::INFO --> 127.0.0.1:58476 - "POST /ws/socket.io/?EIO=4&transport=polling&t=OyvJ-gf&sid=REhucWSM_N8uuUmfAAAA HTTP/1.1" 200
[2024-05-02 07:38:09,831]::[uvicorn.error]::INFO --> ('127.0.0.1', 58488) - "WebSocket /ws/socket.io/?EIO=4&transport=websocket&sid=REhucWSM_N8uuUmfAAAA" [accepted]
[2024-05-02 07:38:09,832]::[uvicorn.error]::INFO --> connection open
[2024-05-02 07:38:09,832]::[uvicorn.access]::INFO --> 127.0.0.1:58494 - "GET /ws/socket.io/?EIO=4&transport=polling&t=OyvJ-gf.0&sid=REhucWSM_N8uuUmfAAAA HTTP/1.1" 200
[2024-05-02 07:38:09,836]::[uvicorn.access]::INFO --> 127.0.0.1:58476 - "GET /ws/socket.io/?EIO=4&transport=polling&t=OyvJ-gf.1&sid=REhucWSM_N8uuUmfAAAA HTTP/1.1" 200
[2024-05-02 07:38:09,864]::[uvicorn.access]::INFO --> 127.0.0.1:58476 - "GET /api/v1/queue/default/status HTTP/1.1" 200
[2024-05-02 07:38:10,080]::[uvicorn.access]::INFO --> 127.0.0.1:58476 - "GET /api/v1/images/?board_id=none&categories=control&categories=mask&categories=user&categories=other&is_intermediate=false&limit=0&offset=0 HTTP/1.1" 200
[2024-05-02 07:38:10,081]::[uvicorn.access]::INFO --> 127.0.0.1:58494 - "GET /api/v1/app/config HTTP/1.1" 200
[2024-05-02 07:38:10,082]::[uvicorn.access]::INFO --> 127.0.0.1:58476 - "GET /api/v1/images/?board_id=none&categories=general&is_intermediate=false&limit=0&offset=0 HTTP/1.1" 200
[2024-05-02 07:38:10,082]::[uvicorn.access]::INFO --> 127.0.0.1:58494 - "GET /api/v1/images/intermediates HTTP/1.1" 200
[2024-05-02 07:38:10,083]::[uvicorn.access]::INFO --> 127.0.0.1:58506 - "GET /api/v1/app/version HTTP/1.1" 200
[2024-05-02 07:38:10,083]::[uvicorn.access]::INFO --> 127.0.0.1:58510 - "GET /api/v1/boards/?all=true HTTP/1.1" 200
[2024-05-02 07:38:10,084]::[uvicorn.access]::INFO --> 127.0.0.1:58524 - "GET /api/v1/images/?board_id=none&categories=general&is_intermediate=false&limit=100&offset=0 HTTP/1.1" 200
[2024-05-02 07:38:10,093]::[uvicorn.access]::INFO --> 127.0.0.1:58536 - "GET /api/v1/app/app_deps HTTP/1.1" 200
[2024-05-02 07:38:10,094]::[uvicorn.access]::INFO --> 127.0.0.1:58476 - "GET /api/v1/queue/default/list HTTP/1.1" 200
[2024-05-02 07:38:10,095]::[uvicorn.access]::INFO --> 127.0.0.1:58494 - "GET /api/v1/queue/default/status HTTP/1.1" 200
[2024-05-02 07:38:19,190]::[uvicorn.access]::INFO --> 127.0.0.1:40554 - "POST /api/v1/queue/default/enqueue_batch HTTP/1.1" 200
[2024-05-02 07:38:19,410]::[uvicorn.access]::INFO --> 127.0.0.1:40554 - "GET /api/v1/queue/default/status HTTP/1.1" 200
[2024-05-02 07:38:19,444]::[uvicorn.access]::INFO --> 127.0.0.1:40568 - "GET /api/v1/queue/default/list HTTP/1.1" 200
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 25/25 [00:09<00:00, 2.57it/s]
/mnt/chonker/InvokeAI/InstallDir/.venv/lib/python3.11/site-packages/lightning_utilities/core/imports.py:14: DeprecationWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html
import pkg_resources
/mnt/chonker/InvokeAI/InstallDir/.venv/lib/python3.11/site-packages/pkg_resources/__init__.py:2832: DeprecationWarning: Deprecated call to `pkg_resources.declare_namespace('google')`.
Implementing implicit namespace packages (as specified in PEP 420) is preferred to `pkg_resources.declare_namespace`. See https://setuptools.pypa.io/en/latest/references/keywords.html#keyword-namespace-packages
declare_namespace(pkg)
/mnt/chonker/InvokeAI/InstallDir/.venv/lib/python3.11/site-packages/lightning_fabric/__init__.py:40: Deprecated call to `pkg_resources.declare_namespace('lightning_fabric')`.
Implementing implicit namespace packages (as specified in PEP 420) is preferred to `pkg_resources.declare_namespace`. See https://setuptools.pypa.io/en/latest/references/keywords.html#keyword-namespace-packages
/mnt/chonker/InvokeAI/InstallDir/.venv/lib/python3.11/site-packages/pytorch_lightning/__init__.py:37: Deprecated call to `pkg_resources.declare_namespace('pytorch_lightning')`.
Implementing implicit namespace packages (as specified in PEP 420) is preferred to `pkg_resources.declare_namespace`. See https://setuptools.pypa.io/en/latest/references/keywords.html#keyword-namespace-packages
[2024-05-02 07:38:52,148]::[ModelLoadService]::INFO --> Converting /mnt/chonker/InvokeAI/InstallDir/models/sd-1/vae/kl-f8-anime2.ckpt to diffusers format
[2024-05-02 07:39:05,988]::[uvicorn.access]::INFO --> 127.0.0.1:33102 - "GET /api/v1/images/i/b38fe7ca-e4a0-404c-ba73-6a5f59acd186.png HTTP/1.1" 200
[2024-05-02 07:39:06,196]::[InvokeAI]::INFO --> Downloading RealESRGAN_x4plus.pth...
RealESRGAN_x4plus.pth: 67.1MiB [01:12, 929kiB/s]
Upscaling: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:12<00:00, 6.08s/it]
[2024-05-02 07:40:32,518]::[uvicorn.access]::INFO --> 127.0.0.1:37426 - "GET /api/v1/images/i/f9bc272e-5d2e-4db7-91b6-618951310362.png HTTP/1.1" 200
[2024-05-02 07:40:32,728]::[uvicorn.access]::INFO --> 127.0.0.1:37426 - "GET /api/v1/images/i/08b0fd84-8637-4d7c-981b-db15358bc173.png HTTP/1.1" 200
0%| | 0/14 [00:03<?, ?it/s]
[2024-05-02 07:40:52,353]::[InvokeAI]::ERROR --> Error while invoking session f0d54825-89e2-4f9f-8acb-4e24b2f43737, invocation 18dd1847-bf84-4d6b-9269-f76177444e74 (denoise_latents):
HIP out of memory. Tried to allocate 11.03 GiB. GPU 0 has a total capacity of 15.98 GiB of which 2.48 GiB is free. Of the allocated memory 12.84 GiB is allocated by PyTorch, and 48.98 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_HIP_ALLOC_CONF=expandable_segments:True to avoid fragmentation. See documentation for Memory Management (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)
[2024-05-02 07:40:52,353]::[InvokeAI]::ERROR --> Traceback (most recent call last):
File "/mnt/chonker/InvokeAI/InstallDir/.venv/lib/python3.11/site-packages/invokeai/app/services/session_processor/session_processor_default.py", line 185, in _process
outputs = self._invocation.invoke_internal(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/mnt/chonker/InvokeAI/InstallDir/.venv/lib/python3.11/site-packages/invokeai/app/invocations/baseinvocation.py", line 281, in invoke_internal
output = self.invoke(context)
^^^^^^^^^^^^^^^^^^^^
File "/mnt/chonker/InvokeAI/InstallDir/.venv/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/mnt/chonker/InvokeAI/InstallDir/.venv/lib/python3.11/site-packages/invokeai/app/invocations/latent.py", line 991, in invoke
result_latents = pipeline.latents_from_embeddings(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/mnt/chonker/InvokeAI/InstallDir/.venv/lib/python3.11/site-packages/invokeai/backend/stable_diffusion/diffusers_pipeline.py", line 339, in latents_from_embeddings
latents = self.generate_latents_from_embeddings(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/mnt/chonker/InvokeAI/InstallDir/.venv/lib/python3.11/site-packages/invokeai/backend/stable_diffusion/diffusers_pipeline.py", line 419, in generate_latents_from_embeddings
step_output = self.step(
^^^^^^^^^^
File "/mnt/chonker/InvokeAI/InstallDir/.venv/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/mnt/chonker/InvokeAI/InstallDir/.venv/lib/python3.11/site-packages/invokeai/backend/stable_diffusion/diffusers_pipeline.py", line 517, in step
uc_noise_pred, c_noise_pred = self.invokeai_diffuser.do_unet_step(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/mnt/chonker/InvokeAI/InstallDir/.venv/lib/python3.11/site-packages/invokeai/backend/stable_diffusion/diffusion/shared_invokeai_diffusion.py", line 199, in do_unet_step
) = self._apply_standard_conditioning(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/mnt/chonker/InvokeAI/InstallDir/.venv/lib/python3.11/site-packages/invokeai/backend/stable_diffusion/diffusion/shared_invokeai_diffusion.py", line 343, in _apply_standard_conditioning
both_results = self.model_forward_callback(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/mnt/chonker/InvokeAI/InstallDir/.venv/lib/python3.11/site-packages/invokeai/backend/stable_diffusion/diffusers_pipeline.py", line 590, in _unet_forward
return self.unet(
^^^^^^^^^^
File "/mnt/chonker/InvokeAI/InstallDir/.venv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/mnt/chonker/InvokeAI/InstallDir/.venv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/mnt/chonker/InvokeAI/InstallDir/.venv/lib/python3.11/site-packages/diffusers/models/unets/unet_2d_condition.py", line 1216, in forward
sample, res_samples = downsample_block(
^^^^^^^^^^^^^^^^^
File "/mnt/chonker/InvokeAI/InstallDir/.venv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/mnt/chonker/InvokeAI/InstallDir/.venv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/mnt/chonker/InvokeAI/InstallDir/.venv/lib/python3.11/site-packages/diffusers/models/unets/unet_2d_blocks.py", line 1279, in forward
hidden_states = attn(
^^^^^
File "/mnt/chonker/InvokeAI/InstallDir/.venv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/mnt/chonker/InvokeAI/InstallDir/.venv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/mnt/chonker/InvokeAI/InstallDir/.venv/lib/python3.11/site-packages/diffusers/models/transformers/transformer_2d.py", line 397, in forward
hidden_states = block(
^^^^^^
File "/mnt/chonker/InvokeAI/InstallDir/.venv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/mnt/chonker/InvokeAI/InstallDir/.venv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/mnt/chonker/InvokeAI/InstallDir/.venv/lib/python3.11/site-packages/diffusers/models/attention.py", line 329, in forward
attn_output = self.attn1(
^^^^^^^^^^^
File "/mnt/chonker/InvokeAI/InstallDir/.venv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/mnt/chonker/InvokeAI/InstallDir/.venv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/mnt/chonker/InvokeAI/InstallDir/.venv/lib/python3.11/site-packages/diffusers/models/attention_processor.py", line 522, in forward
return self.processor(
^^^^^^^^^^^^^^^
File "/mnt/chonker/InvokeAI/InstallDir/.venv/lib/python3.11/site-packages/diffusers/models/attention_processor.py", line 1279, in __call__
hidden_states = F.scaled_dot_product_attention(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
torch.cuda.OutOfMemoryError: HIP out of memory. Tried to allocate 11.03 GiB. GPU 0 has a total capacity of 15.98 GiB of which 2.48 GiB is free. Of the allocated memory 12.84 GiB is allocated by PyTorch, and 48.98 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_HIP_ALLOC_CONF=expandable_segments:True to avoid fragmentation. See documentation for Memory Management (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)
[2024-05-02 07:40:52,354]::[InvokeAI]::INFO --> Graph stats: f0d54825-89e2-4f9f-8acb-4e24b2f43737
Node Calls Seconds VRAM Used
main_model_loader 1 0.001s 0.000G
clip_skip 1 0.000s 0.000G
compel 2 5.175s 0.343G
collect 2 0.000s 0.236G
noise 2 0.004s 0.294G
denoise_latents 2 27.174s 12.907G
core_metadata 1 0.000s 1.615G
vae_loader 1 0.000s 1.615G
l2i 1 18.170s 3.224G
esrgan 1 86.134s 7.383G
img_resize 1 0.535s 0.294G
i2l 1 15.543s 3.693G
TOTAL GRAPH EXECUTION TIME: 152.736s
TOTAL GRAPH WALL TIME: 152.744s
RAM used by InvokeAI process: 3.69G (+2.874G)
RAM used to load models: 3.97G
VRAM in use: 1.615G
RAM cache statistics:
Model cache hits: 10
Model cache misses: 5
Models cached: 5
Models cleared from cache: 0
Cache high water mark: 1.99/7.50G
Discord username
No response