tts-1-hd doesn't work with Radeon 6800XT and ROCm 5.7 #31

abdulocracy · 2024-07-10T22:19:47Z

I'm running the ROCM Docker image but it looks like it's only using the CPU when trying to use tts-1-hd model, from the logs:

2024-07-10 22:08:17.843 | INFO     | __main__:__init__:58 - Loading model xtts to cpu

The text was updated successfully, but these errors were encountered:

matatonic · 2024-07-10T22:49:55Z

I'm sorry to hear that, I'm not able to test this - are you using the docker-compose.rocm.yaml? Please share some more details about how you're running it (docker compose) and the card you have.

abdulocracy · 2024-07-11T17:33:37Z

I'm running it with Podman, using the same config as the docker-compose.rocm.yaml file.

I'm running a Radeon 6800XT.

I also run Ollama in Podman on this machine and it's able to use ROCM with just the two devices exposed, nothing else needed.

matatonic · 2024-07-11T22:14:44Z

Thanks for the reply, I'm currently traveling with limited internet access but I will try to get to this as soon as I can.

matatonic · 2024-07-11T22:26:05Z

I don't see the Radeon 6800XT listed as supported for ROCm 5.7, you may need to build your own image with an older version of ROCm pytorch. They do support the "AMD Radeon™ Pro W6800" but I don't think that's the same, only the 7800XT series is supported by 5.7. I am not able to test this and am not really sure this is correct. I'll have to look at how ollama does it later.

matatonic · 2024-07-11T23:15:02Z

6800XT seems to require trickery, try to set the following environment variable:

HSA_OVERRIDE_GFX_VERSION=10.3.0

From comments in: https://www.reddit.com/r/Amd/comments/179dncu/amd_rocm_pytorch_now_supported_with_the_radeon_rx/

abdulocracy · 2024-07-12T13:36:59Z

Unfortunately this env var doesn't seem to fix it.

abdulocracy · 2024-07-12T13:47:27Z

I think this is caused by openedai-speech using ROCM 5.7, while Ollama runs ROCM 6 which works with my GPU.

matatonic · 2024-07-13T11:45:55Z

Unfortunately, this means that support wont officially land in the pre-built image for a while, ROCm 6.0 brings in and requires torch 2.3, which so far is causing a lot of dependency and build problems (and the image is over 10GB so wont fit in github). You could try changing the requirements file yourself (from 5.7 to 6.0), but you may need to install a lot of extra stuff, like the nvidia cuda-toolkit, various developer tools, etc.

I'll leave this open for now, but it probably will not get fixed until the whole stack is upgraded to torch 2.3.

stormymeadow · 2024-07-24T14:31:04Z

Hello.

First of all thank you for making this project. I have been looking for a TTS extension for Open WebUI and this project fits perfectly.

I wanted to share my observations. that might help solve the issue. On my AMD card I managed to run xtts with ROCm from a Podman container, but I had to add some changes.
I made 2 variants. One with the current python:3.11-slim image and one with rocm/pytorch.

Branch: Experimental ROCm variants on AMD 7700 XT

Start podman

sudo podman-compose -f <compose-file> up

Changes

Added - label=disable under security_opt in the docker-compose.
Removed --mount=type=cache,target=/root/.cache/pip
On python:3.11-slim install ROCm
Moved pip rocm pytorch to the front of pip installs
pip install --index-url https://download.pytorch.org/whl/rocm5.7 -r requirements-rocm-torch.txt;
Set HSA_OVERRIDE_GFX_VERSION=11.0.0

requirements-rocm-torch.txt

torch
torchaudio

Details

- label=disable got around some permission issues. There might be better solutions for this.
Pip cache caused some errors for me so I took it out.
I had to take the --index-url part out of the requirements file, as it didn't interpret it. Checked the nightly too --pre flag also can't be in there. When I tried to install it as the last package, it wasn't installed properly (had Cuda version instead of ROCm). Maybe other packages have a different pytorch and somehow conflict. So I installed before the other pip packages.
On python:3.11-slim as base I couldn't install ROCm. The amdgpu-install from the package manager installation got stuck on some Gzip, Zstd issue with initramfs, which leads down a rabbit hole of debugging. Thankfully I found this blog, which had a reference on a deb package ROCm install Debian guide and got that installed. The ROCm installation got stuck at rocm-gdb which has dependencies on python3.10 . I couldn't get around it, so I decided to exclude it from the installation.

RUN wget https://repo.radeon.com/amdgpu-install/6.1.3/ubuntu/jammy/amdgpu-install_6.1.60103-1_all.deb
RUN yes Y | apt-get install -y ./amdgpu-install_6.1.60103-1_all.deb && apt update
RUN apt-get install -y $(apt-cache depends rocm | tail -n +2 | sed "s/Depends://g" | grep -v rocm-developer-tools)
RUn apt-get install -y $(apt-cache depends rocm-developer-tools | tail -n +2 | sed "s/Depends://g" | grep -vE "tracer|debug|gdb|dbg|profile")

HSA_OVERRIDE_GFX_VERSION - The version matters. I have whisper independently on host 10.3.0 causes Segmentation fault, 11.0.0 works fine, but might be different with other cards.
When variable missing

ERROR    | __main__:generator:331 - Exception: RuntimeError('HIP error: invalid device function\nHIP kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.\nFor debugging consider passing AMD_SERIALIZE_KERNEL=3.\nCompile with `TORCH_USE_HIP_DSA` to enable device-side assertions.\n')

With rocm/pytorch as base image ROCm is already installed. The default torch has a weird version 2.3.0a0+gitae01701 which didn't work for me, so I had to reinstall the Pytorch ROCm.

Note: Older amdgpu-install (5.7) doesn't seem to have rocm, that can be installed through apt-get install so apt-cache depends will fail.

Debugging

Enter the container:

sudo podman exec -it <container-id> bash

rocminfo or /opt/rocm-6.1.2/bin/rocminfo good sign that rocm is available

In python:

import torch
torch.__version__  # indicates if rocm or cuda torch
torch.cuda.is_available() # True should work.

Resources

Pytorch
Torch whls
Torch nightly whls
AMDGPU installs
Inspiration from Script for testing PyTorch support with AMD GPUs using ROCM
ROCm install with package manager
ROCm post install

ChrisDeadman · 2024-07-25T23:54:07Z

I can confirm that uninstalling the cuda version of torch and torchaudio, then installing the rocm versions fixed it for me aswell (why is cuda version installed in rocm container in the first place?).
I am using an RX 6900 XT, hence I had to set HSA_OVERRIDE_GFX_VERSION=10.3.0.

But I don't see much of a performance difference, it's pretty much the same perf. I get with my Ryzen 5800X3D which leads to annoying pauses between sentences 🫤

In conclusion I would say that at least for 6000 series it doesn't make much sense since it reduces the VRAM available for LLMs and doesn't result in better perf.

parkingmeter · 2024-08-22T01:39:33Z

I really appreciate the work you put in on that branch @stormymeadow ! On my machine, running on the GPU is a significant improvement.
CPU: AMD Ryzen 9 5900X
GPU: AMD ATI Radeon RX 6900 XT

matatonic changed the title ~~tts-1-hd doesn't work with ROCM~~ tts-1-hd doesn't work with Radeon 6800XT and ROCm 5.7 Jul 13, 2024

matatonic added enhancement New feature or request help wanted Extra attention is needed labels Aug 27, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

tts-1-hd doesn't work with Radeon 6800XT and ROCm 5.7 #31

tts-1-hd doesn't work with Radeon 6800XT and ROCm 5.7 #31

abdulocracy commented Jul 10, 2024

matatonic commented Jul 10, 2024

abdulocracy commented Jul 11, 2024

matatonic commented Jul 11, 2024

matatonic commented Jul 11, 2024

matatonic commented Jul 11, 2024

abdulocracy commented Jul 12, 2024

abdulocracy commented Jul 12, 2024

matatonic commented Jul 13, 2024 •

edited

Loading

stormymeadow commented Jul 24, 2024

ChrisDeadman commented Jul 25, 2024

parkingmeter commented Aug 22, 2024

tts-1-hd doesn't work with Radeon 6800XT and ROCm 5.7 #31

tts-1-hd doesn't work with Radeon 6800XT and ROCm 5.7 #31

Comments

abdulocracy commented Jul 10, 2024

matatonic commented Jul 10, 2024

abdulocracy commented Jul 11, 2024

matatonic commented Jul 11, 2024

matatonic commented Jul 11, 2024

matatonic commented Jul 11, 2024

abdulocracy commented Jul 12, 2024

abdulocracy commented Jul 12, 2024

matatonic commented Jul 13, 2024 • edited Loading

stormymeadow commented Jul 24, 2024

ChrisDeadman commented Jul 25, 2024

parkingmeter commented Aug 22, 2024

matatonic commented Jul 13, 2024 •

edited

Loading