Gradio only using cpu #725

Voltage71 · 2025-01-16T05:18:08Z

Checks

This template is only for usage issues encountered.
I have thoroughly reviewed the project documentation but couldn't find information to solve my problem.
I have searched for existing issues, including closed ones, and couldn't find a solution.
I confirm that I am using English to submit this report in order to facilitate communication.

Environment Details

OS: Windows 11
CPU: Ryzen 7 5800x
GPU: RX 6950 XT
Torch: 2.5.1
rocm6.2

When I try to synthesize a text, it shows 1 step and it takes up to 10 minutes to do one sentence. My CPU utilization jumps between 60% - 70% while my GPU stays at 2%.

I have tried this set up on my laptop that has Windows 11 and a GTX 1650 MaxQ and that also uses 100% of the CPU instead of the GPU.

What am I doing wrong? I pasted the notes it said when starting. Not sure what/where to change it.
P.S. I am pretty new at all of this.

Steps to Reproduce

I created a new conda environment and installed torch with pip install torch==2.5.1+rocm6.2 torchaudio==2.5.1+rocm6.2 --extra-index-url https://download.pytorch.org/whl/rocm6.2

I then did the pip install git+https://github.com/SWivid/F5-TTS.git

And I'm using the f5-tts_infer-gradio command after activating conda. When I hit Synthesize, I get the following message before it starts

"FutureWarning: The input name inputs is deprecated. Please make sure to use input_features instead.
warnings.warn(
You have passed task=transcribe, but also have set forced_decoder_ids to [[1, None], [2, 50360]] which creates a conflict. forced_decoder_ids will be ignored in favor of task=transcribe.
Passing a tuple of past_key_values is deprecated and will be removed in Transformers v4.43.0. You should pass an instance of EncoderDecoderCache instead, e.g. past_key_values=EncoderDecoderCache.from_legacy_cache(past_key_values).
The attention mask is not set and cannot be inferred from input because pad token is same as eos token. As a consequence, you may observe unexpected behavior. Please pass your input's attention_mask to obtain reliable results."

✔️ Expected Behavior

No response

❌ Actual Behavior

No response

The text was updated successfully, but these errors were encountered:

SWivid · 2025-01-16T05:49:48Z

I created a new conda environment and installed torch with pip install torch==2.5.1+rocm6.2 torchaudio==2.5.1+rocm6.2 --extra-index-url https://download.pytorch.org/whl/rocm6.2

#671 try linux or wsl

or use onnx version for windows

Voltage71 · 2025-01-16T05:53:40Z

I created a new conda environment and installed torch with pip install torch==2.5.1+rocm6.2 torchaudio==2.5.1+rocm6.2 --extra-index-url https://download.pytorch.org/whl/rocm6.2

#671 try linux or wsl

or use onnx version for windows

Do you think this would work on a Linux VM?

SWivid · 2025-01-16T06:02:21Z

Do you think this would work on a Linux VM?

the link mentioned above said worked

Voltage71 · 2025-01-23T03:52:05Z

Do you think this would work on a Linux VM?

the link mentioned above said worked

I borrowed a RTX 3060 12GB from a friend. I uninstalled all the AMD GPU drivers and installed the latest Nvidia drivers. I also installed cuda 11.8. I verified that cuda is installed by doing nvcc --version.

I then did

pip install torch==2.3.0+cu118 torchaudio==2.3.0+cu118 --extra-index-url https://download.pytorch.org/whl/cu118

And that installed successfully. I started up F5 and loaded in the sample. I tried generating text and the output is

Device set to use cpu C:\Users\User\.conda\envs\f5-tts\lib\site-packages\transformers\models\whisper\generation_whisper.py:573: FutureWarning: The input name inputsis deprecated. Please make sure to useinput_featuresinstead. warnings.warn( You have passed task=transcribe, but also have setforced_decoder_idsto [[1, None], [2, 50360]] which creates a conflict.forced_decoder_idswill be ignored in favor of task=transcribe. Passing a tuple ofpast_key_valuesis deprecated and will be removed in Transformers v4.43.0. You should pass an instance ofEncoderDecoderCacheinstead, e.g.past_key_values=EncoderDecoderCache.from_legacy_cache(past_key_values). The attention mask is not set and cannot be inferred from input because pad token is same as eos token. As a consequence, you may observe unexpected behavior. Please pass your input's attention_mask to obtain reliable results.

My cpu usage is 60% while my GPU usage is at 2%.
What am I missing?

Voltage71 · 2025-01-24T01:18:47Z

Got it working!

The only thing I did was installed it again using Pinokio

Voltage71 added the help wanted Extra attention is needed label Jan 16, 2025

Voltage71 closed this as completed Jan 24, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Gradio only using cpu #725

Gradio only using cpu #725

Voltage71 commented Jan 16, 2025

SWivid commented Jan 16, 2025 •

edited

Loading

Voltage71 commented Jan 16, 2025

SWivid commented Jan 16, 2025

Voltage71 commented Jan 23, 2025 •

edited

Loading

Voltage71 commented Jan 24, 2025

Gradio only using cpu #725

Gradio only using cpu #725

Comments

Voltage71 commented Jan 16, 2025

Checks

Environment Details

Steps to Reproduce

✔️ Expected Behavior

❌ Actual Behavior

SWivid commented Jan 16, 2025 • edited Loading

Voltage71 commented Jan 16, 2025

SWivid commented Jan 16, 2025

Voltage71 commented Jan 23, 2025 • edited Loading

Voltage71 commented Jan 24, 2025

SWivid commented Jan 16, 2025 •

edited

Loading

Voltage71 commented Jan 23, 2025 •

edited

Loading