-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Export to ONNX Format #214
Comments
@DakeQQ 非常感谢! Thanks a lot! |
Thank you! Its working and I can even use onnxruntime-directml (package) to run this on my AMD GPU! For that - the provider of ort_session_A and ort_session_C needs to be forced to ['CPUExecutionProvider'] but ort_session_B can use ['DmlExecutionProvider', 'CPUExecutionProvider'] and its blazing fast vs CPU. I'm facing a problem though - the ouputs are always in chinese... What do I need to change in 'Export_F5.py' to make this work for english? |
Thank you for your testing. However, the setup for the English version may need to be answered by the original author of the F5-TTS project. The code for ONNX export and execution is based on the original work. According to my tests, ort_session_A and ort_session_C together take up less than 1% of the time cost, while ort_session_B occupies the majority of the time. |
Yes and is why inference speed is pretty much not affected by setting those to CPU. ort_session_B is what matters and it runs fine on AMD GPUs using onnxruntime-directml! Anyways, I've tried messing around with vocab and ofc the reference audio and text but the speaker always tries to speak chinese - even when ref text+audio and gen_text are in english. May be worth noting this has nothing to do with the fact I'm using directml because it also happened before I even tried that. Looking forward to get this working on English... @SWivid please check this out when you have time. Thanks once again! |
Hello~ The issue with the English voice should have been resolved. Please try again using the latest F5-TTS-ONNX version. @GreenLandisaLie |
Its working now both in chinese and english! Thanks! @SWivid Maybe its worth adding a 'ONNX' branch at https://huggingface.co/SWivid/F5-TTS/tree/main. |
@GreenLandisaLie Yes, the onnx version is great! Maybe better for @DakeQQ to do that? |
can someone share the onnx export ? i would love to try it out! |
If anyone would be willing to run me through how to do this and get it working on my Win10 5700xt I would be eternally greatful. (well at least until the next TTS upgrade comes out.) |
@KungFuFurniture see this repo |
Yes I saw that, cloned the repo, changed some path directories in the export.py... But now I'm lost. I am really new to all this (maybe a year or so) so I am not 100% on what I am getting wrong.
This is my error message. |
The wrong message First, we use
(We may have accidentally deleted some code. Please fetch the latest code and try again.) |
So I did a complete startover. Grabbed fresh F5, fresh venv, grabbed the link above, changed file locations from user Dake... It seems my file structure and some names are a bit different, and I believe that is getting me in some trouble. For example:
load_checkpoints is in utils_infer not models.utils in my version of the f5 repo. But I believe I have found most of those things. Now I am stuck here:
I mean I have the config and pytorch_model but I can't figure out where to put em. I have tried about 16 different folders from a cached huggingface folder to the aforementioned infer folder. I dunno. I don't know anything about vocos and its lil brick road is far from Yellow. I fell outta Kansas quick. |
replace
|
Alright, making progress. Thank you for the help. After defining the local_path, I got the DiT uncond error again. Compared the 2 dit.py files they are the same. So it did copy. I ran it again... Got a different error.
Which you can see in the path model is there and module is within and so are the functions we are after. So I added the following line to the dit.py, as I used that once in a different project to resolve a similar "can't find the module" issue.
That did not help...
But hey new errors are progress right? |
error due to literally |
Git pulled, got an update... Same thing
|
The point is: |
@KungFuFurniture Just want to add one important thing if you want to run this on a AMD GPU you might need to do this: ' PS: this is how I did it a week ago but the Export_T5.py file has been changed many times since then and so this might no longer work. Additionally - at the time - the Export_T5.py file did not contain necessary audio transformations that allow for invalid format .wav reference audio files so I had to copy paste those from the original code. You might or might not need to do this as well. Good Luck :D Hopefully someone will release the converted .onnx models with a pipeline for it so it will be easy to use in the future. |
@KungFuFurniture Please note that, we use the modified load vocos method by the following code at line 52: |
First let me say to everyone, Thank you for the help. So here is where I am. Export seems to have worked, and I can still run the app, and it works. But it works exactly the same. Not using the GPU. (AMD 5700xt) That is I am sure a result of what Green mentioned in adjustments to app.py. I feel like such a Kindergardner in College. I am so far in over my head gang. I learned Python from Youtube. lol. I know nothing about onnx - torch except that they help make the magic work. So any suggestions on what to do next... ? Again all help is super appreciated. And I get it if you don't have time to educate me. Cheers to all. |
@KungFuFurniture
Additionally, set the ONNX Runtime log level to 0 or 1 with |
It looks like the repo has changed a lot since the last ONNX export attempt.
Any ideas? |
@amblamps thought fixed by @DakeQQ , many thanks! F5-TTS/src/f5_tts/model/modules.py Lines 30 to 143 in 4a69e6b
|
@amblamps
|
Thanks! That worked. |
has anyone shared a recent onnx export and code for inference? |
@DakeQQ Do any other modifications need to be made to the script to export the E2 TTS model aside from pointing it to the correct checkpoint? |
We have not yet attempted to export the E2-TTS model. If its function call path is the same as that of F5-TTS, theoretically, only modifying the model file path would be necessary to make the corresponding adjustments. However, the actual situation may be more complex, so we currently do not have specific plans to export E2-TTS in ONNX format. |
There still seem to be issues with the mel params, has anyone been able to export recently ? |
@smickovskid What mel parameter issues are you encountering? |
Getting the same issue that @amblamps encountered
I am using a custom fine tuned model, I also ran This is my
I am running Edit: I've changed model.load_state_dict(checkpoint["model_state_dict"], strict=False) and it passes now but it fails down the line with
|
@smickovskid |
Hey @DakeQQ, sorry for the late response. Yeah that fixed it! Thanks for all the help. |
Can you provide step by step guide for this ? do I clone that repository over this one , do I run both on the same venv ? Initially, I only copied the "F5-TTS-ONNX-Inference.py" from that repository to inside f5-tts ,activated its venv , downloaded the provided onnx models, changed the directory names etc. When I just run the inference it automatically works with cpu but of course no gpu support (windows 10, rx 6600). If I just add 'DmlExecutionProvider', 'CPUExecutionProvider' to ort session B providers I get this AttributeError: module 'onnxruntime' has no attribute 'set_seed' Of course I uninstalled onnxruntime and installed onnxruntime-gpu. Edit : reinstalled onnxruntime, it is working but ofc no gpu support |
Perhaps try |
finally. uninstalled all onnxruntime packages, only installed onnxruntime-directml . Also changed the ort_session_B = onnxruntime.InferenceSession(onnx_model_B, sess_options=session_opts, providers=ORT_Accelerate_Providers.append('CPUExecutionProvider')) to ort_session_B = onnxruntime.InferenceSession(onnx_model_B, sess_options=session_opts, providers=['DmlExecutionProvider']) (was just adding dmlexecutionprovider inside the paranthesis before) Speed up is almost 4.5 times than CPU. Now the only thing to solve for me is to figure out how to convert to onnx to be able to use longer sample sizes. Right now it is limited to just around ten seconds. And it seems to be only using around 2 gb of gpu memory , clearly can do better with the gpu. |
Also changed the
That's what I needed!! Roaring on the gpu now!! SWEET!!! Happy Thanksgiving to those who apply. |
Has onnx improved the inference speed of the model |
it "improved" it over cpu for sure :) Most probably still slower than cuda but since we amd users can't use it "easily" onnx is ok. |
Thank you very much for your reply |
Raised a PR on how to use with AMD gpu. It just works with standard Torch on linux |
Does anyone deploy the f5-tts on Qualcomm chips? |
Hey @patientx , can you share any benchmark for timing? How much time does it take to generate? |
The text was updated successfully, but these errors were encountered: