Feature request: Ensemble (multi-model combination) mode #12

c469591 · 2023-09-01T04:16:29Z

Hello,
is it possible to add the functionality of combining multiple models,
similar to UVR's Ensemble Mode? And can we specify the way of combination,
like choosing Min Spec, Max Spec, Average in UVR? Thank you.

hijaek · 2023-09-11T23:19:35Z

I was looking for the same

beveradb · 2023-09-12T07:00:50Z

It's certainly possible!

I'm personally not keen on diving back into the UVR code again any time soon to figure out how those features are implemented, but PRs are very much welcome on this repo and I'd happily pair with anyone interested to help them get up to speed with it :)

Most of the core logic in this project was cherry picked from https://github.com/Anjok07/ultimatevocalremovergui/blob/master/separate.py

c469591 · 2023-09-12T15:29:38Z

Hi,
I noticed that currently only the MD model is supported. Is it possible to add the VR model? The VR model for noise reduction is very useful. Thank you!

beveradb · 2023-09-14T09:04:44Z

Anyone is welcome to submit pull requests to this repo :)

c469591 · 2023-09-14T10:46:20Z

Thank you.

beveradb · 2024-02-05T02:47:07Z

Hey folks, FYI I've been working on adding support for VR models this week, and I released audio-separator version 0.14 earlier today with initial support for VR models!

Please give it a try and see if it works for you!

I'm still working on documentation, tests and some packaging issues but the package on PyPI should "just work".

There's a new CLI parameter audio-separator --list_models which just prints all the models which are supported out of the box, and the interface has changed slightly (you now specify model filename with extension too).

I will inevitably be working on "ensemble mode" and model chaining functionality later this month, as I've been contracted to add support for stem splitting (which kinda goes hand in hand with that).

That said, it's already pretty easy to use audio-separator with multiple models in a row as the output filenames are consistent so you can easily script it to process a file with one model after another!

c469591 · 2024-02-05T05:31:59Z

Hello, I am very diligent and excited that now we can also use the CLI version of the VR model. Thank you so much.
I was wondering if it would be possible to add a synthesis feature in the future, similar to UVR, which can merge multiple documents processed from different models. This could greatly enhance the sound quality of the extracted files.

beveradb · 2024-02-05T05:40:29Z

Yes, I plan to implement that - hopefully later this month! 😄

c469591 · 2024-02-05T13:07:04Z

Thank you! I'd like to ask a question that's been asked many times before: does MDX now support the 23c model?

beveradb · 2024-02-05T23:32:18Z

Thank you! I'd like to ask a question that's been asked many times before: does MDX now support the 23c model?

I'm afraid not quite yet (that's still on my TODO list), but it's not far away now; I intend to implement that later this week or next!

beveradb · 2024-02-26T19:06:17Z

Hey @c469591, the latest version of audio-separator now supports MDX, VR and Demucs models.
I haven't yet finished implementation of the checkpoint models (MDX23C) but I plan to add that later this week.

I'm actually not very familiar with the ensemble mode in UVR; I'll try and dig into it and understand exactly what it's doing later this week too.
However, would you be able to explain what it does from your perspective, or provide any example audio files where it produces better results than a single model? Seeing great results from something motivates me to implement it!

Thank you!

c469591 · 2024-02-27T00:15:35Z

Hello,
Based on my years of experience using UVR, the ensemble mode roughly works like this; it consists of several steps.
The first step is to run each selected model individually.
For example, if I have chosen the 23c from MDX and the 5_HP-Karaoke-UVR from VR, UVR will first run 23c to generate separate accompaniment and vocal files, then it will run 5_HP-Karaoke-UVR to produce another set of accompaniment and vocal files.
Next, all the accompaniment tracks are merged into one file using a method that I'm not aware of; similarly for the vocal files.
I speculate that it might use some strategy like audio phase cancellation to nullify identical sounds across multiple tracks while merging different sounds together—though this is just an unfounded guess.
In the end, after merging, you get an accompaniment with harmonies because 5_HP-Karaoke-UVR includes harmonies.
Additionally, since 23c processes richer instrumental details in its accompaniments, you end up with a result that's generally better than what you'd get from any single model alone.
Of course, if one of the models didn't completely eliminate vocals from its track those remnants would also be included in the final mixed-down accompaniment file.
That's my understanding of ensemble mode—I hope this helps you!

beveradb · 2024-03-15T04:44:13Z

Hey @c469591 , thanks for the write up above, that actually does help me understand the motivation a lot!

I haven't yet gotten around to working on Ensemble mode, but I wanted to give you a heads up that as of version 0.16.2 or higher, audio-separator does now support MDXC models and the VIP models from UVR.

What you've described does actually sound like something I'd like to be able to use myself (I value separation which retains harmonies / backing vocals for the karaoke tracks I make, so far I've mostly been using UVR_MDXNET_KARA_2.onnx on it's own), so I'm motivated to get it working so I too can have that kind of combination of 23c + 5_HP-Karaoke.

I just can't promise when I'll get around to it as my hobby time is limited!

c469591 · 2024-03-15T14:11:05Z

hi
@beveradb
I am glad that my sharing has been helpful to you. I look forward to seeing you complete this feature soon. Thank you for your hard work and contribution!

JackismyShephard · 2024-07-18T13:28:16Z

@beveradb are there any updates on ensemble mode?

beveradb · 2024-07-18T13:40:21Z

Afraid not @JackismyShephard ; to be honest new feature development for audio-separator is something I'm unlikely to be independently motivated to do as my hobby time is limited and I've been pretty happy with my results from audio-separator as it is already for my use case ( https://create.karaokehunt.com )

That said, I would still like to give it a try, I just need a bit of extra help / motivation. If you'd be willing to help / interested in pairing on it some time feel free to email me with a good date/time for a zoom/meet and that'll probably be the thing to get it started 🙏

JackismyShephard · 2024-07-18T16:08:26Z

@beveradb Completely understandable. The karaoke app looks interesting.

It might be interesting to work together on the ensemble mode or other features to add to this project, as I see it has a lot of potential. However, I am a bit busy with my own project (https://github.com/JackismyShephard/ultimate-rvc) as well as my day job, so not sure how much time I have left 😄

beveradb · 2024-07-18T17:35:33Z

No worries, well feel free to email me at [email protected] if you ever have a little free time and wanna pair on getting ensemble mode working :)

beveradb added the enhancement New feature or request label Sep 22, 2023

beveradb added the help wanted Extra attention is needed label Dec 21, 2023

beveradb changed the title ~~Function request, can the multi-model combination function be added?~~ Feature request: Ensemble (multi-model combination) mode Feb 26, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature request: Ensemble (multi-model combination) mode #12

Feature request: Ensemble (multi-model combination) mode #12

c469591 commented Sep 1, 2023

hijaek commented Sep 11, 2023

beveradb commented Sep 12, 2023 •

edited

Loading

c469591 commented Sep 12, 2023

beveradb commented Sep 14, 2023

c469591 commented Sep 14, 2023

beveradb commented Feb 5, 2024

c469591 commented Feb 5, 2024

beveradb commented Feb 5, 2024

c469591 commented Feb 5, 2024

beveradb commented Feb 5, 2024

beveradb commented Feb 26, 2024

c469591 commented Feb 27, 2024

beveradb commented Mar 15, 2024

c469591 commented Mar 15, 2024

JackismyShephard commented Jul 18, 2024

beveradb commented Jul 18, 2024

JackismyShephard commented Jul 18, 2024

beveradb commented Jul 18, 2024

Feature request: Ensemble (multi-model combination) mode #12

Feature request: Ensemble (multi-model combination) mode #12

Comments

c469591 commented Sep 1, 2023

hijaek commented Sep 11, 2023

beveradb commented Sep 12, 2023 • edited Loading

c469591 commented Sep 12, 2023

beveradb commented Sep 14, 2023

c469591 commented Sep 14, 2023

beveradb commented Feb 5, 2024

c469591 commented Feb 5, 2024

beveradb commented Feb 5, 2024

c469591 commented Feb 5, 2024

beveradb commented Feb 5, 2024

beveradb commented Feb 26, 2024

c469591 commented Feb 27, 2024

beveradb commented Mar 15, 2024

c469591 commented Mar 15, 2024

JackismyShephard commented Jul 18, 2024

beveradb commented Jul 18, 2024

JackismyShephard commented Jul 18, 2024

beveradb commented Jul 18, 2024

beveradb commented Sep 12, 2023 •

edited

Loading