How to detect the noise, breaks and multi-person speak in a audio? #673

TomSuen · 2024-12-27T09:26:31Z

Checks

This template is only for question, not feature requests or bug reports.
I have thoroughly reviewed the project documentation and read the related paper(s).
I have searched for existing issues, including closed ones, no similar questions.
I confirm that I am using English to submit this report in order to facilitate communication.

Question details

Hello, I am not in the audio field. I would like to ask, for a reference audio, I have removed BGM and reverberation to a certain extent, but the effect of inputting it into the sound cloning is still not good. Is there any better way to detect whether there is noise, distortion, and multiple people speaking in the reference audio?

sam4muzix · 2024-12-27T14:58:04Z

Removing bgm and reverb from an audio will also remove many frequency ranges where the module finds difficult to analyse. So its better use some other dataset which will have only voice. Still whisper can transcribe. But in audio case, its recommended to use raw voice only dataset.

mlndlesslydev · 2024-12-27T20:49:21Z

This project helped me a lot, but it's a bit of a pain to install and it's only working in Linux.

TomSuen added the question Further information is requested label Dec 27, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to detect the noise, breaks and multi-person speak in a audio? #673

How to detect the noise, breaks and multi-person speak in a audio? #673

TomSuen commented Dec 27, 2024

sam4muzix commented Dec 27, 2024 •

edited

Loading

mlndlesslydev commented Dec 27, 2024

How to detect the noise, breaks and multi-person speak in a audio? #673

How to detect the noise, breaks and multi-person speak in a audio? #673

Comments

TomSuen commented Dec 27, 2024

Checks

Question details

sam4muzix commented Dec 27, 2024 • edited Loading

mlndlesslydev commented Dec 27, 2024

sam4muzix commented Dec 27, 2024 •

edited

Loading