Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

General questions #5

Open
betegon opened this issue Mar 5, 2020 · 5 comments
Open

General questions #5

betegon opened this issue Mar 5, 2020 · 5 comments

Comments

@betegon
Copy link

betegon commented Mar 5, 2020

HI @vbelz ,

First of all, thankyou for your work, I have tried to denoise some audio and it worked so good, but I have a few questions

Quoted from README:

Specify how many frames you want to create as nb_samples in args.py (or pass it as argument from the terminal) I let nb_samples=50 by default for the demo but for production I would recommend having 40 000 or more.

1. What is exactly nb_samples?

2. Are the weights provided by you from nb_samples=50?

3. Should I resample audio to be 8KHz for denoising or is it done inside the network? Also, should I do it for training?

4. I want to twerk it to be a better denoiser for background noise rather than specific sounds. What are your thoughts on this? I have a dataset with clean samples and background noise samples. Will it work if it train it? Which hyperparameter should I use?

Thank you so much and sorry for bothering you!

@vbelz
Copy link
Owner

vbelz commented Mar 6, 2020 via email

@betegon
Copy link
Author

betegon commented Mar 16, 2020

Hi @vbelz , Thank you for your kind and quick response.

I have been working on creating the data necessary to train it, and I have a few more questions (sorry for bothering you).

I have approx. 10h of audio, and when I am about to create the dataset, I end up with the following error, caused in the function numpy_audio_to_matrix_spectrogram:

m_mag_db = np.zeros((nb_audio, dim_square_spec, dim_square_spec))
MemoryError: Unable to allocate array with shape (37028, 257, 257) and data type float64

Also, I have the following questions:

1.) I have used dimensions of 256x256, as I have downsampled the audios to 16KHz instead of 8KH. The window I have used is of 16128KHz, which is slightly more than one second. Do you thinks this is a correct approach? I mean, your window was of 64Hz more than a second for 8KHz, so I scaled it to 16KHz. Also, the problem I am facing is that the size I get from preparing the dataset is 256x257 (the dimensions that librosa.stft returns). I don't know why isn't it 256x256, as my parameters are: hop_length_fft = 63, n_fft = 510, frame_length = 16128 and hop_length_frame = 16128. This gives a result of 16128/63 = 256, so I don't know where it gets that number of 257 columns.

2.) Why the window should be in between a second? Will it improve its performance if it is smaller / bigger?

3.) Do you think there will be any mayor loss of performance by decreasing precision to 32bit (i.e. numpy datatype = 'float32')

4.) It looks like you are cropping all audios as you don't include the last window of them. Therefore, I have added zero padding to the end of each audio to achieve the window size. What do you think about this?

5.) I have concatenated the audios one after another so they keep the audio structure. Is there a special reason to create a random order? you use the function blend_noise_randomly to do this.

6.) What's the difference between frame_length and hop_frame_length? I think they refer to the same parameter: the sliding window size for STFT, which is by definition the frame_length.

Thanks a lot for your time and effort

cheers.

@Vishesh813
Copy link

@vbelz please help on this.

@vaishalibhardwaj
Copy link

@vbelz please help on this.

Hello Vishesh,
As I am new to this project could you guide me on how to get till till the denoised output.
Note: I do not have GPUs on my computer

@Vishesh813
Copy link

Vishesh813 commented Jul 29, 2020 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants