bug in generate_validate_data #2

ghost · 2019-11-26T09:56:40Z

Hi,
I tried running the generate_validate_data.py as in the instructions.
it exits with an error: in line 131 the call to "augment.SNR_adjusting" is expecting two input arguments - speech and noise. but only speech is supplied. Perhaps it is a matter of the package's version, I added the noise and it finished running successfully, and i see that some files were created.

It would be very useful if you could specify the versions of the packages you used, and the outputs that should be expected from each of the running stages.
Thank you.

dreamibor · 2019-11-27T14:24:57Z

Hi,
I got the same error, did you manage to solve it?
Thanks!

AkojimaSLP · 2019-11-28T15:22:39Z

Hi, I'd like to appreciate your question.

generate_validate_data.py bug
You are correct. I fixed generate_validate_data.py
versions of the packages
I wrote version of package on readme
outputs that should be expected from each of the running stages.
I'll add it soon

Sorry to bother you

regards,

ghost · 2019-12-01T09:04:11Z

Thank you for correcting :-)

works fine now.
Regarding packages: seems that "cython" and "matplotlib" packages are required too.
And i would appreciate it if you could also specify the python version you used (if i figured it out
correctly it is python=3.5, is it?)
It will be very helpful indeed.

Currently i managed to successfully activate 'generate_validate_data.py' and 'train.py'.
I activated them using the sample files that you supplied. I saw that '.npy' files were created in the folders 'validation_features' and 'model'.
Running 'predict.py' generated two figures (see attached picture) but seems that it got stuck and didn't finish to run.

dreamibor · 2019-12-02T14:37:06Z

Just close the two image windows should finish the program.

ghost · 2019-12-03T13:05:04Z

Closing the two windows indeed created two result 'wav' fles. However the result file named 'enhacement_all_channels.wav' still sounds pretty noisy to me... :-(
More or less as noisy as the wav files in the folder '\dataset\data_for_beamforming'...
Am i missing something?

AkojimaSLP · 2019-12-04T14:22:31Z

I'd like to appreciate your reporting.

cython and matplotlib

I wrote version of package on readme

python version

I use 3.6.7.

I hope it will be your help

regards.

AkojimaSLP · 2019-12-04T14:24:30Z

Closing the two windows indeed created two result 'wav' fles. However the result file named 'enhacement_all_channels.wav' still sounds pretty noisy to me... :-(
More or less as noisy as the wav files in the folder '\dataset\data_for_beamforming'...
Am i missing something?

I'd like to appreciate your question.
What is model you used ? pre train model or model you train?

regards.

ghost · 2019-12-04T16:21:51Z

Thank you for answering and for making these changes.
regarding your question: I simply activated the python programs according to the instructions that you kindly supplied in the README file ( first 'generate_validate_data.py', then 'train.py' and last 'predict.py' ). I used the files that existed in the git version of the code (flac + wav).

I didn't know that there is an option to run it on a pre-train model. you didn't supply such a model so i can't use it...?

After running 'train.py' i got a result file named 'neaural_mask_estimator0.hdf5.data-00000-of-00001'. If this is supposed to indicate the number of times of epoch, than i assume it's not working correctly...
Since the training folder holds only about 10 speech files and 9 noise files, perhaps the problem is that the training is applied on a set that is too small?...

Actually my goal is to activate it for single-channel speech enhancement. It was one of the options that were mentioned in the article [2]. But seems that the software is designed for multi-channel enhancement...
Am i correct?

Thank you again for your patience. :-)

AkojimaSLP · 2019-12-05T14:25:22Z

I'd like to appropriate your reply.

pre-train model

Sorry to confuse you. In predict.py, there is parameter "WEIGHT_PATH'. The param is defined as variable in program.

./model/194sequence_false_e1.hdf5 means pre-train model. If you want to predict using your own training model, Please change model name such as "neaural_mask_estimator0.hdf5".

model

You can work correctly, however my sample is minimum as you know.
So please try training model using more noise and speech data.
Then, parameters in train.py are for only test.
Please set appropriate parames. (For example, EPOCH=20, NUMBER_OF_UTTERNACE=15,)

single or multi

You are correct. This software is designed for multi channel.
Basically, TF mask is estimated for designing beamforming(beamforming is multi channel signal processing).
However, you can perform single channel speech enhancement using predicted TF mask too.
The method is easy, becuase just multiply TF mask and speech.
I prepared single channel speech enahancement sample for you as ./predict_single.py (sorry, lazy implementation).

regards,

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bug in generate_validate_data #2

bug in generate_validate_data #2

ghost commented Nov 26, 2019 •

edited by ghost

Loading

dreamibor commented Nov 27, 2019

AkojimaSLP commented Nov 28, 2019

ghost commented Dec 1, 2019 •

edited by ghost

Loading

dreamibor commented Dec 2, 2019

ghost commented Dec 3, 2019

AkojimaSLP commented Dec 4, 2019

AkojimaSLP commented Dec 4, 2019

ghost commented Dec 4, 2019

AkojimaSLP commented Dec 5, 2019

bug in generate_validate_data #2

bug in generate_validate_data #2

Comments

ghost commented Nov 26, 2019 • edited by ghost Loading

dreamibor commented Nov 27, 2019

AkojimaSLP commented Nov 28, 2019

ghost commented Dec 1, 2019 • edited by ghost Loading

dreamibor commented Dec 2, 2019

ghost commented Dec 3, 2019

AkojimaSLP commented Dec 4, 2019

AkojimaSLP commented Dec 4, 2019

ghost commented Dec 4, 2019

AkojimaSLP commented Dec 5, 2019

ghost commented Nov 26, 2019 •

edited by ghost

Loading

ghost commented Dec 1, 2019 •

edited by ghost

Loading