Hyper Parameter Exploration #14

grezesf · 2017-02-22T20:26:10Z

Run experiments constantly, exploring the hyper-parameter space:
parameters and scope to be determined shortly.
(possibly use hyperas: https://github.com/maxpumperla/hyperas)

grezesf · 2017-03-07T20:48:54Z

I've (finally) gotten hyper-parameter search to work. Here is the possible search space. Before I launch it on the server, which should I remove?

hyper-parameter space

    # size of LSTM output [256,512,1024,2048])
    # make LSTM bidirectional or not
    # number of LSTM layers [1,2,3,more?]
    # merge_mode for bidirectional LSTM ['sum', 'mul', 'concat', 'ave', None]
    # activation function of output Dense layer: [softmax, softplus, softsign, relu, tanh, sigmoid, hard_sigmoid, linear]
    # loss for whole model: ['mean_squared_error / mse', 'mean_absolute_error / mae', 'mean_absolute_percentage_error / mape', 'mean_squared_logarithmic_error / msle'
    # (continued) squared_hinge, hinge, binary_crossentropy, kullback_leibler_divergence, poisson, cosine_proximity]
    # optimizer [SGD, RMSprop, AdaGrad, AdaDelta, Adam, Adam, Adamax, Nadam]
    # 'batch_size [32, 64, 128, 256, 512]

mim · 2017-03-08T05:13:51Z

What does no merge for bidirectional LSTM mean?

For the loss for the whole model, I thought we were either using the mask-aware loss or the phase-aware loss, right?

And for the activation function of the output, if it is predicting a mask, it should be sigmoid.

The other parameters look good for searching.

grezesf · 2017-03-08T05:25:54Z

"If None, the outputs will not be combined, they will be returned as a list." (I have to admit I'm not 100% clear on how bidirectional networks function)
The model right now is mask-aware, but I guess there is more than 1 way to compute a loss between a predicted and target mask. MSE corresponds to the Erdogan paper.
I'll restrict to sigmoid and hard sigmoid

mim · 2017-03-08T05:30:44Z

Try the other combinations, but not None
I think the loss between the predicted and target mask should be cross entropy and the loss between the masked noisy speech and clean speech should be MSE.
Sounds good.

grezesf self-assigned this Feb 22, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Hyper Parameter Exploration #14

Hyper Parameter Exploration #14

grezesf commented Feb 22, 2017

grezesf commented Mar 7, 2017

mim commented Mar 8, 2017

grezesf commented Mar 8, 2017

mim commented Mar 8, 2017

Hyper Parameter Exploration #14

Hyper Parameter Exploration #14

Comments

grezesf commented Feb 22, 2017

grezesf commented Mar 7, 2017

hyper-parameter space

mim commented Mar 8, 2017

grezesf commented Mar 8, 2017

mim commented Mar 8, 2017