Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Experiments, Benchmark and Hyperoptimizer #54

Open
ASalvail opened this issue Sep 30, 2015 · 1 comment
Open

Experiments, Benchmark and Hyperoptimizer #54

ASalvail opened this issue Sep 30, 2015 · 1 comment

Comments

@ASalvail
Copy link
Member

I've come to a point where I'm doing something that I though should be fairly simple, but which is a lot more complicated than expected with our current code. I need your thoughts on the matter.

I have several models I want to benchmark against one another. To do so, I have several benchmarks I want a few models to compete on. To do so, I need to:

  • Iterate through the benchmarks;
  • Iterate through the models;
  • Find the best hyper-parameters;
  • Test from a few seeds and see the results;
  • Keep track of how each model performed;
  • Output appropriate plots.

As it is, the library is set up to facilitate the training of a single model on a single dataset with a single batch of hyperparameters. I want something higher-level. We could even think about integrating the long awaited spearmint.

Some ideas, all mixed together:

  • Create three more levels of training: Experiment, Benchmark and Hyperoptimizer (I'm certainly open to other names).
  • The Experiment is a collection of Benchmark. It runs them, then aggregate data about them.
  • The Benchmark is used to pit models against each other and find the best-performing one. To do so, it trains and then collects appropriate data about each of them.
  • The Hyperoptimizer is used to find the best hyperparameters for a model to perform a specific task.
  • Some way to add test hyperparameters. You know, just to make sure the code can run through the whole script without crashing.
  • Deal with the issue of the RNG. As of now, nothing is really reproducible. I suggest a hard reset of numpy/theano/blocks RNGs at some level (and it's not clear which one... all of them?)

For now, a grid/random search of the Hyperoptimizer would be great. The Benchmark and Experiment would need more definite roles and type of data to collect. This brings me to the how part.

We will need a way to configure an experiment. The more I work on problems, the more I'm tempted to make a more general class to set and access those parameters. However, this would be such a pain to generalize enough.

We also need a standard way to collect data. I can easily see an Experiment telling a Benchmark what data to collect, which would then cascade all the way to the Trainer and its Tasks. I don't want to have to hardcode everything every time, even though for now it's the easiest solution.

I think those classes have the potential, if done right, to make SL a much greater help in scientific reproducibility and it would accelerate a lot the process of getting experiments done faster.

@ASalvail
Copy link
Member Author

This replaces issue #44.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant