-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Define the Optimizer.py #7
Comments
Some questions
|
@ASalvail I moved some of your comments into the original post (because it seems I can do that!). I'm not familiar with SAG. Knowing the example is not enough, you need the id because you keep an history of the past gradients for each example. Is that it? I think the term For instance, examples of reusable and combinable
So, what I have in mind for the core of a first order optimizer (e.g. SGD) is something that looks like this:
I can see ADAGRAD, Adam, Adadelta being called optimizers. They would inherit from SGD (or maybe a new class So users would only have to specify What do you think? |
@MarcCote That's exactly how SAG proceeds: it stores the gradient of all examples in order to get it's gradient average computation right. Those modifiers could be useful as building blocks for the optimizer, but I don't think it'd be useful to use them out of it. If you want a new fancy optimizer, subclass it. |
Definitively the notion of optimizer is somewhat fuzzy and so is the class
Optimizer
.We should attempt to clarify definitions we are going to use in the library.
Definitions (to be added in a wiki)
Trainer
: manages the optimization procedure of a model on a particular dataset.Optimizer
: optimizes a certain objective function (e.g. a loss) by updating some parameters (the ones used in the computation of the objective function: parameters of the model)).UpdateRule
: Something that modified a direction (often the gradient) in order to update some parameters.BatchScheduler
: manages the batches (nb. of examples, order of the examples) to give to the learn function.Loss
: is responsible of outputting the Theano graph corresponding to the desired loss function to optimizer by theOptimizer
. It takes as inputs aModel
and aDataset
.Some questions
Loss
class.BatchScheduler
for that.UpdateRule
.UpdateRule
or create specialUpdateRule
that will combine them as the user want. Right now, we blindly applied them one after the other.SMART-optim
module.Optimizer
should be the one computingnb_updates_per_epoch
? No, a BatchScheduler should do it.Suggestions
Loss
that will be provided to the optimizer. This class could know about the model and the dataset, provide the necessary symbolic variables(maybe it should build the.given
for the Theano function)update_rules.apply
inSGD
should be moved insideOptimizer
. The same goes for calls toparam_modifier.apply
.The text was updated successfully, but these errors were encountered: