You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Nov 1, 2021. It is now read-only.
Now I want to train my model. Since I have my training typical procedure (with the mini-batch closure) already written and ready, I would like to keep most of it, only exploiting the very easy way to get the gradients with autograd.
So let's confront the 2 methods, so that you can tell if that is actually possible.
The usual:
localfeval=function(x)
ifx~=parametersthenparameters:copy(x)
end-- reset gradientsgradParameters:zero()
-- f is the average of all criterionslocalf=0-- evaluate function for complete mini batchfori=1,#inputsdo-- estimate flocaloutput=model:forward(inputs[i])
localerr=criterion:forward(output, targets[i])
f=f+err-- estimate df/dWlocaldf_do=criterion:backward(output, targets[i])
model:backward(inputs[i], df_do)
end-- normalize gradients and f(X)gradParameters:div(#inputs)
f=f/#inputs-- return f and df/dXreturnf,gradParametersend
So, using autograd while making the smallest changes possible it would be:
-- create closure to evaluate f(X) and df/dXlocalfeval=function(x)
-- get new parametersifx~=parametersthenparameters:copy(x)
end-- reset gradientsgradParameters:zero()
-- f is the average of all criterionslocalf=0-- evaluate function for complete mini batchfori=1,#inputsdo-- estimate flocaldf_do, err, output=df(params,inputs[i],targets[i])
f=f+errmodel:backward(inputs[i], df_do)
end-- normalize gradients and f(X)gradParameters:div(#inputs)
f=f/#inputs-- return f and df/dXreturnf,gradParametersend
And then going on with using the optim module in the classical way.
Is this not possible/not suggested?
The text was updated successfully, but these errors were encountered:
@synchro-- were you successful in doing this? I am mixing optim with wrapped nn modules and getting the following errors:
/Graph.lua:40: bad argument #2 to 'fn' (expecting number or torch.DoubleTensor or torch.DoubleStorage at /tmp/luarocks_torch-scm-1-9261/torch7/generic/Tensor.c:1125)
I have actually never tried in the end. I dropped the project because I was
working on something else, but it's something I could try in the next weeks
let's say.
Keep me up to date if you manage to use optim that way.
On Thu, Mar 23, 2017, 11:06 sebastiangonsal ***@***.***> wrote:
@synchro-- <https://github.com/synchro--> were you successful in doing
this? I am mixing optim with wrapped nn modules and getting the following
errors:
/Graph.lua:40: bad argument #2 to 'fn' (expecting number or torch.DoubleTensor or torch.DoubleStorage at /tmp/luarocks_torch-scm-1-9261/torch7/generic/Tensor.c:1125)
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#169 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AFmzun9DRyVOLkOXVLWNrPtXGd4Llmw6ks5rokQmgaJpZM4LaS2X>
.
Let's say I have a whole wrapped network made with nn called 'model' and I used the
Now I want to train my model. Since I have my training typical procedure (with the mini-batch closure) already written and ready, I would like to keep most of it, only exploiting the very easy way to get the gradients with autograd.
So let's confront the 2 methods, so that you can tell if that is actually possible.
The usual:
So, using autograd while making the smallest changes possible it would be:
And then going on with using the optim module in the classical way.
Is this not possible/not suggested?
The text was updated successfully, but these errors were encountered: