autograd.optim wrapper does not update state value #121

hbqjzj · 2016-05-10T20:36:02Z

It seems that autograd.optim wrapper doesn't update "state" value. It outputs "states", but it is impossible to use it iteratively.

local function wrap(optimfn)
   return function(fn, state, params)
      local states = { }
      local flatParams = util.sortedFlatten(params)
      for i = 1, #flatParams do
         states[i] = util.deepCopy(state)   --this is a deep copy of state, so it is not updated
      end
      return function(...)
         local out = {fn(params, ...)}
         local grads, loss = out[1], out[2]
         local flatGrads = util.sortedFlatten(grads)
         for i = 1, #flatGrads do
            local grad = flatGrads[i]
            optimfn(function()
               return loss, grad
            end, flatParams[i], states[i])
         end
         return table.unpack(out)
      end, states   --now, states is a table of states, which is impossible to pass back to the wrapper
   end
end

ghostcow · 2016-07-07T10:42:54Z

That's fine, it gives you the states table it uses so you can manually change things per weight tensor (if you wish), but you don't actually pass it back to any function.
It's saved inside the wrapper closure already.

Here's how to use the optim wrapper:

optim/init.lua calls wrap() exactly once on initialization (per optimization method):

for k, v in pairs(require 'optim') do
   opt[k] = wrap(v)
end

return opt

you call autograd.optim.sgd(df,state,params) ONCE to get the optimizing function:

local df = autograd(f, {optimize = true})
local state = {learningRate=1e-2}
local optimizer, states = autograd.optim.sgd(df, state, params)

you then use optimizer to update your weights at each iteration:

local grads, loss = optimizer(data, target)

*Example adapted from optim tests here and here

hbqjzj · 2016-07-07T19:29:42Z

The first test case works because the learning rate does not decay. However, in most cases we want the learning rate to decay, and that needs the number of iteration, which is stored in the variable states. Additionally, the moments matrices are also stored in the variable states. The call

local grads, loss = optimizer(data, target)

will always assume the number of iteration is zero (or one) and moment matrices are empty in every iteration.

szagoruyko · 2016-07-28T17:02:46Z

@eugenium there doesn't seem to be an issue. state is created once and then kept as a local variable in optimizer.

eugenium · 2016-07-28T17:10:04Z

Ah yea I see now. thanks.

synchro-- · 2017-01-03T04:29:38Z

@ghostcow So, is there a complete working example (like the mnist one) on how to use Optim?

ghostcow mentioned this issue Jul 7, 2016

Confusion with using optim with autograd #130

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

autograd.optim wrapper does not update state value #121

autograd.optim wrapper does not update state value #121

hbqjzj commented May 10, 2016 •

edited

Loading

ghostcow commented Jul 7, 2016 •

edited

Loading

hbqjzj commented Jul 7, 2016

szagoruyko commented Jul 28, 2016

eugenium commented Jul 28, 2016

synchro-- commented Jan 3, 2017

autograd.optim wrapper does not update state value #121

autograd.optim wrapper does not update state value #121

Comments

hbqjzj commented May 10, 2016 • edited Loading

ghostcow commented Jul 7, 2016 • edited Loading

hbqjzj commented Jul 7, 2016

szagoruyko commented Jul 28, 2016

eugenium commented Jul 28, 2016

synchro-- commented Jan 3, 2017

hbqjzj commented May 10, 2016 •

edited

Loading

ghostcow commented Jul 7, 2016 •

edited

Loading