Different results due to optimize(true) #122

mys007 · 2016-05-21T08:51:29Z

I have encountered weird behavior when turning on optimization.

The following code computes a simple MLP with cross-entropy loss. The loss can be computed in two ways (CrossEntropyCriterion or LogSoftMax&ClassNLLCriterion). The optimization is turned on. Unexpectedly, this produces different results. However, when I turn off optimization, the printouts are the same. Also, when I move the definition of df into the loop with the optimization turned on (i.e. grad is recomputed each time), the results are then the same as well.

PS: A piggy-backed issue is that loss.crossEntropy doesn't support batch mode due to util.logSumExp not supporting it.

Thank you for your help.

t = require 'torch'
grad = require 'autograd'

t.manualSeed(11)
grad.optimize(true) --COMMENT ME OUT

local params = {
   W = {
      t.randn(50,50),
      t.randn(50,10),
   }
}

local ces = grad.nn.CrossEntropyCriterion()
local cnl = grad.nn.ClassNLLCriterion()
local lsm = grad.nn.LogSoftMax()

local f = function(params, x, y)
   local h1 = t.tanh(x * params.W[1])
   local h2 = t.tanh(h1 * params.W[2])
   return ces(h2,y)
end

local g = function(params, x, y)
   local h1 = t.tanh(x * params.W[1])
   local h2 = t.tanh(h1 * params.W[2])
   return cnl(lsm(h2),y)
end

local df = grad(f)  --OR MOVE ME INTO THE LOOP

local dg = grad(g)


local inputs = torch.Tensor(100,50):normal(0,1)
local targets = torch.Tensor(100):fill(1)

for i=1,10 do
    local graddF, lossdF = df(params, inputs, targets)
    local graddG, lossdG = dg(params, inputs, targets)
    print(lossdF, graddF.W[1]:norm(), graddF.W[2]:norm())
    print(lossdG, graddG.W[1]:norm(), graddG.W[2]:norm())

    params.W[1]:add(-1e-3, graddF.W[1])
    params.W[2]:add(-1e-3, graddF.W[2])
end

The text was updated successfully, but these errors were encountered:

alexbw · 2016-05-21T12:47:23Z

That's pretty strange. I don't see any of the usual culprits.
cc @luketwitter

szagoruyko · 2016-07-17T12:09:02Z

this bugged me again. CrossEntropyCriterion cannot be used with optimize = true, I remember myself trying to debug it without any luck a few months ago.

szagoruyko · 2016-08-17T08:50:02Z

apparently codegen first calls backward and then forward starting from the second time it's called. First time is fine.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Different results due to optimize(true) #122

Different results due to optimize(true) #122

mys007 commented May 21, 2016 •

edited

Loading

alexbw commented May 21, 2016

szagoruyko commented Jul 17, 2016

szagoruyko commented Aug 17, 2016

Different results due to optimize(true) #122

Different results due to optimize(true) #122

Comments

mys007 commented May 21, 2016 • edited Loading

alexbw commented May 21, 2016

szagoruyko commented Jul 17, 2016

szagoruyko commented Aug 17, 2016

mys007 commented May 21, 2016 •

edited

Loading