Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Selective Weight Vector for Loss Func #39

Open
ncalik opened this issue May 10, 2018 · 2 comments
Open

Selective Weight Vector for Loss Func #39

ncalik opened this issue May 10, 2018 · 2 comments

Comments

@ncalik
Copy link

ncalik commented May 10, 2018

Hi @jotaf98,
how can we get max score indices in logits? I want to calculate a verification loss for a classified person.
For example, let scr is logits of network , for a person, fea_vect is multiplied with corresponding vector W(maxInd , : )

[~,maxInd] = max( scr , [ ] , 3);
loss_2 = (tanh(W(maxInd , : ) * fea_vect) - lbl).^2 where W = Param('value', randn(numPers,feaVectLength))

But I've got an error using max :
Error using Layer/max
Too many output arguments.

Also is W trainable over corresponding vectors ?

@jotaf98
Copy link
Collaborator

jotaf98 commented May 10, 2018

Hi, you're right that the 2nd output is missing from the overloaded @max (this output is not differentiable so I didn't think of it). Anyway you can work around it, by creating a non-differentiable layer based on Matlab's @max with 2 outputs:

[~, maxInd] = Layer.create(@max, {scr , [ ] , 3}, 'numInputDer', 0)

(numInputDer is the number of input derivatives, which is 0 for non-differentiable functions)
And yes, gradients will be back-propagated to the selected elements of W :)

@ncalik
Copy link
Author

ncalik commented May 12, 2018

it's very cool solution!..
Firstly, I wrote a function like mcnExtraLayers :

function [y,dzdw ]= z_maxx(fea,w,scr,lbl,varargin)
sz_fea = size(fea);
sz_wei = size(w);
fea_mat = squeeze(fea);
scr_mat = squeeze(scr);

[~ dzdy] = vl_argparsepos(struct(), varargin) ;
if isempty(dzdy)
[~ maxInd] = max(scr(:));
val = 1./( 1 + exp( w(maxInd,:)*fea(:) ) );
y = (val-lbl).^2;
else
[~,maxInd] = max(scr(:));
dzdw = w;
y = fea;
% size(dzdw)
% y = {dzdf,dzdw};

end
end

then I called it by using : customLoss = Layer.fromFunction(@z_maxx,'numInputDer',2 );
I didn't write true derivatives, instead, I wrote same params to test function and it's worked.
So, what is the differences between Layer.fromFunction and Layer.create? Can I use one instead of the other?

Also, I want to inform you about that defining derivatives like y = {dzdf,dzdw} can't work in eval mode. Also, vl_nnaxpy is in the same format, so I couldn't compile it. When I define derivatives as an output argument like [y,dzdw ] it's worked. I also mentioned this porblem in #31.

I have a one more question
(I suppose that I am the one who bother you the most among in autonn users :)) )
in resnet we sum y = F(x)+x. Now I want to try a maxout operation on this way :
F(x)'s channels : C1,C2,C3....CN x's channels G1,G2,G3,......GN new tensor y = C1,G1,C2,G2,C3,G3,....,CN,GN then I will use maxout [2 N], so how can I concat these two tensor in this format??

Many thanks !!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants