Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Loss Function in Twenty_Newsgroups #94

Open
ghost opened this issue Feb 21, 2019 · 1 comment
Open

Loss Function in Twenty_Newsgroups #94

ghost opened this issue Feb 21, 2019 · 1 comment

Comments

@ghost
Copy link

ghost commented Feb 21, 2019

I’m trying to adapt this example to my own tweets data, but I got really confused about the loss function calculation.
The calculation in the code is not identical to the formula in the paper.
image
image
The above two pictures show the Loss function formula in paper, but in the code, it only contains the first part of the sum, L^d. There is a dubious calculation of l underlined in the following picture, which I think might be the second part, but it’s not added to the loss before loss.backward().
image
And I had to add a multiplication of -clambda to the original prior code (right under the red line in the above picture), because the returned value of model.prior() is actually a minus value of the formula (5) without the lambda factor.
I don’t know Chainer, so it’s very hard to decide whether l was calculated correctly as the second part of the sum and could be readily added to the loss before backward(). It's also hard to decide the intention of the fraction variable and whether it should be multiplied to the l part of the sum too. Actually, I tried adding l to loss without multiplying it to fraction but didn’t get the expected result.

@ghost
Copy link
Author

ghost commented Feb 21, 2019

In the calculation of l:
image

Here above it seems to have set self.sampler to an instance of NegativeSampling.

image

Here above in the for loop, loss was assigned the return of calling NegativeSampling each time. So does it discard the previous value each time while the formula in paper says it should be added together? Or some nature of the self.sampler made them accumulated?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

0 participants