Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

loss calculation of multi-label classification. #33

Closed
aiwithshekhar opened this issue May 5, 2020 · 8 comments
Closed

loss calculation of multi-label classification. #33

aiwithshekhar opened this issue May 5, 2020 · 8 comments

Comments

@aiwithshekhar
Copy link

aiwithshekhar commented May 5, 2020

  1. For multi classification, while calculating loss for coco dataset why it is multiplied by number of classes (80.0)? Is it weight parameter for class imbalance?
    loss = criteria(output, target.float()) * 80.0

  2. For calculating precision and recall should we use cumulative TP & FP as mentioned here:
    https://github.com/rafaelpadilla/Object-Detection-Metrics

@sacmehta
Copy link
Owner

sacmehta commented May 5, 2020

  1. 80 is used to scale the loss, otherwise loss value is too small.
  2. We are using cumulative values.

@sacmehta sacmehta closed this as completed May 5, 2020
@aiwithshekhar
Copy link
Author

aiwithshekhar commented May 9, 2020

Thanks for clarifying sachin. One more thing loss output we get is 'mean' over a batch we multiply (losses.update) it with batch size to get batch loss which is ok. But precision and recall are already calculated for a batch then why it's again multiplied (prec.update, rec.update) by batch size as shown in below code?

losses.update(float(loss), input.size(0)) prec.update(float(this_prec), input.size(0)) rec.update(float(this_rec), input.size(0))

@sacmehta
Copy link
Owner

sacmehta commented May 9, 2020

There is nothing fancy here. If you don't do this, then the statistics computed for the entire dataset using batch-wise statistics may be different from sample-wise statistics (though not by huge margin).

Mean of sample-wise statistics is not necessarily equal to mean of batch-wise stats.

For example, Let us say that you have following values in an array [1, 2, 3, 4, 5, 6] and let us say that you are using a batch size of 4 (so, you have [1, 2, 3, 4] and [5, 6] as two batches, one batch is truncated). In this case, sample-wise mean is 3.5 while batch-wise mean is 4.0 (mean(mean([1, 2, 3, 4]) + mean([5, 6])).

Hope this helps

@aiwithshekhar
Copy link
Author

Thanks for such a detailed reply. I didn't want to ask about object detection in this thread but didn't wanted to open new issue.

  1. priors per box for SSD300 should be [4 ,6 ,6 ,6 ,4 ,4] which sums upto 8732, but in your implementation its [6 ,6 ,6 ,6 ,6 ,6] which sums to 11640.

Is this done intentionally, if yes what were the benefits?

@sacmehta
Copy link
Owner

I was too lazy to tune these priors, so used same across all scales.

If you tune these, then your object detection would be much faster.

@sacmehta
Copy link
Owner

Also, feel free to create a pull request to merge your changes

@aiwithshekhar
Copy link
Author

Thanks for replying, sure thing !!

@aiwithshekhar
Copy link
Author

i have raised a pull request, please check whether the code is correct.
#34 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants