Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cross entropy loss(es) should document hard assumption of labels summing up to 1.0 #19316

Closed
hmeine opened this issue Mar 15, 2024 · 3 comments
Closed
Assignees

Comments

@hmeine
Copy link

hmeine commented Mar 15, 2024

For legacy reasons, we have been using keras.losses.categorical_crossentropy() (with from_logits = False) in our own loss wrapper that would also support sample weights (for 3D medical images). Through some version upgrade (some time between TF 2.9 and 2.14), we found that this no longer worked, and it took us quite some time to find that there is nowadays a _get_logits() function that switches to the preferred softmax_cross_entropy_with_logits() function, even though we did not specify that at all! On the one hand, that's great, on the other hand, that loss has stricter requirements, in particular it assumes the target vectors to "be probability distributions" (sum up to 1.0).

I came here to suggest that the relatively weak statement "We expect labels to be provided in a one_hot representation." is re-worded into something stronger such as "This function assumes that the targets are provided in a one_hot representation, and they have to sum up to 1.0."

Maybe the sneaky "we're trying to reveal your logits and might strip your softmax" should also be explicitly mentioned.

@SuryanarayanaY SuryanarayanaY added the type:docs Need to modify the documentation label Mar 18, 2024
@SuryanarayanaY
Copy link
Contributor

Hi @hmeine ,

If we specify from_logits = False to loss function it assumes/expects the values are normalized to summing to the probability of 1 , right ? . If uses from_logits=True will have much more numeric stability. IMO for your case since you are using from_logits = False this should not fail. Could you please provide a minimal code snippet to verify it.

Copy link

github-actions bot commented Apr 3, 2024

This issue is stale because it has been open for 14 days with no activity. It will be closed if no further activity occurs. Thank you.

@github-actions github-actions bot added the stale label Apr 3, 2024
Copy link

This issue was closed because it has been inactive for 28 days. Please reopen if you'd like to work on this further.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants