Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MetricCollection.update gives identical results #2916

Open
Boltzmachine opened this issue Jan 24, 2025 · 4 comments · May be fixed by #2944
Open

MetricCollection.update gives identical results #2916

Boltzmachine opened this issue Jan 24, 2025 · 4 comments · May be fixed by #2944
Labels
bug / fix Something isn't working help wanted Extra attention is needed v1.4.x

Comments

@Boltzmachine
Copy link

Boltzmachine commented Jan 24, 2025

🐛 Bug

MetricCollection.update gives identical results

To Reproduce

from torchmetrics import MetricCollection
from torchmetrics.text import BLEUScore

scores = MetricCollection({
    "bleu-1": BLEUScore(1),
    "bleu-2": BLEUScore(2),
    "bleu-3": BLEUScore(3),
    "bleu-4": BLEUScore(4),
})

preds = ['the cat is on the mat']
target = [['there is a cat on the mat', 'a cat is on the mat']]

scores.update(preds, target)

print(scores.compute())

This gives the following result

{'bleu-1': tensor(0.8333), 'bleu-2': tensor(0.8333), 'bleu-3': tensor(0.8333), 'bleu-4': tensor(0.8333)}

which is incorrect

Expected behavior

{'bleu-1': tensor(0.8333), 'bleu-2': tensor(0.8165), 'bleu-3': tensor(0.7937), 'bleu-4': tensor(0.7598)}

Environment

torchmetrics==1.4.1
torch==2.2.2

@Boltzmachine Boltzmachine added bug / fix Something isn't working help wanted Extra attention is needed labels Jan 24, 2025
Copy link

Hi! thanks for your contribution!, great first issue!

@Boltzmachine Boltzmachine changed the title MetricsCollection.update gives identical results MetricCollection.update gives identical results Jan 24, 2025
@Boltzmachine
Copy link
Author

Looks like compute_groups=True is the root. I think the argument is really dangerous.

@Borda Borda added the v1.4.x label Jan 25, 2025
@Boltzmachine
Copy link
Author

Could anyone take a look? This is a really serious bug and can lead to incorrect results in research

@rbedyakin
Copy link

rbedyakin commented Feb 4, 2025

I suppose error on

return state1.shape == state2.shape and allclose(state1, state2)

If only state1[key_0] is equal to state2[key_0], then function return True without checking all other keys.

@Borda Borda linked a pull request Feb 4, 2025 that will close this issue
4 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug / fix Something isn't working help wanted Extra attention is needed v1.4.x
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants