Implement clustering accuracy #2767

moetayuko · 2024-10-03T12:34:44Z

🚀 Feature

Motivation

Clustering accuracy is a popular metric. In addition to classification accuracy, it employs the Hungarian algorithm to align the predicted pseudo labels and the ground truth labels.

Current implementations of clustering accuracy use either scipy.optimize.linear_sum_assignment or the munkres package for Hungarian. I'm not sure if this is allowed for torchmetrics, and a custom implementation needs to be added if not.

Pitch

Implement clustering accuracy in torchmetrics.clustering

The text was updated successfully, but these errors were encountered:

github-actions · 2024-10-03T12:35:07Z

Hi! thanks for your contribution!, great first issue!

SkafteNicki · 2024-10-08T17:58:03Z

@moetayuko thanks for opening this issue. Do you have a reference to a source (possible research paper) where they describe the metric in details?

moetayuko · 2024-10-09T07:16:48Z

@moetayuko thanks for opening this issue. Do you have a reference to a source (possible research paper) where they describe the metric in details?

Sec. 6.2.1 of https://arxiv.org/abs/2206.07579

FYI here are some random implementations I found:
https://github.com/bdy9527/SDCN/blob/da6bb007b7d07362ac04db3e146b4944e2acb883/evaluation.py#L9
https://github.com/google-research/google-research/blob/cc787a6a513cb6d2042cd6e286e2cbe42c41a863/stacked_capsule_autoencoders/capsules/eval.py#L29
https://github.com/aeon-toolkit/aeon/blob/01e8424df1bee9e614a25a25b5268383a8fd9893/aeon/performance_metrics/clustering.py#L12

SkafteNicki · 2024-10-11T12:09:57Z

@moetayuko thanks for the references, it really helped understanding how the metric is intended to work.
Hopefully, I have time to fully implement the metric in the next couple of days. I have already the logic figured out using https://github.com/ivan-chai/torch-linear-assignment for solving the linear sum assignment problem:

from torchmetrics.functional.classification import multiclass_confusion_matrix
import torch
# pip install git+https://github.com/ivan-chai/torch-linear-assignment.git@main
from torch_linear_assignment import batch_linear_assignment

preds = torch.tensor([0, 0, 1, 1])
target = torch.tensor([1, 1, 0, 0])

confmat = multiclass_confusion_matrix(preds, target, num_classes=5)
print(confmat)

confmat = confmat[None]

assignment = batch_linear_assignment(confmat.max() - confmat)
print(assignment)

confmat = confmat[0]

tps = confmat[torch.arange(confmat.size(0)), assignment.flatten()]

acc = tps.sum() / len(preds)
print(acc)

moetayuko added the enhancement New feature or request label Oct 3, 2024

SkafteNicki added this to the future milestone Oct 8, 2024

SkafteNicki added the New metric label Oct 8, 2024

SkafteNicki linked a pull request Oct 12, 2024 that will close this issue

New metric: Cluster Accuracy #2777

Open

4 tasks

SkafteNicki self-assigned this Oct 22, 2024

SkafteNicki modified the milestones: future, v1.6.0 Oct 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement clustering accuracy #2767

Implement clustering accuracy #2767

moetayuko commented Oct 3, 2024

github-actions bot commented Oct 3, 2024

SkafteNicki commented Oct 8, 2024

moetayuko commented Oct 9, 2024

SkafteNicki commented Oct 11, 2024

Implement clustering accuracy #2767

Implement clustering accuracy #2767

Comments

moetayuko commented Oct 3, 2024

🚀 Feature

Motivation

Pitch

github-actions bot commented Oct 3, 2024

SkafteNicki commented Oct 8, 2024

moetayuko commented Oct 9, 2024

SkafteNicki commented Oct 11, 2024