Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

get_groups fails to tell a Random classifier from real classifiers #11

Open
andreplima opened this issue Mar 5, 2025 · 0 comments
Open

Comments

@andreplima
Copy link

Ideally, any comparison framework should be able to discern a Random classifier from real ones by their accuracy results, even if the data does not quite satisfy the requirements pointed out in Demsar, 2006 (N > 10, k > 5).

The data:

classifier_name   BRDT   BRRF     DT  Linear    MLP  Polygrid     RF  Random  Ridge
dataset_name
cancer (mc)      0.908  0.924  0.909   0.930  0.941     0.950  0.841   0.490  0.934
iris (mc)        0.936  0.899  0.943   0.799  0.710     0.981  0.767   0.367  0.846
penguins (mc)    0.932  0.825  0.922   0.988  0.822     0.991  0.778   0.323  0.983
wine (mc)        0.833  0.977  0.892   0.973  0.962     0.994  0.918   0.333  0.990

produces the correct ranking,

Average rank per model, according to accuracy
   Polygrid     1.00
   Ridge        3.25
   Linear       4.00
   BRRF         4.50
   DT           5.00
   BRDT         5.50
   MLP          5.50
   RF           7.25
   Random       9.00

but all classifiers are included in a single group:

Groups of models with statistically indistinguishible performance:
  Group 1: ['BRDT', 'BRRF', 'DT', 'Linear', 'MLP', 'Polygrid', 'RF', 'Random', 'Ridge']

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant