Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Enhance skore.cross_validate #544

Closed
5 tasks done
sylvaincom opened this issue Oct 21, 2024 · 5 comments
Closed
5 tasks done

feat: Enhance skore.cross_validate #544

sylvaincom opened this issue Oct 21, 2024 · 5 comments
Assignees
Labels
epic This issue represents major product increments

Comments

@sylvaincom
Copy link
Contributor

sylvaincom commented Oct 21, 2024

skore.cross_validate`

First PR of this new feature: #443, cc @MarieS-WiMLDS and @augustebaum. Issue: #383.
Some issues arose in #443 (comment), that have been separated to this new GitHub issue. The issue is written again here:


Capture d’écran 2024-10-21 à 11 06 05

EDIT : after a discussion with @augustebaum, maybe we can keep both, but in all cases, we should say somewhere what is actually behind test_score.

For the fit time:
Capture d’écran 2024-10-17 à 14 29 41

  • We must specify the units (seconds for example) of the fit time
  • We must distinguish scores (the higher, the better) and errors (the higher, the worse), for example this is an error but the title is still "Cross-validation scores" and the y-label is "Score": this is misleading
Capture d’écran 2024-10-17 à 14 35 01

Note: neg_mean_squared_error is actually meant to be negative and behave like a score (the higher, the better). But still, naming "results" is better I believe.

Maybe we can say something like "Cross-validation results" and the y-label is "Results", so that it can be appropriate for scores, errors, and times

EDIT by @augustebaum : Done

  • We should switch to plotly before the first release, because DS are much more familiar with plotly than with altair. See dropdowns in plotly.

EDIT: Tis is the new display that does not have a dropdown menu but does the same & it can enable us to compare the training and testing times together which is very useful ; if we had the train score then we can also compare the train and test scores together

Enregistrement.de.l.ecran.2024-10-17.a.17.46.57.mov
@MarieS-WiMLDS
Copy link
Contributor

MarieS-WiMLDS commented Oct 21, 2024

Another short enhancement:

@augustebaum
Copy link
Contributor

augustebaum commented Oct 21, 2024

To be honest, this issue is getting a bit big. Can we create individual issues?

@MarieS-WiMLDS
Copy link
Contributor

@augustebaum I created 3 smaller issues.
@sylvaincom, I didn't create issues about For linear regression, it is redundant to say test_r2 and test_score. because I don't know what kind of design solution you had in mind, nor for the train scores, for the same reason. I'll let you do it :)!

@sylvaincom
Copy link
Contributor Author

Created to two issues of this larger issue that I would consider a small epic

@augustebaum
Copy link
Contributor

Now that this has been divided up I'll go ahead and close this

@augustebaum augustebaum closed this as not planned Won't fix, can't repro, duplicate, stale Oct 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
epic This issue represents major product increments
Projects
None yet
Development

No branches or pull requests

4 participants