Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revisit CrossValidateReporter.__init__ parameters #953

Open
glemaitre opened this issue Dec 13, 2024 · 7 comments
Open

Revisit CrossValidateReporter.__init__ parameters #953

glemaitre opened this issue Dec 13, 2024 · 7 comments

Comments

@glemaitre
Copy link
Member

glemaitre commented Dec 13, 2024

Right now, CrossValidateReporter.__init__ is taking *args and **kwargs as parameters.

It is pretty hard for user to infer what are the parameters in such case. We should avoid to have these generic signature.

Since we are wrapping the sklearn.model_selection.cross_validate, we can make an editorial choice of the parameters that we want to expose.

Currently, the signature is

sklearn.model_selection.cross_validate(
    estimator, X, y=None, *, groups=None, scoring=None,
    cv=None, n_jobs=None, verbose=0, fit_params=None,
    params=None, pre_dispatch='2*n_jobs', return_train_score=False,
    return_estimator=False, return_indices=False, error_score=nan
)

For instance, there is no need to request the user any of the return_* values since we are going to persist them to make any computation.

We would almost request to not accept groups or fit_params since it would lead to bugs and we should encourage the usage of the metadata routing in this case.

I would also not expose the verbose option because we should come with a nicer way to observe the progress of the computation. This is related with #949.

We should probably also simplify and not expose pre-dispatch or error_score, the first one being a low level component while the second one should be highlighted from the report that we provide.

Therefore, I would think that estimator, X, y, scoring, cv, params, and n_jobs are the sufficient parameters for this class. We should therefore explicitly document these parameters without redirecting to the scikit-learn documentation (when developing in and editor/notebook, I don't really want to be redirected on an external webpage).

@thomass-dev
Copy link
Collaborator

In line with a comment on the original PR: #871 (comment)

@augustebaum
Copy link
Contributor

I'm happy to copy-paste cross_validate's arguments to avoid the *args, **kwargs, but changing the argument list further than that might go against the "drop-in replacement" requirement of the cross-validation feature. @MarieS-WiMLDS What are your thoughts on this?

@tuscland tuscland added needs-triage This has been recently submitted and needs attention user-reported labels Dec 17, 2024
@MarieS-WiMLDS
Copy link
Contributor

It does go against the drop-in replacement indeed. However, it would be more user-friendly to go by this suggestion. I've been convinced that everything that we would remove in this proposed solution is edge-case, or shouldn't be used. Therefore, let's go :)!

@MarieS-WiMLDS MarieS-WiMLDS removed their assignment Dec 17, 2024
@MarieS-WiMLDS MarieS-WiMLDS removed the needs-triage This has been recently submitted and needs attention label Dec 17, 2024
@thomass-dev
Copy link
Collaborator

thomass-dev commented Dec 17, 2024

@augustebaum i don't understand why you said that changing the argument list is against the "drop-in replacement".
As long as we keep the same positional arguments, we can remove as many kwargs as we want using **kwargs.

For instance:

def skore.cross_validate(estimator, X, y=None, *, error_score=nan, **kwargs)

@MarieS-WiMLDS
Copy link
Contributor

other similar request: #1000. Therefore increasing prio to P1.

@tuscland
Copy link
Member

tuscland commented Jan 7, 2025

@glemaitre can we take the opportunity of including this issue in the upcoming CrossValidationReport?

@glemaitre
Copy link
Member Author

Yes. I think that it will be part of the revamp.

@tuscland tuscland added this to the Skore 0.7 milestone Jan 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants