API: naming and functions #32

mathurinm · 2020-11-09T08:30:09Z

As we saw in #30 and with @ksehic, newcomer's life could be made easier.

I would like to discuss a few API changes:

Naming: from my (personal) user perspective it's surprising to have sparse_ho.models.Lasso which has as an attribute a celer.Lasso
Since models.Lasso sets the alpha, I would name this class LassoGradSearch, or LassoGS or LassoAuto, or something else (in a similar way as in sklearn.linear_model.LassoCV)
exposition of estimator: in the newly named LassoGradSearch, estimator should be made private. It's only called for its solver, and I don't think the user should be aware of it.
the user should not have to pass estimator when creating LassoGradSearch. It should be created in the init (if possible with sklearn convention, oterwise on the first call to fit). Conceptually, this makes the instanciation of the class easier.
LassoGradSearch should expose a fit and a predict (so the user does not need to access its estimator), and pass sklearn.utils.estimators_check.check_estimator

@agramfort @josephsalmon what do you think ?

The text was updated successfully, but these errors were encountered:

agramfort · 2020-11-09T20:29:14Z

+100

…

mathurinm · 2021-01-26T14:48:42Z

Given the past work, I think it should be doable to have something along the lines of:

criterion = HeldOutMSE(idx_train, idx_val)  # better yet, train_frac=0.75 if possible with CV
algo = ImplicitForward()
monitor_grad = Monitor()  # this could go inside model too, I think ? 
optimizer = LineSearch(n_outer=10)
model = LassoAuto(estimator=estimator, criterion=criterion, alpha=alpha0)
# in my ideal world we should not pass estimator, (skelarn LassoCV does not need sklearn Lasso) but with sklearn *
# conventions I don't know what's possible. Instanciate at first fit call, maybe, like sklearn does for coef_ ?
model.fit(X, y, algo=algo, optimizer=optimizer, search='grad', callback=None)  # grad_search happens here, do we
#  even need to specify it ? 
# ie. is there an alternative ? for gridsearch, use sklearn
model.predict(X_test)

My 2 issues with the current design is that we have to instantiate 6 objects to fit (estimator, model, monitor, algo, criterion, optimizer), and that we must use estimator to predict in the end, while it should remain hidden from the user (it's not obvious that it's modified by grad_search)

Does it sound doable the way I propose, @QB3 @Klopfe ? this may help for refitting in #86

QB3 · 2021-01-26T15:44:57Z

While we are discussing the API, it seems that instantiating HeldOutMSE with idx_train and idx_val is counter intuitive.
What do you think HeldOutMSE should take as argument? @agramfort @josephsalmon @Klopfe @mathurinm ?

agramfort · 2021-01-26T15:46:48Z

what do you suggest?

…

QB3 mentioned this issue Jan 26, 2021

[WIP] Draft for nicer API #88

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

API: naming and functions #32

API: naming and functions #32

mathurinm commented Nov 9, 2020

agramfort commented Nov 9, 2020 via email

mathurinm commented Jan 26, 2021 •

edited

Loading

QB3 commented Jan 26, 2021

agramfort commented Jan 26, 2021 via email

API: naming and functions #32

API: naming and functions #32

Comments

mathurinm commented Nov 9, 2020

agramfort commented Nov 9, 2020 via email

mathurinm commented Jan 26, 2021 • edited Loading

QB3 commented Jan 26, 2021

agramfort commented Jan 26, 2021 via email

mathurinm commented Jan 26, 2021 •

edited

Loading