-
-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
surv.aorsf
: Workaround for leaf_min_events = X should be <= Y (number of events divided by 2)
in tuning setting?
#383
Comments
I like the idea of tuning |
Looking over rfsrc and ranger, I don't think they have parameters equivalent to Searching for the |
Just to revive this a little: In #384 I noticed that the API is kind of.. odd? In addition to that, I realized that the motivation for this entire thing is our survival benchmark, where we use this additional transformation see in context: .extra_trafo = function(x, param_set) {
x$surv.aorsf.split_min_obs = x$surv.aorsf.leaf_min_events + 5L
x
} Which introduces a dependency of We could, of course, a) scrap the @bcjaeger happy for your input here --- I also realize that I probably should have checked in with you anyway after migrating the benchmark from |
Hello! I really appreciate the work that has gone in to accommodating If it makes sense to pursue these features, one thing that could be done to make the API less awkward (maybe) is find a safe upper bound for |
Description
In
surv.aorsf
, when theleaf_min_events
parameter is tuned, then allowed values depend on the number of events in the respective task (or subset of the task used for training). This leads to some errors in our benchmark where we tuneleaf_min_events
in the range of 5 through 50, but for tasks with few observations we encounter the error message above due to resampling, e.g.leaf_min_events = 25 should be <= 20 (number of events divided by 2)
.In practice we encapsulate the learner and use a fallback (KM) to impute results, but of course it would be better to not even attempt to evaluate "invalid" hyperparameter configurations.
A common example for a data-dependent parameter is
mtry
, for which we have introduced themtry.ratio
proxy parameter to tunemtry
on a scale from 0 to 1/nfeatures
rather than 1 tonfeatures
.I am now wondering if it makes sense to introduce a similar proxy parameter here, or if we can get away with an
.extra_trafo
of some sorts (but I don't think.extra_trafo
has access to the necessary information?).@bcjaeger, if you have any insights here let use know!
Below is a reprex with an
AutoTuner
setup reproducing the error message, with the tuning spaces copied verbatim from our benchmark.Reproducible example
Created on 2024-09-09 with reprex v2.1.1
Session info
The text was updated successfully, but these errors were encountered: