[FEATURE] Parallelized Grouped Predictor #711

AhmedThahir · 2024-11-04T21:31:14Z

Context
The GroupedPredictor, GroupedRegressor, GroupedClassifier are all single-threaded.

Problem
If we have many groups with lots of rows in each, the training time will be high.

Request
GroupedPredictor, GroupedRegressor, GroupedClassifier could be parallelized using Joblib by accepting n_jobs as a kwarg.

FBruzzesi · 2024-11-05T08:26:08Z

Hey @AhmedThahir , that's definitly a reasonable request. We have such option for OrdinalClassifier. Concern is only about the fact that majority of estimators also allow to fit in parallel, yet I don't see anything wrong in letting the user decide at which level the parallelism should happen

AhmedThahir · 2024-11-05T08:39:28Z

If the user specifies n_jobs=-1 for the estimators and the meta estimator, Joblib will automatically handle the parallelism: ie, it will use as many threads available.

If the user specifies anything other than -1, 1 or None, then it is implied that they are cognisant of the parallelism and any associated issues that may arise.

Hence, there will be no downsides to such a feature.

FBruzzesi · 2024-11-06T21:12:00Z

Hey @AhmedThahir , it turns out that I completely forgot that we already support such feature in HierarchicalClassifier and HierarchicalRegressor, which in general have a better design compared to Grouped estimators. We could not change grouped ones because we did not want to break the API too badly.

If you can, then moving to one of those should address the problem for you. If you cannot, then I will consider adding parallelization to the grouped estimators

AhmedThahir · 2024-11-07T07:34:40Z

Hi @FBruzzesi

Just checked out the HierarchicalClassifier and HierarchicalRegressor.

{

    (1,): LogisticRegression(),  # global estimator

    (1, 'A'): LogisticRegression(),  # estimator for `g_1 = 'A'`

    (1, 'B'): LogisticRegression(),  # estimator for `g_1 = 'B'`

    (1, 'A', 'X'): LogisticRegression(),  # estimator for `(g_1, g_2) = ('A', 'X`)`

    (1, 'A', 'Y'): LogisticRegression(),  # estimator for `(g_1, g_2) = ('A', 'Y`)`

    (1, 'B', 'W'): LogisticRegression(),  # estimator for `(g_1, g_2) = ('B', 'W`)`

    (1, 'B', 'Z'): LogisticRegression(),  # estimator for `(g_1, g_2) = ('B', 'Z`)`

}

However, this does not seem to be what I require. I just want the fit for the lowest level groups, without fitting for each level.

If you do not want to make any changes to the API currently, could you share the code so that I may use it for my projects as a custom class.

In the long run, I still believe that parallelization to the grouped estimator as part of the main API will be useful.

FBruzzesi · 2024-11-07T08:09:04Z

You can fit the lowest level only by specifying fallback_method="raise", shrinkage=None. From the docs:

[...] This means that an estimator is fitted for each level of the group columns.

The only exception to that is when shrinkage=None and fallback_method="raise", in which case only one estimator per group value is fitted.

Please let me know if that's what you are looking for.

If you would still require to have a global fallback at prediction time, then it could be worth adding that to the Hierarchical API.

In the meanwhile I will assess how feasible it is to parallelize grouped

AhmedThahir · 2024-11-07T08:21:45Z

Thank you! Got it :)

Is there a way to do this?
"global fallback at prediction time"
Doesn't this make grouped predictor redundant?

FBruzzesi · 2024-11-07T09:17:43Z

I don't think it is possible to do so in Hierarchical at the moment, but it could be in scope for the shrinkage=None case.
Quoting a comment directly:

GroupedPredictor refactoring was a first attempt to fix the issue reported here, as well as expanding the functionalities. Since this was getting a bit messy we decided to follow another approach, which led us to the next two points.

GroupedPredictor patch follows your suggestion to raise an error if shrinkage is used in combination with a classification task. This is a somewhat breaking change which is already in the main branch.

In HierarchicalPredictor and HierarchicalTransformer we set up the discussion for how to deal with that as well as adding other features to meta estimators that are somewhat similar to Grouped* ones.

AhmedThahir added the enhancement New feature or request label Nov 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEATURE] Parallelized Grouped Predictor #711

[FEATURE] Parallelized Grouped Predictor #711

AhmedThahir commented Nov 4, 2024

FBruzzesi commented Nov 5, 2024

AhmedThahir commented Nov 5, 2024

FBruzzesi commented Nov 6, 2024

AhmedThahir commented Nov 7, 2024 •

edited

Loading

FBruzzesi commented Nov 7, 2024

AhmedThahir commented Nov 7, 2024

FBruzzesi commented Nov 7, 2024

[FEATURE] Parallelized Grouped Predictor #711

[FEATURE] Parallelized Grouped Predictor #711

Comments

AhmedThahir commented Nov 4, 2024

FBruzzesi commented Nov 5, 2024

AhmedThahir commented Nov 5, 2024

FBruzzesi commented Nov 6, 2024

AhmedThahir commented Nov 7, 2024 • edited Loading

FBruzzesi commented Nov 7, 2024

AhmedThahir commented Nov 7, 2024

FBruzzesi commented Nov 7, 2024

AhmedThahir commented Nov 7, 2024 •

edited

Loading