Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE] Parallelized Grouped Predictor #711

Open
AhmedThahir opened this issue Nov 4, 2024 · 7 comments
Open

[FEATURE] Parallelized Grouped Predictor #711

AhmedThahir opened this issue Nov 4, 2024 · 7 comments
Labels
enhancement New feature or request

Comments

@AhmedThahir
Copy link

Context
The GroupedPredictor, GroupedRegressor, GroupedClassifier are all single-threaded.

Problem
If we have many groups with lots of rows in each, the training time will be high.

Request
GroupedPredictor, GroupedRegressor, GroupedClassifier could be parallelized using Joblib by accepting n_jobs as a kwarg.

@AhmedThahir AhmedThahir added the enhancement New feature or request label Nov 4, 2024
@FBruzzesi
Copy link
Collaborator

Hey @AhmedThahir , that's definitly a reasonable request. We have such option for OrdinalClassifier. Concern is only about the fact that majority of estimators also allow to fit in parallel, yet I don't see anything wrong in letting the user decide at which level the parallelism should happen

@AhmedThahir
Copy link
Author

If the user specifies n_jobs=-1 for the estimators and the meta estimator, Joblib will automatically handle the parallelism: ie, it will use as many threads available.

If the user specifies anything other than -1, 1 or None, then it is implied that they are cognisant of the parallelism and any associated issues that may arise.

Hence, there will be no downsides to such a feature.

@FBruzzesi
Copy link
Collaborator

Hey @AhmedThahir , it turns out that I completely forgot that we already support such feature in HierarchicalClassifier and HierarchicalRegressor, which in general have a better design compared to Grouped estimators. We could not change grouped ones because we did not want to break the API too badly.

If you can, then moving to one of those should address the problem for you. If you cannot, then I will consider adding parallelization to the grouped estimators

@AhmedThahir
Copy link
Author

AhmedThahir commented Nov 7, 2024

Hi @FBruzzesi

Just checked out the HierarchicalClassifier and HierarchicalRegressor.

{

    (1,): LogisticRegression(),  # global estimator

    (1, 'A'): LogisticRegression(),  # estimator for `g_1 = 'A'`

    (1, 'B'): LogisticRegression(),  # estimator for `g_1 = 'B'`

    (1, 'A', 'X'): LogisticRegression(),  # estimator for `(g_1, g_2) = ('A', 'X`)`

    (1, 'A', 'Y'): LogisticRegression(),  # estimator for `(g_1, g_2) = ('A', 'Y`)`

    (1, 'B', 'W'): LogisticRegression(),  # estimator for `(g_1, g_2) = ('B', 'W`)`

    (1, 'B', 'Z'): LogisticRegression(),  # estimator for `(g_1, g_2) = ('B', 'Z`)`

}

However, this does not seem to be what I require. I just want the fit for the lowest level groups, without fitting for each level.

If you do not want to make any changes to the API currently, could you share the code so that I may use it for my projects as a custom class.

In the long run, I still believe that parallelization to the grouped estimator as part of the main API will be useful.

@FBruzzesi
Copy link
Collaborator

You can fit the lowest level only by specifying fallback_method="raise", shrinkage=None. From the docs:

[...] This means that an estimator is fitted for each level of the group columns.

The only exception to that is when shrinkage=None and fallback_method="raise", in which case only one estimator per group value is fitted.

Please let me know if that's what you are looking for.

If you would still require to have a global fallback at prediction time, then it could be worth adding that to the Hierarchical API.

In the meanwhile I will assess how feasible it is to parallelize grouped

@AhmedThahir
Copy link
Author

Thank you! Got it :)

  1. Is there a way to do this?
    "global fallback at prediction time"
  2. Doesn't this make grouped predictor redundant?

@FBruzzesi
Copy link
Collaborator

  1. I don't think it is possible to do so in Hierarchical at the moment, but it could be in scope for the shrinkage=None case.

  2. Quoting a comment directly:

  • GroupedPredictor refactoring was a first attempt to fix the issue reported here, as well as expanding the functionalities. Since this was getting a bit messy we decided to follow another approach, which led us to the next two points.
  • GroupedPredictor patch follows your suggestion to raise an error if shrinkage is used in combination with a classification task. This is a somewhat breaking change which is already in the main branch.
  • In HierarchicalPredictor and HierarchicalTransformer we set up the discussion for how to deal with that as well as adding other features to meta estimators that are somewhat similar to Grouped* ones.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants