-
Notifications
You must be signed in to change notification settings - Fork 118
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEATURE] Parallelized Grouped Predictor #711
Comments
Hey @AhmedThahir , that's definitly a reasonable request. We have such option for |
If the user specifies n_jobs=-1 for the estimators and the meta estimator, Joblib will automatically handle the parallelism: ie, it will use as many threads available. If the user specifies anything other than -1, 1 or None, then it is implied that they are cognisant of the parallelism and any associated issues that may arise. Hence, there will be no downsides to such a feature. |
Hey @AhmedThahir , it turns out that I completely forgot that we already support such feature in HierarchicalClassifier and HierarchicalRegressor, which in general have a better design compared to Grouped estimators. We could not change grouped ones because we did not want to break the API too badly. If you can, then moving to one of those should address the problem for you. If you cannot, then I will consider adding parallelization to the grouped estimators |
Hi @FBruzzesi Just checked out the HierarchicalClassifier and HierarchicalRegressor.
However, this does not seem to be what I require. I just want the fit for the lowest level groups, without fitting for each level. If you do not want to make any changes to the API currently, could you share the code so that I may use it for my projects as a custom class. In the long run, I still believe that parallelization to the grouped estimator as part of the main API will be useful. |
You can fit the lowest level only by specifying
Please let me know if that's what you are looking for. If you would still require to have a global fallback at prediction time, then it could be worth adding that to the In the meanwhile I will assess how feasible it is to parallelize grouped |
Thank you! Got it :)
|
|
Context
The GroupedPredictor, GroupedRegressor, GroupedClassifier are all single-threaded.
Problem
If we have many groups with lots of rows in each, the training time will be high.
Request
GroupedPredictor, GroupedRegressor, GroupedClassifier could be parallelized using Joblib by accepting n_jobs as a kwarg.
The text was updated successfully, but these errors were encountered: