Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integration for FLAML #566

Open
alexk101 opened this issue May 12, 2023 · 5 comments
Open

Integration for FLAML #566

alexk101 opened this issue May 12, 2023 · 5 comments
Labels

Comments

@alexk101
Copy link

I would love to see integration of Microsoft's FLAML hyperparameter optimizer!

@daavoo
Copy link
Contributor

daavoo commented May 15, 2023

Hi @alexk101 , are you using FLAML alongside an existing ML framework?

@daavoo daavoo added A: frameworks Area: ML Framework integration feature request labels May 15, 2023
@alexk101
Copy link
Author

alexk101 commented May 15, 2023

Yes. Currently we are using LightGBM for our models. The way that I have been working around this has been by doing the hyperparameter optimization outside of DVCLive, and then just retraining a LightGBM model, which is supported, with the best values, as shown below.

from flaml import AutoML
from dvclive import Live
from dvclive.lgbm import DVCLiveCallback
import lightgbm as lgb

automl = AutoML(**auto_ml_opts)
automl.fit(
    X_train=X_train.to_numpy(),
    y_train=y_train.to_numpy(),
    X_val=X_test.to_numpy(),
    y_val=y_test.to_numpy(),
    time_budget=TIME,
    estimator_list=['lgbm'], 
    task=model_type
)

starting_points = automl.best_config_per_estimator['lgbm']

with Live(str(output / 'dvclive')) as live:
    gbm = lgb.LGBMRegressor(
        **starting_points, 
        gpu_platform_id = 1, 
        gpu_device_id = 0
    )
    gbm.fit(X_train.to_numpy(), y_train.to_numpy(),
        eval_set=[(X_test.to_numpy(), y_test.to_numpy())],
        callbacks=[
            lgb.early_stopping(20),
            DVCLiveCallback(live=live, save_dvc_exp=True)
        ],
        eval_metric="rmse")

This is not ideal, since it doesn't capture the hyperparameter optimization, but it is minimally sufficient for what we are doing at the moment. I saw that Optuna was supported, so I thought it might be reasonable to request FLAML as another option.

@daavoo
Copy link
Contributor

daavoo commented May 15, 2023

This is not ideal, since it doesn't capture the hyperparameter optimization, but it is minimally sufficient for what we are doing at the moment. I saw that Optuna was supported, so I thought it might be reasonable to request FLAML as another option.

That makes sense, I will see what kind of integration could be done.

I was asking because even for Optuna I personally find it more convenient to either manually use DVCLive or use the MLFramework integration directly.

For example, in FLAML looks like you can pass a callback to fit like :

from flaml import AutoML
from dvclive import Live
from dvclive.lgbm import DVCLiveCallback
import lightgbm as lgb

automl = AutoML(**auto_ml_opts)
automl.fit(
    X_train=X_train.to_numpy(),
    y_train=y_train.to_numpy(),
    X_val=X_test.to_numpy(),
    y_val=y_test.to_numpy(),
    time_budget=TIME,
    estimator_list=['lgbm'], 
    task=model_type,
    callbacks=[DVCLiveCallback(save_dvc_exp=True)]
)

And every iteration of the hyperparameter optimization would create a DVC experiment.

It also looks like you could customize it further by using a custom class like in https://microsoft.github.io/FLAML/docs/Examples/AutoML-for-LightGBM/#create-a-customized-lightgbm-learner-with-a-custom-objective-function

@alexk101
Copy link
Author

Awesome. I will look into those custom callbacks further. Thanks @daavoo !

@dberenbaum
Copy link
Collaborator

It would be great to document this in https://dvc.org/doc/dvclive/ml-frameworks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants