📊 Bayesian Optimization Comparison

Ever wondered which bayesian optimization framework to use for your project? We try to help you with that :)

This repository provides a general comparison of different bayesian optimization frameworks.

📚 Table of Contents

📊 Bayesian Optimization Comparison

🎯 Summary

We used Huggingfaces spotify tracks dataset. The objective was to predict the popularity of a mobile device. It is about a 10-class classification problem. We performed a short EDA on the dataset: huggingface__spotify_tracks/eda.ipynb.

Number of rows: 114,000
Number of columns: 21
Target feature: popularity
- Number of distinct values: 10

We used different machine learning models to show-case the results.

RandomForestClassifier

Baseline | baselines.ipynb

    Performance Results : RandomForestClassifier 
    ========================================
                    Train     Test      Delta(train,test)   
        Accuracy  : 0.9867    0.5579       -0.4289
        Precision : 0.9817    0.6632       -0.3185
        Recall    : 0.9590    0.5680       -0.3910
        F1-Score  : 0.9698    0.6044       -0.3653

Tuning time: None, since no tuning has been performed

Optuna | optuna.ipynb

50 trials
Optuna Results
==============
                Train     Test      Delta(train,test)   
    Accuracy  : 0.5235    0.5239        0.0004
    Precision : 0.2844    0.2842       -0.0002
    Recall    : 0.2990    0.2993        0.0003
    F1-Score  : 0.2264    0.2273        0.0009

Tuning time: 6 min 24 sec

BayesSearchCV | bayessearchcv.ipynb

TODO

BayesianOptimization | bayesianoptimization.ipynb

5 trials
BayesianOptimization Results
============================
                Train     Test      Delta(train,test)   
    Accuracy  : 0.4859    0.4874        0.0015
    Precision : 0.2789    0.2779       -0.0010
    Recall    : 0.2744    0.2753        0.0010
    F1-Score  : 0.2060    0.2066        0.0006

Tuning time: 0 min 23 sec

hyperopt | hyperopt.ipynb

5 trials
Hyperopt Results
===============
                Train     Test      Delta(train,test)   
    Accuracy  : 0.2058    0.2058       -0.0000
    Precision : 0.0206    0.0206       -0.0000
    Recall    : 0.1000    0.1000        0.0000
    F1-Score  : 0.0341    0.0341       -0.0000

Tuning time: 0 min 30 sec

gp_minimize | gpminimize.ipynb

TODO

Library 🤖	Tune Time ⌛	Precision	Recall	F1-Score	Notebook 📕
`Baseline`	None	Train: `1.000` Test: `0.862`	Train: `1.000` Test: `0.863`	Train: `1.000` Test: `0.862`	baselines.ipynb
`Optuna`	1m9s	Train: `0.818` Test: `0.778`	Train: `0.821` Test: `0.786`	Train: `0.813` Test: `0.772`	optuna.ipynb
`BayesianOptimization`		Train: Test:	Train: Test:	Train: Test:	bayesianoptimization.ipynb
`BayesSearchCV`		Train: Test:	Train: Test:	Train: Test:	bayessearchcv.ipynb
`hyperopt`		Train: Test:	Train: Test:	Train: Test:	hyperopt.ipynb
`gp_minimize`		Train: Test:	Train: Test:	Train: Test:	gpminimize.ipynb

Installation

To install the conda environment with conda, run the following command:

conda env create -f environment.yml

💡 Library Descriptions

1️⃣ Optuna

# Pip
% pip install optuna
# Conda
% conda install -c conda-forge optuna

In Optuna you need to define an objective, create a study via create_study and optimize your objective. An example implementation would be as follows:

def objective(trial):
    # Defining the hyperparameter search space.
    n_estimators = trial.suggest_int('n_estimators', 50, 300)
    # ...
    
    # Build the pipeline
    clf = RandomForestClassifier(n_estimators=n_estimators,
                                 # ...  
                                 )
    
    clf.fit(X_train, y_train)
    y_pred = clf.predict(X_val)
    acc = accuracy_score(y_val, y_pred)
    
    return acc

# Creating a study and running Optuna optimization.
study = optuna.create_study(study_name='my_optuna_study',
                            direction='maximize')
study.optimize(objective, 
               n_trials=100)

# Obtain the best found parameters.
best_params = study.best_params

While tuning Optuna provides logs in the following format. This shows the log of the 1st, 150th and last trial:

[I 2023-08-07 16:07:39,539] A new study created in memory with name: my_optuna_study
[I 2023-08-07 16:07:39,809] Trial 0 finished with value: 0.1864406779661017 and parameters: {'n_estimators': 158, 'max_depth': 10, 'min_samples_split': 0.578748660317236, 'min_samples_leaf': 0.12410371173766938}. Best is trial 0 with value: 0.1864406779661017.
...
[I 2023-08-07 16:08:21,456] Trial 150 finished with value: 0.1694915254237288 and parameters: {'n_estimators': 271, 'max_depth': 3, 'min_samples_split': 0.38567485569136783, 'min_samples_leaf': 0.13389811287208112}. Best is trial 55 with value: 0.22033898305084745.
...
[I 2023-08-07 16:09:02,867] Trial 299 finished with value: 0.1864406779661017 and parameters: {'n_estimators': 139, 'max_depth': 8, 'min_samples_split': 0.4866643968671017, 'min_samples_leaf': 0.19301363057724275}. Best is trial 55 with value: 0.22033898305084745.

2️⃣ BayesianOptimization

# Pip
$ pip install bayesian-optimization
# Conda
$ conda install -c conda-forge bayesian-optimization

3️⃣ BayesSearchCV

# Pip
$ pip install scikit-optimize
# Conda
% conda install -c conda-forge scikit-optimize

4️⃣ hyperopt

# Pip
% pip install hyperopt
# Conda
% conda install -c conda-forge hyperopt

5️⃣ gp_minimize

# Pip
$ pip install scikit-optimize
# Conda
% conda install -c conda-forge scikit-optimize

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
datasets		datasets
docs		docs
imgs		imgs
notebooks		notebooks
README.md		README.md
environment.yml		environment.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

📊 Bayesian Optimization Comparison

📚 Table of Contents

🎯 Summary

RandomForestClassifier

Installation

💡 Library Descriptions

1️⃣ Optuna

2️⃣ BayesianOptimization

3️⃣ BayesSearchCV

4️⃣ hyperopt

5️⃣ gp_minimize

About

Releases

Packages

Languages

PeeteKeesel/bayes-opt-battle

Folders and files

Latest commit

History

Repository files navigation

📊 Bayesian Optimization Comparison

📚 Table of Contents

🎯 Summary

RandomForestClassifier

Installation

💡 Library Descriptions

1️⃣ Optuna

2️⃣ BayesianOptimization

3️⃣ BayesSearchCV

4️⃣ hyperopt

5️⃣ gp_minimize

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages