GitHub - nickkunz/nestedhyperboost: Nested Cross-Validation for Bayesian Optimized Gradient Boosting

Nested Cross-Validation for Bayesian Optimized Gradient Boosting

Description

A Python implementation that unifies Nested K-Fold Cross-Validation, Bayesian Hyperparameter Optimization, and Gradient Boosting. Designed for rapid prototyping on small to mid-sized data sets (can be manipulated within memory). Quickly obtains high quality prediction results by abstracting away tedious hyperparameter tuning and implementation details in favor of usability and implementation speed. Bayesian Hyperparamter Optimization utilizes Tree Parzen Estimation (TPE) from the Hyperopt package. Gradient Boosting can be conducted one of three ways. Select between XGBoost, LightGBM, or CatBoost. XGBoost is applied using traditional Gradient Tree Boosting (GTB). LightGBM is applied using its novel Gradient Based One Sided Sampling (GOSS). CatBoost is applied usings its novel Ordered Boosting. NestedHyperBoost can be applied to regression, multi-class classification, and binary classification problems.

Features

Consistent syntax across all Gradient Boosting methods.
Supported Gradient Boosting methods: XGBoost, LightGBM, CatBoost.
Returns custom object that includes common performance metrics and plots.
Developed for readability, maintainability, and future improvement.

Requirements

Python 3
NumPy
Pandas
MatPlotLib
Scikit-Learn
Hyperopt
XGBoost
LightGBM
CatBoost

Installation

## install pypi release
pip install nestedhyperboost

## install developer version
pip install git+https://github.com/nickkunz/nestedhyperboost.git

Usage

## load libraries
from nestedhyperboost import xgboost
from sklearn import datasets
import pandas

## load data
data_sklearn = datasets.load_iris()
data = pandas.DataFrame(data_sklearn.data, columns = data_sklearn.feature_names)
data['target'] = pandas.Series(data_sklearn.target)

## conduct nestedhyperboost
results = xgboost.xgb_ncv_classifier(
    data = data,
    y = 'target',
    k_inner = 5,
    k_outer = 5,
    n_evals = 10
)

## preview results
results.accu_mean()
results.conf_mtrx()
results.prfs_mean()

## preview plots
results.feat_plot()

## model and params
model = results.model
params = results.params

License

Contributions

NestedHyperBoost is open for improvements and maintenance. Your help is valued to make the package better for everyone.

References

Bergstra, J., Bardenet, R., Bengio, Y., Kegl, B. (2011). Algorithms for Hyper-Parameter Optimization. https://papers.nips.cc/paper/4443-algorithms-for-hyper-parameter-optimization.pdf.

Bergstra, J., Yamins, D., Cox, D. D. (2013). Making a Science of Model Search: Hyperparameter Optimization in Hundreds of Dimensions for Vision Architectures. Proceedings of the 30th International Conference on International Conference on Machine Learning. 28:I115–I123. http://proceedings.mlr.press/v28/bergstra13.pdf.

Chen, T., Guestrin, C. (2016). XGBoost: A Scalable Tree Boosting System. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 785–794. https://www.kdd.org/kdd2016/papers/files/rfp0697-chenAemb.pdf.

Ke, G., Meng, Q., Finley, T., et al. (2017). LightGBM: A Highly Efficient Gradient Boosting Decision Tree. Proceedings of the 31st International Conference on Neural Information Processing Systems. 3146-3154. https://papers.nips.cc/paper/6907-lightgbm-a-highly-efficient-gradient-boosting-decision-tree.pdf.

Prokhorenkova, L., Gusev, G., Vorobev, A., et al. (2018). CatBoost: Unbiased Boosting with Categorical Features. Proceedings of the 32nd International Conference on Neural Information Processing Systems. 6639–6649. http://learningsys.org/nips17/assets/papers/paper_11.pdf.

Name		Name	Last commit message	Last commit date
Latest commit History 103 Commits
media		media
nestedhyperboost		nestedhyperboost
.gitignore		.gitignore
.travis.yml		.travis.yml
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
UPDATES		UPDATES
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Nested Cross-Validation for Bayesian Optimized Gradient Boosting

Description

Features

Requirements

Installation

Usage

License

Contributions

References

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Languages

License

nickkunz/nestedhyperboost

Folders and files

Latest commit

History

Repository files navigation

Nested Cross-Validation for Bayesian Optimized Gradient Boosting

Description

Features

Requirements

Installation

Usage

License

Contributions

References

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Languages

Packages