Skip to content

Jacobfaib/legate-boost

 
 

Repository files navigation

legate-boost

GBM implementation on Legate. The primary goals of legate-boost is to provide a state-of-the-art distributed GBM implementation on Legate, capable of running on CPUs or GPUs at supercomputer scale.

API Documentation

For developers - see contributing

Example

Run with the legate launcher

legate example_script.py
import cunumeric as cn
import legateboost as lb

X = cn.random.random((1000, 10))
y = cn.random.random(X.shape[0])
model = lb.LBRegressor(verbose=1, n_estimators=100, random_state=0, max_depth=2).fit(
    X, y
)

Features

Probabilistic regression

legate-boost can learn distributions for continuous data. This is useful in cases where simply predicting the mean does not carry enough information about the training data:

drawing

The above example can be found here: examples/probabilistic_regression.

Batch training

legate-boost can train on datasets that do not fit into memory by splitting the dataset into batches and training the model with partial_fit.

total_estimators = 100
model = lb.LBRegressor(n_estimators=estimators_per_batch)
for i in range(total_estimators // estimators_per_batch):
    X_batch, y_batch = train_batches[i % n_batches]
    model.partial_fit(
        X_batch,
        y_batch,
    )

drawing

The above example can be found here: examples/batch_training.

Different model types

legate-boost supports tree models, linear models, kernel ridge regression models, custom user models and any combinations of these models.

The following example shows a model combining linear and decision tree base learners on a synthetic dataset.

model = lb.LBRegressor(base_models=(lb.models.Linear(), lb.models.Tree(max_depth=1),), **params).fit(X, y)

drawing

The second example shows a model combining kernel ridge regression and decision tree base learners on the wine quality dataset.

model = lb.LBRegressor(base_models=(lb.models.KRR(sigma=0.5), lb.models.Tree(max_depth=5),), **params).fit(X, y)

drawing

Installation

If you already have cunumeric and legate-core installed, run the following:

pip install \
    --no-build-isolation \
    --no-deps \
    .

For more details on customizing the build and setting up a development environment, see contributing.md.

Releases

No releases published

Packages

No packages published

Languages

  • Python 50.3%
  • C++ 25.9%
  • Cuda 21.0%
  • Shell 1.8%
  • Other 1.0%