You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In Rust, we are required to specify the type of the ndarray passed by the Python user, e.g PyReadonlyArray2<f64> where f64 is the type of the ndarray. What happens if the design matrix X contains features of different data types?
In the original PyMC-BART implementation, the Python user can pass the following types
X : PyTensor Variable, Pandas/Polars DataFrame or numpy array
The covariate matrix.
However, in the underlying Python code we convert X to an numpy array and cast the type to a float, thereby changing the types of all the dimensions (features) to a float.
This can have implications in the underlying Rust code. For example, instead of having to define enums for different split value thresholds, we know a priori that all split value thresholds will be f64 due to the type cast at the Python level.
Furthermore, there are different SplitRules such as ContinuousSplit and OneHotSplit defined for different feature data types. If the type cast on X is performed at the Python level, then at the Rust level, we will need to cast f64 to i32, perform the split rule, and then cast back to f64—which is not that big of a deal.
The text was updated successfully, but these errors were encountered:
In Rust, we are required to specify the type of the ndarray passed by the Python user, e.g
PyReadonlyArray2<f64>
wheref64
is the type of the ndarray. What happens if the design matrixX
contains features of different data types?In the original PyMC-BART implementation, the Python user can pass the following types
X : PyTensor Variable, Pandas/Polars DataFrame or numpy array The covariate matrix.
However, in the underlying Python code we convert
X
to an numpy array and cast the type to afloat
, thereby changing the types of all the dimensions (features) to afloat
.This can have implications in the underlying Rust code. For example, instead of having to define enums for different split value thresholds, we know a priori that all split value thresholds will be
f64
due to the type cast at the Python level.Furthermore, there are different
SplitRules
such asContinuousSplit
andOneHotSplit
defined for different feature data types. If the type cast onX
is performed at the Python level, then at the Rust level, we will need to castf64
toi32
, perform the split rule, and then cast back tof64
—which is not that big of a deal.The text was updated successfully, but these errors were encountered: