You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
ValueError: Input X contains NaN.
LinearRegression does not accept missing values encoded as NaN natively. For supervised learning, you might want to consider sklearn.ensemble.HistGradientBoostingClassifier and Regressor which accept missing values encoded as NaNs natively. Alternatively, it is possible to preprocess the data, for instance by using an imputer transformer in a pipeline or drop samples with missing values. See https://scikit-learn.org/stable/modules/impute.html You can find a list of all estimators that handle NaN values at the following page: https://scikit-learn.org/stable/modules/impute.html#estimators-that-handle-nan-values
The text was updated successfully, but these errors were encountered:
whoisroop
changed the title
Training error for "LINEAR" & "LGB" models
Training error for "LINEAR" model
Mar 4, 2024
Hi Roop,
Thank you for writing this issue. The decision tree based models are able to work with NaNs natively. I would suggest you use xgboost or another GBDT model if possible. If not, then I would happily review your PR to fix the issue! I think adding an optional preprocessing step to input or drop data would be useful!
Pipeline Configuration:
pj = PredictionJobDataClass(
id=102,
model='linear',
quantiles=[0.1,0.3,0.5,0.7,0.9],
forecast_type="load",
lat=19.0760,
lon=72.8777,
horizon_minutes=24*60,
resolution_minutes=60,
name="Mumbai",
# hyper_params={},
# feature_names=None,
default_modelspecs=None,
save_train_forecasts=True,
)
Training Forecast:
start = time.time()
train_model_pipeline(
pj,
train_data,
check_old_model_age=False,
mlflow_tracking_uri="./mlflow_trained_models",
artifact_folder="./mlflow_artifacts",
)
end = time.time()
ERROR:
ValueError: Input X contains NaN.
LinearRegression does not accept missing values encoded as NaN natively. For supervised learning, you might want to consider sklearn.ensemble.HistGradientBoostingClassifier and Regressor which accept missing values encoded as NaNs natively. Alternatively, it is possible to preprocess the data, for instance by using an imputer transformer in a pipeline or drop samples with missing values. See https://scikit-learn.org/stable/modules/impute.html You can find a list of all estimators that handle NaN values at the following page: https://scikit-learn.org/stable/modules/impute.html#estimators-that-handle-nan-values
The text was updated successfully, but these errors were encountered: