Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ValueError when using [GlobalSklearnTransformer+PowerTransformer] #475

Closed
wasf84 opened this issue Feb 6, 2025 · 1 comment · Fixed by #477
Closed

ValueError when using [GlobalSklearnTransformer+PowerTransformer] #475

wasf84 opened this issue Feb 6, 2025 · 1 comment · Fixed by #477
Labels

Comments

@wasf84
Copy link

wasf84 commented Feb 6, 2025

What happened + What you expected to happen

Hello! o/

I was reading the manual to learn about new ways to transform data. However, when I attempt to apply the 'GlobalSklearnTransformer' with a 'PowerTransformer', I consistently encounter an error. All I did was try to apply the example shown in the manual here https://nixtlaverse.nixtla.io/mlforecast/target_transforms.html to my data, but the error persists. This example from the manual is exactly what I want to accomplish with my data—it would be perfect for my use case.

simple_train.xlsx


ValueError Traceback (most recent call last)
Cell In[184], line 1
----> 1 df_cv = mlf.cross_validation(
2 df = simple_train[["ds", "unique_id", "y"]],
3 n_windows = 5,
4 h = 15,
5 step_size = 15,
6 refit = True,
7 static_features = [],
8 )

File ~/bin/miniconda3/envs/py311/lib/python3.11/site-packages/mlforecast/forecast.py:958, in MLForecast.cross_validation(self, df, n_windows, h, id_col, time_col, target_col, step_size, static_features, dropna, keep_last_n, refit, max_horizon, before_predict_callback, after_predict_callback, prediction_intervals, level, input_size, fitted, as_numpy, weight_col)
956 else:
957 X_df = None
--> 958 y_pred = self.predict(
959 h=h,
960 before_predict_callback=before_predict_callback,
961 after_predict_callback=after_predict_callback,
962 new_df=train if not should_fit else None,
963 level=level,
964 X_df=X_df,
965 )
966 y_pred = ufp.join(y_pred, cutoffs, on=id_col, how="left")
967 result = ufp.join(
968 valid[[id_col, time_col, target_col]],
969 y_pred,
970 on=[id_col, time_col],
971 )

File ~/bin/miniconda3/envs/py311/lib/python3.11/site-packages/mlforecast/forecast.py:738, in MLForecast.predict(self, h, before_predict_callback, after_predict_callback, new_df, level, X_df, ids)
735 else:
736 ts = self.ts
--> 738 forecasts = ts.predict(
739 models=self.models_,
740 horizon=h,
741 before_predict_callback=before_predict_callback,
742 after_predict_callback=after_predict_callback,
743 X_df=X_df,
744 ids=ids,
745 )
746 if level is not None:
747 if self._cs_df is None:

File ~/bin/miniconda3/envs/py311/lib/python3.11/site-packages/mlforecast/core.py:856, in TimeSeries.predict(self, models, horizon, before_predict_callback, after_predict_callback, X_df, ids)
854 preds = ufp.assign_columns(preds, col, ga.data)
855 else:
--> 856 preds = tfm.inverse_transform(preds)
857 return preds

File ~/bin/miniconda3/envs/py311/lib/python3.11/site-packages/mlforecast/target_transforms.py:333, in GlobalSklearnTransformer.inverse_transform(self, df)
329 df = ufp.copy_if_pandas(df, deep=False)
330 cols_to_transform = [
331 c for c in df.columns if c not in (self.id_col, self.time_col)
332 ]
--> 333 transformed = self.transformer_.inverse_transform(
334 df[cols_to_transform].to_numpy()
335 )
336 return ufp.assign_columns(df, cols_to_transform, transformed)

File ~/bin/miniconda3/envs/py311/lib/python3.11/site-packages/sklearn/preprocessing/_data.py:3379, in PowerTransformer.inverse_transform(self, X)
3348 """Apply the inverse power transformation using the fitted lambdas.
3349
3350 The inverse of the Box-Cox transformation is given by::
(...)
3376 The original data.
3377 """
3378 check_is_fitted(self)
-> 3379 X = self._check_input(X, in_fit=False, check_shape=True)
3381 if self.standardize:
3382 X = self._scaler.inverse_transform(X)

File ~/bin/miniconda3/envs/py311/lib/python3.11/site-packages/sklearn/preprocessing/_data.py:3513, in PowerTransformer._check_input(self, X, in_fit, check_positive, check_shape)
3495 def check_input(self, X, in_fit, check_positive=False, check_shape=False):
3496 """Validate the input before fit and transform.
3497
3498 Parameters
(...)
3511 If True, check that n_features matches the length of self.lambdas

3512 """
-> 3513 X = self._validate_data(
3514 X,
3515 ensure_2d=True,
3516 dtype=FLOAT_DTYPES,
3517 force_writeable=True,
3518 copy=self.copy,
3519 force_all_finite="allow-nan",
3520 reset=in_fit,
3521 )
3523 with warnings.catch_warnings():
3524 warnings.filterwarnings("ignore", r"All-NaN (slice|axis) encountered")

File ~/bin/miniconda3/envs/py311/lib/python3.11/site-packages/sklearn/base.py:654, in BaseEstimator._validate_data(self, X, y, reset, validate_separately, cast_to_ndarray, **check_params)
651 out = X, y
653 if not no_val_X and check_params.get("ensure_2d", True):
--> 654 self._check_n_features(X, reset=reset)
656 return out

File ~/bin/miniconda3/envs/py311/lib/python3.11/site-packages/sklearn/base.py:443, in BaseEstimator.check_n_features(self, X, reset)
440 return
442 if n_features != self.n_features_in
:
--> 443 raise ValueError(
444 f"X has {n_features} features, but {self.class.name} "
445 f"is expecting {self.n_features_in_} features as input."
446 )

ValueError: X has 6 features, but PowerTransformer is expecting 1 features as input.

Versions / Dependencies

Python
3.11.11

Zorin OS
17.2

mlforecast
1.0.1

sklearn
1.5.2

pandas
2.2.3

numpy
1.26.4

scipy
1.14.0

Reproduction script

from sklearn.preprocessing import PowerTransformer
from mlforecast.target_transforms import GlobalSklearnTransformer

sk_yeojohnson = PowerTransformer(method='yeo-johnson', standardize=False)
yeojohnson_global = GlobalSklearnTransformer(sk_yeojohnson)

mdl_lr = LinearRegression()
mdl_cb = CatBoostRegressor(loss_function="MAE", verbose=False, allow_writing_files=False, has_time=True)
mdl_xgb = XGBRegressor(verbosity=0)
mdl_lgbm = LGBMRegressor(objective="regression_l1", verbosity    = -1)
mdl_rf = RandomForestRegressor(verbose=0, criterion="friedman_mse")
mdl_gb = GradientBoostingRegressor(verbose=0, loss="absolute_error")

models_ml = [mdl_lr, mdl_cb, mdl_xgb, mdl_lgbm, mdl_rf, mdl_gb]

mlf = MLForecast(models=models_ml, freq='D', lags=list(range(1, 31)), target_transforms=[yeojohnson_global])

df_cv = mlf.cross_validation(df=simple_train[["ds", "unique_id", "y"]], n_windows=5, h=15, step_size=15, refit=True, static_features = [])

Issue Severity

None

@wasf84 wasf84 added the bug label Feb 6, 2025
@jmoralez jmoralez changed the title [<Library component: Model|Core|etc...>] ValueError when using [GlobalSklearnTransformer+PowerTransformer] ValueError when using [GlobalSklearnTransformer+PowerTransformer] Feb 6, 2025
@jmoralez
Copy link
Member

jmoralez commented Feb 7, 2025

Hey. Thanks for the report, a fix will be released soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants