Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AutoML: CI trips with ValueError: Input contains NaN. #298

Closed
amotl opened this issue Feb 13, 2024 · 8 comments
Closed

AutoML: CI trips with ValueError: Input contains NaN. #298

amotl opened this issue Feb 13, 2024 · 8 comments

Comments

@amotl
Copy link
Member

amotl commented Feb 13, 2024

Originally coming from an issue that mixed things up, GH-170, let's get things straight here.

Problem

CI on the AutoML job occasionally trips like this, failing the CI run.

FAILED test.py::test_file[automl_timeseries_forecasting_with_pycaret.py] - ValueError: Input contains NaN.
self = <joblib.parallel.BatchCompletionCallBack object at 0x7f4f737cb910>

    def _return_or_raise(self):
        try:
            if self.status == TASK_ERROR:
>               raise self._result
E               ValueError: Input contains NaN.

-- https://github.com/crate/cratedb-examples/actions/runs/7884792002/job/21514554253#step:6:1146

Outlook

@andnig shared his suggestions at #170 (comment) already. Maybe you can add them here instead?

@amotl
Copy link
Member Author

amotl commented Feb 13, 2024

Recommendation

@andnig suggested:

To go forward, you could use a different model for the test run, one which has less MASE.

Thanks!

Rationale

If I look at the failed run, I see the the esm model has an incredibly high MASE and RMSSE. This mostly indicates that the model is not very well suited for the data. I suggested it, as it is very lightweight, but well, too lightweight as it seems 😓

Untitled

@amotl
Copy link
Member Author

amotl commented Feb 14, 2024

Hi again. GH-300 makes it so to exclusively use a single model, "ets_cds_dt". Unfortunately, it still trips on CI.

@andnig
Copy link
Contributor

andnig commented Feb 14, 2024

Wasn't the script about using 3 models? I think the later benchmarking operations need at least 3 models, don't they?
Using 1 model without adjusting the later call will probably cause the trainers to fail.
But you'd also see this locally, not only on CI.

@amotl
Copy link
Member Author

amotl commented Feb 14, 2024

Ah all right. That looks like I didn't know what I was doing at all. Thanks!

@seut seut added bug Something isn't working and removed bug Something isn't working labels Feb 20, 2024
@amotl
Copy link
Member Author

amotl commented Feb 27, 2024

Currently, we see no problems on CI in this regard. Therefore, I am closing the issue. Thanks for your support, @andnig!

@amotl amotl closed this as completed Feb 27, 2024
@amotl
Copy link
Member Author

amotl commented Apr 11, 2024

The problem still happens occasionally, so re-opening.

-- https://github.com/crate/cratedb-examples/actions/runs/8644976949/job/23701224841#step:6:1137

@amotl amotl reopened this Apr 11, 2024
@amotl
Copy link
Member Author

amotl commented Apr 14, 2024

@amotl
Copy link
Member Author

amotl commented Jan 21, 2025

Hi. AutoML/PyCaret is still flaky. I've converged this into:

We will need to migrate off PyCaret anyway.

@amotl amotl closed this as not planned Won't fix, can't repro, duplicate, stale Jan 21, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants