Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🔒 🤖 CI Update lock files for array-api CI build(s) 🔒 🤖 #29373

Conversation

scikit-learn-bot
Copy link
Contributor

Update lock files.

Note

If the CI tasks fail, create a new branch based on this PR and add the required fixes to that branch.

Copy link

github-actions bot commented Jul 1, 2024

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

Generated for commit: da8dee7. Link to the linter CI: here

@betatim
Copy link
Member

betatim commented Jul 1, 2024

I started the CUDA CI for this PR https://github.com/scikit-learn/scikit-learn/actions/runs/9740561717 (failed because it couldn't check out the commit. Investigating that now) Now https://github.com/scikit-learn/scikit-learn/actions/runs/9740690111/job/26878320047 is running.

I was expecting to see an automatically triggered run from the same bot that made this PR. There is https://github.com/scikit-learn/scikit-learn/actions/runs/9738462572 but that seems to have run on a4ebe19 which is the latest commit on main, not one that contains the updated lockfile.

edit: you have to paste the full commit hash 68f5f9bf93e23b0fd2fe746ce25cf0c63c57d55e and not just 68f5f9b

@lesteve
Copy link
Member

lesteve commented Jul 1, 2024

There seems to be genuine errors. Probably it would be good to look at #29276 first since there seems to be some Array API issues there as well.

@betatim
Copy link
Member

betatim commented Jul 2, 2024

Some (or even all?) of the errors in the CI should be fixed by #29336.

I think if we click the "Update branch" button at the bottom of this PR we will get those changes from main and can retest this PR?

@lesteve
Copy link
Member

lesteve commented Jul 2, 2024

I clicked "Update branch button" and triggered the GPU workflow, let's see 😉 https://github.com/scikit-learn/scikit-learn/actions/runs/9758399657/job/26932855916

@lesteve
Copy link
Member

lesteve commented Jul 2, 2024

You may want to double-check I triggered the GPU workflow correctly since this is the first time I did it 😉

@lesteve
Copy link
Member

lesteve commented Jul 2, 2024

Amongst other errors, there seems to be the same errors TypeError: array iteration is not allowed in array-api-strict as in #29276 that are likely due to array-api-strict update.

@betatim
Copy link
Member

betatim commented Jul 12, 2024

I'm a bit puzzled by this. Trying to reproduce this on a machine with a GPU I used conda-lock install -n scikit-learn-debug-202407 build_tools/github/pylatest_conda_forge_cuda_array-api_linux-64_conda.lock, then built scikit-learn pip install --verbose --no-build-isolation --editable . --config-settings editable-verbose=true. However, now when I import sklearn I get the following:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/nfs/thead/git/scikit-learn/sklearn/__init__.py", line 84, in <module>
    from .base import clone
  File "/home/nfs/thead/git/scikit-learn/sklearn/base.py", line 19, in <module>
    from .utils._estimator_html_repr import _HTMLDocumentationLinkMixin, estimator_html_repr
  File "/home/nfs/thead/git/scikit-learn/sklearn/utils/__init__.py", line 20, in <module>
    from ._indexing import (
  File "/home/nfs/thead/git/scikit-learn/sklearn/utils/_indexing.py", line 12, in <module>
    from .extmath import _approximate_mode
  File "/home/nfs/thead/git/scikit-learn/sklearn/utils/extmath.py", line 16, in <module>
    from .sparsefuncs_fast import csr_row_norms
  File "sklearn/utils/sparsefuncs_fast.pyx", line 1, in init sklearn.utils.sparsefuncs_fast
    """Utilities to work with sparse matrices and arrays written in Cython."""
ValueError: numpy.dtype size changed, may indicate binary incompatibility. Expected 96 from C header, got 88 from PyObject

If anyone has an idea, let me know

@lesteve
Copy link
Member

lesteve commented Jul 12, 2024

make clean and make dev again?

FWIW I am able to reproduce on a machine with a GPU from the lock-file (but not on Colab for some reason ...)

The two value on each of the assertion that fails:

(Pdb) result.dtype
dtype('float64')
(Pdb) result_xp_np.dtype
dtype('float32')
FAILED sklearn/tests/test_common.py::test_estimators[LinearDiscriminantAnalysis()-check_array_api_input(array_namespace=torch,dtype_name=float32,device=cpu)] - AssertionError
FAILED sklearn/tests/test_common.py::test_estimators[LinearDiscriminantAnalysis()-check_array_api_input(array_namespace=torch,dtype_name=float32,device=cuda)] - AssertionError
FAILED sklearn/tests/test_common.py::test_estimators[PCA()-check_array_api_input(array_namespace=torch,dtype_name=float32,device=cpu)] - AssertionError
FAILED sklearn/tests/test_common.py::test_estimators[PCA()-check_array_api_input(array_namespace=torch,dtype_name=float32,device=cuda)] - AssertionError
= 4 failed, 1460 passed, 178 skipped, 34087 deselected, 1027 warnings in 64.55s (0:01:04) =

Since there are errors with torch on cpu I guess this is also reproducible without GPU? Edit: yes I can use the lock-file, run on a machine without GPU and I get the failure for torch cpu.

One of the stack-trace
_ test_estimators[PCA()-check_array_api_input(array_namespace=torch,dtype_name=float32,device=cpu)] _

estimator = PCA()
check = functools.partial(<function check_array_api_input at 0x7f272a82b6a0>, 'PCA', array_namespace='torch', dtype_name='float32', device='cpu')
request = <FixtureRequest for <Function test_estimators[PCA()-check_array_api_input(array_namespace=torch,dtype_name=float32,device=cpu)]>>

    @parametrize_with_checks(list(chain(_tested_estimators(), _generate_pipeline())))
    def test_estimators(estimator, check, request):
        # Common tests for estimator instances
        with ignore_warnings(
            category=(FutureWarning, ConvergenceWarning, UserWarning, LinAlgWarning)
        ):
            _set_checking_parameters(estimator)
>           check(estimator)

sklearn/tests/test_common.py:169: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

name = 'PCA', estimator_orig = PCA(), array_namespace = 'torch', device = 'cpu'
dtype_name = 'float32', check_values = False

    def check_array_api_input(
        name,
        estimator_orig,
        array_namespace,
        device=None,
        dtype_name="float64",
        check_values=False,
    ):
        """Check that the estimator can work consistently with the Array API
    
        By default, this just checks that the types and shapes of the arrays are
        consistent with calling the same estimator with numpy arrays.
    
        When check_values is True, it also checks that calling the estimator on the
        array_api Array gives the same results as ndarrays.
        """
        xp = _array_api_for_tests(array_namespace, device)
    
        X, y = make_classification(random_state=42)
        X = X.astype(dtype_name, copy=False)
    
        X = _enforce_estimator_tags_X(estimator_orig, X)
        y = _enforce_estimator_tags_y(estimator_orig, y)
    
        est = clone(estimator_orig)
    
        X_xp = xp.asarray(X, device=device)
        y_xp = xp.asarray(y, device=device)
    
        est.fit(X, y)
    
        array_attributes = {
            key: value for key, value in vars(est).items() if isinstance(value, np.ndarray)
        }
    
        est_xp = clone(est)
        with config_context(array_api_dispatch=True):
            est_xp.fit(X_xp, y_xp)
            input_ns = get_namespace(X_xp)[0].__name__
    
        # Fitted attributes which are arrays must have the same
        # namespace as the one of the training data.
        for key, attribute in array_attributes.items():
            est_xp_param = getattr(est_xp, key)
            with config_context(array_api_dispatch=True):
                attribute_ns = get_namespace(est_xp_param)[0].__name__
            assert attribute_ns == input_ns, (
                f"'{key}' attribute is in wrong namespace, expected {input_ns} "
                f"got {attribute_ns}"
            )
    
            assert array_device(est_xp_param) == array_device(X_xp)
    
            est_xp_param_np = _convert_to_numpy(est_xp_param, xp=xp)
            if check_values:
                assert_allclose(
                    attribute,
                    est_xp_param_np,
                    err_msg=f"{key} not the same",
                    atol=_atol_for_type(X.dtype),
                )
            else:
                assert attribute.shape == est_xp_param_np.shape
                assert attribute.dtype == est_xp_param_np.dtype
    
        # Check estimator methods, if supported, give the same results
        methods = (
            "score",
            "score_samples",
            "decision_function",
            "predict",
            "predict_log_proba",
            "predict_proba",
            "transform",
        )
    
        for method_name in methods:
            method = getattr(est, method_name, None)
            if method is None:
                continue
    
            if method_name == "score":
                result = method(X, y)
                with config_context(array_api_dispatch=True):
                    result_xp = getattr(est_xp, method_name)(X_xp, y_xp)
                # score typically returns a Python float
                assert isinstance(result, float)
                assert isinstance(result_xp, float)
                if check_values:
                    assert abs(result - result_xp) < _atol_for_type(X.dtype)
                continue
            else:
                result = method(X)
                with config_context(array_api_dispatch=True):
                    result_xp = getattr(est_xp, method_name)(X_xp)
    
            with config_context(array_api_dispatch=True):
                result_ns = get_namespace(result_xp)[0].__name__
            assert result_ns == input_ns, (
                f"'{method}' output is in wrong namespace, expected {input_ns}, "
                f"got {result_ns}."
            )
    
            assert array_device(result_xp) == array_device(X_xp)
            result_xp_np = _convert_to_numpy(result_xp, xp=xp)
    
            if check_values:
                assert_allclose(
                    result,
                    result_xp_np,
                    err_msg=f"{method} did not the return the same result",
                    atol=_atol_for_type(X.dtype),
                )
            else:
                if hasattr(result, "shape"):
                    assert result.shape == result_xp_np.shape
>                   assert result.dtype == result_xp_np.dtype
E                   AssertionError

@betatim
Copy link
Member

betatim commented Jul 12, 2024

I fixed my issue with git clean -xdf, there were a lot of generated .c files. Probably because I haven't used this particular checkout in a long long time.

@lesteve
Copy link
Member

lesteve commented Jul 12, 2024

Probably because I haven't used this particular checkout in a long long time.

Ah the good old setuptools days 😉

@betatim
Copy link
Member

betatim commented Jul 12, 2024

The problem might be

xp.asarray(0.0, device=device(exp_var)),
. If you change it to:

exp_var_diff = xp.where(
            exp_var > self.noise_variance_,
            exp_var_diff,
            xp.asarray(0.0, device=device(exp_var), dtype=exp_var.dtype),
        )

the test passes again. I think it makes sense. Interesting that we didn't find this earlier. And another reason why it would be good for xp.where to support scalar arguments .

@lesteve
Copy link
Member

lesteve commented Jul 15, 2024

Closing since the issue has been fixed in #29488.

@lesteve lesteve closed this Jul 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants