Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] cudf_udf nightly tests failing due to no attribute __pyx_capi__ #11693

Open
jlowe opened this issue Nov 5, 2024 · 3 comments
Open

[BUG] cudf_udf nightly tests failing due to no attribute __pyx_capi__ #11693

jlowe opened this issue Nov 5, 2024 · 3 comments
Assignees
Labels
bug Something isn't working

Comments

@jlowe
Copy link
Member

jlowe commented Nov 5, 2024

Nightly cudf_udf test builds recently started failing with exceptions like the following:

INFO: Process 1469 found CUDA visible device(s): 0
Traceback (most recent call last):
  File "/home/...../jars/rapids-4-spark_2.12-24.12.0-SNAPSHOT-cuda11.jar/rapids/daemon.py", line 131, in manag
er
  File "/home/...../jars/rapids-4-spark_2.12-24.12.0-SNAPSHOT-cuda11.jar/rapids/worker.py", line 37, in initia
lize_gpu_mem
    from cudf import rmm
  File "/opt/conda/lib/python3.10/site-packages/cudf/__init__.py", line 19, in <module>
    _setup_numba()
  File "/opt/conda/lib/python3.10/site-packages/cudf/utils/_numba.py", line 121, in _setup_numba
    shim_ptx_cuda_version = _get_cuda_build_version()
  File "/opt/conda/lib/python3.10/site-packages/cudf/utils/_numba.py", line 16, in _get_cuda_build_version
    from cudf._lib import strings_udf
  File "/opt/conda/lib/python3.10/site-packages/cudf/_lib/__init__.py", line 4, in <module>
    from . import (
  File "avro.pyx", line 1, in init cudf._lib.avro
  File "utils.pyx", line 1, in init cudf._lib.utils
  File "column.pyx", line 1, in init cudf._lib.column
  File "/opt/conda/lib/python3.10/site-packages/rmm/__init__.py", line 17, in <module>
    from rmm import mr
  File "/opt/conda/lib/python3.10/site-packages/rmm/mr.py", line 14, in <module>
    from rmm.pylibrmm.memory_resource import (
  File "/opt/conda/lib/python3.10/site-packages/rmm/pylibrmm/__init__.py", line 15, in <module>
    from .device_buffer import DeviceBuffer
  File "device_buffer.pyx", line 1, in init rmm.pylibrmm.device_buffer
AttributeError: module 'cuda.ccudart' has no attribute '__pyx_capi__'
INFO: Process 1503 found CUDA visible device(s): 0
24/11/05 14:10:10 ERROR Executor: Exception in task 2.0 in stage 1.0 (TID 8)
java.io.EOFException
        at java.io.DataInputStream.readInt(DataInputStream.java:392)
        at org.apache.spark.api.python.PythonWorkerFactory.createSocket$1(PythonWorkerFactory.scala:121)
        at org.apache.spark.api.python.PythonWorkerFactory.liftedTree1$1(PythonWorkerFactory.scala:137)
        at org.apache.spark.api.python.PythonWorkerFactory.createThroughDaemon(PythonWorkerFactory.scala:136)
        at org.apache.spark.api.python.PythonWorkerFactory.create(PythonWorkerFactory.scala:106)
        at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:121)
        at org.apache.spark.api.python.BasePythonRunner.compute(PythonRunner.scala:162)
        at org.apache.spark.sql.rapids.execution.python.GpuArrowEvalPythonExec.$anonfun$internalDoExecuteColumnar$2(GpuArrowEvalPythonExec.scala:456)
        at org.apache.spark.rdd.RDD.$anonfun$mapPartitions$2(RDD.scala:863)
        at org.apache.spark.rdd.RDD.$anonfun$mapPartitions$2$adapted(RDD.scala:863)
        at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373)
[.....]
@jlowe jlowe added ? - Needs Triage Need team to review and classify bug Something isn't working labels Nov 5, 2024
@leofang
Copy link
Member

leofang commented Nov 5, 2024

This is tracked in NVIDIA/cuda-python#215. We are working on it. For the time being please downgrade your cuda-python version as instructed there.

@mattahrens
Copy link
Collaborator

We need to update our test environment to pin the version of cuda-python.

@mattahrens mattahrens removed the ? - Needs Triage Need team to review and classify label Nov 5, 2024
@pxLi
Copy link
Collaborator

pxLi commented Nov 6, 2024

We need to update our test environment to pin the version of cuda-python.

The cudf-udf pipeline is designed to monitor nightly CUDF-py changes.
I recommend keeping it running against the latest nightly CUDF build unless we decide not to wait for the fix in this release, thanks

@pxLi pxLi assigned pxLi and unassigned GaryShen2008 Nov 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

5 participants