[BUG] cudf_udf nightly tests failing due to no attribute __pyx_capi__ #11693

jlowe · 2024-11-05T16:18:30Z

Nightly cudf_udf test builds recently started failing with exceptions like the following:

INFO: Process 1469 found CUDA visible device(s): 0
Traceback (most recent call last):
  File "/home/...../jars/rapids-4-spark_2.12-24.12.0-SNAPSHOT-cuda11.jar/rapids/daemon.py", line 131, in manag
er
  File "/home/...../jars/rapids-4-spark_2.12-24.12.0-SNAPSHOT-cuda11.jar/rapids/worker.py", line 37, in initia
lize_gpu_mem
    from cudf import rmm
  File "/opt/conda/lib/python3.10/site-packages/cudf/__init__.py", line 19, in <module>
    _setup_numba()
  File "/opt/conda/lib/python3.10/site-packages/cudf/utils/_numba.py", line 121, in _setup_numba
    shim_ptx_cuda_version = _get_cuda_build_version()
  File "/opt/conda/lib/python3.10/site-packages/cudf/utils/_numba.py", line 16, in _get_cuda_build_version
    from cudf._lib import strings_udf
  File "/opt/conda/lib/python3.10/site-packages/cudf/_lib/__init__.py", line 4, in <module>
    from . import (
  File "avro.pyx", line 1, in init cudf._lib.avro
  File "utils.pyx", line 1, in init cudf._lib.utils
  File "column.pyx", line 1, in init cudf._lib.column
  File "/opt/conda/lib/python3.10/site-packages/rmm/__init__.py", line 17, in <module>
    from rmm import mr
  File "/opt/conda/lib/python3.10/site-packages/rmm/mr.py", line 14, in <module>
    from rmm.pylibrmm.memory_resource import (
  File "/opt/conda/lib/python3.10/site-packages/rmm/pylibrmm/__init__.py", line 15, in <module>
    from .device_buffer import DeviceBuffer
  File "device_buffer.pyx", line 1, in init rmm.pylibrmm.device_buffer
AttributeError: module 'cuda.ccudart' has no attribute '__pyx_capi__'
INFO: Process 1503 found CUDA visible device(s): 0
24/11/05 14:10:10 ERROR Executor: Exception in task 2.0 in stage 1.0 (TID 8)
java.io.EOFException
        at java.io.DataInputStream.readInt(DataInputStream.java:392)
        at org.apache.spark.api.python.PythonWorkerFactory.createSocket$1(PythonWorkerFactory.scala:121)
        at org.apache.spark.api.python.PythonWorkerFactory.liftedTree1$1(PythonWorkerFactory.scala:137)
        at org.apache.spark.api.python.PythonWorkerFactory.createThroughDaemon(PythonWorkerFactory.scala:136)
        at org.apache.spark.api.python.PythonWorkerFactory.create(PythonWorkerFactory.scala:106)
        at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:121)
        at org.apache.spark.api.python.BasePythonRunner.compute(PythonRunner.scala:162)
        at org.apache.spark.sql.rapids.execution.python.GpuArrowEvalPythonExec.$anonfun$internalDoExecuteColumnar$2(GpuArrowEvalPythonExec.scala:456)
        at org.apache.spark.rdd.RDD.$anonfun$mapPartitions$2(RDD.scala:863)
        at org.apache.spark.rdd.RDD.$anonfun$mapPartitions$2$adapted(RDD.scala:863)
        at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373)
[.....]

The text was updated successfully, but these errors were encountered:

leofang · 2024-11-05T18:46:05Z

This is tracked in NVIDIA/cuda-python#215. We are working on it. For the time being please downgrade your cuda-python version as instructed there.

mattahrens · 2024-11-05T21:44:04Z

We need to update our test environment to pin the version of cuda-python.

pxLi · 2024-11-06T00:15:08Z

We need to update our test environment to pin the version of cuda-python.

The cudf-udf pipeline is designed to monitor nightly CUDF-py changes.
I recommend keeping it running against the latest nightly CUDF build unless we decide not to wait for the fix in this release, thanks

jlowe added ? - Needs Triage Need team to review and classify bug Something isn't working labels Nov 5, 2024

mattahrens assigned GaryShen2008 Nov 5, 2024

mattahrens removed the ? - Needs Triage Need team to review and classify label Nov 5, 2024

pxLi assigned pxLi and unassigned GaryShen2008 Nov 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] cudf_udf nightly tests failing due to no attribute __pyx_capi__ #11693

[BUG] cudf_udf nightly tests failing due to no attribute __pyx_capi__ #11693

jlowe commented Nov 5, 2024

leofang commented Nov 5, 2024

mattahrens commented Nov 5, 2024

pxLi commented Nov 6, 2024

[BUG] cudf_udf nightly tests failing due to no attribute __pyx_capi__ #11693

[BUG] cudf_udf nightly tests failing due to no attribute __pyx_capi__ #11693

Comments

jlowe commented Nov 5, 2024

leofang commented Nov 5, 2024

mattahrens commented Nov 5, 2024

pxLi commented Nov 6, 2024