You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I previously posted about 1 issue related to static default policies in the oneapi/dpl headers, which setting the environment variable ONEDPL_USE_PREDEFINED_POLICIES 0 fixed.
I have now obtained the same issue in a slightly different way:
UR CUDA ERROR:
Value: 3
Name: CUDA_ERROR_NOT_INITIALIZED
Description: initialization error
Function: setContext
Source Location: /tmp/tmp.HhqPmzG672/intel-llvm-mirror/build/_deps/unified-runtime-src/source/adapters/cuda/context.hpp:142
Native API failed. Native API returns: -5 (PI_ERROR_OUT_OF_RESOURCES)
UR CUDA ERROR:
Value: 3
Name: CUDA_ERROR_NOT_INITIALIZED
Description: initialization error
Function: setContext
Source Location: /tmp/tmp.HhqPmzG672/intel-llvm-mirror/build/_deps/unified-runtime-src/source/adapters/cuda/context.hpp:142
Native API failed. Native API returns: -5 (PI_ERROR_OUT_OF_RESOURCES)
Without the code in the second set of braces the program no longer produces any runtime errors. Crucially if no call to oneapi::dpl::reduce is made the program produces no runtime errors. This suggests to me that this function somehow leaks shared pointers to a sycl::queue or something similar that would prevent the release of the underlying CUDA context.
UR CUDA ERROR:
Value: 3
Name: CUDA_ERROR_NOT_INITIALIZED
Description: initialization error
Function: setContext
Source Location: /tmp/tmp.IpEAV9Rdzp/intel-llvm-mirror/build/_deps/unified-runtime-src/source/adapters/cuda/context.hpp:142
Native API failed. Native API returns: -5 (PI_ERROR_OUT_OF_RESOURCES)
UR CUDA ERROR:
Value: 3
Name: CUDA_ERROR_NOT_INITIALIZED
Description: initialization error
Function: setContext
Source Location: /tmp/tmp.IpEAV9Rdzp/intel-llvm-mirror/build/_deps/unified-runtime-src/source/adapters/cuda/context.hpp:142
Native API failed. Native API returns: -5 (PI_ERROR_OUT_OF_RESOURCES)
This seems to be a general problem with the SYCL CUDA backend as opposed to an issue within oneDPL, so I would recommend filing an issue to https://github.com/intel/llvm.
I previously posted about 1 issue related to static default policies in the oneapi/dpl headers, which setting the environment variable
ONEDPL_USE_PREDEFINED_POLICIES 0
fixed.I have now obtained the same issue in a slightly different way:
This again produces the error:
Without the code in the second set of braces the program no longer produces any runtime errors. Crucially if no call to
oneapi::dpl::reduce
is made the program produces no runtime errors. This suggests to me that this function somehow leaks shared pointers to asycl::queue
or something similar that would prevent the release of the underlying CUDA context.ICPX version:
compilation command:
The text was updated successfully, but these errors were encountered: