CUDA Python is a standard set of low-level interfaces, providing full coverage of and access to the CUDA host APIs from Python. Checkout the Overview for the workflow and performance results.
CUDA Python can be installed from:
- PYPI
- Conda (nvidia channel)
- Source builds
There're differences in each of these options that are described further in Installation documentation. Each package will guarantee minor version compatibility.
CUDA Python is supported on all platforms that CUDA is supported. Specific dependencies are as follows:
- Driver: Linux (450.80.02 or later) Windows (456.38 or later)
- CUDA Toolkit 12.0 to 12.6
Only the NVRTC redistributable component is required from the CUDA Toolkit. CUDA Toolkit Documentation Installation Guides can be used for guidance. Note that the NVRTC component in the Toolkit can be obtained via PYPI, Conda or Local Installer.
CUDA Python follows NEP 29 for supported Python version guarantee.
Before dropping support, an issue will be raised to look for feedback.
Source builds work for multiple Python versions, however pre-build PyPI and Conda packages are only provided for a subset:
- Python 3.9 to 3.12
Latest dependencies can be found in requirements.txt.
You can run the included tests with:
python -m pytest tests/
You can run benchmark only tests with:
python -m pytest --benchmark-only benchmarks/
You can run the included tests with:
python -m pytest examples/
CUDA Samples rewriten using CUDA Python are found in examples
.
Custom extra included examples:
examples/extra/jit_program_test.py
: Demonstrates the use of the API to compile and launch a kernel on the device. Includes device memory allocation / deallocation, transfers between host and device, creation and usage of streams, and context management.examples/extra/numba_emm_plugin.py
: Implements a Numba External Memory Management plugin, showing that this CUDA Python Driver API can coexist with other wrappers of the driver API.