Releases: NVIDIA/k8s-device-plugin
Releases · NVIDIA/k8s-device-plugin
v0.15.0-rc.1
What's Changed
- Import GPU Feature Discovery into the GPU Device Plugin repo. This means that the same version and container image is used for both components.
- Add tooling to create a kind cluster for local development and testing.
- Update
go-gpuallocator
dependency to migrate away from the deprecatedgpu-monitoring-tools
NVML bindings. - Remove
legacyDaemonsetAPI
config option. This was only required for k8s versions < 1.16. - Add support for MPS sharing.
- Bump CUDA base image version to 12.3.1
Full Changelog: v0.14.0...v0.15.0-rc.1
v0.14.4
What's Changed
- Update to refactored go-gpuallocator code. This permanently fixes the NVML_NVLINK_MAX_LINKS value addressed in a
hotfix in v0.14.3. This also addresses a bug due to uninitialized NVML when calling go-gpuallocator.
Full Changelog: v0.14.3...v0.14.4
v0.14.3
Bug fixes
- Patched vendored
NVML_NVLINK_MAX_LINKS
to 18 to support devices with 18 NVLinks
Dependency updates
- Bumped CUDA base images version to 12.3.0
Full Changelog: v0.14.2...v0.14.3
v0.14.2
v0.14.1
This release fixes bugs and bumps dependencies.
Bug fixes
- Fixed parsing of
deviceListStrategy
in device plugin config (#410)
Dependency Updates
- Updated CUDA Base Image to 12.2.0
- Update GPU Feature Discovery version to
v0.8.1
- Update Node Feature Discovery to
v0.13.2
- Updated Go dependencies.
Full Changelog: v0.14.0...v0.14.1
v0.14.0
Full Changelog: v0.13.0...v0.14.0
Changes
- Promote
v0.14.0-rc.3
tov0.14.0
- Bumped
nvidia-container-toolkit
dependency to latest version for newer CDI spec generation code - Updated GFD subchart to version v0.8.0
Changes from v0.14.0-rc.3
- Removed the
--cdi-enabled
config option and instead trigger CDI injection based oncdi-annotation
strategy. - Bumped
go-nvlib
dependency to latest version to support new MIG profiles. - Added
cdi-annotation-prefix
config option to control how CDI annotations are generated. - Renamed
driver-root-ctr-path
config option added inv0.14.0-rc.1
tocontainer-driver-root
. - Updated GFD subchart to version v0.8.0-rc.2
Changes from v0.14.0-rc.2
- Fix bug from v0.14.0-rc.1 when using cdi-enabled=false
Changes from v0.14.0-rc.1
- Added --cdi-enabled flag to GPU Device Plugin. With this enabled, the device plugin will generate CDI specifications for available NVIDIA devices. Allocation will add CDI anntiations (
cdi.k8s.io/*
) to the response. These are read by a CDI-enabled runtime to make the required modifications to a container being created. - Updated GFD subchard to version 0.8.0-rc.1
- Bumped Golang version to 1.20.1
- Bumped CUDA base images version to 12.1.0
- Switched to klog for logging
- Added a static deployment file for Microshift
Note:
The container image nvcr.io/nvidia/k8s-device-plugin-v0.14.0-ubi8
contains the following high-severity CVEs:
- CVE-2023-0286 - Vulnerability found in os package type (rpm) - openssl-libs
- CVE-2023-24329 - Vulnerability found in os package type (rpm) - platform-python and python3-libs
v0.14.0-rc.3
Full Changelog: v0.14.0-rc.2...v0.14.0-rc.3
Changes
- Removed the
--cdi-enabled
config option and instead trigger CDI injection based oncdi-annotation
strategy. - Bumped
go-nvlib
dependency to latest version to support new MIG profiles. - Added
cdi-annotation-prefix
config option to control how CDI annotations are generated. - Renamed
driver-root-ctr-path
config option added inv0.14.0-rc.1
tocontainer-driver-root
. - Updated GFD subchart to version v0.8.0-rc.2
v0.14.0-rc.2
Full Changelog: v0.14.0-rc.1...v0.14.0-rc.2
Changes
- Fix bug from v0.14.0-rc.1 when using cdi-enabled=false
v0.14.0-rc.1
Full Changelog: v0.13.0...v0.14.0-rc.1
Changes
- Added --cdi-enabled flag to GPU Device Plugin. With this enabled, the device plugin will generate CDI specifications for available NVIDIA devices. Allocation will add CDI anntiations (
cdi.k8s.io/*
) to the response. These are read by a CDI-enabled runtime to make the required modifications to a container being created. - Updated GFD subchard to version 0.8.0-rc.1
- Bumped Golang version to 1.20.1
- Bumped CUDA base images version to 12.1.0
- Switched to klog for logging
- Added a static deployment file for Microshift
v0.13.0
Full Changelog: v0.12.2...v0.13.0
Changes
- Skip
NVIDIA DGX Display
devices when generating labels. - Fail on startup if no valid resources are detected
- Bump GFD subchart to version 0.7.0
Changes from v0.13.0-rc.3
- Use
nodeAffinity
instead ofnodeSelector
by default in daemonsets - Add
machine-file-path
option to GFD config flags - Mount
/sys
instead of/sys/class/dmi/id/product_name
in GPU Feature Discovery daemonset - Bump GFD subchard to version 0.7.0-rc.3
Changes from v0.13.0-rc.2
- Bump cuda base image to 11.8.0
- Use consistent indendation in YAML manifests
- Fix bug from v0.13.0-rc.1 when using mig-strategy="mixed"
- Add logged error message if setting up health checks fails
- Support MIG devices with 1g.10gb+me profile
- Distribute replicas evenly across GPUs during allocation
- Bump GFD subchart to version 0.7.0-rc.2
Changes from v0.13.0-rc.1
- Improve health checks to detect errors when waiting on device events
- Log ECC error events detected during health check
- Add the GIT sha to version information for the CLI and container images
- Use NVML interfaces from go-nvlib to query devices
- Refactor plugin creation from resources
- Add a CUDA-based resource manager that can be used to expose integrated devices on Tegra-based systems
- Bump GFD subchart to version 0.7.0-rc.1
Note:
The container image nvcr.io/nvidia/k8s-device-plugin:v0.13.0-ubi8
contains the following high-severity CVEs:
- CVE-2022-42898 - Vulnerability found in os package type (rpm) - krb5-libs