Skip to content

Commit

Permalink
Update dep and rpm packaging
Browse files Browse the repository at this point in the history
Signed-off-by: Evan Lezar <[email protected]>
  • Loading branch information
elezar committed Apr 18, 2024
1 parent 99631f4 commit 5a4aebc
Show file tree
Hide file tree
Showing 8 changed files with 377 additions and 163 deletions.
107 changes: 107 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,107 @@
# NVIDIA MIG Manager Changelog

## v0.6.0
- Update to latest CUDA base image 12.3.2
- Migrate to using github.com/NVIDIA/go-nvlib
- Bump Golang version to 1.20.5
- Bump nvidia-ctk version used by k8s-mig-manager to 1.14.6
- Update vendored go dependencies
- Minor code improvements and refactoring

## v0.5.5
- Update to latest CUDA base image 12.2.2

## v0.5.4
- Update MIG config for Hopper with device ID of H100 80GB HBM3 SKU

## v0.5.3
- Update to latest CUDA image 12.2.0
- Update example config for Hopper with H100 NVL and H800 NVL

## v0.5.2
- Update to latest CUDA image 12.1.0
- Update k8s-mig-manager to support CDI
- Add two new example configs for the newly supported profiles on A100
- Update MIG profile code to rely on go-nvlib
- Update vendored go-nvlib to latest
- Update NVML wrapper to include MIG profiles from NVML v12.0

## v0.5.1
- Update to latest CUDA image 12.0.1
- Add newer MIG profiles supported with NVML 12.0 to default config.yaml files
- Add profiles with media extensions for A30-24GB to default config.yaml files
- Add H100 and H800 profiles to default config.yaml files
- Add A800 profiles to default config.yaml files
- Update all calls to enumerate GPUs to use NVML or PCI as appropriate
- Bump vendored go-nvml to v12.0
- Bump Golang version to 1.20.1

## v0.5.0
- Bump CUDA base image to 11.7.1
- Remove CUDA compat libs from mig-manager in favor of libs installed by the Driver
- Use symlink for config.yaml instead of static config file
- Add k8s-mig-manager-example for Hopper
- Update k8s-mig-manager-example with standalone RBAC objects
- Explicitly delete pods launched by operator validator before reconfig
- Allow missing GPUClients file in k8s-mig-manager
- Add hooks-minimal.yaml that gets linked if on Hopper or above
- Use symlink for hooks.yaml instead of static config file
- Update install script to use go 1.16.4
- Update hooks.sh to split out start/stop of k8s services from k8s pods
- Explicitly clear all MIG configurations before disabling MIG mode

## v0.4.3
- Update calculation for GB in MIG profile name
- Make the systemd-mig-manager a dependency of systemd-resolved.service

## v0.4.2
- Update CUDA image to 11.7.0
- Add extra assert in k8s-mig-manager to double check mig-mode change applied
- Update mig-manager image to use NGC DL license

## v0.4.1
- Keep NVML alive across all mig-parted commands (except GPU reset)
- Remove unnecessary services from hooks.sh

## v0.4.0
- Update nvidia-mig-parted.sh to include MIG_PARTED_CHECKPOINT_FILE
- Add checkpoint / restore commands to mig-parted CLI
- Update golang version to 1.16.4
- Support instantiation of *_PROFILE_6_SLICE GIs and CIs
- Update cyrus-sasl-lib to address CVE-2022-24407
- Add support for MIG profiles with +me as an attribute extension
- Support Compute Instances in mig-parted config such that CI != GI
- Update go-nvml to v0.11.6
- Change semantics of 'all' to mean 'all-mig-capable' in mig-parted config

## v0.3.0
- k8s-mig-manager: Add support for multi-arch images
- k8s-mig-manager: Handle eviction of NVSM pod when applying MIG changes

## v0.2.0
- nvidia-mig-parted: Support passing newer GI and CI profile enums on older drivers
- k8s-mig-manager: Rename nvcr.io/nvidia to nvcr.io/nvidia/cloud-native
- k8s-mig-manager: Add support for pre-installed drivers
- systemd-mig-manager: Update logic to remove 'containerd' containers in utils.sh
- systemd-mig-manager: Update logic to shutdown only active systemd services in list
- ci-infrastructure: Rework build and CI to align with other projects
- ci-infrastructure: Use pulse instead of contamer for scans

## v0.1.3
- Add default configs for the PG506-96GB card
- Remove CombinedMigManager and add wrappers for Mode/Config Managers
- Add a function to check the minimum NVML version required
- Add SystemGetNVMLVersion() to the NVML interface
- Fix small bug in assert logic for non MIG-capable GPUs

## v0.1.2
- Do not start nvidia-mig-manager.service when installing the .deb
- Restore lost assert_gpu_reset_available() function
- Add nvidia-dcgm.service to driver_services array
- Split dcgm, and dcgm-exporter in k8s-mig-manager

## v0.1.1
- Update packaged config.yaml to include more supported devices

## v0.1.0
- Initial release of rpm package for v0.1.0
Original file line number Diff line number Diff line change
Expand Up @@ -15,29 +15,63 @@
# build go binary
ARG BASE_IMAGE=undefined
ARG GOLANG_VERSION=undefined
FROM golang:${GOLANG_VERSION} AS go-build
FROM ${BASE_IMAGE} as go-build

ENV DEBIAN_FRONTEND=noninteractive
RUN apt-get update && apt-get install -y --no-install-recommends \
wget \
ca-certificates \
git \
build-essential \
dh-make \
fakeroot \
devscripts \
lsb-release && \
rm -rf /var/lib/apt/lists/*

ARG GOLANG_VERSION=0.0.0
RUN set -eux; \
\
arch="$(uname -m)"; \
case "${arch##*-}" in \
x86_64 | amd64) ARCH='amd64' ;; \
ppc64el | ppc64le) ARCH='ppc64le' ;; \
aarch64 | arm64) ARCH='arm64' ;; \
*) echo "unsupported architecture" ; exit 1 ;; \
esac; \
wget -nv -O - https://storage.googleapis.com/golang/go${GOLANG_VERSION}.linux-${ARCH}.tar.gz \
| tar -C /usr/local -xz

ENV GOPATH /go
ENV PATH $GOPATH/bin:/usr/local/go/bin:$PATH

WORKDIR /build
COPY . .
RUN go build -o /artifacts/nvidia-mig-parted ./cmd/nvidia-mig-parted

RUN mkdir /artifacts
ARG VERSION="N/A"
ARG GIT_COMMIT="unknown"
RUN make PREFIX=/artifacts cmds

# build package
FROM ${BASE_IMAGE}
ENV DEBIAN_FRONTEND=noninteractive
RUN apt-get update && apt-get install -y devscripts debhelper

# envs for packaging
ENV DEBFULLNAME "NVIDIA CORPORATION"
ENV DEBEMAIL "[email protected]"
ARG PACKAGE_NAME=undefined
ARG PACKAGE_VERSION=undefined
ARG PACKAGE_REVISION=undefined
ENV PACKAGE_NAME ${PACKAGE_NAME}
ENV PACKAGE_VERSION ${PACKAGE_VERSION}
ENV PACKAGE_REVISION ${PACKAGE_REVISION}
ENV PACKAGE_VERSION_STRING "${PACKAGE_VERSION}-${PACKAGE_REVISION}"
ENV SECTION ""

# working directory
ENV PWD=/tmp/${PACKAGE_NAME}-${PACKAGE_VERSION}
WORKDIR ${PWD}
WORKDIR /tmp/${PACKAGE_NAME}-${PACKAGE_VERSION_STRING}

# sources
COPY ./LICENSE .
Expand All @@ -49,8 +83,11 @@ COPY ./deployments/systemd/packages/debian/Makefile .
# output directory
RUN mkdir -p /dist

# Check that the latest changelog entry matches the current version info
RUN if [ "${PACKAGE_VERSION}-${PACKAGE_REVISION}" != "$(dpkg-parsechangelog --show-field=Version)" ]; then exit 1; fi
RUN dch --create --package="${PACKAGE_NAME}" \
--newversion "${PACKAGE_VERSION_STRING##v}" \
"See https://github.com/NVIDIA/mig-parted/-/blob/${GIT_COMMIT}/CHANGELOG.md for the changelog" && \
dch -r "" && \
if [ "${PACKAGE_VERSION_STRING##v}" != "$(dpkg-parsechangelog --show-field=Version)" ]; then exit 1; fi

# build command
CMD export DISTRIB=$(lsb_release -c -s) && \
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -15,11 +15,40 @@
# build go binary
ARG BASE_IMAGE=undefined
ARG GOLANG_VERSION=undefined
FROM golang:${GOLANG_VERSION} AS go-build
FROM ${BASE_IMAGE} as go-build

RUN yum install -y \
ca-certificates \
gcc \
wget \
git \
make \
rpm-build && \
rm -rf /var/cache/yum/*

ARG GOLANG_VERSION=0.0.0
RUN set -eux; \
\
arch="$(uname -m)"; \
case "${arch##*-}" in \
x86_64 | amd64) ARCH='amd64' ;; \
ppc64el | ppc64le) ARCH='ppc64le' ;; \
aarch64 | arm64) ARCH='arm64' ;; \
*) echo "unsupported architecture"; exit 1 ;; \
esac; \
wget -nv -O - https://storage.googleapis.com/golang/go${GOLANG_VERSION}.linux-${ARCH}.tar.gz \
| tar -C /usr/local -xz

ENV GOPATH /go
ENV PATH $GOPATH/bin:/usr/local/go/bin:$PATH

WORKDIR /build
COPY . .
RUN go build -o /artifacts/nvidia-mig-parted ./cmd/nvidia-mig-parted

RUN mkdir /artifacts
ARG VERSION="N/A"
ARG GIT_COMMIT="unknown"
RUN make PREFIX=/artifacts cmds

# build package
FROM ${BASE_IMAGE}
Expand All @@ -32,10 +61,10 @@ ARG PACKAGE_REVISION=undefined
ENV PACKAGE_NAME ${PACKAGE_NAME}
ENV PACKAGE_VERSION ${PACKAGE_VERSION}
ENV PACKAGE_REVISION ${PACKAGE_REVISION}
ENV PACKAGE_VERSION_STRING "${PACKAGE_VERSION}-${PACKAGE_REVISION}"

# working directory
ENV PWD=/tmp/${PACKAGE_NAME}-${PACKAGE_VERSION}
WORKDIR ${PWD}
WORKDIR /tmp/${PACKAGE_NAME}-${PACKAGE_VERSION_STRING}

# specs
RUN mkdir -p ./SPECS
Expand All @@ -59,5 +88,7 @@ CMD arch=$(uname -m) && \
-D "_topdir ${PWD}" \
-D "version ${PACKAGE_VERSION}" \
-D "revision ${PACKAGE_REVISION}" \
-D "git_commit ${GIT_COMMIT}" \
-D "release_date $(date +'%a %b %d %Y')" \
SPECS/${PACKAGE_NAME}.spec && \
mv RPMS/$arch/*.rpm /dist
63 changes: 53 additions & 10 deletions deployments/systemd/packages/Dockerfile.tarball
Original file line number Diff line number Diff line change
Expand Up @@ -13,28 +13,71 @@
# limitations under the License.

# build go binary
ARG BASE_IMAGE=undefined
ARG GOLANG_VERSION=undefined
FROM golang:${GOLANG_VERSION} AS go-build
FROM ${BASE_IMAGE} as go-build

ENV DEBIAN_FRONTEND=noninteractive
RUN apt-get update && apt-get install -y --no-install-recommends \
wget \
ca-certificates \
git \
build-essential \
dh-make \
fakeroot \
devscripts \
lsb-release && \
rm -rf /var/lib/apt/lists/*

ARG GOLANG_VERSION=0.0.0
RUN set -eux; \
\
arch="$(uname -m)"; \
case "${arch##*-}" in \
x86_64 | amd64) ARCH='amd64' ;; \
ppc64el | ppc64le) ARCH='ppc64le' ;; \
aarch64 | arm64) ARCH='arm64' ;; \
*) echo "unsupported architecture" ; exit 1 ;; \
esac; \
wget -nv -O - https://storage.googleapis.com/golang/go${GOLANG_VERSION}.linux-${ARCH}.tar.gz \
| tar -C /usr/local -xz

ENV GOPATH /go
ENV PATH $GOPATH/bin:/usr/local/go/bin:$PATH

WORKDIR /build
COPY . .

RUN mkdir /artifacts
ARG VERSION="N/A"
ARG GIT_COMMIT="unknown"
RUN make PREFIX=/artifacts cmds

# build package
FROM ${BASE_IMAGE}
ENV DEBIAN_FRONTEND=noninteractive
RUN apt-get update && apt-get install -y devscripts debhelper

# envs for packaging
ENV DEBFULLNAME "NVIDIA CORPORATION"
ENV DEBEMAIL "[email protected]"
ARG PACKAGE_NAME=undefined
ARG PACKAGE_VERSION=undefined
ARG PACKAGE_REVISION=undefined
ENV PACKAGE_NAME ${PACKAGE_NAME}
ENV PACKAGE_VERSION ${PACKAGE_VERSION}
ENV PACKAGE_REVISION ${PACKAGE_REVISION}
ENV PACKAGE_VERSION_STRING "${PACKAGE_VERSION}-${PACKAGE_REVISION}"
ENV SECTION ""

# destination to put tarball files
ENV DESTDIR=/${PACKAGE_NAME}-${PACKAGE_VERSION}-${PACKAGE_REVISION}

# working directory
WORKDIR /build
COPY . .
# destination to put tarball files
WORKDIR /${PACKAGE_NAME}-${PACKAGE_VERSION_STRING}
ENV DESTDIR=/${PACKAGE_NAME}-${PACKAGE_VERSION_STRING}

# collect tarball files
RUN mkdir -p ${DESTDIR}
RUN go build -o ${DESTDIR}/nvidia-mig-parted ./cmd/nvidia-mig-parted
COPY ./LICENSE ${DESTDIR}
COPY ./LICENSE .
COPY --from=go-build /artifacts/nvidia-mig-parted .
COPY ./deployments/systemd/packages/tarball/install.sh ${DESTDIR}
COPY ./deployments/systemd/config-default.yaml ${DESTDIR}
COPY ./deployments/systemd/hooks.sh ${DESTDIR}
Expand All @@ -47,7 +90,7 @@ COPY ./deployments/systemd/service.sh ${DESTDIR}
COPY ./deployments/systemd/uninstall.sh ${DESTDIR}
COPY ./deployments/systemd/utils.sh ${DESTDIR}

# output directory for final tarball
# output directory
RUN mkdir -p /dist

# build command
Expand Down
Loading

0 comments on commit 5a4aebc

Please sign in to comment.