GitHub - mindprince/nvidia_gpu_prometheus_exporter: NVIDIA GPU Prometheus Exporter

NVIDIA GPU Prometheus Exporter

This is a Prometheus Exporter for exporting NVIDIA GPU metrics. It uses the Go bindings for NVIDIA Management Library (NVML) which is a C-based API that can be used for monitoring NVIDIA GPU devices. Unlike some other similar exporters, it does not call the nvidia-smi binary.

Building

The repository includes nvml.h, so there are no special requirements from the build environment. go get should be able to build the exporter binary.

go get github.com/mindprince/nvidia_gpu_prometheus_exporter

Running

The exporter requires the following:

access to NVML library (libnvidia-ml.so.1).
access to the GPU devices.

To make sure that the exporter can access the NVML libraries, either add them to the search path for shared libraries. Or set LD_LIBRARY_PATH to point to their location.

By default the metrics are exposed on port 9445. This can be updated using the -web.listen-address flag.

Running inside a container

There's a docker image available on Docker Hub at mindprince/nvidia_gpu_prometheus_exporter

If you are running the exporter inside a container, you will need to do the following to give the container access to NVML library:

-e LD_LIBRARY_PATH=<path-where-nvml-is-present>
--volume <above-path>:<above-path>

And you will need to do one of the following to give it access to the GPU devices:

Run with --privileged
If you are on docker v17.04.0-ce or above, run with --device-cgroup-rule 'c 195:* mrw'
Run with --device /dev/nvidiactl:/dev/nvidiactl /dev/nvidia0:/dev/nvidia0 /dev/nvidia1:/dev/nvidia1 <and-so-on-for-all-nvidia-devices>

If you don't want to do the above, you can run it using nvidia-docker.

Running using nvidia-docker

nvidia-docker run -p 9445:9445 -ti mindprince/nvidia_gpu_prometheus_exporter:0.1

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
vendor/github.com		vendor/github.com
Dockerfile		Dockerfile
Gopkg.lock		Gopkg.lock
Gopkg.toml		Gopkg.toml
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
main.go		main.go

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

NVIDIA GPU Prometheus Exporter

Building

Running

Running inside a container

Running using nvidia-docker

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

License

mindprince/nvidia_gpu_prometheus_exporter

Folders and files

Latest commit

History

Repository files navigation

NVIDIA GPU Prometheus Exporter

Building

Running

Running inside a container

Running using nvidia-docker

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages