Skip to content

Commit

Permalink
Implement product feedback to readme (#15261)
Browse files Browse the repository at this point in the history
* remove mentions of nvidia devs

* Describe how to add extra metrics
  • Loading branch information
iliakur authored Jul 13, 2023
1 parent 07917c8 commit 7e86b86
Showing 1 changed file with 4 additions and 2 deletions.
6 changes: 4 additions & 2 deletions dcgm/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ DCGM_FI_PROF_PCIE_TX_BYTES, counter, Total number of bytes transmit
DCGM_FI_PROF_PCIE_RX_BYTES, counter, Total number of bytes received through PCIe RX (in KB) via NVML.
```

NVIDIA devs also recommend enabling the following default counters and labels:
It is also recommended enabling the following default counters and labels:
- `DCGM_FI_DEV_MEMORY_TEMP`
- `DCGM_FI_DEV_GPU_TEMP`
- `DCGM_FI_DEV_POWER_USAGE`
Expand All @@ -43,7 +43,7 @@ NVIDIA devs also recommend enabling the following default counters and labels:
- `DCGM_FI_DEV_BRAND`
- `DCGM_FI_DEV_SERIAL`

They also recommend adding the following non-default fields and labels:
The following non-default fields and labels are also recommended:
```
DCGM_FI_DEV_SLOWDOWN_TEMP, gauge, Slowdown temperature for the device.
DCGM_FI_DEV_POWER_MGMT_LIMIT, gauge, Current power limit for the device.
Expand Down Expand Up @@ -107,6 +107,8 @@ instances:
- openmetrics_endpoint: http://localhost:9400/metrics
```

Use the `extra_metrics` configuration field to add metrics that go beyond the ones [we support out of the box][6]. See [here][10] for the full list of metrics that dcgm-exporter can collect. Make sure to [enable these fields in the dcgm-exporter configuration][9] as well.

<!-- xxx tab xxx -->
<!-- xxx tab "Docker" xxx -->

Expand Down

0 comments on commit 7e86b86

Please sign in to comment.