Update memory info to v2 #57

marcelroed · 2024-03-25T03:56:46Z

nvmlDeviceGetMemoryInfo_v2 has been available for a few years now. I propose to update the implementation to v2, so that the results correspond with what nvidia-smi and gpustat output.

Currently, the outputs of device.memory_info() return an nvmlMemory_t struct with total, free and used. nvmlMemory_v2_t is defined in the existing code, but never used. Calling the NVML function with a struct set up for v2 gives a slightly different result, where the used part doesn't include cache memory and other non-allocated stuff. Additionally, the struct has a field for version and reserved memory.

I implemented this change and did some testing, and it works! Importantly, the results of device.memory_info()?.used would produce 912MB on our A100s before the change, and now show 7.8MB with the v2 version, which matches nvidia-smi and gpustat.

The text was updated successfully, but these errors were encountered:

marcelroed linked a pull request Mar 25, 2024 that will close this issue

Update nvmlDeviceGetMemoryInfo to version 2 to be consistent with nvidia-smi #58

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update memory info to v2 #57

Update memory info to v2 #57

marcelroed commented Mar 25, 2024

Update memory info to v2 #57

Update memory info to v2 #57

Comments

marcelroed commented Mar 25, 2024