You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
nvmlDeviceGetMemoryInfo_v2 has been available for a few years now. I propose to update the implementation to v2, so that the results correspond with what nvidia-smi and gpustat output.
Currently, the outputs of device.memory_info() return an nvmlMemory_t struct with total, free and used. nvmlMemory_v2_t is defined in the existing code, but never used. Calling the NVML function with a struct set up for v2 gives a slightly different result, where the used part doesn't include cache memory and other non-allocated stuff. Additionally, the struct has a field for version and reserved memory.
I implemented this change and did some testing, and it works! Importantly, the results of device.memory_info()?.used would produce 912MB on our A100s before the change, and now show 7.8MB with the v2 version, which matches nvidia-smi and gpustat.
The text was updated successfully, but these errors were encountered:
nvmlDeviceGetMemoryInfo_v2
has been available for a few years now. I propose to update the implementation to v2, so that the results correspond with whatnvidia-smi
andgpustat
output.Currently, the outputs of
device.memory_info()
return annvmlMemory_t
struct withtotal
,free
andused
.nvmlMemory_v2_t
is defined in the existing code, but never used. Calling the NVML function with a struct set up forv2
gives a slightly different result, where theused
part doesn't include cache memory and other non-allocated stuff. Additionally, the struct has a field forversion
andreserved
memory.I implemented this change and did some testing, and it works! Importantly, the results of
device.memory_info()?.used
would produce912MB
on our A100s before the change, and now show7.8MB
with the v2 version, which matchesnvidia-smi
andgpustat
.The text was updated successfully, but these errors were encountered: