Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pcm-memory can not reflect memory bandwidth accurately when using rdma (pcie to memory) #923

Open
jianpanlanyue opened this issue Mar 31, 2025 · 3 comments

Comments

@jianpanlanyue
Copy link

jianpanlanyue commented Mar 31, 2025

When doing ib_write_bw (a standard IB bandwidth testing tool without memory copy) and watch the corresponding memory bandwidth by pcm-memory ( of latest build 202502) as follows:
nodeA ip : 100.7.36.221, with ib port of mlx5_0 (400Gb) on numa0, and cpu is intel Xeon 6740E (BirthStream, Siera Forest)
nodeB ip : 100.7.36.223, with ib port of mlx5_0 (400Gb) on numa0, and cpu is intel Xeon 6740E (BirthStream, Siera Forest)

  1. let nodeA do 'ib_write_bw' to nodeB:
    nodeB(server): numactl -N 0 -m 0 ib_write_bw -d mlx5_0 -t 128 --run_infinitely
    nodeA(client) : numactl -N 0 -m 0 ib_write_bw -d mlx5_0 -t 128 --run_infinitely 100.7.36.223
    now ib_write_bw show the ib bandwidth is 44GB, and pcm-memory shows the memory bandwidth as 47GB, it's just ok.
  2. let nodeB do 'ib_write_bw' to nodeA:
    nodeA(server): numactl -N 0 -m 0 ib_write_bw -d mlx5_0 -t 128 --run_infinitely
    nodeB(client) : numactl -N 0 -m 0 ib_write_bw -d mlx5_0 -t 128 --run_infinitely 100.7.36.221
    now ib_write_bw show the ib bandwidth is also 44GB, but pcm-memory shows the memory bandwidth less than 1G, only a few hundred megabytes.

Image
Image

It looks like the pcm-memory does not account the memory band in last scenario 2.
Why? As i thought, the pcm-memory should account any memory bandwidth from cpu and pcie to memory ?

@rdementi
Copy link
Contributor

rdementi commented Apr 1, 2025

Hi, I think it is possible that in one of the cases the Intel DDIO is being used (accesses go through last level cache): https://www.intel.com/content/www/us/en/developer/articles/technical/ddio-analysis-performance-monitoring.html

You can check this with "pcm-pcie -e" instead of pcm-memory.

@jianpanlanyue
Copy link
Author

jianpanlanyue commented Apr 3, 2025

Yes, just as you thought, it is caused by DDIO. 'pcm-pcie' -e shows 99% hits of 'ItoM' when case 2. thanks.
By the way, i find the 'pcm-pcie -h' shows that write events includes PCIeWiLf, PCIeItoM, PCIeNSWr, PCIeNSWrF, ItoM, RFO, but when i run 'pcm-pcie -e', there are only columes of 'PCIRdCur', 'ItoM', 'ItoMCacheNear', 'UCRdF', 'WiL', 'WCiL', 'WCiLF', which not correspond with the 'pcm-pcie -h' shows. Why?

@rdementi
Copy link
Contributor

rdementi commented Apr 4, 2025

thanks for checking. 'pcm-pcie -h' shows events not only for your CPU but also for other CPUs supported by PCM.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants