Skip to content

Latest commit

 

History

History
61 lines (42 loc) · 3.29 KB

README.md

File metadata and controls

61 lines (42 loc) · 3.29 KB

Measure the performance of individual cores with OpenMP.

You can use this program to measure the performance of individual CPU cores of a system. It works by running a pleasingly parallel workload on each core and measuring the time it takes to complete. The workload is a simple loop that find the sum of all the elements of an array (1 billion 64-bit floating point numbers by default). The program is written in C++ and uses OpenMP to parallelize the workload. To prevent the operating system from moving threads between cores, the program uses the OpenMP affinity API to pin each thread to a specific core.

Note

You can just copy main.sh to your system and run it.
For the code, refer to main.cxx.


$ ./main.sh
# OMP_NUM_THREADS=64
# {run=000, thread=000, node=0, core=000, time=2407.0969ms, flops=4.1544e+08}
# {run=000, thread=001, node=0, core=032, time=2407.0779ms, flops=4.1544e+08}
# {run=000, thread=002, node=1, core=001, time=2407.0649ms, flops=4.1544e+08}
# ...

I run this program on nodes 1, 2, and 4 of our cluster - no core-specific faults are present.

Also below is runtime stabilisation plot (I perform 100 runs of summing a billion elements on each core).



References




ORG DOI