perf-ninja-rs/labs/memory_bound/huge_pages_1 at master · grahamking/perf-ninja-rs

History

Name		Name	Last commit message	Last commit date
parent directory ..
benches		benches
src		src
Cargo.toml		Cargo.toml
README.md		README.md

README.md

Original C++ lab with docs and maybe video

Rust version is Linux only so far. The C++ original also supports macOS and Windows. Contributions welcome!

Observe the memory bottleneck:

Build benchmark binary: cargo bench --no-run. It should print path to the binary.
Confirm we're loading a lot from main memory: perf stat -e cache-references,LLC-loads,LLC-load-misses <binary>. I get over 50% Last Level Cache (L3) misses, meaning those loads had to go to main memory.
Check TLB: perf stat -e dTLB-loads,dTLB-load-misses <binary>. I have about 12% TLB misses (before optimization).

Optimize:

Enable huge pages on Linux (128 pages is a guess, try other numbers): sudo bash -c 'echo 128 > /proc/sys/vm/nr_hugepages'. If you use anonymous mmaped pages I don't think you need to mount a hugetlbfs filesystem like the docs recommend.

I got a ~30% speedup.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

huge_pages_1

huge_pages_1

README.md

Files

huge_pages_1

Directory actions

More options

Directory actions

More options

Latest commit

History

huge_pages_1

Folders and files

parent directory

README.md