`std.Random.shuffle`: optimize for cache utilizing @prefetch input queue #24705

Olvilock · 2025-08-05T17:21:07Z

I've been doing a simulation, where one of the steps was shuffling a big table of particles (~10^7 elements). The std.Random.shuffle method turned out to be the major contributor into overall runtime. The main reason lies in inherent cache unfriendliness of the random-access swaps.

The solution that worked for me: precompute the swap indices 32 steps ahead of time, @prefetch them and put them into a ring buffer. As the array was cold, prefetching achieved over 3x speedup.

This pull request presents a hybrid approach that works well on both hot and cold arrays (threshold optimized for hot memory)

Benchmark for worst case (hot memory) results are attached in screenshots (ReleaseFast, CPU: Intel i5 8300H, memory freq. 2400 MHz).

Rexicon226 · 2025-08-05T17:22:48Z

Nice - I wonder if setting the prefetch locality to 0 would improve it further.

Olvilock · 2025-08-05T17:26:36Z

@Rexicon226 strangely, it did not (on my machine)

Rexicon226 · 2025-08-05T17:30:21Z

not too surprising - im not aware of x86 having a less-temporal prefetch and i doubt llvm is smart enough to do anything with the information. thanks for checking!

Olvilock · 2025-08-18T22:12:42Z

Is there/might there be a standard way of querying cache sizes of target machine, if applicable? I hardcoded a constant to switch between implementations in this pull request, a function of (L3) cache size is more suited as a choice of that constant

std.Random: optimize for cache utilizing @prefetch input queue

0691502

Merge branch 'master' into master

54dc3a7

andrewrk self-requested a review August 6, 2025 18:42

Olvilock and others added 2 commits August 8, 2025 12:20

Merge branch 'master' into master

a9d4211

Use std.mem alias

194880f

Olvilock changed the title ~~std.Random: optimize for cache utilizing @prefetch input queue~~ std.Random.shuffle: optimize for cache utilizing @prefetch input queue Aug 18, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

`std.Random.shuffle`: optimize for cache utilizing @prefetch input queue #24705

`std.Random.shuffle`: optimize for cache utilizing @prefetch input queue #24705

Olvilock commented Aug 5, 2025

Uh oh!

Rexicon226 commented Aug 5, 2025

Uh oh!

Olvilock commented Aug 5, 2025

Uh oh!

Rexicon226 commented Aug 5, 2025

Uh oh!

Olvilock commented Aug 18, 2025

Uh oh!

Uh oh!

Uh oh!

std.Random.shuffle: optimize for cache utilizing @prefetch input queue #24705

Are you sure you want to change the base?

std.Random.shuffle: optimize for cache utilizing @prefetch input queue #24705

Conversation

Olvilock commented Aug 5, 2025

Uh oh!

Rexicon226 commented Aug 5, 2025

Uh oh!

Olvilock commented Aug 5, 2025

Uh oh!

Rexicon226 commented Aug 5, 2025

Uh oh!

Olvilock commented Aug 18, 2025

Uh oh!

Uh oh!

`std.Random.shuffle`: optimize for cache utilizing @prefetch input queue #24705

`std.Random.shuffle`: optimize for cache utilizing @prefetch input queue #24705