Benchmark Results on Raptor Lake (E-Core and P-Core!) #955
Mooshua
started this conversation in
Show and tell
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I've used CPUsets on an LXC container to run the Primes benchmarks on only the efficiency-cores (Gracemont) and performance-cores (Raptor Cove) on the i5-13400.
The following machine was used to run the benchmarks
Results
LuaJIT
I ran my LuaJIT prime workload on all 3 machines, as the same underlying benchmark program is configured to run under a wide variety of workloads and configurations. I've included a few in the table below.
Interestingly, the Gracemont core doesn't handle the interpreter's hashtables very well (
vm_hash
), despite handling the slower (and more cumbersome) interpreted FFI handlers just fine (vm_ffi
). Both cores had slightly higher scores with an unroll factor of 8 compared to an unroll factor of 16.The baseline JIT workloads (
jit_slow
) are handled pretty well by both the Gracemont and Raptor cores. However, the plain fast-JIT workload (jit
) has the 4th-largest performance gap in the suite.The cache-optimized fast-JIT workloads are where we see each core start to shine (
jit_16_c...k
). The Gracemont core performs best when running blocks of 32kb (jit_16_c16k
), and gets to 71% of the Raptor core's performance at a smaller block size of 16kb (jit_16_c8k
). However, it drops off at 48kb (jit_16_c24k
) and performance begins to plummet as more and more L1 evictions occur.The raptor cores stay strong throughout all cache-optimized runs, continuing to net wins as the execution blocks grow. This may indicate better L1 eviction performance or an optimization in linear-access prefetching. (Raptor cove is known to have a significant improvement in prefetch heuristics compared to previous generations)
Top 8
Below are tables for just the top 8 single-threaded benchmarks for the Raptor cores and Gracemont cores. We can clearly see from the results that the Raptor cores handle the prime workload significantly better, at about 2x the operations/second compared to the Gracemont cores.
Top 8 - Gracemont (Efficiency)
Top 8 - Raptor (Performance)
Beta Was this translation helpful? Give feedback.
All reactions