Concerns with assumptions and limitations of UnixBench #125

meteorfox · 2015-02-20T01:51:41Z

I have several concerns with the current usage of UnixBench. I know that most people seems to run UnixBench 'as is' with whatever defaults are set, but I believe this will not show a proper comparison between systems (i.e. OS versions, Compiler versions, libC versions, etc).

Here's why:

Doesn't scale beyond 16 CPUs; at least not by default
- This issue has been known for a while and a patch is available, for some reason when UnixBench was written, a hard-coded limit (16) was set for the Systems and Non-Index benchmarks suites. Obviously, nowadays we have much more CPUs than that on single machine.
- This means that running UnixBench on a instance-type with 32 CPUs will give scores approximately the same as as 16 CPUs instance-type, assuming they use the same processors, of course.
GCC compiler options
The UnixBench Makefile by default assumes a Solaris OS, even when the flags for Linux are commented out, although it assumes a Pentium ISA.
Compiling with different flags it's known to have huge impacts on the performance of a program, you can tweaks things like unrolling loops, to inlining functions, but more concerning, it's been shown recently that even linking order and environment variables can impact performance significantly.
This is a little trickier to solve, but perhaps one way to mitigate this is to statically compiled binaries for specific architectures, this way it will always run the same number of instructions, and be linked the same way.
GCC versions!
- Brendan Gregg wrote this great post showing how wildly UnixBench can vary by just changing GCC versions. I'm just going to link his blog post here, which explains it better and was the inspiration for this issue.

What is your take on these issues?

voellm · 2015-02-23T19:40:04Z

These are great points! It looks like we should adjust the benchmark to better fit how the tool is actually used to measure systems (including more CPUs).

This change however would effect the default behavior of UnixBench and would require the community to agree to make it the default.

Any chance you can prototype this change in a separate branch and then share the comparison data on this issue thread?

We can then include the proposed CL in the next community meeting on 3/18. If everyone agrees we take the change and are done. If we need to escalate Christos from Stanford and Daniel from MIT decide.

Ivan make also speak at the community meeting with a proposal on how to take changes that is a bit more codified.

ivansmf · 2015-02-23T21:00:04Z

TL;DR - I think you're right and we should do this. Let's test the workflow on how to change stuff.

I will likely create another issue to track this, but here is what I was thinking:

If it is a bug fix then we can just take it and update the benchmark.
If it changes the benchmark in a way that can be characterized as breaking change we need a review.
2.1 If the breaking change comes from the benchmark publisher, Stanford and MIT approve/deny this, everyone has to follow suit.
2.2 If the breaking change comes from a third party (e.g. a community patch). Then each partner can decide whether to take the patch on their own test set, and the updated test must change its name to avoid confusion.

What do you think?

meteorfox · 2015-02-23T23:41:17Z

Sounds reasonable. I'll work on a CL in a separate branch.

IMHO, the changes to make UnixBench run on more than 16 CPUs doesn't really change UnixBench's behavior, anywhere were you had < 16 CPUs will continue to run the same way and I wouldn't expect any changes on the results.

The other suggested change of using pre-compiled binaries for each of the UnixBench benchmarks listed here, will definitely be considered as a "breaking" change.

ivansmf · 2015-02-23T23:45:59Z

I am happy with the flag solution as it is simpler and let us postpone
adding process/overhead.

On Mon, Feb 23, 2015 at 3:41 PM, Carlos Torres [email protected]
wrote:

Sounds reasonable. I'll work on a CL in a separate branch.

IMHO, the changes to make UnixBench run on more than 16 CPUs doesn't
really change UnixBench's behavior, anywhere were you had < 16 CPUs will
continue to run the same way and I wouldn't expect any changes on the
results.

The other suggested change of using pre-compiled binaries for each of the
UnixBench benchmarks listed here
https://code.google.com/p/byte-unixbench/source/browse/trunk/UnixBench/Makefile#183,
will definitely be considered as a "breaking" change.

—
Reply to this email directly or view it on GitHub
#125 (comment)
.

voellm · 2015-03-12T22:28:27Z

Carlos - Thanks for fixing part 1 with #135

What do you suggest for the other two?

meteorfox · 2015-06-23T18:47:35Z

@voellm For the other two, I opened an issue in the UnixBench repo for my proposal to address these, kdlucas/byte-unixbench#23

voellm added the enhancement label Feb 23, 2015

voellm assigned meteorfox Mar 12, 2015

voellm added this to the P1 milestone Mar 12, 2015

voellm added the P1 label Mar 24, 2015

voellm modified the milestone: P1 Mar 24, 2015

keerockl pushed a commit to keerockl/PerfKitBenchmarker that referenced this issue Jan 11, 2020

Change hhvm_provisioning URLs to GitLab (GoogleCloudPlatform#125)

92da085

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Concerns with assumptions and limitations of UnixBench #125

Concerns with assumptions and limitations of UnixBench #125

meteorfox commented Feb 20, 2015

voellm commented Feb 23, 2015

ivansmf commented Feb 23, 2015

meteorfox commented Feb 23, 2015

ivansmf commented Feb 23, 2015

voellm commented Mar 12, 2015

meteorfox commented Jun 23, 2015

Concerns with assumptions and limitations of UnixBench #125

Concerns with assumptions and limitations of UnixBench #125

Comments

meteorfox commented Feb 20, 2015

voellm commented Feb 23, 2015

ivansmf commented Feb 23, 2015

meteorfox commented Feb 23, 2015

ivansmf commented Feb 23, 2015

voellm commented Mar 12, 2015

meteorfox commented Jun 23, 2015