-
Notifications
You must be signed in to change notification settings - Fork 318
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optionally expand the size of sort.src for tst.sh shell workload #86
Conversation
As I posted in #85: meta: benchmarking, like statistics, can be manipulated. Comparing numbers generated from different code can be less accurate and even misleading. I made similar comments many years ago here: That said, where the implementation is obviously detrimental to what the code is trying to exercise and test, I am happy to evaluate proposed patches.
It looks like you have copied the contents of sort.src and repeatedly appended that content to the file X number of times. Instead of hard-coding a larger file into the source code, should |
If you're prototyping, please continue. If you would like me to consider the patches, please discuss the questions I raised. |
@gstrauss Thanks for your response. I am preparing another patch and collecting the data for the discussion. Hoping I will have the data a day or two. Would love to contribute to the benchmark. |
Thanks @gstrauss, my ideas:
A: My opinion is no, from my point of view, the file size change didn’t in purpose of changing the original test, instead the change aligns back to the original test purpose. As we can see, before the change, in dozens core# system, the shell transformation only contribute ~8% cycles, while after the change, the cycles distribution between shell transformation and the process creation becomes balance.
A: Since we aligns back to the original test purpose, I would suggest update existing test case, rather than adding a new case while keep the one which should be updated.
A: Agree, the patch is refined as your comments, auto-generate test.src in tst.sh, above profiling data is based on this patchset, please take a look. |
You speak very much from the perspective of a benchmark producing numbers, and have not said anything regarding consumption of those benchmarking numbers and using those numbers for historical comparison. To guide this conversation: I think that you have misunderstood the original test. The original test is self-described as a certain set of operations on a given input, including size and content. You also misunderstood my rhetorical question. Dramatically changing the size of the data set does in fact change the test. You might want to measure a different combination and balance of operations, and that different combination might produce a more useful metric to describe the performance of your modern system, but that is in fact a different test metric than is currently being measured. ... I will think about this more, too, and later this week will try to find time to analyze your patch further. |
This is sloppy and insecure: Separately, I won't accept a hard-coded I might (TBD) consider accepting a patch which uses an environment variable defaulting to 1, which preserves current behavior. e.g. |
Thanks Glenn for your comments and kindly suggestions. I got some family stuff to handle those days, sorry for the delayed response. I’ve been fully convinced by the statement "Dramatically changing the size of the data set does in fact change the test.". On the other hand, I also agree with you "a patch which uses an environment variable defaulting to 1, which preserves current behavior. e.g. A good catch for "This is sloppy and insecure: |
I would prefer to be able to run the test with different values for the environment variable, without needing to re-run make install. Creating the input file separately -- not in a pipeline -- may have a different (and potentially more desirable) effect than creating the output while piping to
|
Hi @gstrauss , thanks for your suggestions, I did a little change according to your comments:
How do you think of it ? |
|
Updated. |
I pushed a commit which changed the naming of some variables, and I added some brief documentation. Please review. Note that setting |
…lance cycles distribution.
LGTM. @gstrauss Thank you for the improvement.
Agree. |
Thank you for your contribution. I hope that my additions to the |
The data file named sort.src used in the Shell Scripts test cases inherited for more than a decade, where a balanced time distribution between the transformation of the data file and the creation of process became imbalanced in a system with dozens of cores nowadays.
Would it be meaningful to enlarge the sort.src file to obey the original designate goal of the Shell Script tests, make them be more scalable on a modern system with many cores?
Looking forward to hear your advice.