Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optionally expand the size of sort.src for tst.sh shell workload #86

Merged
merged 5 commits into from
Mar 10, 2023

Conversation

lipzhu
Copy link
Contributor

@lipzhu lipzhu commented Feb 22, 2023

The data file named sort.src used in the Shell Scripts test cases inherited for more than a decade, where a balanced time distribution between the transformation of the data file and the creation of process became imbalanced in a system with dozens of cores nowadays.
Would it be meaningful to enlarge the sort.src file to obey the original designate goal of the Shell Script tests, make them be more scalable on a modern system with many cores?
Looking forward to hear your advice.

@gstrauss
Copy link
Collaborator

gstrauss commented Feb 26, 2023

As I posted in #85:

meta: benchmarking, like statistics, can be manipulated. Comparing numbers generated from different code can be less accurate and even misleading. I made similar comments many years ago here:
#23 (comment)

That said, where the implementation is obviously detrimental to what the code is trying to exercise and test, I am happy to evaluate proposed patches.


tst.sh was written to operate on a small file, and you correctly assessed that "a balanced time distribution between the transformation of the data file and the creation of process became imbalanced in a system with dozens of cores nowadays." If we change the size of the file, aren't we changing the test? Should we consider adding an additional test on a larger input? Should we consider making the test configurable for generating input?

It looks like you have copied the contents of sort.src and repeatedly appended that content to the file X number of times. Instead of hard-coding a larger file into the source code, should tst.sh be modified to repeat the contents of sort.src in a loop and pipe that into sort? Then the script might take a parameter of how many times to repeat sort.src.

@gstrauss
Copy link
Collaborator

If you're prototyping, please continue. If you would like me to consider the patches, please discuss the questions I raised.

@lipzhu
Copy link
Contributor Author

lipzhu commented Feb 27, 2023

@gstrauss Thanks for your response. I am preparing another patch and collecting the data for the discussion. Hoping I will have the data a day or two. Would love to contribute to the benchmark.

@lipzhu
Copy link
Contributor Author

lipzhu commented Feb 28, 2023

Thanks @gstrauss, my ideas:

If we change the size of the file, aren't we changing the test?

A: My opinion is no, from my point of view, the file size change didn’t in purpose of changing the original test, instead the change aligns back to the original test purpose. As we can see, before the change, in dozens core# system, the shell transformation only contribute ~8% cycles, while after the change, the cycles distribution between shell transformation and the process creation becomes balance.
According to my profiling, you can see the user-space/kernel cycle distribution after the change:
image

image

Should we consider adding an additional test on a larger input?

A: Since we aligns back to the original test purpose, I would suggest update existing test case, rather than adding a new case while keep the one which should be updated.

Should we consider making the test configurable for generating input?

A: Agree, the patch is refined as your comments, auto-generate test.src in tst.sh, above profiling data is based on this patchset, please take a look.

@gstrauss
Copy link
Collaborator

If we change the size of the file, aren't we changing the test?

A: My opinion is no, from my point of view, the file size change didn’t in purpose of changing the original test, instead the change aligns back to the original test purpose.

You speak very much from the perspective of a benchmark producing numbers, and have not said anything regarding consumption of those benchmarking numbers and using those numbers for historical comparison.

To guide this conversation: I think that you have misunderstood the original test. The original test is self-described as a certain set of operations on a given input, including size and content. You also misunderstood my rhetorical question. Dramatically changing the size of the data set does in fact change the test. You might want to measure a different combination and balance of operations, and that different combination might produce a more useful metric to describe the performance of your modern system, but that is in fact a different test metric than is currently being measured.

... I will think about this more, too, and later this week will try to find time to analyze your patch further.

@gstrauss
Copy link
Collaborator

gstrauss commented Mar 3, 2023

This is sloppy and insecure: printf "$sort_src_content%.0s\n" $(seq $1) | sort > sort.$$
If the $sort_src_content contains %s or other format string, the output is not as you intended.

Separately, I won't accept a hard-coded times_of_sort_src=20

I might (TBD) consider accepting a patch which uses an environment variable defaulting to 1, which preserves current behavior. e.g. ${WELL_NAMED_SOMETHING:-1} which would allow you to recommend for customers to set in the environment before running UnixBench with a different workload -- using the same code but measuring a different (modified) metric. Whether the numbers are interpreted competently or not is out of my hands.

@lipzhu
Copy link
Contributor Author

lipzhu commented Mar 7, 2023

Thanks Glenn for your comments and kindly suggestions. I got some family stuff to handle those days, sorry for the delayed response. I’ve been fully convinced by the statement "Dramatically changing the size of the data set does in fact change the test.". On the other hand, I also agree with you "a patch which uses an environment variable defaulting to 1, which preserves current behavior. e.g. ${WELL_NAMED_SOMETHING:-1} which would allow you to recommend for customers to set in the environment before running UnixBench with a different workload -- using the same code but measuring a different (modified) metric". Considering such changes to use the same code and also preserve the current behavior would be a unparallel idea. Allow me to have a day or two to draft a patch for your review.

A good catch for "This is sloppy and insecure: printf "$sort_src_content%.0s\n" $(seq $1) | sort > sort.$$, If the $sort_src_content contains %s or other format string, the output is not as you intended.", I will find another way in new patch. Thank you for such details. Would you prefer a change in Makefile to generate a new sort.src according to env variable, or add a command in tst.sh?

@gstrauss
Copy link
Collaborator

gstrauss commented Mar 7, 2023

Would you prefer a change in Makefile to generate a new sort.src according to env variable, or add a command in tst.sh?

I would prefer to be able to run the test with different values for the environment variable, without needing to re-run make install.

Creating the input file separately -- not in a pipeline -- may have a different (and potentially more desirable) effect than creating the output while piping to sort. I suggest creating the input file in multi.sh and passing the input file name as an argument to tst.sh

for i in $(seq $foo); do cat sort.src; done > sort.src-alt
and rm sort.src-alt at the end of multi.sh.
It does mean that the input file is the same for all the tst.sh instances, and that matches current test behavior.

@lipzhu
Copy link
Contributor Author

lipzhu commented Mar 8, 2023

Hi @gstrauss , thanks for your suggestions, I did a little change according to your comments: 

  1. Add a checker if user use the TIMES_OF_SORT_SRC env and want to expand the size, otherwise keep the default behavior.
  2. Move the external command cat out of the for loop statement to reduce the contention of loading external binary.

How do you think of it ?

@gstrauss
Copy link
Collaborator

gstrauss commented Mar 8, 2023

combine_sort_src+=" $sort_src" is not portable to all shells.
You can use combine_sort_src="$combine_sort_src $sort_src"

@gstrauss gstrauss changed the title Expand the size of sort.src for Shell workload to keep align it with the original test purpose Optionally expand the size of sort.src for tst.sh shell workload Mar 8, 2023
@lipzhu
Copy link
Contributor Author

lipzhu commented Mar 8, 2023

combine_sort_src+=" $sort_src" is not portable to all shells. You can use combine_sort_src="$combine_sort_src $sort_src"

Updated.

@gstrauss
Copy link
Collaborator

gstrauss commented Mar 9, 2023

I pushed a commit which changed the naming of some variables, and I added some brief documentation. Please review.

Note that setting MULTI_SH_WORK_FACTOR to a value too high results in your shell syntax overflowing the command line, and no work being done. On my system MULTI_SH_WORK_FACTOR=1400 worked, but some slightly higher values failed silently. However, since the user/kernel workload is drastically changed with MULTI_SH_WORK_FACTOR=1400, and since your data above suggested MULTI_SH_WORK_FACTOR=20, I am not too concerned with that limitation.

@lipzhu
Copy link
Contributor Author

lipzhu commented Mar 10, 2023

I pushed a commit which changed the naming of some variables, and I added some brief documentation. Please review.

LGTM. @gstrauss Thank you for the improvement.

However, since the user/kernel workload is drastically changed with MULTI_SH_WORK_FACTOR=1400, and since your data above suggested MULTI_SH_WORK_FACTOR=20, I am not too concerned with that limitation.

Agree. MULTI_SH_WORK_FACTOR=1400 or higher should be rarely used.

@gstrauss gstrauss merged commit d130340 into kdlucas:master Mar 10, 2023
@gstrauss
Copy link
Collaborator

Thank you for your contribution.

I hope that my additions to the USAGE file help you to explain to others how to use export MULTI_SH_WORK_FACTOR=20 and to compare the performance to other systems on which the same export MULTI_SH_WORK_FACTOR=20 was also used.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants