-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve the random read behavior in StressWorkerBench #18000
Improve the random read behavior in StressWorkerBench #18000
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the work! I added some comments.
dora/stress/shell/src/main/java/alluxio/stress/cli/worker/StressWorkerBench.java
Show resolved
Hide resolved
dora/stress/shell/src/main/java/alluxio/stress/cli/worker/StressWorkerBench.java
Outdated
Show resolved
Hide resolved
dora/stress/shell/src/main/java/alluxio/stress/cli/worker/StressWorkerBench.java
Outdated
Show resolved
Hide resolved
dora/job/server/src/main/java/alluxio/job/plan/stress/StressBenchDefinition.java
Outdated
Show resolved
Hide resolved
a35dda2
to
4cb7ac9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks for the work!
@@ -415,6 +396,10 @@ private BenchThread(BenchContext context, int targetFileIndex, FileSystem fs) { | |||
mResult.setParameters(mParameters); | |||
mResult.setBaseParameters(mBaseParameters); | |||
mIsRandomRead = mParameters.mIsRandom; | |||
mRandom = new Random(mParameters.mRandomSeed); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if you are using the same random seed for all threads, then wouldn't the offset and length sequence of all threads be the same as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand your motivation now. The threads were reading from one single offset and with the same length throughout the benchmark, and this PR makes them read with different offsets and lengths. Makes sense to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep, i used to worry the same seed problem as well, but i think as all threads read enough times it won't be matter.
private final Random mRandom; | ||
private final int mRandomMax; | ||
private final int mRandomMin; | ||
private final int mFileSize; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
using int for file sizes limits the file to be smaller that 2.1GB. Can you either add a check to the cli parser to disallow input of files larger than 2.1GB, or change it to long
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
consequently, offset
should be check in the same way as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
all changed to long, and StressWorkerBench will make sure the MaxReadLength is within 2.1G when validating the parameters. since the InputStream.read() method only support read length in integer.
dora/stress/shell/src/main/java/alluxio/stress/cli/worker/StressWorkerBench.java
Outdated
Show resolved
Hide resolved
dora/stress/shell/src/main/java/alluxio/stress/cli/worker/StressWorkerBench.java
Outdated
Show resolved
Hide resolved
private final long mRandomMax; | ||
private final long mRandomMin; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
length should be int
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since these two value will evaluated with long type mFileSize
later, i think long would be better
dora/stress/shell/src/main/java/alluxio/stress/cli/worker/StressWorkerBench.java
Outdated
Show resolved
Hide resolved
…ssWorkerBench.java Co-authored-by: Bowen Ding <[email protected]>
…ssWorkerBench.java Co-authored-by: Bowen Ding <[email protected]>
a621637
to
d7c81f9
Compare
dora/stress/shell/src/main/java/alluxio/stress/cli/worker/StressWorkerBench.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
alluxio-bot, merge this please |
* generate random number in range [min, max] (include both min and max). | ||
*/ | ||
private long randomNumInRange(long min, long max) { | ||
return ThreadLocalRandom.current().nextLong(min, max + 1) + min; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note we are using thread local random here, so does the --seed
parameter still take effect?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think we no longer need this param
What changes are proposed in this pull request?
Improve the randomness of StressWorkerBench random read test, now each thread throw dice every time it trying to do a random read.
Why are the changes needed?
In previous StressWorkerBench random read test each thread read from same offset and same length everytime, this cause low randomness.
Does this PR introduce any user facing changes?
no