-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
infohint: add nogpu as a gpudriver_id hint #169
Conversation
Add a way for application to tell yaksa that both in_buf and out_buf are host memory. Signed-off-by: Hui Zhou <[email protected]>
e4fbfd2
to
b8d3a98
Compare
test:yaksa/gpu |
Apply nogpu hint every other test when no device memory is involved. Signed-off-by: Hui Zhou <[email protected]>
The yaksu_atomic_decr returns the value before decrement. Signed-off-by: Hui Zhou <[email protected]>
test:yaksa/gpu RESULT: 3 hr 28min. 1 failure:
|
@pavanbalaji The PR is ready for your review. There is 1 failure in the newly created gpu testing, see #169 (comment), I am not sure it is related to this PR. Maybe you can help to decide. Meanwhile, I'll re-run the test to confirm. |
test:yaksa/gpu ✔️ |
@pavanbalaji Tests are clean. Ready for your review. |
host_only_get_ptr_attr(&unpack_info); | ||
} else { | ||
pack_get_ptr_attr(tbuf_d, dbuf_d + dobj.DTP_buf_offset, &unpack_info); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This patch is forcing all host-only testing to use this new info hint. This will cause us to lose all testing for the cases where this info hint is not provided.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It only adds the info hint to half of the cases, similar to the gpu tests. Refer to line 55 above.
test/pack/pack.c
Outdated
static void host_only_get_ptr_attr(yaksa_info_t * info) | ||
{ | ||
static int count = 0; | ||
if ((++count) % 2 == 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this thread-safe?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not really. But the consequence is just less than perfect half by half testing. The same mechanism is used in test/pack/pack-cuda.c
. We can make it atomic (do we have atomic library for the testing?) but maybe it isn't worth the trouble. For example, maybe some randomness is even desirable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, randomness is not desirable in this case. It's better to make both cases deterministic. If we don't have an atomic library, we could use a lock or decide on the info hint based on the iteration number.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay
The static counter is not thread-safe. Use iteration number to get more consistent split-testing. Signed-off-by: Hui Zhou <[email protected]>
test:yaksa/gpu 2 failures:
|
test:yaksa/gpu |
@pavanbalaji ready for your review again |
How did the failures disappear? Did you simply rerun the tests till they disappeared? That's not a good model. It only means that the errors are intermittent and are being ignored. |
I did record them down. They seem to be intermittent gpu errors. I won't have time to debug them. I'll have to check them later. |
If they are happening with |
Let me open a dummy PR to check that |
I meant in the nightly tests. You don't need a dummy PR. |
Sure. Let me create one. |
@pavanbalaji The cuda memory issue was reproduced in nightly test (https://jenkins-pmrs.cels.anl.gov/view/yaksa/job/yaksa-nightly-gpu/lastCompletedBuild/testReport/) and tracked in issue #170 |
@pavanbalaji Is there any remaining issues with this PR? |
Pull Request Description
The purpose of this PR is add a way to tell yaksa the both
in_buf
andout_buf
is cpu host memory. This is useful when yaksa is configured with GPU support but the application doesn't use gpu memory or the application knows when their buffer is entirely from CPU and thus doesn't need to pay the pointer attribute query overhead.Expected Impact
Fixes #168
Author Checklist
module: short description
and follows good practice