-
Notifications
You must be signed in to change notification settings - Fork 450
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PERF: Assertion on final ack message with RNDV #10529
Conversation
23c6fa1
to
5d82ad2
Compare
src/tools/perf/lib/ucp_tests.cc
Outdated
ucs_assert(length == ucx_perf_get_message_size(&m_perf.params)); | ||
} else if (my_index == 1) { | ||
/* Sender may only receive final ack */ | ||
ucs_assert(length == 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you pls remind why we send an ack with rndv?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure
When we discussed the ACK message we took a decision to reuse existing communication channel and buffers, and just send the message in the opposite direction (receiver -> sender). So apparently the protocol is not fixed, it matches the test communication pattern. In this particular case with ucp_am_bw it happens to be RNDV
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we convert it to ucs_assertv that prints the actual, expected length and my_index?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done with forced push due to tiny size
9f35d0a
to
abc0782
Compare
src/tools/perf/lib/ucp_tests.cc
Outdated
ucs_assertv(length == expected_length, "length=%zu, expected=%zu," | ||
" index=%u", length, expected_length, my_index); | ||
} else if (my_index == 1) { | ||
/* Sender may only receive final ack */ | ||
ucs_assertv(length == 1, "length=%zu, expected=1, index=%u", length, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pls remove "," from the format strings
1st assert: "length=%zu expected_length=%zu my_index=%zu"
2nd assert: "length=%zu my_index=%zu" (no need to print 1, it already would appear in the assert message)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
d64c8c1
to
a6a3a2f
Compare
src/tools/perf/lib/ucp_tests.cc
Outdated
@@ -275,11 +275,20 @@ class ucp_perf_test_runner { | |||
const ucp_am_recv_param_t *rx_params) | |||
{ | |||
ucs_assert(!(rx_params->recv_attr & UCP_AM_RECV_ATTR_FLAG_DATA)); | |||
ucs_assert(length == ucx_perf_get_message_size(&m_perf.params)); | |||
unsigned my_index = rte_call(&m_perf, group_index); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we avoid calling it on data-path?
e. g. can do a single assert like
(length == expected_length || length == 1)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That call is not expensive, but ok. We can make it static, so that it's evaluated just once:
static unsigned my_index = rte_call(&m_perf, group_index);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why can't we define a single assert?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's less strict, but not a big deal. Ok, we can also do with a single assert
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can also cache group_index as member var, or define 2 different am callbacks (on sender and on receiver)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, caching as member var is easier option
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now I implemented it with a single assertion like Mikhail proposed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but seems a single assertion does not really cover all the possible errors, for example if server gets length of 1 instead of actual data length?
a6a3a2f
to
dfd13d8
Compare
src/tools/perf/lib/ucp_tests.cc
Outdated
ucs_assert(length == ucx_perf_get_message_size(&m_perf.params)); | ||
/* Data length can be either 1 (on sender side when receive a final ack) | ||
* or expected payload size on the receiver side. */ | ||
ucs_assertv((length == 1) || |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also we can change it like this
ucs_assertv((length == 1) || | |
ucs_assertv((length == 1) && (rte_call(&m_perf, group_index) == 0) || |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO we cannot avoid rte_call for every data packet to determine that we are the server and the payload size should be >1. unless we cache it in the class or provide different AM callback.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it can be inside assert as i mentioned in this comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so it is hidden in release build?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we go this way, then I would just cache this index inside a member var
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Finally I implemented it in a slightly different manner: I adjust m_am_rx_length
on the sender side, and use this existing variable for assertions. Btw, there was one more place where similar assertion was failing, so this change fixes both of them
dfd13d8
to
478d780
Compare
478d780
to
7608569
Compare
What?
When
ucx_perftest -t ucp_am_bw
is executed in debug mode, then we hit an assertion that verifies the received RNDV message length = configured message size.Why?
This check fails on final sync message of size 1
How?
Adjust assertion depending on the role: receiver asserts on the payload size, sender may receive only 1 byte final ack