Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UCP/PROTO: UCX_PROTO_INFO=used option #10395

Open
wants to merge 14 commits into
base: master
Choose a base branch
from
Open
1 change: 1 addition & 0 deletions src/ucp/core/ucp_context.c
Original file line number Diff line number Diff line change
Expand Up @@ -476,6 +476,7 @@ static ucs_config_field_t ucp_context_config_table[] = {
"Enable printing protocols information. The value is interpreted as follows:\n"
" 'y' : Print information for all protocols\n"
" 'n' : Do not print any protocol information\n"
" used : Print information for used protocols\n"
" glob_pattern : Print information for operations matching the glob pattern.\n"
" For example: '*tag*gpu*', '*put*fast*host*'",
ucs_offsetof(ucp_context_config_t, proto_info), UCS_CONFIG_TYPE_STRING},
Expand Down
16 changes: 16 additions & 0 deletions src/ucp/core/ucp_worker.c
Original file line number Diff line number Diff line change
Expand Up @@ -2228,6 +2228,21 @@ static void ucp_worker_keepalive_reset(ucp_worker_h worker)
worker->keepalive.round_count = 0;
}

static void ucp_worker_trace_configs(ucp_worker_h worker)
{
ucp_ep_config_t *ep_config;
ucp_rkey_config_t *rkey_config;

ucs_array_for_each(ep_config, &worker->ep_config) {
ucp_proto_select_trace(worker, &ep_config->proto_select);
}

ucs_carray_for_each(rkey_config, worker->rkey_config,
worker->rkey_config_count) {
ucp_proto_select_trace(worker, &rkey_config->proto_select);
}
}

static void ucp_worker_destroy_configs(ucp_worker_h worker)
{
ucp_ep_config_t *ep_config;
Expand Down Expand Up @@ -2878,6 +2893,7 @@ static void ucp_worker_destroy_eps(ucp_worker_h worker,
void ucp_worker_destroy(ucp_worker_h worker)
{
ucs_debug("destroy worker %p", worker);
ucp_worker_trace_configs(worker);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems we display the data only during working destroy, but what if we have a long running application and we want to show the info without waiting for app exit?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This scenario is not really covered by the initial impl, as it does not seem to be a mainstream use case.
Here are 2 ideas what we can do:

  1. Display all the protocols as before with UCX_PRORO_INFO=y option
  2. Implement periodic timer (with configurable interval, 10s default) in worker using timerfd_create and timerfd_settime, that would trigger ucp_worker_trace_configs.

Personally I believe it's not really worth the effort. But if you think it's useful, can be done maybe in the next PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we print the info after the count reaches a certain threshold?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In short - I think that may not be always reliable, and requires operations on fast-path
Currently the idea is that we support this feature in 2 modes:

  • Debug mode (ENABLE_DEBUG_DATA) - when we get precise statistics. Requires adding increment operation to the fast path.
  • Release mode - do not use counter, but only detect used protocols in the slow path. This way we don't introduce delays to the fast-path.
    Btw, what's your opinion on that? Should have have these 2 modes of operation, or we just enable this feature always?

Basically in release mode the counter is never incremented above 1, so we cannot rely on this.
And in debug mode this would require to introduce one more if branch to the fast-path.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO we need to try making the feature available also in release mode without adding overhead when it is not enabled

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I implemented it with absolutely zero overhead when the feature is disabled.
It required some additions to the logger, hopefully you like it.
If not, we can remove logger event listener and have one extra if branch in the fast-path.
Please let me know

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks to @brminich , I removed this logger stuff, and found the right place for the increment.
Now it's really zero overhead when feature is disabled


UCS_ASYNC_BLOCK(&worker->async);
uct_worker_progress_unregister_safe(worker->uct, &worker->keepalive.cb_id);
Expand Down
3 changes: 3 additions & 0 deletions src/ucp/proto/proto.h
Original file line number Diff line number Diff line change
Expand Up @@ -141,6 +141,9 @@ typedef struct {

/* Map of used lanes */
ucp_lane_map_t lane_map;

/* Selected count */
size_t selected_count;
rakhmets marked this conversation as resolved.
Show resolved Hide resolved
} ucp_proto_query_attr_t;


Expand Down
66 changes: 53 additions & 13 deletions src/ucp/proto/proto_debug.c
Original file line number Diff line number Diff line change
Expand Up @@ -61,6 +61,7 @@ struct ucp_proto_perf_node {

/* Protocol information table */
typedef struct {
char counter_str[32];
char range_str[32];
char desc[UCP_PROTO_DESC_STR_MAX];
char config[UCP_PROTO_CONFIG_STR_MAX];
Expand Down Expand Up @@ -153,42 +154,69 @@ static void ucp_proto_table_row_separator(ucs_string_buffer_t *strb,
}

static int ucp_proto_debug_is_info_enabled(ucp_context_h context,
const char *select_param_str)
const char *select_param_str,
int show_used)
{
const char *proto_info_config = context->config.ext.proto_info;
int bool_value;

if (show_used) {
return !strcmp(proto_info_config, "used");
}

if (ucs_config_sscanf_bool(proto_info_config, &bool_value, NULL)) {
return bool_value;
}

return fnmatch(proto_info_config, select_param_str, FNM_CASEFOLD) == 0;
}

static inline size_t
ucp_proto_select_elem_selected_count(const ucp_proto_select_elem_t *select_elem)
{
#ifdef ENABLE_STATS
const ucp_proto_threshold_elem_t *thresh_elem = select_elem->thresholds;
size_t total_selected_count = 0;

do {
total_selected_count += thresh_elem->proto_config.selected_count;
} while ((thresh_elem++)->max_msg_length < SIZE_MAX);

return total_selected_count;
#else
return 0;
#endif
}

static void
ucp_proto_select_elem_info(ucp_worker_h worker,
ucp_worker_cfg_index_t ep_cfg_index,
ucp_worker_cfg_index_t rkey_cfg_index,
const ucp_proto_select_param_t *select_param,
ucp_proto_select_elem_t *select_elem, int show_all,
ucs_string_buffer_t *strb)
int show_used, ucs_string_buffer_t *strb)
{
UCS_STRING_BUFFER_ONSTACK(ep_cfg_strb, UCP_PROTO_CONFIG_STR_MAX);
UCS_STRING_BUFFER_ONSTACK(sel_param_strb, UCP_PROTO_CONFIG_STR_MAX);
static const char *info_row_fmt = "| %*s | %-*s | %-*s |\n";
static const char *info_row_fmt = "| %s%*s | %-*s | %-*s |\n";
ucp_proto_info_table_t table;
int hdr_col_width[2], col_width[3];
ucp_proto_query_attr_t proto_attr;
ucp_proto_info_row_t *row_elem;
size_t range_start, range_end;
int proto_valid;

if (show_used && (0 == ucp_proto_select_elem_selected_count(select_elem))) {
return;
}

ucp_proto_select_param_dump(worker, ep_cfg_index, rkey_cfg_index,
select_param, ucp_operation_descs, &ep_cfg_strb,
&sel_param_strb);
if (!show_all &&
!ucp_proto_debug_is_info_enabled(
worker->context, ucs_string_buffer_cstr(&sel_param_strb))) {
worker->context, ucs_string_buffer_cstr(&sel_param_strb),
show_used)) {
return;
}

Expand Down Expand Up @@ -219,7 +247,16 @@ ucp_proto_select_elem_info(ucp_worker_h worker,
ucs_memunits_range_str(range_start, range_end, row_elem->range_str,
sizeof(row_elem->range_str));

col_width[0] = ucs_max(col_width[0], strlen(row_elem->range_str));
if (show_used && (proto_attr.selected_count > 0)) {
ucs_snprintf_safe(row_elem->counter_str,
sizeof(row_elem->counter_str), "%zu ",
proto_attr.selected_count);
} else {
row_elem->counter_str[0] = '\0';
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe print 0 counter (if it is even possible ho hit this code in this case)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's easily doable, I thought about this, but IMO having zeroes in the output does not improve readability:

+---------------------------------+----------------------------------------------------------------------------------+
| ucp_context_47 intra-node cfg#0 | rendezvous data fetch into host memory from host                                 |
+---------------------------------+---------------------------------+------------------------------------------------+
|    0                          0 | no data fetch                   |                                                |
| 1000                    1..9109 | (?) fragmented copy-in copy-out | rc_mlx5/mlx5_0:1 50% on path0 and 50% on path1 |
|    0                  9110..inf | (?) zero-copy                   | rc_mlx5/mlx5_0:1 50% on path0 and 50% on path1 |
+---------------------------------+---------------------------------+------------------------------------------------+

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

imo it is better, than showing the counter only for some rows, but not others

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, will be done

}

col_width[0] = ucs_max(col_width[0], strlen(row_elem->counter_str) +
strlen(row_elem->range_str));
col_width[1] = ucs_max(col_width[1], strlen(row_elem->desc));
col_width[2] = ucs_max(col_width[2], strlen(row_elem->config));
} while (range_end != SIZE_MAX);
Expand All @@ -241,7 +278,8 @@ ucp_proto_select_elem_info(ucp_worker_h worker,
/* Print contents */
ucp_proto_table_row_separator(strb, col_width, 3);
ucs_array_for_each(row_elem, &table) {
ucs_string_buffer_appendf(strb, info_row_fmt, col_width[0],
ucs_string_buffer_appendf(strb, info_row_fmt, row_elem->counter_str,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the meaning of these numbers will not be obvious to users. Please add the value to a separate column with the name.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I followed the existing design - as you can see there are no really named columns, e.g. for message range. The column name is actually just the config name. So if I add a new named column just for selections count - that would look ridiculous IMO

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps, we should add a row with column names: size range / protocol / resources / use count.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about the added value of this change.
I mean, to me it seems this is a debug feature meant to be used mostly by UCX developers who know what to look for. This probably explains why these column names were not introduced so far - there is no much benefit I guess.

On the other hand - it would also increase the output size - and this is exactly the issue I'm trying to solve with this commit.

col_width[0] - strlen(row_elem->counter_str),
row_elem->range_str, col_width[1],
row_elem->desc, col_width[2],
row_elem->config);
Expand All @@ -262,7 +300,7 @@ void ucp_proto_select_info(ucp_worker_h worker,

kh_foreach(proto_select->hash, key.u64, select_elem,
ucp_proto_select_elem_info(worker, ep_cfg_index, rkey_cfg_index,
&key.param, &select_elem, show_all,
&key.param, &select_elem, show_all, 0,
strb);
ucs_string_buffer_appendf(strb, "\n"))
}
Expand Down Expand Up @@ -964,7 +1002,7 @@ ucp_proto_select_write_info(ucp_worker_h worker,
ucp_operation_names, &ep_cfg_strb,
&sel_param_strb);
if (!ucp_proto_debug_is_info_enabled(
worker->context, ucs_string_buffer_cstr(&sel_param_strb))) {
worker->context, ucs_string_buffer_cstr(&sel_param_strb), 0)) {
goto out;
}

Expand Down Expand Up @@ -1029,17 +1067,19 @@ ucp_proto_select_write_info(ucp_worker_h worker,
}

void ucp_proto_select_elem_trace(ucp_worker_h worker,
ucp_worker_cfg_index_t ep_cfg_index,
ucp_worker_cfg_index_t rkey_cfg_index,
const ucp_proto_select_param_t *select_param,
ucp_proto_select_elem_t *select_elem)
ucp_proto_select_elem_t *select_elem,
int show_used)
{
ucs_string_buffer_t strb = UCS_STRING_BUFFER_INITIALIZER;
const ucp_proto_config_t *proto_config = &select_elem->thresholds[0].proto_config;
ucp_worker_cfg_index_t ep_cfg_index = proto_config->ep_cfg_index;
ucp_worker_cfg_index_t rkey_cfg_index = proto_config->rkey_cfg_index;
ucs_string_buffer_t strb = UCS_STRING_BUFFER_INITIALIZER;
char *line;

/* Print human-readable protocol selection table to the log */
ucp_proto_select_elem_info(worker, ep_cfg_index, rkey_cfg_index,
select_param, select_elem, 0, &strb);
select_param, select_elem, 0, show_used, &strb);
ucs_string_buffer_for_each_token(line, &strb, "\n") {
ucs_log_print_compact(line);
}
Expand Down
5 changes: 2 additions & 3 deletions src/ucp/proto/proto_debug.h
Original file line number Diff line number Diff line change
Expand Up @@ -153,10 +153,9 @@ void ucp_proto_perf_node_replace(ucp_proto_perf_node_t **old_perf_node_p,


void ucp_proto_select_elem_trace(ucp_worker_h worker,
ucp_worker_cfg_index_t ep_cfg_index,
ucp_worker_cfg_index_t rkey_cfg_index,
const ucp_proto_select_param_t *select_param,
ucp_proto_select_elem_t *select_elem);
ucp_proto_select_elem_t *select_elem,
int show_used);


void ucp_proto_select_write_info(
Expand Down
31 changes: 28 additions & 3 deletions src/ucp/proto/proto_select.c
Original file line number Diff line number Diff line change
Expand Up @@ -298,6 +298,9 @@ static ucs_status_t ucp_proto_select_elem_add_envelope(
proto_config->rkey_cfg_index = rkey_cfg_index;
proto_config->select_param = *select_param;
proto_config->init_elem = proto;
#ifdef ENABLE_STATS
proto_config->selected_count = 0;
#endif
*last_proto_idx = proto_idx;
}

Expand Down Expand Up @@ -492,8 +495,7 @@ ucp_proto_select_elem_init(ucp_worker_h worker, int internal,
ucp_proto_select_wiface_activate(worker, select_elem, ep_cfg_index);

if (!internal) {
ucp_proto_select_elem_trace(worker, ep_cfg_index, rkey_cfg_index,
&select_param_copy, select_elem);
ucp_proto_select_elem_trace(worker, &select_param_copy, select_elem, 0);
}

status = UCS_OK;
Expand Down Expand Up @@ -580,11 +582,24 @@ void ucp_proto_select_cleanup(ucp_proto_select_t *proto_select)
ucp_proto_select_elem_t select_elem;

kh_foreach_value(proto_select->hash, select_elem,
ucp_proto_select_elem_cleanup(&select_elem)
ucp_proto_select_elem_cleanup(&select_elem)
)
kh_destroy(ucp_proto_select_hash, proto_select->hash);
}

void ucp_proto_select_trace(ucp_worker_h worker,
ucp_proto_select_t *proto_select)
{
#ifdef ENABLE_STATS
ucp_proto_select_elem_t select_elem;
ucp_proto_select_key_t key;

kh_foreach(proto_select->hash, key.u64, select_elem,
ucp_proto_select_elem_trace(worker, &key.param, &select_elem, 1);
)
#endif
}

void ucp_proto_select_add_proto(const ucp_proto_init_params_t *init_params,
size_t cfg_thresh, unsigned cfg_priority,
ucp_proto_perf_t *perf, const void *priv,
Expand Down Expand Up @@ -719,6 +734,11 @@ void ucp_proto_select_short_init(ucp_worker_h worker,
goto out_disable;
}

#ifdef ENABLE_STATS
/* Revert stats counter for probe operation */
--((ucp_proto_threshold_elem_t *)thresh)->proto_config.selected_count;
#endif

ucs_assert(thresh->proto_config.proto != NULL);
if (!ucs_test_all_flags(thresh->proto_config.proto->flags,
proto_flags)) {
Expand Down Expand Up @@ -864,6 +884,11 @@ int ucp_proto_select_elem_query(ucp_worker_h worker,

proto_attr->max_msg_length = ucs_min(proto_attr->max_msg_length,
thresh_elem->max_msg_length);
#ifdef ENABLE_STATS
proto_attr->selected_count = proto_config->selected_count;
#else
proto_attr->selected_count = 0;
#endif

return !(thresh_elem->proto_config.proto->flags & UCP_PROTO_FLAG_INVALID);
}
9 changes: 9 additions & 0 deletions src/ucp/proto/proto_select.h
Original file line number Diff line number Diff line change
Expand Up @@ -121,6 +121,11 @@ typedef struct {

/* Pointer to the corresponding initialization data */
const ucp_proto_init_elem_t *init_elem;

#ifdef ENABLE_STATS
rakhmets marked this conversation as resolved.
Show resolved Hide resolved
/* Usage counter */
size_t selected_count;
#endif
} ucp_proto_config_t;


Expand Down Expand Up @@ -183,6 +188,10 @@ ucs_status_t ucp_proto_select_init(ucp_proto_select_t *proto_select);
void ucp_proto_select_cleanup(ucp_proto_select_t *proto_select);


void ucp_proto_select_trace(ucp_worker_h worker,
ucp_proto_select_t *proto_select);


void ucp_proto_select_add_proto(const ucp_proto_init_params_t *init_params,
size_t cfg_thresh, unsigned cfg_priority,
ucp_proto_perf_t *perf, const void *priv,
Expand Down
7 changes: 6 additions & 1 deletion src/ucp/proto/proto_select.inl
Original file line number Diff line number Diff line change
Expand Up @@ -81,6 +81,7 @@ ucp_proto_select_lookup(ucp_worker_h worker, ucp_proto_select_t *proto_select,
size_t msg_length)
{
const ucp_proto_select_elem_t *select_elem;
const ucp_proto_threshold_elem_t *thresh;
ucp_proto_select_key_t key;
khiter_t khiter;

Expand Down Expand Up @@ -108,7 +109,11 @@ ucp_proto_select_lookup(ucp_worker_h worker, ucp_proto_select_t *proto_select,
proto_select->cache.value = select_elem;
}

return ucp_proto_select_thresholds_search(select_elem, msg_length);
thresh = ucp_proto_select_thresholds_search(select_elem, msg_length);
#ifdef ENABLE_STATS
++((ucp_proto_threshold_elem_t *)thresh)->proto_config.selected_count;
#endif
return thresh;
}

/*
Expand Down
Loading