Skip to content

Commit

Permalink
mpir/nodemap: use fallback collectives during init
Browse files Browse the repository at this point in the history
In MPIR_nodeid_init use MPIR_Allgather_fallback and MPIR_Bcast_fallback
to avoid the complication of collective algorithm selection.

It causes issue here because the bcast smp_new algorithm does not have
proper CVAR fallback check yet. The proper fix need add coll_attr to
most communicator checking routines, and will need coll_attr to be
universally added to all collective interfaces including nonblocking and
persistent collectives. Let's postpone that big change for now.
  • Loading branch information
hzhou committed Aug 13, 2024
1 parent c12140e commit 8786d56
Show file tree
Hide file tree
Showing 2 changed files with 6 additions and 5 deletions.
1 change: 1 addition & 0 deletions src/include/mpir_coll.h
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,7 @@
* are safe to use during init. They are all intra algorithms.
*/
#define MPIR_Barrier_fallback MPIR_Barrier_intra_dissemination
#define MPIR_Bcast_fallback MPIR_Bcast_intra_binomial
#define MPIR_Allgather_fallback MPIR_Allgather_intra_brucks
#define MPIR_Allgatherv_fallback MPIR_Allgatherv_intra_brucks
#define MPIR_Allreduce_fallback MPIR_Allreduce_intra_recursive_doubling
Expand Down
10 changes: 5 additions & 5 deletions src/util/mpir_nodemap.c
Original file line number Diff line number Diff line change
Expand Up @@ -451,16 +451,16 @@ int MPIR_nodeid_init(void)
MPIR_Strerror(errno, strerrbuf, MPIR_STRERROR_BUF_SIZE), errno);
my_hostname[MAX_HOSTNAME_LEN - 1] = '\0';

mpi_errno = MPIR_Allgather_impl(MPI_IN_PLACE, MAX_HOSTNAME_LEN, MPI_CHAR,
allhostnames, MAX_HOSTNAME_LEN, MPI_CHAR,
node_roots_comm, MPIR_ERR_NONE);
mpi_errno = MPIR_Allgather_fallback(MPI_IN_PLACE, MAX_HOSTNAME_LEN, MPI_CHAR,
allhostnames, MAX_HOSTNAME_LEN, MPI_CHAR,
node_roots_comm, MPIR_ERR_NONE);
MPIR_ERR_CHECK(mpi_errno);
}

MPIR_Comm *node_comm = MPIR_Process.comm_world->node_comm;
if (node_comm) {
mpi_errno = MPIR_Bcast_impl(allhostnames, MAX_HOSTNAME_LEN * MPIR_Process.num_nodes,
MPI_CHAR, 0, node_comm, MPIR_ERR_NONE);
mpi_errno = MPIR_Bcast_fallback(allhostnames, MAX_HOSTNAME_LEN * MPIR_Process.num_nodes,
MPI_CHAR, 0, node_comm, MPIR_ERR_NONE);
MPIR_ERR_CHECK(mpi_errno);
}

Expand Down

0 comments on commit 8786d56

Please sign in to comment.