From 0b47caf9e8289bd0f3f7d7d49924ce2ef77fff72 Mon Sep 17 00:00:00 2001 From: David Ozog Date: Fri, 30 Aug 2024 11:56:43 -0400 Subject: [PATCH 1/2] collectives: clarify src buffer entry requirements --- content/shmem_alltoall.tex | 12 ++++++++---- content/shmem_broadcast.tex | 15 +++++++++------ content/shmem_collect.tex | 11 +++++++++++ content/shmem_reductions.tex | 12 ++++++++---- content/shmem_scan.tex | 13 ++++++++++--- 5 files changed, 46 insertions(+), 17 deletions(-) diff --git a/content/shmem_alltoall.tex b/content/shmem_alltoall.tex index 4e145c26..90440511 100644 --- a/content/shmem_alltoall.tex +++ b/content/shmem_alltoall.tex @@ -100,12 +100,16 @@ If \VAR{team} compares equal to \LibConstRef{SHMEM\_TEAM\_INVALID} or is otherwise invalid, the behavior is undefined. - Before any \ac{PE} calls a \FUNC{shmem\_alltoall} routine, - the following conditions must be ensured: + Before any \ac{PE} calls a \FUNC{shmem\_alltoall} routine, the following + conditions must be ensured, otherwise the behavior is undefined: \begin{itemize} - \item The \VAR{dest} data object on all \acp{PE} in the team is - ready to accept the \FUNC{shmem\_alltoall} data. + \item The \dest{} array on all \acp{PE} in the team is ready to + accept the result of the operation. + \item The \source{} buffer at the local \ac{PE} is ready to be + read by any \ac{PE} in the team. \end{itemize} + The application does not need to synchronize to ensure that the \source{} + buffer is ready across all \acp{PE} prior to calling this routine. Upon return from a \FUNC{shmem\_alltoall} routine, the following is true for the local PE: diff --git a/content/shmem_broadcast.tex b/content/shmem_broadcast.tex index d67c2fb0..05b67068 100644 --- a/content/shmem_broadcast.tex +++ b/content/shmem_broadcast.tex @@ -85,13 +85,16 @@ the team. \end{itemize} - Before any \ac{PE} calls a broadcast routine, the following - conditions must be ensured: + Before any \ac{PE} calls a broadcast routine, the following conditions + must be ensured, otherwise the behavior is undefined: \begin{itemize} - \item The \dest{} array on all \acp{PE} participating in the broadcast - is ready to accept the broadcast data. - \end{itemize} - Otherwise, the behavior is undefined. + \item The \dest{} array on all \acp{PE} in the team is ready to + accept the result of the operation. + \item The \source{} buffer at the local root \ac{PE} is ready to be + read by any \ac{PE} in the team. + \end{itemize} + The application does not need to synchronize to ensure that the \source{} + buffer is ready across all \acp{PE} prior to calling this routine. Upon return from a team-based broadcast routine, the following are true for the local \ac{PE}: diff --git a/content/shmem_collect.tex b/content/shmem_collect.tex index d14d8f17..479c93e2 100644 --- a/content/shmem_collect.tex +++ b/content/shmem_collect.tex @@ -88,6 +88,17 @@ If \VAR{team} compares equal to \LibConstRef{SHMEM\_TEAM\_INVALID} or is otherwise invalid, the behavior is undefined. + Before any \ac{PE} calls a collect routine, the following conditions must + be ensured, otherwise the behavior is undefined: + \begin{itemize} + \item The \dest{} array on all \acp{PE} in the team is ready to + accept the result of the operation. + \item The \source{} buffer at the local \ac{PE} is ready to be read + by any \ac{PE} in the team. + \end{itemize} + The application does not need to synchronize to ensure that the \source{} + buffer is ready across all \acp{PE} prior to calling this routine. + \begin{DeprecateBlock} \openshmem \FUNC{collect} and \FUNC{fcollect} routines perform a collective operation to concatenate \VAR{nelems} diff --git a/content/shmem_reductions.tex b/content/shmem_reductions.tex index 46cb0abe..888a51e1 100644 --- a/content/shmem_reductions.tex +++ b/content/shmem_reductions.tex @@ -295,12 +295,16 @@ \subsubsubsection{PROD} If \VAR{team} compares equal to \LibConstRef{SHMEM\_TEAM\_INVALID} or is otherwise invalid, the behavior is undefined. - Before any \ac{PE} calls a reduction routine, the following conditions must be ensured: + Before any \ac{PE} calls a reduction routine, the following conditions + must be ensured, otherwise the behavior is undefined: \begin{itemize} - \item The \dest{} array on all \acp{PE} participating in the reduction - is ready to accept the results of the \OPR{reduction}. + \item The \dest{} array on all \acp{PE} in the team is ready to + accept the results of the operation. + \item The \source{} buffer at the local \ac{PE} is ready to be read by + any \ac{PE} in the team. \end{itemize} - Otherwise, the behavior is undefined. + The application does not need to synchronize to ensure that the \source{} + buffer is ready across all \acp{PE} prior to calling this routine. Upon return from a reduction routine, the following are true for the local \ac{PE}: diff --git a/content/shmem_scan.tex b/content/shmem_scan.tex index 618a51a0..185c52d1 100644 --- a/content/shmem_scan.tex +++ b/content/shmem_scan.tex @@ -86,9 +86,16 @@ \LibConstRef{SHMEM\_TEAM\_INVALID} or is otherwise invalid, the behavior is undefined. - Before any \ac{PE} calls a scan routine, the \dest{} array on all - \acp{PE} participating in the operation must be ready to accept the - results of the operation. Otherwise, the behavior is undefined. + Before any \ac{PE} calls a scan routine, the following conditions must be + ensured, otherwise the behavior is undefined: + \begin{itemize} + \item The \dest{} array on all \acp{PE} in the team is ready to accept + the result of the operation. + \item The \source{} buffer at the local \ac{PE} is ready to be read by + any \ac{PE} in the team. + \end{itemize} + The application does not need to synchronize to ensure that the \source{} + buffer is ready across all \acp{PE} prior to calling this routine. Upon return from a scan routine, the following are true for the local \ac{PE}: the \dest{} array is updated, and the \source{} array From 8095ea451dfcd3f5f48da7affc668fe8095b87e3 Mon Sep 17 00:00:00 2001 From: David Ozog Date: Fri, 30 Aug 2024 15:29:21 -0400 Subject: [PATCH 2/2] collectives: "array" instead of source "buffer" --- content/shmem_alltoall.tex | 4 ++-- content/shmem_broadcast.tex | 4 ++-- content/shmem_collect.tex | 4 ++-- content/shmem_reductions.tex | 4 ++-- content/shmem_scan.tex | 4 ++-- 5 files changed, 10 insertions(+), 10 deletions(-) diff --git a/content/shmem_alltoall.tex b/content/shmem_alltoall.tex index 90440511..ba0b43a7 100644 --- a/content/shmem_alltoall.tex +++ b/content/shmem_alltoall.tex @@ -105,11 +105,11 @@ \begin{itemize} \item The \dest{} array on all \acp{PE} in the team is ready to accept the result of the operation. - \item The \source{} buffer at the local \ac{PE} is ready to be + \item The \source{} array at the local \ac{PE} is ready to be read by any \ac{PE} in the team. \end{itemize} The application does not need to synchronize to ensure that the \source{} - buffer is ready across all \acp{PE} prior to calling this routine. + array is ready across all \acp{PE} prior to calling this routine. Upon return from a \FUNC{shmem\_alltoall} routine, the following is true for the local PE: diff --git a/content/shmem_broadcast.tex b/content/shmem_broadcast.tex index 05b67068..bd936b5f 100644 --- a/content/shmem_broadcast.tex +++ b/content/shmem_broadcast.tex @@ -90,11 +90,11 @@ \begin{itemize} \item The \dest{} array on all \acp{PE} in the team is ready to accept the result of the operation. - \item The \source{} buffer at the local root \ac{PE} is ready to be + \item The \source{} array at the local root \ac{PE} is ready to be read by any \ac{PE} in the team. \end{itemize} The application does not need to synchronize to ensure that the \source{} - buffer is ready across all \acp{PE} prior to calling this routine. + array is ready across all \acp{PE} prior to calling this routine. Upon return from a team-based broadcast routine, the following are true for the local \ac{PE}: diff --git a/content/shmem_collect.tex b/content/shmem_collect.tex index 479c93e2..b7e2d3fa 100644 --- a/content/shmem_collect.tex +++ b/content/shmem_collect.tex @@ -93,11 +93,11 @@ \begin{itemize} \item The \dest{} array on all \acp{PE} in the team is ready to accept the result of the operation. - \item The \source{} buffer at the local \ac{PE} is ready to be read + \item The \source{} array at the local \ac{PE} is ready to be read by any \ac{PE} in the team. \end{itemize} The application does not need to synchronize to ensure that the \source{} - buffer is ready across all \acp{PE} prior to calling this routine. + array is ready across all \acp{PE} prior to calling this routine. \begin{DeprecateBlock} \openshmem \FUNC{collect} and \FUNC{fcollect} routines perform a collective diff --git a/content/shmem_reductions.tex b/content/shmem_reductions.tex index 888a51e1..fa48bb3d 100644 --- a/content/shmem_reductions.tex +++ b/content/shmem_reductions.tex @@ -300,11 +300,11 @@ \subsubsubsection{PROD} \begin{itemize} \item The \dest{} array on all \acp{PE} in the team is ready to accept the results of the operation. - \item The \source{} buffer at the local \ac{PE} is ready to be read by + \item The \source{} array at the local \ac{PE} is ready to be read by any \ac{PE} in the team. \end{itemize} The application does not need to synchronize to ensure that the \source{} - buffer is ready across all \acp{PE} prior to calling this routine. + array is ready across all \acp{PE} prior to calling this routine. Upon return from a reduction routine, the following are true for the local \ac{PE}: diff --git a/content/shmem_scan.tex b/content/shmem_scan.tex index 185c52d1..35338a51 100644 --- a/content/shmem_scan.tex +++ b/content/shmem_scan.tex @@ -91,11 +91,11 @@ \begin{itemize} \item The \dest{} array on all \acp{PE} in the team is ready to accept the result of the operation. - \item The \source{} buffer at the local \ac{PE} is ready to be read by + \item The \source{} array at the local \ac{PE} is ready to be read by any \ac{PE} in the team. \end{itemize} The application does not need to synchronize to ensure that the \source{} - buffer is ready across all \acp{PE} prior to calling this routine. + array is ready across all \acp{PE} prior to calling this routine. Upon return from a scan routine, the following are true for the local \ac{PE}: the \dest{} array is updated, and the \source{} array