-
Notifications
You must be signed in to change notification settings - Fork 280
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
typerep/yaksa: always pass info hint for yaksa pack/unpack #5005
Conversation
af8f2f7
to
974d679
Compare
The tests will fail until yaksa is updated with pmodels/yaksa#169 EDIT: updated by #5072 |
759be31
to
5f92b68
Compare
test:mpich/ch4/ofi |
test:mpich/ch4/most |
This is the key to use when MPIR_CVAR_ENABLE_GPU is off or when we know both in_buf and out_buf are host memory.
test:mpich/ch4/most |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A couple of minor comments.
src/mpid/ch4/shm/src/shm_init.c
Outdated
ret = MPIDI_IPC_mpi_init_hook(rank, size, tag_bits); | ||
MPIR_ERR_CHECK(ret); | ||
if (MPIR_CVAR_ENABLE_GPU) { | ||
ret = MPIDI_IPC_mpi_init_hook(rank, size, tag_bits); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will disable XPMEM if there is no GPU, which I don't think should happen.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. I should move it to just where the gpu gets initialized.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed. Now it is around MPIDI_GPU_mpi_init_hook
.
src/mpl/include/mpl_gpu.h
Outdated
@@ -29,6 +29,7 @@ typedef enum { | |||
typedef struct { | |||
MPL_pointer_type_t type; | |||
MPL_gpu_device_handle_t device; | |||
MPL_gpu_device_attr attr; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My only issue is the name of the struct member. Maybe device_attr
or dev_attr
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good.
An unrelated side note: it appears to me that at least cuda does its internal query cache/optimization already. We need measure the overhead of passing info and without passing info to yaksa pack/unpack. Maybe it not necessary to do this opmization.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed. Renamed to device_attr
.
MPL_STATIC_INLINE_PREFIX int MPIR_gpu_register_host(const void *ptr, size_t size) | ||
{ | ||
if (ENABLE_GPU) { | ||
return MPL_gpu_register_host(ptr, size); | ||
} | ||
return MPI_SUCCESS; | ||
} | ||
|
||
MPL_STATIC_INLINE_PREFIX int MPIR_gpu_unregister_host(const void *ptr) | ||
{ | ||
if (ENABLE_GPU) { | ||
return MPL_gpu_unregister_host(ptr); | ||
} | ||
return MPI_SUCCESS; | ||
} | ||
|
||
MPL_STATIC_INLINE_PREFIX int MPIR_gpu_malloc_host(void **ptr, size_t size) | ||
{ | ||
if (ENABLE_GPU) { | ||
return MPL_gpu_malloc_host(ptr, size); | ||
} else { | ||
*ptr = MPL_malloc(size, MPL_MEM_BUFFER); | ||
return MPI_SUCCESS; | ||
} | ||
} | ||
|
||
MPL_STATIC_INLINE_PREFIX int MPIR_gpu_free_host(void *ptr) | ||
{ | ||
if (ENABLE_GPU) { | ||
return MPL_gpu_free_host(ptr); | ||
} else { | ||
MPL_free(ptr); | ||
return MPI_SUCCESS; | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
To avoid multiple query of device pointer attribute, for example, between mpich and yaksa, we need cache or pass the queried attribute in hash or info hint. We need have the attr inside the MPL_pointer_attr_t to be able to do that.
Since we already queried pointer device attribute, we should always pass the info hint to yaksa to avoid additional pointer query inside yaksa. The code is locally restructured to always get info hints from attributes.
The checking of pointer attributes and datatype now is done in the main code.
When MPIR_CVAR_ENABLE_GPU is off, pass nogpu infohint to yaksa_init to prevent it query devices. This commit depends on yaksa PR pmodels#172.
We need consistently check MPIR_CVAR_ENABLE_GPU variable in order to consistently skip GPU access. Typical GPU driver such as CUDA have huge initialization cost. Thus we have to make sure to skip every access in order to skip the init latency.
test:mpich/ch4/ofi |
Pull Request Description
GPU pointer attribute query can be expensive. Since we have already done that in MPICH, we should avoid do it again in yaksa, thus need to pass in the info hint.
This PR depends on a new yaksa feature: pmodels/yaksa#169 and pass in the "nogpu" hint when no device memory is involved in either
inbuf
oroutbuf
. It will create a new yaksa hint when GPU memory is involved.Fixes #5002
Creating yaksa info hint incurs overhead --
malloc
,memcpy
,strcmp
-- so additional optimization may be needed. This is left as TODO.[skip warnings]
Expected Impact
Author Checklist
module: short description
and follows good practice