forked from aws/aws-ofi-nccl
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
When net-plugin returns regIsGlobal=1 to NCCL (as part of net-plugin getProperties() API), it signals to NCCL that registered MRs are global, in the sense that they can be used by all communicators. In addition, it also signals to NCCL that the net-plugin have a fast MR cache such that calling regMr() on same buffer (address and size), will quickly return a previously globally registered MR on same buffer. When user registers a buffer with NCCL by using ncclCommRegister() API, if net-plugin supports regIsGlobal=1, NCCL will register the buffer globally once (On each net device) with regMr() API. When the net proxy-thread starts to execute a communication task on a previously registered user buffer, it will call the net-plugin regMr() to quickly fetch the previously globally registered MR from the plugin managed MR cache. Since we now have such MR cache in the plugin, we can report registrations as global if 1. the MR scope for the libfabric provider is the domain, and 2. if the plugin is using one domain per process. Additionally, we are not reporting registrations as global for the SENDRECV protocol because the SENDRECV protocol currently does not correctly handle the truncated send case (send size > recv size) which NCCL may use when regIsGlobal=1. This reverts commit d9c416f, with additional changes. Signed-off-by: Amedeo Sapio <[email protected]>
- Loading branch information
1 parent
61f9007
commit e431f03
Showing
4 changed files
with
42 additions
and
60 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters