forked from microsoft/DeepSpeed
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
reduce all-to-all communication volume when both expert and non-exper…
…t are tensor-parallel (microsoft#5626) Example: E + M + D parallel world_size = 8 model_degree = 2 expert_degree = 4 mp_group = [0, 1], [2,3], [4,5],[6,7] expert_parallel_group = [0,2,4,6], [1,3,5,7] The original execution method was that before executing Expert, there was no drop operation, and two EPs did all-to-all separately. In the end, they both obtained complete data, but 0 and 1 obtained exactly the same data. Similarly, 2, 3, and so on all obtained the same data. Therefore, we can drop the data before executing all-to-all, and then execute allgather after all-to-all to obtain the complete data. After executing Expert, the data on 0 and 1 is exactly the same, so we can drop it and then execute all-to-all , and then execute allgather to obtain the complete data. 1. non-expert use TP, expert not use TP: drop -> alltoall -> exe MOE -> alltoall -> allgather 2. both non-expert and expert all use TP: - the original execution order: alltoall -> exe MOE-> allreduce -> alltoall - optimized execution order: drop -> alltoall -> allgather -> exe MOE -> drop ->alltoall -> allgather Signed-off-by: --local <[email protected]> Co-authored-by: --local <[email protected]> Co-authored-by: Olatunji Ruwase <[email protected]> Co-authored-by: Logan Adams <[email protected]>
- Loading branch information
1 parent
213e2d9
commit f5d6c63
Showing
4 changed files
with
45 additions
and
21 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters