-
Notifications
You must be signed in to change notification settings - Fork 326
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[MoE][Common/PyTorch] Add permutation (#936)
* Add permutation functions * Add permutation ops * Remove the dependency on cutlass * Move permutation.py out of module dir * Rewrite the unit test and enable skipping if FP8 is unavailable * Rename exposed C++ API and reorder its parameters + take NVTETensor as inputs * Use Float8Tensor for FP8 input * Move dtype to ctx --------- Signed-off-by: Jiang Shao <[email protected]> Co-authored-by: Qi Zhang <[email protected]> Co-authored-by: Phuong Nguyen <[email protected]>
- Loading branch information
1 parent
47caafb
commit a335374
Showing
11 changed files
with
1,394 additions
and
1 deletion.
There are no files selected for viewing
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
21 changes: 21 additions & 0 deletions
21
transformer_engine/common/include/transformer_engine/permutation.h
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,21 @@ | ||
/************************************************************************* | ||
* Copyright (c) 2022-2024, NVIDIA CORPORATION & AFFILIATES. All rights reserved. | ||
* | ||
* See LICENSE for license information. | ||
************************************************************************/ | ||
|
||
#ifndef TRANSFORMER_ENGINE_PERMUTATION_H_ | ||
#define TRANSFORMER_ENGINE_PERMUTATION_H_ | ||
|
||
#include "transformer_engine.h" | ||
|
||
void nvte_permute(const NVTETensor input, NVTETensor output, const NVTETensor sorted_row_id, | ||
NVTETensor row_id_map, const NVTETensor prob, NVTETensor prob_grad, | ||
const NVTETensor input_fwd, const int num_rows, const int topK, | ||
const int num_cols, const int num_out_tokens, cudaStream_t stream = nullptr); | ||
|
||
void nvte_unpermute(const NVTETensor input, NVTETensor output, NVTETensor row_id_map, | ||
const NVTETensor prob, const int num_rows, const int topK, const int num_cols, | ||
cudaStream_t stream = nullptr); | ||
|
||
#endif // TRANSFORMER_ENGINE_PERMUTATION_H_ |
Oops, something went wrong.