-
Notifications
You must be signed in to change notification settings - Fork 305
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Update FP8 scale-inverse in kernels with FP8 output (#1083)
* Perform scale-inv update in cast-transpose kernels Signed-off-by: Tim Moon <[email protected]> * Perform scale-inv update in cast and activation kernels Signed-off-by: Tim Moon <[email protected]> * Perform sclae-inv update in LayerNorm and RMSNorm kernels Signed-off-by: Tim Moon <[email protected]> * Perform scale-inv update after FP8 GEMMs Signed-off-by: Tim Moon <[email protected]> * Fuse casts and scale-inv updates in linear module Signed-off-by: Tim Moon <[email protected]> * Fuse casts and scale-inv updates in layernorm-linear module Signed-off-by: Tim Moon <[email protected]> * Simplify kernel to update FP8 scale-inv Signed-off-by: Tim Moon <[email protected]> * Fix typos Signed-off-by: Tim Moon <[email protected]> * Debug amax update in layernorm kernels Signed-off-by: Tim Moon <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Debug test failures Signed-off-by: Tim Moon <[email protected]> * Debug ONNX export Use quantization scaling factor in ONNX quantize op. Signed-off-by: Tim Moon <[email protected]> * Review suggestion from @ptrendx Signed-off-by: Tim Moon <[email protected]> * Debug mismatched dtypes Signed-off-by: Tim Moon <[email protected]> --------- Signed-off-by: Tim Moon <[email protected]> Signed-off-by: Tim Moon <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
- Loading branch information
1 parent
5d5fe81
commit 8e3561b
Showing
34 changed files
with
824 additions
and
380 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,32 @@ | ||
/************************************************************************* | ||
* Copyright (c) 2022-2024, NVIDIA CORPORATION & AFFILIATES. All rights reserved. | ||
* | ||
* See LICENSE for license information. | ||
************************************************************************/ | ||
|
||
#include <transformer_engine/transformer_engine.h> | ||
|
||
#include "./common.h" | ||
#include "./utils.cuh" | ||
|
||
namespace transformer_engine { | ||
|
||
namespace { | ||
|
||
__global__ void __launch_bounds__(1) | ||
update_tensor_scale_inv_kernel(const float* __restrict__ scale_ptr, | ||
float* __restrict__ scale_inv_ptr) { | ||
const float scale = scale_ptr == nullptr ? 1 : *scale_ptr; | ||
reciprocal<float>(scale_inv_ptr, scale); | ||
} | ||
|
||
} // namespace | ||
|
||
void update_tensor_scale_inv(Tensor* t, cudaStream_t stream) { | ||
if (t->scale_inv.dptr != nullptr) { | ||
update_tensor_scale_inv_kernel<<<1, 1, 0, stream>>>( | ||
reinterpret_cast<const float*>(t->scale.dptr), reinterpret_cast<float*>(t->scale_inv.dptr)); | ||
} | ||
} | ||
|
||
} // namespace transformer_engine |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.