Added offloading support FP8 attention #1131

sanandaraj5597 · 2024-08-23T04:36:54Z

This PR allows offloading the QKV activation tensors for FP8 Attention.

Signed-off-by: Selvaraj Anandaraj <[email protected]>

timmoon10

LGTM.

Can CPU offloading handle recieving None in the tensor list? #1143 adds some cases where the FP8 tensors are not saved.

transformer_engine/pytorch/attention.py

sanandaraj5597 · 2024-09-04T18:08:06Z

Can CPU offloading handle recieving None in the tensor list? #1143 adds some cases where the FP8 tensors are not saved.

Yes, it can handle.

Co-authored-by: Kirthi Shankar Sivamani <[email protected]> Signed-off-by: Selvaraj Anandaraj <[email protected]>

ksivaman · 2024-09-04T18:10:35Z

/te-ci pytorch

ksivaman

Looks good

cyanguwa

LGTM

Signed-off-by: Kirthi Shankar Sivamani <[email protected]>

Added offloading support FP8 attention

0ef5803

Signed-off-by: Selvaraj Anandaraj <[email protected]>

timmoon10 requested a review from cyanguwa September 4, 2024 17:59

Merge branch 'main' into main

84857bb

timmoon10 approved these changes Sep 4, 2024

View reviewed changes

ksivaman reviewed Sep 4, 2024

View reviewed changes

transformer_engine/pytorch/attention.py Outdated Show resolved Hide resolved

Update transformer_engine/pytorch/attention.py

e59e4ba

Co-authored-by: Kirthi Shankar Sivamani <[email protected]> Signed-off-by: Selvaraj Anandaraj <[email protected]>

ksivaman approved these changes Sep 4, 2024

View reviewed changes

cyanguwa approved these changes Sep 4, 2024

View reviewed changes

ksivaman added 2 commits September 5, 2024 15:16

Fix

e54c03b

Signed-off-by: Kirthi Shankar Sivamani <[email protected]>

Merge branch 'main' into main

4e249fb

Signed-off-by: Kirthi Shankar Sivamani <[email protected]>

ksivaman merged commit 454e389 into NVIDIA:main Sep 5, 2024
14 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added offloading support FP8 attention #1131

Added offloading support FP8 attention #1131

sanandaraj5597 commented Aug 23, 2024

timmoon10 left a comment

sanandaraj5597 commented Sep 4, 2024

ksivaman commented Sep 4, 2024

ksivaman left a comment

cyanguwa left a comment

Added offloading support FP8 attention #1131

Added offloading support FP8 attention #1131

Conversation

sanandaraj5597 commented Aug 23, 2024

timmoon10 left a comment

Choose a reason for hiding this comment

sanandaraj5597 commented Sep 4, 2024

ksivaman commented Sep 4, 2024

ksivaman left a comment

Choose a reason for hiding this comment

cyanguwa left a comment

Choose a reason for hiding this comment