Skip to content

TransformerEngine FP8 is slower & more memory intensive than FlashAttention FP16? #3706

TransformerEngine FP8 is slower & more memory intensive than FlashAttention FP16?

TransformerEngine FP8 is slower & more memory intensive than FlashAttention FP16? #3706