fp8 dtype support #2604

cyatarow · 2023-12-07T16:02:27Z

cyatarow
Dec 7, 2023

In AUTOMATIC1111's WebUI repo, the FP8 dtype support is proposed.
It reduces VRAM usage by almost HALF compared to FP16 with a speed decrease of only 5% or less.
It needs PyTorch 2.1.0 or newer, and so far AUTOMATIC1111's WebUI doesn't have the 2.1.0 support, but SD.Next does.

What do you think of FP8? And can it be implemented in SD.Next?

vladmandic · 2023-12-07T17:07:58Z

vladmandic
Dec 7, 2023
Maintainer

i'm following the progress, but i don't see it as production ready just yet as torch/cuda does not have compute fp8 capabilities, so this is applies to storage only and relies on autocast to fp16 during runtime for processing - and exact gpus which are likely to be memory starved are same ones that are likely to have either autocast (e.g. directml) or fp16 precision issues (e.g. nvidia series 1xxx).

0 replies

drax-xard · 2023-12-08T02:51:57Z

drax-xard
Dec 8, 2023

Honestly, for memory starved setups the best answer may come when stable-difussion.cpp matures to a usable point, since they use the GGML framework from llamacpp that allows really low-bit quantizations (among other cool things).
Even with fp16 they are reporting sub-2gb usage when used with flash attention.

0 replies

cyatarow · 2023-12-20T12:19:07Z

cyatarow
Dec 20, 2023
Author

[Edit] changed the title: FP8 Precision -> fp8 dtype support

The above proposal was merged into their dev branch.
Frankly, if VRAM usage is drastically reduced as their proposal says, I'm OK even if it is only applied to the storage side.
And if IPEX also can use it, that's even better.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fp8 dtype support #2604

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 3 comments

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Select a reply

fp8 dtype support #2604

cyatarow Dec 7, 2023

Replies: 3 comments

vladmandic Dec 7, 2023 Maintainer

drax-xard Dec 8, 2023

cyatarow Dec 20, 2023 Author

cyatarow
Dec 7, 2023

vladmandic
Dec 7, 2023
Maintainer

drax-xard
Dec 8, 2023

cyatarow
Dec 20, 2023
Author