[Feature] Is there a plan to support BF16 inference? #403

qingchanghan · 2023-09-12T03:00:06Z

It appears that LMDeploy does not support native BF16 inference, but instead converts it to FP16.

But this may lead to precision overflow (#385), is there a plan to support BF16 inference?

No response

No response

lzhangzz · 2023-09-12T05:45:56Z

Yes, we will bring back BF16 support after a few major refactoring to lower the maintainance cost.

BTW, which model are you testing has overflow issue using f16 type?

qingchanghan · 2023-09-12T07:57:57Z

Thank you! Look forward to the follow-up progress.

And we encountered this problem with an internal inference framework and internal models. These models are trained with BF16 precision.

lzhangzz · 2023-09-12T09:53:23Z

I see.

Have you checked in which layer or operator the overflow arises?

qingchanghan · 2023-09-12T12:09:05Z

Sorry, I didn't check it carefully.

But when I changed the precision to BF16, I can get the correct result.

lvhan028 · 2023-12-11T08:18:07Z

lzhangzz self-assigned this Sep 12, 2023

lvhan028 added the backlog label Oct 26, 2023

lvhan028 closed this as completed Dec 11, 2023

Provide feedback