Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Is there a plan to support BF16 inference? #403

Closed
qingchanghan opened this issue Sep 12, 2023 · 5 comments
Closed

[Feature] Is there a plan to support BF16 inference? #403

qingchanghan opened this issue Sep 12, 2023 · 5 comments
Assignees
Labels

Comments

@qingchanghan
Copy link

Motivation

It appears that LMDeploy does not support native BF16 inference, but instead converts it to FP16.

But this may lead to precision overflow (#385), is there a plan to support BF16 inference?

Related resources

No response

Additional context

No response

@lzhangzz lzhangzz self-assigned this Sep 12, 2023
@lzhangzz
Copy link
Collaborator

lzhangzz commented Sep 12, 2023

Yes, we will bring back BF16 support after a few major refactoring to lower the maintainance cost.

BTW, which model are you testing has overflow issue using f16 type?

@qingchanghan
Copy link
Author

Thank you! Look forward to the follow-up progress.

And we encountered this problem with an internal inference framework and internal models. These models are trained with BF16 precision.

@lzhangzz
Copy link
Collaborator

I see.

Have you checked in which layer or operator the overflow arises?

@qingchanghan
Copy link
Author

qingchanghan commented Sep 12, 2023

Sorry, I didn't check it carefully.

But when I changed the precision to BF16, I can get the correct result.

@lvhan028
Copy link
Collaborator

PR #803

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants