Supports W8A8 quantization for more models #2850

AllentDan · 2024-12-03T07:34:44Z

Basically, all the AWQ models support W8A8 quantization too.

lvhan028 · 2024-12-03T12:51:14Z

docs/zh_cn/supported_models/supported_models.md

-|       YI       |  6B - 34B   | LLM  |    Yes    |   Yes   |   Yes   |  No  |  Yes  |
-|    Mistral     |     7B      | LLM  |    Yes    |   Yes   |   Yes   |  No  |  No   |
+|       YI       |  6B - 34B   | LLM  |    Yes    |   Yes   |   Yes   | Yes  |  Yes  |
+|    Mistral     |     7B      | LLM  |    Yes    |   Yes   |   Yes   | Yes  |  Yes  |
 |    Mixtral     | 8x7B, 8x22B | LLM  |    Yes    |   Yes   |   Yes   |  No  |  No   |


Mixtral cannot?

Pytorch engine did not support it yet.

zhulinJulia24 · 2024-12-04T02:47:08Z

docs/en/supported_models/supported_models.md

 |  QWen1.5-MoE   |    A2.7B    | LLM  |    Yes    |   Yes   |   Yes   |  No  |  No   |
-|     QWen2      | 0.5B - 72B  | LLM  |    Yes    |   Yes   |   No    |  No  |  Yes  |


where is qwen2.5

Qwen2.5 shares the same structure with qwen2

qwen2.5 is mentioned in #2849 @zhulinJulia24

AllentDan added 2 commits December 3, 2024 15:05

Supports W8A8 quantization for more models

9627436

update supported models

fd4db5e

lvhan028 requested review from lvhan028 and zhulinJulia24 December 3, 2024 12:43

lvhan028 added the improvement label Dec 3, 2024

lvhan028 reviewed Dec 3, 2024

View reviewed changes

zhulinJulia24 reviewed Dec 4, 2024

View reviewed changes

lvhan028 approved these changes Dec 4, 2024

View reviewed changes

lvhan028 merged commit 69a4306 into InternLM:main Dec 4, 2024
5 checks passed

AllentDan mentioned this pull request Dec 4, 2024

Update pytorch engine w8a8 supported model list #2854

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Supports W8A8 quantization for more models #2850

Supports W8A8 quantization for more models #2850

AllentDan commented Dec 3, 2024

lvhan028 Dec 3, 2024

AllentDan Dec 4, 2024

zhulinJulia24 Dec 4, 2024

AllentDan Dec 4, 2024

lvhan028 Dec 4, 2024

		\| QWen1.5-MoE \| A2.7B \| LLM \| Yes \| Yes \| Yes \| No \| No \|
		\| QWen2 \| 0.5B - 72B \| LLM \| Yes \| Yes \| No \| No \| Yes \|

Supports W8A8 quantization for more models #2850

Supports W8A8 quantization for more models #2850

Conversation

AllentDan commented Dec 3, 2024

lvhan028 Dec 3, 2024

Choose a reason for hiding this comment

AllentDan Dec 4, 2024

Choose a reason for hiding this comment

zhulinJulia24 Dec 4, 2024

Choose a reason for hiding this comment

AllentDan Dec 4, 2024

Choose a reason for hiding this comment

lvhan028 Dec 4, 2024

Choose a reason for hiding this comment