Skip to content

Latest commit

 

History

History
30 lines (26 loc) · 1.43 KB

supported_models.md

File metadata and controls

30 lines (26 loc) · 1.43 KB

Supported Models

Models supported by TurboMind

Model Size FP16/BF16 KV INT8 W4A16
Llama 7B - 65B Yes Yes Yes
Llama2 7B - 70B Yes Yes Yes
InternLM 7B - 20B Yes Yes Yes
InternLM2 7B - 20B Yes - Yes
InternLM-XComposer 7B Yes Yes Yes
QWen 7B - 72B Yes Yes Yes
QWen-VL 7B Yes Yes Yes
Baichuan 7B Yes Yes Yes
Baichuan2 7B Yes Yes Yes
Code Llama 7B - 34B Yes No No
YI 6B - 34B Yes No No

Models supported by PyTorch

Model Size FP16/BF16 KV INT8 W8A8
Llama 7B - 65B Yes No Yes
Llama2 7B - 70B Yes No Yes
InternLM 7B - 20B Yes No Yes
InternLM2 7B - 20B Yes No -
Baichuan2 7B - 13B Yes No Yes
ChatGLM2 6B Yes No No
Falcon 7B - 180B Yes No No
YI 6B - 34B Yes No No