Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

基于MultiDevices库实现快速调用多个计算设备进行推理 #732

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

YIZXIY
Copy link
Contributor

@YIZXIY YIZXIY commented Apr 20, 2023

基于MultiDevices库实现快速调用多个计算设备(CPU,GPU)在低配置情况下进行推理。6G显存+16G内存即可运行int8的模型。
https://github.com/ChaimEvans/ChatGLM_MultiGPUCPU_eval

基于MultiDevices库实现快速调用多个计算设备(CPU,GPU)在低配置情况下进行推理。6G显存+16G内存即可运行int8的模型。
Add MultiDevices 库,作者 @ChaimEvans
@yhyu13
Copy link

yhyu13 commented Apr 23, 2023

这个和GLM_MNN的区别是什么?速度上所多少token/s?

@YIZXIY
Copy link
Contributor Author

YIZXIY commented Apr 23, 2023

这个和GLM_MNN的区别是什么?速度上所多少token/s?

mnn没用过,win编译太难搞,这个好部署

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants