Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Android跑两个模型,OpenCL只能在一个上面生效。 #2942

Open
hooponn opened this issue Jun 29, 2024 · 2 comments
Open

Android跑两个模型,OpenCL只能在一个上面生效。 #2942

hooponn opened this issue Jun 29, 2024 · 2 comments

Comments

@hooponn
Copy link

hooponn commented Jun 29, 2024

模型A的结果作为B的参数,串行执行。两个MNN在一个CPP里面初始化,初始化参数都是独立的。开启TUNING,配置独立的Cache。

配置和结果如下:
A和B都采用OpenCL A-20ms;B-600ms;
A采用OpenCL,B采用CPU A-20ms;B-600ms;
A采用CPU,B采用OpenCL A-600ms;B-20ms;

感觉像被一个使用后就绑定了,不得不说,安卓上MNN + OpenCL这性能还真不错。

初始化代码:
int Test::LoadModel(const char *aPath, const char *bPath) {
BackendConfig aBConfig;
aBConfig.precision = BackendConfig::PrecisionMode::Precision_High;
ScheduleConfig bSConfig;
bSConfig.type = MNN_FORWARD_OPENCL;
bSConfig.numThread = 4;
bSConfig.backendConfig = &aBConfig;
bSConfig.mode = MNN_GPU_TUNING_WIDE | MNN_GPU_MEMORY_BUFFER;
aRtmgr = std::shared_ptrExecutor::RuntimeManager(
Executor::RuntimeManager::createRuntimeManager(aRtmgr));
if (aRtmgr == nullptr) {
MNN_ERROR("Empty RuntimeManger\n");
return -1;
}
aRtmgr->setCache("/data/data/com.hp.test/mnn/aCache");
std::vectorstd::string embed_inputs = {"xxx"};
std::vectorstd::string embed_outputs = {"xxx"};
a.reset(Module::load(a_inputs, a_outputs,aPath,aRtmgr));
BackendConfig bBConfig;
bBConfig.precision = BackendConfig::PrecisionMode::Precision_High;
ScheduleConfig bSConfig;
bSConfig.type = MNN_FORWARD_OPENCL;
bSConfig.numThread = 4;
bSConfig.backendConfig = &bBConfig;
bSConfig.mode = MNN_GPU_TUNING_WIDE | MNN_GPU_MEMORY_BUFFER;
bRtmgr = std::shared_ptrExecutor::RuntimeManager(
Executor::RuntimeManager::createRuntimeManager(bSConfig));
if (bRtmgr == nullptr) {
MNN_ERROR("Empty RuntimeManger\n");
return -1;
}
bRtmgr->setCache("/data/data/com.hp.test/mnn/bCache");
std::vectorstd::string predict_inputs = {"xxx"};
std::vectorstd::string predict_outputs = {"xxx"};
b.reset(Module::load(b_inputs, b_outputs,bPath, bRtmgr));
return 0;
}
执行推理:
auto outputs = a->onForward({input_var}); aRtmgr->updateCache();

..................................................

auto outputs = b->onForward({input_var}); bRtmgr->updateCache();

不是太熟悉,请教一下各位大神写法是不是有问题。

@Qxinyu
Copy link
Collaborator

Qxinyu commented Jul 1, 2024

你在统计时间的时候有同步吗,可以调用outputs->readMap()

@hooponn
Copy link
Author

hooponn commented Jul 3, 2024

你在统计时间的时候有同步吗,可以调用outputs->readMap()

谢谢回答。
时间统计应该没问题。
后来我尝试了每次推理之前都重新load一下模型,这样速度就没问题。但是load的时间很长,总的时间算起来比之前更长了。
然后又发现两个模型的缓存文件是一模一样的,对比过二进制都是一致。
难道要两个进程隔离执行?后续我再试试。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants