We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
模型A的结果作为B的参数,串行执行。两个MNN在一个CPP里面初始化,初始化参数都是独立的。开启TUNING,配置独立的Cache。
配置和结果如下: A和B都采用OpenCL A-20ms;B-600ms; A采用OpenCL,B采用CPU A-20ms;B-600ms; A采用CPU,B采用OpenCL A-600ms;B-20ms;
感觉像被一个使用后就绑定了,不得不说,安卓上MNN + OpenCL这性能还真不错。
初始化代码: int Test::LoadModel(const char *aPath, const char *bPath) { BackendConfig aBConfig; aBConfig.precision = BackendConfig::PrecisionMode::Precision_High; ScheduleConfig bSConfig; bSConfig.type = MNN_FORWARD_OPENCL; bSConfig.numThread = 4; bSConfig.backendConfig = &aBConfig; bSConfig.mode = MNN_GPU_TUNING_WIDE | MNN_GPU_MEMORY_BUFFER; aRtmgr = std::shared_ptrExecutor::RuntimeManager( Executor::RuntimeManager::createRuntimeManager(aRtmgr)); if (aRtmgr == nullptr) { MNN_ERROR("Empty RuntimeManger\n"); return -1; } aRtmgr->setCache("/data/data/com.hp.test/mnn/aCache"); std::vectorstd::string embed_inputs = {"xxx"}; std::vectorstd::string embed_outputs = {"xxx"}; a.reset(Module::load(a_inputs, a_outputs,aPath,aRtmgr)); BackendConfig bBConfig; bBConfig.precision = BackendConfig::PrecisionMode::Precision_High; ScheduleConfig bSConfig; bSConfig.type = MNN_FORWARD_OPENCL; bSConfig.numThread = 4; bSConfig.backendConfig = &bBConfig; bSConfig.mode = MNN_GPU_TUNING_WIDE | MNN_GPU_MEMORY_BUFFER; bRtmgr = std::shared_ptrExecutor::RuntimeManager( Executor::RuntimeManager::createRuntimeManager(bSConfig)); if (bRtmgr == nullptr) { MNN_ERROR("Empty RuntimeManger\n"); return -1; } bRtmgr->setCache("/data/data/com.hp.test/mnn/bCache"); std::vectorstd::string predict_inputs = {"xxx"}; std::vectorstd::string predict_outputs = {"xxx"}; b.reset(Module::load(b_inputs, b_outputs,bPath, bRtmgr)); return 0; } 执行推理: auto outputs = a->onForward({input_var}); aRtmgr->updateCache();
auto outputs = a->onForward({input_var}); aRtmgr->updateCache();
..................................................
auto outputs = b->onForward({input_var}); bRtmgr->updateCache();
不是太熟悉,请教一下各位大神写法是不是有问题。
The text was updated successfully, but these errors were encountered:
你在统计时间的时候有同步吗,可以调用outputs->readMap()
Sorry, something went wrong.
谢谢回答。 时间统计应该没问题。 后来我尝试了每次推理之前都重新load一下模型,这样速度就没问题。但是load的时间很长,总的时间算起来比之前更长了。 然后又发现两个模型的缓存文件是一模一样的,对比过二进制都是一致。 难道要两个进程隔离执行?后续我再试试。
No branches or pull requests
模型A的结果作为B的参数,串行执行。两个MNN在一个CPP里面初始化,初始化参数都是独立的。开启TUNING,配置独立的Cache。
配置和结果如下:
A和B都采用OpenCL A-20ms;B-600ms;
A采用OpenCL,B采用CPU A-20ms;B-600ms;
A采用CPU,B采用OpenCL A-600ms;B-20ms;
感觉像被一个使用后就绑定了,不得不说,安卓上MNN + OpenCL这性能还真不错。
初始化代码:
int Test::LoadModel(const char *aPath, const char *bPath) {
BackendConfig aBConfig;
aBConfig.precision = BackendConfig::PrecisionMode::Precision_High;
ScheduleConfig bSConfig;
bSConfig.type = MNN_FORWARD_OPENCL;
bSConfig.numThread = 4;
bSConfig.backendConfig = &aBConfig;
bSConfig.mode = MNN_GPU_TUNING_WIDE | MNN_GPU_MEMORY_BUFFER;
aRtmgr = std::shared_ptrExecutor::RuntimeManager(
Executor::RuntimeManager::createRuntimeManager(aRtmgr));
if (aRtmgr == nullptr) {
MNN_ERROR("Empty RuntimeManger\n");
return -1;
}
aRtmgr->setCache("/data/data/com.hp.test/mnn/aCache");
std::vectorstd::string embed_inputs = {"xxx"};
std::vectorstd::string embed_outputs = {"xxx"};
a.reset(Module::load(a_inputs, a_outputs,aPath,aRtmgr));
BackendConfig bBConfig;
bBConfig.precision = BackendConfig::PrecisionMode::Precision_High;
ScheduleConfig bSConfig;
bSConfig.type = MNN_FORWARD_OPENCL;
bSConfig.numThread = 4;
bSConfig.backendConfig = &bBConfig;
bSConfig.mode = MNN_GPU_TUNING_WIDE | MNN_GPU_MEMORY_BUFFER;
bRtmgr = std::shared_ptrExecutor::RuntimeManager(
Executor::RuntimeManager::createRuntimeManager(bSConfig));
if (bRtmgr == nullptr) {
MNN_ERROR("Empty RuntimeManger\n");
return -1;
}
bRtmgr->setCache("/data/data/com.hp.test/mnn/bCache");
std::vectorstd::string predict_inputs = {"xxx"};
std::vectorstd::string predict_outputs = {"xxx"};
b.reset(Module::load(b_inputs, b_outputs,bPath, bRtmgr));
return 0;
}
执行推理:
auto outputs = a->onForward({input_var}); aRtmgr->updateCache();
..................................................
auto outputs = b->onForward({input_var}); bRtmgr->updateCache();
不是太熟悉,请教一下各位大神写法是不是有问题。
The text was updated successfully, but these errors were encountered: