Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Very long latency of clSetKernelArg function on embedded platforms ? #7819

Open
doonny opened this issue Nov 28, 2023 · 1 comment
Open

Very long latency of clSetKernelArg function on embedded platforms ? #7819

doonny opened this issue Nov 28, 2023 · 1 comment

Comments

@doonny
Copy link

doonny commented Nov 28, 2023

Hi, I am runing vitis acceleration applications on a ZCU102 boards, It was found that the operation clsetKernelArgs has very long latency. The following pic are from vitis analyzer. When I perform around ten times of clsetKernelArgs operations,it can take 3ms, which is too slow compared to the kernel execution time. There is any way to shorten the latency of the function clsetKernelArgs ?

Is this a problem of the arm processor or it is slow when using the XRT in linux mode ?

vitis

@stsoe
Copy link
Collaborator

stsoe commented Dec 4, 2023

@doonny I haven't heard clSetKernelArg being an issue before. Can you keep clSetKernelArg out of the critical path? E.g. associate a kernel object with a set of arguments that do not need to be changed other than content of global buffers. You can have as many kernel objects as you have buffer pools.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants