-
Notifications
You must be signed in to change notification settings - Fork 77
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ GPU ] split kernel registration from forwarding function in addition_layer_cl
and transpose_cl
#2810
base: main
Are you sure you want to change the base?
[ GPU ] split kernel registration from forwarding function in addition_layer_cl
and transpose_cl
#2810
Conversation
📝 TAOS-CI Version: 1.5.20200925. Thank you for submitting PR #2810. Please a submit 1commit/1PR (one commit per one PR) policy to get comments quickly from reviewers. Your PR must pass all verificiation processes of cibot before starting a review process from reviewers. If you are new member to join this project, please read manuals in documentation folder and wiki page. In order to monitor a progress status of your PR in more detail, visit http://ci.nnstreamer.ai/. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@EunjuYang, 💯 All CI checkers are successfully verified. Thanks.
- This commit updates addition_layer_cl.cpp/h to inherit LayerImplCl. - This commit implements registerClKernels() of addition_layer_cl layer. - This commit update cl_context.cpp (applying addition_layer_cl's update) Self evaluation: Build test: [X]Passed [ ]Failed [ ]Skipped Run test: [X]Passed [ ]Failed [ ]Skipped Signed-off-by: Eunju Yang <[email protected]>
d325aee
to
64f3d0d
Compare
addition_layer_cl
and transpose_cl
addition_layer_cl
and transpose_cl
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@EunjuYang, 💯 All CI checkers are successfully verified. Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
* cl_blas_kernels and there is no specific kernels for this. If there are | ||
* specific kernels for this, it should be updated to register the kernels . | ||
*/ | ||
static bool registerClKernels() { return true; }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does it register ClKernels?
or
does this check if CLKernels are registered or not?
(Clarify at brief)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missed your previous message—sorry for the delayed response!
This change intends to handle layer-specific CL kernel registrations. However, both the addition
and transpose
layers don't actually require any specific kernels, so we're skipping the registration step and simply returning true.
I've updated the @brief
section accordingly. Please let me know if there's a better approach, and I'd be happy to implement it. Thanks again! 😊
And if a PR is waiting for too long and you think it's ready, please ping people. |
- This commit updates transpose_cl.cpp/h to inherit LayerImplCl. - This commit implements registerClKernels() of transpose_cl layer. - This commit update cl_context.cpp (applying transpose_cl's update) - This is the last commit to complete nnstreamer#2723. - This can close nnstreamer#2723. Self evaluation: Build test: [X]Passed [ ]Failed [ ]Skipped Run test: [X]Passed [ ]Failed [ ]Skipped Signed-off-by: Eunju Yang <[email protected]>
64f3d0d
to
b32989b
Compare
[ GPU/OpenCL ] change
addition_layer_cl
andtranspose_cl
to inherit LayerImplCladdition_layer_cl
andtranspose_cl
.cpp/h to inherit LayerImplCl.addition_layer_cl
andtranspose_cl
layer.addition_layer_cl
andtranspose_cl
's update)Self evaluation:
Build test: [X]Passed [ ]Failed [ ]Skipped
Run test: [X]Passed [ ]Failed [ ]Skipped