-
Notifications
You must be signed in to change notification settings - Fork 2.6k
[RISCV64] Implement CPU plugin just-in-time emitter for Sqrt operation #30675
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Signed-off-by: Lim, Kuan Xian <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kuanxian1 changes look good! 👍🏼 Thank you! I left only one comment - please take a look 😊
case ov::element::i32: | ||
h->vfcvt_f_x_v(dst, src); | ||
h->vfsqrt_v(dst, dst); | ||
h->vfcvt_x_f_v(dst, dst); | ||
break; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For int32
these conversion instructions will be automatically inserted by JIT Eltwise kernel. It means that if RISC-V RVV doesn't provide int32 instruction for square, we can implement only fp32
support in JIT Emitter for this op.
So may I ask you to please leave only fp32
implementation in JIT emitter impl?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for letting me know!
It seems that the problem in JIT Kernel for Eltwise ops - when we load/store values and we need to convert f32
->i32
, we use instruction vfcvt_rtz_x_f_v(...)
- conversion to integer value, which round towards zero. Looks like the such conversion rtz
is needed only for integer division.
I suggest to do the following steps:
- Remove
i32
implementation fromjit_sqrt_emitter
- Replace
vfcvt_rtz_x_f_v(...)
withvfcvt_x_f_v(...)
inload_vector
andstore_vector
methods in JIT Kernel for conversionf32 -> i32
. Please check that now your implementation works correctly fori32
too. - Support execution precision
i32
injit_divide_emitter
- as far as I remember, RISC-V RVV provide specific instruction for integer division. So I believe that support won't be difficult :)
What do you think about this solution? May I ask you to try this solution please?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me, I will try this solutions and get back to you with the result. Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I pushed the changes, test is passing, can you please review?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Left only one comment :)
Signed-off-by: Lim, Kuan Xian <[email protected]>
break; | ||
default: | ||
OV_CPU_JIT_EMITTER_THROW("Unsupported precision"); | ||
} | ||
} | ||
|
||
std::set<std::vector<element::Type>> jit_divide_emitter::get_supported_precisions(const std::shared_ptr<ov::Node>& node) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please add supported precision i32
here?
case ov::element::i32: | ||
h->vfcvt_f_x_v(dst, src); | ||
h->vfsqrt_v(dst, dst); | ||
h->vfcvt_x_f_v(dst, dst); | ||
break; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Left only one comment :)
Details:
Testing:
Related Issue:
#30246