-
Notifications
You must be signed in to change notification settings - Fork 11
Optimize PReLU #144
Optimize PReLU #144
Conversation
fb3c973
to
9c55c8e
Compare
trident/kernel/cosine_similarity.py
Outdated
output_y_size: tl.int32, | ||
output_x_size: tl.int32, | ||
size_along_dim: tl.constexpr, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
์ด ํ์ ์ด ๋ฐ๋ ์ด์ ๊ฐ ์์ด ๋ณด์ด๋๋ฐ์?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"require_boundary_check": lambda args: args["size_along_dim"] % args["block_size"],
์ฌ๊ธฐ ๋ค์ด๊ฐ๋ ์ฐ์ฐ๋ค์ tl.constexpr์ด์ด์ผ ์ปดํ์ผ๋ ๊ฒฐ์ ๋๋ ๊ฑธ๋ก ์๊ฐํ๋๋ฐ, tl.int32์ด์ด๋ ๋๊ฐ์ ํจ๊ณผ๋ฅผ ์ป์ ์ ์๋๊ฑธ๊น์?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
๋ค ํด๋ ์คํฑ์ CPU์์ ์คํ๋์. tl.constexpr
์ GPU ์ปค๋ ์ปดํ์ผ์ ์ํฅ์ ๋ฏธ์น๋๊ฑฐ์์.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
์์ ๋ต๋ต ์์ ํ์์๋๊ฒ ๋ง๊ฒ ๊ตฐ์. ์ด ๋ถ๋ถ์ ๋ค์ ์๋ณตํ์์ต๋๋ค.
9c55c8e
to
96b34bd
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
๐ Describe the pull request
PReLU๋ฅผ ์ต์ ํํ์์ต๋๋ค.
๐ฌ Additional context
After
Before
Add any other context or screenshots about the pull request here.
โ Checklist