-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Design of Gather ops #26
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can tcp.gather
can always be written as a tcp.gather_elements
+ tcp.broadcast
? Here's an example where I think it can, but I'm not sure if this is always possible:
Let's say we have indices of shape <5x6xi64>
, elements of shape <7x8x9xf32>
and output of shape <5x6x8x9xf32>
, with gather index = 0.
Then we could:
- broadcast indices to shape
<5x6x8x9xi64>
(i.e. introduce the8
and9
dimensions) - broadcast elements to
<7x6x8x9xf32>
(i.e. introduce the6
dimension) - do a gather elements on these with dimension
0
I think this will get the same result and we should be able to easily fuse the broadcasts.
Thats very interesting. I had to write a couple of examples to convince myself that it works. My worry was that the Do you see any benefits to doing it this way though, other than the obv one of reducing an op? Assuming this always works, the tradeoff I see is between having an extra op and having a somewhat complicated lowering for some gather ops (only because it is not immediately obvious how this works). |
Folks, let me know your thoughts on this design of Gather ops in TCP. |
That's the direct benefit I see. The indirect benefit, IMO, is that since broadcast/gather fusions will be more common which will force the backend to be more resilient in supporting fusions, and this might have knock on beneficial effects. |
Here are couple of examples I tried (just for a record here): Example 1
Using
Using
Example 2
Using
Using
|
@sanjoy Updated the doc to use the broadcasting approach you proposed for gathering slices. PTAL. Once we have an implementation of this, we can test it with a variety of gather cases to ensure this approach is correct for all of them (I haven't found any incorrect cases with this so far). In the worst case, we can revert to the original plan of having a separate op. Hence, I kept that op in the doc as an alternative that was considered. |
@sjarus As you had requested offline, I added examples of how gather ops from various frameworks get converted to TCP in this doc. PTAL. |
This PR proposes a design for gather ops in TCP.