-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[GSoC 2023] Implementation of the merge lattice data structure and algorithms #17
Comments
I think @bharath2438 can answer this better but we use a I see no issues with your plan so far. |
To clarify, this is regarding the I don't see a Is the proposal that in the template of the template <bool Dense, bool Compressed, ...>
class Dimension
{
public:
consteval bool is_Dense = Dense;
...
};
// then in LatticePoint
template <class... Dimension, ...>
class Latticepoint; ? |
Hi, let me try answering the questions, sorry for the late reply.
Lattice points are obviously going to be some kind of numerical datatype. We wouldn't like to restrict them to an integer though, we'll be using a type alias.
The expression is represented by a constexpr/consteval function (currently all tests use constexpr functions that represent these expressions, I remember me and @hameerabbasi decided it wasn't a good idea to use consteval, but I do not remember why).
For now, this assumes that some higher level code generation algorithm will provide us the function itself, along with the levels. |
I think these are already taken care of as part of coiteration. Once the coiteration code gets the constexpr function specifying the type of merge, it should be able to merge the levels. |
Notes from 03/19/23:
Would a good prelim PR (before GSoC) be to implement?:
cc: @bharath2438 |
@bharath2438 I'm actually a bit confused now. Where is the merge taking place? The E.g. would the draft PR in #19 be the right direction? |
The constexpr compare function takes in boolean inputs and gives out a boolean using the expression. Now, these booleans are got by comparing "iterators" over the levels to their "endpoints". Example: Let's say we have 3 levels to be merged - a, b and c. At any point during the iteration, let's say a hasn't reached it's end while b and c have reached their respective ends. Bottomline is - coiteration.hpp outputs an iterator that is able to iterate over the merged dimensions of the input levels. What we need to do is implement a higher level algorithm that is able to give coiteration the constexpr function as well as handle things at the "tensor storage format" level (COO, CSR etc. that have stacked levels). |
I think the actual meaning is to flip OR to AND. Consider a disjunctive merge ( The same works for a conjunctive merge ( |
Notes from 3/26/23: Summary direction for Adam's GSoC proposalThe goal of the GSoC 2023 for me at least will be the completion or progress towards merge lattice and its related data structures and algorithms. This is one of the final steps needed in order to implement the code-generation algorithm. An intuition of merge lattices in terms of numpy API is the notion of "broadcasting" e.g. adding two arrays, some of which have axes inserted. TODO:
Notes augmenting the previous conversion
Some more notes on this. The situation is we are trying to take two or more levels in general and try to do a conjunctive, or disjunctive a mixture of merge operations with them. The current codebase assumes the levels have indices already sorted and are unique. Here is a description of the high level conjunctive/disjunctive merge algorithm that leverages the coiteration algorithm. Examples of unsorted levels: hash map A conjunctive (and) merge algorithm proceeds by: advancing the iterator at the same time for each level to find common elements of each vector (i.e. common indices can be multiplied together; if not common, then they would've been multiplied by "0", so it would not show up in the output anyways). I.e. the output format would only get appended to if there are any common indices. At each vector, we would increment the index of the level that is the minimum. If both levels' indices are minimum (i.e. same value), increment both. A disjunctive (or) merge algorithm would proceed similarly, but append to the output at every index from each iterator that we are merging. Result: This results in the indices of all level formats that we want to use to construct the output. A note about determining when the iterator is at its end:
Augmenting coiteration to deal with unordered levels (i.e. hash maps)The API that calls coiteration should and will eventually perform the following logic: If it is sortable, sort the level. Otw if it is not sortable, then it must be If the unordered level is part of a conjunctive merge, we can iterate through the level that is ordered and use Currently, we want logic that implements this depending on if one of the level formats passed to |
Summary of GSOC 2023#22 , #24 , #25 , #26 , #29 , #30 and #31 addressed the main problems described here. Merge Lattice workThe relevant work left to do now is to work on iterating work on a
The last point seems like it will make co-iterating quite complex because to advance the iterator of MergeLattice, one would need to possibly recurse up the levels of a Tensor and then recurse back down the levels of the Tensor when one hits the end of an inner level. For example, say Tensor A is comprised of levels (a, b, c) stacked. If we iterate over Tensor A, then we start with the beginning of It is because of this fact that @bharath2438 and I discussed perhaps implementing a very simple MergeLattice first that assumes that the input Tensors are exactly the same dimensions and we are only dealing with a conjunctive merge. E.g. two CSR arrays being multiplied together. This should be a simple generalization of the unit-test added in #31. Then we can generalize the MergeLattice implementation to handle additional layers of complexity. Misc.cc: @hameerabbasi and @bharath2438 if you think I missed anything? It would be cool to try implementing the MergeLattice since the latest PR #31 shows us how to run the API. Then with a basic implementation of workspaces, couldn't this be relatively usable for basic sparse tensor operations? |
I wanted to use this GH issue to i) track the implementation of the merge lattice as well as ii) provide a possibility of implementing a preliminary PR before the GSoC due date. This will be iterated upon in conjunction with the GSoC proposal Google Doc.
Summary
I'm starting an issue to start preliminary work on the merge lattice data structure implementation. Following Section 4-6 (mainly Section 5) in https://dl.acm.org/doi/pdf/10.1145/3133901, there seem to be a few things that need to be implemented:
Some open questions that I am still sorting out
A few open questions I have will be documented here and migrated into the GH issue description when I understand them better, or an answer is given.
Downstream tasks
Optimizations: Once the merge lattice is implemented, tested and merged, then it can also be optimized using the tips in Section 5.2. As the work is done to implement the merge lattice, we can keep these in mind.
Code-generation: The code-generation algorithm in Figure 11a looks pretty straightforward once the merge lattice is complete, but I presume complexities may arise as my understanding improves.
Misc.
My understanding and reading of the codebase after our two calls so far have been that the level format representation, properties and functionalities have all been more or less implemented, so the next step is finally the code-generation algorithm.
Please let me know if there's any issues you see with this plan?
cc: @hameerabbasi and @bharath2438
Relevant papers:
The text was updated successfully, but these errors were encountered: