Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

graph: use builtin bidirectional graph #1240

Merged
merged 3 commits into from
Jul 11, 2024

Conversation

trws
Copy link
Member

@trws trws commented Jul 7, 2024

problem: walking inward edges is difficult and slow because they're mixed in with the out_edges range (we explicitly add out-edges to every vertex matching each inward edge for some reason). This means we have to filter them out during traversals, and maintain them manually. They show up almost like another subsystem (the "in" subsystem) in some readers, and aren't generated at all by others.

solution: switch from directed to bidirectional graph type, remove the explicit addition of "in" edges, and fix the graph test to match the actual number of edges.

I was very surprised how easy this was, would have sworn I'd tried it and failed at least once before. The main possible downside is that bidirectional graphs have to store every edge twice, and thus should require twice as much storage for edges as directional graphs. Strangely, switching to this actually didn't increase memory usage at all. It stayed identical, and I don't know why. The even better news is that after removing the explicit back-edges, this actually saved 500MB for a 16,000 node 64-core graph, and made running the test under heaptrack take a full second less with 5,000,000 less allocations (17.5s -> 16.5s and 75m allocations to 70m for just loading the graph and stopping).

Copy link
Member

@jameshcorbett jameshcorbett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks reasonable. Two questions:

  1. I'm guessing the JGF reader is one of the readers that doesn't bother with in-edges?
  2. Are there no logic changes or simplifications that can be done in traversals now that in-edges are gone?

@milroy
Copy link
Member

milroy commented Jul 8, 2024

I'm impressed it took so few changes to get bidirectionalS to work!

A couple of things to consider: along the lines of @jameshcorbett's question, we should be able to compress the JGFs by removing the "in" edges.

Also, I noticed that the changes in this PR also compile and pass CI for undirectedS. @trws can you run the tests you described with a boost::undirectedS?

@trws
Copy link
Member Author

trws commented Jul 8, 2024

I didn't find where "in" edges were being created in JGF, but i'll check again. If they're there, they should definitely go away.

I could run them for undirected, but that brings me to the answer to @jameshcorbett's question 2. This PR is just about switching the base graph type, keep it simple, but if we keep directional edges, then we can distinguish between edges going up and down the containment hierarchy in O(1) time, if we use undirected then we'll be back to O(N) time. I haven't done the optimizations to take advantage of that for upwards walks yet, but the performance improvement I got is partly from removing the inward edges we no longer need to consider from the downward walk. There are several other things we can now do because the difference between out_edges and in_edges exists, and is constant time. I'm also planning to wrap a small data structure for use in place of OutEdgeList that will let us get edges by type in either direction in O(1) as well, then we can do subsystems efficiently.

@trws
Copy link
Member Author

trws commented Jul 8, 2024

Oh that's somewhat interesting. JGF never actually creates "in" edges explicitly. It will serialize them if they exist in the graph read from another reader, and it will deserialize them if they're in the json, but it will not create them based on the "containment" edges. I guess it never actually mattered before because all the JGF format graphs were created from graphs populated by other readers. There's no actual code to remove, but I'll add a commit to remove the "in" edges from the test files and ignore them if they are found in an existing file.

@trws
Copy link
Member Author

trws commented Jul 8, 2024

Interesting results from undirectedS, not quite what I expected. It came out slightly smaller, another ~100 megabytes taking the total down to 5.2GB RSS, that's more or less expected. The surprise is that the number of allocations and the runtime went back up to be the same as the directedS case. Apparently just getting the up edges out of the out_edges list is enough to get a bit of a win.

@trws
Copy link
Member Author

trws commented Jul 8, 2024

Ok, this should be ready for another pass. The JGF reader now no ignores containment edges of the "in" type, correctly resolves parent vertices in the reader for the remove test that was relying on old behavior, much more efficiently finds those parent vertices, and the JGF test files no longer contain either "in" or "draws_power" edges. Despite some of the json files getting changed from single-line to formatted, it's a net win of almost 40,000 lines, which is a bit crazy. JGF reader needs a rework, but this at least gets things cleaned up for this purpose.

Copy link
Member

@milroy milroy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pending Zeke's input on one changed function this looks good and has the side effect of removing >38,000 of the 142,886 lines (!) in the power JGF.

@@ -14,6 +14,7 @@
#include "resource/schema/resource_data.hpp"
#include <boost/config.hpp>
#include <boost/graph/adjacency_list.hpp>
#include <boost/graph/graph_selectors.hpp>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fluxion builds and passes the testsuite without graph_selectors.

Comment on lines +1159 to +1240
if (g[*ei].name.contains ("containment")) {
parent_vtx = next_vtx;
rc = 0;
break;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good for containment, but I seem to remember there was a corner case that motivated @zekemorton to implement the check this way. Zeke, can you comment here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, when the graph was a directedS graph, the approach I'm using here would not have worked at all since there wouldn't be a way to get in_edges like this, and even if we iterated all vertices to get them there would have been in_edges from the children as well. As long as all graphs follow the invariant that there's always a containment tree, and it's always single parent, this should be fine. If we have a case where one of those is violated, it would be wrong, but I'd want to know what that is and why it's invalidating the invariant.

This reminds me by the way, you asked why it is that everything has to be in the containment hierarchy, every component I mean, and this is one example of why. If we want there to be other non-containment hierarchies (things like networking can't be hierarchies, but they shouldn't be anyway, they should be extra links between things that exist in other hierarchies) we could do that, but it would mean having the hierarchy the thing is resident in be part of the vertex, or the vertex type perhaps, and then having some operations that need to traverse more than one.

Copy link
Member

@milroy milroy Jul 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we want there to be other non-containment hierarchies (things like networking can't be hierarchies, but they shouldn't be anyway, they should be extra links between things that exist in other hierarchies)

This comes up in the power JGF, where the power subsystem is also a hierarchy. The current implementation can return the parent in the power subsystem instead of the parent in the containment hierarchy, which actually might not be desirable.

I think returning the parent vertex in the containment hierarchy will work well, and it's lower complexity.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could make it capable of doing both with the same complexity if we pass in which edge type to consider, or have it embedded in the vertex type. Maybe something to consider for later.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current implementation...

In case it wasn't clear (I think it was), I meant current flux-sched master, not the change in the PR.

@milroy
Copy link
Member

milroy commented Jul 10, 2024

Do the changes in this PR allow simplification of the by_outedges metadata?

@milroy
Copy link
Member

milroy commented Jul 10, 2024

Interesting results from undirectedS, not quite what I expected. It came out slightly smaller, another ~100 megabytes taking the total down to 5.2GB RSS, that's more or less expected. The surprise is that the number of allocations and the runtime went back up to be the same as the directedS case. Apparently just getting the up edges out of the out_edges list is enough to get a bit of a win.

Interesting result. I was curious to see if undirectedS would result in a significant reduction in RSS. If so, we could have offered it as a build-time option that might be useful for specific graph topologies. It doesn't look like the RSS reduction is worth the increased edge comparison complexity.

Copy link

codecov bot commented Jul 10, 2024

Codecov Report

Attention: Patch coverage is 92.85714% with 1 line in your changes missing coverage. Please review.

Project coverage is 74.1%. Comparing base (5e54a68) to head (fec2715).
Report is 25 commits behind head on master.

Additional details and impacted files
@@           Coverage Diff            @@
##           master   #1240     +/-   ##
========================================
- Coverage    74.1%   74.1%   -0.1%     
========================================
  Files         103     103             
  Lines       14606   14584     -22     
========================================
- Hits        10828   10810     -18     
+ Misses       3778    3774      -4     
Files Coverage Δ
resource/readers/resource_reader_grug.cpp 83.0% <ø> (-0.3%) ⬇️
resource/readers/resource_reader_hwloc.cpp 65.0% <ø> (+0.1%) ⬆️
resource/readers/resource_reader_rv1exec.cpp 71.1% <ø> (+0.1%) ⬆️
resource/schema/resource_graph.hpp 0.0% <ø> (ø)
resource/readers/resource_reader_jgf.cpp 68.8% <92.8%> (-0.2%) ⬇️

... and 1 file with indirect coverage changes

@trws
Copy link
Member Author

trws commented Jul 10, 2024

I agree with that assessment. Honestly I expected it to be much more significant, but 100MB out of 5.3GB... 🤷

Do the changes in this PR allow simplification of the by_outedges metadata?

I had to go look. It reduces the amount of work to maintain it slightly by reducing the outedges considered during planner updates, but the by_outedges metadata itself is independent of the number of outedges, it's actually by a weight calculated through the subplan. Something I didn't realize, but seems really important now that I do, is that the heuristic works like this (comment from code):

        // Set dynamic traversing order based on the following heuristics:
        //     1. Current-time (jobmeta.now) resource availability
        //     2. Last pruning filter resource type (if additional
        //        pruning filter type was given, that's a good
        //        indication that it is the scarcest resource)

This means that in our current configuration, the heuristic is calculated based on the rack:node plan I think? Unless it only counts the prune filters that can be relevant at a given level, in which case it would be ALL:gpu at the node level. That seems... not ideal, as does the heuristic. May have to give that some thought. There is definitely an optimization that can be made there though, since it's actually removing (and deallocating) the map node, modifying the key, then re-inserting it. We can fix that with .extract now.

trws added 3 commits July 11, 2024 09:05
problem: walking inward edges is difficult and slow because they're
mixed in with the `out_edges` range (we explicitly add out-edges to
every vertex matching each inward edge for some reason).  This means we
have to filter them out during traversals, and maintain them manually.
They show up almost like another subsystem (the "in" subsystem) in some
readers, and aren't generated at all by others.

solution: switch from directed to bidirectional graph type, remove the
explicit addition of "in" edges, and fix the graph test to match the
actual number of edges.

I was very surprised how easy this was, would have sworn I'd tried it
and failed at least once before.  The main possible downside is that
bidirectional graphs have to store every edge twice, and thus should
require twice as much storage for edges as directional graphs.
Strangely, switching to this actually didn't increase memory usage _at
all_.  It stayed identical, and I don't know why.  The even _better_
news is that after removing the explicit back-edges, this actually saved
500MB for a 16,000 node 64-core graph, and made running the test under
heaptrack take a full second less with 5,000,000 less allocations (17.5s
-> 16.5s and 75m allocations to 70m for just loading the graph and
stopping).

In the end, the only required logic change was in the jgf reader, where
it failed to find a parent vertex because it was only checking out edges
of the child. That is fixed in a simplistic way right now, but could be
optimized massively.
problem: these were always unnecessary, and waste space, and loading
time, and storing time

solution: remove them
problem: get_parent_vtx checks the string component of the path value to
determine which connected vertex is the parent, rather than checking
which containment vertex has an incoming edge to the current vertex
(there can be only one)

solution: check for an in-edge of the right type
@trws trws added the merge-when-passing mergify.io - merge PR automatically once CI passes label Jul 11, 2024
@mergify mergify bot merged commit fc4fcde into flux-framework:master Jul 11, 2024
21 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merge-when-passing mergify.io - merge PR automatically once CI passes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants