`ToGraph` maybe doesn't need so many methods? #151

michaelpj · 2018-11-30T17:02:53Z

I understand that the intention of ToGraph, like Foldable, is that some of the methods can be overridden to provide better performance.

But in practice it seems like the only ones that actually matter are

foldG (special implementation for Fold)
toGraph, toAdjacencyMap, toAdjacencyIntMap (special implementations for those very types or similar ones)

AFAICT the overriding implementations are all equivalent to calling the default implementations on ToGraph with the specialised version of e.g. toAdjacencyMap. For example, LabelledAdjacencyMap just delegates to the AdjacencyMap functions after calling skeleton... which is just its definition of toAdjacencyMap.

So I think we could have a ToGraph class that just had foldG and the toX methods, and then provide some generic implementations of the others as functions with a ToGraph constraint, rather than class methods.

Is this actually a problem? Not enormously, it just looks odd to a reader to have such a profusion of class methods, which will potentially grow indefinitely as more algorithms are added.

(I might just be missing some overrides that are in fact performance-critical, of course!)

The text was updated successfully, but these errors were encountered:

michaelpj · 2018-11-30T17:52:08Z

As an example of how this can lead to some (accidental?) inconsistency, scc isn't on ToGraph, although it's perfectly definable in terms of toAdjacencyMap.

snowleopard · 2018-12-01T00:36:44Z

I'm not a big fan of ToGraph either and I'd be happy to trim it down, so let me list all current methods and give some rational for why they are not just functions with a ToGraph constraint:

foldg is the only interesting method really.
toGraph could in principle be demoted to a function, although it feels natural to define a ToGraph instance by implementing the toGraph method. The Graph data type has a nice implementation id, but I think performance-wise we should be able to simplify the default implementation foldg Empty Vertex Overlay Connect to id using @nobrakal's rewrite rules.
isEmpty: this one is unlikely to ever become a performance bottleneck, but still there are instances with O(1) run time cost, which you can't achieve when going via Graph or AdjacencyMap.
size: I think this one should just be deleted from the ToGraph module altogether, since it only makes sense for algebraic representations.
hasVertex, hasEdge, vertexCount, edgeCount: like isEmpty, these should probably stay, because some instances have O(1) implementations, which you can't achieve when going via Graph or AdjacencyMap. As examples, consider Data.Graph from containers (which I plan to make an instance of ToGraph), or an adjacency matrix that can answer hasEdge queries in O(1).
vertexList, vertexSet, vertexIntSet, preSet, preIntSet, postSet, postIntSet: algebraic representations like Graph can implement these functions more efficiently than by going via an AdjacencyMap (even if they currently don't do that).
edgeList, edgeSet, adjacencyList: again, some instances like Relation give you faster access to various collections of edges.
adjacencyMap, adjacencyIntMap, adjacencyMapTranspose, adjacencyIntMapTranspose: these could probably be removed from ToGraph. They can be efficiently implemented by going via the corresponding to* methods (at least until we change the implementation of AdjacencyMaps).
dfsForest, dfsForestFrom, dfs, reachable, topSort, isAcyclic, isDfsForestOf, isTopSortOf: all of these are currently implemented by going via the AdjacencyMap, but I hope to provide alternative implementations for algebraic graphs, which will be faster on dense graphs.
toAdjacencyMap, toAdjacencyMapTranspose, toAdjacencyIntMap, toAdjacencyIntMapTranspose: these seem to be important, at least for now, when we don't know many interesting algorithms that operate directly on algebraic graph representations. Why do we need transpose versions? Because you can have instances for which computing the transpose is just O(1), for example adjacency maps that store both forward and reverse edge directions, separately.

As an example of how this can lead to some (accidental?) inconsistency, scc isn't on ToGraph, although it's perfectly definable in terms of toAdjacencyMap.

This is not accidental. I just don't know how to make it a method of the ToGraph type class, because it returns a doubly-layered graph. The best we could do is something like this:

scc ::  (ToGraph g, ToGraph h, ToVertex h ~ g) => g -> h

Alas, this doesn't really express the fact that the inner graphs in the scc result are non-empty.

Summary: Here are the methods we should probably remove from ToGraph: size (remove from the module altogether, it was just a mistake), adjacencyMap, adjacencyIntMap, adjacencyMapTranspose, adjacencyIntMapTranspose.

@michaelpj Does this sound reasonable?

michaelpj · 2018-12-02T21:42:21Z

Hm, it seems like many of these do in fact have at least the potential for more efficient implementations.

I would be tempted to split out the "can be folded into a graph" class from the "overridable graph operations class" for clarity, but that's somewhat subjective and based on my confusion on first discovering the class.

i.e. something like

class ToGraph t where
  toGraph :: ...
  foldg :: ...

class ToGraph t => GraphOps t where
   hasVertex :: ...
   ...

You could even drop the ToGraph constraint and just offer default signatures that add it.

This is not accidental. I just don't know how to make it a method of the ToGraph type class, because it returns a doubly-layered graph. The best we could do is something like this:

Couldn't you concretely return an AdjacencyMap (NonEmpty.AdjacencyMap a) in this case? It's not maximally generic, but I'm not sure how much that matters.

snowleopard · 2018-12-03T00:19:16Z

I have a feeling that splitting off ToGraph and GraphOps might bring even more confusion... Let's keep this issue open, perhaps we'll figure out a better approach some day.

Couldn't you concretely return an AdjacencyMap (NonEmpty.AdjacencyMap a) in this case? It's not maximally generic, but I'm not sure how much that matters.

Well, in this case I won't be able to add scc for algebraic graphs that returns Graph (NonEmpty.Graph a), which is annoying. I think there should be some generic way of getting to the NonEmpty version for all our graph implementations.

See #13, #151

This revision addresses a few issues: * Improves the testsuite (#13) by switching to simpler graph `API` type class. * Simplifies the `ToGraph` class, as discussed in #151. * Removes `Algebra.Graph.Fold` which, while cool, appears to be useless in practice. * Removes dependencies on `base-compat` and `base-orphans`, which are no longer critical.

michaelpj mentioned this issue Nov 30, 2018

Higher-kinded ToGraph? #153

Open

snowleopard added enhancement design labels Dec 1, 2018

snowleopard added a commit that referenced this issue May 27, 2019

Drop Fold, simplify ToGraph, refactor testsuite

1fbda0d

See #13, #151

snowleopard mentioned this issue May 27, 2019

Drop Fold, simplify ToGraph, refactor testsuite #205

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`ToGraph` maybe doesn't need so many methods? #151

`ToGraph` maybe doesn't need so many methods? #151

michaelpj commented Nov 30, 2018

michaelpj commented Nov 30, 2018

snowleopard commented Dec 1, 2018 •

edited

Loading

michaelpj commented Dec 2, 2018

snowleopard commented Dec 3, 2018

ToGraph maybe doesn't need so many methods? #151

ToGraph maybe doesn't need so many methods? #151

Comments

michaelpj commented Nov 30, 2018

michaelpj commented Nov 30, 2018

snowleopard commented Dec 1, 2018 • edited Loading

michaelpj commented Dec 2, 2018

snowleopard commented Dec 3, 2018

`ToGraph` maybe doesn't need so many methods? #151

`ToGraph` maybe doesn't need so many methods? #151

snowleopard commented Dec 1, 2018 •

edited

Loading