Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Interest in other graph libraries #6

Open
travitch opened this issue Jul 6, 2018 · 7 comments
Open

Interest in other graph libraries #6

travitch opened this issue Jul 6, 2018 · 7 comments
Assignees

Comments

@travitch
Copy link

travitch commented Jul 6, 2018

Thanks for this - graph library performance is definitely not obvious. Do you have any interest in additional graph libraries to examine? I was a bit frustrated with fgl performance a while ago and put together a library that I thought would address most of the problems I had with it. It might be of interest to you: https://github.com/travitch/haggle (unfortunately, not up on hackage yet).

It has:

  • A few different data representations to accommodate different graph topologies, some of which are very space efficient
  • Graphs with/without edge and node labels
  • Optional mutable graph creation (in IO/ST) via a freeze and thaw interface
@nobrakal
Copy link
Collaborator

nobrakal commented Jul 6, 2018

Do you have any interest in additional graph libraries to examine?

Yes of course, thanks for the link !

The approach is very interesting, I will add benchs for haggle asap :)

@nobrakal nobrakal self-assigned this Jul 6, 2018
@nobrakal
Copy link
Collaborator

nobrakal commented Jul 7, 2018

@travitch I just spent some time trying to add your library. Some questions, to ensure that I don't miss anything:

  • If I want to deal with simple directed graph (like the graphs of FGL), is Data.Graph.Haggle.SimpleBiDigraph the best choice ?
  • If yes, if there a possible NFData instance for MSimpleBiDigraph and SimpleBiDigraph ? Maybe the first can be made by freezing the mutable graph and then use a the one defined we can define for SimpleBeDiGraph ?
  • You don't provide anything like mkGraph, so I wrote:
mkGraph e = let maxV = extractMaxVertex e
        in do
          g <- newSizedMSimpleBiDigraph maxV undefined
          vert <- replicateM maxV (addVertex g)
          mapM_ (\(u,v) -> addEdge g (vert !! u)(vert !! v)) e
          return g

where e is a list of edges (of type [(Int,Int)] and maV is maxV = max $ (\x y -> x ++ y) $ unzip e. The vertices in the graph are made from [0..(maxV-1)].

Is it ok ?

@travitch
Copy link
Author

travitch commented Jul 8, 2018

Thanks for taking a look at it!

  • Which graph is best depends a bit on what you want to test. The most direct analog to FGL is the PatriciaTree, which should be almost the same performance (but with some better strictness properties). SimpleBiDigraph will be more compact in memory and should probably be faster, while still supporting most of the same operations. Digraph might be interesting as well - it is much more compact, but 1) doesn't provide access to predecessors for nodes, and 2) has an expensive test for edge existence.
  • I just added the missing NFData instances for the graph types. I don't think it is really possible to make instances for the mutable graphs, as they contain IO/STRefs. Freezing the graph and then using the NFData instance for the immutable graphs is probably the best bet there.
  • That is a good way to construct graphs. I didn't provide that function, as I ran into a number of use cases where that wasn't quite enough for my needs (e.g., when there are nodes in the graph that don't have any edges to them, the edge list construction is awkward).

@nobrakal
Copy link
Collaborator

nobrakal commented Jul 9, 2018

Thanks for the work :)

I started mine here: https://github.com/haskell-perf/graphs/tree/haggle (no many things are benchmarked, I will dig deeper inside the API soon).

I think the more valuable is SimpleBiDiGraph since it is closer to others libraries but with a different implementation.

I am just stuck with a thing: How can I bench a graph modification, since it is encapsulated inside a PrimMonad ? Sadly, I have no NFData instance available... Do you have any idea ? Ideally, I would fully evaluate the graph creation (to ensure no time is wasted after), then adding my edge, then fully evaluate the result. I don't know if this is possible...

@travitch
Copy link
Author

If you are running in the IO Monad, the modifications to the underlying data structures (which are MVectors and IORefs) happen exactly when you expect. The modifications to IORefs are done using modifyRef', which evaluates its argument to WHNF (since they are all Ints, that is NF). I guess my argument is that there isn't anything else to force, but I don't have a proof that there are no other thunks lying around.

For the get* operations on MSimpleBiDigraph, you could deepseq the returned values, at least.

@nobrakal
Copy link
Collaborator

If we are sure that everything is evaluated, as you are saying, then the NFData instance could simply be const (), it should do the trick... I will make some experiments ^^.

The problem of get* operations is that they come encapsulated in the PrimMonad so the NFData is not simple...

@nobrakal
Copy link
Collaborator

So I have do some work.

Criterion provide a function to benchmark a IO action (here it apply for mutable graphs). The only problem is that it will benchmark the whole action. So I need to benchmark the creation itself with the "action" (like addEdge) then fully evaluate the results.

Fortunately, I have implemented it for others libraries, so we can compare :)

I think my function to build a graph from a list of edges is very poor. Here are the results:

Doing:



Using [("Mesh",3),("Clique",3)] as graphs

edgeCount

Description: Count the edges of the graph

Mesh

1 10 100
Alga 51.52 ns 7.943 μs 163.2 μs
Containers 30.87 ns 388.1 ns 4.460 μs
Fgl 36.41 ns 5.869 μs 202.5 μs
Haggle 914.9 ns 34.31 μs 455.7 μs
Hash-Graph 39.54 ns 5.985 μs 198.2 μs

Clique

1 10 100
Alga 51.51 ns 29.03 μs 6.486 ms
Containers 31.38 ns 1.051 μs 107.4 μs
Fgl 36.31 ns 17.85 μs 10.12 ms
Haggle 904.9 ns 92.23 μs 13.55 ms
Hash-Graph 39.48 ns 19.89 μs 7.820 ms

SUMMARY:

  • Containers was the fastest 6 times

ABSTRACT:
(Based on an average of the ratio between largest benchmarks)

  • Containers was 112.93 times faster than Haggle
  • Alga was 2.48 times faster than Haggle
  • Hash-Graph was 2.05 times faster than Haggle
  • Fgl was 1.84 times faster than Haggle

vertexCount

Description: Count the vertices of the graph

Mesh

1 10 100
Alga 35.03 ns 910.1 ns 14.45 μs
Fgl 25.93 ns 4.446 μs 182.4 μs
Haggle 919.7 ns 33.88 μs 456.4 μs
Hash-Graph 26.36 ns 5.281 μs 192.5 μs

Clique

1 10 100
Alga 34.67 ns 3.247 μs 798.9 μs
Fgl 25.66 ns 12.37 μs 5.190 ms
Haggle 899.5 ns 93.55 μs 13.26 ms
Hash-Graph 25.62 ns 18.58 μs 7.770 ms

SUMMARY:

  • Alga was the fastest 4 times

There was 2 ex-aequo

ABSTRACT:
(Based on an average of the ratio between largest benchmarks)

  • Alga was 21.76 times faster than Haggle
  • Fgl was 2.54 times faster than Haggle
  • Hash-Graph was 1.94 times faster than Haggle

addEdge

Description: Add an edge (not already in the graph)

Mesh

1 10 100
Alga 50.26 ns 747.4 ns 9.611 μs
Fgl 132.0 ns 6.361 μs 213.3 μs
Haggle 1.291 μs 35.80 μs 461.4 μs
Hash-Graph 159.8 ns 7.108 μs 212.8 μs

Clique

1 10 100
Alga 50.43 ns 2.484 μs 257.6 μs
Fgl 133.2 ns 19.71 μs 10.41 ms
Haggle 1.328 μs 95.85 μs 13.15 ms
Hash-Graph 164.3 ns 23.07 μs 8.069 ms

SUMMARY:

  • Alga was the fastest 18 times

ABSTRACT:
(Based on an average of the ratio between largest benchmarks)

  • Alga was 49.48 times faster than Haggle
  • Hash-Graph was 1.91 times faster than Haggle
  • Fgl was 1.73 times faster than Haggle

So I think my function is so poor that it kills the results...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants