-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Why is fast-greedy-dag worse in terms of dag size? #19
Comments
@TrevorHansen I've been reading it over and it's really neat- reminds me of how congruence closure keeps track of parent pointers in a traditional implementation |
One possibility is that the order of traversing the egraph matters because it is greedy, and so the algorithm happens to work less well Edit: I am guessing this is what is happening! faster-greedy-dag considers strictly fewer programs than greedy-dag does, since it traverses bottom up. In this sense, it is slightly more greedy. This is the first evidence I've ever seen that greedy dag leaves performance on the table. |
Hi @oflatt I don't have a good intuition for the extractors yet. My understanding is that we'll get optimal DAG extraction when we're extracting from trees with the greedy-dag algorithms. But for DAGs and cyclic graphs the greedy approaches won't necessarily be optimal. The integer-linear-programming extractor (in #16) is the only one I've seen that will produce optimal DAG extractions given enough time. I wrote some background about how I understand it in: #16 (comment) I need to make some small examples and experiment with the extractors before I can say any of this with confidence though. Regards the faster-greedy-dag and greedy-dag returning different results. I guessed that they sometimes get to different non-optimal fixed-points, so give different results. When I look at the output, sometimes one is better, sometimes the other is better:
|
Thanks for your answer! I think you are right. |
I played around with some kind of Dijkstra-inspired implementation a while back: https://github.com/Bastacyclop/egg/blob/dag-extract/src/dag_extract.rs On small examples it gets better results than the ILP extraction, which is not optimal because it greedily removes cycles (egraphs-good/egg#207, #7), and I think also gives up by returning pretty bad approximations sometimes? Although Dijkstra is O(n²) / O(nlogn), with my problem mapping the graph explored has a node for every member of the powerset of eclasses, so everything turns into a nasty exponential complexity. I think it would be really cool to push this idea further by adding optimizations like branch and bound, using better data structures in the implementation, digging more into other graph algorithms, and maybe even borrowing some optimizations from ILP solvers. |
I think it's a good idea to explore heuristic/non-optimal extractors. I suspect that for problems that we care about, even with lots of work we won't be able to get the optimal dag-extractors fast enough to be practical. For me though, that work will come after the optimal dag-cost extraction is improved. Going back to what I wrote before - I've realised I was confused. I now think that the bottom-up extractors give an optimal extraction for tree-cost, and the ILP-based extractors (in #30 , #16) - when they're set with infinite timeout, give an optimal extraction for dag-cost. In the short term, I'm focused on getting #16 working properly. It's much faster than #30 already, I'd guess >10x, with a few more options still to speed it up. @Bastacyclop are the egraphs you'd like dag-cost extractions for in #18? If so, the hold up on merging looks like it's just providing a readme.md. I can do it if you'd like? |
Yes, at least a sample. I added the README. |
Thanks. I took a quick look. In case you're still working on this, the sample flexc egraphs now seem easy for the updated extractors:
So we get optimal extractions from the faster-ilp-cbc-timeout extractor in about 110ms per egraph. |
Does anyone know why faster-greedy-dag is less good than greedy-dag?
The text was updated successfully, but these errors were encountered: