Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Line graph of benchmark results #33

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

TrevorHansen
Copy link
Contributor

@TrevorHansen TrevorHansen commented Jan 2, 2024

This produces a graph of the benchmark results, e.g:

dag_cost_improvement

For each extractor it shows the improvement v.s. time to a reference extractor. In this I ran each extractor with a 2000 second timeout, and after that returned the result from faster-greedy-day. Given faster-greedy-dag is also the reference, the timeouts just show time being used, and no improvement.

I've got some ideas about how to make it more informative e.g:

  • Including details of timeouts,
  • Changing it to log scale.

I'm keen to hear what others would find helpful?

@mwillsey
Copy link
Member

I love the idea of better graphics, but I'm afraid I don't really understand what's going on with this plot atm.

@Bastacyclop
Copy link
Contributor

I can image plotting, for every dataset and for their cumulation, a figure with the extractor on the x-axis, and time or cost quantiles on the y-axis.

@Bastacyclop
Copy link
Contributor

I also like the general idea of Trevor's graph for incremental extractors: because you can decide how much time to spend on extraction, there is trade-off between time allowed and the actual cost reduction achieved. Maybe incremental extractors could have a different "tag" or trait to produce these additional plots, where cost can be reported over time.

@TrevorHansen
Copy link
Contributor Author

Thanks @Bastacyclop & @mwillsey for the input.

I've not compared optimisation solvers before, so I don't know how it's commonly done. From an updated comment in the plot.py file, what this graphs shows is:

This assumes an extractor is run on the all the egraph benchmarks at the same time.
So given 500 egraphs, each will receive 1/500th of a second of that first second's
CPU time. Say 10 egraphs finish processing with their extractor with less than
1/500th of a second's CPU time, i.e. they have a runtime of less than 2ms. Then
for the 2nd second of CPU time, each egraph will get 1/490th of a second of CPU time.

Continuing the example, if those 10 egraphs which were processed in the first 1/500th
of a second, improved on the cost versus the reference implementation by 20, then the
graph will plot an improvemement of 20 at 1 second.

At 2 seconds, the improvement will be the sum of the improvements of all the extractors
which finished in less than 1/500th + 1/490th of a second, that is that finished with
a total runtime of less than 4.04ms.

This will continue until the timeout on the extractor is reached.

I don't have any claim that this is the best way to compare extractors graphically. Just some way that makes sense to me.

@mwillsey
Copy link
Member

I think that might be a little too complicated. Perhaps another idea is to just scatter plot time vs cost. Each point is a single extraction of a single egraph, x = time, y = cost, color/shape = extractor. I think your idea of normalizing the cost is a good one and might be needed in this scheme as well. But yes, this seems like a tricky visualization problem.

@TrevorHansen
Copy link
Contributor Author

TrevorHansen commented Jan 27, 2024

I've made this a bit more confusing, but more useful. It now plots cumulative percentage improvement. So 1% means that the arithmetic-average improvement in the DAG-cost across all the benchmarks is 1%.

I tried log-scale for time, but didn't find it as helpful (see below).

Next, I'll do a separate plot which shows the number of timeouts. I think we need a dashboard with a few graphs to show what's happening.

I tried the scatter plot, but found it too crowded to be helpful.

dag_cost_improvement

dag_cost_improvement

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants