Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Benchmark Results #70

Open
nmichlo opened this issue Nov 2, 2021 · 2 comments
Open

Benchmark Results #70

nmichlo opened this issue Nov 2, 2021 · 2 comments
Assignees

Comments

@nmichlo
Copy link

nmichlo commented Nov 2, 2021

First of all, great repository!

Would it be possible to publish benchmark results as part of the repo rather than requiring users to run it ourselves?

@mbuzdalov
Copy link
Owner

mbuzdalov commented Nov 3, 2021

I want to publish such results, somehow, maybe since the beginning of this repo.

However, there are few problems that make such a thing really difficult.

  1. They must be up-to-date with the code in the repository, hence the benchmarking must be run at each commit in a continuous integration fashion, and then published. I believe that for technical reasons such results shall be not a part of the repository, but they rather need to be stored and displayed on a companion website, for instance.
  2. Even the smallest usable benchmark set runs for few days. The next smallest one requires two weeks. The only way to scale it somehow is to use a lot of (groups of identical) dedicated computers (as multiple cores of a single computer do not work well, and using non-dedicated computers also does not work), which I cannot currently afford. I have spent some considerable time on implementing incremental benchmarking (e.g. the algorithms that do not show changes on a small dataset are not recomputed on larger ones), but what has been developed remains highly immature (e.g. who likes the statistical significance threshold of 10-10?).
  3. Even if the previous things are resolved, the data is still very hard to display, as it is multidimensional (problem type, number of points, dimension, number of fronts for those types where it is necessary) even for one algorithm, and there are many algorithms and many flavours of them. So this is hell lot of numbers, tables and plots. You can see here (some parallel speedups) or here (very few plots of few good algorithms), for instance, how much of a mess it is.
  4. The existing benchmark sets are just a tiny fraction of what shall be there. For instance, everyone and their brother wants timings on e.g. ZDT, DTLZ, WFG, DTLZ-1 benchmarks, and so on. Currently, such things are not even there!
  5. Even assuming all that is resolved... one cannot really be much confident that the relations between these numbers (of course not the numbers themselves) will hold on some different hardware. Yes, most of these relations are quite stable. However, I am aware that switching some of the things on and off may increase the performance by 3x on one machine and decrease it by 1.2x on another (if you are, for some reason, interested what such things are, check up the branchless-median branch).

So you may see that it is really difficult. Maybe when I get back here finally, for instance to prepare a journal paper on the multthreaded version of the Jensen-Fortin-myself algorithm, I will also try to sort out some of these issues.

@mbuzdalov mbuzdalov self-assigned this Nov 3, 2021
@nmichlo
Copy link
Author

nmichlo commented Nov 3, 2021

That makes sense.

  • Maybe they don't have to be full blown benchmarks but partial ones just to get an idea of the behaviour and runtimes of the different algorithms?
  • You could also version the results and give the date of the run if you did decide to post them without continuous integration.

EDIT: I did not see that some of the results have been posted under the release section.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants