-
-
Notifications
You must be signed in to change notification settings - Fork 115
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add content on performance benchmarks. #461
Conversation
Please reformat the lines to <80 characters |
How to best link to this from the “Performant” block on the front page? |
The content is in |
Please reformat the lines in the markdown to have less than 80 characters. |
Deployment is failing with
|
The results look reasonable now. Did you regenerate the figure to reflect the table? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't really want to dive into this, but looked at the code and found it a bit oddly dense for the NumPy version. Is the code taken from somewhere? I am surprised, because the nbable code seemed a bit clearer/and more comparable to me on first sight.
To be clear: I like this start and having N-Body seems good! My code comments are mainly because it seemed the code could be better and needs to be more comparable.
But, the document around it needs much more thoughts in my opinion. As is, I think the N-body examples probably cannot stand alone (they are just too terrible for NumPy, NumPy is not good at this, although it might be decent for large enough N
, in which case that may work).
Even then, we need a story around it to go somewhat beyond "numpy is fast" (or "how fast is numpy").
Pythran/numba are here for the "fill the gap" story part: NumPy is fast enough for most things, and it is easy to fill the gap with C/Pythran/numba/cython where needed. It is trivial for Julia, etc. to point out that they are faster than Python/NumPy because it plays to their strengths, lets try to play to our strengths? But even that feels hard if we give examples 50-100 times slower than C, which is really much worse than typical).
This is meant to be the first of many benchmarks. We chose this because it was well constructed and seemed to be straight-forward to present. It turned out to be not so great for NumPy, although at the end of the day I think there should be a way to avoid the loops and get better results. |
…r than NumPy's operators at the required places.
… code, modified graph, removed unnecessary files.
Hi, Presently, I have two things in my mind:
I am looking forward to your inputs. Thanks! |
That should be fine. The runtime is a bit long (up to 180 sec), but that's still doable to reproduce because we don't run these benchmarks a lot. I'd not go any larger than
That should be fine. Sometimes it's even more readable.
You now use the code in |
Hi,
Yes, we are now using the same code for both Pythran and Numba. Numba's behavior is quite a bit interesting. |
I think we should move the benchmarking code to a new repo, it doesn't quite fit here or in the main numpy repo. Proposed name: |
I think something is off with the normalization. Originally, the formula Moving the benchmark code elsewhere seems reasonable. Should I open a repo? |
Hi, Let time(t) =
The following is the output, in case we use According to me, normalizing the results using the formula What do you all think?
We discussed transferring my repo https://github.com/khushi-411/numpyorg-benchmarks to NumPy org. Though, it needs some edits.
Is that okay? Thanks! |
content/en/benchmark.md
Outdated
|
||
## Results | ||
|
||
Table values represent the normalized time taken in seconds by each algorithm to run on the given datasets for $50$ number of iterations. The raw timing data can be downloaded from <a href = "benchmarks/data/table.csv">here</a>. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Format to 80 columns, specify the normalization used in the table, and remove reduncancy
Table values represent the normalized time taken in seconds by each algorithm to run on the given datasets for $50$ number of iterations. The raw timing data can be downloaded from <a href = "benchmarks/data/table.csv">here</a>. | |
Table values represent the normalized time `time / n_particles` taken in | |
seconds by each algorithm to run on the given datasets for $50$ iterations. The | |
raw timing data can be downloaded from <a href = | |
"benchmarks/data/table.csv">here</a>.``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I'll make the edits. Thanks!
Ahh, I was confused: you are currently normalizing by I think that is a more compelling story, perhaps the numerical text at the top of each column could present both the raw and the normalized values, as well as presenting both values in the table. |
I created numpy/numpyorg-benchmarks. Please issue a PR to move the code there. |
Sorry, I created mattip/numpyorg-benchmarks instead. It seems I do not have the needed permissions to create a repo in the numpy org, and I didn't notice that github moved me to mattip instead. @rgommers could you create the repo? |
done! |
I'm closing this PR because I shifted all the work to numpyorg-benchmarks repo. Thanks! |
This PR adds content on performance. Follows issue: #370
Rendered Version: https://deploy-preview-461--numpy-preview.netlify.app/
cc: @mattip