Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add numba and tweak Python benchmarks #2

Merged
merged 3 commits into from
Aug 15, 2021

Conversation

DimitarVanguelov
Copy link
Contributor

Hi, I added numba to the benchmarks, as discussed. I also tweaked the functions so that they return the correct results (same as the Julia version and the original problem).

I'll note that timeit is really clunky to use, and I do not have the time right now to learn it properly. For instance, not sure adding ''µs" to the print statements is appropriate -- feel free to take that out. More importantly though, I did notice that I get somewhat different results when benchmarking in a script vs benchmarking with %timeit in IPython:

# timeit.timeit in script
Numpy Vectorized: 18.308 μs
Python Accumulator: 9.287 μs
Numba Accumulator: 0.451 μs

# %timeit magic in ipython
25 µs ± 145 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)  # Numpy Vectorized
11.8 µs ± 2.68 µs per loop (mean ± std. dev. of 7 runs, 100000 loops each)  # Python Accumulator
336 ns ± 4.28 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)  # Numba Accumulator

The differences are much more pronounced if I use larger array sizes.

Speaking of which, I think benchmarking with such small arrays may be misleading. I may be wrong about the mechanics, but I think calling numpy incurs some overhead, so a tiny array of 10, which is obviously not representative of real-world use cases, understates numpy's performance, especially in comparison to looping in pure Python. For instance, here are some benchmarks with randomly generated arrays of 1200 (12 months * 10 years):

# %timeit magic in ipython
66.8 µs ± 6.26 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)  # Numpy Vectorized
1.09 ms ± 4.8 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)  # Python Accumulator
1.42 µs ± 1.68 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)  # Numba Accumulator

Thus, you should give some thought to expanding the size of the q and w arrays. Not only is then numpy clearly faster than plain Python, but in certain cases numba is a bit faster than plain Julia. I recently ran some benchmarks on a similar (discounting) function, and numba was indeed slightly faster. Of course, numba is quite finicky and, in my experience, only works on small/clean enough problems. And there is of course also this.

Nevertheless, if we're benchmarking performance, we should be as thorough and fair as possible. I hope this is helpful. Feel free to reach out here or on Slack if you'd like to discuss.

@alecloudenback
Copy link
Member

alecloudenback commented Aug 15, 2021

Thanks, great to have this addition.

When using timeit in a script, I think it reports seconds so I removed the μs.

Here's the results I'm getting, which are similar to what you had but proportionally a bit faster:

Prior code:

13.979937791 # numpy vectorized
2.370869667000001 # python base

Your commit (dropping the microseconds):

Numpy Vectorized: 14.186
Python Accumulator: 9.876
Numba Accumulator: 0.649

The timing of the second one seemed off... it seems like it's 4x faster to not use numpy arrays, so I went back to that:

Numpy Vectorized: 14.261
Base Python Accumulator: 2.314
Numpy Vectorized: 0.626

primarily to have a `q` and `w` that were not numpy arrays, because using `numpy` for the base Python accumulator was 4x slower:

Prior code:
```
13.979937791 # numpy vectorized
2.370869667000001 # python base
```

Your commit (dropping the microseconds):
```
Numpy Vectorized: 14.186
Python Accumulator: 9.876
Numba Accumulator: 0.649
```
The timing of the second one seemed off... it seems like it's 4x faster to not use numpy arrays, so I went back to that:
```
Numpy Vectorized: 14.261
Base Python Accumulator: 2.314
Numpy Vectorized: 0.626
```
@alecloudenback
Copy link
Member

you should give some thought to expanding the size of the q and w arrays.

That's a fair point - the samples all stemmed from the original submission that used those values. I opend another issue #3 to discuss. Going to merge and close this to get the Numba benchmarks in.

@alecloudenback alecloudenback merged commit 7334805 into JuliaActuary:master Aug 15, 2021
@alecloudenback
Copy link
Member

Also updated the site, should be live in a couple minutes: JuliaActuary/JuliaActuary.org@0ad9163

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants