-
Notifications
You must be signed in to change notification settings - Fork 43
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
benchmark ideas #240
Comments
I could do a PDE solver (FEM) benchmark. A funny thing about PDEs is that one solving the simplest possible (regular grid with diffusion) is a couple of lines while a general framework can be arbitrarily big. But something that is kinda realistic should be possible in a few hundred lines. |
Very true. Though I wonder if we should just link over to @johnfgibson's https://github.com/johnfgibson/julia-pde-benchmark which is really well done already. Instead of a PDE, we could also do something like the 4th order Runge-Kutta method. https://en.wikipedia.org/wiki/Runge%E2%80%93Kutta_methods#The_Runge%E2%80%93Kutta_method A quick Julia code for using it to simulate the Lorenz equation is: function f(du,u)
du[1] = 10.0*(u[2]-u[1])
du[2] = u[1]*(28.0-u[3]) - u[2]
du[3] = u[1]*u[2] - (8/3)*u[3]
end
function rk4_solve(u,dt,n)
k1 = similar(u); k2 = similar(u)
k3 = similar(u); k4 = similar(u)
tmp = similar(u)
for i in 1:n
f(k1,u)
@. tmp = u + dt*k1/2
f(k2,u)
@. tmp = u + dt*k2/2
f(k3,u)
@. tmp = u + dt*k3
f(k4,u)
@. u = u + (k1 + 2k2 + 2k3 + k4)/6
end
u
end
u = [1.0,0.0,0.0]
dt = 1/2
n = 100
u = rk4_solve(u,dt,n) Some things which are immediately highlighted in this example is:
|
To give some context for the "counting substring problem": I work in bioinformatics, where we often have long sequences of DNA, 100's of thousands to millions of bases: So the challenge is: Given a string 250.000 characters long, tally up each 4-mer and count their frequency, skipping any 4-mer that contains an invalid character. |
So I guess the microbenchmark version of the k-mer counting code would be to fix a specific length like 4-mers and then count the frequency in a long, random fake DNA sequence, and have some simpler specialized code for that. Or maybe an example DNA sequence, although for microbenchmarks we generally haven't had data files, but there's no reason we couldn't. |
We have some data files in this repo https://github.com/JuliaCI/BaseBenchmarks.jl/tree/master/src/problem/data. |
I could have sworn I'd previously opened an issue here with an idea for a benchmark but now I can't find it. There was a discussion of some algorithms that are hard/impossible to do in a vectorized way, and a couple that came up were:
Please post and discuss more ideas here!
The text was updated successfully, but these errors were encountered: