-
Notifications
You must be signed in to change notification settings - Fork 555
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add benchmarks #967
Comments
OK, I think a better option is Air Speed Velocity.
Initially, I would think that we would get a lot of value out of measuring
One tricky thing would be to decide how to store the results (eg the metrics collected from past runs). Would want a way to ensure a consistent machine (using github actions? and/or with docker?). Keep the results in this repo, or keep them in a separate repo, possibly using submodules as they mention? |
I love this ides @NickCrews . if you want to try to get something rough and ready set up, i think that would a really good step forward. if it's looking very valuable, then we will figure out the tricky bits. |
Cool, it might be a bit, but I will try to get to this. I'll start with the benchmarks in this repo, as those other packages do. Will try to start out without the stored state of past metrics. Will make the decision at game time of which example data to use, haven't explored enough yet to see which I like better. |
@fgregg @fjsj I'm not well versed in entity resolution, any suggestions on what metrics I should use for "accuracy"? Based on the abstract of https://arxiv.org/abs/1509.04238 (didn't read it yet as I thought maybe you'd have pointers) it sounds like the standard measures like F score and precision might need to get tweaked a little bit. I can also do my own research but if you can tell me where to start it would help. Thanks! |
i think precision and recall are really still the best ones. |
look at canonical.py in tests to see how precision and recall is calculated there. |
Branched off of #965 (comment).
EDIT: See next comment for using ASV instead of @Profile
Place @Profile decorators on bottleneck functions using
memory_profiler
List this dependency as extra, so that most users don't need to install it.
Also, to prevent overhead from @Profile getting run always (even when we don't want profiling), wrap it in our own custom decorator that is usually a noop:
Next steps are to probably actually make a new branch and apply it to some of the examples?
The text was updated successfully, but these errors were encountered: