-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
optimize numerator() #14
Comments
Thanks for looking into this. I agree your version is easier to read, but I could not reproduce the performance gains in my environment. I modified the
This obviously using the benchmark image, not your images. I also don't see how swapping the loops would make any difference, as it would depend on width being smaller than height, which we can't control and it is not always de case. What Go version and architecture are you using? My tests were done in a MacBook M2, Go 1.22.2. |
hi deluan, thanks four your quick reply. i could not answer earlier as there where issues with the github login. first - as requested - your benchmarks on my machine:
i have three patches lookup.zip, which are based on each other, so they have to be applied in order:
benchmark:
compared to current version on github:
so my estimation of 20 % speed improvements where too optimistic.
the 8-fold speed improvement of LookupAll() reflects the fact that my machine has 8 cores, so it scales linearly. Even the ocr routines benefit from this improvement. what bothers my is that lookupCol() has 9 (!) input parameters, which is awful. that's all for now. one last thing: when running go test on the lookup package there seems to be an error, event though the test suite is flagged as PASS:
regards, heiko |
Thanks and sorry for the late reply. I'll take a look at the patches.
You could use GitHub's own interface to create a PR. Just browse to the file you want to change and edit it: If you use VSCode or IntelliJ, that should be very easy to create PR's from their own UI. Anyway, I'll check the patches and will see if I can make the lookupCol better (less parameters). Thanks! |
hi deluan, thanks for taking your time to educate me! please ignore my patch 02, i have something better in mind. my patch makes the code slower when using a small image in a small search window (because of the goroutine/channel overhead). i will report back soon (tm) with a much better approach. regards, heiko |
hello deluan,
first thank you for this very useful library.
i rewrote the numerator() function. usually one would send a pull request, but i know norhing about git (mercurial user).
so i just paste my version here:
two big changes:
1st: i pulled constanst computations out of the loop, as they don't change.
2nd:i swapped the loop order - y outer loop, x inner loop.
this way the loop is a lot more cache friendly
the new code is not only faster but a lot simpler and easier to understand.
on my machine i got a speed improvement of 20%. hope you find this useful!
regards,
dederon
The text was updated successfully, but these errors were encountered: