Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Drop some unnecessary allocations #230

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

bobsayshilol
Copy link

These caught my eye so I removed the obvious ones, and then measured the performance to check that it didn't make it worse. I haven't done a thorough check since the scripts in test_data point to files I can't find, so I used the timing tests in runTests to check the performance before and after these changes. These results show a decent improvement in some cases.

Tests were done by running runTests 4000 | grep faster 3 times and recording the results. Compilers used were gcc 14.2.1 and clang 19.1.7. I haven't checked with MSVC but I assume performance won't have worsened.

Compiler/method Before (run 1) Before (run 2) Before (run 3) After (run 1) After (run 2) After (run 3)
gcc/HWA 5.51 5.51 5.53 5.91 5.89 5.90
gcc/HW 6.91 6.91 6.92 7.47 7.46 7.46
gcc/NWA 3.49 3.51 3.50 3.51 3.49 3.52
gcc/NW 11.53 11.54 11.53 12.40 12.38 12.36
gcc/SHWA 32.32 32.31 32.29 42.44 42.48 42.36
gcc/SHW 49.49 49.58 49.55 77.01 76.91 76.98
clang/HWA 5.70 5.71 5.72 5.91 5.93 5.94
clang/HW 7.15 7.16 7.15 7.52 7.48 7.47
clang/NWA 3.77 3.77 3.75 3.76 3.77 3.72
clang/NW 11.72 11.70 11.79 12.37 12.42 12.31
clang/SHWA 37.55 37.47 37.67 41.80 41.47 41.25
clang/SHW 60.33 60.45 60.57 69.76 70.14 69.68

See individual commits for more details.

Putting containers on the heap doesn't do anything since they allocate
internally and don't grow on the stack, so it's just additional work
having to call into the allocator.
There's only 64 int's which is 256 bytes on most platforms.

This improves performance by a small but measureable amount in the
tests.
`alphabet += ...` isn't a trivial operation since it has to check if it
needs to allocate more memory each time it's called. Since the alphabet
can't include any duplicates we can instead create a fixed size buffer
and build it on the stack, then perform a single allocation at the end.

Also move the storing of the transformed pointers to the end so that
the compiler can infer that they don't alias with anything.

This gives another small boost in performance.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant