-
Notifications
You must be signed in to change notification settings - Fork 157
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove INLINE Pragma on indices #219
base: master
Are you sure you want to change the base?
Conversation
This fixes an interaction with {-# INLINE [1] isInfixOf #-} that made buildTable run once for each scan iteration
I'd like @nomeata to give this a look as he's been recently also looking at the fusion framework rules... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is independent of fusion, right?
But if leaving the inlining decision to GHC yields better performance, then that’s great of course.
@Tarmean, did you run benchmarks to test this?
...I really should have done that before opening the pull request, sorry. I am running into some problems when trying to run the benchmarks, though: 8.0.2:
8.2.2:
|
@Tarmean Did the benchmark executable fail right away or did it produce more output than you showed us? I can't reproduce neither the segfault nor the GHC panic with a |
The executable fails right away. I tried nuking everything ghc related and reinstalling but that didn't fix it. The issue does seem to be memory related, the error changes when specifying a max heap size with -M. How much memory are the benchmarks supposed to use?
|
@Tarmean ok, now that you mention it, I see it too... I got a 32GiB ram machine, and didn't realise how much memory is used shortly after startup (but then gets quickly GC it seems). In any case, the lowest value I was able to get it working was with PS: #204 is related |
I've frequently seen I strongly suspect that the terrible string matching performance observed in haskell/bytestring#307 (comment) is also related to this issue. |
It might be good to report this on GHC's issue tracker. There might be a compiler bug hiding here. |
I still think that removing the INLINE pragma should improve performance in most cases. Not at all certain anymore that nested INLINE pragmas are at fault. The interaction between the bang pattern on buildTable and the INLINE pragma for some reason does seem to prevent buildTable from being floated out when indices is called in a loop. But buildTable also isn't floated when indices isn't inlined so that doesn't explain the weird performance regression on -O2. Guess I'm gonna try if this still can be reproduced with profiling builds tomorrow. |
For context, here is a recent reddit discussion about a program that runs faster with optimizations disabled..
This is caused by an interaction between the
INLINE
pragma on indices and theINLINE [1]
pragma on isInfixOf. For some reason this combination keepsbuildTable 0 0 (nlen-2)
from being floated out. The end result is buildTable being run in each iteration of scan resulting in some impressively slow searches.There are other ways to solve this issue but indices doesn't participate in fusion anyway and this seems like it'd impact code readability the least.