Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The microest optimization wrt pointer tagging #84

Open
alt-romes opened this issue May 26, 2023 · 4 comments
Open

The microest optimization wrt pointer tagging #84

alt-romes opened this issue May 26, 2023 · 4 comments

Comments

@alt-romes
Copy link

alt-romes commented May 26, 2023

I had a thought recently about something that affects runtime performance, but possibly in such a way so minimal that it might not even be worth adding to the book.

Nonetheless...

GHC tags pointers to evaluated heap-allocated data types with their tag. So, for example, pointers to heap-allocated Maybes will be tagged with +1 if the allocated-value is Nothing and with +2 if it is Just ..., (and with +0 if it is a thunk, in which case we evaluate the thunk which will return a tagged pointer (with either +1 or +2)).

{-# LANGUAGE MagicHash #-}
import Prelude(print)
import GHC.Exts

data Maybe a = Nothing
             |  Just a

{-# NOINLINE f #-}
f x = case x of
        Nothing -> 1#
        Just _    -> 2#

main = print (I# (f (Just 4)))

Then, to pattern match on x in the body of f, we don't need to dereference the pointer to the heap and check the constructor info, for we can simply (when x is evaluated) look at its tag.
The above example compiles to the following cmm code with ghc -dno-typeable-binds -fforce-recomp -ddump-opt-cmm -ddump-to-file X.hs:

f_rib_entry() { //  [R2]
      ...
      cOo: // global
          I64[Sp - 8] = block_cOf_info;
          R1 = _sO4::P64;
          Sp = Sp - 8;
          // Check if tag is 0
          if (R1 & 7 != 0) goto cOf; else goto cOg;
      cOg: // tag==0 -> evaluate thunk
          call (I64[R1])(R1) returns to cOf, args: 8, res: 8, upd: 8;
      cOf: // tag/=0 -> pattern match through constructor tag
          _sO5::P64 = R1;
          _cOl::P64 = _sO5::P64 & 7;
          // Check if tag is 1 or 2
          if (_cOl::P64 != 1) goto cOk; else goto cOj;
      cOk: // Tag is 2, con is Just, return unboxed 2
          R1 = 2;
          Sp = Sp + 8;
          call (P64[Sp])(R1) args: 8, res: 0, upd: 8;
      cOj: // Tag is 1, con is Nothing, return unboxed 1
          R1 = 1;
          Sp = Sp + 8;
          call (P64[Sp])(R1) args: 8, res: 0, upd: 8;
    }
}

The question is what happens when we have more than 7 constructors (the tag can only be in the last 3 bits of the pointer, and 0 means unevaluated data type).
In the case of a datatype with more than 7 constructors, a tag of 7 means that the constructor index is 7 or higher, and therefore we have to dereference the pointer and look at the constructor info. The other tags from 0-6 retain their meaning.

Therefore, the most common constructors of a datatype with >7 constructors are better off being in the first 6 constructors, and the least common should come later. So if you have a datatype of which 80% of the times is constructed with some Con1, it is more performant to have it be one of the first 6.

This could possibly matter in really, REALLY, tight loops 😝, though I would love to see such a case. Or a gigantic code base in which all the useful constructors are defined last.

@doyougnu doyougnu added the idea label May 30, 2023
@doyougnu
Copy link
Collaborator

I actually experimented with this when writing the weigh chapter but I couldn't find any significant difference with an extremely small program. But I do think this effect is observable, we just need to find the right program in the wild, perhaps in pandoc or xmonad.

@hsyl20
Copy link
Collaborator

hsyl20 commented May 30, 2023

You could experiment with GHC Core's Expr type: what happens if you move the Var constructor to the last position? :)

@alt-romes
Copy link
Author

I'd love to see those benchmarks heh

@doyougnu
Copy link
Collaborator

sounds like a good case study

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants