You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I had a thought recently about something that affects runtime performance, but possibly in such a way so minimal that it might not even be worth adding to the book.
Nonetheless...
GHC tags pointers to evaluated heap-allocated data types with their tag. So, for example, pointers to heap-allocated Maybes will be tagged with +1 if the allocated-value is Nothing and with +2 if it is Just ..., (and with +0 if it is a thunk, in which case we evaluate the thunk which will return a tagged pointer (with either +1 or +2)).
{-# LANGUAGE MagicHash #-}
importPrelude(print)
importGHC.ExtsdataMaybea=Nothing
| Justa
{-# NOINLINE f #-}
f x =case x ofNothing->1#Just _ ->2#
main =print (I# (f (Just4)))
Then, to pattern match on x in the body of f, we don't need to dereference the pointer to the heap and check the constructor info, for we can simply (when x is evaluated) look at its tag.
The above example compiles to the following cmm code with ghc -dno-typeable-binds -fforce-recomp -ddump-opt-cmm -ddump-to-file X.hs:
f_rib_entry() { // [R2]
...
cOo: // globalI64[Sp-8] =block_cOf_info;
R1=_sO4::P64;
Sp=Sp-8;
// Check if tag is 0if (R1&7!=0) goto cOf; else goto cOg;
cOg: // tag==0 -> evaluate thunkcall (I64[R1])(R1) returnstocOf, args: 8, res: 8, upd: 8;
cOf: // tag/=0 -> pattern match through constructor tag_sO5::P64=R1;
_cOl::P64=_sO5::P64&7;
// Check if tag is 1 or 2if (_cOl::P64!=1) goto cOk; else goto cOj;
cOk: // Tag is 2, con is Just, return unboxed 2R1=2;
Sp=Sp+8;
call (P64[Sp])(R1) args: 8, res: 0, upd: 8;
cOj: // Tag is 1, con is Nothing, return unboxed 1R1=1;
Sp=Sp+8;
call (P64[Sp])(R1) args: 8, res: 0, upd: 8;
}
}
The question is what happens when we have more than 7 constructors (the tag can only be in the last 3 bits of the pointer, and 0 means unevaluated data type).
In the case of a datatype with more than 7 constructors, a tag of 7 means that the constructor index is 7 or higher, and therefore we have to dereference the pointer and look at the constructor info. The other tags from 0-6 retain their meaning.
Therefore, the most common constructors of a datatype with >7 constructors are better off being in the first 6 constructors, and the least common should come later. So if you have a datatype of which 80% of the times is constructed with some Con1, it is more performant to have it be one of the first 6.
This could possibly matter in really, REALLY, tight loops 😝, though I would love to see such a case. Or a gigantic code base in which all the useful constructors are defined last.
The text was updated successfully, but these errors were encountered:
I actually experimented with this when writing the weigh chapter but I couldn't find any significant difference with an extremely small program. But I do think this effect is observable, we just need to find the right program in the wild, perhaps in pandoc or xmonad.
I had a thought recently about something that affects runtime performance, but possibly in such a way so minimal that it might not even be worth adding to the book.
Nonetheless...
GHC tags pointers to evaluated heap-allocated data types with their tag. So, for example, pointers to heap-allocated Maybes will be tagged with
+1
if the allocated-value isNothing
and with+2
if it isJust ...
, (and with+0
if it is a thunk, in which case we evaluate the thunk which will return a tagged pointer (with either +1 or +2)).Then, to pattern match on
x
in the body off
, we don't need to dereference the pointer to the heap and check the constructor info, for we can simply (when x is evaluated) look at its tag.The above example compiles to the following cmm code with
ghc -dno-typeable-binds -fforce-recomp -ddump-opt-cmm -ddump-to-file X.hs
:The question is what happens when we have more than 7 constructors (the tag can only be in the last 3 bits of the pointer, and 0 means unevaluated data type).
In the case of a datatype with more than 7 constructors, a tag of 7 means that the constructor index is 7 or higher, and therefore we have to dereference the pointer and look at the constructor info. The other tags from 0-6 retain their meaning.
Therefore, the most common constructors of a datatype with >7 constructors are better off being in the first 6 constructors, and the least common should come later. So if you have a datatype of which 80% of the times is constructed with some
Con1
, it is more performant to have it be one of the first 6.This could possibly matter in really, REALLY, tight loops 😝, though I would love to see such a case. Or a gigantic code base in which all the useful constructors are defined last.
The text was updated successfully, but these errors were encountered: