Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lambda Lifting Chapter #63

Open
doyougnu opened this issue Nov 29, 2022 · 5 comments
Open

Lambda Lifting Chapter #63

doyougnu opened this issue Nov 29, 2022 · 5 comments
Assignees
Milestone

Comments

@doyougnu
Copy link
Collaborator

Tie this into the SAT chapter: #62

and have a discussion on use cases for the function being optimized. That is, answer when one might want to keep arguments on the stack with a lambda lift, and when one might want to SAT to avoid excess stack allocs.

@doyougnu doyougnu added this to the Phase 1 milestone Nov 29, 2022
@doyougnu doyougnu self-assigned this Nov 29, 2022
doyougnu added a commit that referenced this issue Feb 2, 2023
@Bodigrim
Copy link

Bodigrim commented Mar 6, 2023

Looking at https://input-output-hk.github.io/hs-opt-handbook.github.io/src/Optimizations/GHC_opt/lambda_lifting.html, I don't quite agree with the summary and heuristics suggested.

This new program will be much faster because f becomes essentially non-allocating.

What is supposed to be the program here? Just a single call to f? It will take roughly the same amount of time.

Lambda lifting trades heap for stack and is therefore effective for tight, closed, hot loops where fetching from the heap would be slow.

Both heap and stack are in memory and often in CPU cache, so there is little inherent difference in fetching time. That's not the point of lambda lifting.

The original program allocates one (expensive) closure for g per call of f. The modified program allocates a (cheap) additional Int argument, but per each call of g, not f. So the heuristic is: if there are few outer calls of f a n but many inner calls of g n (which happens when n is large) use the original program, because it will allocate some closures in the outer loop, but save on allocations in the inner loop. If however there are many outer calls of f a n with relatively small n so that there are few inner calls of g n , then lambda lifting proves fruitful.

The simplest motivational example for lambda lifting is

map f = go 
  where 
     go [] = [] 
     go (x : xs) = f x : go xs

vs.

map f [] = []
map f (x : xs) = f x : map f xs

The first form is beneficial if there are few calls of map with relatively long lists, and the second is better when there are many calls of map with short lists.

FWIW the advent of join points tilted scales further in the direction of static argument transformation, letting GHC to do lambda lifting on its own. For instance, see discussion at haskell/bytestring#273 (comment).

@doyougnu
Copy link
Collaborator Author

doyougnu commented Mar 7, 2023

Hi Bodigrim, thank you for the review.

The original program allocates one (expensive) closure for g per call of f. The modified program allocates a (cheap) additional Int argument, but per each call of g, not f. So the heuristic is: if there are few outer calls of f a n but many inner calls of g n (which happens when n is large) use the original program, because it will allocate some closures in the outer loop, but save on allocations in the inner loop. If however there are many outer calls of f a n with relatively small n so that there are few inner calls of g n , then lambda lifting proves fruitful.

Yes I think this is much better guidance and I will add this language to the section.

The simplest motivational example for lambda lifting is ...

Also great, this example should be very straightforward for the audience. I'll adapt the chapter to use this example. My motivation with the previous example was just to have the simplest program to show the transformation. But I do agree that the context of use for the function is an important and missing part.

Both heap and stack are in memory and often in CPU cache, so there is little inherent difference in fetching time. That's not the point of lambda lifting.

Yes, this is not the point of lambda lifting per se but I still think it is an important point to make, because learning to read haskell with an performance eye is a skill in and of itself, and teaching that skill is one of the goals of the book. So my motivation here is not to say "GHC does this optimization that you should be aware of and here is how that works" but instead to say "Here is an optimization that GHC does, it may or may not fire, but you could consider performing the optimization manually in special cases".

FWIW the advent of join points tilted scales further in the direction of static argument transformation, letting GHC to do lambda lifting on its own. For instance, see discussion at haskell/bytestring#273 (comment).

Perfect! Join points is on the TODO stack so this will be a nice follow up chapter. I was thinking of doing SAT next because SAT is not enabled by default, is situational and is an effective optimization technique.

@doyougnu
Copy link
Collaborator Author

doyougnu commented Mar 7, 2023

Also I would like to add your name to the list of contributors as a reviewer. Is that ok with you?

@Bodigrim
Copy link

Bodigrim commented Mar 7, 2023

Also I would like to add your name to the list of contributors as a reviewer. Is that ok with you?

Sure, thanks.

@doyougnu
Copy link
Collaborator Author

I'll adapt the chapter to use this example.
Also I would like to add your name to the list of contributors as a reviewer

These are done.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants