Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DRAFT] Continuation specialization, aka match-in-match #3501

Draft
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

Gbury
Copy link
Contributor

@Gbury Gbury commented Jan 23, 2025

This PR adds to flambda the capacity to specialize continuations.

For now, it's meant for testing. Once testing is done, code will be cleaned up, and some parts will be split up and submitted as independent PRs for an easier review.

Context and overview

Following the continuation lifting introduced in #2295 , this PR (finally) adds continuation specialization, i.e. given one continuation k with more than one call site, generate on specialized version of k for each call site. The most tricky part of that process is related to continuation lifting in order to ensure that the minimal amount of code is duplicated when specializing one continuation (more on that later).

Consider an expression of the form:

let cont k x =
  let y = untag_int x in
  let cont k' z =
    ..
  in
  switch y with
  | 0 -> k' ...
  | 1 -> k' ...
in
switch .. with
| 0 -> k 0
| 1 -> k 1

The process for continuation specialization is the following.

  1. when simplifying (downwards) the let_cont for k we first simplify the body (as usual), and see/record all the uses of k (as usual)
  2. once the downwards pass on the body has been completed, we start the downwards pass on the handler of k
  3. after some simplifications, we simplify the let_cont for k' and start with the downwards traversal of the body of the let_cont
  4. we reach the switch at the end of the handler of k (i.e. the switch on y). At that point we have all the information that we need to know whether specializing k will eliminate the switch (for the match-in-match optimization), so we can decide whether to specialize or not. Note: the criterion for deciding when to specialize is not implemented in this PR and will be done later, for now we always specialize if the lifting budget allows for it.
  5. let's assume we do decide to specialize k. We then also decide to lift any continuation inside of the handler of k to avoid duplicating those
  6. We have now finished the downwards pass on the body of the let_cont of k', and we now know that we want to specialize k, and thus also lift k', so instead of simplifying the handler of k', we lift it by storing it in the dacc (this is what Continuation lifting #2295 does)
  7. We now reach the down_to_up for the handler of k, we first pop the lifted continuations out of the dacc and craft a new down_to_up function that will simplify them. We can then start the specialization of k
  8. We generate the specialized versions of k by copying the original handler (before simplification) for each callsite, and then doing the downwards pass on each of these "new" continuations.
  9. For now, we always completely specialize (i.e. generate exactly one continuation for each callsite), so we adjust the free names and other metadata to completely erase information coming from the downwards traversal fo the original handler for k (in the future we might consider specializing only some callsites and keeping a "generic" version, but that will be for later)
  10. During the downwards pass on the specialized versions of k, we must replace any bound continuation (whose name will be fresh as the let_cont will be freshly opened by the downwards traversal) by its lifted version, and compute adequate lifted arguments. This is where we use the notion of Replay_history to relate the freshly bound continuation (and the lifted arguments) with the names from the first downwards traversal on k.
  11. We record the mapping from callsites to the name of the new continuations as that will be used later (in the data flow analysis and during rebuilding)
    12.During data flow analysis, we rewrite the internal graphs using the mapping created at point 11 (to reflect the term that will be rebuilt)
  12. When rebuilding terms we rewrite the calls of k to refer to the specialized version, using the mapping from point 10.

Replay History

The notion of replay_history is introduced in #3302 (not merged yet) precisely to make it possible to lift a continuation during the first downwards pass on k (the specialized continuation), and then refer to the lifted continuation in the specialized versions of k.

Let's detail a bit more why that is required/necessary. Let's consider the same example as above, when we perform the downwards pass on the specialized versions of k (let's call them k_0 and k_1), what the downwards traversals effectively see/observe is a term that would look like the following:

let cont k x =
  let y = untag_int x in
  let cont k' z =
    ..
  in
  switch y with
  | 0 -> k' ...
  | 1 -> k' ...
in
let cont k_0 x_0 =
  let y_0 = untag_int x_0 in
  let cont k'_0 z_0 =
    ..
  in
  switch y_0 with
  | 0 -> k'_0 ...
  | 1 -> k'_0 ...
in
let cont k_1 x_1 =
  let y_1 = untag_int x_1 in
  let cont k'_1 z_1 =
    ..
  in
  switch y_1 with
  | 0 -> k'_1 ...
  | 1 -> k'_1 ...
in
switch .. with
| 0 -> k_0 0
| 1 -> k_1 1

In particular because of the automatic name abstraction renaming, we have no easy way to relate the lifted continuation k' with its corresponding continuations k'_0 and k'_1, except for the fact that all of those can be identified as the first continuation that is bound in the handler of k. Additionally, when lifting k', we add to it the lifted parameter y' that was in scope at its original binding site and thus must be kept available in its handler: this brings another problem which is that we need to add the adequate argument for this new parameter at every callsite of k', including those in the specialized continuations k_0 and k_1 where its value will be y_0 and y_1, which can only be related to y (the value of the extra parameter y' in the original handler of k) because they are the first variable that is bound in the handler of k (and similarly for x, the first parameter of k).

Therefore, in order to correctly lift continuations out of the handler of specialized continuations, we need to be able to relate the names of parameters, variables and continuations that are bound in the handler of k beetween successive downwards traversal. More precisely, given a parameter/variable/continuation bound in the handler of a specialized version of k, we want to know what was the corresponding name generated during the first downwards pass (the one on the original handler of k).

That's where the notion of Replay_history comes from: basically, it's an ordered list of the successive parameters/variables/continuations that are bound in the handler of a given continuation. By creating and storing that list during the first pass (i.e. the downwards pass on the original handler of k), we can then on successive downwards passes establish a mapping from the names of the current (specialized) pass to the names of the original pass.

The replay history tries to be as "secure" as possible (i.e. prevent against establishing a mapping between two downwards pass on different expressions) by raising errors if the order or variables/continuations bound does not match, and by verifying that the corresponding names are renamed version of one another, which does not protect against any misuse since two unrelated variables/continuations can be seen as renamed version of one another if they have the same names in some situations, but at least offers some measure of safeguard.

Lifted continuations

In the handler of the specialized continuations, we replace the handler of bound continuations by a call to the lifted continuation, and relay on the usual mechanism to correctly compute the adequate arguments for the lifted parameters. This works seamlessly because the lifted_cont_params structure interacts with the replay_history to correctly map the unique key of lifted params (i.e. the name of the variables in the original downwards pass) to their name in the current pass.

Conclusion

General remarks/comments/questions about how everything works are welcome. Detailed remarks about the code are likely to not be of limited use until after testing (particularly once there are no bugs remaining), contrary to remarks about the general design and organisation of the feature, ^^

I'll ping the relevant persons when the PR is ready for reviews, but anyone should feel free to look at the code (keeping in mind that there's a fair bit of code cleanup to do).

@Gbury Gbury added flambda2 Prerequisite for, or part of, flambda2 match-in-match prerequisites, or part of, match-in-match labels Jan 23, 2025
@Gbury Gbury force-pushed the cont_spec branch 2 times, most recently from 03666f7 to e993d77 Compare February 12, 2025 17:00
@Gbury Gbury force-pushed the cont_spec branch 2 times, most recently from a55c03d to 6efd2bf Compare February 24, 2025 08:21
Currently, any and all variables in scope, including parameters
and extra parameters (coming from CSE and unboxing), are added to the
lifted_cont_params. This is superfluous, since the handlers of lifted
continuations do not contain any reference to these extra params prior
to simplification, and these extra params can be recomputed for the
lifted continuations.

As it stands, this commit alone may lead to regressions, since we
currently do not perform unboxing on the lifted_cont_params. This will
be fixed in a later commit.
In some cases (mostly the upcoming continuation specialization for
match-in-match), we are interested in doing multiple downwards pass on a
single term. In such cases, it might be necessary to relate the names of
variables and continuations from the first pass to those from the
subsequent passes. The predominant use of that for now should be to
correctly handle continuations that have been lifted during the first
pass, and whose calls should be rewritten in subsequent passes
(including their lifted arguments).

This is done by storing the sequence of names generated when opening
name abstractions during the first downwards pass. Note that we only
need to do so for names present in the term before downwards traversal
so we can skip any extra param added by Simplify. On subsequent passes,
each new variable and continuation "consumes" an element of that
sequence, allowing to establish the correspondance between names. As a
safety measure, we check that correspondings names are in the same
"renaming equivalence class", though note that this is not a guarantee
(for isntance, by default all continuation have no names and therefore
will all be in the same "renaming equivalence class").

Additionally, and this is more specific to continuation specialization
but generalizable to other future uses, it is necessary to have some
kind of replayability of inlining decisions. Indeed each inlining
decision will change the sequence of bound names that are opened during
the downwards pass, which would currently break the hypothesis of replay
histories that exactly the same sequence of binders are opened. In the
case of match-in-match this is simple: the handler that we want to
specialize ends with a switch, which means that any call inside the
handler have been inlined, so the replay history has a boolean to denote
that we want to inline everything while replaying the handler downwards
pass. For other more complex uses, inlining decisions could be stored
alongside the bound variables and continuations, so that they can be
replayed, either as is, or within some notion of compatibility.

Finally, we also need to make it so that recursive continuations are
bound in an order that is stable through renaming, so we change the map
of continuation to an Lmap for recursive continuation bindings.

In the future, this replay mechanism would also be useful to do widening for
recursive continuations (but that's a far away future).
This should have no observable effect, but will simplify the
continuation specialization work, particularly when paired with the
replay histories feature, so that lifted cont params can be correctly
tracked between a first pass and subsequent passes during continuation
specialization.
Interestingly, the blind specialization currently done (since we do not
yet have a heuristic) actually removes the errors on some tests, which
makes the tests fail. We therefore disable specialization on theses
tests and we'll see once we have a proper heuristic if we change/promote
the tests.
TODO: Revert this before merging the branch/PR

This will allow to test the lifting and continuation specialization
without having to alter cli arguments of env variables
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
flambda2 Prerequisite for, or part of, flambda2 match-in-match prerequisites, or part of, match-in-match
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant