Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Encode default terms as pattern matching #742

Open
adelaett opened this issue Nov 3, 2024 · 3 comments
Open

Encode default terms as pattern matching #742

adelaett opened this issue Nov 3, 2024 · 3 comments
Labels
💬 discussion Further discussion is needed 💡 language Language design ➗ semantics Formal semantics of Catala

Comments

@adelaett
Copy link
Collaborator

adelaett commented Nov 3, 2024

Overview

This proposal suggests a design enhancement for the default calculus, which could improve the compilation process and increase the language's expressiveness.

Motivation

The idea comes from a simple observation: the default term <e1, ... en | ej :- ec> can be viewed as a form of pattern matching: <e1, ... en | ej is pj :- ec>. If ej represents not a boolean condition but a pattern, <| ej is pat :- ec>, we could encode traditional pattern matching as follows:

< <| ej is pat1 :- ec1>, <| ej is pat2 :- ec2> | Nothing :- empty>

This structure would allow variables to be bound within the consequence of a default term. For example:

match x with 
| Some y when y >= 0 => y + 1
| Some y when y < 0 => y - 1
| None => 0
end

can be encoded as

< <| x is Some y when y >= 0 :- <y + 1>>,
  <| x is Some y when y < 0 :- <y - 1>>,
  <| x is None :- <0>>
  | Nothing :- Empty
>

Note that the Nothing is different from the existing false in the boolean language, and of patterns. It is more like an empty pattern

High-Level Syntax

In terms of surface syntax, this change would mean we could add under condition x is Some y, making the variable y accessible in the current context. Here is a more concrete example:

definition z
under condition x is Some y when y >= 0
consequence equals y+1

definition z
under condition x is Some y when y < 0
consequence equals y

definition z
under condition x is None
consequence equals y

Compilation Approach

For the compilation process, we could integrate match ... with structures with default terms, utilizing pattern-matching compilation techniques to optimize the evaluation of these terms such as [maranget-ml2008]. (with modifications related to conflict and nothing patterns)

In particular, to check there is no conflict between too branches, one need only to check the branches where the pattern is the same. This could reduce the number of cases to handle.

@article{maranget-ml2008,
        author =    "Luc Maranget",
        title =     "Compiling Pattern Matching to Good Decision Trees",
        booktitle = "Workshop on the Language ML",
        year =      2008,
        month =     September,
        publisher = "ACM Press"
}
@AltGr
Copy link
Contributor

AltGr commented Nov 4, 2024

Hm, I am not yet totally convinced of the point, but will be happy to discuss it :)

In particular, doesn't the presence of when clauses defeat most of the pattern-matching optimisations ? (I need to re-read Luc's paper!)

Two remarks:

  • x is Some y when y >= 0 exists with the syntax x with pattern Some of y and y >= 0 ; it does not, however, bind the variable within the definition body when used in an under condition clause, which would be extremely useful (I thought there was an issue already open at this subject, but could not find it)
  • I have recently found quite a few cases where it would be useful to have, inside the AST, a expr MATCHES constructor predicate ; for now it has to be encoded as a pettern-matching listing all the cases, there is no other way to deconstruct an enumeration term. An example is the temporary equality check expansion pass that I've written, which due to this limitation of the AST requires two nested full-matches — quadratic in size with the number of variants. (Of course, for this case, allowing real pattern matching on tuples would be an even better solution)

@adelaett
Copy link
Collaborator Author

adelaett commented Nov 4, 2024

The problem of the expr matches constructor is the same as x with pattern Some of y for the condition; You cannot access it inside the definition, hence you need to re-match the same expression.

For the opitmization on pattern matching, i talked with @Nadrieril about pattern matching with for catala, and we arrived at the conclusion that it is an non-trivial interesting problem, especially if we want to compile to ocaml pattern matching among other things.

We mainly focused on two points: the first one on how to express the conflict condition using only pattern matching. It should be possible to do it using either exception-counting with decision trees, with two runs, or using back-tracking automata that backtrack the first found exception.

However it is non-trivial to adapt those to nested catala expressions, as it is not obvious to find the correct ways to flatten back the pattern matching.

Nevertheless, @Nadrieril indicated that it should be possible to encode pattern matching with a lot of conditions if the conditions have a certain form. For example if they are intervals on a total order, it is possible to have a logarithmic decision tree. This could be interesting for compiling more efficiently catala.

--

In the meantime, modifying the compilation of this new version of default terms is not very hard. Instead of writing

let res =
    List.fold (function Empty, v | v, Empty => v | v, v => Conflict) Empty [e1, ..., en]
in
if ej then ec else Empty

We can write:

let res =
    List.fold (function Empty, v | v, Empty => v | v, v => Conflict) Empty [e1, ..., en]
in
match ej with |pj => ec | _ => Empty

This is minor modification given regular pattern matching are already part of the lcalc intermediate representation.

@AltGr
Copy link
Contributor

AltGr commented Nov 6, 2024

One major difference that I can see is that pattern-matching is supposed to stop processing at the first match; here, unless we assume there will be no conflicts, we will need to process all of them anyway. Another is that during a pattern-matching, you can leverage the structure of the matched term to drive the decision tree, and here there isn't in general much structure to rely on ?

@denismerigoux denismerigoux added 💬 discussion Further discussion is needed ➗ semantics Formal semantics of Catala 💡 language Language design labels Nov 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
💬 discussion Further discussion is needed 💡 language Language design ➗ semantics Formal semantics of Catala
Projects
Status: Inactive
Development

No branches or pull requests

3 participants