Pattern matching equality types #36

eduardoleon · 2018-01-27T16:56:05Z

Standard ML has two branching mechanisms: pattern matching and the more traditional if-then-else. The former is generally considered superior for most purposes. For example:

fun foo nil = bar
  | foo (x :: xs) = qux (x, xs)

is regarded as cleaner code than:

fun foo xs = if null xs then bar else qux (hd x, tl x)

However, SML's existing pattern matching mechanism provides no support for equality testing. One cannot write, for example:

fun foo ((p,a) :: (p,b) :: (p,c) :: xs) = SOME whatever
  | foo _ = NONE

Instead, one has to write the much clumsier:

fun foo ((p,a) :: (q,b) :: (r,c) :: xs) =
    if p = q andalso q = r then SOME whatever else NONE
  | foo _ = NONE

Note that q and r serve no purpose other than to be tested for equality with p.

I propose that repeated variables be allowed in patterns, and that they be required to have eqtypes. As an additional benefit, the equality testing function wouldn't need to be a magical primitive anymore. Instead, it can be defined as a library function:

fun op= (x, x) = true
  | op= _ = false

I would also like to propose that the equality testing function be renamed to ==, since it is confusing to use the same token for definitions and equality testing. But that's a debate for another day.

The text was updated successfully, but these errors were encountered:

JohnReppy · 2018-01-27T17:39:26Z

The Successor ML design has an alternative approach to this issue using guarded patterns. You can write something like

(p1, a) :: (p2, a) :: (p3, a) :: xs if (p1 = p2) andalso (p2 = p3)

(I may be getting the exact syntax slightly wrong, but it is described in the SuccessorML definition).

I've got an undergraduate working on adding this mechanism to SML/NJ.

eduardoleon · 2018-01-27T18:23:34Z

What I don't like about guarded patterns is that they may require evaluating arbitrary expressions, which of course might have arbitrary effects, including nontermination.

Pattern matching should be the eliminator for a sum, i.e., the language's semantics should ensure that the distinction between the cases is decidable. Guarded patterns shift the burden of proof onto the programmer.

RobertHarper · 2018-01-27T18:42:38Z

it's true about the guards possibly having effects. but it's also true that implicit equality checks can lead to unpredictable efficiency issues, because you don't know what equality means or how patterns are compiled. the guarded form is at least explicit, and you can see the cost involved. i don't like being too cute about equality --- particularly with abstract types for which equality is not defined, or is provided as part of the abstraction. i prefer guards to non-linear patterns.

On Sat, Jan 27, 2018 at 1:23 PM eduardoleon ***@***.***> wrote: What I don't like about guarded patterns is that they may require evaluating arbitrary expressions, which of course might have arbitrary effects. Pattern matching should be the eliminator for a sum, i.e., the language's semantics should ensure that the distinction between the cases is decidable. — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#36 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ABdsdZAyyOLOntN5lBKCRgx1ScVQEBPCks5tO2mmgaJpZM4RvW9L> .

eduardoleon · 2018-01-27T19:24:01Z

Out of curiosity: In practice, how often is equality testing for different types? Personally, I only need it with “atomic” types such as integers and reference cells. Strings are probably best manipulated by treating them as streams of characters, using a function:

val uncons : string -> (char * string) option
(* IMO preferable to String.explode *)

And I don't use equality testing with datatypes or abstract types at all. (In fact, couldn't we make datatypes not eqtypes?)

In conclusion, I would be satisfied with a limited version of my original proposal where equality testing in pattern matching is only allowed for integers, words, characters, and reference cells.

JohnReppy · 2018-01-27T21:18:32Z

While equality for abstract and recursive types has its dangers (as Bob noted), equality is useful for more than just integers and reference cells. The bool datatype is an obvious example.

Going back to the original proposal, guards support the desired pattern in a much more general way than non-linear patterns. For example, say my list elements that foo is working on are hash-consed terms. They have a fast equality test, but it is not op=, so non-linear patterns won't work, but guards will.

Also, it is a bit off topic, but the substring type is what you want for efficient manipulation of strings (the Substring.getc function is your uncons.

RobertHarper · 2018-01-27T21:47:09Z

i agree with the suggestion that the ad hoc notion of a type class for equality should be deleted entirely in favor of a modules-based formulation of automatic functor composition described in my "modular type classes" paper. the language should not build in any concept of equality. it's never useful, except for the trivial dirty hack. when teaching introductory programming, i delete equality entirely, and delete if-then-else. for any boolean value, students are required to case analyze on true and false. why? to disabuse them of the misunderstanding that a boolean-valued function is a predicate. it is not. the equal sign is such a function, it is not the equality relation, it's just written that way. writing "if x=nil" is a barbarism. a good way to teach that is to make it not even expressible in the language at all. bob

On Sat, Jan 27, 2018 at 4:18 PM John Reppy ***@***.***> wrote: While equality for abstract and recursive types has its dangers (as Bob noted), equality is useful for more than just integers and reference cells. The bool *datatype* is an obvious example. Going back to the original proposal, guards support the desired pattern in a much more general way than non-linear patterns. For example, say my list elements that foo is working on are hash-consed terms. They have a fast equality test, but it is not op=, so non-linear patterns won't work, but guards will. Also, it is a bit off topic, but the substring type is what you want for efficient manipulation of strings (the Substring.getc function is your uncons. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#36 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ABdsdYVrnO6Meu1-ztcPZuB3fnUyGJZyks5tO5KpgaJpZM4RvW9L> .

eduardoleon · 2019-08-05T08:47:47Z

I agree that if-then-else really needs to die, but I do not see how pattern matching on bools is any better. Effectively, it is different syntax for the same thing. You are still computing values of type bool, whose provenance you need to remember. Why not get rid of the bool altogether, and branch on what really matters - whether two given values are equal or not?

Now, regarding why it is useful to branch on whether two values are equal. This is an operation that does not arise so often in purely functional programming. (Although I have found use cases, such as the post that begins this thread.) However, it is absolutely essential in the implementation of certain imperative data structures, such as link-cut trees, in which the “object identity” of every node matters, and we need to test at runtime whether two given pointers (or reference cells) are the same.

ratmice · 2019-08-05T10:06:13Z

Not a much to do with pattern matching in this comment, but the approach to guarding via some predicate that I like has been to use a functor/opaque type returning an option as below.

This has the benefit that the predicate is called on construction via mk, thus you don't have to worry about effects occurring during pattern match. That said it doesn't really work if T contains mutable state, which the predicate conditionalizes.

I'm not sure others would want to work in this style which is a somewhat detrimental to pattern matching, it probably wouldn't end up very pretty in Eduardo's example where there is equality across list elements, but perhaps worth mentioning, as I like both the predictability of not only when, but how many times the predicate is called (on construction, once), of this style of code.

shrug

signature DISCRIM = sig
        type T;
        type discrim_t;
        val mk : T -> discrim_t option;
        val get : discrim_t -> T;
end

functor Discrim(type T_; val predicate : T_ -> bool)
:> DISCRIM where type T = T_
= struct
        type T = T_;
        type discrim_t = T_;
        fun mk v = if predicate v then SOME v else NONE;
        fun get v = v;
end

JohnReppy · 2019-08-05T15:10:32Z

A response to parts of the previous two comments.

While if-then-else is just a derived form, it is both notationally and conceptually convenient. Successor ML goes further and adds an if-then derived form.

As far as using an abstract interface instead of pattern matching, you lose the efficiency benefits of pattern matching, unless you have a very smart compiler. An alternative that gives you similar behavior are the Abstract Value Constructors that Bill Aitken and I proposed back in 1992.

eduardoleon · 2019-08-08T21:51:32Z

Consider the type 'a point defined as follows:

datatype 'a state = Empty | Point of 'a point
withtype 'a point = ('a * 'a state * 'a state * 'a state) ref

The elements of the 4-tuple are:

A user-supplied value
A pointer to the left child node
A pointer to the right child node
A pointer to the parent node

The invariants of this type are:

The left and right children of a node must point back to their parent.
The left and right children of a node must be either distinct of both Empty.
There are no cycles obtained by following downward paths, i.e., left and right pointers.
There are no cycles obtained by following upward paths, i.e., parent pointers.

A node may have an arbitrary number of children. The left and right children are merely (zero, one or) two of them.

An auxiliary tree is a maximal subgraph with the property that, given two nodes m and n in it, there exist paths from m to n and from n to m. Auxiliary trees are manipulated as splay trees. In particular, one must be able to splay a node. In present-day Standard ML, this is done as follows:

datatype 'a parent
  = Root
  | Left of 'a point
  | Right of 'a point
  | Other of 'a point

(* Working around how dumb eqtypes are. *)
fun eq (Empty, Empty) = true
  | eq (Point p, Point q) = p = q
  | eq _ = false

fun parent p =
  case #4 (!p) of
      Empty => Root
    | Point q =>
      if eq (#2 (!q), Point p) then Left q else
      if eq (#3 (!q), Point p) then Right q else Other q

fun left (p, q) =
  case parent q of
      Left r => l3 (p, q, r)    (* implemented somewhere else *)
    | _ => l2 (p, q)            (* implemented somewhere else *)

fun right (p, q) =
  case parent q of
      Right r => r3 (r, q, p)    (* implemented somewhere else *)
    | _ => r2 (q, p)             (* implemented somewhere else *)

fun splay p =
  case parent p of
      Root => Empty
    | Left q => (left (p, q); splay p)
    | Right q => (right (p, q); splay p)
    | Other q => Point q

The helper datatype 'a parent and the helper functions parent, left and right exist solely to work around the lack of expressiveness of Standard ML's patterns. With equality patterns and pattern synonyms, one could instead write something like:

pat Left (p, q) = p as ref (_, _, _, Point (q as ref (_, Point p, _, _)))
pat Right (p, q) = p as ref (_, _, _, Point (q as ref (_, _, Point p, _)))
pat Child (p, q) = p as ref (_, _, _, q)

fun splay (Left (p, Left (q, r)) = (l3 (p, q, r); splay p)
  | splay (Right (p, Right (q, r)) = (r3 (r, q, p); splay p)
  | splay (Left (p, q)) = (l2 (p, q); splay p)
  | splay (Right (p, q)) = (r2 (q, p); splay p)
  | splay (Child (p, q)) = q

And then pattern matching imperative data structures is every single bit as sweet as pattern matching purely functional ones.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pattern matching equality types #36

Pattern matching equality types #36

eduardoleon commented Jan 27, 2018

JohnReppy commented Jan 27, 2018

eduardoleon commented Jan 27, 2018 •

edited

Loading

RobertHarper commented Jan 27, 2018 via email

eduardoleon commented Jan 27, 2018

JohnReppy commented Jan 27, 2018

RobertHarper commented Jan 27, 2018 via email

eduardoleon commented Aug 5, 2019 •

edited

Loading

ratmice commented Aug 5, 2019

JohnReppy commented Aug 5, 2019

eduardoleon commented Aug 8, 2019 •

edited

Loading

Pattern matching equality types #36

Pattern matching equality types #36

Comments

eduardoleon commented Jan 27, 2018

JohnReppy commented Jan 27, 2018

eduardoleon commented Jan 27, 2018 • edited Loading

RobertHarper commented Jan 27, 2018 via email

eduardoleon commented Jan 27, 2018

JohnReppy commented Jan 27, 2018

RobertHarper commented Jan 27, 2018 via email

eduardoleon commented Aug 5, 2019 • edited Loading

ratmice commented Aug 5, 2019

JohnReppy commented Aug 5, 2019

eduardoleon commented Aug 8, 2019 • edited Loading

eduardoleon commented Jan 27, 2018 •

edited

Loading

eduardoleon commented Aug 5, 2019 •

edited

Loading

eduardoleon commented Aug 8, 2019 •

edited

Loading