Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SIP-58 - Named Tuples #72

Merged
merged 9 commits into from
Aug 19, 2024
282 changes: 282 additions & 0 deletions content/named-tuples.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,282 @@
---
layout: sip
permalink: /sips/named-tuples.html
stage: implementation
status: waiting-for-implementation
presip-thread: https://contributors.scala-lang.org/t/pre-sip-named-tuples/6403/164
title: SIP-NN - Named Tuples
---

**By: Martin Odersky**

## History

| Date | Version |
|---------------|--------------------|
| Jan 13th 2024 | Initial Draft |

## Summary

We propose to add new form of tuples where the elements are named.
Named tuples can be types, terms, or patterns. Syntax examples:
```scala
type Person = (name: String, age: Int)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if someone writes the following:

type Person = (name: String, name: String)

Is this a syntax error, type error, or something else? The runtime implementation of (String, String) doesn't care about names, but then when someone calls .name the compiler cannot know which tuple entry it is referring to

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a type error. It's listed in the Restrictions section of the SIP.

val Bob: Person = (name = "Bob", age = 33)

Bob match
case (name = n, age = 22) => ...
```

We also propose to revive SIP 43 to support patterns with named fields. Named pattern fields for case classes are analogous to named patterns for tuple elements. User-defined named pattern matching is supported since named tuples can be results of extractor methods.

## Motivation

1. Named tuples are a convenient lightweight way to return multiple results from a function. But the absence of names obscures their meaning, and makes decomposition with _1, _2 ugly and hard to read. The existing alternative is to define a class instead. This does name fields, but is more heavy-weight, both in terms of notation and generated bytecode. Named tuples give the same convenience of definition as regular tuples at far better readability.

1. Named tuples are an almost ideal substrate on which to implement relational algebra and other database oriented operations. They are a good representation of database rows and allow the definition of generic operations such as projections and joins since they can draw on Scala 3’s existing generic machinery for tuples based on match types.

1. Named tuples make named pattern matching trivial to implement. The discussion on SIP 43 showed that without them it’s unclear how to implement named pattern matching at all.

## Proposed solution

The elements of a tuple can now be named. Example:
```scala
type Person = (name: String, age: Int)
val Bob: Person = (name = "Bob", age = 33)

Bob match
case (name, age) =>
println(s"$name is $age years old")

val persons: List[Person] = ...
val minors = persons.filter: p =>
p.age < 18
```
Named bindings in tuples are similar to function parameters and arguments. We use `name: Type` for element types and `name = value` for element values. It is illegal to mix named and unnamed elements in a tuple, or to use the same same
name for two different elements.

Fields of named tuples can be selected by their name, as in the line `p.age < 18` above.

Example:

~~~ scala
// This is an @main method
@main def foo(x: Int): Unit =
println(x)
~~~

### Conformance

The order of names in a named tuple matters. For instance, the type `Person` above and the type `(age: Int, name: String)` would be different, incompatible types.

Values of named tuple types can also be be defined using regular tuples. For instance:
```scala
val x: Person = ("Laura", 25)

def register(person: Person) = ...
register(person = ("Silvain", 16))
register(("Silvain", 16))
```
This follows since a regular tuple `(T_1, ..., T_n)` is treated as a subtype of a named tuple `(N_1 = T_1, ..., N_n = T_n)` with the same element types. On the other hand, named tuples do not conform to unnamed tuples, so the following is an error:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The proposed named tuples are pretty close to shapeless records in principle (just with new syntax at the type and value level to define them). But, the subtyping relationship is the opposite.

E.g. where (FieldType["foo", Int], FieldType["bar", String]) would be a subtype of (Int, String), in this proposal the equivalent to the former (foo: Int, bar: String) would be a supertype of the latter (Int, String).

I think the named value as a refinement of the unnamed value makes a lot more sense. But in the proposal, there are instead special rules to allow the subtyping relationship to be backwards without total insanity.

Please take a look at FieldType and the very successful (ignoring the symbol-vs-string issue) decade-long history of named tuples (i.e. records) in shapeless, before making a decision here.

@milessabin I wonder if you have an opinion about this.

```scala
val x: (String, Int) = Bob // error: type mismatch
```
One can convert a named tuple to an unnamed tuple with the `toTuple` method, so the following works:
```scala
val x: (String, Int) = Bob.toTuple // ok
```

_Question:_ Should we define an implicit conversion, either in place of this method or in addition to it?

Note that conformance rules for named tuples are analogous to the rules for named parameters. One can assign parameters by position to a named parameter list.
```scala
def f(param: Int) = ...
f(param = 1) // OK
f(2) // Also OK
```
But one cannot use a name to pass an argument to an unnamed parameter:
```scala
val f: Int => T
f(2) // OK
f(param = 2) // Not OK
```
The rules for tuples are analogous. Unnamed tuples conform to named tuple types, but the opposite does not hold.


### Pattern Matching

When pattern matching on a named tuple, the pattern may be named or unnamed.
If the pattern is named it needs to mention only a subset of the tuple names, and these names can come in any order. So the following are all OK:
```scala
Bob match
case (name, age) => ...

Bob match
case (name = x, age = y) => ...
Copy link
Contributor

@lihaoyi lihaoyi Feb 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is something like this allowed?

case (name, age = y) =>

Does named tuple pattern matching require fields be in the right order? The section on Pattern Matching with Named Fields in General mentions named pattern matching on case classes does not care about the order of fields, but it's not clear whether than applies to named tuples as well

Should it care about order? Elsewhere, we discuss how one of the defining characteristics of named tuples is that they are ordered. Given that, it seems strange that you can construct different named tuples that are not equivalent to each other e.g. (name: String, age: Int) and (age: Int, name: String), but match the same pattern case (name = x, age = y) => because pattern matching ordering does not consider ordering

Copy link
Contributor Author

@odersky odersky Feb 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

case (name, age = y) is currently not allowed. Named fields do not need to be in the right order.

Not caring about order comes from the named pattern matching part, and behaves the same for case classes and named tuples. So order matters for construction but not for named deconstruction.


Bob match
case (age = x) => ...

Bob match
case (age = x, name = y) => ...
```

### Expansion

Named tuples are in essence just a convenient syntax for regular tuples. In the internal representation, a named tuple type is represented at compile time as a pair of two tuples. One tuple contains the names as literal constant string types, the other contains the element types. The runtime representation of a named tuples consists of just the element values, whereas the names are forgotten. This is achieved by declaring `NamedTuple`
in package `scala` as an opaque type as follows:
```scala
opaque type NamedTuple[N <: Tuple, +V <: Tuple] >: V = V
Copy link
Contributor

@lihaoyi lihaoyi Jan 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any way we constrain the N type parameter more, given that it can only be literal strings, and not arbitrary types like the V paramater next to it? NamedTuple[(Int, Boolean), (Long, Double)] should be illegal right? As should NamedTuple[("foo", "bar", "qux"), (1, 2)] since the two tuples have to be the same length?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The compiler checks that all labels are string literals. But we can't encode such a constraint in the type bound.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the question is since we are relying on opaque types to represent NamedTuples, what stops the user from creating a badly formed NamedTuples manually or via macros? The Scala 3 compiler protects us from creating bad trees, but it won't in this case, unless the checker special-cases NamedTuples opaque types in its checks.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the consequence would be like passing around Tuple instead of a concrete case

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's probably worth listing out in the proposal all the "magic" constraints and checks that we are adding here, since they are not part of the normal type system and would be special-case support for scala.NamedTuple . From what I understand:

  • N <: Tuple must only contain literal strings
  • N <: Tuple and V <: Tuple must have the same length
  • N <: Tuple cannot contain duplicates

Is that all, or are there other constraints I'm missing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That should be it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In fact, those constraints are not enforced by the compiler. You can form arbitrary NamedTuple types yourself. There's no additional magic. It's just that they won't fit any named tuple value you write if they are ill-formed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this would be a good occasion to add a TupleOf[T] (with the semantics TupleOf[T]<: EmptyTuple | T *: TupleOf[T]) to the standard library

While not necessary, it would make the meaning of N clearer (even if it is renamed to something else)

```
For instance, the `Person` type would be represented as the type
```scala
NamedTuple[("name", "age"), (String, Int)]
```
`NamedTuple` is an opaque type alias of its second, value parameter. The first parameter is a string constant type which determines the name of the element. Since the type is just an alias of its value part, names are erased at runtime, and named tuples and regular tuples have the same representation.

A `NamedTuple[N, V]` type is publicly known to be a supertype (but not a subtype) of its value paramater `V`, which means that regular tuples can be assigned to named tuples but not _vice versa_.

The `NamedTuple` object contains a number of extension methods for named tuples hat mirror the same functions in `Tuple`. Examples are
`apply`, `head`, `tail`, `take`, `drop`, `++`, `map`, or `zip`.
Similar to `Tuple`, the `NamedTuple` object also contains types such as `Elem`, `Head`, `Concat`
that describe the results of these extension methods.

The translation of named tuples to instances of `NamedTuple` is fixed by the specification and therefore known to the programmer. This means that:

- All tuple operations also work with named tuples "out of the box".
- Macro libraries can rely on this expansion.

### The FieldsOf Type

The `NamedTuple` object contains a type definition
```scala
type FieldsOf[T] <: AnyNamedTuple
```
`FieldsOf` is treated specially by the compiler. When `FieldsOf` is applied to
an argument type that is an instance of a case class, the type expands to the named
tuple consisting of all the fields of that case class. Here, fields means: elements of the first parameter section. For instance, assuming
```scala
case class City(zip: Int, name: String, population: Int)
```
then `FieldsOf[City]` is the named tuple
```scala
(zip: Int, name: String, population: Int)
```
The same works for enum cases expanding to case classes.

### Pattern Matching with Named Fields in General

We allow named patterns not just for named tuples but also for case classes. For instance:
Copy link
Contributor

@lihaoyi lihaoyi Feb 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does this work? Wouldn't we have to replace

def unapply(x: CaseClass): CaseClass

with

def unapply(x: CaseClass): (foo: Bar, qux: Baz, ...)

?

Or is this more generic, and works with anything returned from def unapply as long as the type has a zero-param-list member defined with the correct name?

If it works with any type with the correct members, what is the limit on the members that we use for pattern matching?

  1. Can they have one empty param list?
  2. Can they take implicits/givens?
  3. Can members that take parameters be eta-expanded during pattern matching, to bind a function value to the name?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or is this more generic, and works with anything returned from def unapply as long as the type has a zero-param-list member defined with the correct name?

Correct.

Copy link
Contributor

@lihaoyi lihaoyi Feb 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about members that take implicits? I'm imagining e.g. the Dotty internal types, where just about everything takes a (given Context), even though for most intents and purposes I think of them as fields

What if the return type of unapply is a Java type? Do we accept named pattern matching on zero-parameter "getter" methods because Scala 3 lets you call them without parens? Or only on Java fields? Or do we prohibit named pattern matching on Java types entirely?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implicits are handled by the existing rules. To support a named pattern match, an unapply must return one of the following (after handling the get/isDefined cases)

  • A reference to a case class. The fields are the fields in the first parameter list of the case class. (Note that implicits come later)
  • A reference to a named tuple.

Copy link
Contributor

@lihaoyi lihaoyi Feb 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a point of contrast, Python's structural named pattern matching on class instances works across arbitrary types (link), as long as they have the relevant properties that are being matched against. Python doesn't have unapply to customize pattern matching further, but I feel like we could take the good parts from Python while still preserving the parts we like in Scala

I don't think it'll take any major changes, here's a strawman proposal for how that would look:

  • Allow unapply to return values of any type, not just case classes or named tuples, and then proceed unchanged with looking up the matched fields on the returned value

  • Make unapply optional: given case Foo(x = y) =>, if Foo does not have an def unapply in the companion, we could fall back to a .isInstanceOf[Foo] type test, and then proceed unchanged with looking up the matched fields on the returned value

This has several nice properties:

  • It would allow easy pattern matching in the common case of "check if type is equal and the members are what we expect". Forcing people to define companion object Foo{ def unapply(foo: Foo): Foo = foo } to partake in pattern matching seems silly when the implementation of unapply is literally a no-op

  • For the common case of vanilla case classes and sealed traits, it behaves exactly identical to the semantics of calling unapply and looking up the fields on the returned value

  • It is in fact already what we do for case classes in Scala 2.x! case class pattern matching has not called .unapply for several years now, and instead generates bytecode that does a type test and then looks up the fields directly.

  • In Scala 3 we still call .unapply, but Scala 3's default unapply is the identity function (literally aload_1; areturn on the JVM), which does nothing to the passed value, on which we then look up the fields directly. Skipping the call to unapply when it's not provided matches the existing behavior precisely

  • It has beautiful symmetry with Scala 3's Creator Applications. Creator Applications means a Foo() application without a def apply in the companion object falls back to a new operation on that type, where this would mean a case Foo() pattern match without a def unapply in the companion would fall back to an isInstanceOf operation on that type.

  • It would still provide a smooth way of customizing pattern matching: the default isInstanceOf is great, but if you want something different you can def unapply. This is analogous to Creator Applications, where the default new is great but you can def apply if you want something unusual

  • It would be backwards compatible, since any existing companion objects without a unapply method are currently disallowed. This is again analogous to Creator Applications.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Allow unapply to return values of any type, not just case classes or named tuples, and then proceed unchanged with looking up the matched fields on the returned value

That's in a sense already true. unapply allows to return instances of any Product type. It then proceeds to select the fields of that product type by the _1, _2 selectors. We have a link between these selectors and actual field names only for case classes, since the _1 selectors are auto-generated for them. For other Product instances, these things can be user defined in arbitrary ways so we don't have that info.

Another, deeper reason why pattern matching does not work out of the box for normal classes is that in Scala the constructor elements of a normal class are retained only if prefixed by val. So there might be nothing to select. E.g.

class Point(x: Int, y: Int)

p match
  case Point(x, y) // does not work, `x` and `y` are not fields of the class.

We could consider relaxing that restriction, so that we also allow pattern matching on all classes that have only val parameters in their first section. But I believe that's out of the scope for this SIP since it has nothing to do with named matching.

Copy link
Contributor

@lihaoyi lihaoyi Feb 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason I think it's relevant to named matching is because the idea was to allow something like this

class Point(x: Int, y: Int){
  val z = x + y
}

p match{
  case Point(z = 0) => 
}

This doesn't work with positional pattern matching, because that relies on the _1 _2 _3 selectors, which do not exist for most non-case classes. Named pattern matching relaxes the reliance: we can just go directly to pattern matching on the member named .z directly.

Previously such a thing didn't make sense, but with named pattern matching it does. And we already fetch the .z fields directly when pattern matching case classes in Scala 2, so it's not without precedence

Given that this proposal includes named pattern matching for case classes, I think this is worth discussing. If we only allowed pattern matching on tuples, we could defer the discussion until later. But if we're introducing named pattern matching generically, we should make sure we do so in a way that doesn't close off potential future improvements later

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main issue with enabling non-product (fields) pattern matching is the complete loss of exhaustivity indicators.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Haoyi I see. That use case makes sense. Today we would write

p match
  case p: Point if p.z == 0 =>

The named pattern notation gets more convenient if there are more named elements like this.

Still, it's a rather big change to the normal pattern matching rules, so I'd prefer we leave that for a possible future time.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good. Happy to leave it for later, could even come in a future SIP.

My main concern for now is just to make sure we don't close off the possibility in future. If the current SIP's named pattern matching on case classes goes through _1 _2, and future named pattern matching goes through .x .y, that could be a bit awkward. Probably solvable with some fallback rules though e.g. "use .x if ._1 doesn't exist"

```scala
city match
case c @ City(name = "London") => println(p.population)
case City(name = n, zip = 1026, population = pop) => println(pop)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we require a trailing * e.g. case City(name = "London", *) => to make it clear that we are only matching a subset of fields?

Is there some way to do named-pattern-matching against zero fields? This is different from case c: City because it's unapply based rather than isInstanceOf-based, and is more convenient than writing case City(_, _, _) => which is the status quo

Copy link
Contributor

@soronpo soronpo Feb 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there some way to do named-pattern-matching against zero fields? This is different from case c: City because it's unapply based rather than isInstanceOf-based, and is more convenient than writing case City(_, _, _) => which is the status quo

I also think that case City(*) is useful and would opt for that kind of syntax almost always than _ : City. This can be a different SIP, unless we require that indeed we need a trailing * if not all fields are specified.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not fond of explicit * here. The situation for me is analogous to named function arguments. We don't use * there either to make clear that we have given all names and it's OK to use defaults for the rest. For pattern matching, the default is simply _, i.e. wildcard and no binding.

Also, I think almost all named pattern matches will be partial. That's the whole point. If I have a full match for e.g. Person, then I would likely write:

case Person(name, age)

and not

case Person(name = name, age = age)

So in that sense, the fact that it is a named pattern match also indicates that it is partial, and the * would just be annoying and redundant.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. We should avoid reference to _1, _2 in the SIP. As far as I can see, that's the case currently.

```

Named constructor patterns are analogous to named tuple patterns. In both cases

- every name must match the name some field of the selector,
- names can come in any order,
- not all fields of the selector need to be matched.

This revives SIP 43, with a much simpler desugaring than originally proposed.
Named patterns are compatible with extensible pattern matching simply because
`unapply` results can be named tuples.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make user defined class instances support named pattern matching without conversion to named tuples? I feel like "pattern matching without allocation" has been a direction Scala has been going into, with name-based pattern matching and Scala's new pattern matching desugaring, and forcing people to allocate a named tuple to support named pattern matching feels out of place

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unapply could also return some case class, or an option of it. That would work as well for named matching. The only corner case where case classes don't work but tuples do is if there's only one name. See the discussion in SIP 43, which is quite involved.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So for normal pattern matching, my understand is that it's dependent on the ProductN trait. That means that it is not hardcoded to case classes and tuples, but even normal classes/traits can benefit from zero allocation pattern matching as well. Please correct me if I'm wrong

Would normal classes/traits have some trait they can implement to support names pattern matching in the same way? Or is this a known limitation that we are consciously deciding to live with?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ProductN is no longer a thing in Scala 3. An unapply can return a Product trait. That's typically the argument itself, which means no allocations. If the argument is a case class (the most common case), we can do named pattern matching on it. If not, we don't have the names, so it's not possible.

Copy link
Contributor

@lihaoyi lihaoyi Jan 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess my question is: given positional pattern matching has a Product trait, that positional Tuples implement, should named pattern matching also have a NamedProduct trait that NamedTuples implement?

At least superficially, I can imagine

  1. NamedProduct[N <: Tuple] extends Product where N is a tuple of string literals
  2. NamedTuple[N <: Tuple, +V <: Tuple] extends NamedProduct[N]
  3. Define named pattern matching on NamedProduct, so that NamedTuples get it for free due to inheriting from NamedProduct, but user-land normal classes and traits can inherit from NamedProduct and get it as well

I'm not saying we should do this, just that it seems like the natural way of extending the existing Product/Tuple/positional-pattern-matching relationship to NamedTuple/named-pattern-matching. So even if the answer is "no this cannot work" or "maybe it can work but we can always do it later in a follow up" it deserves to be called out explicitly as a design decision

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's an interesting suggestion. A product match looks for the presence of _1, _2, ...., selectors. If the scrutinee implements NamedProduct , we could then alternatively use the associated names in patterns and selectors instead. But the problem is we can't abstract over it. If we want to have a generic type that reflects names and types, we are back to NamedTuple.

Maybe someone could try that out, see whether it would work? I agree we can do it independently later.



### Restrictions

The following restrictions apply to named tuples and named pattern arguments:

1. Either all elements of a tuple or constructor pattern are named or none are named. It is illegal to mix named and unnamed elements in a tuple. For instance, the following is in error:
```scala
val illFormed1 = ("Bob", age = 33) // error
```
2. Each element name in a named tuple or constructor pattern must be unique. For instance, the following is in error:
```scala
val illFormed2 = (name = "", age = 0, name = true) // error
```
3. Named tuples and case classes can be matched with either named or regular patterns. But regular tuples and other selector types can only be matched with regular tuple patterns. For instance, the following is in error:
```scala
(tuple: Tuple) match
case (age = x) => // error
```

### Syntax Changes

The syntax of Scala is extended as follows to support named tuples and
named constructor arguments:
```
SimpleType ::= ...
| ‘(’ NameAndType {‘,’ NameAndType} ‘)’
NameAndType ::= id ':' Type

SimpleExpr ::= ...
| '(' NamedExprInParens {‘,’ NamedExprInParens} ')'
NamedExprInParens ::= id '=' ExprInParens

Patterns ::= Pattern {‘,’ Pattern}
| NamedPattern {‘,’ NamedPattern}
NamedPattern ::= id '=' Pattern
```

### Compatibility

Named tuple types and expressions are simply desugared to types and trees already known to Scala. The desugaring happens before the checking, so does not influence Tasty generation.

Pattern matching with named fields requires some small additions to Typer and the PatternMatcher phase. It does not change the Tasty format, though.

Backward source compatibility is partially preserved since additions to types and patterns come with new syntax that was not expressible before. When looking at tuple expressions, we have one instance of a source incompatibility:

```scala
var age: Int
(age = 1)
```
This was an assignment in parentheses before, and is a named tuple of arity one now. It is however not idiomatic Scala code, since assignments are not usually enclosed in parentheses. The problem could also be detected and diagnosed fairly straightforwardly: When faced with a unary named tuple, try to interpret it as an assignment, and if that succeeds, issue a migration error.

### Open questions

1. What is the precise set of types and operations we want to add to `NamedTuple`. This could also evolve after this SIP is completed.

2. Should there be an implicit conversion from named tuples to ordinary tuples?

## Alternatives
Copy link
Contributor

@lihaoyi lihaoyi Jan 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An earlier version of this discussion modelled named tuples as tuples whose members wereNamedValue types, basically a "array of structs" type in contrast to the current proposals "struct of arrays" approach.

What were the pros and cons that resulted in the current approach being chosen over that one? It seems to me that the "array of structs" approach better models the requirement that the names are all literal strings and that there are the same number of names and values?

Copy link
Member

@bishabosha bishabosha Jan 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there was the problem that each element of the tuple has to be unwrapped explicitly from its label, so sometimes the syntax would do it automatically - e.g. pattern extractor, but manual selection needs unwrapping, e.g. compare tup(0).value and tup.x


### Structural Types

We also considered to expand structural types. Structural types allow to abstract over existing classes, but require reflection or some other library-provided mechanism for element access. By contrast, named tuples have a separate representation as tuples, which can be manipulated directly. Since elements are ordered, traversals can be defined, and this allows the definition of type generic algorithms over named tuples. Structural types don’t allow such generic algorithms directly. Be could define mappings between structural types and named tuples, which could be used to implement such algorithms. These mappings would certainly become simpler if they map to/from named tuples than if they had to map to/from user-defined "HMap"s.

By contrast to named tuples, structural types are unordered and have width subtyping. This comes with the price that no natural element ordering exist, and that one usually needs some kind of dictionary structure for access. We believe that the following advantages of named tuples over structural types outweigh the loss of subtyping flexibility:
Copy link
Contributor

@lihaoyi lihaoyi Jan 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do structural types truly always need dictionaries? Can we have e.g. a synthetic trait for every structural member, such that

def x: {def foo: Int, def bar: String}

desugars to

package scala.structural
trait WithFoo[T]{ def foo: T}
trait WithBar[T]{ def bar: T}
def x: scala.structural.WithFoo[Int] with scala.structural.WithBar[String]

The scala.structual traits could/would be duplicated by various compilation runs, but they would all be the same implementation, and so the duplicate classpath entries would be benign since they would all resolve to the same interfaces

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That has been tried from time to time by us and others. I believe one such paper is: https://dblp.org/rec/conf/oopsla/GilM08.html

But nothing ever materialized. Generally, creating globally visible classes on the fly is a can of worms.

Also, we still need a traversal principle, which named tuples give us, but structural types don't.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's a traversal principle?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A way to inductively visit all the members of a type at the type level, typically using a match type.


- Better integration since named tuples and normal tuples share the same representation.
- Better efficiency, since no dictionary is needed.
- Natural traversal order allows the formulation of generic algorithms such as projections and joins.

### Conformance

A large part of Pre-SIP discussion centered around subtyping rules,. whether ordinary tuples should subtype named-tuples (as in this proposal) or _vice versa_ or maybe no subtyping at all.

Looking at precedent in other languages it feels like we we do want some sort of subtyping for easy convertibility and possibly an implicit conversion in the other direction.

The discussion established that both forms of subtyping are sound. My personal opinion is that the subtyping of this proposal is both more useful and safer than the one in the other direction. There is also the problem that changing the subtyping direction would be incompatible with the current structure of `Tuple` and `NamedTuple` since for instance `zip` is already an inline method on `Tuple` so it could not be overridden in `NamedTuple`. To make this work requires a refactoring of `Tuple` to use more extension methods, and the questions whether this is feasible and whether it can be made binary backwards compatible are unknown. I personally will not work on this, if others are willing to make the effort we can discuss the alternative subtyping as well.
Copy link
Contributor

@lihaoyi lihaoyi Jan 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you summarize the discussion here? There were four options (named <: unnamed, unnamed <: named, no subtyping, unnamed =:= names with _n names) and someone reading this SIP should be able to understand the pros and cons of each and why one selection was chosen, even if they may not agree with the choice


### Spread Operator

An idea I was thinking of but that I did not include in this proposal highlights another potential problem with subtyping. Consider adding a _spread_ operator `*` for tuples and named tuples. if `x` is a tuple then `f(x*)` is `f` applied to all fields of `x` expanded as individual arguments. Likewise, if `y` is a named tuple, then `f(y*)` is `f` applied to all elements of `y` as named arguments.
Now, if named tuples would be subtypes of tuples, this would actually be ambiguous since widening `y` in `y*` to a regular tuple would yield a different call. But with the subtyping direction we have, this would work fine.

I believe tuple spread is a potentially useful addition that would fit in well with Scala. But it's not immediately relevant to this proposal, so is left out for now.


## Related work

This section should list prior work related to the proposal, notably:

- [Pre-SIP Discussion](https://contributors.scala-lang.org/t/pre-sip-named-tuples/6403)

- [SIP 43 on Pattern Matching with Named Fields](https://github.com/scala/improvement-proposals/pull/44)

- [Experimental Implementation](https://github.com/lampepfl/dotty/pull/19174)

## FAQ