Some thoughts about scoped conformance #1528

kyouko-taiga · 2024-07-18T19:51:05Z

kyouko-taiga
Jul 18, 2024
Maintainer

@EugeneFlesselle and I have been discussing a lot about scoped conformance and their implication on soundness lately. I thought it would be good to report some of the observations we have made so far.

Coherence

Systems supporting type classes sometimes also require coherence, which is essentially prescribes that there can be at most one instance (aka model) of a given type class (aka trait/protocol) for a given type. For example, the following program is illegal in Swift:

protocol Monoid {
  func combine(_ other: Self) -> Self
  static var identity: Self { get }
}

func f<M: Monoid>(_ m: M) -> M { m.combine(m) }

extension Int: Monoid {
  func combine(_ other: Self) -> Self { self + other }
  static var identity: Self { 0 }
}

extension Int: Monoid {
  func combine(_ other: Self) -> Self { self * other }
  static var identity: Self { 1 }
}

println(f(6))

The compiler will complain because this program defines multiple overlapping models of Monoid for Int. (Note: the actual compiler diagnostic talks about redundant conformance, but the underlying problem is relates to coherence nonetheless.)

Coherence is convenient because it means that there can never be any ambiguity when one needs to select a model to satisfy the requirement of some generic function. The above program can be written in a system without coherence but then using the conformance of Int to Monoid would require user intervention. For example, in Scala:

trait Monoid[Self]:
  extension (self: Self) def combine (other: Self): Self
  def identity: Self

def f[M: Monoid](m: M): M = m.combine(m)

given m1: Monoid[Int] with
  extension (self: Int) def combine (other: Int): Int = self + other
  def identity: Int = 0

given m2: Monoid[Int] with
  extension (self: Int) def combine (other: Int): Int = self * other
  def identity: Int = 1

println(f(6)) // error: ambiguous given instances

Swift upholds coherence by requiring that models be defined be at the top-level. That is not sufficient, though, because two different modules A and B may each define a model of T for P that are separately unique but will overlap in a module that imports both A and B. Rust addresses this problem with its infamous "orphan rules".

In Hylo, we said that models can be defined in any scope as long as they do not overlap in that scope. In other words, we made a compromise between the restricted world of Swift and the relaxed world of Scala. While we cannot declare two instances of the Monoid concept at the top-level, we can use different scopes:

trait Monoid {
  fun combine(_ other: Self) -> Self
  static fun identity() -> Self
}

func f<M: Monoid>(_ m: M) -> M { m.combine(m) }

conformance Int: Monoid {
  public fun combine(_ other: Self) -> Self { self + other }
  public static fun identity() -> Self { 0 }
}

fun main() {
  conformance Int: Monoid {
    public fun combine(_ other: Self) -> Self { self * other }
    public static fun identity() -> Self { 1 }
  }

  println(f(6)) // prints 36, yay!
}

The reason why Hylo accepts the program (possible bugs in the current implementation aside) is because the second model of Int for Monoid shadows the one declared at the top-level. So while we can define multiple models, just like in Scala, we can still enjoy a form of local coherence to avoid ambiguities.

This idea looks simple on the surface but things get a little thorny when we examine some edge cases.

Hidden ambiguities

There exists a dangerous combination of features that we can use to create tricky programs in Hylo, involving equality constraints and quantification. For example, consider the following program:

trait P { fun f() -> Int }

fun outer<T, U>(_ t: T, _ u: U) {
  conformance T: P { public fun f() -> Int { 1 } }
  conformance U: P { public fun f() -> Int { 2 } }

  fun inner<where T == U>(_ a: T) -> Int { a.f() } //!
  _ = inner(t)
}

It should be fine to define the two models in the body of outer, since the type system has no reason to believe that T and U are the equivalent. But then inner constrain T to be equal to U and the two models become viable candidates to call a.f. Therefore we should actually reject this program. An interesting question asks what is the exact cause of the error.

Note that this problem relates to Rust's overlap rule. Rust won't let us write the following program:

trait A {}
trait B {}
trait P { fn f(&self) -> i64; }

impl <T: A> P for T { fn f(&self) -> i64 { 1 } }
impl <U: B> P for U { fn f(&self) -> i64 { 2 } }

impl A for i64 {}
impl B for i64 {}

In both Hylo's and Rust's cases, the two models may overlap if later we end up in a situation where T and U can be unifined. In Hylo, we ended up in such a situation by requiring T and U to be equal. In Rust, we created a situation where both T and U could be instantiated as i64.

Rust claims that the error is due to the model declaration and will refuse to compile the program even if we comment out the last two lines, that is even if we don't provide a concreate instance of ambiguity. Using the same approach in Hylo would mean rejecting outer even if we comment out its last two lines. There are other solutions, though:

We could claim that the two models require an inequality constrain between T and U in the signature of outer.
We could claim that the two models imply an inequality constrain that is violated by the declaration of inner.
We could report the error in the call a.f, i.e., where the ambiguity actually causes a problem.

I am opposed to require user-defined inequality constraints but the two other solutions sounds acceptable to me. @EugeneFlesselle has pointed out that going with the 3rd option gives us a consistent way to deal with another ambiguous situation:

trait P { fun f() -> Int }

fun outer<T, U>(_ t: T, _ u: U) {
  conformance T: P { public fun f() -> Int { 1 } }
  conformance U: P { public fun f() -> Int { 2 } }
  _ = t.f() //!
}

If we specialize outer with T = Int and U = Int, then the two models are again equally viable to call t.f. However, it may be reasonable to accept this program nonetheless because under separate checking, we'll have already compiled outer so that it unconditionally selects the first model. This approach suggests that a less conservative strategy may be reasonable.

Conformance escape

Consider the following program:

trait P { fun f() -> Int }

fun make() -> some P {
  conformance Int: P { public fun f() -> Int { self } }
  return 1
}

This program should be illegal because it must expose the local model to escape so that caller of make can get a value that conforms to P. Note that the problem would not occur if the return type of make was any P. In that case, the model would have been properly wrapped into an existential at runtime and all would be fine.

There are more intricate cases of conformance escape:

trait P { fun f() -> Int }

fun outer<T>() -> [](T) -> Int {
  conformance T: P { public fun f() -> Int { 1 } }
  return fun (_ t: T) -> Int { t.f() }
}

In its current form, this program should probably be illegal because the return value of outer is implicitly capturing a local model. Capturing a local model is not necessarily harmful. Here we can still separately compile the lambda that is returned using that model and keep that detail hidden from callers of outer. However, the situation will get more complicated if we allow local models to capture local values:

trait P { fun f() -> Int }

fun outer<T>() -> [](T) -> Int {
  let x = 1
  conformance T: P { public fun f() -> Int { x } }
  return fun (_ t: T) -> Int { t.f() }
}

Here, the model captures x and therefore we cannot let the lambda escape as it would violate x's lifetime.

There are two ways we can address this problem:

We can ban code that lets local models escape.
We can have the lambda accept another model of P for T.
We can make models first-class values so that they may appear in the capture-list of a lambda.

Semantic relevance of substitution

This problem is not consequence of scoped conformance but it somehow relates to coherence nonetheless. Consider the following program:

fun truc<C: Collection>(_ e1: C.Element, _ e2: C.Element) {}

fun machin<T: Collection>(_ t: T) {
  if let e: T.Element = t.first { truc(e, e) }
}

fun bidule<T: Collection, U: Collection where T.Element == U.Element>(_ t: T, _ u: U) {
  if let e1: T.Element = t.first {
    if let e2: T.Element = u.first {
      truc(e1, e2)
    }
  }
}

It may be reasonable to expect that inference be able to figure out the substitution of C in the call to truc in machin's body. That's because the type of e is T.Element and so there's a relatively obvious way to deduce that C = T. However, the same isn't necessarily true in bidule.

If T.Element = U.Element, then inference should in principle be free to substitute one for the other anywhere. In fact, that's necessary to call truc because we need to substitute C by either T.Element or U.Element. Either way, the type checker will then have to confirm that we a T.Element to a parameter of vice versa.

The problem is that even if T.Element and U.Element are equivalent, nothing says that T and U are too and so they could have different conformances to Collection. Consequently, whether inference picks T or U may have an impact on the semantics of truc because that function may be using properties of the conformance model it obtained.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The Hylo Group

Some thoughts about scoped conformance #1528

{{title}}

Replies: 0 comments

Select a reply

The Hylo Group

Some thoughts about scoped conformance #1528

kyouko-taiga Jul 18, 2024 Maintainer

Coherence

Hidden ambiguities

Conformance escape

Semantic relevance of substitution

Replies: 0 comments

kyouko-taiga
Jul 18, 2024
Maintainer