Skip to content

Commit

Permalink
More content on indexed codata
Browse files Browse the repository at this point in the history
- Add some items to the bibliography
- Add an example of FSM API protocol
  • Loading branch information
noelwelsh committed Aug 8, 2024
1 parent 54175e0 commit e009829
Show file tree
Hide file tree
Showing 4 changed files with 235 additions and 9 deletions.
53 changes: 53 additions & 0 deletions src/bib/scala-with-cats.bib
Original file line number Diff line number Diff line change
Expand Up @@ -392,3 +392,56 @@ @inproceedings{10.1145/1708016.1708024
location = {Madrid, Spain},
series = {TLDI '10}
}

@article{Roth_2023,
title={Fluent APIs in Functional Languages},
volume={7},
ISSN={2475-1421},
url={http://dx.doi.org/10.1145/3586057},
DOI={10.1145/3586057},
number={OOPSLA1},
journal={Proceedings of the ACM on Programming Languages},
publisher={Association for Computing Machinery (ACM)},
author={Roth, Ori and Gil, Yossi},
year={2023},
month=apr, pages={876–901} }

@inproceedings{10.1145/2951913.2951929,
author = {Thibodeau, David and Cave, Andrew and Pientka, Brigitte},
title = {Indexed codata types},
year = {2016},
isbn = {9781450342193},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/2951913.2951929},
doi = {10.1145/2951913.2951929},
abstract = {Indexed data types allow us to specify and verify many interesting invariants about finite data in a general purpose programming language. In this paper we investigate the dual idea: indexed codata types, which allow us to describe data-dependencies about infinite data structures. Unlike finite data which is defined by constructors, we define infinite data by observations. Dual to pattern matching on indexed data which may refine the type indices, we define copattern matching on indexed codata where type indices guard observations we can make. Our key technical contributions are three-fold: first, we extend Levy's call-by-push value language with support for indexed (co)data and deep (co)pattern matching; second, we provide a clean foundation for dependent (co)pattern matching using equality constraints; third, we describe a small-step semantics using a continuation-based abstract machine, define coverage for indexed (co)patterns, and prove type safety. This is an important step towards building a foundation where (co)data type definitions and dependent types can coexist.},
booktitle = {Proceedings of the 21st ACM SIGPLAN International Conference on Functional Programming},
pages = {351–363},
numpages = {13},
keywords = {Logical frameworks, Functional programming, Dependent types, Coinduction},
location = {Nara, Japan},
series = {ICFP 2016}
}



@article{10.1145/3022670.2951929,
author = {Thibodeau, David and Cave, Andrew and Pientka, Brigitte},
title = {Indexed codata types},
year = {2016},
issue_date = {September 2016},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
volume = {51},
number = {9},
issn = {0362-1340},
url = {https://doi.org/10.1145/3022670.2951929},
doi = {10.1145/3022670.2951929},
abstract = {Indexed data types allow us to specify and verify many interesting invariants about finite data in a general purpose programming language. In this paper we investigate the dual idea: indexed codata types, which allow us to describe data-dependencies about infinite data structures. Unlike finite data which is defined by constructors, we define infinite data by observations. Dual to pattern matching on indexed data which may refine the type indices, we define copattern matching on indexed codata where type indices guard observations we can make. Our key technical contributions are three-fold: first, we extend Levy's call-by-push value language with support for indexed (co)data and deep (co)pattern matching; second, we provide a clean foundation for dependent (co)pattern matching using equality constraints; third, we describe a small-step semantics using a continuation-based abstract machine, define coverage for indexed (co)patterns, and prove type safety. This is an important step towards building a foundation where (co)data type definitions and dependent types can coexist.},
journal = {SIGPLAN Not.},
month = {sep},
pages = {351–363},
numpages = {13},
keywords = {Logical frameworks, Functional programming, Dependent types, Coinduction}
}
174 changes: 171 additions & 3 deletions src/pages/indexed-types/codata.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,7 @@
## Indexed Codata

The basic idea of indexed codata is to prevent methods being called unless certain conditions, encoded in types, are met. More precisely, methods are guarded by type equalities that callers must prove they satisfy to call a method. The contextual abstraction features, `given` instances and `using` clauses, are used to implement this in Scala.

We'll start our exploration of indexed codata with a very simple example.
We are going to define a switch that can only be turned on when it is off, and off when it is on.
Since this is codata, we start with an interface.
Expand Down Expand Up @@ -74,13 +76,179 @@ Incorrect uses fail to compile.
SimpleSwitch.on.on
```

The constraint is made of two parts: using clauses, which we learned about in [@sec:type-classes], and the [`A =:= B`][scala.=:=] construction, which is new. `=:=` represents a type equality. If a given instance `A =:= B` exists, then the type `A` is equal to the type `B`. (Note we can write this with the more familiar prefix notation `=:=[A, B]` if we prefer.) We never create these instances; instead the compiler creates them for us. In the `on` method, we are asking the compiler to construct an instance `A =:= Off`, which can only be done if `A` is `Off`. This in turn means we can only call the method when the `Switch` is `Off`. This is the core idea of indexed codata: we reflect states as types, and restrict method calls to a subset of states.
The constraint is made of two parts: using clauses, which we learned about in [@sec:type-classes], and the [`A =:= B`][scala.=:=] construction, which is new. `=:=` represents a type equality. If a given instance `A =:= B` exists, then the type `A` is equal to the type `B`. (Note we can write this with the more familiar prefix notation `=:=[A, B]` if we prefer.) We never create these instances ourselves. Instead the compiler creates them for us. In the `on` method, we are asking the compiler to construct an instance `A =:= Off`, which can only be done if `A` *is* `Off`. This in turn means we can only call the method when the `Switch` is `Off`. This is the core idea of indexed codata: we reflect states as types, and restrict method calls to a subset of states.


#### Exercise: Torque {-}

In Section [@sec:indexed-types:phantom] we saw how we could use phantom types to represent units.
We also ran into a limitation: we had no way to inspect the phantom types and hence make decisions based on them.
Now, with indexed codata, we can do that.

Below if the definition of `Length` we previously used. Your mission is to:

1. implement a type `Force`, parameterized by a phantom type that represents the units of force;
2. implement a type `Torque`, parameterized by a phantom type that represents the units of torque;
3. define types `Newtons` and `NewtonMetres` to represent force in SI units;
4. implement a method `*` on `Force` that accepts a `Length` and returns a `Torque`. It can only be called if the `Force` is in `Newtons` and the `Length` is in `Metres`. In this case the `Torque` is in `NewtonMetres`. (Torque is force times length.)

```scala mdoc:silent
final case class Length[Unit](value: Double) {
def +(that: Length[Unit]): Length[Unit] =
Length[Unit](this.value + that.value)
}
```

<div class="solution">
Defining `Force`, `Torque`, and the unit types is just repeating the pattern we saw in the example code.

```scala mdoc:silent
trait Newtons
trait NewtonMetres

final case class Force[Unit](value: Double)
final case class Torque[Unit](value: Double)
```

To define the `*` method on `Force` we need constraints that `Forces` `Unit` type is `Newtons`, and `Lengths` `Unit` type is `Metres`. These are both type equalities, so we can express them with `=:=`.

```scala mdoc:reset:invisible
trait Metres
trait Feet
trait Newtons
trait NewtonMetres

final case class Length[Unit](value: Double) {
def +(that: Length[Unit]): Length[Unit] =
Length[Unit](this.value + that.value)
}
final case class Torque[Unit](value: Double)
```

```scala mdoc:silent
final case class Force[Unit](value: Double) {
def *[L](length: Length[L])(using Unit =:= Newtons, L =:= Metres): Torque[NewtonMetres] =
Torque(this.value * length.value)
}
```
</div>

### API Protocols

An API protocol defines the order in which methods on codata must be called. In the case of `Switch`, the protocol is we can only call `off` after calling `on` and vice versa. This protocol is illustrated in Figure [@img:indexed-types:switch]. Many common types have similar protocols. For example, files can only be read once they are opened and cannot be read once they have been closed.
An API protocol defines the order in which methods must be called. The protocol in the case of `Switch` is that we can only call `off` after calling `on` and vice versa. This protocol is a simple finite state machine, and illustrated in Figure [@img:indexed-types:switch]. Many common types have similar protocols. For example, files can only be read once they are opened and cannot be read once they have been closed.

![The switch API protocol](src/pages/indexed-types/switch.pdf) {#img:indexed-types:switch}

Indexed codata allows us to enforce API protocols at compile-time.
Indexed codata allows us to enforce API protocols at compile-time. Often these protocols are finite-state machines. We can represent these protocols with a single type parameter that represents the state, as we did with `Switch`. We can also use multiple type parameters if that makes for a more convenient representation.

Let's see an example using multiple type parameters. We're going to build an API that represents a very limited subset of [HTML][html], the language the defines web pages. An example of HTML is below.

```html
<!DOCTYPE html>
<html>
<head><title>Our Amazing Web Page</title></head>
<body>
<h1>This Is Our Amazing Web Page</h1>
<p>Please be in awe of its <strong>amazingness</strong></p>
</body>
</html>
```

In HTML the content of the page is marked up with tags, like `<h1>`, that give it meaning.
For example, `<h1>` means a heading at level one, and `<p>` means a paragraph.
An opening tag is closed by a corresponding closing tag such as `</h1>` for `<h1>` and `</p>` for `<p>`.

There are several rules for valid HTML[^valid-html]. We're going to focus on the following:

1. Within the `html` tag there can only be a `head` and a `body` tag, in that order.
2. Within the `head` tag there must be exactly one `title`, and there can be any other number of allowed tags.
3. Within the `body` there can be any number of allowed tags.

We're going to use a Church-encoded representation for HTML.
As the code is fairly repetitive we will just present all the code and then discuss the important parts.
Here's the implementation.

```scala mdoc:silent
sealed trait StructureState
trait Empty extends StructureState
trait InHead extends StructureState
trait InBody extends StructureState

sealed trait TitleState
trait WithoutTitle extends TitleState
trait WithTitle extends TitleState

final class Html[S <: StructureState, T <: TitleState](
head: List[String],
body: List[String]
) {
// Head tags ------------------------------------------------------------------

def head(using S =:= Empty): Html[InHead, WithoutTitle] =
Html(head, body)

def title(
text: String
)(using S =:= InHead, T =:= WithoutTitle): Html[InHead, WithTitle] =
Html(head :+ s"<title>$text</title>", this.body)

def link(rel: String, href: String)(using S =:= InHead): Html[InHead, T] =
Html(head :+ s"<link rel=\"$rel\" href=\"$href\"/>", body)

// Body tags ------------------------------------------------------------------

def body(using S =:= InHead, T =:= WithTitle): Html[InBody, WithTitle] =
Html(head, body)

def h1(text: String)(using S =:= InBody): Html[InBody, T] =
Html(head, body :+ s"<h1>$text</h1>")

def p(text: String)(using S =:= InBody): Html[InBody, T] =
Html(head, body :+ s"<p>$text</p>")

// Interpreter ----------------------------------------------------------------

override def toString(): String = {
val h = head.mkString(" <head>\n ", "\n ", "\n </head>")
val b = body.mkString(" <body>\n ", "\n ", "\n </body>")

s"\n<html>\n$h\n$b\n</head>"
}
}
object Html {
val empty: Html[Empty, WithoutTitle] = Html(List.empty, List.empty)
}
```

Here's an example in use.

```scala mdoc
Html.empty.head
.title("Our Amazing Webpage")
.body
.h1("Where Amazing Exists")
.p("Right here")
.toString
```

Here's an example of the type system preventing an invalid construction.

```scala mdoc:fail
Html.empty.head
.link("stylesheet", "styles.css")
.body
.h1("This Shouldn't Work")
```



- We start defining elements that go in the `head` of an HTML document. We'll only allow `title` and `link` elements, for simplicity.
- There must be exactly one `title` element, but there can be zero or more `link` elements.
- Once we start defining `body` elements, we cannot


We can also implement more complex protcols, such as those that can be represented by context-free grammars, with more work. Let's see an example.

[html]: https://html.spec.whatwg.org/multipage/

[^valid-html]: The HTML specification allows for very lenient parsing of HTML. For example, if we don't define the `head` tag it will usually be inferred. However we aren't going to allow that kind of leniency in our API.
5 changes: 5 additions & 0 deletions src/pages/indexed-types/conclusions.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,3 +5,8 @@ The earliest reference I've found to phantom types is [@10.1145/331960.331977].

The majority of research on generalized algebraic data types focuses on type checking and inference algorithms, which is not so relevant to the working programmer.
*Pointwise Generalized Algebraic Data Types* [@10.1145/1708016.1708024] is not different in this respect, but it does have a particularly clear breakdown of how GADTs are used in the most common case.


Indexed codata is described in [@10.1145/3022670.2951929].

Fluent APIs. [@Roth_2023]
12 changes: 6 additions & 6 deletions src/pages/indexed-types/phantom-type.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
## Phantom Types
## Phantom Types {#sec:indexed-types:phantom}

Phantom types are a basic building block of indexed types, so we'll start with an example of them. A phantom type is simply a type parameter that doesn't correspond to any value. In the example below, the type parameter `A` is a phantom type, because there is no value of type `A`, while `B` is not because there is a value of that type.

Expand Down Expand Up @@ -29,11 +29,11 @@ final case class Length[Unit](value: Double) {
We'll need to define a few unit types to use this, and some `Lengths` using these units.

```scala mdoc:silent
trait Metre
trait Foot
trait Metres
trait Feet

val threeMetres = Length[Metre](3)
val threeFeetAndRising = Length[Foot](3)
val threeMetres = Length[Metres](3)
val threeFeetAndRising = Length[Feet](3)
```

Now we can add `Lengths` together if they have the same unit.
Expand Down Expand Up @@ -78,7 +78,7 @@ In HTML the content of the page is marked up with tags, like `<h1>`, that give i
For example, `<h1>` means a heading at level one, and `<p>` means a paragraph.
An opening tag is closed by a corresponding closing tag such as `</h1>` for `<h1>` and `</p>` for `<p>`.

There are rules that control where tags are allowed. The complete set of tags, and their associated rules, is very complex. We'll use the following much simplified rules:
There are rules that control where tags are allowed. The complete set of tags, and their associated rules, is very complex. We'll use the following, much simplified, rules:

- the `body` tag can only contain block level tags;
- block level tags are `h1` and `p` and can only contain inline tags; and
Expand Down

0 comments on commit e009829

Please sign in to comment.