Skip to content
This repository has been archived by the owner on Oct 21, 2024. It is now read-only.

Latest commit

 

History

History
1615 lines (1264 loc) · 41.3 KB

basics.org

File metadata and controls

1615 lines (1264 loc) · 41.3 KB

Lean Tutorial

Basics: Functional Programming in Lean

Disclaimer: this document is heavily based on the Software Foundations book by Benjamin C. Pierce, Chris Casinghino, Marco Gaboardi, Michael Greenberg, Cătălin Hriţcu, Vilhelm Sjöberg, Brent Yorgey. Their book is based on the Coq system. We adapted it to Lean, and added Lean specific information. Of course, all mistakes introduced are our fault.

Introduction

The functional programming style brings programming closer to mathematics: If a procedure or method has no side effects, then pretty much all you need to understand about it is how it maps inputs to outputs — that is, you can think of its behavior as just computing a mathematical function. This is one reason for the word “functional” in “functional programming.” This direct connection between programs and simple mathematical objects supports both sound informal reasoning and formal proofs of correctness. The other sense in which functional programming is “functional” is that it emphasizes the use of functions (or methods) as first-class values — i.e., values that can be passed as arguments to other functions, returned as results, stored in data structures, etc. The recognition that functions can be treated as data in this way enables a host of useful idioms, as we will see. Other common features of functional languages include algebraic data types and pattern matching, which make it easy to construct and manipulate rich data structures, and sophisticated polymorphic type systems that support abstraction and code reuse. Lean shares all of these features.

Enumerated Types

One unusual aspect of Lean is that its set of built-in features is extremely small. For example, instead of providing the usual palette of atomic data types (booleans, integers, strings, etc.), Lean offers an extremely powerful mechanism for defining new data types from scratch — so powerful that all these familiar types arise as instances. Naturally, the Lean distribution comes with a standard library providing definitions of booleans, numbers, and many common data structures like lists. But there is nothing magic or primitive about these library definitions: they are ordinary user code. To see how this works, let’s start with a very simple example.

Days of the Week

The following declaration tells Lean that we are defining a new set of data values: a type.

inductive day :=
monday, tuesday, wednesday, thursday, friday, saturday, sunday

The type is called day, and its members are monday, tuesday, etc. The second line of the definition can be read “monday is a day, tuesday is a day, etc.” We say monday, tuesday, … are constructors for the inductive type day. Having defined day, we can write functions that operate on day’s.

import logic

inductive day :=
monday, tuesday, wednesday, thursday, friday, saturday, sunday

-- BEGIN
definition next_weekday (d : day) : day :=
day.cases_on d
  day.tuesday
  day.wednesday
  day.thursday
  day.friday
  day.monday
  day.monday
  day.monday
-- END

One thing to note is that the argument and return types of this function are explicitly declared. Like most functional programming languages, Lean can often work out these types even if they are not given explicitly – i.e., it performs some type inference – but we include them to make reading easier. Moreover, this function was defined using the recursor day.cases_on automatically generated by Lean whenever we define a new inductive datatype. This recursor allows us to perform case-analysis on d. The day.cases_on above can be read as:

import logic

inductive day :=
monday, tuesday, wednesday, thursday, friday, saturday, sunday

-- BEGIN
definition next_weekday (d : day) : day :=
day.cases_on d
  day.tuesday    -- if d is day.monday,    then the result is day.tuesday
  day.wednesday  -- if d is day.tuesday,   then the result is day.wednesday
  day.thursday   -- if d is day.wednesday, then the result is day.thursday
  day.friday     -- if d is day.thursday,  then the result is day.friday
  day.monday     -- if d is day.friday,    then the result is day.monday
  day.monday     -- if d is day.saturday,  then the result is day.monday
  day.monday     -- if d is day.sunday,    then the result is day.monday
-- END

We say d is the major premise for the recursor day.cases_on, and day.tuesday, …, day.monday are the minor premises. We have a minor premise for each constructor.

Lean declarations are organized into “namespaces”. We can avoid most occurrences of prefix day by opening the namespace day using the command open day. Here is an alternative (and more compact) definition.

import logic

inductive day :=
monday, tuesday, wednesday, thursday, friday, saturday, sunday

-- BEGIN
open day

definition next_weekday d :=
day.cases_on d tuesday wednesday thursday friday monday monday monday
-- END

In the “compressed” definition above, we also omitted the argument and return types, and let Lean figure out them.

Having defined a function, we should check that it works on some examples. There are different ways to do this in Lean. First, we can use the command eval to evaluate a compound expression involving next_weekday.

import logic

inductive day :=
monday, tuesday, wednesday, thursday, friday, saturday, sunday

open day

definition next_weekday d :=
day.cases_on d tuesday wednesday thursday friday monday monday monday

-- BEGIN
eval next_weekday friday
--  monday
eval next_weekday (next_weekday saturday)
-- tuesday
-- END

Second, we can record what we expect the result to be in the form of a Lean example:

import logic

inductive day :=
monday, tuesday, wednesday, thursday, friday, saturday, sunday

open day

definition next_weekday d :=
day.cases_on d tuesday wednesday thursday friday monday monday monday

-- BEGIN
example : next_weekday (next_weekday saturday) = tuesday :=
rfl
-- END

This declaration does two things: it makes an assertion (that the second weekday after saturday is tuesday). Having made the assertion, we show it holds by reflexivity rfl. We can justify this step by reflexivity because the left/right hand sides of the equation are identical, after Lean evaluates next_weekday. The symbols = and rfl are defined in the logic module, and is imported using the command import logic.

Booleans

In a similar way, we can define the type boolean of booleans, with members true and false.

inductive boolean :=
true, false

Although we are rolling our own booleans here for the sake of building up everything from scratch, Lean does, of course, provide a default implementation of the booleans in its standard library, together with a multitude of useful functions and lemmas.

Functions over booleans can be defined in the same way as above. We define them in the namespace boolean.

import logic

inductive boolean :=
true, false
-- BEGIN
namespace boolean

definition neg (b : boolean) : boolean :=
cases_on b false true

definition and (b1 b2 : boolean) : boolean :=
cases_on b1 b2 false

definition or (b1 b2 : boolean) : boolean :=
cases_on b1 true b2

end boolean
-- END

In the example above, we could write cases_on instead of boolean.cases_on because we were inside the namespace boolean.

The following four “unit tests” constitute a complete specification – a truth table – for the boolean.or function:

import logic

inductive boolean :=
true, false

namespace boolean

definition neg (b : boolean) : boolean :=
cases_on b false true

definition and (b1 b2 : boolean) : boolean :=
cases_on b1 b2 false

definition or (b1 b2 : boolean) : boolean :=
cases_on b1 true b2

-- BEGIN
example : or true false = true :=
rfl

example : or false false = false :=
rfl

example : or false true = true :=
rfl

example : or true true = true :=
rfl
-- END
end boolean

The expression sorry can be used to fill a hole in an incomplete definition or proof. We’ll use them in the following exercises. In general, your job in the exercises is to replace sorry with real definitions or proofs.

Exercise: nand

Complete the definition of the following function, then make sure that the example assertions below can each be verified by Lean. This function should return true if either or both of its inputs are false.

import logic

inductive boolean :=
true, false

namespace boolean
-- BEGIN
definition nand (b1 b2 : boolean) : boolean :=
/- FILL IN HERE -/ sorry
-- END
end boolean

Remove sorry and fill in each proof with rfl.

import logic

inductive boolean :=
true, false

namespace boolean
definition nand (b1 b2 : boolean) : boolean :=
cases_on b1 (cases_on b2 false true) true

-- BEGIN
example : nand true false = true  :=
/- FILL IN HERE -/ sorry
example : nand false false = true :=
/- FILL IN HERE -/ sorry
example : nand false true = true  :=
/- FILL IN HERE -/ sorry
example : nand true true = false  :=
/- FILL IN HERE -/ sorry
-- END
end boolean

Exercise: and3

Do the same for the and3 function below. This function should return true when all of its inputs are true, and false otherwise.

import logic

inductive boolean :=
true, false

namespace boolean
-- BEGIN
definition and3 (b1 b2 b3 : boolean) : boolean :=
/- FILL IN HERE -/ sorry

example : and3 true true true = true   :=
/- FILL IN HERE -/ sorry
example : and3 false true true = false :=
/- FILL IN HERE -/ sorry
example : and3 true false true = false :=
/- FILL IN HERE -/ sorry
example : and3 true true false = false :=
/- FILL IN HERE -/ sorry
-- END
end boolean

Function Types

The check command causes Lean to print the type of an expression. For example, the type of and true false is boolean.

import logic

inductive boolean :=
true, false

namespace boolean

definition neg (b : boolean) : boolean :=
cases_on b false true

definition and (b1 b2 : boolean) : boolean :=
cases_on b1 b2 false

definition or (b1 b2 : boolean) : boolean :=
cases_on b1 true b2

end boolean

open boolean
-- BEGIN
check true
-- true : boolean
check neg true
-- neg true : boolean
-- END

Functions like neg itself are also data values, just like true and false. Their types are called function types, and they are written with arrows.

import logic

inductive boolean :=
true, false

namespace boolean

definition neg (b : boolean) : boolean :=
cases_on b false true

definition and (b1 b2 : boolean) : boolean :=
cases_on b1 b2 false

definition or (b1 b2 : boolean) : boolean :=
cases_on b1 true b2

end boolean

open boolean
-- BEGIN
check neg
-- negb : boolean → boolean
-- END

The type of neg, written boolean → boolean and pronounced “boolean arrow boolean,” can be read, “Given an input of type boolean, this function produces an output of type boolean.” Similarly, the type of and, written boolean → boolean → boolean, can be read, “Given two inputs, both of type boolean, this function produces an output of type boolean.”

Remark: in the Lean web interface and Emacs mode, we can input the unicode character by typing \r. We can also use -> instead of . In the web interface, the replacement only occurs after we press space after typing \r.

import logic

inductive boolean :=
true, false
-- BEGIN
example : (boolean -> boolean) = (boolean → boolean) :=
rfl
-- END

Not every function must have a name. The keyword fun introduces an anonymous function. (fun x : A, e) is the function which takes an argument x of type A and returns the result e. For example, the function neg above could be also written as

import logic

inductive boolean :=
true, false

-- BEGIN
namespace boolean

check fun b : boolean, boolean.cases_on b false true

end boolean
-- END

We say (fun x : A, e) is a “lambda abstraction”. We can also use the unicode character λ = instead of =fun. We can input this character by typing \fun. In many cases, the type A can be inferred automatically by Lean, and be omitted.

import logic

inductive boolean :=
true, false
-- BEGIN
namespace boolean

check λ b, cases_on b false true

end boolean
-- END

Functions with multiple arguments are very common. We can write (fun x_1 : A_1, fun x_2 : A_2, ..., e) as (fun (x_1 : A_1) (x_2 : A_2) ..., e). Moreover, if x_1 and x_2 have the same type, we can write (fun (x_1 x_2 : A) ..., e). For example, the function and above could be also written as one of the following forms:

import logic

inductive boolean :=
true, false
-- BEGIN
namespace boolean

check fun b1 : boolean, fun b2 : boolean, cases_on b1 b2 false
check fun (b1 b2 : boolean), cases_on b1 b2 false
check λ (b1 b2 : boolean), cases_on b1 b2 false
check λ b1 b2, cases_on b1 b2 false

end boolean
-- END

Numbers

Technical digression: Lean provides a fairly sophisticated “module” system, to aid in organizing large developments. If we enclose a collection of declarations between namespace X and end X markers, then, in the remainder of the file after the end, these definitions will be referred to by names like X.foo instead of just foo. Here, we use this feature to introduce the definition of the type nat in an inner namespace so that it does not shadow the one from the standard library.

namespace playground

end playground

The types we have defined so far are examples of “enumerated types”: their definitions explicitly enumerate a finite set of elements. A more interesting way of defining a type is to give a collection of “inductive rules” describing its elements. For example, we can define the natural numbers as follows:

namespace playground
-- BEGIN
inductive nat :=
O : nat,
S : nat → nat
-- END
end playground

The clauses of this definition can be read: O is a natural number (note that this is the letter “O,” not the numeral “0”). S is a “constructor” that takes a natural number and yields another one — that is, if n is a natural number, then S n is too. Let’s look at this in a little more detail. Every inductively defined set (day, nat, boolean, etc.) is actually a set of expressions. The definition of nat says how expressions in the set nat can be constructed: the expression O belongs to the set nat; if n is an expression belonging to the set nat, then S n is also an expression belonging to the set nat; and expressions formed in these two ways are the only ones belonging to the set nat. The same rules apply for our definitions of day and bool. The annotations we used for their constructors are analogous to the one for the O constructor, and indicate that each of those constructors doesn’t take any arguments. These three conditions are the precise force of the Inductive declaration. They imply that the expression O, the expression S O, the expression S (S O), the expression S (S (S O)), and so on all belong to the set nat, while other expressions like true, and true false, and S (S false) do not. Each inductive declaration has an associated recursor that allow us to define things by recursion on the structure of the inductive type elements. For example, the predecessor function:

import logic

namespace playground
inductive nat :=
O : nat,
S : nat → nat
-- BEGIN
namespace nat
definition pred (n : nat) : nat :=
cases_on n
  O                     -- case n is O
  (fun (n₁ : nat), n₁)  -- case n is S n₁

eval pred (S (S O))

example : pred (S (S O)) = S O :=
rfl

end nat
-- END
end playground

The second branch can be read: “if n has the form S n₁ for some n₁, then return n₁.”

Remark: numeric subscripts can be conveniently inputed by typing \1, \2, .... This feature is available in the Lean web interface and Emacs mode.

Now, we define the function minustwo using two nested cases_on.

import logic

namespace playground
inductive nat :=
O : nat,
S : nat → nat
namespace nat

-- BEGIN
definition minustwo (n : nat) : nat :=
cases_on n
  O                     -- n is O
  (fun n₁, cases_on n₁  -- n is S n₁
    O                   -- n₁ is O
    (fun n₂, n₂))       -- n₁ is S n₂

eval minustwo (S (S (S O)))

example : minustwo (S (S (S O))) = (S O) :=
rfl
-- END
end nat
end playground

Lean provides support for parsing and printing numeric types as ordinary arabic numerals. The “trick” is based on a type called num that has builtin support for parsing and printing using arabic numerals. If we want similar support for other numeric types we must define a coercion to num. The nat type in standard library provides that.

import data.nat
open nat

check succ 2
eval succ 2

In the example above, succ has type nat → nat, and 2 is a num. To make the expression type check, Lean automatically injects a coercion from num to nat. By default, Lean does not display coercions. We can change that by setting the option pp.coercions. Note that, in the following example, we use the standard library nat type.

import data.nat
open nat
-- BEGIN
set_option pp.coercions true
check succ 2

example : succ 2 = succ (of_num 2) :=
rfl

check of_num
-- END

The coercion of_num is just a function from num to nat.

Returning to our nat type, the constructor S has the type nat → nat, just like the functions minustwo and pred:

import logic

namespace playground
inductive nat :=
O : nat,
S : nat → nat
namespace nat
definition pred (n : nat) : nat :=
cases_on n
  O
  (fun (n₁ : nat), n₁)

definition minustwo (n : nat) : nat :=
pred (pred n)

-- BEGIN
check S
check pred
check minustwo
-- END
end nat
end playground

These are all things that can be applied to a number to yield a number. However, there is a fundamental difference: functions like pred and minustwo come with computation rules – e.g., the definition of pred says that pred (S (S O)) can be simplified to S O – while the definition of S has no such behavior attached. Although it is like a function in the sense that it can be applied to an argument, it does not do anything at all! For most function definitions over numbers, pure case analysis is not enough: we also need recursion. For example, to check that a number n is even, we may need to recursively check whether n-1 is odd. We can also write this kind of function using recursors. Lean automatically generates different recursors whenever an inductive datatype is declared. The recursor rec_on is similar to cases_on, but it provides a recursive parameter.

import logic

namespace playground

inductive boolean :=
true, false

namespace boolean
  definition neg (b : boolean) : boolean :=
  cases_on b false true
end boolean

open boolean

inductive nat :=
O : nat,
S : nat → nat

namespace nat
-- BEGIN
definition even (n : nat) : boolean :=
rec_on n
  true
  (fun (n₁ : nat) (r : boolean), neg r)

eval even O
eval even (S O)
eval even (S (S O))
-- END
end nat
end playground

The definition above can be read as: “if n is O, then it is even; if n has the form S n₁ for some n₁, and n₁ is even (this information is stored in r), then n₁ is not even”. We say r is the recursive parameter in the minor premise associated with the constructor S. For non-recursive inductive datatypes, the recursors cases_on and rec_on are identical.

import logic

inductive boolean :=
true, false

-- BEGIN
check boolean.rec_on
check boolean.cases_on
-- END

We can define odd in a similar way, but here is a simpler definition:

import logic

namespace playground

inductive boolean :=
true, false

namespace boolean
  definition neg (b : boolean) : boolean :=
  cases_on b false true
end boolean

open boolean

inductive nat :=
O : nat,
S : nat → nat

namespace nat
definition even (n : nat) : boolean :=
rec_on n
  true
  (fun (n₁ : nat) (r : boolean), neg r)

-- BEGIN
definition odd (n : nat) : boolean :=
neg (even n)

example : odd (S O) = true :=
rfl

example : odd (S (S (S (S O)))) = false :=
rfl
-- END
end nat
end playground

Naturally, we can also define multi-argument functions using recursors.

import logic
namespace playground

inductive nat :=
O : nat,
S : nat → nat

namespace nat
-- BEGIN
definition add (n m : nat) : nat :=
rec_on n
  m
  (fun (n₁ : nat) (r : nat), S r)

-- Adding three to two gives us five, as we would expect.
example : add (S (S (S O))) (S (S O)) = S (S (S (S (S O)))) :=
rfl
-- END
end nat
end playground

The definition above can be read: “if n is O, then the result of the addition is m; if n has the form S n₁ for some n₁ and r contains add n₁ m, then return S r”.

We can use add to define a coercion from num to our nat. The type num is defined as

namespace playground
-- BEGIN
inductive pos_num :=
one  : pos_num,
bit1 : pos_num → pos_num,
bit0 : pos_num → pos_num

inductive num :=
zero  : num,
pos   : pos_num → num
-- END
end playground

It uses binary encoding for compactly representing big numbers: bit1 n encodes 2*n+1, and bit0 n encodes 2*n. As any inductive type, Lean automatically creates the recursors rec_on and cases_on. We use them to define a coercion.

import logic
namespace playground

inductive nat :=
O : nat,
S : nat → nat

namespace nat
definition add (n m : nat) : nat :=
rec_on n
  m
  (fun (n₁ : nat) (r : nat), S r)

-- BEGIN
definition num_to_nat [coercion] (n : num) : nat :=
num.cases_on n
  O
  (fun (p : pos_num), pos_num.rec_on p
    (S O)
    (fun (p₁ : pos_num) (r : nat), S (add r r))
    (fun (p₁ : pos_num) (r : nat), add r r))

example : add 2 2 = 4 :=
rfl

example : add 1 2 = S (S (S O)) :=
rfl

check add 2 1
set_option pp.coercions true
check add 2 1

-- END
end nat
end playground

The annotation [coercion] instructs Lean to use num_to_nat whenever we have a num, but the system expects a nat. In the example above, the function add expects two nat’s, but we are providing two num’s. Thus, Lean automatically inserts num_to_nat.

Now, we define subtraction sub and multiplication mul using add and pred.

import logic
namespace playground

inductive nat :=
O : nat,
S : nat → nat

namespace nat
definition pred (n : nat) : nat :=
cases_on n
  O
  (fun (n₁ : nat), n₁)

definition add (n m : nat) : nat :=
rec_on n
  m
  (fun (n₁ : nat) (r : nat), S r)

-- BEGIN
definition sub (n m : nat) : nat :=
rec_on m
  n
  (fun (n₁ : nat) (r : nat), pred r)

example : sub (S (S (S O))) (S (S O)) = S O :=
rfl

definition mul (n m : nat) : nat :=
rec_on n
  O
  (fun (n₁ : nat) (r : nat), add m r)

example : mul (S (S O)) (S (S (S O))) = (S (S (S (S (S (S O)))))) :=
rfl
-- END
end nat
end playground

Now, we define the exponential function exp using mul.

import logic
namespace playground

inductive nat :=
O : nat,
S : nat → nat

namespace nat
definition add (n m : nat) : nat :=
rec_on n
  m
  (fun (n₁ : nat) (r : nat), S r)

definition mul (n m : nat) : nat :=
rec_on n
  O
  (fun (n₁ : nat) (r : nat), add m r)

-- BEGIN
definition exp (base power : nat) : nat :=
rec_on power
  (S O)
  (fun (n₁ : nat) (r : nat), mul base r)

eval exp (S (S O)) (S (S (S O)))
-- END
end nat
end playground

Exercise factorial

Recall the standard factorial function:

factorial(0)  =  1
factorial(n)  =  n * factorial(n-1)     (if n>0)

Translate this into Lean.

import logic
namespace playground

inductive nat :=
O : nat,
S : nat → nat

namespace nat
definition add (n m : nat) : nat :=
rec_on n
  m
  (fun (n₁ : nat) (r : nat), S r)

definition mul (n m : nat) : nat :=
rec_on n
  O
  (fun (n₁ : nat) (r : nat), add m r)

definition num_to_nat [coercion] (n : num) : nat :=
num.cases_on n
  O
  (fun (p : pos_num), pos_num.rec_on p
    (S O)
    (fun (p₁ : pos_num) (r : nat), S (add r r))
    (fun (p₁ : pos_num) (r : nat), add r r))

-- BEGIN
definition factorial (n : nat) : nat :=
/- FILL IN HERE -/ sorry

example : factorial 3 = 6 :=
/- FILL IN HERE -/ sorry

example : factorial 5 = (mul 10 12) :=
/- FILL IN HERE -/ sorry
-- END
end nat
end playground

We can make numerical expressions a little easier to read and write by introducing “notations” for addition, multiplication, and subtraction.

import logic
namespace playground

inductive nat :=
O : nat,
S : nat → nat

namespace nat
definition pred (n : nat) : nat :=
cases_on n
  O
  (fun (n₁ : nat), n₁)

definition add (n m : nat) : nat :=
rec_on n
  m
  (fun (n₁ : nat) (r : nat), S r)

definition sub (n m : nat) : nat :=
rec_on m
  n
  (fun (n₁ : nat) (r : nat), pred r)

definition mul (n m : nat) : nat :=
rec_on n
  O
  (fun (n₁ : nat) (r : nat), add m r)

definition num_to_nat [coercion] (n : num) : nat :=
num.cases_on n
  O
  (fun (p : pos_num), pos_num.rec_on p
    (S O)
    (fun (p₁ : pos_num) (r : nat), S (add r r))
    (fun (p₁ : pos_num) (r : nat), add r r))

-- BEGIN
notation a + b := add a b
notation a - b := sub a b
notation a * b := mul a b

eval 2 + 3 * 2 - 1

example : 2 + 3 * 2 - 1 = 7 :=
rfl
-- END
end nat
end playground

In the example above, we did not have to provide precedence levels for the new notation because the Lean standard library already assigns precedence for commonly used operators such as +, * and -.

Proof by Reflexivity

Now that we’ve defined a few datatypes and functions, let’s turn to the question of how to state and prove properties of their behavior. Actually, in a sense, we’ve already started doing this: each Example in the previous sections makes a precise claim about the behavior of some function on some particular inputs. The proofs of these claims were always the same: use reflexivity rfl to check that both sides of the === evaluate to identical values. The same sort of “proof by evaluation” can be used to prove more interesting properties as well. For example, the fact that 0 is a “neutral element” for + on the left can be proved just by observing that 0 + n reduces/evaluates to n no matter what n is, a fact that can be read directly off the definition of plus.

import logic
namespace playground

inductive nat :=
O : nat,
S : nat → nat

namespace nat
definition add (n m : nat) : nat :=
rec_on n
  m
  (fun (n₁ : nat) (r : nat), S r)

notation a + b := add a b

definition num_to_nat [coercion] (n : num) : nat :=
num.cases_on n
  O
  (fun (p : pos_num), pos_num.rec_on p
    (S O)
    (fun (p₁ : pos_num) (r : nat), S (add r r))
    (fun (p₁ : pos_num) (r : nat), add r r))

-- BEGIN
theorem add_0_n : ∀ n : nat, 0 + n = n :=
take n : nat, rfl
-- END
end nat
end playground

The symbol ∀ = can be inputed by typing =\all. We can also use the token forall instead of the unicode character =∀ =. The form of this theorem and proof are almost exactly the same as the examples above; there are just a few differences. First, we’ve used the keyword theorem instead of example. We provided a name to identify the theorem. The keywords definition and theorem are almost the same thing in Lean. Secondly, we’ve added the quantifier ∀ n:nat, so that our theorem talks about all natural numbers n. In order to prove theorems of this form, we need to to be able to reason by assuming the existence of an arbitrary natural number n. This is achieved in the proof by take n : nat, .... In effect, we start the proof by saying “OK, suppose n is some arbitrary number.” Eventually, it will become clear that take is just another syntax sugar for lambda abstraction. We could also have used fun or =λ = in the proof above. In Lean, proof checking is type checking. The same procedure used to type check our definitions is used to proof/type check our theorems. Here are other simple theorems.

import logic
namespace playground

inductive nat :=
O : nat,
S : nat → nat

namespace nat
definition add (n m : nat) : nat :=
rec_on n
  m
  (fun (n₁ : nat) (r : nat), S r)

definition mul (n m : nat) : nat :=
rec_on n
  O
  (fun (n₁ : nat) (r : nat), add m r)

notation a + b := add a b
notation a * b := mul a b

definition num_to_nat [coercion] (n : num) : nat :=
num.cases_on n
  O
  (fun (p : pos_num), pos_num.rec_on p
    (S O)
    (fun (p₁ : pos_num) (r : nat), S (add r r))
    (fun (p₁ : pos_num) (r : nat), add r r))

-- BEGIN
theorem add_1_left : ∀ n : nat, 1 + n = S n :=
take n : nat, rfl

theorem mult_0_left : ∀ n : nat, 0 * n = 0 :=
take n : nat, rfl
-- END
end nat
end playground

We have been using rfl as a short hand for reflexivity. That is, a proof for ?a = ?a, but which ?a?. Lean infers the ?a from the context where rfl is used. Sometimes, it is convenient to provide ?a explicitly (e.g., as a form of documentation in a longer proof). In this cases, we can use eq.refl t as a proof for t = t.

import logic
-- BEGIN
check rfl
check eq.refl 1

example : 1 = 1 :=
rfl

example : 1 = 1 :=
eq.refl 1
-- END

Here are the add_1_left and mult_0_left theorems using eq.refl.

import logic
namespace playground

inductive nat :=
O : nat,
S : nat → nat

namespace nat
definition add (n m : nat) : nat :=
rec_on n
  m
  (fun (n₁ : nat) (r : nat), S r)

definition mul (n m : nat) : nat :=
rec_on n
  O
  (fun (n₁ : nat) (r : nat), add m r)

notation a + b := add a b
notation a * b := mul a b

definition num_to_nat [coercion] (n : num) : nat :=
num.cases_on n
  O
  (fun (p : pos_num), pos_num.rec_on p
    (S O)
    (fun (p₁ : pos_num) (r : nat), S (add r r))
    (fun (p₁ : pos_num) (r : nat), add r r))

-- BEGIN
theorem add_1_left : ∀ n : nat, 1 + n = S n :=
take n : nat, eq.refl (1 + n)

theorem mult_0_left : ∀ n : nat, 0 * n = 0 :=
take n : nat, eq.refl (0 * n)
-- END
end nat
end playground

Proof by Substitution

Here is a slightly more interesting theorem:

import logic
namespace playground

inductive nat :=
O : nat,
S : nat → nat

namespace nat
definition add (n m : nat) : nat :=
rec_on n
  m
  (fun (n₁ : nat) (r : nat), S r)

notation a + b := add a b

-- BEGIN
theorem plus_id_example : ∀ n m : nat, n = m → n + n = m + m :=
-- END
take n m : nat, assume H : n = m,
  eq.subst H (eq.refl (n + n))

end nat
end playground

Instead of making a completely universal claim about all numbers n and m, this theorem talks about a more specialized property that only holds when n = m. The arrow symbol is pronounced “implies.” As before, we need to be able to reason by assuming the existence of some numbers n and m. We also need to assume the hypothesis n = m. We can use lambda abstraction for doing all three. To make proofs look like more text book proofs, Lean provides yet another syntax sugar for lambda abstraction: assume. Since n and m are arbitrary numbers, we can’t just use evaluation to prove this theorem. Instead, we prove it by observing that, if we are assuming n = m, then we can replace n with m in the right hand side of the equality n + n = n + n. Moreover, the equality n + n = n + n can be justified by reflexivity. Lean provides a function eq.subst H1 H2, given (H1 : n = m), the expression (eq.subst H1 H2) replaces n with m in H2. Now, we provide different proofs for the theorem plus_id_example using eq.subst.

import logic
namespace playground

inductive nat :=
O : nat,
S : nat → nat

namespace nat
definition add (n m : nat) : nat :=
rec_on n
  m
  (fun (n₁ : nat) (r : nat), S r)

notation a + b := add a b

-- BEGIN
theorem plus_id_example : ∀ n m : nat, n = m → n + n = m + m :=
take n m : nat, assume H : n = m,
  eq.subst H (eq.refl (n + n))

-- We can avoid the eq prefix by opening the namespace eq
open eq
theorem plus_id_example_2 : ∀ n m : nat, n = m → n + n = m + m :=
take n m : nat, assume H : n = m,
  subst H (refl (n + n))

-- We can use rfl instead of refl
theorem plus_id_example_3 : ∀ n m : nat, n = m → n + n = m + m :=
take n m : nat, assume H : n = m,
  subst H rfl

-- We can use λ instead of take and assume
theorem plus_id_example_4 : ∀ n m : nat, n = m → n + n = m + m :=
λ (n m : nat) (H : n = m),
  subst H rfl

-- We can omit types
theorem plus_id_example_5 : ∀ n m : nat, n = m → n + n = m + m :=
λ n m H, subst H rfl

-- END
end nat
end playground

The functions eq.subst is extensively used in the standard library. To make it more convenient to use, the notation H₁ ▸ H₂ can be used as syntax sugar for eq.subst H₁ H₂. This notation is defined in the namespace eq.ops. The character is inputed by typing \t.

import logic
namespace playground

inductive nat :=
O : nat,
S : nat → nat

namespace nat
definition add (n m : nat) : nat :=
rec_on n
  m
  (fun (n₁ : nat) (r : nat), S r)

notation a + b := add a b

-- BEGIN
open eq.ops
theorem plus_id_example : ∀ n m : nat, n = m → n + n = m + m :=
λ n m H, H ▸ rfl
-- END
end nat
end playground

Exercise add_id_exercise

Remove sorry and fill in the proof.

import logic
namespace playground

inductive nat :=
O : nat,
S : nat → nat

namespace nat
definition add (n m : nat) : nat :=
rec_on n
  m
  (fun (n₁ : nat) (r : nat), S r)

notation a + b := add a b

open eq eq.ops
-- BEGIN
theorem plus_id_exercise : ∀ n m o : nat, n = m → m = o → n + m = m + o :=
/- FILL IN HERE -/ sorry
-- END

end nat
end playground

As we’ve seen in earlier examples, the sorry expression tells Lean that we want to skip trying to prove this theorem and just accept it as a given. This can be useful for developing longer proofs, since we can state subsidiary facts that we believe will be useful for making some larger argument, use sorry to accept them on faith for the moment, and continue thinking about the larger argument until we are sure it makes sense; then we can go back and fill in the proofs we skipped. Be careful, though: every time you say sorry you are leaving a door open for total nonsense to enter Lean rigorous and formally checked world! Note that, Lean produces warning messages whenever we use sorry and/or import a module that contains sorry.

We can also use the eq.subst with a previously proved theorem instead of a hypothesis from the context.

import logic
namespace playground

inductive nat :=
O : nat,
S : nat → nat

namespace nat
definition add (n m : nat) : nat :=
rec_on n
  m
  (fun (n₁ : nat) (r : nat), S r)

definition mul (n m : nat) : nat :=
rec_on n
  O
  (fun (n₁ : nat) (r : nat), add m r)

notation a + b := add a b
notation a * b := mul a b

definition num_to_nat [coercion] (n : num) : nat :=
num.cases_on n
  O
  (fun (p : pos_num), pos_num.rec_on p
    (S O)
    (fun (p₁ : pos_num) (r : nat), S (add r r))
    (fun (p₁ : pos_num) (r : nat), add r r))

theorem add_0_n : ∀ n : nat, 0 + n = n :=
take n : nat, rfl

open eq eq.ops

-- BEGIN
theorem mult_0_plus : ∀ n m : nat, (0 + n) * m = n * m :=
take n m,
  subst (add_0_n n) rfl
-- END
end nat
end playground

In the example above, the n in (add_0_n n) can be inferred automatically by Lean. We can use _ to indicate arguments that should be inferred automatically by Lean. That is, we can write (add_0_n _) instead of (add_0_n n). In this example, it is straightforward how to fill/synthesize _, but this will not always be the case. To avoid the proliferation of terms such as (f _ _ _), Lean provides the alternative syntax !f. Later, we explain in detail the semantics of the operator !.

import logic
namespace playground

inductive nat :=
O : nat,
S : nat → nat

namespace nat
definition add (n m : nat) : nat :=
rec_on n
  m
  (fun (n₁ : nat) (r : nat), S r)

definition mul (n m : nat) : nat :=
rec_on n
  O
  (fun (n₁ : nat) (r : nat), add m r)

notation a + b := add a b
notation a * b := mul a b

definition num_to_nat [coercion] (n : num) : nat :=
num.cases_on n
  O
  (fun (p : pos_num), pos_num.rec_on p
    (S O)
    (fun (p₁ : pos_num) (r : nat), S (add r r))
    (fun (p₁ : pos_num) (r : nat), add r r))

theorem add_0_n : ∀ n : nat, 0 + n = n :=
take n : nat, rfl

open eq eq.ops

-- BEGIN
theorem mult_0_plus : ∀ n m : nat, (0 + n) * m = n * m :=
take n m,
  subst !add_0_n rfl

-- Here is another proof using the notation ▸
theorem mult_0_plus_2 : ∀ n m : nat, (0 + n) * m = n * m :=
take n m, !add_0_n ▸ rfl
-- END
end nat
end playground

Exercise mult_S_1

import logic
namespace playground

inductive nat :=
O : nat,
S : nat → nat

namespace nat
definition add (n m : nat) : nat :=
rec_on n
  m
  (fun (n₁ : nat) (r : nat), S r)

definition mul (n m : nat) : nat :=
rec_on n
  O
  (fun (n₁ : nat) (r : nat), add m r)

notation a + b := add a b
notation a * b := mul a b

definition num_to_nat [coercion] (n : num) : nat :=
num.cases_on n
  O
  (fun (p : pos_num), pos_num.rec_on p
    (S O)
    (fun (p₁ : pos_num) (r : nat), S (add r r))
    (fun (p₁ : pos_num) (r : nat), add r r))

theorem add_1_left : ∀ n : nat, 1 + n = S n :=
take n : nat, rfl

open eq eq.ops
-- BEGIN
theorem mult_S_1 : ∀ n m : nat, m = S n → m * (1 + n) = m * m :=
/- FILL IN HERE -/ sorry
-- END
end nat
end playground

Proof by Case Analysis

Of course, not everything can be proved by simple calculation: In general, unknown, hypothetical values (arbitrary numbers, booleans, lists, etc.) can block the evaluation. Here is a simple example

import logic

inductive boolean :=
true, false

namespace boolean
definition neg (b : boolean) : boolean :=
cases_on b false true

-- BEGIN
theorem neg_involutive : ∀ b : boolean, neg (neg b) = b :=
/- rfl -/ -- fails
sorry
-- END
end boolean

The reason for the failure is that neg is defined using cases_on on b. But here, b is the unknown boolean b. Thus, the expression (neg b) cannot be evaluated/reduced. We say it is stuck.

The same way we use cases_on to define functions, we can use it to prove theorems by case-analysis.

import logic

inductive boolean :=
true, false

namespace boolean
definition neg (b : boolean) : boolean :=
cases_on b false true

-- BEGIN
theorem neg_involutive : ∀ b : boolean, neg (neg b) = b :=
take b : boolean,
  cases_on b
    rfl   -- proof for the case b is true
    rfl   -- proof for the case b is false
-- END
end boolean

The proof above can be read as: “take a boolean b, it can be true or false, in both cases if we evaluate the left/right hand sides, we obtain the same value”.

If we replace the rfl’s by _ in the proof above, Lean will say the placeholders _’s cannot be synthesized, and will show what needs to be proved:

⊢ neg (neg false) = false

and

⊢ neg (neg true) = true

For readers using the Lean Emacs mode, they can simply hover over the incomplete proof to obtain information about missing parts. Lean also reports the type of every sorry used in an input file.