-
Notifications
You must be signed in to change notification settings - Fork 356
Porting wiki
The current state of play:
-
mathport
more or less works.- Lots of warts, but it gives an output that is much easier to clean up than to translate from scratch.
- Please report errors in the output of
mathport
. We would like to make it better, but some things are hard!
- many tactics have been ported to mathlib4 (e.g.
ring
,linarith
,norm_num
) but many are still missing. - Every file has been ported.
- We need lots of people helping on the port!
This document explains how to get involved with the porting process.
- Decide the file you wish to port; you may wish to use the port status page.
- Run the
start_port.sh
fromMathlib4
master
; this copies the file frommathlib3port
, runs various automated fixes and sets up a branch for you to work on. - Run
lake exe cache get
andlake build <library-name>
and review the file - Publish the new branch and open a PR in Github (this automates tracking for porting status)
- Initially tag the branch with
mathlib-port
andWIP
- If you take it as far as you easily can, remove
WIP
and tag withhelp wanted
- Initially tag the branch with
- Perform the various fixes noted below, pushing commits back to Github often
- When you are done, remove any
help-wanted
orWIP
labels, and tag the PR withawaiting-review
. Done is defined as:- Builds and lints cleanly using CI automation (note that some lints can be automated via
fix-lints.py
- Names follow Mathlib4 conventions
- Non-trival changes documented via
--porting note: <note>
- Documentation updated to address name changes (note that some documentation changes can be automated via
scripts/fix-comments.py
)
- Builds and lints cleanly using CI automation (note that some lints can be automated via
- Address any review comments
- 🎉
Note there is a video showing the whole process on a very simple file. It is slightly out of date; the instructions below should be regarded as canonical.
- If you're a mathlib maintainer, please remember to regularly check #queue4 for mathlib4 PRs which are ready for review/merge.
- If you do not already have permissions to the
mathlib4
repo, ask for it on Zulip, and include your Github handle. - The first task is to decide what you want to port.
- A bot regularly posts about progress on various targets. You can also find some somewhat-out-of-date graphs here:
- Clicking on one of the links will show you a graph of all the files leading up to that target.
- Green nodes have already been ported.
- Dark blue nodes have an open PR; you can click on the node to open the PR.
- Light blue nodes are ready to be ported.
- If you are an experienced porter, you are encouraged to prioritise helping out with any open PRs that might be "stuck".
- For everyone, the light blue nodes that appear on these graphs are the next highest priorities.
- Before starting to port such a node, check the port status page to check there are no comments about this file that should be considered first.
- You can also use the port status page to find files that need porting
(whether or not they are needed for any of the near-term targets).
This page shows the same results as you'd get from running
scripts/port_status.py
in the mathmathlib
directory.
- Once you've selected a file
X/Y/Z.lean
to work on, you can get started. - Make sure you are in the root directory of the mathlib4 repository, and run
./scripts/start_port.sh Mathlib/X/Y/Z.lean
. This will automatically do the following:- Creates a new branch
port/X.Y.Z
. - Downloads the corresponding file from the
mathlib3port
repository. - Commits it as-is. This allows reviewers to see the diff between mathport and the manually ported version, making it easy to check aligns and similar statements.
- Adds the new file as import to
Mathlib.lean
. - Replaces all occurences of
import Mathbin.
withimport Mathlib.
, running effectively
find Mathlib -name "*.lean" | env LC_ALL=C sort | sed 's/\.lean//;s,/,.,g;s/^/import /' > Mathlib.lean
- Creates a second commit with these changes.
- Note that on OS X, the built-in version of
sed
does not work with thestart_port.sh
script. The fix is to install the GNU version (e.g. viabrew install gsed
) and ensure it – rather than the OS version – is the default forsed
(e.g. by following the path instructions provided bybrew
).
- Creates a new branch
- Already at this point you should open a PR on the mathlib4 repository. The title should be
feat: port X
, and you can label it asmathlib-port
, and alsoWIP
. We open the PR early so others know you are working on the file. - If you want to record free-form comments about a file, please add them to the port comments.
- Perform a bunch of mechanical updates on the file (i.e. things that in principle
mathport
could do, but doesn't yet), see below. - Try to get everything compiling again.
- A helpful workflow is to go from red error to red error by pressing F8 in VScode, ignoring orange linting errors; then, once all or most red errors are gone, run
scripts/fix-lints.py
on the file to fix many of the remaining orange errors automatically. - Leave
-- Porting note: ...
liberally: anytime you leave something out that was in mathlib3, there should be a porting note. - Use
-- Porting note: TODO ...
for things that must be fixed before the PR can be merged. - Fix documentation changes via
scripts/fix-comments.py
- A helpful workflow is to go from red error to red error by pressing F8 in VScode, ignoring orange linting errors; then, once all or most red errors are gone, run
- It's sometimes useful to have the original mathlib file open in a separate window, so you can typecheck, discover where instances are coming from, etc. Some potential obstacles include:
- Discovering that an import
Y.lean
which you thought was sufficiently well ported already in fact isn't. (Either patch it up, or go back to step 1 withX = Y
.) - Finding that a tactic
T
used in a proof is still missing. (Usually for now you'll just work around it by using simpler tactics, but ambitious porters are invited to give up on portingX
, and instead portT
! Also check the open PR's—tacticT
might be effectively ported, and just awaiting review.) - Encountering a surprising change in behavior in either a ported tactic or Lean core. (Ask on Zulip, create an issue, or hesitantly work around it.)
- Make sure imports are accurate (even down to casing) - Mathport is not super great at these.
- Discovering that an import
- If you get stuck on something, please add the label
help-wanted
, but also post on Zulip asking for assistance. - If everything works, please remove any
help-wanted
orWIP
labels, and add the labelawaiting-review
.
Implicit in this proposed scenario there is no "flag day" for all of mathlib
: we just work our way up from the bottom.
In the optimistic scenario, the problems in step 10 become less and less frequent, and mathport
gets some tweaks that reduce the effort required in step 8, and it becomes possible to do multiple files, or whole directories or areas in one go!
For any file that isn't already a "complete port" but has been manually partly ported, we take "starting over" as the default route.
This might involve:
- move the existing
X.lean
toX2.lean
- take
X.lean
from mathlib3port, and get it working asX.lean
-
import X.lean
inX2.lean
- delete everything in
X2.lean
that by now has already been defined inX.lean
- find homes for or discard anything that remains (judgement required here!)
- get everything downstream working again
You can also port from Lean 3 core, as well as from mathlib3. If you find something is missing that was in Lean 3 core, please put it in the Mathlib/Init/
directory. There is a repository lean3port
which contains the output of mathport
running on Lean 3 core, analogous to mathlib3port
.
- change
import Mathbin
toimport Mathlib
- Recommendation: temporarily add the line
set_option autoImplicit false -- **TODO** delete this later
to the top of the file; this stops Lean from incorrectly assuming that certain incorrectly ported terms are just new variables which the user chose not to declare. - Add explicit imports to resolve missing dependencies where transitive imports have changed (often by identifying what the
mathlib
equivalent resolves to & what themathlib4
equivalent is) - If a file imports another file from the
tactic
ormeta
folder: look at the status of which tactics that still need to be ported in this list. Tactics, attributes and commands that are not on this list should be working in Lean 4. If the tactics mentioned in the imports are already ported, you can import the corresponding mathlib4 files (the file name might have changed a bit). If not, check the open PR's on mathlib4 with the "meta" label—sometimes tactics are effectively ported and functional, but are not yet merged as they await review. If you find it there, feel free to merge that tactic's feature branch into your branch, and your PR will become dependent on that branch. (Note that the tactic porting tracking issue is not always up to date, so it's worth taking a look just in case.) If that fails too, you can replace the'No'
on https://github.com/leanprover-community/mathlib/wiki/mathlib4-port-status by'Waiting on tactic X'
or something. - when mathport suggests that you add
ₓ
in the name of an#align
statement, figure out why it wants that and leave a porting note. Common reasons are:- a change in the order of universe variables, typeclass arguments or implicit arguments
- some other defeq change, due to a change deep down in core or std.
- Review the names assigned by
mathport
for adherence to naming conventions (see Aligning Names below) and fix and align names manually. - Ensure the names in documentation are consistent with the updated names. Once you are done with
#align
s,./scripts/fix-comments.py
can fix many of these automatically, so save this step to late in the porting process. - Missing attributes:
- For
refl
:import Mathlib.Tactic.Relation.Rfl @[refl] theorem foo : 1 = 1 := rfl
-
trans
: for now use import Mathlib.Mathport.Attributes. -
protect_proj
: addprotected
before each field - If you run into other attributes, hopefully you will find them noted as needing porting in
Mathlib/Mathport/Syntax.lean
(which you can freely import if needed, I guess), or they should already be implemented somewhere, and searching forregister.*attr
should help you locate it.
- For
- See also the Lean 4 documentation about differences with Lean 3.
- When declaring a structure, if a field needs to be put on a new line, then each fields need to be on a new line, or at least the first after the
with
keyword must be on a new line. If you hit the error "expected '}'", it's probably this. - For
LawfulMonad
bind_pure_comp_eq_map
is nowbind_pure_comp
and there are more laws that need to be proved.LawfulMonad.mk'
defaults many of these. As an example translation:goes toinstance : LawfulMonad (PFun α) where bind_pure_comp_eq_map β γ f x := funext fun a => Part.bind_some_eq_map _ _ id_map β f := by funext a <;> dsimp [Functor.map, PFun.map] <;> cases f a <;> rfl pure_bind β γ x f := funext fun a => Part.bind_some.{u_1, u_2} _ (f x) bind_assoc β γ δ f g k := funext fun a => (f a).bind_assoc (fun b => g b a) fun b => k b a
(Note how implicit arguments could be omitted in Lean 4)instance : LawfulMonad (PFun α) := LawfulMonad.mk' (bind_pure_comp := fun f x => funext fun a => Part.bind_some_eq_map _ _) (id_map := fun f => by funext a ; dsimp [Functor.map, PFun.map] ; cases f a; rfl) (pure_bind := fun x f => funext fun a => Part.bind_some _ (f x)) (bind_assoc := fun f g k => funext fun a => (f a).bind_assoc (fun b => g b a) fun b => k b a)
- The command
include
andomit
are gone in Lean4, you can safely delete the corresponding lines. Lean should be able to do the right thing by default. - Lean 4 names instances differently from Lean 3, in ways that can lead to exponential explosions in instance names. It is best to look up the instance name in Lean 3 and name instances explicitly. If you come across a reference to an instance name in another file where the instance is not explicitly named, rather than change the sensible name to
instLongNameAnotherLongNameStillAnotherLongName
, it is best to name the upstream instance in the other file explicitly in the PR. - The simplifier behaves differently, in ways that are still not fully documented. Some changes that might be needed:
- add the lemma
iff_self
to a failingsimp only [...]
, especially if the goal state looks like this might be relevant (P ↔ P
): the Lean 3 simplifier did simp onlys a bit differently and that simplification was included by default. Zulip discussion, Lean 4 PR which would change this - a "non-confluent" simp is not guaranteed to behave the same way between Lean 3 and Lean 4. You can squeeze the Lean 3
simp
to asimp only
and copy it over, or you can identify the different path taken in Lean 4 by squeezing both simps and then omit the bad lemma which starts this path withsimp [..., -bad_lemma]
. Zulip discussion - If you get an error message containing "the following variables have been introduced by the implicit lambda feature", it may be that the Lean 3 code contained a proof term with a lambda introducing an implicit argument and this confused mathport. For instance you could have in Lean 3
example := ∀ {n : ℕ}, ... := λ n, bar
which becomes in Lean 4example := ∀ {n : Nat}, ... := bar
. See the section about implicit lambdas in the Lean 4 manual.
- add the lemma
- If you get a
simp can prove this
lint, check ifby simp
can close the original goal and consider removing thesimp
attribute to address the lint. Put porting comments to indicate the change. If the original proof isrfl
, check ifby dsimp
can close the goal. If not, do not remove thesimp
, instead add anolint simpNF
(and again, document the decision via porting notes). In some cases, the path tosimp can prove this
is very complicated or nonsensical - in such cases ask for guidance on the Zulip. Templates for porting notes:- "-- @[simp] -- Porting note: simp can prove this" -- if proof is not rfl
- "-- @[simp] -- Porting note: dsimp can prove this" -- if proof is rfl
- "@[simp, nolint simpNF] -- Porting note: dsimp cannot prove this" -- if proof is rfl
- If you get
simpNF
lint that the left hand side can be simplified, it is often best to create a newsimp
lemma with the simplified LHS and remove thesimp
attribute from the original lemma. -
The notationThis change is not recommended, because it makes review more difficult and it can be done automatically later.fun x y => z
should be replaced byfun x y ↦ z
. -
bit0
andbit1
yield deprecation warnings. You should still port lemmas that involve them, but useset_option linter.deprecated false
within a section orset_option linter.deprecated false in
before a single result. -
nthLe
has been deprecated forget
but has a significant API change. If it's easy and targeted to refactor toget
, add a porting note, and leave adeprecated
version of the old lemma. If not, useset_option linter.deprecated false
as forbit0
/bit1
and add a porting note to refactor post-port - Long strings that include an explicit
\n
can have the explicit\n
removed, and instead be broken up by actual newlines as necessary to satisfy the linter. (Single newlines in strings are ignored.)-
\n\n
should likewise be replaced with two actual newlines, as this signals a real line break in the generated documentation.
-
- Sometimes a "failed to show termination" error in a recursive definition can be resolved just by specifying the type on the right hand side of a match (
=>
).
See: https://leanprover.zulipchat.com/#narrow/stream/287929-mathlib4/topic/CoeFun/near/317644350
have : ↑(RelIso.toRelEmbedding e).toEmbedding = FunLike.coe e := rfl
can often be used to simplify.
When you get coersion problems of the form r ^ ↑n : ℝ
instead of r ^ n : ℝ
(for n : ℕ
), you can temporarily add the following at the top of the file:
local macro_rules | `($x ^ $y) => `(HPow.hPow $x $y)
(see zulip discussion)
You can fix many linting errors automatically by running scripts/fix-lints.py
on the file.
Run lints by adding #lint
at the bottom of the file. Remove the #lint directive before checking in.
If the file you are importing don't give you access to the #lint
command then you need to add
import Std.Tactic.Lint.Frontend
import Std.Tactic.Lint.Misc
Note that we do not expect new documentation to be added during the porting process. Writing #lint
at the bottom of a file will complain about missing documentation, but this can be ignored. The documentation linter has been disabled in CI, for purposes of the port.
Adding documentation will be done separately, so that it can be done with a focus on writing good documentation.
In Lean 4, notation can also be documented. This can usually be done using the @[inherit_doc]
attribute, which will copy the documentation from another declaration. This attribute takes a declaration name as an optional parameter in case Lean doesn't figure out which declaration should give its documentation to the notation.
In case @[inherit_doc]
is not sufficient, please leave a -- Porting note: TODO add documentation
.
In Mathlib.Mathport.Rename
there are a bunch of commands of the form #align lean3_name lean4Name
, but you can place them anywhere. These instruct mathport to adjust its naming on a case-by-case basis, and we need to be using it a lot more than we are. Note that mathport#187 removes the \_x
's in exchange for now quite often running into severe type errors and having to stub stuff out. Basically, it will now assume that any name clashes are intentional and will use the clashed name even if it's not type correct (instead of making up \_x
names), but that means that some definitions that actually do need to be different, like CoeT
or Stream
, need to be manually marked for realignment.
But even beyond this, whenever mathport gets a name wrong you want to use an #align
directive to change the name to the desired thing, even if you have already fixed the definition, because that influences all future references of that definition.
If Lean complains it does not know about the #align
command then you need to import Mathlib.Mathport.Rename
.
Note that #align
does not automatically include namespaces.
(note: These are work in progress)
We use the following conventions
- Terms of
Prop
s (e.g. proofs, theorem names) are insnake_case
. -
Prop
s andType
s (orSort
) (inductive types, structures, classes) are inUpperCamelCase
. - Functions are named the same way as their return values (e.g. a function of type
A → B → C
is named as though it is a term of typeC
). - All other terms of
Type
s (basically anything else) are inlowerCamelCase
. - When something named with
UpperCamelCase
is part of something named withsnake_case
, it is referenced inlowerCamelCase
. - Acronyms like
LE
are written upper-/lowercase as a group, depending on what the first character would be. - Rules 1-6 apply to fields of a structure or constructors of an inductive type in the same way.
There are some rare counter-exceptions to preserve local naming symmetry: e.g., we use Ne
rather than NE
to follow the example of Eq
; outParam
has a Sort
output but is not UpperCamelCase
; etc. Any such counter-exceptions should be discussed on Zulip.
Note that if a theorem contains a lowerCamelCase
component with multiple words as in rule 4, mathport will usually get the name wrong and it has to be #align
ed by hand.
-- follows rule 2
structure OneHom (M : Type _) (N : Type _) [One M] [One N] where
toFun : M → N -- follows rule 4 via rule 3 and rule 7
map_one' : toFun 1 = 1 -- follows rule 1 via rule 7
-- follows rule 2 via rule 3
class CoeIsOneHom [One M] [One N] : Prop where
coe_one : (↑(1 : M) : N) = 1 -- follows rule 1 via rule 6
-- follows rule 1 via rule 3
theorem map_one [OneHomClass F M N] (f : F) : f 1 = 1 := sorry
-- follows rules 1 and 5
theorem MonoidHom.toOneHom_injective [MulOneClass M] [MulOneClass N] :
Function.Injective (MonoidHom.toOneHom : (M →* N) → OneHom M N) := sorry
-- manual align is needed due to `lowerCamelCase` with several words inside `snake_case`
#align monoid_hom.to_one_hom_injective MonoidHom.toOneHom_injective
-- follows rule 2
class HPow (α : Type u) (β : Type v) (γ : Type w) where
hPow : α → β → γ -- follows rule 3 via rule 6; note that rule 5 does not apply
-- follows rules 2 and 6
class LT (α : Type u) where
lt : α → α → Prop -- follows rule 4 and 6
-- follows rules 1 and 6
theorem gt_iff_lt [LT α] {a b : α} : a > b ↔ b < a := sorry
-- follows rule 2; `Ne` is an exception to rule 6
class NeZero : Prop := sorry
-- follows rules 1 and 5
theorem neZero_iff {R : Type _} [Zero R] {n : R} : NeZero n ↔ n ≠ 0 := sorry
-- manual align is needed due to `lowerCamelCase` with several words inside `snake_case`
#align ne_zero_iff neZero_iff
There is a difference between irreducible_def
and @[irreducible] def
, namely that irreducible_def
applies to both the kernel and the elaborator, whereas @[irreducible] def
only applies to the elaborator. Moreover, irreducible_def foo
creates a definitional lemma for unfolding called foo_def
, so one should use rw [foo_def]
instead of rw [foo]
or unfold foo
. If you are in a situation where you see Wrapper.value✝ wrapped✝
somewhere in your goal, it is likely the result of an irreducible_def
which has been unfolded improperly (i.e., not using foo_def
).
Where Mathlib3 used has_foo
Mathlib 4 will use Foo
.
Examples
- Lean 3:
has_mul
, Lean 4:Mul
Where Mathlib 3 used is_foo
, Mathlib 4 will use IsFoo
Note, however that Core LawfulFoo
names follow a different convention (which may be changed in the future).
- Lean 3:
is_lawful_monad
, Lean 4:LawfulMonad
See the discussion starting here: https://leanprover.zulipchat.com/#narrow/stream/287929-mathlib4/topic/ENnReal/near/312549431
- Categories are suffixed with
Cat
when a name clash would otherwise occur, e.g.GroupCat
.
In Lean 4 coercions are elaborated directly, so given
variable (f : α ≃ β) (a : α)
the application f a
is syntactically equal to f.toFun a
.
UPD: It seems that we have some magic that prevents Lean from unfolding past FunLike.coe
, so for types with FunLike
instance we have a Lean3-style behavior (thus we can have generic lemmas like map_mul
). If you define a CoeFun
instance instead, then it unfolds to toFun
.
Similarly
variable (g : α → Prop) (x : Subtype g)
the coercion (x : α)
is elaborated as x.val
.
There are several different typeclasses for coercions, the most common ones being Coe
, CoeFun
and CoeTail
.
Usually, mathlib3's coe
corresponds to Coe
, coe_fn
to CoeFun
and coeTC
to CoeTail
. In most cases, replacing coe
with Coe.coe
(or one of the other CoeFoo
classes) will work.
This means that we can get rid of lemmas that used to do the simplification f.to_fun x
to ⇑f x
as these lemmas are now syntactic tautologies.
This lemma for Subtype
is not needed anymore in mathlib4:
theorem val_eq_coe {x : Subtype p} : x.1 = ↑x :=
rfl
Another consequence is that certain instances become unnecessary:
instance coe_trans [Zero M] [Coe R S] [CoeTail S M] [h : NeZero (r : M)] : NeZero ((r : S) : M) :=
⟨h.out⟩
If the coercion function appears explicitly in a theorem, then it needs to be replaced by the function that is used in the coercion. So for example
theorem cast_injective : injective (coe : ℕ → R) := sorry
becomes
theorem cast_injective : injective (Nat.cast : ℕ → R) := sorry
To define coercions, there exists now the coe
attribute to automatically create the ↑
delaborator and the norm_cast
attribute.
A complete definition of a coercion might look like this:
@[coe] def cast [AddGroupWithOne R] : ℤ → R := AddGroupWithOne.intCast
instance [AddGroupWithOne R] : CoeTail ℤ R where coe := cast
-
change e
defined inMathlib.Tactic.Basic
as an alias forshow e
, but theat
form andchange .. with ..
are both defined without implementation in lean 4 core. These are just missing an implementation (and they should either be PR'd to lean 4 or moved to mathlib4).
In the meantime, sometimes dsimp
works, sometimes an equivalent rw
, sometimes a revert
/change
/intro
, sometimes change … = _
.
For change <foo> at h
, replace h: <foo> := h
works.
-
to_additive
attribute doesn't correctly translate some names. Just like in Lean 3, it has issues with flipping variable order inpow -> smul
andzpow -> zsmul
. In such cases, use@[to_additive my_additive_theorem]
before the multiplicative theorem. - When applied to a structure:
to_additive
doesn't automatically translate projections to extended structures other than the first, in case that the extended structures have a field in common. In this case, manually add these tags for all projections other than the first. Example:attribute [to_additive AddCommGroup.toAddCommMonoid] CommGroup.toCommMonoid
. You can easily see which projections are translated withattribute [to_additive? ...] ...
. -
to_additive
might not reliably deal with numerals or other operations on fixed types, likeNat
. In this case, make a Zulip thread tagging @fpvandoorn. As a workaround, you can first manually state the additive version, and then the multiplicative version tagged with@[to_additive]
. - Docstrings appearing as strings after
to_additive
can be shortened to satisfy the linter by removing any explicit\n
's that appear and inserting actual newlines as necessary, as single newlines are ignored in strings.\n\n
should be replaced with two newlines.
Lean 4 has automatic generation for implicit types, so that the following code is valid without explicitly mentioning α
or β
:
variable (f : α ≃ β)
When porting this may cause errors due to mathlib3port not aligning names correctly, so it can help to turn auto-implicits off with
set_option autoImplicit false
#synth
is useful for discovering if and where instances are available (e.g. as #synth Semiring Nat
)
All mathlib maintainers are encouraged to regularly view the #queue4, and to
- ask for changes (please change
awaiting-review
toawaiting-author
), -
bors merge
, or bors d+