Skip to content
Tomer Libal edited this page Jun 29, 2017 · 12 revisions

Two Phase Type Checker

It was decided to try and apply the two phases directly in one of the following two ways:

  1. Apply it per toplevel let definition - here we apply it separately on each toplevel let and let rec definition in TcTerm.fs:tc_maybe_toplevel_term.
  2. Apply it per module

Below we experiment with the first option.

Applying two phases per top level let definition

The idea here is to call once in lax mode. This will infer types but will not discharge constraints. Then we call the type checker again on the resulted definition (with inferred types). Right now we apply it on the cases dealing with top level let and letrec definitions.

Problems

  • The first problem was encountered when trying to deal with universes in top level letrecs. The approach above showed that universe variables are not being opened. Indeed, the current implementation just avoided opening top level letrecs and instead trusted constraints about recursive calls. Since universes were added afterwards, they were ignored as well. The fix was to open only the universes in this case.

This is done in Syntax.Subst.fs in the opening and closing of letrecs.

  • The above introduced another problem (in fact two but the second - about explicit vs implicit universes -has disappeared when I switched to the guido_tactics branch, as recommended by Nik). In order to prevent polymorphic recursions, in TcTerm.mk_letrec_env, we just ignore the universe before pushing the letrec into the env. The problem is that when we try to recursively apply it, the universe is instantiated in the application but there is no universe in the definition obtained from the env. We get therefore that there is a mismatch between the number of universes between the definition and the term we apply. One solution for that is to prevent instantiation on application of letrecs (Syntax.Util.close_univs_and_mk_letbinding). The problem then is that the normalizer needs to be changed to ignore the universes as well. A better solution seems to be to keep the instantiations as well as adding the the universes when building the letrec_env. Then we might allow a polymorphic recursion and we prevent this by checking, on all recursive (mutually recursive all have the same universes so they are included implicitly) calls, that they are applied to exactly the same list of universes names. The list of all letrecs can be obtained from the environment.

Notes

  • In order to use the ocaml debugger, the dependency on Pervasive must be remove in Parser.Dep. Apparently its Lax checking takes too long, at least on my machine.

Old thread

Summary

The relationship between discharging guards and type checking seems pretty complex. At different points in type checking, guards are generated. In some places, the type checking algorithm is asking to discharge guards and is waiting for results. Some of these requests target also SMT. Some others also loop between SMT and unification until a fix point is achieved.

Work should be done in understanding why some cases require SMT while others dont.

Discussions with people on the team hinted that the first phase will do unification only while the second phase will do SMT solving. From the description of the task on the roadmap, it seems that the first phase will deal with universes, types, implicit arguments and effect labels while the second will deal with WPs.

Clearly, from looking at the code, the two are mixed together. For example, inferring implicits applies both unification and SMT in a loop.

Conclusion

Unification is completely separated from SMT. A constraint is given to unification and the result to SMT and vice versa but there is no backtracking between SMT and unification. It means that the unification algorithm can be implemented as a separate entity. The benefits will be a dedicated unification algorithm for the F* language which is based on well-established papers and algorithms used in other dependent typed languages.

In order to efficiently separate the two phases, one needs first to understand if SMT is required to the constraints which are planned for the first phase. I.e. inferring implicit arguments really requires SMT?

Some subtype relations require SMT as well. Can they be dealt with by a stronger unification algorithm? Should they be deferred to the second phase?

Aseem's approach was to run it first in Lax mode in order for the SMT call to always succeed and then run it again with SMT turned on.

An attempt at approximating what discharging guards is doing

A more technical but partial description can be found here.

The type checking code generates and discharge guards using the Rel file. When trying to discharge guards, unification is always applied first in order to compute a list of unresolved constraints. If SMT is supported, unsolved constraints are then being forwarded to Z3.

There is a loop calling unification, then SMT, then unification, ....

Guards are conjunctions and implications of problems which are either typing problems or computation problems.

During unification, some problems are being deferred and when the non-deferred ones are all solved, unification is applied recursively. There is a mechanism checking if we have reached a fixed point when applying unification (using a counter which is incremented every time a substitution was applied, i.e. a problem was changed).

According to a remark, deferred can contain non-patterns and some subtype constraints containing variables.

Each problem is prioritized according to its rank since we always want to solve simple problems first (rigid-rigid, etc.).

There are two types of constraints - computation and typing.

There are 2 types of typing constraints - equality and subtyping.

Each guard can be defined to be trivial or non trivial, what does it mean? It seems that non-trivial guards have additional logical constraints which are used as rewriting rules.

The unification algorithms has options for solving patterns and non-patterns, equalities and subtyping relations.

A flex-flex subtype relation (even pattern) is always deferred if possible, if not, it is considered as equality and there is a loss of generality. (commented in code).

First or higher-order flex-rigid constraints where the rigid is a constant are always treated as equality. There is a remark that in case of arrows delaying destroys arity. This is also a loss of generality using the example from the above comment (not commented in code).

An attempt in understanding the relationship between type checking and discharging guards

Discharging guards is done using different methods during type checking. Some also expect SMT involvement if allowed.

discharge_guard (with SMT)

This method applies once unification and then SMT and is called when

  • check_inductive_well_typeness
  • tc_abs
  • tc_eqn
  • check_top_level_let_rec
  • building a let rec environment

resolve_implicits

This method loop on unification, SMT, unification, .. but only on resolving implicits

  • tc_eqn
  • check_top_level_let
  • check_top+level_let_rec
  • build_let_rec_env

solve_deferred_constraints

Is applying only unification

  • tc_abs
  • check_application_args
  • check_top_level_let
  • check_top_level_let_rec
  • tc_tot_or_gtot_term
  • universe_of_aux

force_trivial_guard

Tries both solve_deferred_constraints and resolve_implicits from above and occurs in many places. It asserts that the guard is solvable.

Clone this wiki locally