Skip to content
rebinsilva edited this page Apr 20, 2020 · 16 revisions

Menhir is ocaml-based LR(1) parser generator with a lot of interesting features such as parametrized production rules, nice syntactic additions over the mainstream yacc syntax and really good error reporting of parser conflicts. Being based on ocamlyacc the syntax is pretty close to fsyacc.

The main motivation for this major change are :

  • having a simpler parser (a code easier to dive in and modify), closer to the expected bnf of FStar
  • having better maintenance tools (i.e for analysing shift/reduce, reduce/reduce and end-of-stream conflicts)
  • having better error reporting of syntactic errors (not done yet...)

Modification of FStar organisation

Until now, the main parser was written in fsyacc and sed-compiled into a ocamlyacc-compatible file which was used for the ocaml based FStar. The introduction of menhir swap the situation : the official parser is written in a menhir compatible file format and then preprocessed to ocamlyacc and fsyacc.

Impact on FStar users

If you are not modifying the parser, there is currently no change to your workflow. You only need menhir (version 20161115 or later) if you start modifying the parser. In that case do not forget to regenerate the ocamlyacc and fsyacc parsers by calling make -C src/ocaml-output parser before pushing your changes. To recapitulate we now have 3 different parsers :

  • src/parser/parse.mly is the original menhir parser
  • src/parser/ml/parse.mly is the preprocessed-from-menhir ocamlyacc parser
  • src/parser/parse.fsy is the sed-processed-from-ocamlyacc fsyacc parser

and the rules to (re)generate them are in src/ocaml-output/Makefile.

Modification in the parser

A lot of small syntax fixes have been incorporated in this new parser.

Multi-binders

A long-standing request for FStar : having the possibility of writing let f (#a b #c : Type) = ... instead of the more cumbersome let f (#a : Type) (b : Type) (#c : Type) =...

In the case of a multibinder with a refinement (x y : t { phi }) no variable is bound in phi (which is different from the one-binder case).

So (x y : t { phi }) is elaborated to (x:(z:t { phi })) (x:(z:t { phi })) with z fresh, whereas (x : t { phi }) is elaborated to (x:(x:t { phi }))

val can take parameters

val is now syntactically closer to let and accepts binders before the type ascription e.g. val f : (x:int) -> (b:bool) -> unit can be written val f (x:int) (b:bool) : unit

There is still a restriction that these parameters are binders and not patterns so you still can't write val g (Some x: option int) : unit whereas this syntax is valid for let.

Definition of infix operators

The natural syntax val ( op ) : ... and let ( op ) x y = ... is now valid in order to declare and define infix operators.

Generalisation of wildcard _ in bindings

The wildcard _ can now be used in most binding constructs for example fun _ -> ....

There is currently one specific restriction in let-multibinders : in that case all the binders must be named (so let f (x _ : int) = 42 is forbidden).

Pattern-refinement

Fstar now accept refinements annotations on arbitrary patterns such as

let h (Some (x,y):(option (int * int)){x + y = 3}) = x

This annotation is desugared to the following code

let f (Some (x,y):(z:(option (int * int)){ match z with | Some (x,y) -> x + y = 3 | _ -> False})) = x

Precedence of <: modified

The ascription operator <: is now tighter than before for example

(fun x -> x <: int)

is now parsed as fun x -> (x <: int)

Note that in the previous parser, the rule for ascription was ( term <: type ) and that the parens are no longer necessary.

Other changes

There are also some regressions/syntactic fixes which might break existing code :

  • (| t |) is no longer accepted as a variant of ( t ) in particular the correct syntax for defining a dependent pair type is (x:a & b(x)) (if the parens are necessary)
  • The focusing let has been slightly modified : let ~> rec bidule = ... has been replaced by let rec ~> bidule = ...
  • The stratified specific = qualifier is not available anymore on binders i.e. the deprecated fun (=x:t) -> ... must be changed to fun ($x:t) -> ...
  • It is now possible to put parens around implicit parameters in types e.g. val u : (#a:Type) -> (x:a) -> Tot unit
Clone this wiki locally