Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove polymorphic variants using module namespacing #487

Closed

Conversation

jstolarek
Copy link
Contributor

@jstolarek jstolarek commented Feb 19, 2019

A while back I started an email thread about warnings generated by the compiler when several datatypes share the same constructor. In such cases compiler uses type-directed disambiguation and for the most time it guesses wrong. Ideas to solve the problem included:

  1. using type annotations
  2. putting datatype definitions into modules
  3. prefixing constructor names so that they are different

I wasn't happy with (1) because it meant awkward clutter. Sam pointed out that (3) is a poor-man's namespacing. So I decided to experiment with (2) to see how it works out. I placed pattern and patternode datatypes in a Pattern module like this:

module Pattern = struct
  type node =
  | Any
  | Nil
  | Cons     of t * t
  | List     of t list
  | Variant  of name * t option
  | Effect   of name * t list * t
  | Negative of name list
  | Record   of (name * t) list * t option
  | Tuple    of t list
  | Constant of constant
  | Variable of binder
  | As       of binder * t
  | HasType  of t * datatype'
   [@@deriving show]
  and t = node with_pos
   [@@deriving show]
end

and then changed all the code accordingly. Thoughts, questions and observations:

  1. This approach adds some clutter to the code. Most pattern matches are now prefixed with let open Pattern in, unless we only match on one or two constructors, in which case I used name quantification. Even with let open quantification is sometimes required because OCaml is not very good at type inference:
let open Pattern in
match p with
| Pattern.Any -> ()
| Nil -> ()
| _ -> ()

Here if Any isn't prefixed the compiler complains about type-directed disambiguation. An alternative solution is to put Nil branch first, because Nil does not appear in other datatypes (yet):

let open Pattern in
match p with
| Nil -> ()
| Any -> ()
| _ -> ()

This generally works, but it might force us to order branches in an unintuitive order. Another annoying part was the parser, where every use of constructor has to be prefixed.

Is all that clutter acceptable? Perhaps a shorter module name (Pat?) would be better?

  1. Pattern.t is now a pattern with position, while Pattern.node is just a pattern. Is that intuitive? Now that I wrote this I am having thoughts that a more intuitive naming would be to have Pattern.t as the pattern datatype with all the constructors and Pattern.with_pos as pattern with position.

  2. In Sugartypes there is a long series of definitions of mutually recursive datatypes that include phrase, binding, and all the accompanying datatype definitions. Within that group two constructors in cp_phrasenode (Select, Offer) are repeated in other datatypes. I am leaning strongly towards prefixing all constructors in cp_phrasenode with CP. This would mean that the whole recursive group can remain as is, without the need of putting all the datatype definitions into mutually recursive modules.

  3. Unrelated to this PR, I think some datatypes would perhaps be better kept as polymorphic variants. I am in particular thinking about freedom. The reason is its interaction with Types.Vars.flavour and subtype casting from freedom to flavour (e.g. in free_bound_type_vars). While a direct cast can be easily replaced with a function (a small runtime expense), we also seem to be using subtyping on data structures. From my understanding of code this would require rebuilding the whole data structure just to perform a cast on elemnents stored inside, which is potentially a large runtime expense.

Any thoughts and feedback welcome.

Pattern.t is now Pattern.with_pos, Pattern.node is now Pattern.t
This will allow to avoid name clashes after converting phrasenode
datatype to an ordinary variant datatype (instead of current polymorphic
variant implementation)
@frank-emrich
Copy link
Contributor

Here if Any isn't prefixed the compiler complains about type-directed disambiguation. An alternative solution is to put Nil branch first, because Nil does not appear in other datatypes (yet)

I think this is a good argument to settle the debate on type-directed disambiguation by just accepting that we use it and disable the corresponding warning.

I am leaning strongly towards prefixing all constructors in `cp_phrasenode` with `CP`.

This sounds like a reasonable thing to do and much better than creating mutually recursive modules.

Is all that clutter acceptable? Perhaps a shorter module name (Pat?) would be better?

I would rather have some clutter than trying to be clever about the ordering of patterns. Once we allow type-directed disambiguation, the local open and occasional prefexing of constructors with module names should be fine IMHO.

@jstolarek
Copy link
Contributor Author

jstolarek commented Feb 20, 2019

I think this is a good argument to settle the debate on type-directed disambiguation by just accepting that we use it and disable the corresponding warning.

We can't disable the warnings because they are right. Like I said, with type-directed disambiguation the compiler picks one of the datatypes in scope, and it usually picks the wrong one.

I pushed the refactoring further and redefined almost all sugartypes as ordinary variants. Few minor definitions remain and I am working on them. I'd say that the amount of extra clutter is reasonable and acceptable.

Also, put all the datatype-related datatype definitions into a module
@jamescheney
Copy link
Contributor

This looks like progress to me. How close does it get us to being able to reconsider using visitors and alphalib (#255)?

@jstolarek
Copy link
Contributor Author

Note to self and @rudihorn: lens_opeartors.ml says

let from_links v =
   match v with
   | `Minus -> Minus
   | `FloatMinus -> Minus
   | `Not -> Not
   | `Name name -> Name name

But `Not is not introduced anywhere in Links source code, and thus the corresponding Not operator seems dead.

@jstolarek
Copy link
Contributor Author

@rudihorn, I got stuck on refactoring unary operators in Sugartypes due to the way the code interacts with lenses stuff. The problem is with Lens_operators module. Previously it pulled operator tags out of thin air, but now it needs to rely on definitions in Sugartypes, which creates a cycle between Sugartypes, Types and Lens_operators. Sugartypes has to rely on Types. Since Types now relies on lenses stuff it seems that all lenses definitions should go from Types to Sugartypes. This looks messy. You are more familiar with the structure of lens code - what are your thoughts? Would you be able to take a look at the code or perhaps sit down with me tomorrow before the Links meeting? The code is currently on js-polymorphic-variants-unary-ops-stuck-on-lenses branch - changed up to a point where the compilation trips on the lenses stuff.

@jstolarek
Copy link
Contributor Author

@rudihorn, you can ignore the above question now. I've found a good way to break the module cycle.

@jstolarek
Copy link
Contributor Author

This PR is ready for review. Things of interest:

  • new module CommonTypes contains simple types used across the compiler + some helper functions. See whether you like what you see there because you'll be encountering and using it a lot.
  • Sugartypes.freedom remains as a polymorphic variant due to it's interaction with Types.Vars.flavour.
  • the interesting changes are in Sugartypes.ml. Operators have been moved to Operators.ml due to interaction with lenses.
  • all the remaining changes are fallout.

If there are no objections I'd like to merge on Monday.

Copy link
Contributor

@frank-emrich frank-emrich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To stay in today's metaphor: You're doing god's work, Janek!

| `Table (_x, _x_i1, _x_i2) ->
let _x_i1 = o#datatype _x_i1 in Mu (_x, _x_i1)
| Forall (_x, _x_i1) ->
let _x = _x in
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any reason for keeping this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Absolutely no. Good catch. Fixed.

end

(* Convenient aliases for constructing values *)
let resAny = Restriction.Any
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you're essentially introducing aliases for the more generically named variants like Any, by introducing linAny, resAny. In my opinion, this just shows that those constructors should be renamed. Effectively, only the aliases you define here are used in the rest of the code, so we could just rename the underlying constructors. What do the others think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

only the aliases you define here are used in the rest of the code

To be more precise: only aliases are used for constructing values. For deconstructing them (pattern matching) we still use constructors, for these types usually with explicit qualification. As for renaming the constructors to ResAny, LinAny, etc. - because I understand that's what you're proposing - I wonder what others think. Sam criticized this as poor man's namespaces, so I went with modules. Perhaps a middle ground here is shortening the names of modules to Lin, DecLin, Res, PrKind? We could then remove the aliases because spelling out the module name wouldn't be that painful.

Rant: I wish OCaml had pattern synonyms.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I quite like the prefix, it is informative. But I wouldn't be opposed to the good ol' single letter prefix on constructor names either.

@dhil
Copy link
Member

dhil commented Feb 21, 2019

If there are no objections I'd like to merge on Monday.

I am slightly concerned about rebasing the proper-modules branch on top of this. Although, I suppose I'll just leave that task to @frank-emrich...

core/commonTypes.ml Outdated Show resolved Hide resolved
core/commonTypes.ml Outdated Show resolved Hide resolved
core/commonTypes.ml Outdated Show resolved Hide resolved
core/commonTypes.ml Outdated Show resolved Hide resolved
core/commonTypes.ml Outdated Show resolved Hide resolved
core/commonTypes.ml Outdated Show resolved Hide resolved
core/commonTypes.ml Outdated Show resolved Hide resolved
core/commonTypes.ml Outdated Show resolved Hide resolved
core/commonTypes.ml Outdated Show resolved Hide resolved
core/commonTypes.ml Outdated Show resolved Hide resolved
@jstolarek
Copy link
Contributor Author

@dhil: thanks for feedback, all fixed now.

@jstolarek jstolarek requested a review from dhil February 22, 2019 09:05
@jstolarek jstolarek changed the title Remove polymorphic variants using module namespacing - an experiment Remove polymorphic variants using module namespacing Feb 22, 2019
@jstolarek jstolarek closed this in 1964e58 Feb 25, 2019
@jstolarek jstolarek deleted the js-polymorphic-variants-with-module-namespaces branch February 26, 2019 10:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants