Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clean up some code smells in D.S.{Promote,Single}.Defun #424

Merged
merged 1 commit into from
Nov 25, 2019

Conversation

RyanGlScott
Copy link
Collaborator

While looking at the defunctionalization code recently, I discovered several smelly aspects of the code. This patch does not fix all such code smells (I plan to tackle more of them while tackling #378), but this does pick some of the lower-hanging fruit. Here are the highlights:

  1. The reverse function is used an awful lot in D.S.Promote.Defun.defunctionalize. It's used here, when the inner loop (go) is first invoked:

    other_decs <- go (num_args - 1) (reverse m_arg_tvbs) m_res_kind

    And it's also used here, inside of go itself:

    arg_params = -- Implements part (2)(ii) from
    -- Note [Defunctionalization and dependent quantification]
    map (map_tvb_kind (substType tvb_to_type_map)) $
    reverse m_args

    This makes my head spin. Moreover, there other some parts of go that use m_args instead of using reverse m_args, which causes some odd-looking code to be generated (e.g., generating both data BlahSym2 x y :: z ~> Type and type instance Apply (BlahSym2 y x) z = BlahSym3 y x z). None of this is technially wrong, but I find it hard to follow.

    Luckily, we can get away without needing either use of reverse. Instead of processing a single list of TyVarBndrs in reverse order inside of go, we can track two lists of TyVarBndrs. To see how this works, imagine we are to generate these defunctionalization symbols:

    data BlahSym0 :: x ~> y ~> z ~> Type
    data BlahSym1    x :: y ~> z ~> Type
    data BlahSym2    x    y :: z ~> Type

    We can accomplish this by "cycling" through the TyVarBndrs with two lists: one for the arguments and another for the result kind. Or, in code:

    []     [x, y, z] -- BlahSym0
    [x]       [y, z] -- BlahSym1
    [x, y]       [z] -- BlahSym2

    Notice that at each iteration of go, we take the first TyVarBndr from the second list and append it to the end of the first. This makes go run in quadratic time, but this is not a new precedent, since go was quadratic before due to invoking reverse in each iteration. Given the choice between these two quadratic-time designs, I prefer the new one. I've applied this refactor to the go functions in D.S.Promote.Defun.defunctionalize and D.S.Single.Defun.singDefuns, and left some comments explaining what is going on.

  2. The inner loop of D.S.Promote.Defun.defunctionalize works by starting from n - 1 (where n is the number of arguments) and iterating until 0 is reached. On the flip side, the inner loop of D.S.Single.Defun.singDefuns works by starting from 0 and iterating until n - 1 is reached. For the sake of consistency, I have adopted the singDefuns convention (start from 0 and count up) for both pieces of code.

  3. The inner loops of D.S.Promote.Defun.defunctionalize and D.S.Single.Defun.singDefuns are monadic, but they don't need to be. The only monadic things they do are generate unique names, but this can be done more easily outside of the inner loops themselves. This patch refactors the code to do just that, making the inner loop code pure.

A consequence of (2) is that D.S.Promote.Defun.defunctionalize now generates defunctionalization symbols in the opposite order that it used to. This causes many, many test cases to have different expected output, making the patch appear larger than it actually is.

While looking at the defunctionalization code recently, I discovered
several smelly aspects of the code. This patch does not fix all such
code smells (I plan to tackle more of them while tackling #378), but
this does pick some of the lower-hanging fruit. Here are the
highlights:

1. The `reverse` function is used an awful lot in
   `D.S.Promote.Defun.defunctionalize`. It's used here, when the
   inner loop (`go`) is first invoked:

   https://github.com/goldfirere/singletons/blob/2ec18435b8c57afcfcec0b7da872621b1179d45f/src/Data/Singletons/Promote/Defun.hs#L299

   And it's also used here, inside of `go` itself:

   https://github.com/goldfirere/singletons/blob/2ec18435b8c57afcfcec0b7da872621b1179d45f/src/Data/Singletons/Promote/Defun.hs#L237-L240

   This makes my head spin. Moreover, there other some parts of `go`
   that use `m_args` instead of using `reverse m_args`, which
   causes some odd-looking code to be generated (e.g., generating
   both `data BlahSym2 x y :: z ~> Type` _and_
   `type instance Apply (BlahSym2 y x) z = BlahSym3 y x z`). None of
   this is technially wrong, but I find it hard to follow.

   Luckily, we can get away without needing either use of `reverse`.
   Instead of processing a single list of `TyVarBndr`s in reverse
   order inside of `go`, we can track two lists of `TyVarBndr`s.
   To how this works, imagine we are to generate these
   defunctionalization symbols:

   ```hs
   data BlahSym0 :: x ~> y ~> z ~> Type
   data BlahSym1    x :: y ~> z ~> Type
   data BlahSym2    x    y :: z ~> Type
   ```

   We can accomplish this by "cycling" through the `TyVarBndr`s with
   two lists: one for the arguments and another for the result kind.
   Or, in code:

   ```hs
   []     [x, y, z] -- BlahSym0
   [x]       [y, z] -- BlahSym1
   [x, y]       [z] -- BlahSym2
   ```

   Notice that at each iteration of `go`, we take the first
   `TyVarBndr` from the second list and append it to the end of the
   first. This makes `go` run in quadratic time, but this is not a
   new precedent, since `go` was quadratic before due to invoking
   `reverse` in each iteration. Given the choice between these two
   quadratic-time designs, I prefer the new one. I've applied this
   refactor to the `go` functions in
   `D.S.Promote.Defun.defunctionalize` and
   `D.S.Single.Defun.singDefuns`, and left some comments explaining
   what is going on.
2. The inner loop of `D.S.Promote.Defun.defunctionalize` works by
   starting from `n - 1` (where `n` is the number of arguments) and
   iterating until `0` is reached. On the flip side, the inner loop
   of `D.S.Single.Defun.singDefuns` works by starting from `0` and
   iterating until `n - 1` is reached. For the sake of consistency,
   I have adopted the `singDefuns` convention (start from `0` and
   count up) for both pieces of code.
3. The inner loops of `D.S.Promote.Defun.defunctionalize` and
   `D.S.Single.Defun.singDefuns` are monadic, but they don't need to
   be. The only monadic things they do are generate unique names, but
   this can be done more easily outside of the inner loops
   themselves. This patch refactors the code to do just that, making
   the inner loop code pure.

A consequence of (2) is that `D.S.Promote.Defun.defunctionalize` now
generates defunctionalization symbols in the opposite order that it
used to. This causes many, many test cases to have different expected
output, making the patch appear larger than it actually is.
@goldfirere
Copy link
Owner

If you're happy, I am too. I imagine the quadratic running time of the old implementation was unintentional (and perhaps unnoticed). But it's not worth losing sleep over.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants