Refactored rational-quadratic spline transforms to run faster #44

vsimkus · 2021-06-10T11:59:02Z

Hi,

I've noticed that the methods in rational_quadratic.py can be easily refactored to make them run ~25% faster.

The main change in unconstrained_rational_quadratic_spline is to avoid using masked select, which can be quite inefficient with dense masks, since it requires assembling all the "unmasked" elements into a new tensor. Instead, in order to do masked insert into a predefined zero tensor, it is generally cheaper to multiply the input tensor with a mask and add it to the target tensor, as I've done in this PR.

I've also made a couple of changes in rational_quadratic_spline about computing widths, heights and cumwidhts, cumheights tensors. The refactored implementation removes the redundancy of some of the operations in the original implementation.

The rational-quadratic spline flow as used in the NSF paper runs about 25% faster with these changes. I think some further improvements can be achieved if the searchsorted is replaced with torch.searchsorted when ran with the custom CUDA kernel as described in #19, but I haven't touched it since it would affect the other spline flows too.

I suppose the other spline flow methods can be refactored in a similar way. If you'd prefer I can make the necessary changes to them too in this PR.

Best,
Vaidotas

vsimkus · 2021-06-10T13:08:29Z

nflows/transforms/splines/rational_quadratic.py

    else:
        raise RuntimeError("{} tails are not implemented.".format(tails))

-    if torch.any(inside_interval_mask):


Also, removed this check, as I suppose most of the time it will evaluate to true as you would expect some inputs to be in the domain. Let me know if you'd prefer it added back.

I see this was added in #25. With the new implementation there won't be any crashes with all-tail inputs either. (Except the computations will essentially be wasted, but I don't suppose we're expecting many calls with all-tails inputs.)

Refactored rational-quadratic spline transforms to run faster

d8e2578

vsimkus commented Jun 10, 2021

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactored rational-quadratic spline transforms to run faster #44

Refactored rational-quadratic spline transforms to run faster #44

vsimkus commented Jun 10, 2021

vsimkus Jun 10, 2021

vsimkus Jun 10, 2021

Refactored rational-quadratic spline transforms to run faster #44

Are you sure you want to change the base?

Refactored rational-quadratic spline transforms to run faster #44

Conversation

vsimkus commented Jun 10, 2021

vsimkus Jun 10, 2021

Choose a reason for hiding this comment

vsimkus Jun 10, 2021

Choose a reason for hiding this comment