Does Zuko allow exporting to ONNX? #45

CaioDaumann · 2024-03-13T13:24:15Z

CaioDaumann
Mar 13, 2024

Hello again,

Is it possible to export Zuko flows to onnx? If possible, do you have any example?

If not promptly possible, you have any ideia of much effort would it take? I would be interesting in trying that out.

Best,
Caio

Answered by francois-rozet

Mar 13, 2024

Hello @CaioDaumann,

I have never tried, but I think it should be possible. Looking at https://pytorch.org/tutorials/beginner/onnx/export_simple_model_to_onnx_tutorial.html, you will probably need to wrap the flow as a "pure function", that is something that takes tensors as input and returns tensors as output. This is not the case of the Flow objects which take a tensor as input and returns a Distribution. A very thin wrapper module that takes both $c$ and $x$ as input and returns $\log p(x | c)$ is probably enough.

Also, I don't know how ONNX handles randomness, but sampling from the flow requires first sampling from the base distribution, and then transforming it. It might be easier to …

View full answer

francois-rozet · 2024-03-13T19:19:08Z

francois-rozet
Mar 13, 2024
Maintainer

Hello @CaioDaumann,

I have never tried, but I think it should be possible. Looking at https://pytorch.org/tutorials/beginner/onnx/export_simple_model_to_onnx_tutorial.html, you will probably need to wrap the flow as a "pure function", that is something that takes tensors as input and returns tensors as output. This is not the case of the Flow objects which take a tensor as input and returns a Distribution. A very thin wrapper module that takes both $c$ and $x$ as input and returns $\log p(x | c)$ is probably enough.

Also, I don't know how ONNX handles randomness, but sampling from the flow requires first sampling from the base distribution, and then transforming it. It might be easier to only export the transform to ONNX.

In any case, if you succeed, consider contributing a tutorial to the repo!

3 replies

CaioDaumann Mar 13, 2024
Author

Thanks for the tips @francois-rozet ! I will try it in the next days when I have some free time and will let you know if I succeed.

dpbigler May 14, 2024

@CaioDaumann did you have any luck?

CaioDaumann May 14, 2024
Author

@dpbigler I haven't had time to look at this yet. I can try to do it tomorrow or Friday and let you know if I succeed.

CaioDaumann · 2024-05-17T16:41:23Z

CaioDaumann
May 17, 2024
Author

Hi @francois-rozet , @dpbigler , Getting back to this, I spend some time trying to export a zuko model to onnx and I came up with something like this:

import torch
import torch.utils.data as data
import zuko
import numpy as np
import onnxruntime as ort

class WrappedNSF(torch.nn.Module):
    def __init__(self):
        super(WrappedNSF, self).__init__()
        self.flow = zuko.flows.NSF(features=2, transforms=3, hidden_features=(64, 64))
    
    def forward(self, x):
        result = self.flow().transform(x)
        return result

def two_moons(n: int, sigma: float = 1e-1):
    theta = 2 * torch.pi * torch.rand(n)
    label = (theta > torch.pi).float()

    x = torch.stack((
        torch.cos(theta) + label - 1 / 2,
        torch.sin(theta) + label / 2 - 1 / 4,
    ), axis=-1)

    return torch.normal(x, sigma), label

samples, labels = two_moons(16384)

samples_tensor = samples.clone().detach()

trainset = data.TensorDataset(samples, labels)
trainloader = data.DataLoader(trainset, batch_size=64, shuffle=True)

model = WrappedNSF()
model.eval()  

dummy_input = torch.randn(1, 2)  

output = model(dummy_input)
print("Sample output:", output)

try:
    torch.onnx.export(model,               # Wrapped model instance
                    dummy_input,         # Model input (or a tuple for multiple inputs)
                    "wrapped_flow_model.onnx",  # Output ONNX file path
                    export_params=True,    # Store the trained parameter weights inside the model file
                    opset_version=17,      # ONNX version to export the model to
                    do_constant_folding=True,  # Optimization: constant folding
                    input_names=['input'],     # Model's input names
                    output_names=['output'],   # Model's output names
                    dynamic_axes={'input': {0: 'batch_size'},  # Variable length axes
                                    'output': {0: 'batch_size'}})
    print("Model exported successfully.")
except Exception as e:
    print("Failed to export model:", str(e))

But this returns the following error:

Failed to export model: Exporting the operator 'aten::searchsorted' to ONNX opset version 17 is not supported. Please feel free to request support or submit a pull request on PyTorch GitHub: https://github.com/pytorch/pytorch/issues.

Any ideias here or should I open an issue in PyTorch as the error message suggests?

6 replies

CaioDaumann May 18, 2024
Author

@francois-rozet with MAF it works. Here are for example the outputs of the forward method of the flow trained in the make moons dataset. I show the output of the zuko flow, I export it to onnx and check the output of the forward method of the onnx model.

But the problem is, from what I understand onnx only supports one method, the forward one. The other ones will not be exported into the onnx model. So, every functionality you need has to be wrapped around the forward method. In order to be able to sample from the flow I did this:

class WrappedNSF(torch.nn.Module):
    def __init__(self):
        super(WrappedNSF, self).__init__()
        self.flow = zuko.flows.MAF(features=2, transforms=10, hidden_features=(64, 64))

    def forward(self, x):
        sample_flag = x[0, 0].item() > 0.5  
        num_samples = int(x[0, 1].item())  
        
        if sample_flag:
            samples = self.flow().sample((50000,))
            return samples

        x = x[:, 2:]  # Actual data starts from the third element
        result = self.flow().transform(x)
        return result

    def loss(self, x):
        return -self.flow().log_prob(x).mean()
    
    def sample(self, n):
        return self.flow().sample((n,))
    
    def get_flow(self):
        return self.flow

The input for the forward method should have a[ boolean (sample or no), n_samples, the remaining input tensor] it would have four dimensions since this is based on the make moons example. And it works well before the passing it to onnx. both the forward and sample parts of the forward method.

But for some reason, when I export it to onnx, no matter the values you add in samples = self.flow().sample((50000,)) it will always sample only one single point. I still need to understand why tho.

A quick way around it, would be to do a for loop, and sample the function n times. But I guess the seed is always the same and it always returns the same point (x and y). Is there a way to give the seed as input to the sample method?

francois-rozet May 18, 2024
Maintainer

torch.onnx.export is based on torch.jit.trace which captures a static computation graph. This means that the computations that are performed for some input are independent from the value of that input. Therefore, torch.onnx.export will evaluate the if statements in your module once, and do the same for all inputs.

Basically, each method must have a different ONNX program. One for log_prob(x), one for transform(x), one for transform.inv(z), ...

Similarly, randomness is only evaluated once, and kept constant for all inputs. So for sampling, you should make a module that takes the latent z as input and returns flow().transform.inv(z).

CaioDaumann May 19, 2024
Author

@francois-rozet with your suggestion looks like it works nicely now. You can check the plots of the samples drawn from the onnx and the zuko model.

Regarding the searchsorted the NSF is the only type of flow that uses it?

francois-rozet May 19, 2024
Maintainer

Nice! I believe it is the only one yes, but other flows might use other instructions that are not supported by ONNX.

CaioDaumann May 20, 2024
Author

Okay, I will test the other kind of flows to check in which types we have problems.

CaioDaumann · 2024-11-25T13:07:17Z

CaioDaumann
Nov 25, 2024
Author

Hi @francois-rozet ,

Coming back to this, I implemented a custom-made search-sorted function that is ONNX-friendly, and now it can convert the "custom" NSF model to ONNX. The current implementation is as follows:

import torch
from torch import nn, Tensor, LongTensor

import math
from math import pi
from torch.distributions import Transform
from torch.distributions import constraints
from typing import Any, Callable, Dict, Iterable, List, Sequence, Tuple, Union
import torch.nn.functional as F

from zuko.flows import MAF

def broadcast(*tensors: Tensor, ignore: Union[int, Sequence[int]] = 0) -> List[Tensor]:
    r"""Broadcasts tensors together.

    The term broadcasting describes how PyTorch treats tensors with different shapes
    during arithmetic operations. In short, if possible, dimensions that have
    different sizes are expanded (without making copies) to be compatible.

    Arguments:
        tensors: The tensors to broadcast.
        ignore: The number(s) of dimensions not to broadcast.

    Returns:
        The broadcasted tensors.

    Example:
        >>> x = torch.rand(3, 1, 2)
        >>> y = torch.rand(4, 5)
        >>> x, y = broadcast(x, y, ignore=1)
        >>> x.shape
        torch.Size([3, 4, 2])
        >>> y.shape
        torch.Size([3, 4, 5])
    """

    if isinstance(ignore, int):
        ignore = [ignore] * len(tensors)

    dims = [t.dim() - i for t, i in zip(tensors, ignore)]
    common = torch.broadcast_shapes(*(t.shape[:i] for t, i in zip(tensors, dims)))

    return [torch.broadcast_to(t, common + t.shape[i:]) for t, i in zip(tensors, dims)]


def onnx_friendly_searchsorted(seq: Tensor, value: Tensor) -> LongTensor:
    """
    Custom implementation to replace torch.searchsorted, which is not onnx "friendly" (torch.searchsorted(seq, value).squeeze(dim=-1))
    Compatible with ONNX and reproduces the exact behavior and output shapes.

    Args:
        seq (Tensor): Sorted tensor of shape (..., S).
        value (Tensor): Tensor containing values to insert of shape (..., 1).

    Returns:
        LongTensor: Indices of shape (...), matching torch.searchsorted(seq, value).squeeze(dim=-1).
    """
    
    # Ensure tensors are contiguous
    seq = seq.contiguous()
    value = value.contiguous()

    # Use torch.sum to count the number of elements in seq less than value
    # The comparison seq < value results in a boolean tensor of shape (..., S)
    # Summing over the last dimension (-1) gives indices of shape (...)

    indices = torch.sum(seq < value, dim=-1)

    return indices

class MonotonicRQSTransform_(Transform):
    r"""Creates a monotonic rational-quadratic spline (RQS) transformation.

    References:
        | Neural Spline Flows (Durkan et al., 2019)
        | https://arxiv.org/abs/1906.04032

    Arguments:
        widths: The unconstrained bin widths, with shape :math:`(*, K)`.
        heights: The unconstrained bin heights, with shape :math:`(*, K)`.
        derivatives: The unconstrained knot derivatives, with shape :math:`(*, K - 1)`.
        bound: The spline's (co)domain bound :math:`B`.
        slope: The minimum slope of the transformation.
    """

    domain = constraints.real
    codomain = constraints.real
    bijective = True
    sign = +1

    def __init__(
        self,
        widths: Tensor,
        heights: Tensor,
        derivatives: Tensor,
        bound: float = 5.0,
        slope: float = 1e-4,
        **kwargs,
    ):
        super().__init__(**kwargs)

        widths = widths / (1 + abs(2 * widths / math.log(slope)))
        heights = heights / (1 + abs(2 * heights / math.log(slope)))
        derivatives = derivatives / (1 + abs(derivatives / math.log(slope)))

        widths = F.pad(F.softmax(widths, dim=-1), (1, 0), value=0)
        heights = F.pad(F.softmax(heights, dim=-1), (1, 0), value=0)
        derivatives = F.pad(derivatives, (1, 1), value=0)

        self.horizontal = bound * (2 * torch.cumsum(widths, dim=-1) - 1)
        self.vertical = bound * (2 * torch.cumsum(heights, dim=-1) - 1)
        self.derivatives = torch.exp(derivatives)

    def __repr__(self) -> str:
        return f"{self.__class__.__name__}(bins={self.bins})"

    @property
    def bins(self) -> int:
        return self.horizontal.shape[-1] - 1

    def bin(self, k: LongTensor) -> Tuple[Tensor, ...]:
        mask = torch.logical_and(0 <= k, k < self.bins)

        k = k % self.bins
        k0_k1 = torch.stack((k, k + 1))

        k0_k1, hs, vs, ds = broadcast(
            k0_k1[..., None],
            self.horizontal,
            self.vertical,
            self.derivatives,
            ignore=1,
        )

        x0, x1 = hs.gather(-1, k0_k1).squeeze(dim=-1)
        y0, y1 = vs.gather(-1, k0_k1).squeeze(dim=-1)
        d0, d1 = ds.gather(-1, k0_k1).squeeze(dim=-1)

        s = (y1 - y0) / (x1 - x0)

        return mask, x0, x1, y0, y1, d0, d1, s

    @staticmethod
    def searchsorted(seq: Tensor, value: Tensor) -> LongTensor:
        seq, value = broadcast(seq, value.unsqueeze(dim=-1), ignore=1)
        seq = seq.contiguous()

        # uses a non torch implementation of search sorted that enables export to onnx
        return onnx_friendly_searchsorted(seq, value)

    def _call(self, x: Tensor) -> Tensor:
        k = self.searchsorted(self.horizontal, x) - 1
        mask, x0, x1, y0, y1, d0, d1, s = self.bin(k)

        z = mask * (x - x0) / (x1 - x0)
        y = y0 + (y1 - y0) * (s * z**2 + d0 * z * (1 - z)) / (s + (d0 + d1 - 2 * s) * z * (1 - z))

        return torch.where(mask, y, x)

    def _inverse(self, y: Tensor) -> Tensor:
        k = self.searchsorted(self.vertical, y) - 1
        mask, x0, x1, y0, y1, d0, d1, s = self.bin(k)

        y_ = mask * (y - y0)

        a = (y1 - y0) * (s - d0) + y_ * (d0 + d1 - 2 * s)
        b = (y1 - y0) * d0 - y_ * (d0 + d1 - 2 * s)
        c = -s * y_

        z = 2 * c / (-b - (b**2 - 4 * a * c).sqrt())

        x = x0 + z * (x1 - x0)

        return torch.where(mask, x, y)

    def log_abs_det_jacobian(self, x: Tensor, y: Tensor) -> Tensor:
        _, ladj = self.call_and_ladj(x)
        return ladj

    def call_and_ladj(self, x: Tensor) -> Tuple[Tensor, Tensor]:
        k = self.searchsorted(self.horizontal, x) - 1
        mask, x0, x1, y0, y1, d0, d1, s = self.bin(k)

        z = mask * (x - x0) / (x1 - x0)
        y = y0 + (y1 - y0) * (s * z**2 + d0 * z * (1 - z)) / (s + (d0 + d1 - 2 * s) * z * (1 - z))

        jacobian = (
            s**2
            * (2 * s * z * (1 - z) + d0 * (1 - z) ** 2 + d1 * z**2)
            / (s + (d0 + d1 - 2 * s) * z * (1 - z)) ** 2
        )

        return torch.where(mask, y, x), mask * jacobian.log()

class NSF_(MAF):
    r"""Creates a neural spline flow (NSF) with monotonic rational-quadratic spline
    transformations.

    By default, transformations are fully autoregressive. Coupling transformations
    can be obtained by setting :py:`passes=2`.

    Warning:
        Spline transformations are defined over the domain :math:`[-5, 5]`. Any feature
        outside of this domain is not transformed. It is recommended to standardize
        features (zero mean, unit variance) before training.

    See also:
        :class:`zuko.transforms.MonotonicRQSTransform`

    References:
        | Neural Spline Flows (Durkan et al., 2019)
        | https://arxiv.org/abs/1906.04032

    Arguments:
        features: The number of features.
        context: The number of context features.
        bins: The number of bins :math:`K`.
        kwargs: Keyword arguments passed to :class:`zuko.flows.autoregressive.MAF`.
    """

    def __init__(
        self,
        features: int,
        context: int = 0,
        bins: int = 8,
        **kwargs,
    ):
        super().__init__(
            features=features,
            context=context,
            univariate=MonotonicRQSTransform_,
            shapes=[(bins,), (bins,), (bins - 1,)],
            **kwargs,
        )

And I wrapped the model as I did before, and it works. Here are some performance comparisons between the Zuko model and the "custom" ONNX-friendly NSF:

Can I have your opinion on this? I can happily produce more validations/tests and write a more detailed tutorial about exporting it to ONNX. Let me know if I can help with this.

4 replies

francois-rozet Nov 25, 2024
Maintainer

Hi @CaioDaumann, thanks for the follow up.

That is funny because the initial implementation of searchsorted was actually torch.sum(seq < value[..., None], dim=-1). I switched at some point to torch.searchsorted because it should be faster (in theory), but I never bothered to check. I did today and torch.sum is actually faster (likely because we always use a small number of bins).

I am happy with switching back to torch.sum. I did in the dev branch. Could you check if ONNX works out-of-the box on this branch? (pip install git+https://github.com/probabilists/zuko@dev)

CaioDaumann Nov 25, 2024
Author

Hi @francois-rozet ,

Yes, it works out of the box on the devt branch. I will add a validation plot and the code I used for the studies here. If you are okay with it, I can write a tutorial and open a merge request soon, where I will organize the code better and provide a clearer explanation of the process to convert the model to ONNX.

The example I have here is only for generation, but I could also write something for likelihood estimation if you think it is needed.

import torch
import torch.utils.data as data
import zuko
import torch
import torch.utils.data as data
import zuko
import numpy as np
import onnxruntime as ort
import onnx

# Wrapper to the flow model
class WrappedNSF(torch.nn.Module):
    def __init__(self):
        super(WrappedNSF, self).__init__()
        self.flow = zuko.flows.NSF(features=2, transforms=4, hidden_features=(64, 64))
    
    def forward(self, x):
        samples = self.flow().transform.inv(x)
        return samples

    def loss(self, x):
        return -self.flow().log_prob(x).mean()
    
    def sample(self, n):
        return self.flow().sample((n,))
    
    def get_flow(self):
        return self.flow
    
def two_moons(n: int, sigma: float = 1e-1):
    theta = 2 * torch.pi * torch.rand(n)
    label = (theta > torch.pi).float()

    x = torch.stack((
        torch.cos(theta) + label - 1 / 2,
        torch.sin(theta) + label / 2 - 1 / 4,
    ), axis=-1)

    return torch.normal(x, sigma), label

# Generating samples
samples, labels = two_moons(100000)
samples_tensor  = samples.clone().detach()

trainset = data.TensorDataset(samples, labels)
trainloader = data.DataLoader(trainset, batch_size=32, shuffle=True)

model = WrappedNSF()
model.eval()  

# Training loop
optimizer = torch.optim.Adam(model.parameters(), lr=1e-3)

for epoch in range(5):
    losses = []

    for x, label in trainloader:
        loss = model.loss(x)
        loss.backward()

        optimizer.step()
        optimizer.zero_grad()

        losses.append(loss.detach())

    losses = torch.stack(losses)

    print(f'({epoch})', losses.mean().item(), '±', losses.std().item())

# Prepare the dummy input for ONNX export
dummy_input = torch.randn(1, 2)
try:
    torch.onnx.export(model,               # Wrapped model instance
                      dummy_input,         # Model input (or a tuple for multiple inputs)
                      "wrapped_flow_model.onnx",  # Output ONNX file path
                      export_params=True,    # Store the trained parameter weights inside the model file
                      opset_version=17,      # ONNX version to export the model to
                      do_constant_folding=True,  # Optimization: constant folding
                      input_names=['input'],     # Model's input names
                      output_names=['output'],   # Model's output names
                      dynamic_axes={'input': {0: 'batch_size'},  # Variable length axes
                                    'output': {0: 'batch_size'}})
    print("Model exported successfully.")
except Exception as e:
    print("Failed to export model:", str(e))
    exit()
    
# Load the ONNX model
model_path = 'wrapped_flow_model.onnx'
onnx_model = onnx.load(model_path)

# Creating an ONNX runtime session
session = ort.InferenceSession(model_path)

# Check the model
onnx.checker.check_model(onnx_model)
print("The model is well-formed!")

# Prepare some input data for inference
input_data = torch.randn(35000, 2).numpy()

# Run the model for inference
outputs = session.run(['output'], {'input': input_data})
outputs = np.squeeze(np.array(outputs))

# Now we plot the results
import matplotlib.pyplot as plt

fig, (ax_left, ax_right) = plt.subplots(
    1, 2, 
    figsize=(9.6, 4.8),  
    gridspec_kw={'width_ratios': [1, 1]}, 
    constrained_layout=True  
)

# Plot the input data histogram on the left
h1 = ax_left.hist2d(
    input_data[:, 0], 
    input_data[:, 1], 
    bins=100, 
    range=((-2, 2), (-2, 2))
)
ax_left.set_title('Prior')
#fig.colorbar(h1[3], ax=ax_left, label='Counts')

# Plot the outputs histogram on the right
h2 = ax_right.hist2d(
    outputs[:, 0], 
    outputs[:, 1], 
    bins=100, 
    range=((-2, 2), (-2, 2))
)
ax_right.set_title('Generated - ONNX')

# Save the combined figure
plt.savefig('forward_method_onnx.png', dpi=300)

### Also works like this, but slower, of course ...
# Create a generator for controlling the random seed
gen = torch.Generator()
sample_outputs = None

for i in range(5000):
    gen.manual_seed(i)
    input_data = torch.randn(1, 2, generator=gen).numpy()
    output = session.run(['output'], {'input': input_data})
    output = np.array(output)
    
    if(sample_outputs is None):
        sample_outputs = output.reshape(-1,1)
    else:
        sample_outputs = np.concatenate([ sample_outputs, output.reshape(-1,1)], axis = 1 )

sample_outputs = np.squeeze(sample_outputs)
sample_outputs = sample_outputs.T

The output plot should look like this:

francois-rozet Nov 25, 2024
Maintainer

Sure we would love a tutorial! I will likely make a few changes when you submit your PR, but this mostly looks great! One thing that would be nice is to compare the speed of the ONNX model versus the speed of the PyTorch model. I guess the main use case is to be able to ship models without needing to install PyTorch?

CaioDaumann Nov 25, 2024
Author

Right! I will also test the speed. Being able to ship models without needing to install PyTorch is nice. Also for experimental physics, we sometimes need ONNX to make the model run in a C++ framework. So being able to convert it to ONNX is a nice feature for us.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Does Zuko allow exporting to ONNX? #45

{{title}}

Replies: 3 comments 13 replies

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

Select a reply

Does Zuko allow exporting to ONNX? #45

CaioDaumann Mar 13, 2024

Replies: 3 comments · 13 replies

francois-rozet Mar 13, 2024 Maintainer

CaioDaumann Mar 13, 2024 Author

dpbigler May 14, 2024

CaioDaumann May 14, 2024 Author

CaioDaumann May 17, 2024 Author

CaioDaumann May 18, 2024 Author

francois-rozet May 18, 2024 Maintainer

CaioDaumann May 19, 2024 Author

francois-rozet May 19, 2024 Maintainer

CaioDaumann May 20, 2024 Author

CaioDaumann Nov 25, 2024 Author

francois-rozet Nov 25, 2024 Maintainer

CaioDaumann Nov 25, 2024 Author

francois-rozet Nov 25, 2024 Maintainer

CaioDaumann Nov 25, 2024 Author

CaioDaumann
Mar 13, 2024

Replies: 3 comments 13 replies

francois-rozet
Mar 13, 2024
Maintainer

CaioDaumann Mar 13, 2024
Author

CaioDaumann May 14, 2024
Author

CaioDaumann
May 17, 2024
Author

CaioDaumann May 18, 2024
Author

francois-rozet May 18, 2024
Maintainer

CaioDaumann May 19, 2024
Author

francois-rozet May 19, 2024
Maintainer

CaioDaumann May 20, 2024
Author

CaioDaumann
Nov 25, 2024
Author

francois-rozet Nov 25, 2024
Maintainer

CaioDaumann Nov 25, 2024
Author

francois-rozet Nov 25, 2024
Maintainer

CaioDaumann Nov 25, 2024
Author