🐛 [Bug] Torch-TensorRT silently converts a boolean mask into an integer index containing zeros and ones #2024

airalcorn2 · 2023-06-16T13:34:41Z

Bug Description

When a PyTorch model calculates a mask using something like this:

mask = vals < mask_val

and uses it to index a Tensor like this:

vals[mask]

the Torch-TensorRT model effectively does this:

vals[mask.long()]

Replacing the mask indexing operation with torch.masked_select, e.g.:

torch.masked_select(vals, mask)

makes the Torch-TensorRT model work correctly.

While the incorrectly working code creates Tensors that have different shapes, a common use case for this kind of masking involves placing the Tensors into a different, fixed-size Tensor. As a result, the outputs of such a PyTorch model will wildly differ from its associated Torch-TensorRT model, but because the mask casting happens silently, it's difficult to debug. As far as I could tell, this behavior isn't mentioned in the documentation, and the only thing I could find when googling around about this issue was this GitHub discussion.

To Reproduce

import torch
import torch_tensorrt

from torch import nn


class Masker(nn.Module):
    def __init__(self, use_masked_select, mask_val):
        super().__init__()
        self.use_masked_select = use_masked_select
        self.mask_val = mask_val

    def forward(self, vals):
        mask = vals < self.mask_val
        if self.use_masked_select:
            return torch.masked_select(vals, mask)
        else:
            return vals[mask]


torch_tensorrt.logging.set_reportable_log_level(torch_tensorrt.logging.Level.Error)

device = torch.device("cuda:0")
vals = torch.rand(20).to(device)
inputs = [torch_tensorrt.Input(vals.shape)]
mask_val = 0.5
for use_masked_select in [False, True]:
    model = Masker(use_masked_select, mask_val).to(device)
    trt_model = torch_tensorrt.compile(model, inputs=inputs)
    with torch.no_grad():
        pt_out = model(vals)
        trt_out = trt_model(vals)

        print(f"use_masked_select: {use_masked_select}")
        print(f"pt_out.shape: {pt_out.shape}")
        print(f"trt_out.shape: {trt_out.shape}")
        if use_masked_select:
            print(f"(pt_out == trt_out).all(): {(pt_out == trt_out).all()}")
        else:
            mask_0_1s = (vals < mask_val).long()
            pt_out_0_1s = pt_out[mask_0_1s]
            print(f"(pt_out_0_1s == trt_out).all(): {(pt_out_0_1s == trt_out).all()}\n")

Expected behavior

Either work the same as masking does in PyTorch or raise a warning or error if a mask operation is detected, and instruct the user to use masked_select.

Environment

Build information about Torch-TensorRT can be found by turning on debug messages

Torch-TensorRT Version (e.g. 1.0.0): 1.4.0
PyTorch Version (e.g. 1.0): 2.0.1+cu117
CPU Architecture: i7-12800H
OS (e.g., Linux): Linux
How you installed PyTorch (conda, pip, libtorch, source): pip
Build command you used (if compiling from source):
Are you using local sources or building from archives:
Python version: 3.10.10
CUDA version: 11.7
GPU models and configuration: GeForce RTX 3080 Ti
Any other relevant information:

Additional context

The text was updated successfully, but these errors were encountered:

narendasan · 2023-06-20T19:12:23Z

We will have to look at the specific operation decomposition here but there are some limitations for TensorRT and boolean inputs where in certain places we cast to int in order for the input to be accepted.

github-actions · 2023-09-22T00:02:05Z

This issue has not seen activity for 90 days, Remove stale label or comment or this will be closed in 10 days

airalcorn2 · 2023-11-14T16:13:14Z

I no longer experience this bug when using PyTorch 2.0.1, Torch-TensorRT 1.4.0, and CUDA 12.2, and compiling the model with:

trt_model = torch.compile(model, backend="torch_tensorrt")

airalcorn2 added the bug Something isn't working label Jun 16, 2023

narendasan added the component: converters Issues re: Specific op converters label Jun 20, 2023

github-actions bot assigned peri044 Jun 20, 2023

github-actions bot added the No Activity label Sep 22, 2023

github-actions bot closed this as completed Oct 2, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

🐛 [Bug] Torch-TensorRT silently converts a boolean mask into an integer index containing zeros and ones #2024

🐛 [Bug] Torch-TensorRT silently converts a boolean mask into an integer index containing zeros and ones #2024

airalcorn2 commented Jun 16, 2023 •

edited

Loading

narendasan commented Jun 20, 2023

github-actions bot commented Sep 22, 2023

airalcorn2 commented Nov 14, 2023

🐛 [Bug] Torch-TensorRT silently converts a boolean mask into an integer index containing zeros and ones #2024

🐛 [Bug] Torch-TensorRT silently converts a boolean mask into an integer index containing zeros and ones #2024

Comments

airalcorn2 commented Jun 16, 2023 • edited Loading

Bug Description

To Reproduce

Expected behavior

Environment

Additional context

narendasan commented Jun 20, 2023

github-actions bot commented Sep 22, 2023

airalcorn2 commented Nov 14, 2023

airalcorn2 commented Jun 16, 2023 •

edited

Loading