Assertion bound >= 0 failed of TensorRT 8.6.1 when running build_serialized_network on GPU nvidia tesla v100 #3639

elch10 · 2024-01-29T12:46:53Z

Description

I try to convert small modification of VITS model https://github.com/jaywalnut310/vits. But getting error when running builder.build_serialized_network:

[01/29/2024-13:52:34] [TRT] [I] Graph optimization time: 0.629615 seconds.
[01/29/2024-13:52:34] [TRT] [W] BuilderFlag::kENABLE_TACTIC_HEURISTIC has been ignored in this builder run. This feature is only supported on Ampere and beyond.
[01/29/2024-13:52:34] [TRT] [V] Building graph using backend strategy 0
[01/29/2024-13:52:34] [TRT] [I] Timing cache disabled. Turning it on will improve builder speed.
[01/29/2024-13:52:34] [TRT] [V] Constructing optimization profile number 0 [1/1].
[01/29/2024-13:52:34] [TRT] [E] 2: Assertion bound >= 0 failed. 
[01/29/2024-13:52:34] [TRT] [E] 2: [shapeContext.cpp::checkVolume::2923] Error Code 2: Internal Error (Assertion bound >= 0 failed. )

Environment

TensorRT Version: 8.6.1

NVIDIA GPU: Nvidia Tesla v100

NVIDIA Driver Version: 450.216.04

CUDA Version: 11.6

CUDNN Version: 8.9

Operating System: Ubuntu 22.04.3 inside Docker Container

Python Version (if applicable): 3.11

PyTorch Version (if applicable): 1.13.1

Steps To Reproduce

Have you tried the latest release?: yes

Can this model run on other frameworks? For example run ONNX model with ONNXRuntime (polygraphy run <model.onnx> --onnxrt): yes

The text was updated successfully, but these errors were encountered:

elch10 · 2024-01-29T13:53:14Z

What does that error mean? How can I debug this?

zerollzeng · 2024-01-30T02:01:10Z

Does it work with onnxruntime? you can check it quickly with polygraphy run model.onnx --onnxrt, if yes then could you please provide a reproduce? Thanks!

elch10 · 2024-01-30T04:46:25Z

Of course, It works with onnxruntime and polygraphy. Polygraphy output

[I] RUNNING | Command: /home/user/conda/envs/ekerimov-convert/bin/polygraphy run onnx_500k/generator.onnx --onnxrt
[I] onnxrt-runner-N0-01/29/24-15:45:19  | Activating and starting inference
[I] Creating ONNX-Runtime Inference Session with providers: ['CPUExecutionProvider']
[W] Input tensor: text_emb [shape=BoundedShape(['batch_axis', 'text_axis', 192], min=None, max=None)] | Will generate data of shape: [1, 1, 192].
    If this is incorrect, please provide a custom data loader.
[W] Input tensor: q_labels [shape=BoundedShape(['batch_axis', 'text_axis', 5], min=None, max=None)] | Will generate data of shape: [1, 1, 5].
    If this is incorrect, please provide a custom data loader.
[W] Input tensor: bert_emb [shape=BoundedShape(['batch_axis', 'token_axis', 768], min=None, max=None)] | Will generate data of shape: [1, 1, 768].
    If this is incorrect, please provide a custom data loader.
[W] Input tensor: speaker_ids [shape=BoundedShape(['batch_axis'], min=None, max=None)] | Will generate data of shape: [1].
    If this is incorrect, please provide a custom data loader.
[W] Input tensor: length_scale [shape=BoundedShape(['batch_axis', 'text_axis'], min=None, max=None)] | Will generate data of shape: [1, 1].
    If this is incorrect, please provide a custom data loader.
[W] Input tensor: noise_scale [shape=BoundedShape(['batch_axis'], min=None, max=None)] | Will generate data of shape: [1].
    If this is incorrect, please provide a custom data loader.
[W] Input tensor: noise_scale_w [shape=BoundedShape(['batch_axis'], min=None, max=None)] | Will generate data of shape: [1].
    If this is incorrect, please provide a custom data loader.
[I] onnxrt-runner-N0-01/29/24-15:45:19 
    ---- Inference Input(s) ----
    {text_emb [dtype=float32, shape=(1, 1, 192)],
     q_labels [dtype=int64, shape=(1, 1, 5)],
     bert_emb [dtype=float32, shape=(1, 1, 768)],
     speaker_ids [dtype=int64, shape=(1,)],
     length_scale [dtype=float32, shape=(1, 1)],
     noise_scale [dtype=float32, shape=(1,)],
     noise_scale_w [dtype=float32, shape=(1,)]}
[I] onnxrt-runner-N0-01/29/24-15:45:19 
    ---- Inference Output(s) ----
    {wav [dtype=float32, shape=(1, 1, 1024)],
     attn [dtype=float32, shape=(1, 4, 1)]}
[I] onnxrt-runner-N0-01/29/24-15:45:19  | Completed 1 iteration(s) in 67.58 ms | Average inference time: 67.58 ms.
[I] PASSED | Runtime: 3.193s | Command: /home/user/conda/envs/ekerimov-convert/bin/polygraphy run onnx_500k/generator.onnx --onnxrt

zerollzeng · 2024-02-01T14:23:22Z

could you please provide a reproduce? Thanks!

zerollzeng · 2024-02-01T14:23:37Z

Would be great if you can try TRT 9.2/9.3 first.

elch10 · 2024-02-02T09:43:59Z

Is there python wheel with trt 9.2/9.3 or I need trtexec?

zerollzeng · 2024-02-07T09:37:35Z

python wheel should be shipped with the tar package.

elch10 · 2024-02-07T11:40:02Z

I couldn't find wheel in tar package of current repo. But I found in such archives https://developer.nvidia.com/nvidia-tensorrt-8x-download , but there is also version 8.6.1

I uploaded onnx model to reproduce https://drive.google.com/file/d/1nlXTliLV9M7_Z1xiQnUXYP_p8UqbEUBk/view?usp=sharing

elch10 · 2024-02-07T11:45:29Z

And use such code

# %%
import tensorrt as trt
import onnx

logger = trt.Logger(trt.Logger.VERBOSE)
builder = trt.Builder(logger)
network = builder.create_network(1 << int(trt.NetworkDefinitionCreationFlag.EXPLICIT_BATCH))
parser = trt.OnnxParser(network, logger)


# %%
success = parser.parse_from_file('generator.onnx')
for idx in range(parser.num_errors):
    err = parser.get_error(idx)
    print(err)

if not success:
    exit(0)

# %%
config = builder.create_builder_config()
config.set_memory_pool_limit(trt.MemoryPoolType.WORKSPACE, 1024 * 1024 * 1024)
config.flags |= 1 << int(trt.BuilderFlag.DEBUG)
config.clear_flag(trt.BuilderFlag.TF32)



MIN_TIME_AXIS = 1
MAX_TIME_AXIS = 400

MIN_TIME_AXIS_BERT = 1
MAX_TIME_AXIS_BERT = 50

# test input
TEST_TIME_AXIS = 400
TEST_TIME_AXIS_BERT = 50

TEXT_EMB_SIZE = 192
N_Q_FEATURES = 5
BERT_EMB_DIM = 768


dynamic_shape_config = [
    {"input": "text_emb", "min": (1, MIN_TIME_AXIS, TEXT_EMB_SIZE), "opt": (1, MAX_TIME_AXIS, TEXT_EMB_SIZE), "max": (1, MAX_TIME_AXIS, TEXT_EMB_SIZE)},
    {"input": "q_labels", "min": (1, MIN_TIME_AXIS, N_Q_FEATURES), "opt": (1, MAX_TIME_AXIS, N_Q_FEATURES), "max": (1, MAX_TIME_AXIS, N_Q_FEATURES)},
    {"input": "bert_emb", "min": (1, MIN_TIME_AXIS_BERT, BERT_EMB_DIM), "opt": (1, MAX_TIME_AXIS_BERT, BERT_EMB_DIM), "max": (1, MAX_TIME_AXIS_BERT, BERT_EMB_DIM)},
    {"input": 'speaker_ids', "min": (1,), "opt": (1,), "max": (1,)},
    {"input": 'noise_scale', "min": (1,), "opt": (1,), "max": (1,)},
    {"input": 'noise_scale_w', "min": (1,), "opt": (1,), "max": (1,)},
    {"input": 'length_scale', "min": (1, MIN_TIME_AXIS,), "opt": (1, MAX_TIME_AXIS,), "max": (1, MAX_TIME_AXIS,)},
]

profile = builder.create_optimization_profile()
for s in dynamic_shape_config:
    profile.set_shape(**s)

config.add_optimization_profile(profile)
# config.builder_optimization_level = 0


ser_engine = builder.build_serialized_network(network, config)
with open('generator.trt', 'wb') as f:
    f.write(ser_engine)

elch10 · 2024-02-07T16:31:48Z

I found that the error is due to this line https://github.com/jaywalnut310/vits/blob/main/models.py#L517.
or rather because of attn.squeeze(). But due squeeze doesn't work #2846 I used just attn = attn[:, 0] and then matmul.
And trt raises error due attn[:, 0]. If I comment this line and all calls after, convertation works ok.
Shape of attn is (batch_size, 1, t_1, t_2)

elch10 · 2024-02-08T05:57:34Z

I also tried trtexec of version 8.6, 7.x and the same error occurs

zerollzeng · 2024-02-19T08:08:13Z

Test with TRT 9.2:

[02/19/2024-08:07:17] [W] [TRT] /dp/flows.7/Reshape_14: IShuffleLayer with zeroIsPlaceHolder=true has reshape dimension at position 0 that might or might not be zero. TensorRT resolves it at runtime, but this may cause excessive memory consumption and is usually a sign of a bug in the network.
[02/19/2024-08:07:17] [W] [TRT] /dp/flows.7/Reshape_16: IShuffleLayer with zeroIsPlaceHolder=true has reshape dimension at position 0 that might or might not be zero. TensorRT resolves it at runtime, but this may cause excessive memory consumption and is usually a sign of a bug in the network.
[02/19/2024-08:07:17] [W] [TRT] /dp/flows.7/Reshape_20: IShuffleLayer with zeroIsPlaceHolder=true has reshape dimension at position 0 that might or might not be zero. TensorRT resolves it at runtime, but this may cause excessive memory consumption and is usually a sign of a bug in the network.
[02/19/2024-08:07:17] [W] [TRT] /dp/flows.7/Reshape_22: IShuffleLayer with zeroIsPlaceHolder=true has reshape dimension at position 0 that might or might not be zero. TensorRT resolves it at runtime, but this may cause excessive memory consumption and is usually a sign of a bug in the network.
[02/19/2024-08:07:17] [W] [TRT] /dp/flows.7/Reshape_24: IShuffleLayer with zeroIsPlaceHolder=true has reshape dimension at position 0 that might or might not be zero. TensorRT resolves it at runtime, but this may cause excessive memory consumption and is usually a sign of a bug in the network.
[02/19/2024-08:07:17] [W] [TRT] /dp/flows.7/Reshape_26: IShuffleLayer with zeroIsPlaceHolder=true has reshape dimension at position 0 that might or might not be zero. TensorRT resolves it at runtime, but this may cause excessive memory consumption and is usually a sign of a bug in the network.
[02/19/2024-08:07:17] [W] [TRT] /dp/flows.5/Reshape_14: IShuffleLayer with zeroIsPlaceHolder=true has reshape dimension at position 0 that might or might not be zero. TensorRT resolves it at runtime, but this may cause excessive memory consumption and is usually a sign of a bug in the network.
[02/19/2024-08:07:17] [W] [TRT] /dp/flows.5/Reshape_16: IShuffleLayer with zeroIsPlaceHolder=true has reshape dimension at position 0 that might or might not be zero. TensorRT resolves it at runtime, but this may cause excessive memory consumption and is usually a sign of a bug in the network.
[02/19/2024-08:07:17] [W] [TRT] /dp/flows.5/Reshape_20: IShuffleLayer with zeroIsPlaceHolder=true has reshape dimension at position 0 that might or might not be zero. TensorRT resolves it at runtime, but this may cause excessive memory consumption and is usually a sign of a bug in the network.
[02/19/2024-08:07:17] [W] [TRT] /dp/flows.5/Reshape_22: IShuffleLayer with zeroIsPlaceHolder=true has reshape dimension at position 0 that might or might not be zero. TensorRT resolves it at runtime, but this may cause excessive memory consumption and is usually a sign of a bug in the network.
[02/19/2024-08:07:17] [W] [TRT] /dp/flows.5/Reshape_24: IShuffleLayer with zeroIsPlaceHolder=true has reshape dimension at position 0 that might or might not be zero. TensorRT resolves it at runtime, but this may cause excessive memory consumption and is usually a sign of a bug in the network.
[02/19/2024-08:07:17] [W] [TRT] /dp/flows.5/Reshape_26: IShuffleLayer with zeroIsPlaceHolder=true has reshape dimension at position 0 that might or might not be zero. TensorRT resolves it at runtime, but this may cause excessive memory consumption and is usually a sign of a bug in the network.
[02/19/2024-08:07:17] [W] [TRT] /dp/flows.3/Reshape_14: IShuffleLayer with zeroIsPlaceHolder=true has reshape dimension at position 0 that might or might not be zero. TensorRT resolves it at runtime, but this may cause excessive memory consumption and is usually a sign of a bug in the network.
[02/19/2024-08:07:17] [W] [TRT] /dp/flows.3/Reshape_16: IShuffleLayer with zeroIsPlaceHolder=true has reshape dimension at position 0 that might or might not be zero. TensorRT resolves it at runtime, but this may cause excessive memory consumption and is usually a sign of a bug in the network.
[02/19/2024-08:07:17] [W] [TRT] /dp/flows.3/Reshape_20: IShuffleLayer with zeroIsPlaceHolder=true has reshape dimension at position 0 that might or might not be zero. TensorRT resolves it at runtime, but this may cause excessive memory consumption and is usually a sign of a bug in the network.
[02/19/2024-08:07:17] [W] [TRT] /dp/flows.3/Reshape_22: IShuffleLayer with zeroIsPlaceHolder=true has reshape dimension at position 0 that might or might not be zero. TensorRT resolves it at runtime, but this may cause excessive memory consumption and is usually a sign of a bug in the network.
[02/19/2024-08:07:17] [W] [TRT] /dp/flows.3/Reshape_24: IShuffleLayer with zeroIsPlaceHolder=true has reshape dimension at position 0 that might or might not be zero. TensorRT resolves it at runtime, but this may cause excessive memory consumption and is usually a sign of a bug in the network.
[02/19/2024-08:07:17] [W] [TRT] /dp/flows.3/Reshape_26: IShuffleLayer with zeroIsPlaceHolder=true has reshape dimension at position 0 that might or might not be zero. TensorRT resolves it at runtime, but this may cause excessive memory consumption and is usually a sign of a bug in the network.
[02/19/2024-08:07:17] [E] Error[4]: [fillNode.cpp::symbolicExecute::109] Error Code 4: Internal Error (/dp/RandomNormalLike: an IFillLayer can compute a shape tensor only for FillOperation::kLINSPACE.)
[02/19/2024-08:07:17] [E] Engine could not be created from network
[02/19/2024-08:07:17] [E] Building engine failed
[02/19/2024-08:07:17] [E] Failed to create engine from model or file.
[02/19/2024-08:07:17] [E] Engine set up failed
&&&& FAILED TensorRT.trtexec [TensorRT v9200] # trtexec --onnx=generator.onnx

Looks like we hit a known limitation, what is the real input shape?

elch10 · 2024-02-19T08:43:21Z

I run with such command
/usr/src/tensorrt/bin/trtexec --onnx=generator.onnx --minShapes=text_emb:1x1x192,q_labels:1x1x5,bert_emb:1x1x768,speaker_ids:1,noise_scale:1,noise_scale_w:1,length_scale:1x1 --optShapes=text_emb:1x400x192,q_labels:1x400x5,bert_emb:1x50x768,speaker_ids:1,noise_scale:1,noise_scale_w:1,length_scale:1x400 --maxShapes=text_emb:1x400x192,q_labels:1x400x5,bert_emb:1x50x768,speaker_ids:1,noise_scale:1,noise_scale_w:1,length_scale:1x400 --workspace=30000

elch10 · 2024-02-19T08:44:47Z

I saw somewhere about RandomNormalLike, but as I remember solution was just update tensorrt

elch10 · 2024-02-27T09:51:58Z

Any updates?
I've encountered similar issue using TRT 9.2.0.5. It's also about StochasticDurationPredictor module (https://github.com/jaywalnut310/vits/blob/main/models.py#L17), as in your output with RandomNormalLike

[02/27/2024-12:21:40] [W] [TRT] /dp/flows.3/Reshape_26: IShuffleLayer with zeroIsPlaceHolder=true has reshape dimension at position 0 that might or might not be zero. TensorRT resolves it at runtime, but this may cause excessive memory consumption and is usually a sign of a bug in the network.
[02/27/2024-12:21:41] [E] Error[4]: [fillNode.cpp::symbolicExecute::112] Error Code 4: Internal Error (/dp/flows.7/Range: An IFillLayer that computes a shape tensor can have at most one input, and the input must be the first input.)
[02/27/2024-12:21:41] [E] Engine could not be created from network
[02/27/2024-12:21:41] [E] Building engine failed
[02/27/2024-12:21:41] [E] Failed to create engine from model or file.
[02/27/2024-12:21:41] [E] Engine set up failed

zerollzeng · 2024-02-28T04:37:43Z

Filed internal bug 4535894 for this.

ArchRobison · 2024-03-01T22:37:29Z

Just an aside: I noticed the network is using what TensorRT calls "zero as placeholder", which indicates the original ONNX file is not setting the attribute "allowzero=1" for Reshape.

When "allowzero=1" is not present, ONNX treats a 0 in a reshape dimension not as a dimension, but as a placeholder for the corresponding input dimension. With dynamic shapes this is almost never what the author intended, and tends to break networks.

Attached is a zip file with a python script that I sometimes use to repair networks where the author did not intend 0 to be a placeholder.

allowzero.zip

elch10 · 2024-03-05T07:42:08Z

It doesn't help. I got the same error

[03/05/2024-10:41:25] [TRT] [I] Graph optimization time: 0.513168 seconds.
[03/05/2024-10:41:25] [TRT] [W] BuilderFlag::kENABLE_TACTIC_HEURISTIC has been ignored in this builder run. This feature is only supported on Ampere and beyond.
[03/05/2024-10:41:25] [TRT] [V] Building graph using backend strategy 0
[03/05/2024-10:41:25] [TRT] [I] Timing cache disabled. Turning it on will improve builder speed.
[03/05/2024-10:41:25] [TRT] [V] Constructing optimization profile number 0 [1/1].
[03/05/2024-10:41:25] [TRT] [E] 2: Assertion bound >= 0 failed. 
[03/05/2024-10:41:25] [TRT] [E] 2: [shapeContext.cpp::checkVolume::2923] Error Code 2: Internal Error (Assertion bound >= 0 failed. )

elch10 · 2024-03-05T08:30:05Z

May be it will help:
If I split this module to two modules by this line https://github.com/jaywalnut310/vits/blob/main/models.py#L515.
i.e. first module have code https://github.com/jaywalnut310/vits/blob/main/models.py#L501-L514
And second https://github.com/jaywalnut310/vits/blob/main/models.py#L515-L522
Then two modules converted without any errors. And then I can run one by one sequentially.
But when two modules "inside one big module" the above error occurs.

ArchRobison · 2024-03-07T19:17:36Z

There is an error in TensorRT that affects attempts to use IFillLayer with mode kRANDOM_UNIFORM or kRANDOM_NORMAL to construct a shape tensor. The mistake in TensorRT was that one part of the logic incorrectly claimed "I can deliver a shape tensor" and the other part later said "That's not allowed."

The FillLayers are coming from layers /RandomNormalLike and "/dp/RandomNormalLike". The first one's output has variable dimensions, which knocks it out from consideration as a shape tensor, so I think it's /dp/RandomNormalLike_output_0 that is triggering the bug.

The following hack might work. When the output from IConvolutionLayer is used as a shape tensor, TensorRT correctly deals with it, even though the layer says "I can't deliver a shape tensor". The hack is to feed the output from the IFillLayer through dummy 1x1 IConvolutionLayer that is just an identity operation, i.e. the weights are an identity matrix, TensorRT should be able to deal with it, because the convolution will stop TensorRT from asking IFillLayer to deliver a shape tensor. A complication is that IConvolutionLayer needs 4D input, so you'll need to add some reshaping to compensate.

So at the TensorRT level, the replacement for the IFillLayer looks some like:

IFillLayer --> IShuffleLayer --> IConvolutionLayer --> IShuffleLayer -->

where the first IShuffleLayer does a 3D to 4D reshape and the second IShuffleLayer does a 4D to 3D reshape. E.g., first shuffle can reshape from [1,2,1] to [1,1,2,1] and second shuffle can reshape the other direction. The convolution sees a channel-dimension of length 1, so the identity matrix is just a 1x1 matrix containing 1.

Of course what I've described is at the TensorRT level. You're probably more interested in an ONNX-level description. At the ONNX level, the hack looks like replacing RandomNormalLike /dp/RandomNormalLike with:

RandomNormalLike --> Reshape --> Conv --> Reshape -->

elch10 · 2024-03-26T10:13:38Z

RandomNormalLike was from here https://github.com/jaywalnut310/vits/blob/main/models.py#L90
I replaced that line with

      z = torch.randn(x.size(0), 2, x.size(2)).to(device=x.device, dtype=x.dtype) # (b, 2, t)
      z = z.unsqueeze(1) # (b, 1, 2, t)
      z = F.conv2d(z, z.new_ones(1, 1, 1, 1)) # identity
      z = z[:, 0] # (b, 2, t)

      z = z * noise_scale

And it seems to work. Will such a solution be added inside tensorrt?

I'm testing now, if another errors will occur I let you know

zerollzeng · 2024-04-15T07:10:13Z

Hi, this issue cannot be fixed in short-term, and it's still under tracked, to unblock you, we prepare a WAR, could you please try on you side?

WAR:

upgrade to TRT 10.0
Add a Cast operation converting FP32 to INT64 before the /Clip operation, as shown in the following figure

ttyio · 2024-07-02T17:01:18Z

closing since there is WAR, thanks all!

jingzhaoo · 2024-10-02T19:20:06Z

I ran into the same issue today. @zerollzeng, thanks a lot for the WAR. Any instructions how to add that CAST operation? Should I make some changes to the original model? I am eager to try it out.

clumsyroot · 2024-10-10T13:20:12Z

[TRT] [E] [shapeContext.cpp::checkVolume::3570] Error Code 2: Internal Error (Assertion bound >= 0 failed. )
same issue +1 (vits) , looking forward to some progress
and when i trying to solve it by the WAR, encountered the flowing errors:

[10/10/2024-12:42:46] [TRT] [W] IElementWiseLayer with inputs /ReduceSum_output_0_casted and ONNXTRT_Broadcast_12929_output: first input has type Int64 but second input has type Float.
[10/10/2024-12:42:46] [TRT] [E] ITensor::getDimensions: Error Code 4: API Usage Error ((Unnamed Layer* 13578) [ElementWise]: IElementWiseLayer with MAX operation has incompatible input types Int64 and Float type.)
[10/10/2024-12:42:46] [TRT] [E] ModelImporter.cpp:949: While parsing node number 5873 [Clip -> "/Clip_output_0"]:
[10/10/2024-12:42:46] [TRT] [E] ModelImporter.cpp:950: --- Begin node ---
input: "/ReduceSum_output_0_casted"
input: "/Cast_output_0"
input: ""
output: "/Clip_output_0"
name: "/Clip"
op_type: "Clip"

[10/10/2024-12:42:46] [TRT] [E] ModelImporter.cpp:951: --- End node ---
[10/10/2024-12:42:46] [TRT] [E] ModelImporter.cpp:954: ERROR: ModelImporter.cpp:195 In function parseNode:
[6] Invalid Node - /Clip
ITensor::getDimensions: Error Code 4: API Usage Error ((Unnamed Layer* 13578) [ElementWise]: IElementWiseLayer with MAX operation has incompatible input types Int64 and Float type.)
In node 5873 with name: /Clip and operator: Clip (parseNode): INVALID_NODE: Invalid Node - /Clip
ITensor::getDimensions: Error Code 4: API Usage Error ((Unnamed Layer* 13578) [ElementWise]: IElementWiseLayer with MAX operation has incompatible input types Int64 and Float type.)

Any advice? Thanks a lot. @zerollzeng

clumsyroot · 2024-10-11T06:23:36Z

[TRT] [E] [shapeContext.cpp::checkVolume::3570] Error Code 2: Internal Error (Assertion bound >= 0 failed. ) same issue +1 (vits) , looking forward to some progress and when i trying to solve it by the WAR, encountered the flowing errors:

[10/10/2024-12:42:46] [TRT] [W] IElementWiseLayer with inputs /ReduceSum_output_0_casted and ONNXTRT_Broadcast_12929_output: first input has type Int64 but second input has type Float.
[10/10/2024-12:42:46] [TRT] [E] ITensor::getDimensions: Error Code 4: API Usage Error ((Unnamed Layer* 13578) [ElementWise]: IElementWiseLayer with MAX operation has incompatible input types Int64 and Float type.)
[10/10/2024-12:42:46] [TRT] [E] ModelImporter.cpp:949: While parsing node number 5873 [Clip -> "/Clip_output_0"]:
[10/10/2024-12:42:46] [TRT] [E] ModelImporter.cpp:950: --- Begin node ---
input: "/ReduceSum_output_0_casted"
input: "/Cast_output_0"
input: ""
output: "/Clip_output_0"
name: "/Clip"
op_type: "Clip"

[10/10/2024-12:42:46] [TRT] [E] ModelImporter.cpp:951: --- End node ---
[10/10/2024-12:42:46] [TRT] [E] ModelImporter.cpp:954: ERROR: ModelImporter.cpp:195 In function parseNode:
[6] Invalid Node - /Clip
ITensor::getDimensions: Error Code 4: API Usage Error ((Unnamed Layer* 13578) [ElementWise]: IElementWiseLayer with MAX operation has incompatible input types Int64 and Float type.)
In node 5873 with name: /Clip and operator: Clip (parseNode): INVALID_NODE: Invalid Node - /Clip
ITensor::getDimensions: Error Code 4: API Usage Error ((Unnamed Layer* 13578) [ElementWise]: IElementWiseLayer with MAX operation has incompatible input types Int64 and Float type.)

Any advice? Thanks a lot. @zerollzeng

ok, guys, after some debugging operations, I found that the error was caused by the following line of code:
https://github.com/jaywalnut310/vits/blob/2e561ba58618d021b5b8323d3765880f7e0ecfdb/models.py#L512
In my use case, I don't need to perform batch inference. so I commented out this line of code and set the mask to all ones, which solved the problem. perhaps can try using other methods to achieve the operation performed by this line of code. It's worth mentioning that I'm still curious about why this error occurs in TensorRT.🤔

jingzhaoo · 2024-11-25T19:04:04Z

I am still blocked by this issue and wonder if anyone can help me out. I do need batch inference to achieve better performance. Regarding the WAR mentioned above, I can see the Cast node is already there (see below) and I got the same error when converting ONNX to TensorRT. Any suggestion would be highly appreciated.

jingzhaoo · 2024-11-25T19:05:01Z

@zerollzeng Possible to reopen this ticket so that we can take a closer look at it? Thanks.

jingzhaoo · 2024-11-26T04:17:03Z

[TRT] [E] [shapeContext.cpp::checkVolume::3570] Error Code 2: Internal Error (Assertion bound >= 0 failed. ) same issue +1 (vits) , looking forward to some progress and when i trying to solve it by the WAR, encountered the flowing errors:

[10/10/2024-12:42:46] [TRT] [W] IElementWiseLayer with inputs /ReduceSum_output_0_casted and ONNXTRT_Broadcast_12929_output: first input has type Int64 but second input has type Float.
[10/10/2024-12:42:46] [TRT] [E] ITensor::getDimensions: Error Code 4: API Usage Error ((Unnamed Layer* 13578) [ElementWise]: IElementWiseLayer with MAX operation has incompatible input types Int64 and Float type.)
[10/10/2024-12:42:46] [TRT] [E] ModelImporter.cpp:949: While parsing node number 5873 [Clip -> "/Clip_output_0"]:
[10/10/2024-12:42:46] [TRT] [E] ModelImporter.cpp:950: --- Begin node ---
input: "/ReduceSum_output_0_casted"
input: "/Cast_output_0"
input: ""
output: "/Clip_output_0"
name: "/Clip"
op_type: "Clip"

[10/10/2024-12:42:46] [TRT] [E] ModelImporter.cpp:951: --- End node ---
[10/10/2024-12:42:46] [TRT] [E] ModelImporter.cpp:954: ERROR: ModelImporter.cpp:195 In function parseNode:
[6] Invalid Node - /Clip
ITensor::getDimensions: Error Code 4: API Usage Error ((Unnamed Layer* 13578) [ElementWise]: IElementWiseLayer with MAX operation has incompatible input types Int64 and Float type.)
In node 5873 with name: /Clip and operator: Clip (parseNode): INVALID_NODE: Invalid Node - /Clip
ITensor::getDimensions: Error Code 4: API Usage Error ((Unnamed Layer* 13578) [ElementWise]: IElementWiseLayer with MAX operation has incompatible input types Int64 and Float type.)

Any advice? Thanks a lot. @zerollzeng

I added a Cast node as suggested in the WAR as shown below and then encountered the same error in the Clip node. Appreciate your help, @zerollzeng.

jingzhaoo · 2024-11-26T21:08:29Z

I resolved the Invalid Node - /Clip error after adding the extra Cast operator. The following Clip operator has three inputs. After casting the input' input to int64, we also need to cast minandmaxinputs to int64. However, I still ran into the original[Assertion bound >= 0]` error during TensorRT conversion. So, the WAR actually does not work. I appreciate some more help on this issue!

zerollzeng self-assigned this Jan 30, 2024

zerollzeng added the triaged Issue has been triaged by maintainers label Jan 30, 2024

zerollzeng added the internal-bug-tracked Tracked internally, will be fixed in a future release. label Feb 28, 2024

ttyio closed this as completed Jul 2, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Assertion bound >= 0 failed of TensorRT 8.6.1 when running build_serialized_network on GPU nvidia tesla v100 #3639

Assertion bound >= 0 failed of TensorRT 8.6.1 when running build_serialized_network on GPU nvidia tesla v100 #3639

elch10 commented Jan 29, 2024 •

edited

Loading

elch10 commented Jan 29, 2024

zerollzeng commented Jan 30, 2024

elch10 commented Jan 30, 2024

zerollzeng commented Feb 1, 2024

zerollzeng commented Feb 1, 2024

elch10 commented Feb 2, 2024

zerollzeng commented Feb 7, 2024

elch10 commented Feb 7, 2024

elch10 commented Feb 7, 2024 •

edited

Loading

elch10 commented Feb 7, 2024 •

edited

Loading

elch10 commented Feb 8, 2024

zerollzeng commented Feb 19, 2024

elch10 commented Feb 19, 2024 •

edited

Loading

elch10 commented Feb 19, 2024

elch10 commented Feb 27, 2024 •

edited

Loading

zerollzeng commented Feb 28, 2024

ArchRobison commented Mar 1, 2024

elch10 commented Mar 5, 2024

elch10 commented Mar 5, 2024

ArchRobison commented Mar 7, 2024

elch10 commented Mar 26, 2024 •

edited

Loading

zerollzeng commented Apr 15, 2024

ttyio commented Jul 2, 2024

jingzhaoo commented Oct 2, 2024

clumsyroot commented Oct 10, 2024 •

edited

Loading

clumsyroot commented Oct 11, 2024

jingzhaoo commented Nov 25, 2024

jingzhaoo commented Nov 25, 2024

jingzhaoo commented Nov 26, 2024

jingzhaoo commented Nov 26, 2024 •

edited

Loading

Assertion bound >= 0 failed of TensorRT 8.6.1 when running build_serialized_network on GPU nvidia tesla v100 #3639

Assertion bound >= 0 failed of TensorRT 8.6.1 when running build_serialized_network on GPU nvidia tesla v100 #3639

Comments

elch10 commented Jan 29, 2024 • edited Loading

Description

Environment

Steps To Reproduce

elch10 commented Jan 29, 2024

zerollzeng commented Jan 30, 2024

elch10 commented Jan 30, 2024

zerollzeng commented Feb 1, 2024

zerollzeng commented Feb 1, 2024

elch10 commented Feb 2, 2024

zerollzeng commented Feb 7, 2024

elch10 commented Feb 7, 2024

elch10 commented Feb 7, 2024 • edited Loading

elch10 commented Feb 7, 2024 • edited Loading

elch10 commented Feb 8, 2024

zerollzeng commented Feb 19, 2024

elch10 commented Feb 19, 2024 • edited Loading

elch10 commented Feb 19, 2024

elch10 commented Feb 27, 2024 • edited Loading

zerollzeng commented Feb 28, 2024

ArchRobison commented Mar 1, 2024

elch10 commented Mar 5, 2024

elch10 commented Mar 5, 2024

ArchRobison commented Mar 7, 2024

elch10 commented Mar 26, 2024 • edited Loading

zerollzeng commented Apr 15, 2024

ttyio commented Jul 2, 2024

jingzhaoo commented Oct 2, 2024

clumsyroot commented Oct 10, 2024 • edited Loading

clumsyroot commented Oct 11, 2024

jingzhaoo commented Nov 25, 2024

jingzhaoo commented Nov 25, 2024

jingzhaoo commented Nov 26, 2024

jingzhaoo commented Nov 26, 2024 • edited Loading

elch10 commented Jan 29, 2024 •

edited

Loading

elch10 commented Feb 7, 2024 •

edited

Loading

elch10 commented Feb 7, 2024 •

edited

Loading

elch10 commented Feb 19, 2024 •

edited

Loading

elch10 commented Feb 27, 2024 •

edited

Loading

elch10 commented Mar 26, 2024 •

edited

Loading

clumsyroot commented Oct 10, 2024 •

edited

Loading

jingzhaoo commented Nov 26, 2024 •

edited

Loading