Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DEIM: TensorRT engine export with dynamic batches #55

Open
mirza298 opened this issue Mar 21, 2025 · 1 comment
Open

DEIM: TensorRT engine export with dynamic batches #55

mirza298 opened this issue Mar 21, 2025 · 1 comment

Comments

@mirza298
Copy link

Has anyone successfully exported DEIM to TensorRT or ONNX with dynamic batch sizes? While export_onnx.py and trtexec work for exporting the model, I get an error related to the model architecture ('/model/decoder/GatherElements') during batch inference with both the ONNX and TensorRT engine files. I used the following trtexec command for the export:
trtexec --onnx=model.onnx --saveEngine=model.trt --minShapes=images:1x3x640x640,orig_target_sizes:1x2 --optShapes=images:1x3x640x640,orig_target_sizes:1x2 --maxShapes=images:32x3x640x640,orig_target_sizes:32x2 --fp16

My input shapes are correct (e.g., for a batch size of 2: images: torch.Size([2, 3, 640, 640]), orig_target_sizes: torch.Size([2, 2]))."

This is the error with onnx:
2025-03-21 09:46:45.304331171 [E:onnxruntime:, sequential_executor.cc:516 ExecuteKernel] Non-zero status code returned while running GatherElements node. Name:'/model/decoder/GatherElements' Status Message: GatherElements op: 'indices' shape should have values within bounds of 'data' shape. Invalid value in indices shape is: 2

This is the error with trt:
[03/21/2025-09:16:43] [TRT] [E] IExecutionContext::executeV2: Error Code 7: Internal Error (/model/decoder/GatherElements: The extent of dimension 0 of indices must be less than or equal to the extent of data. Condition '<' violated: 2 >= 1. Instruction: CHECK_LESS 2 1.)

@PINTO0309
Copy link

PINTO0309 commented Mar 22, 2025

  • My Fork
  • For dynamic model custom
  • ONNX files
  • DEIM dynamic batch (dynamic height, dynamic width) - ONNX
    Image
    Image
  • Inference test - [5, 3, 480, 640]
    • CUDA
      sit4onnx -if deim_hgnetv2_s_wholebody28_ft_1250query_n_batch.onnx -oep cuda -fs 5 3 480 640
      
      INFO: file: deim_hgnetv2_s_wholebody28_ft_1250query_n_batch.onnx
      INFO: providers: ['CUDAExecutionProvider', 'CPUExecutionProvider']
      INFO: input_name.1: input_bgr shape: [5, 3, 480, 640] dtype: float32
      INFO: test_loop_count: 10
      INFO: total elapsed time:  558.7770938873291 ms
      INFO: avg elapsed time per pred:  55.87770938873291 ms
      INFO: output_name.1: label_xyxy_score shape: [5, 1250, 6] dtype: float32
      
    • TensorRT
      sit4onnx -if deim_hgnetv2_s_wholebody28_ft_1250query_n_batch.onnx -oep tensorrt -fs 5 3 480 640
      
      2025-03-22 15:36:32.511025557 [W:onnxruntime:Default, tensorrt_execution_provider.h:86 log] [2025-03-22 06:36:32 WARNING] ModelImporter.cpp:787: Make sure output /model/decoder/decoder/lqe_layers.2/TopK_output_1 has Int64 binding.
      2025-03-22 15:36:32.580633082 [W:onnxruntime:Default, tensorrt_execution_provider.h:86 log] [2025-03-22 06:36:32 WARNING] ModelImporter.cpp:787: Make sure output /model/decoder/decoder/lqe_layers.2/TopK_output_1 has Int64 binding.
      INFO: file: deim_hgnetv2_s_wholebody28_ft_1250query_n_batch.onnx
      INFO: providers: ['TensorrtExecutionProvider', 'CPUExecutionProvider']
      INFO: input_name.1: input_bgr shape: [5, 3, 480, 640] dtype: float32
      INFO: test_loop_count: 10
      INFO: total elapsed time:  154.9851894378662 ms
      INFO: avg elapsed time per pred:  15.498518943786621 ms
      INFO: output_name.1: label_xyxy_score shape: [5, 1250, 6] dtype: float32
      
  • My playground

Good luck.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants