Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cuda Runtime (out of memory) failure of TensorRT 10.3.0 when running trtexec on GPU RTX4060/jetson/etc #4258

Open
zargooshifar opened this issue Nov 22, 2024 · 1 comment

Comments

@zargooshifar
Copy link

zargooshifar commented Nov 22, 2024

Description

I'm trying to convert yoloV8-seg model to TensorRT engine, I'm using DeepStream-Yolo-Seg for converting the model to onnx.
after running trtexec with the converted onnx file I'm getting this errors:

[11/22/2024-13:40:08] [I] Finished parsing network model. Parse time: 0.0704936
[11/22/2024-13:40:08] [I] [TRT] Local timing cache in use. Profiling results in this builder pass will not be stored.
[11/22/2024-13:40:47] [E] Error[1]: [defaultAllocator.cpp::allocate::31] Error Code 1: Cuda Runtime (out of memory)
[11/22/2024-13:40:47] [W] [TRT] Requested amount of GPU memory (15485030400 bytes) could not be allocated. There may not be enough free memory for allocation to succeed.
[11/22/2024-13:40:47] [E] Error[9]: Error Code: 9: Skipping tactic 0x0000000000000000 due to exception [tunable_graph.cpp:create:117] autotuning: User allocator error allocating 15485030400-byte buffer
[11/22/2024-13:40:47] [E] Error[10]: IBuilder::buildSerializedNetwork: Error Code 10: Internal Error (Could not find any implementation for node {ForeignNode[/1/Constant_36_output_0.../1/Slice_4]}.)
[11/22/2024-13:40:47] [E] Engine could not be created from network
[11/22/2024-13:40:47] [E] Building engine failed
[11/22/2024-13:40:47] [E] Failed to create engine from model or file.
[11/22/2024-13:40:47] [E] Engine set up failed

with TensorRT 10.0.0.6-1+cuda11.8 the engine can be created, but anything newer it fails.

Environment

TensorRT Version: 10.3.0
NVIDIA GPU: RTX4060
NVIDIA Driver Version: 565.57.01
CUDA Version: 12.6
CUDNN Version: 9.5.1.17-1

Operating System: Ubuntu 22.04.5 LTS
Python Version (if applicable): 3.10.12-1~22.04.7
PyTorch Version (if applicable): 2.5.1

Steps To Reproduce

 python export_yoloV8_seg.py --weights yolov8s-seg.pt
/usr/src/tensorrt/bin/trtexec --onnx=yolov8s-seg.onnx
@zargooshifar zargooshifar changed the title Cuda Runtime (out of memory) failure of TensorRT 10.3.0 when running trtexec on GPU 4060/jetson/etc Cuda Runtime (out of memory) failure of TensorRT 10.3.0 when running trtexec on GPU RTX4060/jetson/etc Nov 22, 2024
@lix19937
Copy link

Maybe is a bug, you can try to modify the maxWorkspaceSize or memoryPoolLimit, either larger or smaller will not solve the problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants