-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FP16 model of TensorRT 10.0 are incorrect when running on GPU T4 #4022
Comments
Try to use follow cmd polygraphy run xxxx.onnx --trt --onnxrt --fp16 \
--trt-outputs mark all \
--onnx-outputs mark all |
log_netg.txt |
Hi, sorry to bother you, but is there any update on the solution? @lix19937 |
@yflv-yanxia Sorry late to reply, from my build log
You can check your conv is adjacent to bn op afterwards or not. |
I also encountered this problem, that I had set the layer_precision and layer_output_type to kFLOAT of all layers can be set under FP16 mode, but some inference results are still wrong(the results are all one). I wonder is there any way to disable inserting reformation( to FP16) layer under FP16 mode? Thanks! @lix19937 |
If you can make your input data type are/is fp16 (preprocess phase, the img data from fp32 to fp16), |
@lix19937 Thanks for your reply. |
Description
Using version 10 of TensorRT's trtexec to convert an ONNX model to a TensorRT model, the results of the FP32 model are correct, but the results of the FP16 model are incorrect. I have set almost all layers to FP32 using
trtexec --precisionConstraints=obey --builderOptimizationLevel=5 --layerPrecisions="/Transpose":fp32,"/intro_/Conv":fp32,"/intro_down/Conv":fp32,.......,
but the results are still incorrect. Could you help me solve this problem?
Environment
TensorRT Version: TensorRT 10.0.1
NVIDIA GPU: Tesla T4
NVIDIA Driver Version: 450.36.06
CUDA Version: 11.0
CUDNN Version:8.0.0
Operating System:
onnx opset17
Relevant Files
onnx Model link: https://drive.google.com/file/d/14zuubyXVVN-mOJ2b64jPc128dj4VRU_C/view?usp=sharing
Steps To Reproduce
trtexec --onnx=$pr_nolog_model_path --fp16 --device=0 --minShapes=input:1x128x128x3 --optShapes=input:1x1920x1920x3 --maxShapes=input:1x3072x3072x3 --saveEngine=ysDeblur_cc75_t4_fp16_small_dyn.trtmodel --layerPrecisions="/Transpose":fp32,"/intro_/Conv":fp32,"/intro_down/Conv":fp32,(so many) --precisionConstraints=obey --builderOptimizationLevel=5
The text was updated successfully, but these errors were encountered: