We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GPU-A100 TensorRT-LLM: 0.15.0
@ncomly-nvidia @byshiue
examples
Run the steps mentioned under https://nvidia.github.io/TensorRT-LLM/reference/troubleshooting.html#debug-on-e2e-models
to debug the output from the following model - https://github.com/huggingface/transformers/blob/5d7739f15a6e50de416977fe2cc9cb516d67edda/src/transformers/models/mistral/modeling_mistral.py#L1015
easy debugging of new models
no debugging setup available
I need to debug the reason for reduction in accuracy of Biomistral model (https://github.com/huggingface/transformers/blob/5d7739f15a6e50de416977fe2cc9cb516d67edda/src/transformers/models/mistral/modeling_mistral.py#L1015) on TensorRT-Engine compared to vLLM. Looking at the instructions in https://nvidia.github.io/TensorRT-LLM/reference/troubleshooting.html#debug-on-e2e-models, I was wondering if there is any model closely related to my model above that I can use to troubleshoot? (I hope some nearly similar model exists because Im able to draw inference using TensorRT-LLM, its just the accuracy seems to be low. Can you please help me?
The text was updated successfully, but these errors were encountered:
Maybe you can run it with python runtime, then debug it following https://nvidia.github.io/TensorRT-LLM/reference/troubleshooting.html#debug-on-e2e-models
Sorry, something went wrong.
No branches or pull requests
System Info
GPU-A100
TensorRT-LLM: 0.15.0
Who can help?
@ncomly-nvidia @byshiue
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
Run the steps mentioned under https://nvidia.github.io/TensorRT-LLM/reference/troubleshooting.html#debug-on-e2e-models
to debug the output from the following model - https://github.com/huggingface/transformers/blob/5d7739f15a6e50de416977fe2cc9cb516d67edda/src/transformers/models/mistral/modeling_mistral.py#L1015
Expected behavior
easy debugging of new models
actual behavior
no debugging setup available
additional notes
I need to debug the reason for reduction in accuracy of Biomistral model (https://github.com/huggingface/transformers/blob/5d7739f15a6e50de416977fe2cc9cb516d67edda/src/transformers/models/mistral/modeling_mistral.py#L1015) on TensorRT-Engine compared to vLLM.
Looking at the instructions in https://nvidia.github.io/TensorRT-LLM/reference/troubleshooting.html#debug-on-e2e-models, I was wondering if there is any model closely related to my model above that I can use to troubleshoot? (I hope some nearly similar model exists because Im able to draw inference using TensorRT-LLM, its just the accuracy seems to be low.
Can you please help me?
The text was updated successfully, but these errors were encountered: