Update on the development branch #1234
kaiyux
announced in
Announcements
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi,
The TensorRT-LLM team is pleased to announce that we are pushing an update to the development branch (and the Triton backend) this March 5, 2024.
This update includes:
transformers
Gemma implementation #1147LLM()
API to accept engines built bytrtllm-build
commandexamples/mixtral/README.md
Mixtral - no run.py file #1181head_size
when importing Gemma model from HuggingFace Hub, thanks for the contribution from @mfuntowicz in Specify the head_size from the config when importing Gemma from Hugging Face. #1148docs/source/performance.md
benchmarks/cpp/README.md
Thanks,
The TensorRT-LLM Engineering Team
Beta Was this translation helpful? Give feedback.
All reactions