Update on the development branch #2563
kaiyux
announced in
Announcements
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi,
The TensorRT-LLM team is pleased to announce that we have pushed an update to the development branch (and the Triton backend) this Dec 11, 2024.
This update includes:
LLM
APIexamples/recurrentgemma/README.md
.examples/qwen/README.md
.allottedTimeMs
to the C++Request
class to support per-request timeout.--use_embedding_sharing
from convert checkpoints scripts.nvcr.io/nvidia/pytorch:24.11-py3
.nvcr.io/nvidia/tritonserver:24.11-py3
.Thanks,
The TensorRT-LLM Engineering Team
Beta Was this translation helpful? Give feedback.
All reactions