Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tensorrt-llm 0.12.0.dev2024073000, triton 2.46.0 #52

Open
wants to merge 35 commits into
base: main
Choose a base branch
from

Conversation

yorickvP
Copy link
Contributor

@yorickvP yorickvP commented Aug 8, 2024

Breaking changes:

  • enable_trt_overlap now useless
  • max_queue_size is now required, 0 is a good default
  • max_seq_len required until next week, but will default to the model max input size after

Other changes:

  • removed bls

yorickvP and others added 30 commits July 19, 2024 13:54
Remove the tensorrt_llm python script, since it confuses
`maybe_download_tarball_with_pget`
hopefully fixes
`ValueError: Invalid pattern: '**' can only be an entire path component`
adds TRTLLM_BUILDER_VARIANT=h100 to work around
"It looks like a copy of this model version already exists on Replicate"
@yorickvP yorickvP requested a review from joehoover August 8, 2024 14:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants