-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support 1-d array data in profile exporter #28
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
549fece
to
fa7fdd2
Compare
f7dd9e4
to
70b12b8
Compare
fa7fdd2
to
c0de3e8
Compare
nv-hwoo
commented
Aug 6, 2024
70b12b8
to
456b5c7
Compare
faa23f4
to
c7b1642
Compare
b32067d
to
f787972
Compare
f787972
to
9dcc8b7
Compare
9dcc8b7
to
faf762a
Compare
faf762a
to
5df151a
Compare
5df151a
to
684958f
Compare
matthewkotila
approved these changes
Aug 7, 2024
684958f
to
79e3d0e
Compare
matthewkotila
added a commit
that referenced
this pull request
Aug 9, 2024
* support array of data in profile exporter * add some tests * run formatting * fix pre-commit * remove duplicate argparser arguments * Fix Triton C API mode missing infer requested output datatype bug --------- Co-authored-by: Matthew Kotila <[email protected]>
matthewkotila
added a commit
that referenced
this pull request
Aug 9, 2024
* Add tensorrtllm_engine option to service-kind and update testing (#700) (#762) * Add tensorrtllm_engine option to service-kind and update testing * Add output format check for tensorrtllm_engine Co-authored-by: Elias Bermudez <[email protected]> * Support input payload generation for tensorrtllm engine (#767) * Add functionality for async requests and output retrieval with Triton C API (#25) * Support 1-d array data in profile exporter (#28) * support array of data in profile exporter * add some tests * run formatting * fix pre-commit * remove duplicate argparser arguments * Fix Triton C API mode missing infer requested output datatype bug --------- Co-authored-by: Matthew Kotila <[email protected]> * Support profile data parsing for tensorrtllm engine service kind (#33) * support parsing tensorrtllm engine profile response * add test * refactor the test * update types and names * fix pre-commit * run PA with triton c api * more clean up on the tests * fix codeql * address feedback * Add functionality to continue benchmarking in Triton C API mode if server logging support is disabled (#34) --------- Co-authored-by: Hyunjae Woo <[email protected]> Co-authored-by: Elias Bermudez <[email protected]>
lkomali
pushed a commit
that referenced
this pull request
Aug 15, 2024
* Add tensorrtllm_engine option to service-kind and update testing (#700) (#762) * Add tensorrtllm_engine option to service-kind and update testing * Add output format check for tensorrtllm_engine Co-authored-by: Elias Bermudez <[email protected]> * Support input payload generation for tensorrtllm engine (#767) * Add functionality for async requests and output retrieval with Triton C API (#25) * Support 1-d array data in profile exporter (#28) * support array of data in profile exporter * add some tests * run formatting * fix pre-commit * remove duplicate argparser arguments * Fix Triton C API mode missing infer requested output datatype bug --------- Co-authored-by: Matthew Kotila <[email protected]> * Support profile data parsing for tensorrtllm engine service kind (#33) * support parsing tensorrtllm engine profile response * add test * refactor the test * update types and names * fix pre-commit * run PA with triton c api * more clean up on the tests * fix codeql * address feedback * Add functionality to continue benchmarking in Triton C API mode if server logging support is disabled (#34) --------- Co-authored-by: Hyunjae Woo <[email protected]> Co-authored-by: Elias Bermudez <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Example:
When given trtllm engine the following input:
what we got from PA before the change is
After the fix, we get the full array of token ids (same for any other fields in the profile export json file):