Support 1-d array data in profile exporter #28

nv-hwoo · 2024-08-06T00:34:33Z

Support 1-d array data in profile exporter so that when the input/output involves more than one data value (or array of values), then it captures all the data, rather than just the first element. This is an incremental step towards supporting full N-dimensional tensor in profile exporter. Currently, we don't need to support N-dimensional tensors because we are mostly interested in the text string or token ids (which is just a 1-d array). Additionally, N-dimensional tensor support requires more changes to PA since response outputs doesn't store shape information of the tensors returned from the model so we need to retrieve that information somewhere.
Support other data types such as uint, fp, and etc.

Example:

When given trtllm engine the following input:

{
    "token_ids": [1, 2, 3, 4, 5, 6, 7, 8],
    "input_length": [8],
    "request_output_len": [10],
    ...
}

what we got from PA before the change is

{
    "request_inputs": {
        "token_ids": 1,  // missing remaining values
        "input_length": 8,
        "request_output_len": 10,
    },
    ...
}

After the fix, we get the full array of token ids (same for any other fields in the profile export json file):

{
    "request_inputs": {
        "token_ids": [1, 2, 3, 4, 5, 6, 7, 8],
        "input_length": 8,
        "request_output_len": 10,
    },
    ...
}

genai-perf/genai_perf/parser.py

* support array of data in profile exporter * add some tests * run formatting * fix pre-commit * remove duplicate argparser arguments * Fix Triton C API mode missing infer requested output datatype bug --------- Co-authored-by: Matthew Kotila <[email protected]>

* Add tensorrtllm_engine option to service-kind and update testing (#700) (#762) * Add tensorrtllm_engine option to service-kind and update testing * Add output format check for tensorrtllm_engine Co-authored-by: Elias Bermudez <[email protected]> * Support input payload generation for tensorrtllm engine (#767) * Add functionality for async requests and output retrieval with Triton C API (#25) * Support 1-d array data in profile exporter (#28) * support array of data in profile exporter * add some tests * run formatting * fix pre-commit * remove duplicate argparser arguments * Fix Triton C API mode missing infer requested output datatype bug --------- Co-authored-by: Matthew Kotila <[email protected]> * Support profile data parsing for tensorrtllm engine service kind (#33) * support parsing tensorrtllm engine profile response * add test * refactor the test * update types and names * fix pre-commit * run PA with triton c api * more clean up on the tests * fix codeql * address feedback * Add functionality to continue benchmarking in Triton C API mode if server logging support is disabled (#34) --------- Co-authored-by: Hyunjae Woo <[email protected]> Co-authored-by: Elias Bermudez <[email protected]>

nv-hwoo requested review from debermudez and matthewkotila August 6, 2024 00:34

nv-hwoo force-pushed the hwoo-profile-export-array branch from 549fece to fa7fdd2 Compare August 6, 2024 19:37

matthewkotila force-pushed the tensorrtllm-engine branch from f7dd9e4 to 70b12b8 Compare August 6, 2024 20:51

nv-hwoo force-pushed the hwoo-profile-export-array branch from fa7fdd2 to c0de3e8 Compare August 6, 2024 22:23

nv-hwoo commented Aug 6, 2024

View reviewed changes

genai-perf/genai_perf/parser.py Show resolved Hide resolved

matthewkotila force-pushed the tensorrtllm-engine branch from 70b12b8 to 456b5c7 Compare August 7, 2024 17:57

nv-hwoo added 5 commits August 7, 2024 11:01

support array of data in profile exporter

a06804f

add some tests

5357813

run formatting

15ca5ca

fix pre-commit

f6dbd93

remove duplicate argparser arguments

c7b1642

nv-hwoo force-pushed the hwoo-profile-export-array branch from faa23f4 to c7b1642 Compare August 7, 2024 18:01

nv-hwoo temporarily deployed to GITLAB August 7, 2024 18:02 — with GitHub Actions Inactive

matthewkotila temporarily deployed to GITLAB August 7, 2024 18:52 — with GitHub Actions Inactive

matthewkotila force-pushed the hwoo-profile-export-array branch from b32067d to f787972 Compare August 7, 2024 19:03

matthewkotila temporarily deployed to GITLAB August 7, 2024 19:03 — with GitHub Actions Inactive

matthewkotila force-pushed the hwoo-profile-export-array branch from f787972 to 9dcc8b7 Compare August 7, 2024 19:13

matthewkotila temporarily deployed to GITLAB August 7, 2024 19:13 — with GitHub Actions Inactive

matthewkotila force-pushed the hwoo-profile-export-array branch from 9dcc8b7 to faf762a Compare August 7, 2024 21:48

matthewkotila temporarily deployed to GITLAB August 7, 2024 21:48 — with GitHub Actions Inactive

matthewkotila force-pushed the hwoo-profile-export-array branch from faf762a to 5df151a Compare August 7, 2024 22:03

matthewkotila temporarily deployed to GITLAB August 7, 2024 22:03 — with GitHub Actions Inactive

matthewkotila force-pushed the hwoo-profile-export-array branch from 5df151a to 684958f Compare August 7, 2024 22:17

matthewkotila temporarily deployed to GITLAB August 7, 2024 22:17 — with GitHub Actions Inactive

matthewkotila approved these changes Aug 7, 2024

View reviewed changes

Fix Triton C API mode missing infer requested output datatype bug

79e3d0e

matthewkotila force-pushed the hwoo-profile-export-array branch from 684958f to 79e3d0e Compare August 7, 2024 23:23

matthewkotila temporarily deployed to GITLAB August 7, 2024 23:24 — with GitHub Actions Inactive

nv-hwoo merged commit e258b28 into tensorrtllm-engine Aug 8, 2024
7 checks passed

nv-hwoo deleted the hwoo-profile-export-array branch August 8, 2024 16:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support 1-d array data in profile exporter #28

Support 1-d array data in profile exporter #28

nv-hwoo commented Aug 6, 2024

Support 1-d array data in profile exporter #28

Support 1-d array data in profile exporter #28

Conversation

nv-hwoo commented Aug 6, 2024