Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tracing of GPU zeCommandListAppend<..> submitted kernels, copies, barriers... with use of an additional event #310

Open
jfedorov opened this issue Jul 19, 2024 · 2 comments

Comments

@jfedorov
Copy link

jfedorov commented Jul 19, 2024

Summary

Introduce Level-Zero Core or Tools API that enables setting-up timestamp enabled event (additional to already set) for GPU task that being submitted into the command list.

Details

Motivation

  1. Dynamic enabling of Tracing GPU tasks (Kernels, mem copied, barriers..) via Start/Stop became a critical requirement for long running workloads (e.g. AI workloads based on PyTorch).
  2. This requirement also asks that еthe implementation of tracing should have a minimal overhead and traced application should be "disturbed" as less as possible.

Level-Zero and Loader already added APIs that makes Start/Stop possible. But current Start/Stop implementation by the tracing tool (e.g. PTI) incurs significant overhead as there are no means to easy add or change an Event to the one having Timestamp property.
To meet requirement (2) - let's add an API that would add profiling event (event with Timestamp enabled) on the fly - prior to the zeCommandListAppendLaunchKernel (zeCommandListAppendMemoryCopy etc.)

Proposed API

New Functions

From the brainstorm this API could be something like:

zeCommandListProfileNextAppend(
    ze_command_list_handle_t cmdList,
    ze_event_handle_t eventHandle
); 
Parameter Description
cmdList [ ${\textsf{\color{orange}in}}$ ] handle of command list
eventHandle [ ${\textsf{\color{yellow}out}}$ ] event with Timestamp property

The usage flow would be like this:

Call zeCommandListProfileNextAppend(cmdList, event) prior zeCommandListAppend..(cmdList,..) that submits a task to be profiled. The caller of these 2 APIs should make additional precautions in the situation when 2 threads might submit to the same command list around the same moment. So the situation when the event from zeCommandListProfileNextAppend might be erroneously associated with zeCommandListAppend... from another thread is to be handled by an API user.

Event passed into zeCommandListProfileNextAppend is to be created by a user and should come from eventPool with ZE_EVENT_POOL_FLAG_KERNEL_TIMESTAMP.
The timing data from the event would be availible per the task completion of the devcie and should be retrieved by zeEventQueryKernelTimestamp(event, &timestamp);

@jfedorov jfedorov changed the title Tracing of GPU zeCommandListAppend<..> submitted kernels, copies, barriers with use of additional event Tracing of GPU zeCommandListAppend<..> submitted kernels, copies, barriers... with use of an additional event Jul 19, 2024
@MichalMrozek
Copy link

let me summarize understanding :

  • all thread safety is outside of Level Zero driver, user of this extension needs to guard access to cmdList from multiple threads
  • this operation adds event to subsequent append operation
  • this event would only be used for profiling of next command , it cannot be used for completion /dependency resolution
  • event is provided by the caller and it must have profiling capabilities

In the driver we would need to allow profiling in such conditions and apply profiling capture
for fs1 this would mean we will utilize one more post sync operation
for pre-fs1 this would mean we would need to add additional profiling calls around the next call

We gain with this API that we do not need to introduce any new APIs.

@jfedorov
Copy link
Author

Yes. You summarized correctly.
Just to make sure:
This API is to be used for any CommandList - not just for Immediate.
For example, this API could be also used to "instrument" commandLists passed to zeCommandQueueExecuteCommandLists

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants