Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CR-1202587: Fixing xbutil validate with ml_timeline on client #8277

Draft
wants to merge 13 commits into
base: master
Choose a base branch
from

Conversation

jvillarre
Copy link
Collaborator

Problem solved by the commit

When an application uses multiple hardware contexts and turns on ml_timeline, the application would encounter an error when creating the second hw_context since the ml_timeline plugin was keeping a reference to the shared pointer and not releasing it.

This resolves the issue by not keeping the cached hw_context anymore but creating it on the fly when necessary.

Bug / issue (if any) fixed, which PR introduced the bug, how it was discovered

This was introduced as a side effect of driver changes that changed the debug buffer write back mechanism, and the necessary combination of turning on ml_timeline and other debug features at the same time. The bug was discovered through regression testing.

How problem was solved, alternative solutions (if any) and why they were rejected

This is a temporary solution, as the reason the ml_timeline plugin had a cached hw_context is that it requires it when AIE profiling and/or AIE debug are enabled in conjunction with ml timeline. We have to enforce an order where ml_timeline flushes its data before profiling or debug can fetch values, and since those can ask for data at any time in the execution we could not rely on a hook from the user side to give us a live hw_context at the time we need to flush.

We will have to revisit this with a different solution to resolve that use case in the future.

Risks (if any) associated the changes in the commit

High risk to the ml_timeline feature as this changes the behavior of flushing the data from the device.

What has been tested and how, request additional testing if necessary

Testing in progress on the original failing test case and other designs.

Documentation impact (if any)

No documentation impact.

…t was preventing ml timeline from working in when the user application used multiple hardware contexts.

Signed-off-by: Jason Villarreal <[email protected]>
Signed-off-by: Jason Villarreal <[email protected]>
Signed-off-by: Jason Villarreal <[email protected]>
Signed-off-by: Jason Villarreal <[email protected]>
…tation information in the special case of ml_timeline

Signed-off-by: Jason Villarreal <[email protected]>
Signed-off-by: Jason Villarreal <[email protected]>
@jvillarre jvillarre marked this pull request as draft July 9, 2024 23:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants