Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix hipBone segfault when running with rocprof #41

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

dmcdougall
Copy link
Collaborator

Running hipbone with rocprof segfaults.

The current assumption in the code was to assume the output from popen is a single line. fgets only processes a single line (up to \n). When the popen (shell) invocation is clobbered by roctracer output, what the fgets call sees is the first line which is the roctracer output, and NOT the output from the pipe, which comes in a later line. It is not clear to me why roctracer output from the parent shell is caught in the output of the child shell. I suppose there is only one stdout and so when two things write to it, you have some indeterminate ordering. This might explain why you don't see it in other OSs; perhaps they way shells are handled takes more "time" on a rhel system for a reason I don't know, and that's why we see the behaviour we see.

There are multiple ways to skin this cat. I picked one that consumes an external software dependency. The other solutions involved string parsing in C++ which is its own can of worms.

Running hipbone with rocprof segfaults.

The current assumption in the code was to assume the output from popen
is a single line.  fgets only processes a single line (up to \n).  When
the popen (shell) invocation is clobbered by roctracer output, what the
fgets call sees is the first line which is the roctracer output, and NOT
the output from the pipe, which comes in a later line.  It is not clear
to me why roctracer output from the parent shell is caught in the output
of the child shell.  I suppose there is only one stdout and so when two
things write to it, you have some indeterminate ordering.  This might
explain why you don't see it in other OSs; perhaps they way shells are
handled takes more "time" on a rhel system for a reason I don't know,
and that's why we see the behaviour we see.

There are multiple ways to skin this cat.  I picked one that consumes an
external software dependency.  The other solutions involved string
parsing in C++ which is its own can of worms.
@dmcdougall dmcdougall requested a review from noelchalmers March 21, 2024 20:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant