Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Race Conditions and Deadlock in qarray Test With Thread Sanitizer #240

Closed
insertinterestingnamehere opened this issue Feb 12, 2024 · 2 comments · Fixed by #242
Closed

Race Conditions and Deadlock in qarray Test With Thread Sanitizer #240

insertinterestingnamehere opened this issue Feb 12, 2024 · 2 comments · Fixed by #242
Labels
bug medium priority tsan Thread Sanitizer Errors
Milestone

Comments

@insertinterestingnamehere
Copy link
Collaborator

The qarray test hangs with thread sanitizer (x86-64, nemesis, clang17, no topology detection).

Prior to hanging though, it also emits various thread sanitizer errors:

atomic write:

qthread_incr(arg->donecount, 1);

non-atomic read of same variable:
while (donecount < maxsheps) {

In the test itself, non-atomic read:

if (count != ELEMENT_COUNT) {

atomic write:
qthread_incr(&count, 1);

Similar:

if (elem[j] != 1) {

memset(arg, 1, sizeof(bigobj));

Similar:

double elem = *(double *)qarray_elem_nomigrate(a, i);

*(double *)arg = 1.0;

Somewhat similar:
Non-atomic write:

count = 0;

Atomic write:
qthread_incr(&count, 1);

Non-atomic write:

FREE(a, sizeof(qarray));

Non-atomic read:
void *ptr = qarray_elem_nomigrate(arg->a, count + inpage_offset);

Note the read occurs inside the called function qarray_elem_nomigrate at:
return a->base_ptr + ((segment_num * a->segment_bytes) + ((index - segment_num * a->segment_size) * a->unit_size));

Non-atomic write:

struct qarray_func_wrapper_args qfwa = { { NULL }, a, NULL, &donecount, startat, stopat };

Atomic write to same address:
qthread_incr(arg->donecount, 1);

@insertinterestingnamehere
Copy link
Collaborator Author

Debugged some more. It's not actually a deadlock. It's just that that particular test hits the thread sanitizer performance penalty really hard. Adjusting the problem size gets it down to a reasonable runtime.

@insertinterestingnamehere
Copy link
Collaborator Author

Closing in favor of #303

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug medium priority tsan Thread Sanitizer Errors
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant