-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix racing condition in BaseQueryRewriteContext #17124
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: Mingshi Liu <[email protected]>
Signed-off-by: Mingshi Liu <[email protected]>
@dbwiddis @msfroh @bugmakerrrrrr ready for review, thank you!!! This change should go to 3.0 version, so I added to changelog in 3.0 |
❌ Gradle check result for 30333a7: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
asyncActionsRef.updateAndGet(list -> { | ||
List<BiConsumer<Client, ActionListener<?>>> newList = new ArrayList<>(list); | ||
newList.add(asyncAction); | ||
return newList; | ||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need to copy the list here; this gives us no value over a CopyOnWriteArrayList
which we wouldn't need the atomic reference for. Can't we simply just update the list in-place but wrapped in the atomic reference for thread safety? (CC @msfroh to confirm this suggestion):
asyncActionsRef.updateAndGet(list -> { | |
List<BiConsumer<Client, ActionListener<?>>> newList = new ArrayList<>(list); | |
newList.add(asyncAction); | |
return newList; | |
}); | |
asyncActionsRef.updateAndGet(list -> { | |
list.add(asyncAction); | |
return list; | |
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried, if not returning new list, I am having test failure, so it still cause racing condition during registration.
BaseQueryRewriteContextTests > testRacingConditionFixed FAILED
java.lang.AssertionError: expected:<5000> but was:<4874>
at org.junit.Assert.fail(Assert.java:89)
at org.junit.Assert.failNotEquals(Assert.java:835)
at org.junit.Assert.assertEquals(Assert.java:647)
at org.junit.Assert.assertEquals(Assert.java:633)
at org.opensearch.index.query.BaseQueryRewriteContextTests.testRacingConditionFixed(BaseQueryRewriteContextTests.java:167)
BaseQueryRewriteContextTests > testConcurrentRegistrationAndExecution FAILED
java.lang.AssertionError: expected:<10000> but was:<7802>
at org.junit.Assert.fail(Assert.java:89)
at org.junit.Assert.failNotEquals(Assert.java:835)
at org.junit.Assert.assertEquals(Assert.java:647)
at org.junit.Assert.assertEquals(Assert.java:633)
at org.opensearch.index.query.BaseQueryRewriteContextTests.testConcurrentRegistrationAndExecution(BaseQueryRewriteContextTests.java:108)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm... if we need to copy the whole list anyway, I think I like @dbwiddis's suggestion of using CopyOnWriteArrayList
over my AtomicReference
suggestion.
server/src/test/java/org/opensearch/index/query/BaseQueryRewriteContextTests.java
Show resolved
Hide resolved
IMO, I think there is no racing condition. Before rewriting again, we should have executed all registered async actions and notifies the listener, so no new actions will be registered during |
Description
Fixing #17103
to avoid racing in registering and execution:
use AtomicReference<List<BiConsumer<Client, ActionListener<?>>>> asyncActionsRef,
while registerAsyncAction, use updateAndGet and return newList,
while executeAsyncActions, use getAndSet and return new List before execution.
added unit tests to help verify the fix.
Related Issues
[[Resolves #[Issue number to be closed when this PR is merged]
](https://github.com/opensearch-project/OpenSearch/issues/17103)](https://github.com/opensearch-project/OpenSearch/issues/17103)
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.