Fix Huge Number of Watches in ZooKeeper #17482
Open
+950
−42
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Fixes #6647
Description
This PR is built upon #6683 and #9172 and aims to reduce the number of ZooKeeper watch counts.
Fixed Huge Number of Watches in ZooKeeper
Previously,
Announcer.java
causes all child nodes under the parent path to be watched by the ZooKeeper ensemble. This causes an unnecessarily large number of ZooKeeper watches to be produced.The new
NodeAnnouncer.java
class, which is heavily modelled on theAnnouncer.java
, aims to address this issue by announcing a single node within a ZooKeeper ensemble. By eliminating the watches on child nodes, this approach significantly reduces the total number of watch counts in ZooKeeper. Tests conducted on the production server also indicate a decrease in watch counts resulting from this change.The original
Announcer.java
still provides better performance for segment announcements, and hence will be retained in the codebase.Documentation
Refactoring
Announceable
class inAnnouncer.java
toAnnounceable.java
.ZKPathsUtils.java
to abstract the retrieval of ZooKeeper path and ZooKeeper node.Release note
Improved: ZooKeeper no longer spins up an unnecessary large number of watches when running realtime tasks.
Key changed/added classes in this PR
Announcer.java
NodeAnnouncer.java
Announceable.java
AnnouncerModule.java
ZKPathsUtils.java
Further Actionable
I plan to create issues for the following follow-up actions after this PR:
Deprecated Code
The
PathChildrenCache
,NodeCache
classes have been deprecated since Curator 5.0.0.We can look into replacing these deprecated classes with
CuratorCache
. CuratorCache requires ZooKeeper 3.6+, and we are currently using ZooKeeper 3.8.4.Concurrency Flow
Add a concurrent control flow documentation for NodeAnnouncer.
This PR has: