Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid infinite and unecessary loop when CN RPC processors are killed/interrupted by OS #12584

Merged
merged 2 commits into from
May 24, 2024

Conversation

Pengzna
Copy link
Collaborator

@Pengzna Pengzna commented May 24, 2024

We found that in some scenarios, when the stop-confignode.sh command is executed, confignode does not exit immediately, but hangs for about 90 seconds, printing more than 3 million "Unexpected interruption during waiting for configNode leader ready." warning logs.

This problem is because in the code modified by this PR, the RPC thread of confignode will continue to sleep in the while loop. When the confignode process is killed, the OS will continuously try to interrupt the sleep thread, causing the sleep thread to be continuously interrupted and print logs in the while loop.

So we check the confignode for all threads that are sleeping in the while loop. If these threads will only be interrupted during shutdown during their life cycle, then they need to jump out of the while loop when they are interrupted to avoid infinite log printing and stop-confignode lags.

for detailed info, see https://jira.infra.timecho.com:8443/browse/TIMECHODB-750

Copy link
Contributor

@OneSizeFitsQuorum OneSizeFitsQuorum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Collaborator

@liyuheng55555 liyuheng55555 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. The detailed description greatly helps to understand the issue and solution !

Copy link
Contributor

@CRZbulabula CRZbulabula left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!!!

@CRZbulabula CRZbulabula merged commit fad6553 into apache:master May 24, 2024
56 of 57 checks passed
SzyWilliam pushed a commit to SzyWilliam/iotdb that referenced this pull request Nov 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants