Avoid infinite and unecessary loop when CN RPC processors are killed/interrupted by OS #12584

Pengzna · 2024-05-24T04:06:45Z

We found that in some scenarios, when the stop-confignode.sh command is executed, confignode does not exit immediately, but hangs for about 90 seconds, printing more than 3 million "Unexpected interruption during waiting for configNode leader ready." warning logs.

This problem is because in the code modified by this PR, the RPC thread of confignode will continue to sleep in the while loop. When the confignode process is killed, the OS will continuously try to interrupt the sleep thread, causing the sleep thread to be continuously interrupted and print logs in the while loop.

So we check the confignode for all threads that are sleeping in the while loop. If these threads will only be interrupted during shutdown during their life cycle, then they need to jump out of the while loop when they are interrupted to avoid infinite log printing and stop-confignode lags.

for detailed info, see https://jira.infra.timecho.com:8443/browse/TIMECHODB-750

OneSizeFitsQuorum

LGTM

liyuheng55555

LGTM. The detailed description greatly helps to understand the issue and solution !

CRZbulabula

LGTM!!!

…interrupted by OS (apache#12584)

Pengzna added 2 commits May 24, 2024 12:04

avoid infinite loop when CN RPC processors are killed by OS

ad964e5

fix

d0dffb6

OneSizeFitsQuorum approved these changes May 24, 2024

View reviewed changes

liyuheng55555 approved these changes May 24, 2024

View reviewed changes

CRZbulabula approved these changes May 24, 2024

View reviewed changes

CRZbulabula merged commit fad6553 into apache:master May 24, 2024
56 of 57 checks passed

SzyWilliam pushed a commit to SzyWilliam/iotdb that referenced this pull request Nov 26, 2024

Avoid infinite and unecessary loop when CN RPC processors are killed/…

20c7be2

…interrupted by OS (apache#12584)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Avoid infinite and unecessary loop when CN RPC processors are killed/interrupted by OS #12584

Avoid infinite and unecessary loop when CN RPC processors are killed/interrupted by OS #12584

Pengzna commented May 24, 2024 •

edited

Loading

OneSizeFitsQuorum left a comment

liyuheng55555 left a comment

CRZbulabula left a comment

Avoid infinite and unecessary loop when CN RPC processors are killed/interrupted by OS #12584

Avoid infinite and unecessary loop when CN RPC processors are killed/interrupted by OS #12584

Conversation

Pengzna commented May 24, 2024 • edited Loading

OneSizeFitsQuorum left a comment

Choose a reason for hiding this comment

liyuheng55555 left a comment

Choose a reason for hiding this comment

CRZbulabula left a comment

Choose a reason for hiding this comment

Pengzna commented May 24, 2024 •

edited

Loading