Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DocDB] FATAL occurred on one of alive nodes during kernel shutdown nemesis #26178

Open
1 task done
qvad opened this issue Feb 25, 2025 · 0 comments
Open
1 task done
Assignees
Labels
area/docdb YugabyteDB core features kind/bug This issue is a bug priority/medium Medium priority issue

Comments

@qvad
Copy link
Contributor

qvad commented Feb 25, 2025

Jira Link: DB-15513

Description

This FATAL occurred during kernel shutdown test - we killed one node, but other node got FATAL after 3 minutes.

We should not fail other node in this case

2025-02-24 07:17:56,131:DEBUG: Acting KernelShutdownNemesis nemesis
...
2025-02-24 07:17:56,143: INFO: start step Linux kernel stop(54f1d9cf-31e7-4f8b-a5a8-edd04d376aeb) instances=172_151_31_75 max_timeout=900 max_num_nodes=1 name=Linux kernel stop
2025-02-24 07:17:56,144: INFO: Sending halt ['i-01bfb2fa030434ab8']
2025-02-24 07:17:56,145:DEBUG: > 172.151.43.204 > nohup sudo  /tmp/strobe 1000 100 60 >& /tmp/strobe.log & 2>&1
2025-02-24 07:17:56,146:DEBUG: > 172.151.31.75 > sudo halt -f 2>&1
2025-02-24 07:17:56,451:DEBUG: < 172.151.43.204 < 
F20250224 07:21:15 ../../src/yb/tserver/remote_bootstrap_service.cc:779] Check failed: _s.ok() Bad status: Timed out (yb/rpc/outbound_call.cc:647): Unable to refresh Log Anchor session 7573ad2bc2a942868dfb857683a53b8e-80e661b9f2f24388bc7203f687f64905-3583.942s: KeepLogAnchorAlive RPC (request call id 7707127) to 172.151.45.35:9100 timed out after 5.000s
    @     0xaaaade2ff55c  google::LogMessage::SendToLog()
    @     0xaaaade300400  google::LogMessage::Flush()
    @     0xaaaade300a9c  google::LogMessageFatal::~LogMessageFatal()
    @     0xaaaadf9f4f8c  yb::tserver::RemoteBootstrapServiceImpl::EndExpiredSessions()
    @     0xaaaadfee2658  yb::Thread::SuperviseThread()
    @     0xffffb38878b8  start_thread
    @     0xffffb38e3afc  thread_start

Issue Type

kind/bug

Warning: Please confirm that this issue does not contain any sensitive information

  • I confirm this issue does not contain any sensitive information.
@qvad qvad added area/docdb YugabyteDB core features status/awaiting-triage Issue awaiting triage labels Feb 25, 2025
@yugabyte-ci yugabyte-ci added kind/bug This issue is a bug priority/medium Medium priority issue and removed status/awaiting-triage Issue awaiting triage labels Feb 25, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/docdb YugabyteDB core features kind/bug This issue is a bug priority/medium Medium priority issue
Projects
None yet
Development

No branches or pull requests

4 participants