Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

broken leader re-election after killing most of cluster nodes #78

Closed
xmonader opened this issue Apr 14, 2018 · 21 comments
Closed

broken leader re-election after killing most of cluster nodes #78

xmonader opened this issue Apr 14, 2018 · 21 comments

Comments

@xmonader
Copy link

xmonader commented Apr 14, 2018

I was killing some of a 3 node cluster randomly to verify an issue with re-electing leader and checking the status using

➜  wspace cat clusterstatus.sh 
syncobj_admin -conn 127.0.0.1:6000 -status
syncobj_admin -conn 127.0.0.1:6001 -status
syncobj_admin -conn 127.0.0.1:6002 -status

➜  wspace bash clusterstatus.sh | egrep 'leader:|self:'
leader: localhost:6000
self: localhost:6000
leader: localhost:6000
self: localhost:6001
leader: localhost:6000
self: localhost:6002
➜  wspace OA
zsh: command not found: OA
➜  wspace bash clusterstatus.sh | egrep 'leself:|leader:'  
leader: localhost:6000
self: localhost:6000
leader: localhost:6000
self: localhost:6001
leader: localhost:6000
self: localhost:6002
➜  wspace bash clusterstatus.sh | egrep 'self:|leader:'
leader: localhost:6000
self: localhost:6000
leader: localhost:6000
self: localhost:6001
leader: localhost:6000
self: localhost:6002
➜  wspace bash clusterstatus.sh | egrep 'self:|leader:'
leader: localhost:6001
self: localhost:6001
leader: localhost:6001
self: localhost:6002

Here the leader should've been set to 6001 but got None value instead

➜  wspace bash clusterstatus.sh | egrep 'self:|leader:'
leader: None
self: localhost:6000
leader: localhost:6001
self: localhost:6001
leader: localhost:6001
self: localhost:6002
➜  wspace bash clusterstatus.sh | egrep 'self:|leader:'
leader: localhost:6001
self: localhost:6000
leader: localhost:6001
self: localhost:6001
leader: localhost:6001
self: localhost:6002
➜  wspace bash clusterstatus.sh | egrep 'self:|leader:'
leader: localhost:6002
self: localhost:6000
leader: localhost:6002
self: localhost:6001
leader: localhost:6002
self: localhost:6002
➜  wspace bash clusterstatus.sh | egrep 'self:|leader:'
leader: localhost:6001
self: localhost:6000
leader: localhost:6001
self: localhost:6001

And I reached this very interesting state where the node 6000 has a leader 6001 but that leader isn't even active?

➜  wspace bash clusterstatus.sh | egrep 'self:|leader:'
leader: localhost:6001
self: localhost:6000

it was fixed after launching 6002

➜  wspace bash clusterstatus.sh | egrep 'self:|leader:'
leader: localhost:6000
self: localhost:6000
leader: localhost:6000
self: localhost:6002

@bakwc
Copy link
Owner

bakwc commented Apr 14, 2018

Currently leader for node resets only when node receives some messages from other node. It's not a bug - raft requires more than half of the cluster alive to elect new leader, so in your scenario it's a normal behaviour.

@xmonader
Copy link
Author

@bakwc tested with 4 nodes and 3 of the nodes live and the re-election didn't happen

@despiegk
Copy link

I saw this behaviour too, but in my case
I was testing with 4 nodes, 3 were active, I killed the leader.
There was no re-election.
I cannot reproduce this every time though.

kristofs-MacBook-Pro:recordchain kristofdespiegeleer$ syncobj_admin -conn 127.0.0.1:6001 -status -pass 1233
commit_idx: 1966
enabled_code_version: 0
last_applied: 1966
leader: localhost:6000
leader_commit_idx: 1966
log_len: 972
match_idx_count: 3
match_idx_server_localhost:6000: 0
match_idx_server_localhost:6002: 0
match_idx_server_localhost:6003: 0
next_node_idx_count: 3
next_node_idx_server_localhost:6000: 2
next_node_idx_server_localhost:6002: 2
next_node_idx_server_localhost:6003: 2
partner_node_status_server_localhost:6000: 2
partner_node_status_server_localhost:6002: 2
partner_node_status_server_localhost:6003: 2
partner_nodes_count: 3
raft_term: 2
readonly_nodes_count: 0
revision: 1899fe752bde334787dbfa54bb51bbd9fcf2826c
self: localhost:6001
self_code_version: 0
state: 0
unknown_connections_count: 1
uptime: 88
version: 0.3.3

now I kill the leader

kristofs-MacBook-Pro:recordchain kristofdespiegeleer$ syncobj_admin -conn 127.0.0.1:6001 -status -pass 1233
commit_idx: 3306
enabled_code_version: 0
last_applied: 3306
leader: None
leader_commit_idx: 3306
log_len: 316
match_idx_count: 3
match_idx_server_localhost:6000: 0
match_idx_server_localhost:6002: 0
match_idx_server_localhost:6003: 0
next_node_idx_count: 3
next_node_idx_server_localhost:6000: 2
next_node_idx_server_localhost:6002: 2
next_node_idx_server_localhost:6003: 2
partner_node_status_server_localhost:6000: 0
partner_node_status_server_localhost:6002: 0
partner_node_status_server_localhost:6003: 0
partner_nodes_count: 3
raft_term: 3
readonly_nodes_count: 0
revision: 1899fe752bde334787dbfa54bb51bbd9fcf2826c
self: localhost:6001
self_code_version: 0
state: 1
unknown_connections_count: 1
uptime: 163
version: 0.3.3

i am running the servers in tmux,
the 3 non leaders are working, but all have errors in setting data now

code where we test
https://github.com/rivine/recordchain/edit/master/JumpScale9RecordChain/servers/raft/README.md

restarting the leader leaves everything in limbo

@bakwc
Copy link
Owner

bakwc commented Apr 14, 2018

Thanks for report! I'll try to reproduce it. How long does it take to get this situation? Is it reproduces only when you have a password-protected cluster?

@despiegk
Copy link

i'll try for non password protected, i'll do it now.

@despiegk
Copy link

yes indeed, that seems to be the issue, without passwd I cannot reproduce .

@bakwc
Copy link
Owner

bakwc commented Apr 15, 2018

tested with 4 nodes and 3 of the nodes live and the re-election didn't happen

@xmonader, did you use the password? You created a 4-node cluster, killed only one node (leader) and new leader was not elected?

@bakwc
Copy link
Owner

bakwc commented Apr 15, 2018

Please try to increase following config options, set them to:

raftMinTimeout = 1.0
raftMaxTimeout = 3.0

@despiegk
Copy link

sorry was away for this time, will try to do.

@bakwc
Copy link
Owner

bakwc commented May 28, 2018

Do you use python2 or python3?

@despiegk
Copy link

despiegk commented May 30, 2018 via email

@sakh251
Copy link

sakh251 commented Mar 23, 2020

Any update on this?
I have this issue too

@bakwc
Copy link
Owner

bakwc commented Mar 24, 2020

Could you please provide more details? What is your reproduce steps? What is the cluster size? How much nodes were alive?

@sakh251
Copy link

sakh251 commented Mar 24, 2020

My cluster size is 4
Frist I created a cluster and dynamically add 3 other node. I monitored syncObj._SyncObj__connectedNodes and saw 4 node and first node is leader.
when I kill other nodes cluster goes well. The killed node is in syncObj._SyncObj__otherNodes but it removed from syncObj._SyncObj__connectedNodes.
but for leader, If I kill the leader. different node show different nodes in syncObj._SyncObj__connectedNodes and none of them became leader
I just killed the leader.
I used these also : raftMinTimeout = 1.0 raftMaxTimeout = 3.0

@sakh251
Copy link

sakh251 commented Mar 24, 2020

Also when I removed a node from the cluster it remained in otherNodes in not leader nodes. It maybe also relate to this

@bakwc
Copy link
Owner

bakwc commented Mar 24, 2020

Thanks for report, I'll chek it. What script did you use for test purposes? Could you post it somewhere (pastebin.com)? Is it always reproduces or from time to time?
Your steps were:

  1. Start a single-node cluster
  2. Add 3 more nodes dynamically
  3. Kill the first node
  4. No new leader was elected
    Right?

@bakwc
Copy link
Owner

bakwc commented Mar 24, 2020

Checked multiple times, can't reproduce. Could you please provide detailed step-by-step instruction of your actions?

@sakh251
Copy link

sakh251 commented Mar 24, 2020

Acually I did what you explained.
I can share my code with you. I am writing a for dynamic cluster extension. So I have 10 ready to join nodes and for example target number of node to 4. The cluster should expand itself to reach 4. after that I will change the target to 8 after reduce to 3. So I want can add and remove nodes.
To do this the cluster should remain available with different failure. https://pastebin.com/svvG3eHK it is very dirty test code. I have port scanner to find other nodes. I have some problem to remove and add nodes too.

@sakh251
Copy link

sakh251 commented Mar 24, 2020

Also if you want I can show my screen in any call

@bakwc
Copy link
Owner

bakwc commented Mar 24, 2020

When adding nodes you need to specify all current cluster nodes manually. Added #112 to make auto-discovery.

@sakh251
Copy link

sakh251 commented Mar 24, 2020

Thank you,
It works for me

@bakwc bakwc closed this as completed Mar 24, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants