Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

learner can became leader after (request_leadership/yield_leadership) #553

Open
UnamedRus opened this issue Nov 14, 2024 · 3 comments
Open

Comments

@UnamedRus
Copy link

UnamedRus commented Nov 14, 2024

We observe behavior, that if we run request_leadership on learner, it can actually became leader without change of it's "learner" status

peer 1, DC ID 0, 172.25.0.2:9234, voting member, 1
peer 2, DC ID 0, 172.25.0.3:9234, voting member, 1
peer 3, DC ID 0, 172.25.0.4:9234, voting member, 1
peer 4, DC ID 0, 172.26.0.2:9234, learner, 1
peer 5, DC ID 0, 172.26.0.3:9234, learner, 1
peer 6, DC ID 0, 172.26.0.4:9234, learner, 1
my id: 5, leader: 5, term: 2

From code comments:

* Learner will not initiate or participate in leader election.

    /**
     * (Read-only)
     * `true` if this server is a learner. Will not participate
     * leader election.
     */  

From readonly(learner) documentation
https://github.com/eBay/NuRaft/blob/master/docs/readonly_member.md

Both read-only members and zero-priority members do not initiate leader election, and never be a leader.

  1. Does it mean that learner, only affect behavior around elections, and not straight nomination of node as leader?
  2. If so, should documentation for readonly_member, corrected that in some cases, it can became leader?
@greensky00
Copy link
Contributor

Leaners should never be a leader in any circumstance. They neither initiate an election nor vote for others. Can you please share the full log of peer 5 so that I can see what happened during the leader election?

@greensky00
Copy link
Contributor

Ah, you executed request_leadership on learner. If it worked, then it is a bug; request_leadership should be denied on learners. Let me check if that's the case.

@UnamedRus
Copy link
Author

2024.11.14 15:44:47.624300 [ 34 ] {} <Debug> KeeperTCPHandler: Receive four letter command rqld
2024.11.14 15:44:47.624451 [ 34 ] {} <Information> RaftInstance: sent leadership request to leader 1
2024.11.14 15:44:47.624903 [ 55 ] {} <Information> RaftInstance: 0x7f55f8e87898 connected to 172.25.0.2:9234 (as a client)
2024.11.14 15:44:47.625424 [ 48 ] {} <Debug> RaftInstance: Receive a custom_notification_response message from peer 1 with Result=1, Term=1, NextIndex=39
2024.11.14 15:44:47.625833 [ 36 ] {} <Trace> KeeperDispatcher: Processing requests batch, size: 1, bytes: 0
2024.11.14 15:44:47.625837 [ 31 ] {} <Debug> KeeperTCPHandler: Received close event with xid 2147483647 for session id #19
2024.11.14 15:44:47.626203 [ 48 ] {} <Debug> RaftInstance: Receive a append_entries_request message from 1 with LastLogIndex=38, LastLogTerm 1, EntriesLength=0, CommitIndex=38 and Term=1
2024.11.14 15:44:47.626248 [ 48 ] {} <Debug> RaftInstance: Response back a append_entries_response message to 1 with Accepted=1, Term=1, NextIndex=39
2024.11.14 15:44:47.633617 [ 54 ] {} <Debug> RaftInstance: Receive a custom_notification_request message from 1 with LastLogIndex=38, LastLogTerm 1, EntriesLength=1, CommitIndex=38 and Term=1
2024.11.14 15:44:47.633642 [ 54 ] {} <Information> RaftInstance: [LEADERSHIP TAKEOVER] got request
2024.11.14 15:44:47.633768 [ 54 ] {} <Debug> IDisk: Copying from LocalStateFileDisk (path: /var/lib/clickhouse/coordination) state to LocalStateFileDisk (path: /var/lib/clickhouse/coordination) state-OLD.
2024.11.14 15:44:47.638704 [ 54 ] {} <Information> RaftInstance: [VOTE INIT] my id 5, my role candidate, term 2, log idx 38, log term 1, priority (target 1 / mine 1)
2024.11.14 15:44:47.638804 [ 54 ] {} <Debug> RaftInstance: send request_vote_request to server 3 with term 2
2024.11.14 15:44:47.638846 [ 54 ] {} <Debug> RaftInstance: send request_vote_request to server 2 with term 2
2024.11.14 15:44:47.638855 [ 54 ] {} <Debug> RaftInstance: send request_vote_request to server 1 with term 2
2024.11.14 15:44:47.638970 [ 54 ] {} <Debug> RaftInstance: Response back a custom_notification_response message to 1 with Accepted=1, Term=1, NextIndex=39
2024.11.14 15:44:47.639454 [ 54 ] {} <Information> RaftInstance: 0x7f55f8e87c18 connected to 172.25.0.3:9234 (as a client)
2024.11.14 15:44:47.639524 [ 55 ] {} <Information> RaftInstance: 0x7f55f8e87f98 connected to 172.25.0.4:9234 (as a client)
2024.11.14 15:44:47.646386 [ 56 ] {} <Debug> RaftInstance: Receive a request_vote_response message from peer 1 with Result=1, Term=2, NextIndex=39
2024.11.14 15:44:47.646413 [ 56 ] {} <Information> RaftInstance: [VOTE RESP] peer 1 (O), resp term 2, my role candidate, granted 2, responded 2, num voting members 3, quorum 2
2024.11.14 15:44:47.646417 [ 56 ] {} <Information> RaftInstance: Server is elected as leader for term 2
2024.11.14 15:44:47.646440 [ 56 ] {} <Information> RaftInstance: number of pending commit elements: 0
2024.11.14 15:44:47.646447 [ 56 ] {} <Information> RaftInstance: state machine commit index 38, precommit index 38, last log index 38      
2024.11.14 15:44:47.646455 [ 56 ] {} <Trace> RaftInstance: peer 4, interval: 500
2024.11.14 15:44:47.646465 [ 56 ] {} <Trace> RaftInstance: peer 3, interval: 500
2024.11.14 15:44:47.646472 [ 56 ] {} <Trace> RaftInstance: peer 2, interval: 500
2024.11.14 15:44:47.646476 [ 56 ] {} <Trace> RaftInstance: peer 6, interval: 500
2024.11.14 15:44:47.646481 [ 56 ] {} <Trace> RaftInstance: peer 1, interval: 500
2024.11.14 15:44:47.646672 [ 56 ] {} <Information> RaftInstance: [BECOME LEADER] appended new config at 39
2024.11.14 15:44:47.646702 [ 56 ] {} <Warning> RaftInstance: connection to peer 4 is not active long time: 152728 ms, force re-connect     
2024.11.14 15:44:47.646717 [ 56 ] {} <Debug> RaftInstance: reset RPC client for peer 4
2024.11.14 15:44:47.646931 [ 56 ] {} <Debug> RaftInstance: append_entries for 4 with LastLogIndex=39, LastLogTerm=2, EntriesLength=0, CommitIndex=38, Term=2, peer_last_sent_idx 0
2024.11.14 15:44:47.646977 [ 56 ] {} <Warning> RaftInstance: connection to peer 3 is not active long time: 152728 ms, force re-connect     
2024.11.14 15:44:47.646989 [ 56 ] {} <Debug> RaftInstance: reset RPC client for peer 3
2024.11.14 15:44:47.647011 [ 56 ] {} <Debug> RaftInstance: append_entries for 3 with LastLogIndex=39, LastLogTerm=2, EntriesLength=0, CommitIndex=38, Term=2, peer_last_sent_idx 0
2024.11.14 15:44:47.647068 [ 56 ] {} <Warning> RaftInstance: connection to peer 2 is not active long time: 152728 ms, force re-connect     
2024.11.14 15:44:47.647078 [ 56 ] {} <Debug> RaftInstance: reset RPC client for peer 2
2024.11.14 15:44:47.647121 [ 56 ] {} <Debug> RaftInstance: append_entries for 2 with LastLogIndex=39, LastLogTerm=2, EntriesLength=0, CommitIndex=38, Term=2, peer_last_sent_idx 0
2024.11.14 15:44:47.647148 [ 56 ] {} <Warning> RaftInstance: connection to peer 6 is not active long time: 152728 ms, force re-connect     
2024.11.14 15:44:47.647158 [ 56 ] {} <Debug> RaftInstance: reset RPC client for peer 6
2024.11.14 15:44:47.647274 [ 56 ] {} <Debug> RaftInstance: append_entries for 6 with LastLogIndex=39, LastLogTerm=2, EntriesLength=0, CommitIndex=38, Term=2, peer_last_sent_idx 0
2024.11.14 15:44:47.647332 [ 56 ] {} <Debug> RaftInstance: append_entries for 1 with LastLogIndex=38, LastLogTerm=1, EntriesLength=1, CommitIndex=38, Term=2, peer_last_sent_idx 0
2024.11.14 15:44:47.647424 [ 56 ] {} <Information> RaftInstance:   === LEADER (term 2) ===
2024.11.14 15:44:47.649645 [ 59 ] {} <Information> RaftInstance: 0x7f55f8e88318 connected to 172.25.0.4:9234 (as a client)
2024.11.14 15:44:47.649664 [ 54 ] {} <Warning> RaftInstance: [EDGE CASE] got stale RPC response from 3: current 0x7f55f8e88318 (8), from parameter 0x7f55f8e87f98 (3). will ignore this response
2024.11.14 15:44:47.649650 [ 46 ] {} <Information> RaftInstance: 0x7f55ce0a0018 connected to 172.26.0.2:9234 (as a client)
2024.11.14 15:44:47.649759 [ 59 ] {} <Debug> RaftInstance: Receive a append_entries_response message from peer 1 with Result=1, Term=2, NextIndex=40
2024.11.14 15:44:47.649773 [ 59 ] {} <Trace> RaftInstance: peer 1, prev matched idx: 0, new matched idx: 39
2024.11.14 15:44:47.649787 [ 59 ] {} <Debug> RaftInstance: Server 4 is busy, skip the request
2024.11.14 15:44:47.649793 [ 59 ] {} <Debug> RaftInstance: Server 3 is busy, skip the request
2024.11.14 15:44:47.649797 [ 59 ] {} <Debug> RaftInstance: Server 2 is busy, skip the request
2024.11.14 15:44:47.649801 [ 59 ] {} <Debug> RaftInstance: Server 6 is busy, skip the request
2024.11.14 15:44:47.649808 [ 59 ] {} <Debug> RaftInstance: append_entries for 1 with LastLogIndex=39, LastLogTerm=2, EntriesLength=0, CommitIndex=39, Term=2, peer_last_sent_idx 39
2024.11.14 15:44:47.649861 [ 59 ] {} <Trace> RaftInstance: leadership transfer is disabled
2024.11.14 15:44:47.649887 [ 60 ] {} <Debug> RaftInstance: commit upto 39, current idx 38
2024.11.14 15:44:47.649894 [ 60 ] {} <Trace> RaftInstance: commit upto 39, current idx 39
2024.11.14 15:44:47.649904 [ 60 ] {} <Information> RaftInstance: config at index 39 is committed, prev config log idx 1
2024.11.14 15:44:47.649945 [ 60 ] {} <Information> RaftInstance: new config log idx 39, prev log idx 1, cur config log idx 1, prev log idx 0
2024.11.14 15:44:47.649953 [ 60 ] {} <Debug> RaftInstance: system is reconfigured to have 6 servers, last config index: 1, this config index: 39
2024.11.14 15:44:47.650001 [ 60 ] {} <Information> RaftInstance: new configuration: log idx 39, prev log idx 1
peer 1, DC ID 0, 172.25.0.2:9234, voting member, 1
peer 2, DC ID 0, 172.25.0.3:9234, voting member, 1
peer 3, DC ID 0, 172.25.0.4:9234, voting member, 1
peer 4, DC ID 0, 172.26.0.2:9234, learner, 1
peer 5, DC ID 0, 172.26.0.3:9234, learner, 1
peer 6, DC ID 0, 172.26.0.4:9234, learner, 1
my id: 5, leader: 5, term: 2
2024.11.14 15:44:47.650072 [ 60 ] {} <Debug> RaftInstance: DONE: commit upto 39, current idx 39

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants