Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

client: print more info when leader disconnect #7907

Open
wants to merge 9 commits into
base: master
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
42 changes: 29 additions & 13 deletions client/pd_service_discovery.go
Original file line number Diff line number Diff line change
Expand Up @@ -991,30 +991,46 @@
log.Info("[pd] update member urls", zap.Strings("old-urls", oldURLs), zap.Strings("new-urls", urls))
}

func (c *pdServiceDiscovery) switchLeader(url string) (bool, error) {
// switchLeader switches the leader of the PD cluster.
// Note: For current implementation, when initializing the client, the connection to leader should be established.
// Otherwise, the initialization will fail.
func (c *pdServiceDiscovery) switchLeader(url string) (change bool, err error) {
oldLeader := c.getLeaderServiceClient()
if url == oldLeader.GetURL() && oldLeader.GetClientConn() != nil {
return false, nil
}

newConn, err := c.GetOrCreateGRPCConn(url)
var newConn *grpc.ClientConn
newConn, err = c.GetOrCreateGRPCConn(url)
// If gRPC connect is created successfully or leader is new, still saves.
if url != oldLeader.GetURL() || newConn != nil {
if url != oldLeader.GetURL() {
change = true
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might fail to connect to the leader?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but the leader does change and it will affect the follower proxy

Comment on lines +1006 to +1007
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check and introducing change seem unnecessary due to line 999.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the client continues to fail to establish a connection with the leader, there will be situations where oldURL == url and err != nil``. So it's not always change = true` after line 999

log.Info("[pd] switch leader", zap.String("new-leader", url), zap.String("old-leader", oldLeader.GetURL()))
}
if err == nil {
change = true
log.Info("[pd] successfully connected to leader", zap.String("leader", url))
} else {
log.Warn("[pd] failed to connect leader", zap.String("leader", url), errs.ZapError(err))
}
if change {
// Set PD leader and Global TSO Allocator (which is also the PD leader)
// Note: Even if the connection is not established, the leader is still updated.
// The reason is that the leader has transferred to another PD server, and the client should
// update the member information. Meanwhile, the client must know the newest leader for follower proxy.
leaderClient := newPDServiceClient(url, url, newConn, true)
Copy link
Member

@HuSharp HuSharp Mar 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does newConn will be nill? Do we need to change if change to if change || newConn != nil ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even if the connection is not established, the leader is still updated.
The reason is that the leader has transferred to another PD server, and the client should update the member information. Meanwhile, the client must know the newest leader for follower proxy.

c.leader.Store(leaderClient)
}
// Run callbacks
if c.tsoGlobalAllocLeaderUpdatedCb != nil {
if err := c.tsoGlobalAllocLeaderUpdatedCb(url); err != nil {
return true, err
// Run callbacks
if c.tsoGlobalAllocLeaderUpdatedCb != nil {
if err := c.tsoGlobalAllocLeaderUpdatedCb(url); err != nil {
return change, err

Check warning on line 1026 in client/pd_service_discovery.go

View check run for this annotation

Codecov / codecov/patch

client/pd_service_discovery.go#L1026

Added line #L1026 was not covered by tests
}
}
for _, cb := range c.leaderSwitchedCbs {
cb()
}
}
for _, cb := range c.leaderSwitchedCbs {
cb()
}
log.Info("[pd] switch leader", zap.String("new-leader", url), zap.String("old-leader", oldLeader.GetURL()))
return true, err
return
}

func (c *pdServiceDiscovery) updateFollowers(members []*pdpb.Member, leaderID uint64, leaderURL string) (changed bool) {
Expand Down
Loading