Pipeline + NoCache 情况下，可能出现 Can't truncate logs before _applied_id=22, last_log_kept=0 #421

CkTD · 2023-09-13T09:59:17Z

场景:
1. leader 给某个 peer 安装快照 [last_included_index=22]，完成后 _next_index = 23
2. 然后leader 先后发送两个 append entries rpc ，分别复制log [23, 28] 和 [29,32]

如果网络原因导致第二个 append entries rpc 先到达。在 NoCache 情况下，会失败。 replicator 把 _next_index 重置并取消掉第一个 rpc。下面的这个检查又进入到了 truncate log 的分枝。_next_index--，也就是设置为 22。但实际上peer上仅仅是缺少了日志，应该是第一个分支。

braft/src/braft/replicator.cpp

Lines 442 to 463 in 59c40e5

    
           // prev_log_index and prev_log_term doesn't match 
        
           r->_reset_next_index(); 
        
           if (response->last_log_index() + 1 < r->_next_index) { 
        
               BRAFT_VLOG << "Group " << r->_options.group_id 
        
                          << " last_log_index at peer=" << r->_options.peer_id  
        
                          << " is " << response->last_log_index(); 
        
               // The peer contains less logs than leader 
        
               r->_next_index = response->last_log_index() + 1; 
        
           } else {   
        
               // The peer contains logs from old term which should be truncated, 
        
               // decrease _last_log_at_peer by one to test the right index to keep 
        
               if (BAIDU_LIKELY(r->_next_index > 1)) { 
        
                   BRAFT_VLOG << "Group " << r->_options.group_id  
        
                              << " log_index=" << r->_next_index << " mismatch"; 
        
                   --r->_next_index; 
        
               } else { 
        
                   LOG(ERROR) << "Group " << r->_options.group_id  
        
                              << " peer=" << r->_options.peer_id 
        
                              << " declares that log at index=0 doesn't match," 
        
                                 " which is not supposed to happen"; 
        
               } 
        
           }

后续 append entries 的时候， prev_index = 21 ， peer 上是没有的，仍然会失败。然后不停地减小 _next_index 重试，直到变成0触发了 peer 上的 truncate suffix。

ehds · 2023-09-14T02:24:59Z

取消 rpc 为什么进入下面的分支？应该是判断 if (cntl->Failed())分支，所以 next_index 不会 --?

braft/src/braft/replicator.cpp

Lines 400 to 416 in 59c40e5

    
           if (cntl->Failed()) { 
        
               ss << " fail, sleep."; 
        
               BRAFT_VLOG << ss.str(); 
        
               // TODO: Should it be VLOG? 
        
               LOG_IF(WARNING, (r->_consecutive_error_times++) % 10 == 0) 
        
                               << "Group " << r->_options.group_id 
        
                               << " fail to issue RPC to " << r->_options.peer_id 
        
                               << " _consecutive_error_times=" << r->_consecutive_error_times 
        
                               << ", " << cntl->ErrorText(); 
        
               // If the follower crashes, any RPC to the follower fails immediately, 
        
               // so we need to block the follower for a while instead of looping until 
        
               // it comes back or be removed 
        
               // dummy_id is unlock in block 
        
               r->_reset_next_index(); 
        
               return r->_block(start_time_us, cntl->ErrorCode()); 
        
           }

CkTD · 2023-09-14T09:29:54Z

取消 rpc 为什么进入下面的分支？应该是判断 if (cntl->Failed())分支，所以 next_index 不会 --?

braft/src/braft/replicator.cpp

Lines 400 to 416 in 59c40e5

if (cntl->Failed()) {

ss << " fail, sleep.";

BRAFT_VLOG << ss.str();

// TODO: Should it be VLOG?

LOG_IF(WARNING, (r->_consecutive_error_times++) % 10 == 0)

<< "Group " << r->_options.group_id

<< " fail to issue RPC to " << r->_options.peer_id

<< " _consecutive_error_times=" << r->_consecutive_error_times

<< ", " << cntl->ErrorText();

// If the follower crashes, any RPC to the follower fails immediately,

// so we need to block the follower for a while instead of looping until

// it comes back or be removed

// dummy_id is unlock in block

r->_reset_next_index();

return r->_block(start_time_us, cntl->ErrorCode());

}

被取消的第一个rpc应该是进入了if (cntl->Failed())。

感觉出问题的是先到达的第二个 rpc， cntl->Failed() 没有失败，但是由于 log gap存在，Cache 又未开启，所以 response->success() 是 false， response->last_log_index() + 1 又等于 r->_next_index ，就进入了 truncate log 分枝。

ehds · 2023-09-15T09:25:27Z

取消 rpc 为什么进入下面的分支？应该是判断 if (cntl->Failed())分支，所以 next_index 不会 --?

braft/src/braft/replicator.cpp

Lines 400 to 416 in 59c40e5

if (cntl->Failed()) {

ss << " fail, sleep.";

BRAFT_VLOG << ss.str();

// TODO: Should it be VLOG?

LOG_IF(WARNING, (r->_consecutive_error_times++) % 10 == 0)

<< "Group " << r->_options.group_id

<< " fail to issue RPC to " << r->_options.peer_id

<< " _consecutive_error_times=" << r->_consecutive_error_times

<< ", " << cntl->ErrorText();

// If the follower crashes, any RPC to the follower fails immediately,

// so we need to block the follower for a while instead of looping until

// it comes back or be removed

// dummy_id is unlock in block

r->_reset_next_index();

return r->_block(start_time_us, cntl->ErrorCode());

}

被取消的第一个rpc应该是进入了if (cntl->Failed())。

感觉出问题的是先到达的第二个 rpc， cntl->Failed() 没有失败，但是由于 log gap存在，Cache 又未开启，所以 response->success() 是 false， response->last_log_index() + 1 又等于 r->_next_index ，就进入了 truncate log 分枝。

确实会存在这个问题，能否提个PR fix 这个问题呢 👍 ？

CkTD · 2023-09-20T03:06:15Z

取消 rpc 为什么进入下面的分支？应该是判断 if (cntl->Failed())分支，所以 next_index 不会 --?

braft/src/braft/replicator.cpp

Lines 400 to 416 in 59c40e5

if (cntl->Failed()) {

ss << " fail, sleep.";

BRAFT_VLOG << ss.str();

// TODO: Should it be VLOG?

LOG_IF(WARNING, (r->_consecutive_error_times++) % 10 == 0)

<< "Group " << r->_options.group_id

<< " fail to issue RPC to " << r->_options.peer_id

<< " _consecutive_error_times=" << r->_consecutive_error_times

<< ", " << cntl->ErrorText();

// If the follower crashes, any RPC to the follower fails immediately,

// so we need to block the follower for a while instead of looping until

// it comes back or be removed

// dummy_id is unlock in block

r->_reset_next_index();

return r->_block(start_time_us, cntl->ErrorCode());

}

被取消的第一个rpc应该是进入了if (cntl->Failed())。
感觉出问题的是先到达的第二个 rpc， cntl->Failed() 没有失败，但是由于 log gap存在，Cache 又未开启，所以 response->success() 是 false， response->last_log_index() + 1 又等于 r->_next_index ，就进入了 truncate log 分枝。

确实会存在这个问题，能否提个PR fix 这个问题呢 👍 ？

这里还不能直接把条件改成 response->last_log_index() + 1 == r->_next_index。这个条件成立，也有可能是非乱序，但确实是有冲突情况，需要 truncate log。还得想个办法区分这两种情况，没想到很好的方法来 fix。

ehds · 2023-09-20T06:24:59Z

--- a/src/braft/replicator.cpp
+++ b/src/braft/replicator.cpp
@@ -450,9 +450,11 @@ void Replicator::_on_rpc_returned(ReplicatorId id, brpc::Controller* cntl,
             // The peer contains logs from old term which should be truncated,
             // decrease _last_log_at_peer by one to test the right index to keep
             if (BAIDU_LIKELY(r->_next_index > 1)) {
-                BRAFT_VLOG << "Group " << r->_options.group_id 
-                           << " log_index=" << r->_next_index << " mismatch";
+              BRAFT_VLOG << "Group " << r->_options.group_id
+                         << " log_index=" << r->_next_index << " mismatch";
+              if (request->prev_log_index() == r->_next_index - 1) {
                 --r->_next_index;
+              }
             } else {
                 LOG(ERROR) << "Group " << r->_options.group_id 
                            << " peer=" << r->_options.peer_id

这样可行吗？把相等判断放到下面，只有当 request->prev_log_index() == r->_next_index - 1 即 AppendEntries 请求的 prev_log_index 是当前 _flying_append_entries 中最小的还未被确认的，才能去缩小 _next_index.

首先最小的请求一定会被处理，如果是 cntl.Failed , 恢复最初始的状态继续执行，如果是不匹配则向下执行：

如果是 leader 安装快照的场景，这个条件就不会被触发，因为最小的 prev_log_index 一定能匹配 term。

其他情况也不会有影响，因为当前最小的请求的 term 不匹配，那么之后的请求也一定不匹配。

This was referenced Apr 7, 2024

decrease _next_index when when the request with the minimum log #447

Open

[replicator] decrease _next_index when when the request with the minimum log ehds/braft#1

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pipeline + NoCache 情况下，可能出现 Can't truncate logs before _applied_id=22, last_log_kept=0 #421

Pipeline + NoCache 情况下，可能出现 Can't truncate logs before _applied_id=22, last_log_kept=0 #421

CkTD commented Sep 13, 2023

ehds commented Sep 14, 2023 •

edited

Loading

CkTD commented Sep 14, 2023 •

edited

Loading

ehds commented Sep 15, 2023

CkTD commented Sep 20, 2023

ehds commented Sep 20, 2023 •

edited

Loading

Pipeline + NoCache 情况下，可能出现 Can't truncate logs before _applied_id=22, last_log_kept=0 #421

Pipeline + NoCache 情况下，可能出现 Can't truncate logs before _applied_id=22, last_log_kept=0 #421

Comments

CkTD commented Sep 13, 2023

ehds commented Sep 14, 2023 • edited Loading

CkTD commented Sep 14, 2023 • edited Loading

ehds commented Sep 15, 2023

CkTD commented Sep 20, 2023

ehds commented Sep 20, 2023 • edited Loading

ehds commented Sep 14, 2023 •

edited

Loading

CkTD commented Sep 14, 2023 •

edited

Loading

ehds commented Sep 20, 2023 •

edited

Loading