Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consensus node error while recovering from crash - f+1 commits and still changed view #780

Open
vncoelho opened this issue Dec 21, 2022 · 9 comments

Comments

@vncoelho
Copy link
Member

Bug found while testing dBFT 3.0 PR.

However, it is suspected that this can also happen on current version:

neo> [12:58:47.717] OnStart
[12:58:49.169] Initialize: height=123019 view=0 index=3 role=PrimaryP1
[12:58:49.180] Sending RecoveryRequest: height=123019 view=0 nc=0 nf=0
[12:58:50.182] Sending PrepareRequest: height=123019 view=0 Id=0
[12:58:51.123] OnRecoveryMessageReceived: height=123019 view=0 index=0 pId=0
[12:58:51.135] OnCommitReceived: height=123019 view=0 index=0 nc=0 nf=0
[12:58:51.139] OnCommitReceived: height=123019 view=0 index=2 nc=0 nf=0
[12:58:51.141] Recovery finished: (valid/total) ChgView: 0/0 PrepReq: 0/0 PrepResp: 0/2 PreCommits: 3/3 Commits: 2/2
[12:58:51.143] OnCommitReceived: height=123019 view=0 index=0 nc=0 nf=0
[12:58:51.145] OnCommitReceived: height=123019 view=0 index=2 nc=0 nf=0
[12:58:56.538] Sending ChangeView: height=123019 view=0 nv=1 nc=0 nf=0 reason=Timeout
[12:58:57.005] OnRecoveryMessageReceived: height=123019 view=1 index=1 pId=0
[12:58:57.015] OnChangeViewReceived: height=123019 view=0 index=1 nv=1 reason=Timeout
[12:58:57.018] OnChangeViewReceived: height=123019 view=0 index=2 nv=1 reason=Timeout
[12:58:57.022] View changed: view=1 primary=03bbc4a365248c965200d7957e2575e14e932a04f7e43f834c689bbdcb726c44c3
[12:58:57.024] Initialize: height=123019 view=1 index=3 role=Backup
@vncoelho
Copy link
Member Author

@ZhangTao1596, I am not sure this is just for dBFT 3.0. But also for dBFT 2.0.

@vncoelho
Copy link
Member Author

vncoelho commented Dec 21, 2022

In fact, it is even described in the current code. I think we wrote that some long time ago.

" // A possible attack can happen if the last node to commit is malicious and either sends change view after his
// commit to stall nodes in a higher view, or if he refuses to send recovery messages. In addition, if a node
// asking change views loses network or crashes and comes back when nodes are committed in more than one higher
// numbered view, it is possible for the node accepting recovery to commit in any of the higher views, thus
// potentially splitting nodes among views and stalling the network."

@vncoelho
Copy link
Member Author

It also looks like this logic is not correct anymore:

        public bool NotAcceptingPayloadsDueToViewChanging => ViewChanging && !MoreThanFNodesCommittedOrLost;
        public bool MoreThanFNodesCommittedOrLost => (CountCommitted + CountFailed) > F;

@shargon
Copy link
Member

shargon commented Dec 23, 2022

@vncoelho Could you add a test in order to ensure that v2 is affected?

@vncoelho
Copy link
Member Author

I could, but all test structure was removed when dbft was migrated to a Module, from neo core to here.

We need to add that previous infrastructure for akka testing here.

@vncoelho
Copy link
Member Author

neo-project/neo@3d64080

@vncoelho
Copy link
Member Author

We need to mock everything here as we used to do before, @shargon.

Do you have a good direction for starting it to work? I could them add all tests. (ping @erikzhang)

image

@ZhangTao1596
Copy link
Collaborator

I think this problem has always exists in master branch.
Primary node will create different PrepareReqeust after restart if it hasn't committed, So that it will refuse previous PrepareResponse in RecoveryMessage.

@vncoelho
Copy link
Member Author

vncoelho commented Jan 3, 2023

@ZhangTao1596, this point you mentioned is another problem related to the PrepareRequest, which should also be fixed in the future. Perhaps it is good to open another issue explaining that one in order that we do not forget this possibility.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants