Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cleanup old checkpoints from finalization state #1083

Merged
merged 3 commits into from
May 9, 2019

Conversation

frolosofsky
Copy link
Member

@frolosofsky frolosofsky commented May 8, 2019

This fix significantly reduces individual finalization state size.

In a test, I run 1 proposer with 4 finalizers with configuration epoch_length=2.
After processing ~9k blocks and ~4.5k epochs, finalization state size was 767599.
After applying this PR, it has been reduced to 196002. More finalizers we have,
more memory we save, because checkpoints map holds votes from every finalizer.

Relates to #1073.

@frolosofsky frolosofsky added this to the 0.2 milestone May 8, 2019
@frolosofsky frolosofsky requested review from kostyantyn, Gnappuraz and a team May 8, 2019 12:29
@frolosofsky frolosofsky self-assigned this May 8, 2019
@frolosofsky frolosofsky changed the title Clenup old checkpoints from finalization state Cleanup old checkpoints from finalization state May 8, 2019
@scravy
Copy link
Member

scravy commented May 8, 2019

The lint step failed in a weird way:

Screenshot 2019-05-08 14 31 50

I have a feeling this has something to do with the scripted diff merged in #1072 as the messages refers to a52d88e

@cornelius Can you have a look?

@scravy
Copy link
Member

scravy commented May 8, 2019

Concept ACK 0bc5e57

@scravy scravy requested a review from cornelius May 8, 2019 12:36
Copy link
Member

@kostyantyn kostyantyn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ConceptACK 0bc5e57

I see it as a great quick win! This checkpoint design grows n^2, and with the low effort, we did it linearly. I have in mind to completely delete checkpoints as I don't see a need for them if we can look back for the previous finalization state, but that requires a significantly larger change/refactoring, and after this PR, it won't give us a drastic improvement so it can wait.

@Gnappuraz
Copy link
Member

Gnappuraz commented May 8, 2019

I'm looking at finalizationstate.cpp:924
GetCheckpoint(vote.m_target_epoch).m_vote_set.insert(vote.m_validator_address);
and checking if actually we always pass by the ValidateVote function, if not, is an easy way to crash the node with an old vote.

As a bare minimum I think we should change anyway the log messages in finalizationstate.cpp:322 "%s: target_epoch=%d is in the future.\n", __func__, and following since is not necessarily an epoch in the future anymore.

@cornelius
Copy link
Member

The weird lint failure should be fixed by #1084 now.

@cornelius cornelius removed their request for review May 8, 2019 15:59
@Ruteri
Copy link
Member

Ruteri commented May 9, 2019

Tested 0bc5e57 and while it does improve overall memory consumption, it does not resolve the syncing issue #1073.

@frolosofsky
Copy link
Member Author

I'm looking at finalizationstate.cpp:924
GetCheckpoint(vote.m_target_epoch).m_vote_set.insert(vote.m_validator_address);
and check if actually we always pass by the ValidateVote function, if not, is an easy way to crash the node with an old vote.

As a bare minimum I think we should change anyway the log messages in finalizationstate.cpp:322 "%s: target_epoch=%d is in the future.\n", __func__, and following since is not necessarily an epoch in the future anymore.

Fixed/checked in 8fe4a24.

This fix significantly reduces individual finalization state size.

In a test, I run 1 proposer with 4 finalizers with configuration epoch_length=2.
After processing ~9k blocks and ~4.5k epochs, finalization state size was 767599.
After applying this PR, it has been reduced to 196002. More finalizers we have,
more memory we save, because checkpoints map holds votes from every finalizer.

Signed-off-by: Stanislav Frolov <[email protected]>
Signed-off-by: Stanislav Frolov <[email protected]>
Copy link
Member

@Gnappuraz Gnappuraz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

except my last comment, utACK d751b58

Copy link
Member

@Ruteri Ruteri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ConceptACK d751b58
Tested resource usage - significantly lowers memory usage.

Signed-off-by: Stanislav Frolov <[email protected]>
@frolosofsky frolosofsky merged commit 44cd02a into dtr-org:master May 9, 2019
@frolosofsky frolosofsky deleted the fix-checkpoints branch May 9, 2019 13:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants