-
Notifications
You must be signed in to change notification settings - Fork 245
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Out of order call to state_machine::create_snapshot() when manually triggering a snapshot #478
Comments
Hi @adotsch, Creating snapshot is done by a separate background commit thread according to the current "state machine's commit index", so the snapshot index can be lagging behind the last commit index of log store. State machine is always catching up the Raft's commit index. Nonetheless, the current behavior is obviously a bug. If there is more recent snapshot that is manually created, auto creation should skip any log index smaller than that. I will provide the fix for it soon. |
Correct me if I am wrong, but I think you just added an extra check to ignore manual create_snapshot() calls when inconvenient, i.e. create_snapshot() will work when we are lucky, but won't be reliable. |
@adotsch If what you want is to schedule the snapshot creation on next earliest available log index, then I'd rather add a new API, for instance
Please let me know if it is ok for your case. |
Thanks for your response! |
@adotsch if I understood your problem correctly, it should be similar to what I noticed:
I think a valid fix is to take @greensky00 WDYT? |
@antonio2368 But I think it will be better to add an option to The option can define the behavior, like whether acquiring lock or not, and what to do if the lock is already acquired, etc. The default is the same as the current one (best effort). |
Hi,
I only create manual snapshots in my application calling raft_server::create_snapshot().
I have seen state_machine::create_snapshot() called with a snapshot index that is smaller(!) than the last committed index, i.e. when the state machine was already ahead of the point it should have created a snapshot for.
I guess this is a bug.
I have my workarounds but it would be nice to have this fixed so that I can clean up my code.
Regards,
Andras
The text was updated successfully, but these errors were encountered: