Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Union-based merge of PCD conflicts #1342

Closed
artwyman opened this issue Dec 7, 2023 · 0 comments · Fixed by #1360
Closed

Union-based merge of PCD conflicts #1342

artwyman opened this issue Dec 7, 2023 · 0 comments · Fixed by #1360
Assignees
Labels

Comments

@artwyman
Copy link
Member

artwyman commented Dec 7, 2023

This was split out of #1271.

Despite out best efforts to minimize them, our E2EE syncing system is still subject to conflicting changes, which can result in lost PCDs. Generalized perfect syncing is hard. We have a multi-phase plan in Notion which would take months and still not be perfect. However, it is reasonable to assume that losing PCDs is the worst form of bad behavior on a conflict, and we can build a simpler merge algorithm which avoids that in common cases.

Something akin to a set union (keeping PCDs from both versions) will have this property by optimizing for maintaining "add new PCD" changes, but have less ideal results for other operations. This is explicitly a 2-way merge algorithm (local version and new remote version) with less power and complexity than a 3-way merge algorithm (which would require the old remote base version).
Some thoughts on the types of operations we support, and the outcome:

  • Add PCD: always preserved by union merge
  • Remove PCD: might reappear again in case of conflict
  • Replace PCD: we can't guarantee a correct outcome here. We can pick one, or refuse to merge. This mostly happens for ticket reissuance.
  • Change folder order: we can't guarantee a deterministic outcome here, but it should be "close enough"
  • Move/rename: Not currently supported

Part 1: Merge algorithm This should be separate from passport-client in order to be testable. It should also be reusable for PCD import, which has similar merging needs. It needs to make a best-effort to preserve all PCDs, while detecting unmergable situations (e.g. 2 versions of the same PCD by ID), and avoiding illegal states (e.g. 2 different semaphore identities).

Part 2: Sync integration The revision arguments to upload/download allow us to detect when conflicts happen and respond to them by performing a merge. Implementing this fully requires:

  • Add revision argument to upload to detect conflicts. If detected, trigger a download.
  • Use revision argument to download (already present) to detect conflicts. On conflict and trigger merge.
  • Decide how to respond to merge failures and whether to report to user, or give them a choice.
@artwyman artwyman self-assigned this Dec 7, 2023
github-merge-queue bot pushed a commit that referenced this issue Dec 13, 2023
This is part of addressing #1342. This is the app-framework part of
adding merge to E2EE sync. The merge algorithm itself is stubbed-out for
separate implementation. There are two major cases covered here:
background sync and password change. The behavior is summarized below.
This conflict-resolution model is based on a pattern I used at Dropbox,
so I hope it's clear to people who haven't seen it before.

### Background sync behavior

- Pre-existing logic lets the server revision to be used to detect
whether this client has seen the latest state on the server.
- Pre-existing logic lets the storage hash be used to detect whether
this client's state has changed from the most recent server state.
- New upload behavior: Send the previous known revision to the server,
which will abort rather than clobbering a new revision.
- Client responds by triggering an extra download in order to merge,
after which another upload will be attempted.
- New download behavior: When a new revision is downloaded, recalculate
the hash of data from AppState, and compare it to the hash of the
**previous** server revision to see if this client has made any changes.
  - If so, trigger a merge and keep the merged state.
- The download can trigger a merge on its own if it's the first point
that a conflict is discovered. The case of upload aborting and
triggering download is simply the common case in a quiescent system,
because uploads are attempted eagerly after changes.

### Password change behavior

- Password change (changeBlobKey) behaves like an upload, because it
uploads the latest data using the new encryption key.
- The known revision is added to the changeBlobKey request to detect
conflicts and abort, same as an upload.
  - This requires some new react hooks to fetch this info from AppState.
- When an abort happens, the user sees a message asking them to wait for
sync to complete.
  - An extra download is triggered to help this happen.
- Making that possible required adding `useSyncE2EEStorage()` to some
more
- Also added found #1351 in the process of testing, and added tweak to
the behavior of background tab behavior to trigger an extra download and
merge.

### Testing

I tested the background sync and password change cases manually against
my dev server. I don't think there's a way to write automated tests for
them.

I didn't test the non-standard password change cases (where a password
is set for an existing account) because I don't know how to set up those
cases. I'm hopeful that the relevant logic is all identical and will
just work. Maybe our QA can set up those cases to be sure.

### Notes

If this PR is pushed without a real merge algorithm, it ends up always
preferring the remote copy, such that whichever change hits the server
first wins. That's quite close to the current behavior (given the eager
nature of uploads), though not identical because the current behavior
doesn't ensure a consistent decision.
github-merge-queue bot pushed a commit that referenced this issue Dec 18, 2023
[Updated after force push] Adds merge functionality to sync, filling in
the stub added in a previous diff. Major pieces included:

- The E2EESync implementation uses the underlying merge of PCDs and
subscriptions, and adds some stats accounting I hope to be useful for
logging.
- There are TODOs for some pieces I think might want to be better,
including explaining the goals and assumptions of the merge algorithm.
- The FeedsSubscritpionManager merge is implemented and tested as per
the previous revision of this PR.
- The app-level changes around subscriptions motivated by my wanting to
make sure it's okay to replace the FeedsSubscriptionManager with a new
one at merge time. Different parts of the app have different assumptions
about this. I've done my best to make it consistent.

This deserves more testing, but I did some manual testing creating
conflicts with both PCDs and subscriptions, and I saw the expected
results in both data and stats. Those expectations are met both in the
positive (no objects are ever lost) and the negative (removed objects
come back if involved in a merge.

Fixes #641
Closes #1342
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant