-
Notifications
You must be signed in to change notification settings - Fork 71
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make sync work across VV upgrades #1913
Conversation
Code Coverage Results:
|
The test
This appears unrelated to what I've changed; it doesn't look like version vectors were involved. |
yes, that test is flaky. It seems to happen when the replicator goes to Idle before all revisions get synced. |
"when a doc is upgraded in memory to a VectorRecord" -- I noticed that method "upgradeVersioning" is deleted. When actually a doc is upgraded in memory? |
Reviewing the code changes from looking at the diff here is hard for me to understand all of upgrading and handling both types of revisions. I'm still working on it. As of now, I have a few questions in general:
Do I understand here that if the doc didn't get upgraded to vv, the vv record will be generated when read (but will not be saved)?
Do I understand correctly that when pulling a new revision that only contains the rev tree, the doc will be upgraded to the version vector (with the old rev tree kept) at that point including the new doc? |
Any time it's opened by the current version of LiteCore; the DocumentFactory is now VectorDocumentFactory. |
Yes.
The doc gets saved as a VectorRecord in the new format, but no version vector is generated since the new revision has a 'legacy' revid. |
The transition from tree-based revids to version vectors is tricky for replication, and the prior approach (adding a fake 'legacy' version when creating the vector) didn't turn out to work. Instead, when a doc is upgraded in memory to a VectorRecord it keeps its tree-based/legacy revids. When the doc is saved it creates a version vector for the local Revision but also remembers the prior legacy revid. Remote Revisions keep their tree-based IDs until VVs are pulled from or pushed to those remotes. Revision histories may now include one or more legacy revids at the end. These are what allow the recipient of a revision (an active or passive puller) to match up an incoming version vector with a local legacy revid. There's still a question of how long we need to keep the legacy revid of the document. It only needs to be used once per remote, but in a P2P scenario you don't know if you might have to sync with some new peer that hasn't upgraded that doc yet. In any case the revid isn't very big (~24 bytes) so it may not be worth removing.
Some constants added earlier for VV upgrading and dealing with legacy revids, which have been superseded by newer techniques, were nuked: - kDocGetUpgraded / kUpgrade - kLegacyRevSourceID - VectorRecord::_versioning
c4doc_getRevisionHistory requires doc be loaded with kDocGetAll. VectorDoc correctly threw an exception on this. Also, all this is doing is checking the current rev, so no need to get the history.
94b5082
to
6b624ca
Compare
Please review the last commit, which is a change I had to make to get a unit test pass. I think the code in the test was wrong. |
CBL-5616: Update SG version in docker-compose.yml (#2121) CBL-6239: Fixed c4doc_getRevisionBody in a 2.x db (#2136) 92fc5cc Make sync work across VV upgrades (#1913) abe9759 Merge in tag 3.2.0 8009302 Statically link to libstdc++ (#2134) CBL-6169: DataFile::documentKeys() is not thread-safe (#2132) CBL-6189: Annotation of thread-safety, filling in the missing ones after the first pass. (#2129) 471e5ec Allow a Timer to destruct itself during its callback (#2124) b090363 Fixed C4CollectionSpec hash and equality (#2122) cba409e LogEncoder cleanup (#2130) cf53ccd Rewritten query translator (#1966) 9f75fab Updated Fleece now that its PR is merged d6c2091 Update Android NDK even more 47cc9ad Update Jenkins configuration to support C++20 1652943 C++20: use std::ranges, contains() methods, std::size() f27f8dd Build with C++20
The transition from tree-based revids to version vectors is tricky for replication, and the prior approach (adding a fake 'legacy' version when creating the vector) didn't turn out to work.
Instead, now:
As a result, revision histories now consist of a VV and/or legacy revids. The version vector, if present, always comes first. The legacy IDs at the end are what allow the recipient of a revision (an active or passive puller) to match up an incoming version vector with a local legacy revid.
(It's also possible for a rev history sent by the replicator to consist only of legacy IDs, even after the upgrade. This happens if a doc was updated locally before the app upgrade but not pushed until after. It can also happen on a pull for the same reason.)
There's an open question of how long we need to keep the legacy revid of the document. It only needs to be used once per remote, but in a P2P scenario you never know if you might have to sync with some new peer that hasn't upgraded that doc yet. In any case the revid isn't very big (~24 bytes) and it only occurs in pre-existing docs, so it may not be worth removing.