-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature: Synching & conflict resolution for peer-to-peer and offline-first applcations #8
Comments
Huge fan of having another player enter offline first database, since every current one has pretty big caveats. Realmhttps://github.com/realm/realm-js PouchDB/CouchDBhttps://github.com/pouchdb/pouchdb RxDBhttps://github.com/pubkey/rxdb Gunhttps://github.com/amark/gun |
Thank you @bkniffler for your first comment from outside our small core development team! Very nice summary, thank you for the care you've put in :) I'll study the resources you've noted down very soon. So far, I've been reading up on what's out there, and also looking at the data model of Memex (which'll be the first consumer of the lib) and the types of modifications it needs. Here's my thought process so far:
Also, the idea is that even though this lib will easily integrate with Storex, I want to design it in such a way that it's not directly coupled to Storex, so also applications using other database abstractions can benefit from this. If you're interested to offer your thoughts once in a while or shoot at my ideas with a flamethrower, it would be much appreciated :) (Either in this thread, or by Slack/e-mail.) Currently I'm thinking about whether I'm missing any big factors in designing this thing. And, how to solve the whole causality puzzle of enabling last-write-wins independent of the device you were on when you made the edit. |
From @bluesun on Slack:
I'll get back to you once I have more ideas. Thank you Samir for your input! And let's keep the conversation going here, will notify you through direct channels when urgent questions come up. |
Reading the article, it reminds me a lot of: Because the Memex data model is quite simple so far, I think we can make some shortcuts to make the implementation much simpler. Only complication I'm still struggling with is ordering of the operation and accounting for clock inaccuracies. |
Hi Vincent,
thanks a lot for the link. I am currently in low energy mode and have to wait until reading and digesting that article.
Actually the operational transform IS giving a very clever answer to the clock issues. AFAIK The time ordering is NOT relevant to the main algorithm. Time stamps are used only :
— for the ordering on a single device OR
— INSIDE a specific conflict resolution between two action on different devices. Here one has to use a heuristic.
Actually it only make sense to go that route if the robustness of the offline-able sync is key to your success. My guess today is that it is better to use OT and use crude heuristic from the beginning because you can improve the conflict heuristics over the years. But this is only a guess.
Bye
Samir
… On 30. Jan 2019, at 11:19, Vincent den Boer ***@***.***> wrote:
Reading the article, it reminds me a lot of:
http://archagon.net/blog/2018/03/24/data-laced-with-history/ <http://archagon.net/blog/2018/03/24/data-laced-with-history/>
Because the Memex data model is quite simple so far, I think we can make some shortcuts to make the code much simpler. Only complication I'm still struggling with is ordering of the operation and accounting for clock inaccuracies.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub <#8 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAbNFUaQFyijjzs7_M-k-3IhCT1Wg5Nxks5vIXGngaJpZM4Z-5or>.
|
After thinking about this some more, I've decided on an architecture to plan out. Since our data model is quite simple, and we're not dealing yet which unstructured data transforming in unexpected ways (like strings), OT and CRDTs would be overkill for the core architecture. Instead, I've decided that at it's core, it should automatically sync operations that do not generate conflicts, while allowing you to decide what to do with more complex conflicts that it should be able to foresee automatically. This is done by declaring the ways you're going to interact with the data in advance, so they can be statically analyzed: With the help of those, and uniqueness constraints, we can detect potential conflicts through static analysis. One case where it'll detect a potential unique constraint violation is when having ordered children with Other conflicts that'll arise, which should be automatically detected are:
For creation operations not to generate conflicts, there needs to be another strategy to generate IDs instead of auto-increment integers. This to prevent the scenario of where:
Here I was thinking about the following strategies:
The changes will be synchronized through a shared database. This database will hold change records which hold the time at which they occured, and when they were uploaded. Also, every device will record it's last sync time. On every sync, changes below the minimal sync time shared between devices will be archived to serve as a backup (by one of the devices) and removed from the main sync log. When a client syncs the log with the shared database, it'll locally process the newly arrived changes. For each newly arrived changed, it'll look up relevant changes its own log (writes to the same field, to fields matching unique constraints, etc.) and replay the changes after conflict resolution. This shared database will initially be implemented on Firestore (#10), which also allows live sync. However, it'll be implemented in such a way that it can easily be ported to a self-hosted server. This means that:
I've already added support for middleware to the Storex core (yet to be published) that allows you to intercept all operations you do on the database: The unified access control (#6) is still TBD. |
Work started here: |
This description is pretty good!
For the OT/CRDT part I will need a few days to think about.
On the issue of IDs here are some remarks:
— Depending on the way you store your information, you could give UUIDs only to live components that participate in the communication (e.g. a process). The live components can then use incremental integers as secondary IDs. Please do not use a device or browser ID. Narrow it as much as possible to avoid conflict (a user might work at the same time on two tabs — at the same time means he could enter data manually on one tab while an automated routine of worldbrain runs in another tab).
— Delay the sending of information with a meaningful threshold to aggregate operations before sending them (to lower the number of requests)
So far for now
Samir
… On 5. Feb 2019, at 12:14, Vincent den Boer ***@***.***> wrote:
After thinking about this some more, I've decided on an architecture to plan out. Since our data model is quite simple, and we're not dealing yet which unstructured data transforming in unexpected ways (like strings), OT and CRDTs would be overkill for the core architecture. Instead, I've decided that at it's core, it should automatically sync operations that do not generate conflicts, while allowing you to decide what to do with more complex conflicts that it should be able to foresee automatically. This is done by declaring the ways you're going to interact with the data in advance, so they can be statically analyzed:
https://github.com/WorldBrain/storex-pattern-modules/blob/master/ts/index.test.ts <https://github.com/WorldBrain/storex-pattern-modules/blob/master/ts/index.test.ts>
With the help of those, and uniqueness constraints, we can detect potential conflicts through static analysis. One case where it'll detect a potential unique constraint violation is when having ordered children with parentId + position being unique, such as in our feature allowing you to organize pages into ordered lists. Here an OT-like algorithm could be a good fit. Instead of just capturing a series of lower-level operations that capture modifcations of the position field, we could define higher-level operations such as insertBefore, moveBefore, etc. (but named better) than we could do some OT magic with.
Other conflicts that'll arise, which should be automatically detected are:
Modifying a field on two different devices (here you could specify a 'last in time should win' strategy)
Modifying a field on a deleted object (in which case you could specify the modification could be discarded)
Adding a child to a deleted object (like an item to a list, you'd probably discard the change)
For creation operations not to generate conflicts, there needs to be another strategy to generate IDs instead of auto-increment integers. This to prevent the scenario of where:
Both devices have a widget collection containing one object with ID 1
Device A creates a new object, with auto-incremented ID 2
Device B creates a new object, also with auto-incremented ID 2
ID 2 now means different things on different devices
Here I was thinking about the following strategies:
Assign a unique device ID to every device, override default ID generation to generate IDs like <deviceId>-<autoIncrementedInteger> (lots of moving parts, like device ID communication and propagation.)
Using UUIDs (which might be to space-inefficient, but are always unique and simple to implement.)
Assinging each device a unique range of the 64-bit number range. (thing about it, too many problems, and probably not a lot of space wins in most real-world scenarios.)
The changes will be synchronized through a shared database. This database will hold change records which hold the time at which they occured, and when they were uploaded. Also, every device will record it's last sync time. On every sync, changes below the minimal sync time shared between devices will be archived to serve as a backup (by one of the devices) and removed from the main sync log.
When a client syncs the log with the shared database, it'll locally process the newly arrived changes. For each newly arrived changed, it'll look up relevant changes its own log (writes to the same field, to fields matching unique constraints, etc.) and replay the changes after conflict resolution.
This shared database will initially be implemented on Firestore (#10 <#10>), which also allows live sync. However, it'll be implemented in such a way that it can easily be ported to a self-hosted server. This means that:
We'll have to able to define that 'something needs to happens once an object is added to a collection', but decoupled from how that happens. On Firestore we have to deploy a Firebase Function. One a self-hosted server, we might need to intercept all write operations, and execute the code automatically on each write. On a cloud infrastructure, we might need to publish a write through a Redis Pubsub channel, to signal a pool of workers to execute the code.
Access rights need to be declared in a database-agnostic way, translating to Firestore rules on Firestore, but to be enforced directly on the API server in a cloud-provider setting.
I've already added support for middleware to the Storex core (yet to be published) that allows you to intercept all operations you do on the database. The unified access control (#6 <#6>) is still TBD.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub <#8 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAbNFXiZ-NUi8sWb3Ll0AruCzpnaOowHks5vKWeSgaJpZM4Z-5or>.
|
Good ideas! Would be pretty easy to implement a middleware for a write lock (have to think about the possibility of deadlocks, but none come to mind quickly.) |
For situations where we want to host data on multiple sources, we need to be able to sync database changes around. This involves:
Point 3) will be most difficult and must be handled on a per-case basis. Sometimes we can let the last change win, sometimes the user needs to intervene and sometimes we can come up with fitting application-specific algorithms. This functionality should be modular enough to allow for all these cases. Additionally, it should allow for different mechanisms of storing and broadcasting the changes.
The text was updated successfully, but these errors were encountered: