Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Collaboration v2.0 #169

Open
jayaddison opened this issue Sep 14, 2020 · 1 comment
Open

Collaboration v2.0 #169

jayaddison opened this issue Sep 14, 2020 · 1 comment
Labels
enhancement New feature or request

Comments

@jayaddison
Copy link
Member

jayaddison commented Sep 14, 2020

Use case: A group of people want to collaborate on meal planning and ingredient shopping, and they may have intermittently unreliable data connections.

RecipeRadar should provide a software architecture that allows any person in the group to use the application and update the meal plan and shopping list, including when they are offline. Frequent version upgrades of the application software are possible and can happen at any time.

The RecipeRadar application supports database schema versioning via DexieJS.

Example use case: while one person is offline and meal planning, others may be online and updating the shopping list. The next time they are online together, the set of changes that the offline user has made should be merged into the group's session. It's possible that a software upgrade occurred while the user was offline, so the offline user may have been using 'v1' while some/all of the rest of the group was using 'v2'.

Collaborative editing is reasonably well-understood and researched for plain text documents. It is increasingly possible to support more advanced datatypes, but those datatypes require careful planning, precision and developer testing, and domain-specific implementation problems can lead to loss of data integrity and cross-version compatibility. Using structured datatypes could therefore pose challenges for iterative software development with many participants.

Based on the state of the art above, for collaboration 2.0 in RecipeRadar, the underlying document structure will be a simple text document.

Adding a meal or adding a shopping list entry will translate into editing a text-based list of items, similar to (or perhaps a subset of) Markdown.

Different entity types -- which may match up with key database tables -- will be divided into section headings. For example, there may be a section heading for Meals, which could contain URLs for recipes and the calendar times at which they are planned.

An individual entity will typically be represented as a single line of text, and may have nested properties.

For each entity type, the application's code will contain timestamp-versioned parsers and generators.

A parser is responsible for transforming a line of the shared document's text into a set of local database updates -- a simple example might involve a one-to-one mapping between a shopping list item and a single database record in a shopping-basket table.

A generator is responsible for applying text edits to a document based on a database record that has been updated. For example, when a user takes a photograph of a tomato at a local market, then the application might search for and update a markdown line containing the text string - [ ] tomato to - [x] tomato. Ideally the generator should emit a single-character diff in this case.

When the application is notified about an update to the document from a peer, it will parse the lines of text that have been added, modified and removed, and it will use a parser to update the local database state to reflect the new understanding of those sections of the document.

Parsers should typically use the latest-available-locally version since the user would appreciate the best understanding of the state of the collaborative work.

Generators should typically use the oldest-mutually-group-available version since compatibility is a core tenet of collaboration, and a group that fails to co-operate effectively will tend to lose participation.

Parsing and generation operations may occur frequently during interactive editing sessions, and may require performance and resource monitoring and optimization.

To support compatibility considerations, a special metadata heading section in the document should include per-client application version details. This will allow each client to determine whether anyone has a newer version of the application (translation: it's time to check for an update) and whether anyone in the group is using older application versions (translation: it is worth being cautious about the types of edits made to the document to ensure that they are compatible).

Each application should regularly write a high-water-mark timestamp to their version details in the document so that long-disconnected clients can be identified and potentially removed if the group agrees that they are unlikely to return.

When an application discovers that a more recent version of the application is available, then it should download the latest available version of itself. It should then perform the following steps:

  • Identify the maximum high-water-mark from the list of other people in the session that it was locally aware of before receiving the new update
  • Identify all of the changes that have been made by the user locally between that timestamp and the current timestamp
  • Determine whether an attempt to merge is sensible -- if the number of conflicts is overwhelming, then it may be worth discarding any except high-priority changes
  • Assuming a merge is sensible, upgrade the local database schema
  • Apply edits from the rest of the group to the local database
  • Use the most up-to-date generator to produce edits to the document

References:

@jayaddison jayaddison added the enhancement New feature or request label Sep 14, 2020
@jayaddison
Copy link
Member Author

I'm having second thoughts about whether CRDT-based editing over a single line-based text document is the correct way forward here.

Perhaps one text document per database table is better, somewhat akin to CSV files?

How reliable is CRDT editing of CSV content?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Development

No branches or pull requests

1 participant