Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Streams do not support atomic operations #73

Open
0xjjpa opened this issue Sep 11, 2021 · 8 comments
Open

Streams do not support atomic operations #73

0xjjpa opened this issue Sep 11, 2021 · 8 comments
Labels
bug Something isn't working dontclose help wanted Extra attention is needed

Comments

@0xjjpa
Copy link

0xjjpa commented Sep 11, 2021

Describe the bug
When you have a stream with constant read/write* operations on top of it by the same key, a race condition can occur when trying to write data on the stream that relies on the last read. In short, the node does not provide locking abilities into the stream per read, and as a result, the latest write on the stream is considered the latest commit, and the latest state to show for the stream when using load by any ceramic client. This can create a mismatch on the data the stream has. The problem only becomes more likely when the stream grows in size (e.g. 25 KB)

* By read/write we refer to executing a load operation on our node, followed by an update operation in <100 ms

To Reproduce
Steps to reproduce the behavior:

  1. Create a stream as a JSON map, ideally in 25kbs in size
  2. Create an async script that maps over [1...100] and a) load the stream, b) updates the content, c) updates the stream
  3. See the stream latest state being random every time, instead of being sequential.

Expected behavior
Ideally, we can load a stream with a lock identifier, where it can not be written by a write operation that doesn't have that identifier as part of its read. E.g.

       const doc = await TileDocument.create(client, null, {
          deterministic: true,
          tags: ["hopr-dashboard"],
        });

        const mutatedDoc = Object.assign({}, doc.content, {
          "foo": "bar,
        });

        doc.update(mutatedDoc, doc.lockId);

This behavior should be considered the default for all the ceramic clients, and invisible for users. If the lock identifier is not provided, then the Ceramic node should return a 428 HTTP status code error. To avoid infinite locking, streams should have a 60 secs timeout, option which could be configured per stream.

Screenshots
Using Documint, you can see that k2t6wyfsu4pg1bz6houhqzlpljag79xcs6r04s1ihyapjhwzg8fl3e7fwrb1pg commit changes can have negative deltas on the content of the next state, where it should only have positive deltas (i.e. only additions to the content). In other words, the stream should only increase in size, not decrease.

Ceramic versions
We saw this issue in the Clay node during the days of Sept 3rd and 10th of 2021.

Machine, OS, browser information (please complete the following information):
The client is loaded with the key-did-provider-25519 and written using a serverless Vercel instance with node.js 14. The code is executed here.

Additional context
We are using a specific stream (k2t6wyfsu4pg1bz6houhqzlpljag79xcs6r04s1ihyapjhwzg8fl3e7fwrb1pg) as a "database" for keeping records of all other streams in the network. This stream is being written by this endpoint, triggered on the server-side by a server-side key. As a result, given the conditions for the requests are met, this read-modify-write call can be done within <100ms. The goal was meant to use this stream as an "indexer" of sort for our Dashboard, to keep track of all the streams our users where creating (with the same key).

Without this ability, streams are only limited to be written by a single user and single key, which although is good enough for some cases, seems limited to multiple users and single key cases, managed by servers. Indexing is one of the hardest problems in the decentralized ecosystem and would be great to allow Ceramic nodes to provide a solution by ensuring streams execute atomic operations on top of them.

@0xjjpa 0xjjpa added bug Something isn't working help wanted Extra attention is needed labels Sep 11, 2021
@stbrody
Copy link
Contributor

stbrody commented Sep 13, 2021

This is a known issue. Updates to Ceramic streams are based on read-modify-write behavior and have no special synchronization between the read and the write. That makes them ill-suited for concurrent updates. Most of the time this isn't an issue, because streams generally have a single controller and the single controller will not be making multiple conflicting updates to the same stream at the same time. For some use cases, however, like the one you describe, this can be a problem. Locking at the node level isn't really a general purpose solution as it doesn't solve the case of conflicting writes to the same stream that originate on different nodes. Our long term plan for this is to add streamtypes with CRDT-based update semantics, so that simultaneous writes can be automatically merged without overriding each other. In the meantime, applications should take care not to create multiple simultaneous updates to a single stream.

@0xjjpa
Copy link
Author

0xjjpa commented Sep 14, 2021

@stbrody Thanks for replying! I briefly mentioned it on Discord, but worth sharing here: considering Ceramic is still quite the new tech, it's normal for developers to try to use it as a one size fits all kind of tech and finding unexpected but known issues like this one. It's probably worth having a "Best practices" section for new devs picking up Ceramic.

Probably "bug" is not a fair denomination for this issue, but rather a specific feature currently not implemented and within the roadmap, so will remove that. Also, if you have an existing issue in GitHub or a similar project planning tool that we can link this or externals can search in for this in the future, I can chip in there rather than creating isolating noise.

@0xjjpa 0xjjpa changed the title BUG: Streams do not support atomic operations Streams do not support atomic operations Sep 14, 2021
@stbrody
Copy link
Contributor

stbrody commented Sep 14, 2021

Also, if you have an existing issue in GitHub or a similar project planning tool that we can link this or externals can search in for this in the future, I can chip in there rather than creating isolating noise.

This is the first time this issue has been captured in our github, so fine to leave this open so others can see this ticket.

@stbrody
Copy link
Contributor

stbrody commented Sep 14, 2021

FYI, here's a WIP PR for updates to the docs which explain this behavior: ceramicnetwork/docs#141

@0xjjpa
Copy link
Author

0xjjpa commented Dec 20, 2021

Just a brief update here @stbrody, I had a quick chat with @oed today involving stream indexing. As Ceramic usage seems to keep growing, I imagine more users would try to bake their own indexing solutions, and be tempted to do the same as I did (i.e. using a separate stream to index other streams), and stumble upon this issue.

Might I suggest adding a Known problems under “Advanced” to showcase these particular oddities that aren't really issues or bugs with the protocol, but a mismatch between the expectations of the tool and its goals?

@stbrody
Copy link
Contributor

stbrody commented Dec 21, 2021

Created ceramicnetwork/docs#199 for this suggestion, thanks @jjperezaguinaga!

@Gutyn
Copy link

Gutyn commented Sep 22, 2022

@stbrody Thanks for Ceramic, I am testing ceramic with one of the public nodes: https://ceramic-clay.3boxlabs.com and it seems that update is not working, no matter what I do, update is just ignored, I have a did attached to ceramic and creating documents and loading works fine. Please help me figure out if I'm missing something. Thank you

@stbrody
Copy link
Contributor

stbrody commented Sep 22, 2022

@Gutyn can you please open a post on http://forum.ceramic.network? Would be good to include the code for how you're creating and authenticating the client and how you're issuing the write, as well as the error message you are seeing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working dontclose help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

4 participants