Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CRUD Notifications #41

Open
CxRes opened this issue Dec 13, 2021 · 3 comments
Open

CRUD Notifications #41

CxRes opened this issue Dec 13, 2021 · 3 comments

Comments

@CxRes
Copy link
Member

CxRes commented Dec 13, 2021

CRUD notifications would be the most basic use of the SOLID notification functionality. This represents not a use case, but a couple of families of use cases, as we shall see below.

SOLID, even before being a linked data store, is a data store. In this sense, it is a bit like an open standards version of Dropbox or Google Drive (plug in your favourite cloud storage). I can conceive of the following use cases:

  1. As a personal data store: where the user accesses their own data through a client. The most common case here is some equivalent of a file-explorer/finder for SOLID, like the one provided by every cloud storage provider. But I am also thinking in terms of concept stores and knowledge managers that provide a more nuanced/sophisticated view on the stored data and also allow users to selectively make their data public.
  2. User data store for a third party application: where the application (which does something more/other than just manage user data) for its purpose might need to store and update information based on your interaction with it.

In both cases, especially with clients that are always connected and save data in the background (as opposed to explicitly saving when use hits save) one expects that there would be a lot of changes being made quickly which need to be notified to all connected clients (I am not even going into OT/CRDTs which is a whole other can of worms that I have little knowledge about). Whilst NSS implementation was clearly under-defined, I find that the choice of activity streams with its large JSON payload might be erring in the opposite direction. Not only will this be less performant at scale (i.e., with multiple changes being notified to multiple clients simultaneously), clients have to be that much more sophisticated to handle the messages as well.

Since I am looking a SOLID from the lens of a data store for the purpose of this use case, it might be that my view is myopic; the reasonable expectations being that most non-trivial applications might inevitably need some linked data functionality and hence the payload needs to be more complex. But, I still think there is a middle ground between complexity and performance that needs to be found.

@acoburn
Copy link
Member

acoburn commented Dec 16, 2021

It is not entirely clear to me what is meant by "CRUD notifications" that is not already captured by the use of Activity Streams 2.0 in the current draft proposal. As such: Create, Update and Delete are already defined operations. Other event types can be used as implementation-specific extensions.

I find that the choice of activity streams with its large JSON payload

I disagree with this assessment. While JSON is not as compact as a binary payload (e.g., thrift or protobuffers), the minimal JSON-LD structure described by the notifications protocol will generally fall in the 0.5 KB range. Messaging systems such as Kafka perform exceedingly well with messages in the 1KB range. Related: network packets are usually in the 1.0-1.5 KB range. Ideally, a single message fits inside a single TCP data frame, and that is exactly what this proposal anticipates, even considering any overhead that the messaging technology provides.

Not only will this be less performant at scale [...] clients have to be that much more sophisticated to handle the messages as well

Please refer to the rate description under Notification Features where clients can indicate the rate at which notifications are delivered. For instance, a client may want notifications no more frequent than every minute; another may want every notification, as it arrives. Also, please provide evidence that demonstrates that this model is "less performant at scale" compared to some other model. Discussing scalability without data is not productive.

the reasonable expectations being that most non-trivial applications might inevitably need some linked data functionality and hence the payload needs to be more complex. But, I still think there is a middle ground between complexity and performance that needs to be found.

Hence the use of JSON-LD. An RDF-aware client can use the semantics of JSON-LD if it wants, but a JSON-only client will also be able to consume messages, given that the format will be well-defined with a particular @context.

@CxRes
Copy link
Member Author

CxRes commented Dec 16, 2021

In general, I am using CRUD as a catch-all for the two scenarios documented above. The use cases documented do not include these yet. I am not contesting that Activity Streams 2.0 capture these scenarios (though there are situations where there are ambiguities, as shown by various questions on Gitter). Just because the use case is covered by the implementation is no reason not to record them, which is my foremost purpose with opening this issue.

Perhaps, I should have spilt the use case and discussions that arise from it as separate comments for the sake of clarity which I am doing here.

As for "R", it is conceivable, say, a server inform clients how many or even which clients are currently retrieving the same data.

@CxRes
Copy link
Member Author

CxRes commented Dec 16, 2021

@acoburn To address the points you raise:

I disagree with this assessment. While JSON is not as compact as a binary payload...

I am not making an assessment, rather an attempt at an enquiry if this is absolutely the best model (while simultaneously noting the large jump from NSS). Hence, my statement: "...activity streams with its large JSON payload might be erring in the opposite direction", instead of using "is erring...".

Also, please provide evidence that demonstrates that this model is "less performant at scale" compared to some other model. Discussing scalability without data is not productive.

I actually had two models in mind:

  1. NSS implementation: which as I said is clearly inadequate.
  2. HTTP style headers and response (which are also standard and maybe even more flexible, with the client negotiating the type of payload).

As for providing data, that seems like an onerous and unnecessary demand. I am coming here from the perspective of a client developer or an end user not a server implementer. However, moving from a few bytes to Kbytes (as compared to NSS) would have obvious performance implications.

Please refer to the rate description under Notification Features

I have concerns about the rate feature and wanted to discuss this separately (perhaps we can move this part to Gitter). A long time back, I had posted on Gitter in the Solid specifications channel that while we cannot preserve REST in a notification system, we should try to remain as close to the REST model as possible, which was warmly received. Perhaps you might disagree with this assessment. I fear that rate is just such a feature.

Hence the use of JSON-LD.

OK! That makes sense!!!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants