You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The problem is not everyone uses RethinkDB for reactivity & even if they do, they're more or less limited to single-table subscriptions. This makes sense, since it can get pretty expensive to simulate a join & subscribe to it. Apollo offers something that's in the experimental phase, but it's, uh, not robust. Here's the blueprints for how to make something that can match (or exceed) rethinkdb performance while allowing for cross-table subs.
Problem:
client calls the getTop5Posts(userId: 'user123') query. This returns documents of type Post with ids A,B,C,D,E
a second client calls upvote(postId: 'F') mutation. How do we know who to send this update to? Some folks care about just that document. Other folks don't care about that document yet, but supposing that the upvote gives F more votes than E, we should replace E with F. Doing this naively results in n db hits, where n is the number of channels that include at least 1 post.
Solution:
we have a topic lookup table full of queries, which are full of mutations, which contain a factory function for what I call "bump functions".
At the end of the resolve method, before returning the array of docs, that socketId does a few things:
see if the channel getTop5Posts/user123 exists. if not, create the bump function for it: topicLookupTable[query][mutation](5, 'F'). Store this bump function on the channel getTop5/user123.
subscribes to getTop5Posts/user123
The magic of the bump function is that it contains really inexpensive logic (in this case, mutatedDoc.votes > minVotes). Without it, we'd have to re-run each original database function to determine if F replaced E. This is critical because every time upvote gets called, we're gonna have to run through every channel with the getTop5Posts topic. A single Float64 comparison should be cheap enough that JS will work at scale. SocketCluster already contains a message bus, but to save a function on each channel, we'll have to use a key/value store like redis to save the bumpFnVars on each channel.
For the next example, let's try a form of CmRDT. Say we have hell world and we want to correct it. We send: updateContent(changes: {id: 'A', pos: 4, val: 'o'}) to make it hello world. Since it's a C_m_RDT, We'll never have the full state, rather just a transform. That means our mutation will have to adjust the db with just this info. Then, we forward the operational transform onto the client & trust that the client knows how to do it. Since the updateContent mutation can never change the docs that are returned by getTop5Posts, our bumpFn is easy:
For super fine grained performance tweaking, we could consider establishing a discrete channel just for that field: content/content123, but that would be very application specific & could result in a performance net-loss.
A fringe benefit of all of these things is that it means we don't always necessarily need to use a websocket between the client and the server. For example, I can take the return values of the bump functions and store them away in a key/value store under the JWT. Then, when the client long-polls for updates, I just send the array of changes. That means in 1 network request, they get a whole bunch of fresh new info without having to request it from each individual query.
The text was updated successfully, but these errors were encountered:
additional thought:
suppose each query can take in 2 additional args:
ids: The list of IDs that we currently have on the client
lastUpdatedAt: The max of all updatedAt in the list of IDs
With these 2 things, we can greatly reduce the network payload. For example, I subscribe to team members. Then i unsubscribe, then I subscribe again like: teamMembers(teamId: 'team123', ids: ['A', 'B', 'C'], updatedAt: Yesterday)
Now, I run the query. When it resolves from the DB, I get something like this:
First, we intersect the result with the ids. On the left side, we have D. On the right side, we have C. In the intersection, we have A,B. WIthin that intersection, we see that A hasn't been updated for a week, so we exclude it. B has been updated since we have recently seen it, so we need to include it. So, we return a result like:
The problem is not everyone uses RethinkDB for reactivity & even if they do, they're more or less limited to single-table subscriptions. This makes sense, since it can get pretty expensive to simulate a join & subscribe to it. Apollo offers something that's in the experimental phase, but it's, uh, not robust. Here's the blueprints for how to make something that can match (or exceed) rethinkdb performance while allowing for cross-table subs.
Problem:
getTop5Posts(userId: 'user123')
query. This returns documents of typePost
with idsA,B,C,D,E
upvote(postId: 'F')
mutation. How do we know who to send this update to? Some folks care about just that document. Other folks don't care about that document yet, but supposing that the upvote givesF
more votes thanE
, we should replaceE
withF
. Doing this naively results inn
db hits, wheren
is the number of channels that include at least 1 post.Solution:
resolve
method, before returning the array of docs, thatsocketId
does a few things:getTop5Posts/user123
exists. if not, create the bump function for it:topicLookupTable[query][mutation](5, 'F')
. Store this bump function on the channelgetTop5/user123
.getTop5Posts/user123
The magic of the bump function is that it contains really inexpensive logic (in this case,
mutatedDoc.votes > minVotes
). Without it, we'd have to re-run each original database function to determine ifF
replacedE
. This is critical because every timeupvote
gets called, we're gonna have to run through every channel with thegetTop5Posts
topic. A single Float64 comparison should be cheap enough that JS will work at scale. SocketCluster already contains a message bus, but to save a function on each channel, we'll have to use a key/value store like redis to save thebumpFnVars
on each channel.For the next example, let's try a form of CmRDT. Say we have
hell world
and we want to correct it. We send:updateContent(changes: {id: 'A', pos: 4, val: 'o'})
to make ithello world
. Since it's a C_m_RDT, We'll never have the full state, rather just a transform. That means our mutation will have to adjust the db with just this info. Then, we forward the operational transform onto the client & trust that the client knows how to do it. Since theupdateContent
mutation can never change the docs that are returned bygetTop5Posts
, our bumpFn is easy:For super fine grained performance tweaking, we could consider establishing a discrete channel just for that field:
content/content123
, but that would be very application specific & could result in a performance net-loss.A fringe benefit of all of these things is that it means we don't always necessarily need to use a websocket between the client and the server. For example, I can take the return values of the bump functions and store them away in a key/value store under the JWT. Then, when the client long-polls for updates, I just send the array of changes. That means in 1 network request, they get a whole bunch of fresh new info without having to request it from each individual query.
The text was updated successfully, but these errors were encountered: