-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature Request: Stronger Writing Guarantee on ReplicaSet #98
Comments
I'd say we can make this to be configurable env var, since it depends on the db architecture and how many nodes you have. good point @Magnitus- thanks for the heads up |
I believe it would still work with a single node: I vaguely recall using majority writes in local development with one node in the past (1/1 is still majority afterall). I think where it would fail in some cases would be if you made a hard assumption about the number of nodes and set a write concern of 2 for example. I guess the main case I could think of against it is if you have a >50ms latency in your replicaset from multiple data center replication with heavy traffic (though frankly, I think that if your replication cannot keep up with your write throughput, you'll be in a lot of trouble when your master crashes). Either way, it makes sense to make it configurable so that people can adapt to unforeseen use-cases. |
yeah my concern was having arbiters which may influence the majority vote (not sure how they behave to be frank), if they vote, we may want full consensus, and someone may want just eventual consistency by keeping it in master, so configuration var seems to give enough flexibility. thanks for the valuable clarifications I appreciate it. |
Not sure if this is the right forum to discuss this, but there are a couple of different concepts at play for this. Unless things changed a lot since couple of years back, an arbiter should be strictly for voting (in the case of a network partition, it will cut who should be the primary). It does not participate in any data operation (read, write, etc) so I do not think it would be involved at all in the write concern. A majority write concern doesn't guarantee by itself strong read consistency. All a majority write concern does is give the client a guarantee that when the write call returns, a majority of writeable members will have acknowledged the write (and if that is not possible, an error will be returned). Whether what you read is strongly consistent depends on your read concern (I see now that it got more complex over time): https://docs.mongodb.com/manual/reference/read-concern/ For example, assume the following scenarios where a writer has majority write concern and a separate reader reads right after the primary got the write, but not the secondaries (and then to make it really fun, assume the primary crashes before the write is propagated to the replicas so the write will be lost and the writer will receive a failure notice):
If you want strong consistency (both in your acknowledged writes and the values you are reading), you need a write concern of majority and a read concern of majority. MongoDB gives you A LOT of granular control on the kind of consistency you want (with the resulting performance tradeoffs). Honestly, I think many of its detractors in that department fail to understand how much you can tweak its behavior to do what you want. |
shouldn't the replication log avoid this case ? even if a crash happens, eventually any non replicated changes will be propagated when replication resumes where it stopped, I think this is a general concept in all leader/follower dbs not only mongo replicaset ? |
There is an oplog on the primary (technically, all data nodes have an oplog, but the primary's is the one of interest here) which is used by the secondaries to sync up (basically, as long as a secondary is not so far back that all the operations it doesn't have are in the primary's oplog, syncing up is not too expensive). However, the primary can accept a write (and put it in its oplog), but then crash before the write is propagated in the secondaries. Unless things changed since I took my certification a couple of years back, what happens then is that when the primary is back up and joins the cluster (as a secondary if another node was elected in the interim), any operations it has in its oplog that are not in the new primary's oplog are "rolled back". They are still there in the background and retrievable, though it requires a manual intervention at that point (otherwise, the cluster will just ignore them). Much simpler than the above is just to always acknowledge with a majority write concern so that all your acknowledged writes never end up stuck in the rollback of some former primary that you have to manually restore. |
Writing guarantee on a replicaset is not as robust as it could be. If the master crashes before writes are propagated, the client will get a success acknowledgement, but the write won't be there.
Detailed Description
I recommend changing the write concern to "majority". I think that for this particular case, the gain in guarantee greatly outweights a slight loss in latency.
See: https://mongoosejs.com/docs/guide.html#writeConcern
Possible Implementation
src/models/Dictionary.ts:
The text was updated successfully, but these errors were encountered: