Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nhse d34 nhskv.i33 token #35

Open
wants to merge 37 commits into
base: nhse-develop-3.4
Choose a base branch
from

Conversation

martinsumner
Copy link

Add support for a token manager and token session process to act as a requestor for tokens, and warp riak_client calls within a session.

This is then used to be provide a stronger conditional PUT API to allow for check-and-set operations that will behave reliably when the cluster is healthy, as well as in simple failure scenarios and operational changes. The CAS will fall-back to eventual consistency in extreme failure scenarios.

Start a token manager on each node.  The token manager will on request register a session against a token, and remove the registration when the session terminates.  If another request is received for the same token, it will be queued until the previous session releases the token (or the queued session dies).

There is a session server that can be started to run a session (which request a token from the local token_manager on startup), and will allow for the session to be used by making calls to riak_client functions.  Each call will renew the session.  After token timeout, the session will terminate.
Allow, by configuration, for the conditional check to be strengthened to use the new token-based mechanism
Means that if a topology change occurs without node failure, sibling protection is still maintained.
Also catch erpc errors when initial connecting to node, and have back-off and retry.  This smooths handling of the case where a node is suddenly killed.
To make it simpler and clearer to remove session token
If the strong conditional check is at the API - becomes much harder to make the check identically and reliably in the HTTP API.

Webmachine has lots of potential exit points outside of our control - so it is hard to reliably release a session on exit.
Discussion of the reliability of process monitoring, including the monitoring of processes on remote nodes, led to a new design which simplifies the message passing required by relying instead on monitoring.

A request should now be refused if there exists an currently active grant, and the preflist has changed since that grant was activated.  In all other situations grants should be made (either immediately or after being queued awaiting a release).
With aim to improve clarity
The naming has been changed  following review to try and clarify meaning.

The downstream_check was previously sync - but two token_managers could attempt to check in each other at the same time (for different tokens) - resulting in deadlock.

These messages are now async not sync.
priv/riak_kv.schema Outdated Show resolved Hide resolved
The token_manager will no longer monitor remote upstream sessions.  Add a backup GC process, whereby if a request is blocked, there is a check of the existence in the remote manager of an association with the session.

The only thing that must be reliable, is therefore the monitoring of local sessions by managers.

Before using a local session, an association check is now made, that the local token manager has an active association for that session.
Very hard to test the GC process - hard to create a PID which doesn't trigger down on die.

There was a discrepancy in sloppy_quorum between PB/WM APIs for the conditional check.  This is resolved.
Once a HEAD has been converted to a GET, disqualify it from the find_bestobject calculation.

As GETs occur after HEADs in the GET_FSM, it should by default be the case when allow_mult=true that VCget >= VChead.

This is not the case with lww=true, wher eobjects may regress in VC terms.  However as lww=true, it is destined to prefer the GET which came after the HEAD.

The code as-is could cause a HEAD to continuously require re-fetching, and lead to requests timing out.  This is demonstrated in the riak_test associated with the conditional PUT token implementation
@martinsumner
Copy link
Author

Looking at the results of a full performance test, the overhead of performing basic_consensus for a proportion of PUTs is not large

Throughput Comparison - CAS

Base automatically changed from nhse-d34-otp26 to nhse-develop-3.4 September 10, 2024 11:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants