Feature/atomic chaining #159

tassm · 2022-11-11T06:25:10Z

📝 Atomic chaining of limiters!

Hi all, have had a go at implementing atomic chaining of limiters, keen for feedback or suggestions on how it could be improved.

Added ATOMIC_REQUESTS as a UnionBehaviour as suggested by @thrawn01
Added a flag on the RateLimitReq proto to avoid changing all the function signatures (not sure this was the best idea)
Modified the algorithms to support checking limits without modifying the actual limit values
Added an integration test and another missing test

💎 Closes #45 💎

Baliedge

Thanks for this contribution! Looking forward to your feedback on the review.

Baliedge · 2022-11-11T14:42:07Z

algorithms.go

@@ -251,13 +265,22 @@ func tokenBucketNewItem(ctx context.Context, s Store, c Cache, r *RateLimitReq)
 		ResetTime: expire,
 	}

+	// if we are doing a check during atomic chaining then remaining number is actually the limit


I would like to see a separation of concerns around checking a bucket vs. updating the bucket. Because now there's two places where the bucket's counters are updated as a shared responsibility. Can we have the update logic pulled out and called by gubernator's behavior logic and not by tokenBucket/leakyBucket?

yeah that's a good point, I will move the logic to persist hits into a new function

update: I've decoupled the update logic from the actual algorithms into a new function which is invoked from peer_client.go following the actual "check" using the algorithm in the case the request is not part of an atomic chain. hopefully this is the kind of separation you were thinking of. I have also extracted the cache and store loading logic into a common function.

Baliedge · 2022-11-11T14:44:41Z

gubernator.go

@@ -219,16 +219,43 @@ func (s *V1Instance) GetRateLimits(ctx context.Context, r *GetRateLimitsReq) (re
 		Responses: make([]*RateLimitResp, len(r.Requests)),
 	}

+	// get the UnionBehaviour
+	var atomicRequests = HasUnionBehavior(r.GetBehavior(), UnionBehavior_ATOMIC_REQUESTS)


nit: Combine these two lines?

Baliedge · 2022-11-11T14:45:55Z

proto/gubernator.proto

@@ -127,9 +129,21 @@ enum Behavior {
  // least 2 instances of Gubernator.
  MULTI_REGION = 16;

+


nit: Excess whitespace

Baliedge · 2022-11-11T14:46:00Z

proto/gubernator.proto

@@ -47,6 +47,8 @@ service V1 {
 // Must specify at least one Request
 message GetRateLimitsReq {
  repeated RateLimitReq requests = 1;
+  UnionBehavior Behavior = 2;
+


nit: Excess whitespace

thrawn01

I am SO SORRY this review took so long for me too look at. I didn't even realize this PR was here until I looked at the repo this morning.

This is a great change, but in order for it to actually be atomic we will need to make a few more changes. See my other comment in V1Instance.GetRateLimits()

Again, I apologize for the tardy review.

I'm thinking of something like this in V1Instance.GetRateLimits()

if HasUnionBehavior(r.GetUnionBehavior(), UnionBehavior_FORWARDED_UNION) {
	// TODO: Preform an atomic check of all the rate limits here
}

// perform a check first if the request is for union behaviour
if HasUnionBehavior(r.GetUnionBehavior(), UnionBehavior_ATOMIC_REQUESTS) {
	// TODO: Collect all the unique keys for each ratelimit and combine them to create our union unquie key
	// this will be used to identify which instance in the cluster will handle the atomic check
	key := "combination of all keys"
	// TODO: Make `GetUnionPeer()` which will retry if a peer is leaving the cluster at the time we request a lookup.
	peer, err := s.GetUnionPeer(ctx, key)
	if err != nil {
		// TODO: Handle error
	}

	// TODO: Mark the forwarded union as forwarded, so we don't preform the same check on the remote instance
	r.UnionBehavior = UnionBehavior_FORWARDED_UNION

	// TODO: Forward and wait for a response
	return &resp, nil
}

Thoughts?

thrawn01 · 2023-02-22T15:31:24Z

proto/gubernator.proto

@@ -47,6 +47,7 @@ service V1 {
 // Must specify at least one Request
 message GetRateLimitsReq {
  repeated RateLimitReq requests = 1;
+  UnionBehavior Behavior = 2;


This should probably be UnionBehavior UnionBehavior = 2;

thrawn01 · 2023-02-22T15:47:01Z

gubernator.go

+			}
+		}
+	}
+	// if atomic check is valid then run as we normally would


There is a possible race condition where we check the limits, but by the time we are ready to apply the limits some other instance has incremented the limits we are interested in.

One solution would be to have the union of rate limits all be owned by the same instance such that if we see the union behavior flag set, then we send the complete list of rate limits to a single owning instance which can then atomically check all the ratelimits in the list (by locking the entire cache).

Does that make sense?

thrawn01 · 2023-08-17T02:10:16Z

I'm so sorry this code change hasn't worked out. This would be a very nice feature to have, but implementing it in a race condition safe way is going to be a challenge.

The way forward is to either
A. Create a union of the rate limits such that one instance owns the union (thus all the ratelimits) and can be worked on in a batch
B. Add a new Peer call which reserves each limit on the owning instances and if any fail to reserve then all rate limits can be reverted.

I'm about to make some backward incompatible changes to the peer clients and the worker pool which WILL make implementing this feature easier, but will change so much that this PR as written would need to be re-done.

I want to keep this code around in the branch as reference, but I will close this PR until we have a good direction forward.

I hate to throw away good code, but in this instance, I think we must at least for now.

tassm · 2023-08-17T11:10:03Z

Hey @thrawn01, no worries. Unfortunately I've been focused elsewhere the last few months and haven't managed to revisit this. I agree it is a more complex challenge than I originally thought. Looking forward to seeing your changes and maybe I can help you revisit this feature in the future.

tassm added 6 commits October 24, 2022 21:39

🎨 fix typo

587724c

💎 update protos for atomic chaining

33a322e

💎 atomic chainingof limiters

fd8af64

♻️ request proto contains atomic check

c8bab72

✅ atomic chaining integration test

2e3f5e4

✅ add missing test for nil requests

e35a4fd

tassm requested a review from Baliedge as a code owner November 11, 2022 06:25

Baliedge requested a review from thrawn01 November 11, 2022 14:36

Baliedge suggested changes Nov 11, 2022

View reviewed changes

♻️ refactor alorithms update logic + cleanup

f73f997

tassm requested review from Baliedge and removed request for thrawn01 November 13, 2022 22:44

thrawn01 suggested changes Feb 22, 2023

View reviewed changes

thrawn01 self-assigned this Feb 22, 2023

thrawn01 added the enhancement New feature or request label Feb 22, 2023

thrawn01 closed this Aug 17, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature/atomic chaining #159

Feature/atomic chaining #159

tassm commented Nov 11, 2022

Baliedge left a comment

Baliedge Nov 11, 2022

tassm Nov 12, 2022

tassm Nov 13, 2022

Baliedge Nov 11, 2022

Baliedge Nov 11, 2022

Baliedge Nov 11, 2022

thrawn01 left a comment

thrawn01 Feb 22, 2023

thrawn01 Feb 22, 2023

thrawn01 commented Aug 17, 2023

tassm commented Aug 17, 2023

		@@ -127,9 +129,21 @@ enum Behavior {
		// least 2 instances of Gubernator.
		MULTI_REGION = 16;

Feature/atomic chaining #159

Feature/atomic chaining #159

Conversation

tassm commented Nov 11, 2022

📝 Atomic chaining of limiters!

Baliedge left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

thrawn01 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

thrawn01 commented Aug 17, 2023

tassm commented Aug 17, 2023