You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We've known for awhile that having the cache selection based solely on GeoIP is problematic. Most significant issues:
No way for the load on the cache to "feed back" to the selection process. This results in a thundering herd problem where a large number of clients will knock over a single cache, then the next cache on the list, then the next...
Doesn't take into account file locality and the costs for making a new copy from the origin. A nearby cache without the file will be preferred 100% of the time over a slightly more distance cache with the file.
Doesn't balance across multiple caches within a region.
I propose a new weighted round robin algorithm. When matchmaking between a client and a server, we:
Select the top 6 servers based on GeoIP.
Assign a weight of 1.0 to each server.
Modify the weight based on GeoIP. If the server and client are at the same location, then the weight remains 1.0. Then, its decreased by half for every 200 miles of distance (meaning a server 800 miles away has a weight of 0.125).
Multiply the weight by 2.0 if the server is known to have the object. Multiply by 0.5 if the server is known to not have the object. Do nothing if we don't know whether the object is cached.
Multiply the weight by a load multiplier. If the load is unknown, it is considered to be 1. If the load is between 0.0 and 10.0, then the load multiplier is 1.0. Above 10.0, the load multiplier decreases by half for every 4 units of load. So, a load of 18.0 multiplies the weight by 0.25 while a load of 8 multiplies by 1.0.
Create ranges [0, weight_1), [weight_1, weight_1 + weight_2), ... for servers 1 through 6. Select a random number in the range [0, sum(weights)). If the number falls within the range corresponding to that server, it is sorted to the top.
Repeat step (6) to select a second and third place server.
Respond to the client with the top 3 servers.
Constants in this proposal are:
Number of servers under consideration: 6
Halving distance for the GeoIP weight: 200 miles
Multiplier for knowing whether an object is present: 2
Threshold where the load modifier kicks in: 10.0
Halving interval for load: 4.0
The reasons I like this approach are:
Takes into account GeoIP, load, and object locality.
Stochastic -- we randomly distribute out clients instead of put the load deterministically on one server.
In the lack of load and object locality data, it's similar to today's algorithm.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
We've known for awhile that having the cache selection based solely on GeoIP is problematic. Most significant issues:
I propose a new weighted round robin algorithm. When matchmaking between a client and a server, we:
Constants in this proposal are:
The reasons I like this approach are:
Beta Was this translation helpful? Give feedback.
All reactions