Rate limiting algorithms

https://www.figma.com/blog/an-alternative-approach-to-rate-limiting/

### [Token bucket](https://www.figma.com/blog/an-alternative-approach-to-rate-limiting/#token-bucket)
Whenever a new request arrives from a user, the rate limiter would have to do a number of things to track usage. It would fetch the hash from Redis and refill the available tokens based on a chosen refill rate and the time of the user’s last request. Then, it would update the hash with the current request’s timestamp and the new available token count. When the available token count drops to zero, the rate limiter knows the user has exceeded the limit.
![image](https://user-images.githubusercontent.com/25551568/215150180-8805c773-342e-4728-9c8c-672594d32c9e.png)
Our token bucket implementation could achieve atomicity if each process were to fetch a [Redis lock](https://redis.io/topics/distlock) for the duration of its Redis operations. This, however, would come at the expense of slowing down concurrent requests from the same user and introducing another layer of complexity. Alternatively, we could make the token bucket’s Redis operations atomic [via Lua scripting](https://gist.github.com/ptarjan/e38f45f2dfe601419ca3af937fff574d#request-rate-limiter). For simplicity, however, I decided to avoid the unnecessary complications of adding another language to our codebase.

### [Fixed window counters](https://www.figma.com/blog/an-alternative-approach-to-rate-limiting/#fixed-window-counters)
When incrementing the request count for a given timestamp, we would compare its value to our rate limit to decide whether to reject the request. We would also tell Redis to expire the key when the current minute passed to ensure that stale counters didn’t stick around forever.
![image](https://user-images.githubusercontent.com/25551568/215151941-91542c00-c9b2-42f0-a8c0-2659359646a0.png)
Although the fixed window approach offers a straightforward mental model, it can sometimes let through twice the number of allowed requests per minute.
![image](https://user-images.githubusercontent.com/25551568/215152035-30d098cc-edc5-4d58-b678-a569c044486e.png)
We could avoid this issue by adding another rate limit with a smaller threshold and shorter enforcement window — e.g. 2 requests per second in addition to 5 requests per minute — but this would overcomplicate the rate limit. Arguably, it would also impose too severe of a restriction on how often the user could make requests.

### [Sliding window counters](https://www.figma.com/blog/an-alternative-approach-to-rate-limiting/#sliding-window-counters)
![image](https://user-images.githubusercontent.com/25551568/215152210-9d970775-d43d-4100-ace5-b77213679681.png)



Finally, we had to reflect on how to respond to users who exceeded the rate limit. Traditionally, web applications respond to requests from users who surpass the rate limit with a HTTP response code of 429. Our rate limiter initially did so as well. But in the case of Figma’s spam attack, our attackers saw the response code change from 200 to 429 and simply created new accounts to circumvent the rate limiting on their blocked accounts. In response, we implemented a [shadow ban](https://en.wikipedia.org/wiki/Shadow_banning): On the surface, the attackers continued to receive a 200 HTTP response code, but behind the scenes we simply stopped sending document invitations after they exceeded the rate limit.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Rate limiting algorithms #153

Token bucket

Fixed window counters

Sliding window counters

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Rate limiting algorithms #153

Description

Token bucket

Fixed window counters

Sliding window counters

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions