API Rate Limiting
Rate limiting is a strategy to limit the access to APIs. It restricts the number of API calls that a client can make within any given timeframe. This helps to defend the API against abuse, both unintentional and malicious scripts.
Rate limits are often applied to an API by tracking the IP address, API keys or access tokens, etc. As an API developers, we can choose to respond in several different ways when a client reaches the limit.
- Queueing the request until the remaining time period has elapsed.
- Allowing the request immediately but charging extra for this request.
- Most common one is rejecting the request (HTTP 429 Too Many Requests)
Sliding Log Algorithm
Sliding Log rate limiting involves tracking a time stamped log for each consumer request. These logs are usually stored in a hash set or table that is sorted by time. Logs with timestamps beyond a threshold are discarded. When a new request comes in, we calculate the sum of logs to determine the request rate. If the request would exceed the threshold rate, then it is held.
The advantage of this algorithm is that it does not suffer from the boundary conditions of fixed windows. The rate limit will be enforced precisely and because the sliding log is tracked for each consumer, you don’t have the rush effect that challenges fixed windows. However, it can be very expensive to store an unlimited number of logs for every request. It’s also expensive to compute because each request requires calculating a summation over the consumers prior requests, potentially across a cluster of servers. As a result, it does not scale well to handle large bursts of traffic or denial of service attacks.
Please refer to the Understanding Rate Limiting Algorithms blog where the Sliding Log and other algorithms have been explained in detail.
Building a Springboot Application with API Rate Limiter
Create a new spring boot application from Spring Initializr with dependency on spring web module.
Unzip the downloaded project and import to your IDE. We are going to implement a simple calculator REST APIs that can do operations like add and subtract.
Let’s ensure that our above APIs are up and running as expected. You can use the cURL or PostMan to make an API call.
Now that we have APIs ready to consume, next let’s introduce some subscription plans with rate limits. Let’s assume that we have the following subscription plans for our clients:
- Free Subscription allows 2 requests per 60 seconds.
- Basic Subscription allows 10 requests per 60 seconds.
- Professional Subscription allows 20 requests per 60 seconds.
Each API client gets a unique API key that they must send along with each request. This would help us identify the client and subscription plan linked.
Next we create a subscription service which will store the references for each of the API client in a memory.
Let’s understand the implementation. The API client sends an API key with the X-Subscription-Key request header. We use the SubscriptionService to get the user reference for the API key and check whether the request is allowed or not with the help of methods.
In order to enhance the client experience of the API, we will add the following additional response headers to send information about the rate limit.
- X-Rate-Limit-Remaining — number of tokens remaining in the current time window.
- X-Rate-Limit-Retry-After-Seconds — remaining time in seconds until the bucket is refilled with new tokens.
We can call UserRequestData methods getRequestWaitTime and getRemainingRequests, to get the count of the remaining requests and the time remaining until the next sliding log respectively. The implementation provided in this class is self explanatory and easy to understand the same.
Here is the implementation of the Interceptor to validate the request with rate limiter to see whether we accept or reject the request.
Finally, let’s add the interceptor to the InterceptorRegistry of Springboot so that the RateLimitInterceptor intercepts each request to our calculator API endpoints.
Let invoke calculator API to see the behaviour.
The client has to send the API key within the http header otherwise the interceptor will not process the request. Let’s add the API key to the header and make the call.
You can see the API key is added in the header, the API responds to our request and also it has added response header which shows how many rate is remaining for the API key.
Let’s make 2 more calls then we should see that we exhausted our rate for the free plan and returns 429 as response.
It looks like we have successfully implemented the rate limiter using the Sliding Log algorithm. We can keep adding endpoints and the interceptor would apply the rate limit for each request.
As usual, the source code for the above spring boot implementation is available over on GitHub.