How you can Implement rate limiting in your System

Dipesh Ghimire
Published: September 6, 2025

Imagine, Suddenly your got thousands of requests and your system got slow, Problem is, some users (or bots) start hammering your API with tons of requests—say, a bot spamming the “search gigs” endpoint 1000 times a second.

Crash your server: Too many requests overload your backend, slowing it down or bringing it to its knees.

Ruin user experience: Legit users (like a writer looking for a Mars mission blog gig) get sluggish responses or errors.

Cost you money: If you’re on a cloud provider like AWS, every request racks up costs.

Open security risks: Spammers or attackers could exploit your API to scrape data or disrupt service.

Here rate limiting serves as a bouncer of your API . It will help to slow down the request rate and set limit i.e. only X requests are allowed per second, as a result you can see that your system run smoothly, without any leakage.

Rate Limiting:

In simple terms, rate limiting caps how many requests a user (or IP, or API key) can send to your API in a given time frame. For example, you might say, “Each user gets 100 requests per minute to the gig search endpoint.” If they go over, they get a polite “429 Too Many Requests” error until the timer resets.

How Can Spring Boot Do Rate Limiting?

Spring has several ways to enforce rate limiting such as:

Token Bucket:

The token bucket algorithm controls access by maintaining a bucket filled with tokens. Each request consumes a token, and once the bucket is empty, requests are denied until tokens are replenished(refilled).

2. Fixed Window Counter

It divides time into fixed windows (e.g., one minute) and counts the number of requests in each window. Once the limit is reached, all further requests in that window are rejected.

3. Sliding Window Log
Unlike fixed window counters, which track requests in predefined intervals, the sliding window log records each request’s timestamp and continuously removes expired entries.

Actual code implementation of Rate Limiting:

Bucket4j for In-Memory Rate Limiting(most popular):

It is one of the commonly used rate limiting algorithm which falls under ( Token Bucket). Think of it as a container that holds X numbers of token. Each API calls takes a token. If the bucket’s empty, you’re blocked until it refills (e.g., every minute). It’s flexible and great for production.

Add Dependencies:

This provides required classes for request based token tracking.

//rate limiter
implementation("com.bucket4j:bucket4j_jdk8-core:8.10.1")

Implementing rate limiting with filter:

Intercepting each request in filter before they reach the controller, making it a practical choice for rate limiting. It checks the availability of requests quota and decides whether to permit or denied the request.

package com.bytespacenepal.common.config.rate_limiting;   //use your own package

import io.github.bucket4j.Bucket;
import io.micrometer.common.lang.NonNull;
import jakarta.servlet.FilterChain;
import jakarta.servlet.ServletException;
import jakarta.servlet.http.HttpServletRequest;
import jakarta.servlet.http.HttpServletResponse;
import lombok.RequiredArgsConstructor;
import org.springframework.http.HttpStatus;
import org.springframework.stereotype.Component;
import org.springframework.web.filter.OncePerRequestFilter;

import java.io.IOException;

@Component
@RequiredArgsConstructor
public class RateLimitingFilter extends OncePerRequestFilter {

    private final Bucket bucket;

    @Override
    protected void doFilterInternal(@NonNull HttpServletRequest request, @NonNull HttpServletResponse response, @NonNull FilterChain filterChain)throws IOException, ServletException {
if(bucket.tryConsume(1)){
    filterChain.doFilter(request, response);
}
else{
    response.setStatus(HttpStatus.TOO_MANY_REQUESTS.value());
}
    }
}

Share this article:

How you can Implement rate limiting in your System

Share This Post

Rate Limiting:

Leave a Reply Cancel reply