
How to Implement API Rate Limiting: Token Bucket and Leaky Bucket Algorithms
Exposing APIs to the public web makes them targets for abuse. Without protections, malicious scripts can spam login endpoints, scrapers can drain database assets, and unexpected traffic spikes can crash your server nodes.
To protect system resources, you must configure Rate Limiting.
In this guide, we will analyze the two most popular rate-limiting algorithms—Token Bucket and Leaky Bucket—compare how they handle traffic bursts, and implement a distributed Sliding Window Counter using Redis.
1. The Token Bucket Algorithm (Allows Bursts)
The Token Bucket algorithm uses a virtual bucket containing tokens. The bucket has a maximum capacity, and it is continuously refilled with tokens at a constant rate (e.g., adding 5 tokens per second).
When a client request arrives:
- The system checks if the bucket has at least one token.
- If yes, the token is removed, and the request executes.
- If no, the request is rejected with a
429 Too Many Requestsstatus.
Advantage: Handles Burst Traffic
If the bucket is full (e.g., holding 100 tokens), a user can suddenly execute 100 requests in a split second. The bucket empties immediately, but all requests succeed. This is useful for web APIs where page load cycles require loading multiple assets simultaneously.
2. The Leaky Bucket Algorithm (Smooth Execution)
The Leaky Bucket algorithm is shaped like a funnel. Incoming requests are poured into the bucket. The bucket leaks requests at a constant, fixed rate (e.g., executing exactly 2 requests per second).
- If the rate of incoming requests is faster than the leak rate, the water level rises (requests queue up).
- If the bucket overflows (the queue is full), incoming requests are rejected immediately.
Advantage: Smooths Traffic Spikes
Unlike the Token Bucket, the Leaky Bucket does not tolerate burst traffic. It smooths out spikes, ensuring that your database and downstream services receive a flat, steady flow of requests.
Implementing a Sliding Window Counter with Redis
While token and leaky buckets are elegant, implementing them inside distributed server clusters requires complex synchronization.
A simpler and highly accurate distributed algorithm is the Sliding Window Counter. It tracks request timestamps inside a Redis Sorted Set (ZSET).
The Sliding Window Logic
- The client IP acts as the key.
- Every request appends the current timestamp to the Sorted Set.
- The system deletes all timestamps older than the active time window.
- The system counts the remaining elements in the set. If the count exceeds the threshold, the request is blocked.
Code Implementation (Node.js & Redis)
import Redis from 'ioredis';
const redis = new Redis();
const LIMIT = 50; // Allow max 50 requests
const WINDOW_MS = 60000; // 60 seconds time window
export async function checkRateLimit(ip: string): Promise<boolean> {
const key = `rate:${ip}`;
const now = Date.now();
const clearBefore = now - WINDOW_MS;
// Execute commands atomically inside a transaction pipeline
const result = await redis
.multi()
// Append the current timestamp as score and member
.zadd(key, now, now.toString())
// Remove all timestamps outside the window
.zremrangebyscore(key, 0, clearBefore)
// Count the total remaining requests in the set
.zcard(key)
// Set a expiration on the key to save memory
.expire(key, Math.ceil(WINDOW_MS / 1000))
.exec();
if (!result) return false;
// The output of zcard is the third command in the pipeline
const requestCount = result[2][1] as number;
// If the count exceeds the limit, block the request
return requestCount <= LIMIT;
}Algorithms Comparison
| Metric | Token Bucket | Leaky Bucket | Sliding Window |
| Handles Burst Traffic? | Yes (Up to bucket capacity) | No (Converts to steady rate) | Yes (Strict boundary check) |
| Queueing Overhead | None (Fails fast) | High (Requires queue buffers) | None (Fails fast) |
| Memory Consumption | Low (Stores counter value) | High (Stores queued requests) | High (Stores timestamps list) |
| Best Used For | Web APIs with asset peaks | Upstream database protection | Accurate User API quotas |
Conclusion
Rate limiting is essential for running reliable public APIs. Choose the Token Bucket algorithm to support burst traffic from web browsers. Select the Leaky Bucket algorithm when you need to smooth traffic spikes and protect downstream databases from write cascades. For distributed container environments, implement the Sliding Window Counter using Redis Sorted Sets to guarantee accurate, real-time rate restrictions.