I’ve been thinking a lot about rate limiting lately, especially after watching one of our production APIs struggle during a sudden traffic spike. What separates a resilient API from one that collapses under pressure? Often, it’s the quality of its rate limiting system. Today, I want to show you how to build a robust, distributed rate limiting system that can handle real-world traffic patterns while maintaining performance.
Let me walk you through building this system step by step.
Rate limiting isn’t just about stopping abuse—it’s about creating predictable, reliable APIs. Without proper rate limiting, a single enthusiastic user or a misconfigured client can bring down your entire service. But how do you build something that’s both effective and performant?
Here’s a basic TypeScript interface to define our rate limiting contract:
interface RateLimitResult {
allowed: boolean;
remaining: number;
resetTime: number;
}
interface RateLimiter {
check(key: string): Promise<RateLimitResult>;
}
The token bucket algorithm has become my favorite approach because it handles bursts gracefully. Imagine you have a bucket that holds tokens. Each request consumes one token, and tokens refill at a steady rate. This means users can make several requests quickly if they have tokens saved up, then must wait for refills.
Here’s how I implement the core token bucket logic:
class TokenBucket {
private tokens: number;
private lastRefill: number;
constructor(
private capacity: number,
private refillRate: number
) {
this.tokens = capacity;
this.lastRefill = Date.now();
}
consume(tokens: number = 1): boolean {
this.refill();
if (this.tokens >= tokens) {
this.tokens -= tokens;
return true;
}
return false;
}
private refill(): void {
const now = Date.now();
const timePassed = now - this.lastRefill;
const tokensToAdd = Math.floor(timePassed * this.refillRate);
this.tokens = Math.min(this.capacity, this.tokens + tokensToAdd);
this.lastRefill = now;
}
}
But what happens when you have multiple server instances? This is where Redis becomes essential. Redis provides a shared state that all your servers can access, making distributed rate limiting possible.
Here’s my Redis-based storage implementation:
import Redis from 'ioredis';
class RedisRateLimiter {
private redis: Redis;
constructor(redisUrl: string) {
this.redis = new Redis(redisUrl);
}
async check(
key: string,
windowMs: number,
maxRequests: number
): Promise<RateLimitResult> {
const now = Date.now();
const pipeline = this.redis.pipeline();
pipeline.zremrangebyscore(key, 0, now - windowMs);
pipeline.zadd(key, now, `${now}-${Math.random()}`);
pipeline.zcard(key);
pipeline.expire(key, Math.ceil(windowMs / 1000));
const results = await pipeline.exec();
const requestCount = results[2][1] as number;
return {
allowed: requestCount <= maxRequests,
remaining: Math.max(0, maxRequests - requestCount),
resetTime: now + windowMs
};
}
}
Notice how I use Redis pipelines? This ensures all operations happen atomically, preventing race conditions. Have you ever considered what happens when multiple requests check the rate limit simultaneously?
Now let’s wrap this in an Express middleware that’s both flexible and performant:
const createRateLimitMiddleware = (
limiter: RateLimiter,
keyGenerator: (req: Request) => string
) => {
return async (req: Request, res: Response, next: NextFunction) => {
const key = keyGenerator(req);
const result = await limiter.check(key);
res.set('X-RateLimit-Limit', '1000');
res.set('X-RateLimit-Remaining', result.remaining.toString());
res.set('X-RateLimit-Reset', result.resetTime.toString());
if (!result.allowed) {
return res.status(429).json({
error: 'Rate limit exceeded',
retryAfter: Math.ceil((result.resetTime - Date.now()) / 1000)
});
}
next();
};
};
What I love about this approach is its flexibility. You can rate limit by IP address, user ID, API key, or any other identifier. The key generator function lets you customize this based on your needs.
But we can do better. Let’s add some Lua scripting to Redis for even better performance:
const rateLimitScript = `
local key = KEYS[1]
local now = tonumber(ARGV[1])
local window = tonumber(ARGV[2])
local limit = tonumber(ARGV[3])
redis.call('ZREMRANGEBYSCORE', key, 0, now - window)
local count = redis.call('ZCARD', key)
if count < limit then
redis.call('ZADD', key, now, now)
redis.call('EXPIRE', key, math.ceil(window / 1000))
end
return {count, limit - count - 1}
`;
This script runs entirely within Redis, reducing network overhead and ensuring atomicity. The performance difference is noticeable, especially under high load.
Here’s how I integrate monitoring to keep track of how the rate limiter is performing:
const metrics = {
requests: new promClient.Counter({
name: 'rate_limit_requests_total',
help: 'Total rate limit requests',
labelNames: ['key', 'allowed']
}),
latency: new promClient.Histogram({
name: 'rate_limit_check_duration_seconds',
help: 'Rate limit check duration'
})
};
Monitoring helps you understand your traffic patterns and adjust limits accordingly. Are your limits too strict? Too lenient? The data will tell you.
One challenge I’ve faced is handling different rate limits for different user tiers. Here’s how I solved it:
const getUserTier = (req: Request): UserTier => {
// Implementation depends on your auth system
return req.user?.tier || 'free';
};
const tierLimits = {
free: { windowMs: 60000, maxRequests: 100 },
premium: { windowMs: 60000, maxRequests: 1000 },
enterprise: { windowMs: 60000, maxRequests: 10000 }
};
This approach lets you provide better service to paying customers while still protecting your API from abuse.
The system I’ve built handles millions of requests daily across multiple data centers. It’s proven resilient during traffic spikes and has prevented several potential outages. Most importantly, it provides clear feedback to users when limits are exceeded, helping them adjust their usage patterns.
Building a great rate limiting system is about balancing protection with usability. Too restrictive, and you frustrate legitimate users. Too lenient, and you risk service instability. The approach I’ve shown you strikes that balance while maintaining high performance.
What challenges have you faced with rate limiting? I’d love to hear about your experiences and solutions. If you found this helpful, please share it with others who might benefit, and let me know your thoughts in the comments below.