It happened during a product launch. Our API suddenly crawled to a stop. An unexpected surge of requests, some legitimate, some not, overwhelmed our system. The immediate scramble to add control taught me a vital lesson: waiting until you’re under attack to build defenses is a terrible plan. Since that moment, building robust rate limiting became a personal mission. I’m writing this to save you that same frantic scramble. Let’s build something that protects your application, ensures fair use, and keeps things running smoothly for everyone.
Think of your API as a popular bakery. Without limits, one customer could buy every pastry at once, leaving others with nothing. Rate limiting is the polite sign saying “two per customer.” But how do you enforce this fairly across hundreds of customers and multiple cashiers (server instances)? A simple in-memory counter won’t work if you have more than one cashier. This is where Redis shines.
I often get asked: why Redis specifically for this job? It’s simple. Redis is incredibly fast, and it provides atomic operations. This means when two servers try to update the request count for the same user at the exact same moment, the operation happens cleanly without a race condition. Plus, Redis can automatically expire old data, which is perfect for cleaning up request logs.
There are several ways to count requests. The simplest is the fixed window. Imagine a one-minute window. You count requests in that minute. At the next minute, you reset the counter to zero. The problem is obvious, right? What if a user makes 59 requests at the end of one window and 59 more at the start of the next? That’s 118 requests in two minutes, which could violate our intended limit of 100 per minute.
So, we use something better for production: the sliding window. This method looks at your activity over a rolling period. If the limit is 100 per minute, it checks how many requests you’ve made in the last 60 seconds from right now, not from a fixed clock minute. This is far more accurate and fair. But implementing it requires careful tracking of timestamps.
Let’s set up our environment. We’ll use Express.js for the server and ioredis, a robust Redis client for Node.js. First, connect to Redis in a way that’s reusable across your app.
// redisClient.js
const Redis = require('ioredis');
const redis = new Redis({
host: process.env.REDIS_HOST || 'localhost',
port: 6379,
retryStrategy: (times) => Math.min(times * 50, 2000)
});
module.exports = redis;
Now, for the core logic. We will implement a sliding window counter using a sorted set in Redis. Each request adds a member with a timestamp as its score. To check if a user is over the limit, we count the members within the last window. Here’s a basic middleware function.
// slidingWindowMiddleware.js
const redis = require('./redisClient');
async function rateLimitSlidingWindow(req, res, next) {
const userId = req.ip; // Use IP for simplicity; use user ID if authenticated
const windowSize = 60; // seconds
const maxRequests = 100;
const now = Date.now();
const windowStart = now - (windowSize * 1000);
const key = `rate_limit:${userId}`;
// Add the current request's timestamp to the sorted set
await redis.zadd(key, now, now);
// Remove all timestamps older than our window
await redis.zremrangebyscore(key, 0, windowStart);
// Count how many requests are left (within the window)
const requestCount = await redis.zcard(key);
if (requestCount > maxRequests) {
// Calculate when the oldest request will expire
const oldestTimestamp = await redis.zrange(key, 0, 0);
const resetTime = parseInt(oldestTimestamp[0]) + (windowSize * 1000);
res.setHeader('X-RateLimit-Reset', resetTime);
return res.status(429).json({ error: 'Too Many Requests' });
}
// Set helpful headers for the client
res.setHeader('X-RateLimit-Limit', maxRequests);
res.setHeader('X-RateLimit-Remaining', Math.max(0, maxRequests - requestCount));
// Let the request expire automatically from Redis
await redis.expire(key, windowSize + 1);
next();
}
This works, but in a high-traffic app, making three Redis commands per request adds up. Can we make it more efficient? Yes, we can use Lua scripting. Redis lets you run a script on the server atomically, reducing network chatter and guaranteeing no other commands interfere. This is crucial for accuracy under load.
Here is the same logic, condensed into a single, atomic Lua script.
-- rateLimit.lua
local key = KEYS[1]
local now = tonumber(ARGV[1])
local window = tonumber(ARGV[2])
local max = tonumber(ARGV[3])
-- Remove old requests
redis.call('ZREMRANGEBYSCORE', key, 0, now - (window * 1000))
-- Get current count
local current = redis.call('ZCARD', key)
if current >= max then
return {0, current} -- Block request
end
-- Add new request
redis.call('ZADD', key, now, now)
redis.call('EXPIRE', key, window)
return {1, current + 1} -- Allow request
You would call this script from your Node.js code, passing the user key, current timestamp, window size, and limit. This single call does everything. This is the kind of optimization that moves code from “it works” to “production-ready.”
A good rate limiter is also flexible. You might want different limits for different API endpoints or user tiers. You don’t want to block search requests as aggressively as login attempts. How would you design for that? The key is in the key we use. Instead of just req.ip, we can create a composite key like rate_limit:${userId}:${req.path}:${req.method}. This gives you granular control.
Another critical aspect is what to do when the limit is reached. A simple 429 error is standard, but consider adding a Retry-After header. This tells the client—be it a browser, mobile app, or another service—how long to wait before trying again. It’s a small touch that makes your API more developer-friendly and can prevent clients from hammering your endpoint uselessly.
Finally, we must talk about monitoring. You need to know when your limits are being hit and by whom. Consider logging rate limit events to a separate system. You can also use Redis’s own metrics or increment a separate counter every time a 429 is sent. This data helps you answer important questions: Are the limits too strict? Is there a coordinated attack? Is a particular integration broken and stuck in a loop?
Building this well upfront saves immense trouble later. It protects your resources, improves the experience for good-faith users, and gives you control and visibility over your traffic. Start with the sliding window counter, make it atomic with Lua, add flexible keys for different rules, and always monitor the results. Your future self, enjoying a stable API during the next big traffic spike, will thank you.
Did you find this walk-through helpful? What part of your system feels most vulnerable right now? Share your thoughts or questions in the comments below—let’s build more resilient systems together. If this guide saved you future headaches, please like and share it with other developers in your circle.