How to Build a Distributed Rate Limiting System: Redis, Node.js & TypeScript Guide

js

How to Build a Distributed Rate Limiting System: Redis, Node.js & TypeScript Guide

Learn to build a distributed rate limiting system using Redis, Node.js & TypeScript. Implement Token Bucket, Sliding Window algorithms with Express middleware. Get started now!

Sep 13, 2025

How to Build a Distributed Rate Limiting System: Redis, Node.js & TypeScript Guide

I’ve been thinking a lot about distributed rate limiting lately. In today’s world of microservices and cloud-native applications, a simple local rate limiter just doesn’t cut it anymore. When you have multiple instances of your service running, each needs to respect the same global limits. That’s where Redis comes in - it gives us a shared state that all our instances can trust.

Why does this matter? Imagine a user making requests to different instances of your API. Without a distributed approach, each instance might allow the maximum number of requests, effectively multiplying the limit. This defeats the purpose of rate limiting entirely.

Let me show you how we can build something better. We’ll use Redis as our coordination layer, Node.js for the runtime, and TypeScript for type safety. The combination gives us both performance and reliability.

Here’s a basic setup for our token bucket implementation:

const checkRateLimit = async (userId: string): Promise<boolean> => {
  const key = `rate_limit:${userId}`;
  const now = Date.now();
  const windowMs = 60000; // 1 minute
  const maxRequests = 100;
  
  const result = await redis.eval(
    `local current = redis.call('get', KEYS[1])
     if current and tonumber(current) >= tonumber(ARGV[1]) then
       return 0
     else
       redis.call('incr', KEYS[1])
       redis.call('expire', KEYS[1], ARGV[2])
       return 1
     end`,
    1, key, maxRequests, Math.ceil(windowMs/1000)
  );
  
  return result === 1;
};

But wait - did you notice the potential race condition here? What happens if multiple requests come in at exactly the same time? That’s where Redis transactions and Lua scripting become essential.

The sliding window algorithm offers more precision than fixed windows. Instead of resetting the counter at fixed intervals, it looks at the actual request pattern. Here’s how we might implement it:

const slidingWindowCheck = async (ip: string): Promise<RateLimitResult> => {
  const key = `sliding:${ip}`;
  const now = Date.now();
  const windowMs = 60000;
  const maxRequests = 100;
  
  // Remove old timestamps
  await redis.zremrangebyscore(key, 0, now - windowMs);
  
  // Get current count
  const currentCount = await redis.zcard(key);
  
  if (currentCount >= maxRequests) {
    return { allowed: false, remaining: 0 };
  }
  
  // Add new request timestamp
  await redis.zadd(key, now, `${now}-${Math.random()}`);
  await redis.expire(key, Math.ceil(windowMs/1000));
  
  return { allowed: true, remaining: maxRequests - currentCount - 1 };
};

Have you considered what happens when Redis becomes unavailable? We need fallback strategies. One approach is to use a local in-memory rate limiter as a backup, though this sacrifices perfect consistency.

Monitoring is crucial. We should track metrics like rejection rates, Redis latency, and request patterns. This helps us tune our limits and catch issues early.

Here’s how we might wrap this in Express middleware:

const rateLimitMiddleware = (config: RateLimitConfig) => {
  return async (req: Request, res: Response, next: NextFunction) => {
    const key = `${config.prefix}:${req.ip}`;
    
    try {
      const result = await rateLimiter.check(key, config);
      
      res.set({
        'X-RateLimit-Limit': config.maxRequests.toString(),
        'X-RateLimit-Remaining': result.remaining.toString(),
        'X-RateLimit-Reset': Math.ceil(result.resetTime/1000).toString()
      });
      
      if (!result.allowed) {
        return res.status(429).json({ error: 'Too many requests' });
      }
      
      next();
    } catch (error) {
      // Fallback to local rate limiting or allow all
      next();
    }
  };
};

What about different rate limiting strategies? Sometimes you want to limit per user, sometimes per IP, and sometimes per API key. The beauty of this approach is that we can easily adjust the key generation logic.

Testing distributed rate limiting requires careful planning. We need to simulate multiple concurrent requests and different failure scenarios. Docker Compose makes it easy to spin up a test environment with Redis and our application.

Performance optimization is always important. We can use Redis pipelining to reduce round trips, and consider using Redis clusters for very high throughput scenarios.

Remember that rate limiting isn’t just about preventing abuse. It’s also about ensuring fair usage and protecting your system from being overwhelmed. The right limits depend on your specific use case and traffic patterns.

I’d love to hear your thoughts on this approach. Have you implemented distributed rate limiting in your projects? What challenges did you face? Share your experiences in the comments below, and don’t forget to like and share if you found this useful!

Share: Facebook Twitter Reddit LinkedIn WhatsApp Telegram Pinterest Email Instagram

js

How to Build a Distributed Rate Limiting System: Redis, Node.js & TypeScript Guide

Our Creations

We are on Medium

Similar Posts

Complete Guide to Integrating Next.js with Prisma ORM for Type-Safe Full-Stack Development

Complete Guide to Integrating Next.js with Prisma ORM for Type-Safe Database Operations

How to Build Production-Ready Event-Driven Microservices with NestJS, Redis Streams and Docker

Build High-Performance GraphQL APIs: NestJS, Prisma & Redis Caching Complete Guide

EventStore and Node.js Complete Guide: Event Sourcing Implementation Tutorial with TypeScript

How to Build Production-Ready Event-Driven Microservices with NestJS, RabbitMQ, and Redis