js

Build a Distributed Rate Limiting System with Redis, Bull Queue, and Express.js

Learn to build scalable distributed rate limiting with Redis, Bull Queue & Express.js. Master token bucket, sliding window algorithms & production deployment strategies.

Build a Distributed Rate Limiting System with Redis, Bull Queue, and Express.js

I’ve been thinking about how to protect APIs from being overwhelmed by too many requests. It’s a challenge that comes up when applications grow and start handling more traffic. Today, I’ll show you how to build a distributed rate limiter using Redis, Bull Queue, and Express.js - a solution that scales across multiple servers while maintaining accuracy.

Why did I choose this stack? Redis gives us fast in-memory storage with atomic operations, essential for counting requests in real-time. Bull Queue handles delayed processing when limits are exceeded, and Express.js provides the middleware structure to integrate everything smoothly. Together, they solve the distributed coordination problem that single-server solutions can’t handle.

Getting Started

Let’s set up our project. First, create the basic structure:

mkdir distributed-rate-limiter && cd distributed-rate-limiter
npm init -y
npm install express redis bull ioredis dotenv

Now configure TypeScript:

// tsconfig.json
{
  "compilerOptions": {
    "target": "ES2020",
    "module": "commonjs",
    "outDir": "./dist",
    "rootDir": "./src",
    "strict": true
  }
}

Redis Connection

A reliable Redis connection is crucial. Here’s how I handle it:

// src/config/redis.ts
import Redis from 'ioredis';
import { config } from './environment';

const redis = new Redis({
  host: config.redis.host,
  port: config.redis.port,
  maxRetriesPerRequest: 3
});

redis.on('error', (err) => {
  console.error('Redis error:', err);
});

export default redis;

Implementing Token Bucket Algorithm

The token bucket allows bursts while maintaining average rates. Here’s my implementation:

// src/algorithms/TokenBucket.ts
import redis from '../config/redis';

class TokenBucket {
  constructor(
    private key: string,
    private capacity: number,
    private refillRate: number
  ) {}

  async consume(tokens = 1): Promise<boolean> {
    const now = Date.now();
    const script = `
      local key = KEYS[1]
      local capacity = tonumber(ARGV[1])
      local refillRate = tonumber(ARGV[2])
      local now = tonumber(ARGV[3])
      local tokens = tonumber(ARGV[4])
      
      local data = redis.call('HMGET', key, 'lastRefill', 'tokens')
      local lastRefill = data[1] and tonumber(data[1]) or now
      local currentTokens = data[2] and tonumber(data[2]) or capacity
      
      local timePassed = math.max(now - lastRefill, 0)
      local refillAmount = math.floor(timePassed * refillRate / 1000)
      currentTokens = math.min(currentTokens + refillAmount, capacity)
      
      if currentTokens >= tokens then
        currentTokens = currentTokens - tokens
        lastRefill = now
        redis.call('HMSET', key, 'lastRefill', lastRefill, 'tokens', currentTokens)
        redis.call('EXPIRE', key, math.ceil(capacity / refillRate) * 2)
        return 1
      end
      return 0
    `;
    
    const result = await redis.eval(
      script, 1, this.key, 
      this.capacity, this.refillRate, now, tokens
    );
    
    return result === 1;
  }
}

What happens when requests exceed the limit? That’s where Bull Queue comes in.

Handling Overflow with Bull

When limits are exceeded, we queue requests for later processing:

// src/services/QueueService.ts
import Bull from 'bull';
import redisConfig from '../config/redis';

const rateLimitQueue = new Bull('rate-limit-queue', {
  redis: redisConfig,
  limiter: { max: 1000, duration: 5000 }
});

rateLimitQueue.process(async (job) => {
  console.log('Processing queued request:', job.data);
  // Add your request processing logic here
});

export const addToQueue = async (requestData: any) => {
  await rateLimitQueue.add(requestData, {
    attempts: 3,
    backoff: { type: 'exponential', delay: 1000 }
  });
};

Express Middleware Integration

Now let’s connect everything with Express middleware:

// src/middleware/rateLimiter.ts
import { Request, Response, NextFunction } from 'express';
import TokenBucket from '../algorithms/TokenBucket';
import { addToQueue } from '../services/QueueService';

export const tokenBucketLimiter = (opts: {
  key: string,
  capacity: number,
  refillRate: number
}) => {
  const bucket = new TokenBucket(opts.key, opts.capacity, opts.refillRate);
  
  return async (req: Request, res: Response, next: NextFunction) => {
    const allowed = await bucket.consume();
    
    if (allowed) {
      return next();
    }
    
    try {
      await addToQueue({ path: req.path, method: req.method });
      res.status(429).json({
        error: 'Too many requests',
        message: 'Your request has been queued for processing'
      });
    } catch (err) {
      res.status(503).json({ error: 'Service unavailable' });
    }
  };
};

Testing and Optimization

How do we know our system works under pressure? I use artillery for load testing:

# load-test.yml
config:
  target: "http://localhost:3000"
  phases:
    - duration: 60
      arrivalRate: 50
      name: "Warm up"
    - duration: 120
      arrivalRate: 200
      name: "Peak load"
scenarios:
  - flow:
      - get:
          url: "/api/protected"

Monitor Redis memory usage during tests with:

redis-cli info memory

Production Considerations

When deploying, remember to:

  • Use Redis clusters for high availability
  • Set appropriate TTLs on rate limit keys
  • Monitor queue backlog with Bull dashboard
  • Implement circuit breakers for Redis failures

What happens during Redis outages? I recommend adding a fallback to local rate limiting using a library like express-rate-limit, though this sacrifices some accuracy.

Final Thoughts

Building this system taught me that distributed coordination requires careful planning. The token bucket approach provides flexibility, while Bull ensures no request gets completely dropped during traffic spikes.

If you found this useful, please share it with others who might benefit! I’d love to hear about your experiences with rate limiting - what challenges have you faced? Leave a comment below with your thoughts or questions.

Keywords: distributed rate limiting, Redis rate limiter, Bull Queue Express, rate limiting algorithms, Express middleware rate limit, sliding window rate limiting, token bucket algorithm, Redis distributed systems, API rate limiting Node.js, scalable rate limiting architecture



Similar Posts
Blog Image
Complete Next.js Prisma Integration Guide: Build Type-Safe Full-Stack Apps in 2024

Learn to integrate Next.js with Prisma for powerful full-stack development. Build type-safe APIs, streamline database operations, and boost productivity in one codebase.

Blog Image
Building Type-Safe Event-Driven Microservices with NestJS, RabbitMQ, and Prisma: Complete Developer Guide

Learn to build type-safe event-driven microservices with NestJS, RabbitMQ & Prisma. Complete guide with error handling, testing & monitoring strategies.

Blog Image
Master Next.js 13+ App Router: Complete Server-Side Rendering Guide with React Server Components

Master Next.js 13+ App Router and React Server Components for SEO-friendly SSR apps. Learn data fetching, caching, and performance optimization strategies.

Blog Image
How to Build Scalable Event-Driven Microservices with NestJS, RabbitMQ, and Redis: Complete Guide

Learn to build scalable event-driven microservices with NestJS, RabbitMQ, and Redis. Master message queuing, caching, CQRS patterns, and production deployment strategies.

Blog Image
Complete Guide: Building Event-Driven Microservices with NestJS, Redis Streams, and TypeScript 2024

Learn to build scalable event-driven microservices with NestJS, Redis Streams & TypeScript. Complete guide with code examples, error handling & monitoring.

Blog Image
How to Build Production-Ready Event-Driven Microservices with NestJS, Redis Streams and Docker

Learn to build production-ready event-driven microservices with NestJS, Redis Streams & Docker. Complete guide with CQRS, error handling & scaling tips.