js

How to Build a Distributed Rate Limiter with Redis and Node.js Implementation Guide

Learn to build a scalable distributed rate limiter using Redis and Node.js. Covers Token Bucket, Sliding Window algorithms, Express middleware, and production optimization strategies.

How to Build a Distributed Rate Limiter with Redis and Node.js Implementation Guide

Recently, while scaling an API service, I faced unexpected traffic spikes threatening system stability. Requests overwhelmed our infrastructure, causing cascading failures. That experience motivated me to implement robust rate limiting - not just locally, but across distributed nodes. Why Redis? Its atomic operations and speed make it ideal for coordinating request counts across instances. Let’s build this together.

First, we establish our foundation. Create the project directory and install essentials:

mkdir distributed-rate-limiter
cd distributed-rate-limiter
npm init -y
npm install redis express typescript @types/node @types/express

Our core interface defines rate limiting behavior:

// src/types/index.ts
export interface RateLimitResult {
  allowed: boolean;
  remaining: number;
  resetTime: Date;
}
export interface IRateLimiter {
  checkLimit(identifier: string): Promise<RateLimitResult>;
}

For the Token Bucket algorithm, we use Redis Lua scripts for atomic operations. This handles token refills and consumption in a single round-trip:

// src/rate-limiters/token-bucket.ts
const refillScript = `
  local key = KEYS[1]
  local capacity = tonumber(ARGV[1])
  local tokens = tonumber(ARGV[2])
  local interval = tonumber(ARGV[3])
  local now = tonumber(ARGV[4])
  
  -- Calculate tokens to add
  local bucket = redis.call('HMGET', key, 'tokens', 'last_refill')
  local current_tokens = tonumber(bucket[1]) or capacity
  local last_refill = tonumber(bucket[2]) or now
  local elapsed = math.max(0, now - last_refill)
  local tokens_to_add = math.floor(elapsed / interval * tokens)
  current_tokens = math.min(capacity, current_tokens + tokens_to_add)
  
  -- Check if request allowed
  local allowed = current_tokens >= 1
  if allowed then
    current_tokens = current_tokens - 1
    redis.call('HMSET', key, 'tokens', current_tokens, 'last_refill', now)
  end
  redis.call('EXPIRE', key, math.ceil(interval / 1000))
  return {allowed and 1 or 0, current_tokens}
`;

The sliding window approach offers greater precision. We track timestamps of recent requests:

// src/rate-limiters/sliding-window.ts
const windowScript = `
  local key = KEYS[1]
  local now = tonumber(ARGV[1])
  local windowMs = tonumber(ARGV[2])
  local maxRequests = tonumber(ARGV[3])
  
  -- Remove outdated timestamps
  redis.call('ZREMRANGEBYSCORE', key, 0, now - windowMs)
  local currentCount = redis.call('ZCARD', key)
  
  if currentCount < maxRequests then
    redis.call('ZADD', key, now, now)
    redis.call('EXPIRE', key, windowMs/1000)
    return {1, maxRequests - currentCount - 1}
  end
  return {0, 0}
`;

For Express integration, middleware becomes crucial. How might we handle varying limits per endpoint? Here’s a flexible solution:

// src/middleware/rate-limit-middleware.ts
import { Request, Response, NextFunction } from 'express';

export const rateLimitMiddleware = (
  limiter: IRateLimiter,
  identifierExtractor: (req: Request) => string
) => {
  return async (req: Request, res: Response, next: NextFunction) => {
    const id = identifierExtractor(req);
    const result = await limiter.checkLimit(id);
    
    if (!result.allowed) {
      return res.status(429).json({
        error: 'Too many requests',
        retryAfter: result.resetTime.toISOString()
      });
    }
    
    res.set('X-RateLimit-Remaining', result.remaining.toString());
    res.set('X-RateLimit-Reset', result.resetTime.toISOString());
    next();
  };
};

Handling Redis failures requires careful decisions. Should we block traffic or fail open? This fallback maintains functionality during outages:

// Error handling in checkLimit method
try {
  // Redis operations
} catch (error) {
  console.error('Redis failure:', error);
  return {
    allowed: true, // Fail open
    remaining: config.maxRequests,
    resetTime: new Date(Date.now() + config.windowMs)
  };
}

Performance optimization matters at scale. Consider these techniques:

  1. Pipeline batch operations
  2. Use Redis clusters for sharding
  3. Local caches with short TTLs
  4. Connection pooling
  5. Monitor memory usage patterns

Testing validates our implementation. Use Artillery for load testing:

# load-test.yml
config:
  target: "http://localhost:3000"
  phases:
    - duration: 60
      arrivalRate: 100
scenarios:
  - flow:
      - get:
          url: "/api/resource"

In production deployment:

  • Set appropriate Redis memory policies
  • Enable persistence based on tolerance
  • Monitor Redis metrics closely
  • Implement circuit breakers
  • Establish alert thresholds

What surprises developers most? The token bucket’s burst handling versus sliding window’s precision. Both have valid use cases - choose based on your tolerance for brief spikes.

I encourage testing different configurations under simulated loads. You’ll discover nuances in behavior that documentation can’t capture. What happens when requests arrive in microbursts? How does geographic latency affect distributed coordination?

This implementation balances accuracy with performance. While not perfect, it provides a strong foundation you can extend. The Lua scripts ensure atomicity, while Redis expiration handles cleanup automatically. For most applications, this strikes the right balance.

Found this useful? Implement it in your next project and share your experience! If you improved the approach or found edge cases, comment below - let’s learn together. Like this guide if it saved you research time, and share it with your team.

Keywords: distributed rate limiter, Redis rate limiting, Node.js rate limiter, token bucket algorithm, sliding window rate limiter, Express.js middleware, API rate limiting, distributed systems Redis, rate limiting algorithms, scalable rate limiter



Similar Posts
Blog Image
Building Distributed Rate Limiting with Redis and Node.js: Complete Implementation Guide

Learn to build scalable distributed rate limiting with Redis & Node.js. Master token bucket, sliding window algorithms, TypeScript middleware & production optimization.

Blog Image
Build High-Performance GraphQL API with NestJS, Prisma, and Redis Caching - Complete Tutorial

Build high-performance GraphQL API with NestJS, Prisma, and Redis. Learn DataLoader patterns, caching strategies, authentication, and real-time subscriptions. Complete tutorial inside.

Blog Image
Complete SvelteKit SSR Guide: Build a High-Performance Blog with PostgreSQL and Authentication

Learn to build a high-performance blog with SvelteKit SSR, PostgreSQL, and Prisma. Complete guide covering authentication, optimization, and deployment.

Blog Image
Build Multi-Tenant SaaS with NestJS, Prisma and PostgreSQL Row-Level Security Complete Guide

Learn to build scalable multi-tenant SaaS apps using NestJS, Prisma & PostgreSQL RLS. Master tenant isolation, security, and performance optimization.

Blog Image
Build Production-Ready GraphQL API with NestJS, Prisma and Redis Caching - Complete Tutorial

Learn to build scalable GraphQL APIs with NestJS, Prisma, and Redis caching. Master authentication, real-time subscriptions, and production deployment.

Blog Image
Build High-Performance Event-Driven Microservices with Fastify NATS JetStream and TypeScript

Learn to build scalable event-driven microservices with Fastify, NATS JetStream & TypeScript. Master async messaging, error handling & production deployment.