js

How to Build a Scalable Query Router for Sharded PostgreSQL with Node.js

Learn how to scale your database with a smart query router using Node.js, TypeScript, and Drizzle ORM. No rewrite required.

How to Build a Scalable Query Router for Sharded PostgreSQL with Node.js

I’ve been thinking about a problem that keeps many developers up at night. Your application is growing fast. The database is groaning under the weight of millions of users. You know you need to split the data, to shard it, but the thought of rewriting every query and managing multiple connections feels overwhelming. What if you could keep writing your database logic as if you had one database, while the system intelligently routed your requests behind the scenes? That’s exactly what we’re going to build today.

Let’s create a system that sits between your application code and your databases. It will look at each request, figure out which shard holds the relevant data, and send the query there. You write clean code; it handles the messy distribution. This approach is powerful because it scales with your user base without forcing a complete rewrite of your data layer.

First, we need to set the stage. We’ll use Node.js with TypeScript for type safety and Drizzle ORM, a lightweight and performant tool that gives us great control over our SQL. We’ll simulate multiple databases using Docker. Here’s a basic docker-compose.yml to spin up three PostgreSQL instances.

version: '3.8'
services:
  shard1:
    image: postgres:16-alpine
    environment:
      POSTGRES_DB: app_shard1
      POSTGRES_USER: app_user
      POSTGRES_PASSWORD: secure_password
    ports: ["5432:5432"]
    volumes: [shard1_data:/var/lib/postgresql/data]

  shard2:
    image: postgres:16-alpine
    environment:
      POSTGRES_DB: app_shard2
      POSTGRES_USER: app_user
      POSTGRES_PASSWORD: secure_password
    ports: ["5433:5432"]
    volumes: [shard2_data:/var/lib/postgresql/data]

  shard3:
    image: postgres:16-alpine
    environment:
      POSTGRES_DB: app_shard3
      POSTGRES_USER: app_user
      POSTGRES_PASSWORD: secure_password
    ports: ["5434:5432"]
    volumes: [shard3_data:/var/lib/postgresql/data]

volumes:
  shard1_data:
  shard2_data:
  shard3_data:

Run docker-compose up -d to start them. Now, how do we decide which user’s data goes to shard1, shard2, or shard3? We need a consistent rule.

The most common method is to use a shard key, like a user ID. We’ll use a technique called consistent hashing. Imagine a circle. We place each database shard at a point on this circle. We then hash the user ID (e.g., hash(userId)) to also get a point on the same circle. We walk clockwise around the circle from that point until we hit a database shard. That’s the shard for that user. This method is brilliant because adding or removing a shard only requires moving a fraction of the data.

Let’s implement a simple version. We’ll need a hashing function. We can use Node’s crypto module.

import crypto from 'crypto';

class ConsistentHasher {
    private ring: Map<number, string> = new Map();
    private sortedKeys: number[] = [];

    addShard(shardName: string, virtualNodes: number = 100) {
        for (let i = 0; i < virtualNodes; i++) {
            const key = this.hash(`${shardName}-${i}`);
            this.ring.set(key, shardName);
        }
        this.sortedKeys = Array.from(this.ring.keys()).sort((a, b) => a - b);
    }

    getShard(key: string): string {
        if (this.sortedKeys.length === 0) {
            throw new Error('No shards available');
        }
        const hash = this.hash(key);
        for (const ringKey of this.sortedKeys) {
            if (ringKey >= hash) {
                return this.ring.get(ringKey)!;
            }
        }
        return this.ring.get(this.sortedKeys[0])!;
    }

    private hash(str: string): number {
        return parseInt(crypto.createHash('md5').update(str).digest('hex').substring(0, 8), 16);
    }
}

// Usage
const hasher = new ConsistentHasher();
hasher.addShard('shard1');
hasher.addShard('shard2');
hasher.addShard('shard3');

const userId = 'user_12345';
const assignedShard = hasher.getShard(userId);
console.log(`User ${userId} belongs to ${assignedShard}`);

With our hashing strategy ready, we need a manager to hold the configuration for all our shards and give us a database connection for a given key. This is the core of our router.

We’ll define a simple schema with Drizzle. Let’s say we have a users table and an orders table. The user ID will be our shard key.

// src/db/schema.ts
import { pgTable, serial, varchar, integer, timestamp } from 'drizzle-orm/pg-core';

export const users = pgTable('users', {
    id: serial('id').primaryKey(),
    email: varchar('email', { length: 255 }).notNull().unique(),
    shardKey: varchar('shard_key', { length: 255 }).notNull(), // The user's ID used for hashing
});

export const orders = pgTable('orders', {
    id: serial('id').primaryKey(),
    userId: integer('user_id').notNull().references(() => users.id),
    amount: integer('amount').notNull(),
    createdAt: timestamp('created_at').defaultNow(),
});

Now, the Shard Manager. Its job is to know about all our database instances and provide a configured Drizzle client for the correct one.

// src/db/shard-manager.ts
import { drizzle } from 'drizzle-orm/node-postgres';
import { Pool } from 'pg';
import { ConsistentHasher } from './consistent-hash';
import * as schema from './schema';

export class ShardManager {
    private hasher: ConsistentHasher;
    private shardConfigs: Map<string, { connectionString: string }>;
    private clients: Map<string, ReturnType<typeof drizzle>> = new Map();

    constructor(shardConfigs: Record<string, string>) {
        this.hasher = new ConsistentHasher();
        this.shardConfigs = new Map(Object.entries(shardConfigs));

        for (const shardName of this.shardConfigs.keys()) {
            this.hasher.addShard(shardName);
            this.connectToShard(shardName);
        }
    }

    private connectToShard(shardName: string) {
        const config = this.shardConfigs.get(shardName);
        if (!config) return;

        const pool = new Pool({ connectionString: config.connectionString });
        const db = drizzle(pool, { schema });
        this.clients.set(shardName, db);
        console.log(`Connected to ${shardName}`);
    }

    getDbForShardKey(shardKey: string) {
        const shardName = this.hasher.getShard(shardKey);
        const db = this.clients.get(shardName);
        if (!db) {
            throw new Error(`No database client found for shard: ${shardName}`);
        }
        return { db, shardName };
    }

    getAllShards() {
        return Array.from(this.clients.entries());
    }
}

We initialize it with our database connection strings. Notice a problem yet? Every shard has the same schema, but the data inside is different. We need to run our migrations on all shards when we set this up.

The real magic happens in the Query Router. This is the class your application code will call. You give it a shard key and a query function, and it handles the rest.

// src/router/query-router.ts
import { ShardManager } from '../db/shard-manager';

export class QueryRouter {
    constructor(private shardManager: ShardManager) {}

    async executeForShard<T>(shardKey: string, queryFn: (db: any) => Promise<T>): Promise<T> {
        const { db } = this.shardManager.getDbForShardKey(shardKey);
        return await queryFn(db);
    }

    // Example: Insert a user
    async createUser(userData: { email: string; shardKey: string }) {
        return this.executeForShard(userData.shardKey, async (db) => {
            const result = await db.insert(schema.users).values(userData).returning();
            return result[0];
        });
    }

    // Example: Get a user's orders
    async getOrdersForUser(userShardKey: string, userId: number) {
        return this.executeForShard(userShardKey, async (db) => {
            return await db.select().from(schema.orders).where(eq(schema.orders.userId, userId));
        });
    }
}

Let’s see it in action. Your API endpoint for creating a user becomes very clean.

// In your API route handler
const router = new QueryRouter(shardManager);

app.post('/users', async (req, res) => {
    const { email } = req.body;
    const shardKey = `user_${generateId()}`; // This becomes your sharding key

    try {
        const newUser = await router.createUser({ email, shardKey });
        res.json(newUser);
    } catch (error) {
        res.status(500).json({ error: 'Failed to create user' });
    }
});

But what happens when you need data that spans multiple shards? For example, a system-wide report on total orders. This is a cross-shard query, and they are inherently more complex and slower. Our router needs a strategy for this.

One approach is the scatter-gather pattern. The router fans the query out to all shards, waits for the results, and then combines them.

// In the QueryRouter class
async executeAcrossAllShards<T>(queryFn: (db: any, shardName: string) => Promise<T>): Promise<T[]> {
    const allShards = this.shardManager.getAllShards();
    const promises = allShards.map(([shardName, db]) => queryFn(db, shardName));
    return Promise.all(promises);
}

async getTotalRevenue(): Promise<number> {
    const results = await this.executeAcrossAllShards(async (db) => {
        const aggResult = await db.select({ total: sum(schema.orders.amount) }).from(schema.orders);
        return aggResult[0]?.total || 0;
    });
    return results.reduce((sum, total) => sum + total, 0);
}

You must be careful with this. Cross-shard operations don’t support transactions in the traditional sense. If you need to move a user’s data from one shard to another during a rebalance, you have to write careful logic to copy, verify, and then delete, handling potential failures at each step.

How do you know if a shard goes down? We need health checks. A simple ping to each database at regular intervals can keep our manager informed. If a shard fails, the consistent hasher can be updated to temporarily remove it from the ring, directing traffic to the remaining healthy shards.

// Simple health check
async checkShardHealth(shardName: string): Promise<boolean> {
    const { db } = this.shardManager.getDbForShardKey('dummy_key_for_shard_lookup');
    try {
        await db.execute(sql`SELECT 1`); // Simple query
        return true;
    } catch {
        return false;
    }
}

Building this router teaches you the core concepts of distributed data: picking a good shard key, routing reads and writes, and handling the trade-offs. The system we built is a foundation. In production, you’d add connection pooling per shard, more sophisticated monitoring, and perhaps integrate with a service discovery system.

The beauty is in the separation of concerns. Your business logic stays simple. The router handles distribution. When you need to add a fourth shard because you’ve hit another million users, you update the shard manager configuration, run migrations on the new database, and the consistent hasher will start sending some of the new users there. Existing user data stays put until you run a planned rebalancing job.

This approach isn’t just for massive tech companies. It’s a pattern you can understand and implement early, saving yourself from a painful “big rewrite” later. It gives you the confidence that your data layer can grow.

What would you add to this router first? Better monitoring, or a dashboard to see the data distribution? Think about the specific queries your app runs most often. Could they be served from a single shard?

I hope this guide helps you think differently about scaling your database layer. It’s a challenging but solvable problem. If you found this walkthrough useful, please share it with a colleague who might be facing a scaling challenge. Have you tried sharding before? What was your biggest hurdle? Let me know in the comments below.


As a best-selling author, I invite you to explore my books on Amazon. Don’t forget to follow me on Medium and show your support. Thank you! Your support means the world!


101 Books

101 Books is an AI-driven publishing company co-founded by author Aarav Joshi. By leveraging advanced AI technology, we keep our publishing costs incredibly low—some books are priced as low as $4—making quality knowledge accessible to everyone.

Check out our book Golang Clean Code available on Amazon.

Stay tuned for updates and exciting news. When shopping for books, search for Aarav Joshi to find more of our titles. Use the provided link to enjoy special discounts!


📘 Checkout my latest ebook for free on my channel!
Be sure to like, share, comment, and subscribe to the channel!


Our Creations

Be sure to check out our creations:

Investor Central | Investor Central Spanish | Investor Central German | Smart Living | Epochs & Echoes | Puzzling Mysteries | Hindutva | Elite Dev | JS Schools


We are on Medium

Tech Koala Insights | Epochs & Echoes World | Investor Central Medium | Puzzling Mysteries Medium | Science & Epochs Medium | Modern Hindutva

Keywords: database sharding,postgresql,nodejs,drizzle orm,scalability



Similar Posts
Blog Image
Complete Guide to Next.js and Prisma Integration for Type-Safe Database Operations in 2024

Learn to integrate Next.js with Prisma for type-safe database operations. Build full-stack apps with auto-generated types and seamless data consistency.

Blog Image
Build Scalable WebRTC Video Conferencing: Complete Node.js, MediaSoup & Socket.io Implementation Guide

Learn to build scalable WebRTC video conferencing with Node.js, Socket.io & MediaSoup. Master SFU architecture, signaling & production deployment.

Blog Image
Build Event-Driven Architecture: Node.js, EventStore, and TypeScript Complete Guide 2024

Learn to build scalable event-driven systems with Node.js, EventStore & TypeScript. Master event sourcing, CQRS patterns & real-world implementation.

Blog Image
Build Production-Ready GraphQL APIs with Apollo Server, TypeScript, and Redis Caching Tutorial

Build production-ready GraphQL APIs with Apollo Server 4, TypeScript, Prisma ORM & Redis caching. Master scalable architecture, authentication & performance optimization.

Blog Image
Why OFFSET Pagination Breaks at Scale—and What to Use Instead

Discover why OFFSET pagination fails with large datasets and learn scalable alternatives like cursor and keyset pagination.

Blog Image
How to Build Real-Time Analytics with WebSockets, Redis Streams, and TypeScript in 2024

Learn to build scalable real-time analytics with WebSockets, Redis Streams & TypeScript. Complete guide with live dashboards, error handling & deployment.