I’ve been building microservices for years, but it wasn’t until I faced massive order spikes during a client’s Black Friday event that I truly appreciated event-driven architecture. When our system nearly buckled under pressure, I knew we needed a better approach. That’s how I discovered the power trio of NestJS, Redis Streams, and Bull Queue. Let me show you how to build systems that handle traffic like a champ.
First, let’s set up our foundation. I prefer NestJS because it gives me TypeScript’s safety without sacrificing Node.js agility. Here’s how I bootstrap a new project:
nest new order-service
cd order-service
npm install @nestjs/bull redis ioredis date-fns
Now, why Redis Streams? They solve the “at-least-once” delivery problem beautifully. When an order comes in, we push it to a stream:
// order.service.ts
async createOrder(orderData) {
const event = {
id: uuidv4(),
type: 'ORDER_CREATED',
aggregateId: `order-${Date.now()}`,
data: orderData,
metadata: { timestamp: new Date() }
};
await this.redisClient.xadd(
'order-events',
'*',
'event', JSON.stringify(event)
);
}
But what happens when downstream services fail? That’s where Bull Queue shines. I configure workers to process stream events:
// payment.processor.ts
@Processor('payment-queue')
export class PaymentProcessor {
@Process()
async handleJob(job: Job) {
const order = job.data;
try {
await paymentGateway.charge(order);
job.moveToCompleted('success');
} catch (error) {
await job.moveToFailed(error);
}
}
}
Ever wondered how to prevent event storms from overwhelming your system? I use consumer groups with backpressure control. This snippet processes events without overloading workers:
// redis.consumer.ts
async processEvents() {
while (true) {
const events = await this.redisClient.xreadgroup(
'GROUP', 'order-group', 'worker1',
'COUNT', 10, 'BLOCK', 2000,
'STREAMS', 'order-events', '>'
);
if (!events) continue;
for (const event of events) {
await this.queue.add('process-order', event);
await this.redisClient.xack('order-events', 'order-group', event.id);
}
}
}
For monitoring, I instrument everything. This Bull dashboard integration gives me real-time visibility:
// main.ts
const app = await NestFactory.create(AppModule);
const adapter = new BullAdapter(queue);
setQueues([adapter]);
app.use('/queues', serverAdapter.getRouter());
When things go wrong (and they will!), I implement dead-letter queues. Notice how I capture failures for later analysis:
// queue.config.ts
BullModule.registerQueue({
name: 'order-queue',
defaultJobOptions: {
attempts: 3,
backoff: { type: 'exponential', delay: 1000 },
removeOnFail: { count: 1000 },
},
settings: {
deadLetterQueue: 'order-dlq'
}
});
Performance tip: Always partition your streams. I shard by customer ID to maintain ordering while scaling:
const partitionKey = `order-stream-${customerId % 10}`;
await redis.xadd(partitionKey, '*', 'event', eventData);
Testing is non-negotiable. I use this pattern to validate event flows:
// order.e2e.spec.ts
it('should process failed payments', async () => {
jest.spyOn(paymentService, 'charge').mockRejectedValueOnce('Declined');
await publishTestEvent('ORDER_CREATED', testOrder);
await waitForQueueDrain();
const failedJobs = await queue.getFailedJobs();
expect(failedJobs[0].reason).toBe('Declined');
});
In production, I always set memory limits and enable persistence. This redis.conf snippet prevents OOM crashes:
maxmemory 4gb
maxmemory-policy allkeys-lfu
appendonly yes
Notice how I haven’t mentioned databases yet? That’s intentional. In true event-driven fashion, Redis becomes my source of truth during peak loads, with periodic snapshots to PostgreSQL.
Did you catch the beauty of this setup? Events flow through streams, queues handle heavy lifting, and failures get quarantined. The system self-heals during outages - when services restart, they pick up right where they left off.
I’ve deployed this pattern for e-commerce clients processing 10,000+ orders/minute. The key is starting simple: implement the core event stream first, add queues for slow operations, then layer in monitoring. Avoid over-engineering early.
What surprised me most? How easily teams adopt this model. Developers love tracing events through the system instead of debugging distributed monoliths.
If this resonates with your challenges, try implementing just the Redis Streams layer first. Measure your throughput before and after. I’d love to hear about your results in the comments! Found this useful? Share it with someone battling microservice complexity. Let’s build more resilient systems together.