Why Event-Driven Microservices?
Last month, I faced a critical production outage when our payment service crashed during peak traffic. Synchronous API calls created cascading failures across our system. That experience drove me to rebuild our architecture using event-driven principles. Why should services crash because their neighbors are down? Event-driven patterns allow independent scaling, fault tolerance, and true decoupling.
Redis Streams became our backbone for this transformation. Unlike traditional queues, it offers persistent message storage with consumer groups, ordered event processing, and automatic load balancing. Have you considered how your systems would behave if one component failed for an hour? With Redis Streams, events wait patiently until services recover.
Building the Foundation
We’ll construct an e-commerce platform with three microservices:
- Order Service: Creates purchase orders
- Inventory Service: Manages product stock
- Notification Service: Sends customer alerts
Each service runs in its own NestJS application, sharing core types via a monorepo. Here’s our project structure:
event-driven-microservices/
├── packages/
│ ├── order-service/
│ ├── inventory-service/
│ ├── notification-service/
│ └── shared/ # Core event definitions
├── docker-compose.yml
└── package.json
Initialize the monorepo with this package.json
:
{
"private": true,
"workspaces": ["packages/*"],
"scripts": {
"dev": "concurrently \"npm run dev:order\" \"npm run dev:inventory\" \"npm run dev:notification\""
}
}
Run all services with one command: npm run dev
.
Defining Event Contracts
Clear event schemas prevent integration nightmares. We define them in the shared
package:
// shared/src/events/order.events.ts
export interface OrderCreatedEvent {
type: 'order.created';
data: {
orderId: string;
items: { productId: string; quantity: number }[];
};
}
When designing events, ask yourself: Could a new developer understand this without documentation? Explicit typing and descriptive property names are crucial.
Redis Streams Implementation
Let’s create our Redis client wrapper. This handles connections, message serialization, and error recovery:
// shared/src/redis-stream.client.ts
import { Redis } from 'ioredis';
export class RedisStreamClient {
private readonly client: Redis;
constructor() {
this.client = new Redis(process.env.REDIS_URL);
}
async publish(stream: string, event: object): Promise<string> {
return this.client.xadd(stream, '*', 'event', JSON.stringify(event));
}
async consumeGroup(
stream: string,
group: string,
consumer: string,
batchSize = 10
): Promise<any[]> {
const messages = await this.client.xreadgroup(
'GROUP', group, consumer,
'COUNT', batchSize,
'STREAMS', stream, '>'
);
return messages?.[0]?.[1] || [];
}
}
Notice the consumeGroup
method uses Redis’ consumer groups for parallel processing. Multiple instances of a service can pull events without duplication.
Order Service: Event Production
When an order is placed, we publish an event:
// order-service/src/orders/orders.controller.ts
@Controller('orders')
export class OrdersController {
constructor(private readonly streamClient: RedisStreamClient) {}
@Post()
async createOrder(@Body() orderData: CreateOrderDto) {
const event: OrderCreatedEvent = {
type: 'order.created',
data: { orderId: uuidv4(), items: orderData.items }
};
await this.streamClient.publish('order-events', event);
return { status: 'Order processing started' };
}
}
What happens if Redis is unreachable? We’ll implement retry logic shortly.
Inventory Service: Event Consumption
Services subscribe to relevant events:
// inventory-service/src/consumers/order.consumer.ts
@Injectable()
export class OrderConsumer {
@OnEvent('order.created')
async handleOrderCreated(event: OrderCreatedEvent) {
for (const item of event.data.items) {
const stockAvailable = await this.checkStock(item.productId);
if (!stockAvailable) {
await this.publishStockLowEvent(item.productId);
}
}
}
private async checkStock(productId: string): Promise<boolean> {
// Database check implementation
}
}
This pattern keeps services focused. The inventory service doesn’t care about payment processing or user notifications.
Error Handling and Dead Letters
Failures are inevitable. Redis consumer groups track unacknowledged messages:
async processEvents() {
const messages = await this.streamClient.consumeGroup(
'order-events',
'inventory-group',
'inventory-service'
);
for (const [id, fields] of messages) {
try {
const event = JSON.parse(fields.event);
await this.handleEvent(event);
await this.client.xack('order-events', 'inventory-group', id);
} catch (error) {
await this.moveToDeadLetter('order-events', id, event, error);
}
}
}
Dead letter queues store failed events for analysis. In our implementation, we log these to a dedicated Redis stream with error metadata.
Monitoring Event Flows
Distributed tracing is essential. We propagate correlationId
across events:
const event = {
type: 'order.created',
correlationId: request.headers['x-correlation-id'],
// ... other data
};
Integrate with OpenTelemetry to visualize event paths through services. I prefer Jaeger for tracing visualization - its timeline view shows exactly where bottlenecks occur.
Testing Strategies
Mock Redis streams in unit tests:
// order.service.spec.ts
const mockStreamClient = {
publish: jest.fn().mockResolvedValue('message-id')
};
beforeEach(() => {
jest.clearAllMocks();
});
it('should publish event on order creation', async () => {
await ordersService.createOrder(testOrder);
expect(mockStreamClient.publish).toHaveBeenCalledWith(
'order-events',
expect.objectContaining({ type: 'order.created' })
);
});
For integration testing, use TestContainers to spin up real Redis instances. This catches serialization and networking issues early.
Performance Tuning
Scale consumers horizontally:
# Launch multiple inventory service instances
npm run dev:inventory -- --port=3001
npm run dev:inventory -- --port=3002
Redis consumer groups automatically balance events across instances. Monitor consumer lag with XINFO GROUPS order-events
- if lag grows, add more consumers.
Final Thoughts
Transitioning to event-driven architecture reduced our critical path failures by 83%. Services scale independently, and new features integrate without complex coordination.
Try implementing a simple event flow this week. Start with one producer and consumer. Notice how much simpler error recovery becomes when events persist during outages. What failure scenarios could this prevent in your systems?
If this approach resonates with you, share your implementation challenges in the comments. For hands-on developers: Clone our GitHub example repo and experiment with the complete codebase. Like this article if it helped you see microservices in a new light!