I’ve been building microservices for years, and I still remember the headaches of coordinating distributed systems. That’s what led me to event-driven architecture - it’s like giving each service its own autonomy while keeping everything in sync. Let me show you how we can build a robust system that handles real business complexity without becoming a distributed monolith.
Have you ever wondered how modern applications handle complex workflows across multiple services without losing data consistency?
Let’s start with our foundation. We’ll use NestJS because it provides excellent structure for microservices, RabbitMQ for reliable messaging, and MongoDB for flexible data storage. The key insight? Events become the communication backbone between services.
Here’s how we define our core events:
// shared/events/order.events.ts
export class OrderCreatedEvent {
constructor(
public readonly orderId: string,
public readonly userId: string,
public readonly items: OrderItem[],
public readonly totalAmount: number,
public readonly correlationId: string
) {}
}
export class PaymentProcessedEvent {
constructor(
public readonly orderId: string,
public readonly paymentId: string,
public readonly status: 'SUCCESS' | 'FAILED',
public readonly correlationId: string
) {}
}
Notice the correlationId? That’s our secret weapon for tracking related events across services.
Setting up our Order Service reveals some interesting patterns:
// order-service/src/orders/orders.controller.ts
@Controller('orders')
export class OrdersController {
constructor(private readonly ordersService: OrdersService) {}
@Post()
async create(@Body() createOrderDto: CreateOrderDto) {
const order = await this.ordersService.create(createOrderDto);
// Emit event immediately after creation
await this.ordersService.publishOrderCreated(order);
return order;
}
}
But what happens when a payment fails after inventory is reserved? This is where the Saga pattern saves us.
The Saga pattern manages distributed transactions by breaking them into smaller, reversible steps. Each service handles its part, and if something fails, we trigger compensating actions.
Here’s our payment service handling both success and failure scenarios:
// payment-service/src/payments/payments.service.ts
@Injectable()
export class PaymentsService {
constructor(
@InjectModel(Payment.name) private paymentModel: Model<PaymentDocument>,
private readonly eventBus: EventBus
) {}
async processPayment(orderId: string, amount: number): Promise<void> {
const payment = new this.paymentModel({
orderId,
amount,
status: 'PROCESSING'
});
try {
// Simulate payment processing
const result = await this.mockPaymentGateway(amount);
payment.status = result.success ? 'COMPLETED' : 'FAILED';
await payment.save();
// Emit appropriate event
const event = result.success
? new PaymentProcessedEvent(orderId, payment._id, 'SUCCESS')
: new PaymentProcessedEvent(orderId, payment._id, 'FAILED');
await this.eventBus.publish(event);
} catch (error) {
payment.status = 'FAILED';
await payment.save();
await this.eventBus.publish(
new PaymentProcessedEvent(orderId, payment._id, 'FAILED')
);
}
}
}
RabbitMQ configuration ensures we don’t lose messages:
// shared/rabbitmq/rabbitmq.module.ts
@Module({
imports: [
ConfigModule,
ClientsModule.registerAsync([
{
name: 'EVENT_BUS',
useFactory: (configService: ConfigService) => ({
transport: Transport.RMQ,
options: {
urls: [configService.get('RABBITMQ_URL')],
queue: 'events',
queueOptions: {
durable: true,
},
},
}),
inject: [ConfigService],
},
]),
],
exports: ['EVENT_BUS'],
})
export class RabbitMQModule {}
How do we ensure our services can find each other and stay healthy?
Service discovery and health checks are crucial. Here’s a simple approach:
// order-service/src/health/health.controller.ts
@Controller('health')
export class HealthController {
constructor(
private mongoose: MongooseHealthIndicator,
private rabbit: RabbitMQHealthIndicator
) {}
@Get()
@HealthCheck()
check() {
return {
status: 'ok',
timestamp: new Date().toISOString(),
services: {
database: this.mongoose.pingCheck('mongodb'),
messageBroker: this.rabbit.isHealthy('rabbitmq'),
},
};
}
}
Error handling deserves special attention. We implement retry mechanisms with exponential backoff:
// shared/utils/retry.util.ts
export async function withRetry<T>(
operation: () => Promise<T>,
maxRetries = 3
): Promise<T> {
for (let attempt = 1; attempt <= maxRetries; attempt++) {
try {
return await operation();
} catch (error) {
if (attempt === maxRetries) throw error;
const delay = Math.pow(2, attempt) * 1000;
await new Promise(resolve => setTimeout(resolve, delay));
}
}
throw new Error('Max retries exceeded');
}
Testing our event-driven system requires a different approach:
// order-service/test/orders.e2e-spec.ts
describe('Orders (e2e)', () => {
it('should create order and emit event', async () => {
const createOrderDto = {
userId: 'user-123',
items: [{ productId: 'prod-1', quantity: 2, price: 25 }],
totalAmount: 50
};
const order = await request(app.getHttpServer())
.post('/orders')
.send(createOrderDto)
.expect(201);
// Verify event was published
expect(eventBus.publish).toHaveBeenCalledWith(
expect.objectContaining({
type: 'ORDER_CREATED',
payload: { orderId: order.body._id }
})
);
});
});
Deployment brings its own challenges. Docker Compose helps us manage multiple services:
# docker-compose.yml
version: '3.8'
services:
order-service:
build: ./order-service
environment:
- RABBITMQ_URL=amqp://rabbitmq:5672
- MONGODB_URI=mongodb://mongodb:27017/orders
rabbitmq:
image: rabbitmq:3-management
ports:
- "5672:5672"
- "15672:15672"
mongodb:
image: mongo:5.0
ports:
- "27017:27017"
Can you see how each piece fits together? The events flow between services, each handling its specific responsibility while maintaining overall system consistency.
The real power emerges when we need to add new services. Want to send notifications? Just create a notification service that listens for order events. Need analytics? Add another listener. The existing services don’t need to change.
I’ve found that monitoring is non-negotiable in production. We track event throughput, processing times, and error rates. When something goes wrong, the correlation IDs let us trace the entire flow across services.
Building event-driven microservices requires shifting your mindset from direct service calls to event-based collaboration. It’s more work upfront, but the flexibility and resilience pay dividends as your system grows.
What challenges have you faced with microservices communication? I’d love to hear about your experiences and solutions. If this approach resonates with you, please share your thoughts in the comments below - let’s learn from each other’s journeys in distributed systems.