I’ve been thinking about performance a lot lately. Not just the kind you see in charts and graphs, but the real, tangible performance a user experiences when they click a button. We spend so much time optimizing front-end code, but what about the black box of the server? How do we truly know what’s happening between the request and the response? This question led me down a path, and I want to share what I built: a way to see inside the server, for every single request.
The answer lies in a powerful but often overlooked web standard: the Server-Timing API. It lets your server send detailed timing information directly to the browser. You can see it in your browser’s developer tools, just like you see JavaScript execution times. But to make it truly useful for a production system, we need to combine it with something that can store and analyze that data over time. That’s where Prometheus comes in.
Think about it. One tool gives you instant, per-request insight for debugging. The other collects metrics across all users to show trends and spot problems. Together, they form a complete picture of your backend’s health. I decided to build this monitoring layer for an Express.js application, and the process was enlightening.
Let’s start with the core concept. The server sends a special HTTP header. It looks something like this: Server-Timing: database;dur=47.2, cache;dur=1.5, total;dur=48.7. Each part is a metric—a name, a duration in milliseconds, and an optional description. The browser receives this and displays it in the Network tab. Suddenly, you’re not guessing why an API call is slow; you can see the database took 47ms and the cache was fast at 1.5ms.
So, how do we generate these timings in Node.js? We need a utility to track them. I created a simple class to handle the start and stop of various operations. It uses performance.now() for high-resolution timing.
export class ServerTiming {
private metrics = new Map<string, { start: number; end?: number }>();
start(name: string) {
this.metrics.set(name, { start: performance.now() });
}
end(name: string) {
const metric = this.metrics.get(name);
if (metric) metric.end = performance.now();
}
toHeaderString(): string {
const entries: string[] = [];
this.metrics.forEach((metric, name) => {
if (metric.end) {
const duration = (metric.end - metric.start).toFixed(1);
entries.push(`${name};dur=${duration}`);
}
});
return entries.join(', ');
}
}
This class is the foundation. But how do we attach it to every request in Express? The answer is middleware. We can create a piece of middleware that adds a new ServerTiming instance to the request object. Then, anywhere in our route handlers, we can call req.serverTiming.start('database') and req.serverTiming.end('database').
The middleware also has a clever job. It needs to make sure the header is sent with the response. We can do this by wrapping the res.send and res.json methods. When the response is about to go out, we stop the overall ‘total’ timer and set the Server-Timing header.
export const serverTimingMiddleware = (req, res, next) => {
const timing = new ServerTiming();
req.serverTiming = timing;
timing.start('total');
const originalSend = res.send;
res.send = function(body) {
timing.end('total');
res.setHeader('Server-Timing', timing.toHeaderString());
return originalSend.call(this, body);
};
next();
};
Now, in any route, we can time our operations. Imagine a route that fetches user data. The code becomes wonderfully transparent.
app.get('/api/user/:id', async (req, res) => {
const timing = req.serverTiming;
timing.start('cache');
let user = cache.get(req.params.id);
timing.end('cache');
if (!user) {
timing.start('database');
user = await db.users.find(req.params.id);
timing.end('database');
timing.start('cache_set');
cache.set(req.params.id, user);
timing.end('cache_set');
}
timing.start('serialize');
const response = serializeUser(user);
timing.end('serialize');
res.json(response);
});
With this in place, every /api/user/ request will send back a detailed breakdown. You can open your browser’s DevTools, click on the network request, and see exactly where the time was spent. It’s incredibly powerful for debugging specific, slow requests. But what about the bigger picture? What if you want to know the average database query time over the last hour, or get an alert if the cache miss rate spikes?
This is where we move from debugging to monitoring. We need to collect these metrics and store them. Prometheus is a perfect tool for this job. It’s a time-series database designed for metrics. We can use the prom-client library to expose our metrics on a special endpoint, which Prometheus will “scrape” or collect at regular intervals.
First, we define what we want to measure. We’ll create metrics for request duration, database query time, and cache performance. These are defined as Histograms and Counters, which are ideal for timing and counting events.
import client from 'prom-client';
const register = new client.Registry();
const httpRequestDuration = new client.Histogram({
name: 'http_request_duration_ms',
help: 'Request duration in ms',
labelNames: ['route', 'method'],
buckets: [10, 50, 100, 200, 500, 1000, 2000]
});
const dbQueryDuration = new client.Histogram({
name: 'db_query_duration_ms',
help: 'Database query duration in ms',
labelNames: ['operation'],
buckets: [1, 5, 10, 20, 50, 100]
});
register.registerMetric(httpRequestDuration);
register.registerMetric(dbQueryDuration);
Now, we need to connect our Server-Timing measurements to these Prometheus metrics. We can create another middleware, or extend our existing one, to observe the request as it finishes and record its duration to the Prometheus histogram.
More importantly, within our route handlers, we can now record our specific operations. After we get the duration from our ServerTiming object, we can also push that data to Prometheus.
// Inside our user route, after database call
const dbTime = timing.getDuration('database');
dbQueryDuration.observe({ operation: 'find_user' }, dbTime);
The beauty of this dual approach is separation of concerns. The Server-Timing header is for the immediate, human-readable debug log of a single request. The Prometheus metric is the aggregated, machine-readable data for long-term storage and analysis. We set up an endpoint, usually /metrics, that exposes all this collected data in a format Prometheus understands.
app.get('/metrics', async (req, res) => {
res.set('Content-Type', register.contentType);
res.end(await register.metrics());
});
Finally, we need a way to visualize this data. That’s the last piece of the puzzle. We use Grafana, a powerful dashboarding tool that connects directly to Prometheus. We can build a dashboard that shows, in real-time, the average response time for our API, the 95th percentile of database latency, the rate of cache hits versus misses, and more. We can set up alerts in Grafana to notify us if any of these metrics cross a dangerous threshold.
The result is a full-cycle monitoring system. A user reports a slow page? You can ask for the Server-Timing header from their browser’s network tab. You see a spike in database latency on your Grafana dashboard? You can drill down and see if it’s related to a specific route or a new deployment. The two tools answer different but complementary questions: “What just happened to my request?” and “What is happening to all requests?”
Building this changed how I think about backend development. It moves performance from being an abstract concern to something visible and measurable. It empowers developers to find and fix bottlenecks with precision. Have you ever wondered what the true cost of a new external API dependency is? Or if your database indexing is actually working? This system gives you the data to answer those questions.
Implementing this doesn’t require a massive overhaul. You can start with the Server-Timing middleware in a single, critical route. Add Prometheus metrics for one operation you’re worried about. The incremental nature of it is a huge advantage. You get immediate value from each piece you build.
I encourage you to try adding just the Server-Timing header to one of your API endpoints. The first time you see that breakdown in your DevTools, it feels like turning on the lights in a dark room. From there, the path to full production monitoring is clear and incredibly rewarding. It transforms your application from a silent machine into one that tells you exactly how it’s feeling.
If you found this walkthrough helpful, please share it with a colleague who’s passionate about performance. Have you implemented a similar system? What challenges did you face? Let me know in the comments—I’d love to hear about your experiences and compare notes.
As a best-selling author, I invite you to explore my books on Amazon. Don’t forget to follow me on Medium and show your support. Thank you! Your support means the world!
101 Books
101 Books is an AI-driven publishing company co-founded by author Aarav Joshi. By leveraging advanced AI technology, we keep our publishing costs incredibly low—some books are priced as low as $4—making quality knowledge accessible to everyone.
Check out our book Golang Clean Code available on Amazon.
Stay tuned for updates and exciting news. When shopping for books, search for Aarav Joshi to find more of our titles. Use the provided link to enjoy special discounts!
📘 Checkout my latest ebook for free on my channel!
Be sure to like, share, comment, and subscribe to the channel!
Our Creations
Be sure to check out our creations:
Investor Central | Investor Central Spanish | Investor Central German | Smart Living | Epochs & Echoes | Puzzling Mysteries | Hindutva | Elite Dev | JS Schools
We are on Medium
Tech Koala Insights | Epochs & Echoes World | Investor Central Medium | Puzzling Mysteries Medium | Science & Epochs Medium | Modern Hindutva