There is a particular kind of confidence that comes from looking at a dashboard and knowing — not guessing, not hoping — that the system is healthy. That deployments are faster this quarter. That the error rate dropped after last week’s fix. That the change you shipped actually moved the number you cared about.
This is what measurement gives you. Not certainty, but evidence. And evidence is what separates engineering from superstition.
Why it matters
Without measurement, engineering decisions become folklore. “We think the API is slow” is not the same as “p99 latency crossed 800ms after the last deploy.” The first is a feeling. The second is something you can act on.
Teams that do not measure tend to optimize for the wrong things. They chase architectural purity while users churn. They rewrite systems that were fast enough and ignore systems that are actually on fire. Measurement gives you triage: it tells you where to look and when to stop.
What to measure
Not everything. The trap is building dashboards for every metric your infrastructure can emit, then never looking at any of them. Instead, pick a small set of signals that tell you whether you are delivering value.
A good starting point is the four DORA metrics:
- Deployment frequency — how often you ship to production
- Lead time for changes — how long from commit to deploy
- Change failure rate — what percentage of deploys cause incidents
- Time to restore — how quickly you recover when things break
These four numbers, tracked honestly over time, tell you more about your engineering health than any architecture diagram. They measure outcomes, not activity.
Beyond DORA, measure what matters to your users. Page load time. Conversion rate. Time to first meaningful interaction. The specific metrics depend on your product, but the principle is the same: measure the outcome, not the effort.
The vanity metric trap
Some metrics feel productive but measure nothing useful:
- Lines of code — more code is not better code
- Story points completed — a measure of estimation accuracy, not value delivered
- Hours worked — effort is not output
- Number of PRs merged — activity is not progress
These metrics are dangerous because they create incentives that diverge from what actually matters. When you measure lines of code, people write verbose code. When you measure story points, people inflate estimates. The metric becomes the target, and the target stops being useful.
A good metric is one where improving the number genuinely means improving the thing you care about. If you can game the metric without improving the outcome, it is the wrong metric.
How to do it without drowning
Pick three to five metrics. Put them on a dashboard that your team actually looks at — during standups, in weekly reviews, wherever decisions get made. Review trends, not snapshots. A single data point means nothing; a trend over weeks tells a story.
Set alerts for the metrics that indicate something is broken, not for the ones that indicate something is merely interesting. Alert fatigue is real, and it is the fastest way to make your measurement system useless.
Revisit your metrics quarterly. What you need to measure changes as your product and team evolve. A metric that was critical six months ago might be irrelevant now. Kill it and replace it with one that matters.
The goal is not a perfect measurement system. The goal is to make decisions with evidence instead of intuition. Start small, be honest about what the numbers say, and let the data change your mind.