Choosing What to Measure

Purpose

Choosing What to Measure is a model for selecting the small number of metrics that should sit at the top of an organisation's cascade. It answers a question that most leadership teams either skip or get wrong: Which metrics, if we got them right, would tell us whether the strategy is actually working?

Most organisations measure too much, or measure the wrong things, or measure the right things badly. Dashboards proliferate. Every function adds its own indicators. Executive reviews become exercises in reading hundreds of numbers and seeing nothing. Or worse, the leadership team agrees on metrics that feel right at the time and then quietly stops referring to them, because the metrics were not the ones that actually move when the strategy works.

This is not a problem of analytics. It is a problem of strategy. Choosing metrics is the act of deciding what success looks like, in numbers, in advance. The conversation that produces good metrics is not a measurement conversation. It is a strategic conversation in disguise, and for most leadership teams, it is the most honest one they will have all year.

The Alignment Stack starts with metrics for a reason. Without the right metrics at the top, the cascade has nothing to cascade from. Goals become activity statements. Teams fight over priorities that nobody can adjudicate. Trade-offs get made politically rather than strategically. A team that has genuinely agreed on its metrics has done half the work of alignment. Everything else flows from that agreement.

This framework is the upstream foundation for the Alignment Stack. It describes how to do the work that the Alignment Stack assumes has already been done.

Why This Conversation Is Hard

If choosing metrics were easy, every organisation would have done it well. Most have not. The reasons it is hard are not analytical. They are political and structural.

The conversation surfaces disagreement. Most executive teams operate on a polite consensus that no specific metric exposes. Choosing metrics breaks the consensus. The marketing leader who has been measuring impressions and the sales leader who has been measuring closed deals are now, for the first time, being asked to agree on a single number that represents whether the company is winning. That number will favour one of them over the other, and the disagreement that surfaces in the metrics conversation is the disagreement they have been avoiding.

The conversation makes accountability concrete. A vague mandate, like grow the business, delight customers, or build the best team, leaves the leader free to claim success in whatever direction the work happened to go. A specific metric removes that freedom. Most leaders, asked privately, prefer the vague mandate, even when they know the specific metric is more honest.

The conversation reveals which leaders cannot articulate what they do. A leader whose function genuinely produces value can name a metric that captures it. A leader whose function has drifted into activity-for-its-own-sake cannot. The metrics conversation is the thing that exposes the drift, which is uncomfortable for everyone involved.

The conversation requires giving up control. A metric is a public commitment. The leader who agrees to be measured against a number gives up the right to redefine success after the fact. Many leaders will resist this without admitting they are resisting it, by proposing metrics that are vague enough to be unfalsifiable, or by adding so many metrics that no single one can hold them to account.

The work of choosing metrics is, in large part, the work of pushing past these resistances. The framework that follows is the practical discipline. The hard part is whether the leadership team is willing to sit through the conversation the discipline requires.

What a Top-of-Cascade Metric Has to Do

Not every metric belongs at the top of the cascade. Most metrics, including good ones, belong further down. The discipline is in distinguishing the small set of metrics that earn the top tier from the much larger set that supports them.

A top-of-cascade metric earns its place when three things are true.

One. Movement on the metric would change a strategic decision. This is the sharpest test. If the number went up by twenty percent, would the executive team do anything differently? If it went down by twenty percent, would they? A metric that fails this test is not measuring strategy. It is being reported. The dashboards of most organisations are dominated by reported metrics. The cascade should not be.

Two. Every senior leader can plausibly affect it through the work of their function. This is the alignment test. A metric that sits at the top has to be something the executive team owns collectively. Revenue passes this test for most companies, because every function contributes to revenue through different means. Engineering velocity does not, because the marketing leader has no useful relationship to it. A top-of-cascade metric is the one place where the executive team's work converges.

Three. The metric cannot be moved by gaming activity alone. This is the integrity test. A metric that can be moved by doing more activity rather than producing more value will, in time, be moved that way. Calls answered in under thirty seconds can be moved by hanging up faster. Tickets closed can be moved by closing tickets prematurely. A serious top-of-cascade metric resists this kind of gaming because it measures something real, usually an outcome that the customer or the market validates rather than the organisation declaring.

A metric that fails any of these tests does not belong at the top. It might still belong somewhere in the cascade, as a leading indicator, an operational measure or a team-level diagnostic. Most metrics belong somewhere. The mistake is putting them at the top.

Outcomes, Not Activity

The most common failure in metrics is measuring activity rather than outcomes.

Activity metrics describe what the organisation did. Calls made, features shipped, content published, meetings held, decks produced, hires made. Outcome metrics describe what changed because of what the organisation did. Customers retained, revenue earned, problems solved, capability built.

The trap is that activity metrics are easier. They sit closer to the work, the data is cleaner, the team has more direct control over the numbers. So they proliferate. A function ends up with twenty activity metrics and zero outcome metrics, and the team cannot tell whether any of the activity is producing the result it was supposed to produce.

The diagnostic question is the one Eric Ries made famous in a slightly different form. If this number went up but nothing else changed, would we be happy? If the answer is no, the metric is measuring activity. If we shipped twice as many features but customer retention stayed flat and revenue stayed flat, would we be happy? No. So features shipped is not the metric. The metric is what features shipped is supposed to produce, which is something downstream, like engagement, retention or revenue. Measure that.

There are cases where activity is, legitimately, the metric. A regulated industry that has to demonstrate due diligence may have to count activities for compliance reasons. A function in its first year of existence may need activity targets to establish a baseline. These are valid exceptions. They are also smaller than the number of activity metrics most organisations carry, and the activity metric should never be the metric the function uses to claim success.

Leading and Lagging

A serious metrics framework names both leading and lagging indicators, and connects them.

Lagging indicators are the outcomes the organisation ultimately cares about. Revenue, retention, market share, NPS. They tell the leadership team what already happened. The trouble with lagging indicators alone is that, by the time they move, the work that produced the movement is months in the past. The team cannot course-correct against a lagging indicator. They can only learn from it, after the fact.

Leading indicators are the earlier signals that predict the lagging outcome. Pipeline coverage predicts revenue. Activation rate predicts retention. Engagement among new users predicts six-month churn. Leading indicators give the leadership team the chance to steer rather than only report.

The discipline is in finding the leading indicators that are genuinely predictive, rather than the ones that feel predictive. Most organisations have a long list of metrics they call leading indicators, and only some of them actually lead. The ones that lead are the ones for which a credible mechanism exists, not just a correlation, but a story for why movement on the leading indicator should produce movement on the lagging indicator three months later. Without that mechanism, the leading indicator is decorative.

A working framework names two or three lagging indicators that define what success looks like and four or five leading indicators that the executive team uses to know whether they are on track. The leading indicators are reviewed weekly. The lagging indicators are reviewed quarterly. Both feed into the cascade.

The Small-Number Discipline

The discipline most violated, and most consequential, is the discipline of keeping the number of top-tier metrics small.

A leadership team that cannot recite its top metrics from memory does not, in any practical sense, have top metrics. The metrics exist on a slide. They do not exist in the working consciousness of the team that is supposed to be using them. When trade-offs come up, the team falls back on intuition and politics, because nobody can hold twenty metrics in mind while making a decision.

The right number is between three and seven. Three is tight. Seven is the upper bound at which the team can still trade off between metrics in real time. Beyond seven, the metrics list has stopped being a strategic instrument and started being a reporting checklist.

The pressure on the small-number discipline is constant. Every function will lobby for its metric to be on the top tier. Every external stakeholder, including investors, board members and regulators, will have one they want included. Every quarter, someone will propose adding a metric to capture a new priority. The accretion is invisible from inside any single decision and ruinous over time. A leadership team that started with five metrics and has eighteen by year three has not become more sophisticated. It has lost the discipline.

The defence is to treat the addition of any metric as requiring the removal of another. Five metrics, period. If something new earns its place, something old has to go. The conversation that produces this trade-off is the same kind of strategic conversation that produced the original five. The act of removing a metric is, itself, a strategic act, because it is a public statement that this thing no longer matters as much as it used to.

Organisational Health vs Operational Metrics

A common confusion that wrecks metric frameworks is the conflation of organisational health metrics with operational metrics.

Organisational health metrics are the small number that sit at the top of the cascade. They define what success looks like for the company or the function as a whole. They are shared across the executive team. There are three to seven of them.

Operational metrics are the much larger set that teams use to manage their day-to-day work. They are local. They are diagnostic. There are dozens or hundreds of them across an organisation. A support team's first-response time, an engineering team's deployment frequency, a marketing team's campaign click-through rate. These are operational metrics. They are essential at the team level. They have no place at the top of the cascade.

The conflation happens when teams, wanting visibility for their work, lobby for their operational metrics to be reviewed at the executive level. Or when an executive, wanting to feel close to the work, asks to see metrics that should stay at the team level. The result is the rocks-in-jar problem. Every team's local metric ends up at the executive review. The executive review becomes a tour through hundreds of numbers. The signal that should have come from the small number of organisational health metrics is drowned out by the noise of operational metrics that nobody at the executive level can act on.

The discipline is to enforce the separation. Organisational health metrics are what the executive team reviews. Operational metrics are what teams use to run their work. The cascade connects them, with operational metrics rolling up into the goals that serve the organisational health metrics, but the layers are not the same and should not be reviewed in the same forum.

Goodhart's Law and the Limits of Measurement

The most famous warning in the metrics literature is Goodhart's Law: when a measure becomes a target, it ceases to be a good measure. The phenomenon is real and it is the reason why the third earned-place test (cannot be moved by gaming) matters.

A metric, once it has consequences attached, will be optimised. The optimisation will sometimes produce more of the underlying value the metric was supposed to capture. It will, often, produce more of the metric without the underlying value. Call centres asked to reduce average handle time learn to hang up on customers who would have benefited from a longer call. Sales teams asked to hit quarterly quotas learn to time deals for end-of-quarter even when the customer would have closed earlier. Engineering teams asked for deployment frequency learn to split changes into trivial deployments to inflate the count.

The defence against gaming is not to abandon metrics. It is to choose metrics that resist gaming and to pair them with balancing metrics that catch the gaming when it happens.

A metric resists gaming when it measures an outcome the organisation cannot fake unilaterally. Customer retention is hard to game because the customer has to come back. Revenue is hard to game except through accounting tricks that have separate consequences. Internal activity metrics are easier to game, because the organisation controls both the activity and the count.

Balancing metrics catch the gaming. If the call centre is being measured on average handle time, the balancing metric is customer satisfaction or first-call resolution. If both move in the wrong direction together, handle time drops and satisfaction drops with it, the gaming is visible. A serious metric framework pairs each gameable metric with a balancing metric, and the balance is part of how the metric is read.

The Functional Leader's Hard Case

Choosing metrics is hardest for functions whose work is genuinely difficult to measure. Design, legal, HR, internal communications, operations, finance, security. These functions often act as if their work is unmeasurable, on the reasoning that it is qualitative or that the outcomes are slow or that the value is in what does not happen rather than what does.

Some of this is honest difficulty. The legal team that prevents three lawsuits this year cannot easily measure the lawsuits that did not happen. The security team that prevents a breach is, in the best case, invisible. The operations team that keeps the office running well produces no positive signal, only the absence of complaint.

Some of it is convenient evasion. A function that cannot be measured cannot be held to account. A leader whose work cannot be assessed in numbers retains discretion over how their success is described.

The honest functional leader pushes through this. The work produces outcomes. The outcomes can be measured, even when the measurement is harder than for sales or engineering. Design produces decisions that customers respond to in measurable ways. Legal produces a risk profile that can be captured in incidents avoided, contracts closed, time-to-decision on legal review. HR produces a pipeline, a retention rate, an internal mobility figure, a manager-effectiveness score. The metrics will not look like revenue. They will measure something real.

The functional leader who cannot name a metric that captures the value their function produces has a problem. Either the function is not producing distinctive value, or the leader has not done the work of articulating what that value is. Both problems are worth surfacing. Both will be surfaced by the act of trying to choose a metric.

How to Run the Conversation

The mechanics of choosing metrics matter less than the conversation that produces them. A working approach has three moves.

One. Ask each leader to write down their proposed top-tier metrics independently, before any discussion. The variance across the answers is diagnostic. If five executives produce five overlapping lists, the team has more shared understanding than they realised. If they produce five non-overlapping lists, the team has been operating without a shared definition of success and the metrics conversation is going to surface every strategic disagreement that has been suppressed. Either outcome is useful.

Two. Discuss the variance, not the metrics. The first conversation is not about which metrics to pick. It is about why the lists are different. The marketing leader who proposed brand metrics and the sales leader who proposed pipeline metrics are not disagreeing about metrics. They are disagreeing about whether the company's growth comes from demand generation or sales execution. That disagreement is the conversation worth having. The metrics will follow.

Three. Choose deliberately, then commit publicly. The chosen metrics go on the wall. They get reviewed at every executive forum. They appear at the top of the planning document. The choice is treated as durable. Not unchangeable, but durable. Changing the metrics is itself a strategic act and requires the same kind of conversation that produced them in the first place.

The conversation that follows this template is uncomfortable in the right ways. It exposes disagreement that needed to be exposed. It commits the team to a definition of success that they will be held to. It produces a small set of numbers that everyone can recite, that genuinely change decisions and that serve as the foundation for everything else the cascade will do.

Common Failure Modes

The metrics list nobody can recite. The team agreed on metrics in a planning offsite and has not looked at them since. The metrics live in a deck. They do not live in the working consciousness of the team. Fix: Put the metrics in front of the team at every executive forum. If they are not central enough to be reviewed weekly or monthly, they are not the right metrics.

The metric that became a slogan. A metric was chosen because it was inspirational, not because it was diagnostic. Customer happiness. Engineering excellence. The number, when reported, produces no decisions, because nobody quite knows what would change if it moved. Fix: Apply the strategic decision test. If the number moved by twenty percent, what would change? If the answer is unclear, the metric does not belong at the top.

The metric that everyone games. The metric had consequences attached without a balancing metric. The team is now optimising for the metric in ways that produce the number without producing the underlying value. Fix: Pair the metric with a balancing metric and review them together. If the balance breaks, the gaming is visible.

The metrics list that grew to twenty. The team added metrics over time without removing any. The discipline has eroded. Fix: Reduce the list. Five metrics, or seven. Anything that does not earn one of those slots is moved to the operational layer or removed.

The functional metric that does not roll up. A function chose its own metric and never connected it to the organisational health metrics at the top of the cascade. The function reports against its own metric and feels successful. The organisation continues to underperform. Fix: Every function-level metric must trace back to an organisational health metric. If it cannot, the function is measuring something the organisation does not value.

The metric chosen because it is easy to measure. The team picked a metric the analytics platform already produced, rather than the metric that captured what the strategy required. The metric is precise and irrelevant. Fix: Choose the right metric first. Build the measurement after. If the right metric is hard to measure, the work of measuring it is the work, and it is worth doing.

Signs Choosing What to Measure Is Working

The leadership team can recite the top-tier metrics from memory
Every senior leader can articulate how their function affects each top-tier metric
Executive forum discussions reference the metrics as the basis for trade-offs
Disagreements between leaders are framed in terms of which metrics matter more, not which leader has more political capital
New initiatives are evaluated against the question which metric does this move
The list has been pruned at least once in the last year
Leading indicators are reviewed weekly; lagging indicators are reviewed quarterly
Functional leaders have named outcome metrics for their own functions, even when those functions are hard to measure
The team has had at least one uncomfortable strategic conversation that the metrics surfaced

Signs Choosing What to Measure Is Broken

Top-tier metrics exist on a slide but are not referenced in working conversations
The list of top-tier metrics has more than seven entries
Some top-tier metrics measure activity rather than outcomes
Functional leaders cannot articulate how their work affects the company's metrics
The same metrics have been in place for three years without revision
Conversations about priorities are political rather than metrics-based
Operational metrics are reviewed at the executive level
Leading indicators have been chosen but the mechanism by which they predict lagging outcomes has never been articulated
Metrics with consequences attached are moving in the right direction while the underlying outcomes are not improving
The team has never had the uncomfortable strategic conversation that good metrics produce

Choosing What to Measure and the Alignment Stack

This framework is upstream of the Alignment Stack. The Alignment Stack describes how metrics cascade through goals, teams and projects. It assumes the metrics at the top of the cascade are the right ones. This framework is how you choose them.

An organisation that has done this work well has a small number of metrics that the executive team can recite, that genuinely reflect strategy and that resist gaming. The cascade flows from those metrics. Goals are written to move them. Teams own them. Projects trace back to them.

An organisation that has skipped this work has a cascade that flows from metrics that were never properly chosen. The cascade may be technically well-built. It will not produce strategic discipline, because the foundation it rests on is weak. The Alignment Stack cannot fix bad top-tier metrics. It can only propagate them more efficiently.

Do this work first. The rest of the pillar depends on it.

Summary

Choosing What to Measure is the upstream discipline that makes strategic alignment possible. It answers the question of what success looks like, in numbers, in advance.

The discipline is in selecting the small number of metrics that earn the top of the cascade. Metrics that change strategic decisions when they move, that every senior leader can affect through their function's work and that resist gaming. Most metrics fail one of these tests. The work is in distinguishing the few that pass from the many that do not.

The conversation that produces good metrics is harder than the analytics it appears to be. It is a strategic conversation in disguise. It surfaces disagreement that has been suppressed, commits leaders to definitions of success they would have preferred to leave vague, and exposes functions that have drifted into activity for its own sake. Most leadership teams resist this conversation. The leaders who push through it produce a foundation that everything else in the cascade can rest on.

Done well, the result is a small number of metrics that the executive team can recite, that genuinely move when the strategy works and that hold the organisation to a definition of success that is both honest and shared.