MTTR is a lagging indicator

Mean time to respond is the metric SOCs report by default and struggle to steer by. It is an average over incidents that already happened, dominated by queue time, and blind to the question that should keep a security leader up at night: of all the alerts we closed quickly, how many deserved more than the look they got?

A fast wrong answer improves your MTTR. That is the whole problem with MTTR.

What the average is hiding

MTTR blends two very different quantities: how long alerts sit waiting, and how long the actual work takes once someone picks it up. In most queues, waiting dwarfs working. So the metric mostly measures staffing against volume, and teams under pressure learn the shortcut it quietly rewards: close things faster by looking at them less. The metric improves as the practice degrades. An average that can be gamed by shallowness is not a safety metric. It is a throughput metric wearing a safety costume.

It is also, strictly, a lagging one. By the time MTTR moves, the quarter is over and the incident reviews are written. Nothing in the number tells you where the next miss is forming.

The regime change

When agents do the first pass on every alert, queue time collapses toward zero, and MTTR with it. That feels like victory, and partly is. But it also means MTTR stops discriminating between a healthy operation and a sloppy one, because the thing it mostly measured, the waiting, is gone. The interesting questions move elsewhere, and they need different instruments.

investigation depthWhat share of alerts got a full investigation: context pulled, indicators enriched, a reasoned conclusion on the record. In the manual world this was sampled at best, a QA review of maybe one ticket in fifty. Now it is measurable on every alert, and it is the first thing to watch.

escalation precisionOf the alerts routed to a human, how many genuinely needed one? Falling precision means thresholds are too tight and the queue is refilling with noise. Rising precision while the sampled re-review of auto-closed alerts stays clean is what earned autonomy looks like.

gate latencyHow long approved-action requests wait at the human gate. This is the new queue time, it is visible, and it is yours to staff. A containment that waited four hours for a click is a process finding, not a technology one.

contested-call rateHow often the verdict spread comes back wide, with per-engine attribution so vendor signature churn does not masquerade as drift. Trending wider on a source or alert type is an early sign something changed: the telemetry, the threat, or the engines themselves.

independent ground truthThe check on all four above. Sampled human re-review of auto-closed alerts, purple-team injects, retro hunts over the closed pile. The dials are computed by the system being graded; this is the audit that keeps them honest.

Earlier-moving, with an honest asterisk

These dials move earlier than MTTR. Depth tells you coverage is real. Precision tells you the routing is honest. Gate latency tells you where human process is the bottleneck. Contested-call rate tells you where the ground is shifting. A team steering by them finds out about degradation while it is still a tuning exercise, not a postmortem.

The asterisk: Goodhart's law does not exempt replacement metrics. Depth can become activity theater, a beautifully formatted wrong conclusion scores as a full investigation, and precision improves by simply escalating less. That is exactly why the fifth instrument is on the list. Self-reported dials need an external audit, and a team that adopts the dials without the sampled re-review has traded one gameable number for four.

And alert fatigue, the thing MTTR never measured at all, finally gets a denominator. Fatigue was always a depth problem: too many alerts each deserving minutes, each getting seconds. Track depth per alert and escalation precision, and you can see fatigue ending, not just feel it.

Keep reporting it anyway

None of this means you delete the MTTR slide. Boards expect it, frameworks ask for it, and it still catches gross failures. Keep reporting it; just stop steering by it. The dials are depth, precision, latency at the gate, and the audit that keeps them honest. MTTR is the rear-view mirror.

Soarcery records the spread and every gate decision on the investigation trail. Request a demo and ask to see a contested verdict and the gate decision it produced.

MTTR is a lagging indicator

What the average is hiding

The regime change

Earlier-moving, with an honest asterisk

Keep reporting it anyway

Depth on every alert. On the record.