The metric most chatbots optimise for

Walk into a vendor demo for almost any hotel chatbot and the first thing they’ll show you is the deflection rate. “We deflect 87% of inbound messages.” “Our deflection rate is 92%.” “Industry-leading deflection.”

Deflection sounds good. It means the bot handled the message without escalating to a human. Less work for the team. Faster reply for the guest. Win-win.

Except deflection optimises for the wrong thing.

What deflection actually rewards

A bot that says “I don’t know, please contact the front desk” technically deflects: it didn’t escalate. A bot that confidently makes up an answer also deflects. A bot that gives the wrong answer deflects, until the guest follows up angry, and then it counts as two interactions instead of one.

If you measure deflection, the bot is incentivised to appear helpful. Whether it actually was helpful is somebody else’s metric to worry about — usually nobody’s.

What we measure instead

At Innquire we have three metrics that matter, in order of importance:

1. Resolution rate, measured 24 hours later

Did the guest get what they were asking for, where “asking for” includes the implicit request behind the explicit one? “Where’s the closest pharmacy?” isn’t really a question about pharmacies; it’s a question about whether they can fix their headache before bed. Did they get the headache fixed?

We measure this with a follow-up score from the guest the next day. It’s slow. It’s expensive. It tells you what actually happened.

2. Honest-deflection rate

When the AI doesn’t know, does it admit it and route the question to a human, or does it confabulate? We track every message that gets escalated and look at whether the escalation was honest (“I’m not sure about this, let me get someone”) or forced (a human had to step in because the AI gave a wrong answer).

We’d rather see a 70% honest deflection rate than a 95% silent failure rate. Most of our customers, after a month, agree.

3. Median resolution time

How long, end to end, did the guest wait? Includes both AI response and human handoff time. This one is the most operationally legible — your team can see it and improve it.

Why this matters

The hotel chatbot category has earned a deserved reputation for being mostly worthless, and the deflection metric is part of why. If we want guests and hoteliers to trust AI-based conversation, we have to be honest about what it can and can’t do — and we have to measure the right thing so we don’t drift toward the easy lie.

The boring conclusion

Optimising for deflection is optimising for “the AI didn’t visibly fail.” Optimising for resolution is optimising for “the guest got what they wanted.” These are not the same thing, and the difference matters.