Hard-to-measure Quality | Customers, Etc.

How to approach quality for things that are difficult to measure

Jun 24, 2021

Quick promo: the FullStory support team is going global! Open positions:

If you get excited about extra-queue-rricular work—senior support team members at FullStory spend 40% of their time outside of the queue—and all the other things I write about in this newsletter, I hope you’ll consider joining us. Email me with questions.

Let’s imagine you’ve done a good job implementing Service Level Indicators (SLIs), Objectives (SLOs), and Agreements (SLAs) with your team as one avenue for improving the quality of service that you offer to customers. Perhaps you’ve even matured to the point where you’re comfortable baking some of these SLAs into customer contracts. Well done!

You’re sitting down for another brainstorming session with your team about how to improve the quality of customer support.

“What if we make support better?”, someone asks.

“What do you mean, better?”, you retort. “We already made our responses faster and have the SLAs to back it up. What do you have in mind?”

“You know, better? Like, customers like our responses more. Not just faster. Better. So what if they get a response in 60 seconds—what if it’s the wrong response?”

That sounds interesting, but how do you measure it?

Knowing what to measure

When we looked at service level indicators last week, one of the key takeaways was that if you want to improve quality, you have to find something to measure. You can’t have a goal—a service level objective—without knowing what you’re going to measure. You need the indicator (the SLI) before the objective (the SLO).

With response times, you have a clear set of data. Sure, you still have to decide precisely how you want define response times—do you use first reply? all replies? business hours—but at least the data is there if you have the tools to do the math. And that’s the nice thing about service level indicators. Because they’re easily (after a bit of effort) defined and measurable, they can be agreed upon. You can define them as service level agreements (SLAs) and bake them into contracts.

Harder-to-measure quality

How do you measure if a response is good, not just fast?

A lot of people are going to immediately jump to customer satisfaction (CSAT) survey scores as a way to measure quality. Which, fine, as a proxy for quality, it’s a decent measurement and you probably want CSAT in your toolbox1. CSAT data comes from the customer—they’re ultimately the one who decides whether a support interaction is good or bad. But CSAT isn’t just a function of the quality of the support interaction. It’s also a function of response time, product satisfaction, whether the customer ate a donut before filling out the survey2, etc.

CSAT can be useful as a proxy for quality, but it’s not as straightforward as a service level indicator in the same way it would be for response times. Put another way, you’re not going to be able to put CSAT into contracts3. “Company will maintain customer satisfaction scores of 4.7 or higher on all customer support interactions” is not something you’d expect to see in a contract.

So where does that leave us?

Separating measurability from behavior

With easy-to-measure indicators, the behaviors to move the indicator are closely aligned with the way it’s measured. If you want a faster response time, you keep a closer eye on the inbox and you respond faster. Simple enough. The behavior is tightly coupled with the service level indicator.

Hard-to-measure qualities may not have a tight coupling between behavior and the proposed indicator. If your proxy indicator for support ticket quality is CSAT, you can’t just say “provide better responses so the score goes up.” I mean you can (and if your responses are bad, I suppose you probably should), but more realistically this leaves you open to gaming the metric—it’s easier to find tricks to make the score go up than to do the hard work to improve something that’s hard to measure.

If you want to improve overall quality, which is harder to measure, you’re going to have to spend some serious time thinking about what goes into a quality response. This is a lot of work! But it’s good work. It’s meaningful work. It will help you to align on what’s important and communicate clearly to the team what you expect from them when they engage with customers.

Promote behaviors that matter

Defining what goes into a quality customer support response is only the first step. You also have to promote behaviors that drive quality responses. Often these behaviors aren’t immediately obvious. We want to avoid behaviors that seem like they would lead to desired change, but aren’t impactful. This often means getting creative.

One thing we did differently at FullStory was implement peer review of support tickets. We still did manager-led review, but peer review put each agent in the shoes of a manager, making them responsible for evaluating their peers on key questions related to support quality4. When you’re responsible for evaluating your colleagues, you pay a lot more attention to the qualities that go into a good support response. You want to be fair, so you intrinsically put in the effort.

We found this approach to be incredibly effective driving behavior to produce higher quality responses. Granted, it was hard to measure the improvement—CSAT scores were already very high—but we heard qualitatively from speaking with customers and peers across the business that they valued the work we did on the support team. I wrote about how we arrived at this process in the post Beware of Zombie Values.

Building brand quality

Your customers won’t notice the work you do behind the scenes, but they will notice its outcomes. Customers can tell the difference between a quick-but-not-so-helpful response and a response where the support team really took time to listen and understand their problem.

Spending time on the harder-to-measure aspects of quality can help to improve your team’s reputation. In business, we call this your brand. Next week, we’ll look at “contract” quality (the world of easier-to-measure SLIs, SLOs, and SLAs) alongside “brand” quality (the world of harder-to-measure aspects of a service).

FullStory uses Stella Connect. I wrote about them in this post.

Does eating the donut push the CSAT score up or down? There’s probably a curve where within the first two minutes of eating the donut, CSAT scores sharply increase, but after like five minutes when the donut is just sitting in the customer’s belly and they’re already regretting their decision—and how!—they take their stomach ache out on the poor support agent on the other end of the CSAT survey.

I’ve seen contract language that tries to define resolution and customer satisfaction based on various levels of severity. For certain sizes of companies and industries, that might make sense.

It helps to have a team with a solid foundation of psychological safety. It’s worth getting there so you can do this kind of work.

Customers, Etc