Case Study: 83% Conversational IVR Performance Improvement for a Leading Home Warranty Company
One of our passions is taking a problem apart, looking at it from multiple angles, then figuring out how to solve it well. Last year, an opportunity came up to do this with a home warranty company client who was ready to launch a new IVR application. The project was nearing launch, but they had concerns that the approach to-date would not deliver the right outcome. 60 days after I was brought in to advise, we exceeded their original goal by 10% through listening to their concerns, getting the right people involved, and following the optimization playbook I share below.
The launch situation
Two weeks post-launch, the client was ready to pull the plug on the project. The existing call center metrics showed the old system performed better than the new system. They had just spent significant budget and resources on the automation effort, and yet agents were busier than ever taking calls coming out of the new application. This was blowing up call center staffing plans and costing more than the old application.
With that context, I identified three core issues that needed to be addressed before we could work on optimizing performance:
The performance goal had not been clearly defined, agreed on, and communicated
The application had been designed based on functional requirements, not performance requirements
The application had not been instrumented properly to capture the metrics needed to understand the performance
Step 1: Planning and Agreement
I started with collaborative discussions to define the performance goal. We needed to align on these points:
The primary metric we would use to judge performance and how we would define it
How to classify which calls would and wouldn’t be included in the calculation
How to reconcile existing client metrics with those for the new application
As we worked through those details, it came to light that the client was looking at the resolution of the primary intent—’open a service request’—as the most important metric. The incumbent application resolved 20% of service request calls. The goal of having a new application was to increase savings through automation by half, meaning resolution would increase by ten points to 30% in the new application. The new application was currently resolving 18% of relevant calls.
Once we established the goal of 30% resolution, I specified the data points to capture in the new application. Engineering scheduled the work and within a couple of weeks, we had data. In parallel, I assessed the task flows and interface to identify potential trouble spots based on heuristics and secondary data. From that analysis, the application was over-featured and cluttering the use case that mattered most.
Step 2: Analysis and Issue Identification
As the data became available, I analyzed the end-to-end journeys of successful and unsuccessful calls. The analysis consisted of stitching together data points from call start, authentication, intent capture, task flow steps, and call wrap-up. I also coordinated the listening to over one hundred calls that met certain criteria, mostly failures in or immediately after intent capture. This action is necessary because transcripts of calls omit important audio information about why some calls fail. The technology ignores meaningful conversational components such as pauses and side speech and misidentifies meaningless audio like background noises as intentional verbal input.
The data plus listening analyses showed several issues:
Callers seemed frustrated by the need for upfront authentication
The application failed to capture intent for a high percentage of all callers
Even when the primary intent capture (open a service request) did happen, callers expressed frustration with the number of steps needed to verify and complete the intent
At this point, we considered that all three major flow areas—authentication, intent capture, and intent fulfillment—might be in need of treatment. However, I ran calculations on intent completion and determined that increasing post-capture fulfillment would not get us much closer to the goal. The math showed that the true need for optimization was in the authentication and intent capture steps at the beginning of the application. The completion path could remain as long as we increased the number of calls getting into it.
Step 3: Diagnosis and treatment
Examining additional data and call recordings around the authentication and intent capture parts of the application yielded two insights.
Callers did not expect the interaction offered by the application. After the generic greeting, they went through a barrage of questions before they heard any information about what the system could do for them.
Second, when it was time for them to indicate what they were calling about, they were confused about how best to answer due to multiple instructions and an open-ended question. The confusion was often expressed as hesitation, partial responses, dysfluent responses, and stammers. These in turn caused the system to go into error states, adding to the negative experience.
As is often the case when user confusion appears, the performance problem appeared to stem from the framing of the interaction and how input is requested. To address this, I redesigned most of the opening sequence to increase user confidence by providing clarity and predictability about what the system did and how the user could use it. I made use of the psychological principle of priming, which is a way to give subtle cues to people about what is coming up. The central change was the intent capture prompt. I reduced the length by about 60% and changed the construction to "Why" + "Action" + "Method" instead of confusing instructions and unnecessary examples.
To measure the effectiveness of the changes, we created two alternate versions of the capture prompt and tested all 3 in production against the original design. To the client’s delight, one of the alternates outperformed the others by over 10 percentage points! (Funnily enough, the winner was not the one I thought it would be.) All three alternatives outperformed the original intent capture prompt.
In addition to that change, we implemented several other streamlining and clarifying prompt changes before and after intent capture. The greeting was made more specific and the collection of follow-up information was clarified. Together with the winning capture prompt, the design changes increased the traffic getting to the intent handling flow. The increase led to a 15 percentage point jump in resolution to 33%. Success!
Lessons and follow-ups
Optimization is a vital set of skills and actions for conversational systems. Even the best teams doing great work run into unexpected issues upon launch. Optimization enables you to respond to them quickly and successfully.
It’s critical to use strategic planning discussions to establish clear and measurable goals for conversational systems and have a baseline for future comparison.
Team leaders need to educate the client about the importance of goal definition and optimization expectations early in the project.
The goals should drive every design and build decision. Any design focus not on the primary goals needs to be assessed for its potential impact.
Data to calculate goal metrics need to be built into the design and code as well as tested before launch.
Teams that build conversational systems need an optimization playbook to use in achieving the right outcomes.
Looking to dive deeper with your team on optimizing voice, chat, or multimodal performance? CCAI offers a 3.5 hour workshop for teams looking to get more from conversational AI.