You may have heard some of your peers say that their enterprise tried improved personalization, but they didn’t see enough positive ROI to proceed. This might actually be your own employer’s experience.
Here are a few common mistakes that can kill early personalization attempts and some tips on how to avoid them.
Not Testing an Experience First
Humans are weird. They’re full of things called cognitive biases, of which we’re mostly unconscious, but they’re extremely measurable. Psychologists Daniel Kahneman and Amos Tversky famously created experiments that helped catalog many of them, and in doing so created an entire discipline called Behavioral Economics. (You can read about them in the book by Kahneman called Thinking, Fast and Slow).
Because humans are weird, you cannot predict how they will respond to a personalization. Even the personalizations that you’re convinced should be improvements must be supported by a statistically verifiable test.
Here’s a recent example: We were asked to help a client improve conversion rates. High on our list of things to test was the utility of an auto-advancing carousel on the brand’s homepage. Much research, including a study conducted by the Nielsen Nelson Group, maintained that auto-advancing carousels are terrible at engagement and conversion.
Had we followed our instincts, we would have recommended that all segments receive an alternative to the carousel: a grid showing the offerings in a static set of clickable headings. Instead, we conducted an A/B test, where a random half of homepage visitors saw the control experience (the carousel) and the other half saw the grid.
To our astonishment, there was no improvement in conversion between the two experiences, with a greater than 95% confidence level. We were humbled!
But we learned something about that brand’s prospects and customers.
Moral of the story: Test every experience before it is pushed to a personalization.
Prioritizing the Wrong Experiences
What if we conducted the carousel test on a carousel that was not on an important or highly trafficked page? Our reasoning would have been sound for testing because of the evidence suggesting this feature kills conversion. (If you need more evidence and are looking for a good laugh, visit Should I Use a Carousel?) But, as the saying goes, would the juice be worth the squeeze?
If you’re testing on a page or user journey that gets little traffic, you can produce dramatic improvements on conversion as a percentage, but the impact on annual sales would still be low.
Moral of the story: Create a way to prioritize all your potential experiments based on the estimated gain if the test is successful.
Stopping a Test Too Soon
There’s a fundamental phenomenon regarding the math behind testing you need to understand when you construct, run, and analyze one. It’s the concept of “regression to the mean.”
Regression to the mean can convince an experimenter that a test destined to be falsified is actually a success. You might check your numbers on a given day, see that your conversions have improved impressively in the test experience, and call your test a success before reaching statistical certainty (e.g., 95% confidence level that the improvement within the sample size is reflective of reality).
So you push to personalization, and lo! You don’t see the improvements you saw in the experiment. It’s even possible the personalized experience actually harmed performance. All because of regression to the mean.
The size of your sample is the problem. When you first turn on an A/B test, you have small sample sizes for both the control group (Group A) and the test group (Group B). These will grow as the experiment proceeds, but in the meantime, you can be the victim of the Law of Small Numbers.
Imagine you’re flipping a coin to prove that there is an equal chance of getting heads or tails. You flip the coin four times and get these results: heads, heads, tails, heads. Or even heads, heads, heads, heads! These sorts of results are the artifact of each flip having an equal chance of coming up heads or tails.
With such small numbers of flips, these results aren’t representative of reality. In fact, the Law of Small Numbers is defined this way in iResearch.com: The incorrect belief held by experts and laypeople alike that small samples ought to resemble the population from which they are drawn.
Although this is true of large samples, it isn’t for small ones.
Moral of the story: Try not to read anything into the lift — or lack of lift — reported before the test arrives at statistical certainty. The gains and losses will reliably oscillate as the numbers accumulate, and you will eventually see a clear improvement that stops its shifting.
Contact Us
Ready to achieve your vision? We're here to help.
We'd love to start a conversation. Fill out the form and we'll connect you with the right person.
Searching for a new career?
View job openings