Measuring the Business Impact of Your A/B Test
In previous blog posts, I have talked about setting up A/B tests, how to run concurrent A/B tests, and what p-values mean. This final post will help you understand the difference between test and control groups and the impact of the test. This is done by using confidence intervals and effect size.
The first blog post in this series put a strong emphasis on creating a good hypothesis. This is not only for ease of creating sample sizes, but also for interpreting results, which is where confidence intervals and effect size come into play. Confidence intervals are used as a check so we can understand the range of what the mean rates of our test and control group lie in. These are directly related to how your sample size is calculated and the alpha you choose.
The effect size gives the magnitude of the change. Therefore, in a hypothesis when we say how much change we want to see, we are directly referencing the effect size. The effect size ultimately helps you decide whether or not the test was successful, and whether you should move forward with pushing the test to your entire list. In short, the larger the effect size, the more impactful the results. There are many different calculations for effect size, but a common measure for effect size for a test with a binary outcome such as "read" or "did not read" in the odds ratio.
The formula for odds ratio is:
So the calculation for the odds ratio of a subject line test based on number of reads would be:
(the number of reads of the test /the number of reads of the control) / (the number of non-reads of the test / the number of non-reads of the control)
If the odds ratio equals one, then the test and control perform the same. If the odds ratio is greater than one, the the test group is x times more likely to read the email than the control group. If the odds ratio is less than one, the result is not directly interpreted, except to say that the test performed worse. In order to get a more interpretable result when the odds ratio is less than one, you would have to reverse the test and control positions in the equation, putting the control group numbers in the numerator and the test group numbers in the denominator. Then the odds ratio would read, the control group is x times more likely to read the email than the test group.
As an example, say you performed a subject line test. The results showed that 300 subscribers read the test subject line and 1200 subscribers did not. For the control group, 200 subscribers read the control subject line and 1300 subscribers do not. Thus the odds ratio is:
This means the test email is 1.6 times more likely to be read.
As you become more versed in testing and understanding the statistics behind it, you will be able to create a better hypothesis that not only addresses business needs but can address things such as confidence intervals and effect size to make not only running tests but interpreting and creating actionable results a breeze. It is always important to remember that testing never ends! The more you test, the more you will be able to find out what improves and doesn’t improve your emails campaign.