Is My Subject Line Test Statistically Significant? P-Values and What They Actually Mean
Are these results statistically significant? This is a question any marketer, scientist, social scientist, or anyone who wants to understand the impact of their findings should ask. Whether or not your results are statistically significant will determine whether you move forward with the analysis or conclude that you can not tell the difference between a test or control difference.
To determine the significance, some researchers will look solely at their p-value, which calculates odds of the results being significant. P-values are generally set at .05 or .01 and are used to reject the null hypothesis. The null hypothesis that you created at the beginning of your test states that the two approaches being tested will perform the same.
An example of a null hypothesis for a subject line test would state that the test subject line will perform the same as control subject line. In setting up the test, a marketer would select a p-value of .05 which is used to determine how large the sample size will need to be in order say the results from your test and control group are different, thus rejecting the null hypothesis
What does all this mean? As it turns out, very little. Ronald Fisher, the statistician who introduced p-values, claimed p-values are not the end-all be-all for testing. Rather, they are a starting point to make sure you start your test on the correct track. They help make sure your sample sizes are correct, so when you are assessing different rates you can confirm that the result of your test group and the result of your control group are different. As a general rule of thumb a p-value of .05 means you can expect the odds of the null hypothesis being true are about 1 in 20.
What p-values and statistical significance can tell you:
There is a difference between the test and the control.
What p-values and statistical significance can’t tell you:
- Which performed better, the test or control
- How large the difference is between the test and control
- Whether the difference between the test and control is large enough to be important
- Any results about the magnitude of your test
To understand differences such as magnitude and relative importance, you need to look at effect size and confidence intervals, which will be talked about in my next post.