|
I have the results of a split test which tested conversion rates given four different scenarios. How do I ensure that the differences in conversion rate are statistically significant? |
|
The answer to use the posterior distribution is completely sound. Estimate the posterior using your favorite technique (bayesglm from the arm package in R works well), then sample from it and compute the measure of interest for each sample. That gives you a nice posterior estimate of the decision you want. Also, the question you should be asking is NOT which one is the best. You should be asking which of the options has a high probability of being nearly as good as the best. These questions have very different properties and the latter is much better for what you want. |
|
The standard approach is to use one-way ANOVA, which will give you a F-statistic for the joint null of equal means. However you often are interested in pairwise comparisons, which is a multiple comparisons problem. The simplest adjustment method is Bonferroni's correction, which tends to be very conservative. That is the standard advice. But it's rather silly. A/B testing is not a classical testing problem, it is a decision problem. This point is woefully missing in most A/B testing writing. The most framework for decision making is Bayesian. Bayesian analysis of A/B/C/D/etc testing would be hierarchical, shrinking each estimate towards a common value with the amount of shrinkage depending how how precisely each individual effect is estimated and the variance of the distribution of effects. Once you have the posterior distribution, you can use it intuitively and not worry about artificial testing issues. The multiple comparisons problem disappears. |