Aug 26, 2024

Understanding the A/B Testing Process and Best Practices for CRO

Most B2B websites are primarily used for lead generation, which demands an effective conversion rate optimization (CRO) strategy and A/B testing. But it can be difficult to achieve a significant lift in conversions from A/B testing. This is because only one in eight A/B tests yields significant results. For this reason, marketers need to be cautious about changing on-page elements based on early or faulty conclusions.

Only 1 in 8 A/B tests produces significant results.

To ensure your CRO decisions are grounded in authentic insights, read on. This blog will refresh your expertise in the A/B testing process and best practices. (If you’re looking for specific A/B testing strategies to improve your website conversions, check out our recent blog: A/B Testing Strategies That Drive Real Results in CRO.)

CRO and A/B Testing, Defined

Conversion rate optimization is the process of improving the user experience to increase the percentage of visitors who take a desired action on your website. Part of CRO is auditing your website and analyzing that data to better understand your website functionality and users’ behaviors—what drives them to take an action and what drives them away. Another part of CRO is developing theories around those insights, and then testing those theories to improve site performance.

A/B or split testing is a scientific method for comparing version A against version B to see which performs better. In an A/B test, one element is tested at a time. The two variations are served up randomly to visitors and traffic is split equally between the two. This is an ongoing process in which elements are tested independently, iteratively and subsequently.  

For example, you might A/B test two identical pages on which the only variation is the CTA text. On one, the text might be “Learn More” and on the other, it might be “Download the Whitepaper.” Once you’ve declared “Download the Whitepaper” a winner, you might test it against “Download the Whitepaper Now.” And so on. Then you might move on to test the color of the CTA button and then the headline font, executing each winner, developing a new hypothesis and retesting.

Multivariate testing uses the same process of A/B testing, except that it tests multiple variations to see how the different combinations interact with one another and perform as a whole. For example, you might test the CTA copy, text size and button color together.

Although it’s thought to save time and effort, a multivariate test depends on first testing each element individually. Moreover, a multivariate test requires a large audience to determine a statistical significance. Depending on the number of variants you’re testing and the size of your audience, a test may need to run one to two months.

A/B Testing Process

An effective A/B testing strategy uses the following process:

  1. Set goals and formulate a hypothesis: Clearly define the objective of each test. This means knowing what you’re trying to improve. Develop a hypothesis that states the expected outcome for each test.
  2. Implement two versions: Create two versions of the test element on the landing page, ensuring all other elements remain the same to isolate the impact of the new element.
  3. Determine your statistical parameters: Your parameters help you set up your test and calculate your results, and include:
    • Detectable effect is the minimum change you would like to be able to detect. For example, a 20% improvement in conversions.
    • Significance level is how confident you need to be that the results are real and not random. Significance level depends on your goal and the consequence of making a change based on an error. In business and marketing, significance level is typically 90% to 95%.
    • Your p-value, against which you’ll measure your test results, aligns with your significance level. A significance level of 95% will have a p-value of 0.05, and 90% will have a p-value of 0.1.
    • Your significance power, which will help determine the reliability and validity of test results, is typically set at 80%.
    • Sample size is calculated using the above parameters to determine how much traffic you need to reach a meaningful outcome. Arriving at this number involves complex mathematical equations, or you could use an online sample size calculator like the one offered by Optimizely.
  4. Randomization: Split the incoming traffic randomly between the two versions (A and B).
  5. Monitoring: Track the performance of both versions over a set period of time, or until the required number of users have interacted with each version, whichever comes later. Yes, later.
  6. Validation: False positives and testing biases happen. To safeguard against this, validate your test. This means ensuring you have the correct sample size and that your test ran for the full duration. Check that external factors are accounted for that can bias your results. Run all these checks before verifying that your results are statistically significant.
  7. Analysis: Calculate your results against the p-value to determine statistical significance and analyze other metrics, such as engagement and/or conversion rates, to determine which version performed better. Analysis is another complex mathematical process that can be achieved efficiently and accurately with help from an online A/B test statistical significance calculator like this one offered by VWO.
  8. Iterate (and reiterate): Many marketers implement the winning version on their web page and stop testing. But continuous improvement is key to effective testing. Instead, create a new, better variation of the winner and run another A/B test. Then do it again. And again.

If you stop A/B testing an element after declaring a winner, you're doing it wrong.

A/B test best practices

While more simple than multivariate testing, A/B testing isn’t a foolproof process. The following best practices can help you achieve meaningful and reliable A/B test results.

Run an A/A validation test first

Before running the A/B test, verify your test is correctly set up by testing two identical versions of the page. This step ensures that any observed differences are due to the changes being tested, not due to flaws in the experimental design. A/A test results that show no significant difference between the identical versions indicate a reliable setup. If a significant difference is observed, you may have issues with the randomization or measurement tools that need to be addressed before running your A/B test.

Prioritize tests based on potential impact

Focus your testing efforts on the changes you expect to have the most significant positive effect. This might mean focusing on improvements that align to your business goals and strategies, those that are likely to yield the best return, or those with the highest potential to positively impact your metrics.

Run tests for an appropriate duration

Often, when early results support a hypothesis, marketers make the mistake of declaring a winner. This mistake, called confirmation bias, can result in a false positive. A/B testing should run a minimum of two weeks but can take as long as six weeks to get a statistically significant result. The amount of time an A/B test will take depends on many factors, including:

  • Traffic volume: The larger your sample size, in this case web visitors, the quicker you’ll get a clear answer.
  • Current conversion rate: If your existing conversion rate is high, it might take longer to detect a change.
  • Expected change: Bigger changes (e.g., a larger conversion rate) are easier and faster to detect. For example, a 10% increase in conversions is easier to see than a 1% increase.
  • User behavior variability: If user actions on your site are consistent, it’ll be easier to notice a difference. If behaviors are varied, it’ll take more data to determine if observed differences are real or random.
  • Significance level: The higher level of certainty (e.g., 95% vs. 90%) you want in your results, the more data and time you’ll need to declare a difference.
  • Statistical power: A higher statistical power requires a larger sample size to detect the same effect.

Consider external events

Try to avoid testing during holidays, massive campaigns, seasonal industry occurrences and major events that can sway your results. During these events, you won’t get a true representation of your traffic to your site, so any test results you get will be relevant only to that sample of visitors.

Test again

If you stop A/B testing an element after declaring a winner, you’re doing it wrong. A/B tests can result in false positives. User behaviors change. Even if you ran several iterations of an A/B test before declaring a solid winner, don’t neglect to go back and test it again later to be sure it’s still driving the best results.

Use A/B testing tools

A/B testing tools can automate the process of traffic splitting, data collection and analysis. As mentioned earlier, A/B testing tools can perform the complex calculations needed to determine your required sample size or evaluate the significance of your results. Some of the best A/B testing tools, according to Gartner, include VWO, Adobe Target, AB Tasty and Oracle Maxymiser.

Partner With A/B Testing experts

A/B testing for CRO is a nuanced and iterative process. It requires extensive knowledge, strict oversight and complex computations. Even then, it’s still easy to get it wrong. If you’re struggling to achieve higher conversion rates based on A/B test results, we can help. Contact Elevation Marketing to learn how our A/B testing experts can boost your CRO.

Looking for something specific? Check out our blogs on these topics:

Industry

Marketing Technique

Ready to talk with the experts in B2B marketing?
Request a free proposal.