A/B Testing: Checking Your Marketing Ideas for Validity

Chukwuebuka Justus
5 min readJan 4, 2021

A/B testing is a practice of testing different variations of a hypothesis or an idea to know their effect and hence which to adopt.

A/B Testing. Image by Optimizely

The ideas in this post was inspired by the CXL Institute mini degree in Conversion Rate Optimization.

It is a very powerful way of validating an idea but A/B testing comes with different challenges which include:

• Not having enough data

• Not adopting the proper statistics

• Knowing what to test first

• Know when to stop the test

• Know the proper way to determine the winner

• Determining how long the test should run.

There are three parts to the A/B testing exercise

1. Planning the Test

2. Running the Test

3. Measuring and report the outcome of the Test

Planning Stage

A key question here is “do you have enough data to conduct A/B test”.

To answered this question, we employ the ROAR model

R — Risk

O — Optimization

A — Automation

R — Re-think

If you don’t have at least 1,000 conversions per month you may not be able to run an A/B test. You could run it but will most probably have a false winner because you have too little data to use.

Statistical Power is the likelihood that an experiment will detect an effect when there is an effect to be detected. This is dependent on a few factors.

· Sample size

· Effect size

· Significance level.

In a case where you have up to 10,000 conversions a month, you can do many A/B tests at the same time. With this number you can start 4 A/B tests per week that’s 200 conversions per test and you will now need a full team to run them.

In the case of 1,000 conversions per month, the challenger test variation has to beat the control by 15%, which means if the control is 100 conversions per month, the challenger needs to get a 115 conversions per month.

If you have up to 10,000 conversions per month, the winner test needs only 5% uplift to be the winner.

To calculate the size needed for an A/B test, you could employ some tools for this. This will help you know the size of data you need for a conclusive result in your tests. You can look up a tool at AB Test Guide. www.abtestguide.com/testsize .

Without the need for complicated calculations, you only need to provide your website conversion rate and the number of unique visitors per month.

For instance, if your conversion rate is 4% and you have a 10,000 unique visitors per month, this translates to 400 transactions per week and 16,000 per month. So you can run s good test.

Once you file out the two fields, it will calculate how long you need to run the test to get a 15% uplift in conversion.

Significance Level

Test against 90–95% significance level otherwise you may be declaring a winner when there is actually no winner — false positive.

Significance Level calculator.

Selecting you’re A/B test KPIs.

Another key challenge is on selecting what to test and how to measure outcomes.

We will be looking at

· When to pick what KPIs

KPIs — key performance indicator are things that show progress. Most of the times we refer to them as Key Metric. This could be trying to change customer behavior, increase transaction frequency, create more clicks, more leads or any other goal we measure our test against.

· Clicks: Clicks could be measure as a metric but in many cases, it is not a very clear metric to measure especially if you are concerned about revenue.

· Behavior: Shifting behavior could be optimized for especially if you are looking at raising conversion rate.

· Transaction: This is important if you are optimizing to grow your business. Under transaction you could measure Revenue Per user. Another KPI we could optimize for is the Customer Life Time Value.

Note: AB testing tools available are only compatible with binary values 1/0. This means you are either hitting the goal or not.

Goal Metric: This is the goal you want to raise. This could differ for different organizations. Some would go for value per transaction, raise sales of a particular product, increase subscription for a product e.t.c.

It is important to have a complimentary set of metrics so they don’t conflict with each other. This is where the Overall Evaluation Criterion comes in to harmonize the metrics so they reinforce each other.

· Selecting your OEC could be a challenging task as you will need to talk to most of the teams to know which will be most beneficial to all the teams.

· OAC should be based on short term goals that could indicate a long-term goal and hard to be manipulated.

· Think about Customer Lifetime Value not just the revenue

· Look out for success indicator not vanity metrics. Measure what matters.

Hypothesis Setting

There is a framework to help align people to what you are testing and minimize discussion meetings.

1. The Problem

2. Proposed Solution

3. Proposed Outcome

If I apply this — (Psychology), then this behavior change (data) will happen, among this group (data), because of this reason (Psychology).

Example of a Hypothesis.

Self-Efficacy:

Self-efficacy affects every aspect of human behavior.

By determining the beliefs, a person holds regarding his or her power to affect situations, it strongly affects both the power the person actually has to face the challenges competently and the choice the person is most likely to make.

Self-Efficacy testing image. By wikispaces.psu.edu

Self-Efficacy Example:

If we boost self-efficacy (Psych), the conversion rate will increase (Data) will happen, among new users (data), because of the reduction of their fear for potential obstacle (Psy).

Application:

If we show a happy 65+ person filling this form (UX), the conversion rate will increase (Data) will happen, among new users (data), because of the reduction of their fear for potential obstacle (Psy).

by making an unlikely person (65+ elder) perform a task perceived to be difficult, people believe that task is easier than they would have expected hence they agree to perform it too.

Another way to apply self-efficacy is to show a picture of an elderly person, saying they completed the form in about five minutes.

Prioritizing You’re A/B test.

A well-known prioritization model would help us organize our tests in proper order for more credible results.

P = Potential

I = Importance

E = Ease

I = Impact

C = Confidence

E = Effort

Hypothesis X Location. You have to have a well-researched hypothesis applied in the proper location.

There are more to A/B testing than I could possibly discuss in a single blog post but if you would like to have a more comprehensive understanding of A/B testing visit CXL Institute for their mini degree in Conversion Rate Optimization.

--

--