一、Experimental Design:

A/B testing design
- Sample Size
- Test Length
- Applications for recommendation algorithms
- Ability to implement an A/B test, python/R skills if handed the result of an A
- Communication around A/B testing is vital

1 → A/B testing design

Both visible and invisible changes can be tested with A/B testing.

What can be tested?

Examples to invisible changes can be page load time or testing different recommendation algorithms.

A popular example is Amazon’s A/B test that showed every 100ms increase in page load time decreased the sales by 1%.

Examples to visible changes can be new additions to the UI, changes in the design and layout or headline messages.

What can’t be tested?

New experiences are not suitable for implementing A/B tests.

Because a new experience can show change aversion behavior where users don’t like changes and prefer to stick to the old version, or it can show novelty effect where users feel very excited and want to test out everything.

In both cases, defining a baseline for comparison and deciding the duration of the test is difficult.

How can we choose the metrics?

Metric selection needs to consider both sensitivity and robustness.

Sensitivity means that metrics should be able to catch the changes and robustness means that metrics shouldn’t change too much from irrelevant effects.

As an example, most of the time if the metric is a “mean”, it is sensitive to outliers but not robust. If the metric is a “median”, it is robust but not sensitive for small group changes.

In order to consider both sensitivity and robustness in the metric selection, we can apply filtering and segmentation while creating the control and experiment samples.

Filtering and segmentation can be based on user demographics (i.e. age, gender), the language of the platform, internet browser, device type (i.e. iOS or Android), cohort and etc.