Skip to main content

Experiment Design

Rashan avatar
Written by Rashan
Updated this week

Types of Geo-Experiments

  1. Hold-out Tests

    • Reduce or Completely stop spending in test regions

    • Measures baseline contribution of channel

    • Best for validating channel incrementality

    • Compare Decrease in Spend to Decrease in Revenue

  2. Scale-up Tests

    • Increase spending in test regions

    • Measures potential for growth

    • Best for testing saturation points

    • Compare increase in spend to increase in revenue

  3. New Channel Tests

    • Test new channel in specific regions

    • Measures incremental impact of new activity

    • Best for validating expansion plans

    • Compare increase in spend to increase in revenue

Design Parameters

  • Duration: Typically 14-21 days minimum

  • Budget: Determined by the expected lift and ROI

  • Geography Selection: Algorithm selects regions to create comparable test/control groups

  • Expected Lift: Minimum detectable effect needed for statistical significance

Channel-Specific Considerations

When testing upper-funnel activities (e.g., Meta Awareness, YouTube), consider the delayed effect:

Example Scenario:

  • Test Duration: 3 weeks

  • Channel’s Known Lag Effect: 2 weeks

  • Analysis Approach:

    1. First Analysis: At the end of 3-week test period

    2. Final Analysis: At 5 weeks (3 weeks test + 2 weeks lag)

    This ensures we capture the full impact, including delayed conversions

This is particularly important for:

  • Brand awareness campaigns

  • Video advertising

  • Content marketing

  • Other upper-funnel activities with known lag effects

Understanding Expected Lift

The expected lift shown in the experiment design represents:

  • The minimum change needed to validate the input ROI assumption

  • NOT a prediction of actual results

  • A threshold for statistical significance

  • Calculated based on:

    • Input ROI/ROAS

    • Historical performance

    • Geographic variance

    • Test duration

Example:

If input ROAS = 10 Expected Lift = -5% This means: To validate a ROAS of 10, we need to see at least a 5% reduction in revenue when reducing spend. If we see less impact, it indicates the actual ROAS is lower than 10.

When designing the experiment, the initial estimate we input for iROAS or CPIC sets the lower bound of what we expect to detect. For example, if we input an iROAS of 10 and the experiment requires a minimum detectable lift of $10,000 to be statistically significant, the required investment would be $1,000. This $1,000 would theoretically return 10× its value, providing the minimum detectable lift needed. However, if we overestimated the iROAS and the actual value is only 2, our spend wouldn't generate enough lift to reach the detection threshold, causing the experiment to fail.

Did this answer your question?