💡 Significant and insignificant results should be treated the same. Set clear thresholds and follow them, no matter what the numbers say along the way.
ℹ️ Our fix: We now define both sample size and test duration up front. If a test ends without reaching significance, we log the result, review it, and decide what to do next. But we do not keep pushing.
What we do instead: guardrails, not guesswork
We stopped letting the data tempt us into bad habits. Instead, we built a process that removes the guesswork.
Here’s what that looks like now:
-
Every test starts with a sample size calculation. We use our baseline conversion rate and expected minimum effect to estimate how many sessions we’ll need.
-
We commit to a test duration before launching. This helps ensure we aren’t reacting emotionally to interim results.
-
If we reach the threshold and the result is still insignificant, we stop. We don’t extend the test just to chase a number.
💡 It’s easier to trust your results when you’re not constantly moving the finish line.
ℹ️ Umbraco Engage includes built-in support for test guardrails. You can define sample sizes, set durations, and avoid early stopping errors with clear test rules and thresholds.
Summary: Letting your test run too short or too long will cost you
If your A/B test ends before it reaches enough users, the result is unreliable. If it runs long past the planned threshold, you risk overfitting to noise.