This article will help you:
|
It can be difficult to understand the benefits of running different statistical tests. Whether you're planning for an end to end experiment or importing a dataset into an experiment results chart, this article will help clarify how to access a T-test in Amplitude Experiment, as well as how the test is computed.
A T-test is the comparison of means amongst two populations of data to determine if the difference is statistically significant. Amplitude computing uses the Welch's T-test, which comes with a few assumptions about your dataset:
- The Central Limit Theorem applies to the metric.
- Both populations do not share the same variance.
- You don't run the T-test until you've reached the sample size specified by the sample size calculator.
T-tests can be completed as either two-sided (which looks for any change in the metric, in either direction) or one-sided (which looks for an increase or a decrease, but not both). For a two-sided test, a statistically significant increase or decrease is not explicitly stated, while for a one-sided test, it is. (If you select Increase, the upper confidence interval bound is positive infinity; for Decrease, the lower confidence interval bound is negative infinity.)
NOTE: If you have yet to run your experiment or your sample size is large enough, you should use sequential testing instead of running a T-test. Read more about the difference in testing options in this blog.
Analyze your data with the T-test
You can access the T-test from either the Plan tab or the Analyze tab in Amplitude Experiment.
From the Plan tab, select Increase or Decrease if you want to run a one-sided test, or select Any to run a two-sided test.
Alternatively, follow these steps to analyze your experiment’s results with a T-test:
- From your Experiment Results chart, click the direction of the T-test to change it (Increase, Decrease, or Any).
- Next, click Statistical Settings. Click T-test to modify the Test Type.
- Enter the number of users needed under Samples Per Variant Needed.
NOTE: The T-test works by first computing the sample size you'll need before you can control for a specific false positive and false negative rate. Analyzing your data before reaching the sample size threshold will increase your error rates. See this article for more explanation on how peeking can interrupt your experiment process.
If you're unsure of the sample size to enter in Samples Per Variant, use Amplitude's sample size calculator. To learn more, see our Help Center article on planning experiments with the help of the sample size calculator.
- Lastly, click Apply to change the statistical settings to T-test.
The Analyze tab will now show the T-test results of your experimental data. Please note that sequential testing is the default test type. If you change the test type to a T-test, Amplitude Experiment will not save that change when you refresh or leave the page. Read more about interpreting test results in our Help Center article on Amplitude's Experiment Results chart.
Manage sample size needed for the T-test
You'll need to reach a minimum sample size before you run a T-test. The Analyze tab of the T-test will warn you if your data set is too small.
You can find more information on your sample size requirements in the Cumulative Exposure graph and its corresponding table. The graph shows a constant, dotted line named Sample Size Target, which is the total number of users per variant needed. The table next to the graph highlights the Exposure Remaining, which is the number of users needed by each variant. This information can confirm not only the number of users needed before running the T-test, but also provide an estimate of the time you'll need to complete the experiment before using a T-test to interpret your results.
Unfortunately, reaching the needed sample size does not guarantee your results will be statistically significant. For example, if your lift is smaller than the MDE, then your results often will not be statistically significant.