More Information
Worked Example
When designing a trial to assess the effectiveness of a new therapy treatment on the treatment of severe sepsis and septic shock, how many patients are required in the treatment (new therapy) and control (standard therapy) groups? The clinicians measure the effectiveness of the therapies of the treatments using mean arterial pressures and wish to detect a difference of at least 14mmHg between the two groups (the standard deviation of the two groups is 20mmHg, i.e., the variance is 400mmHg). In order to detect a difference of this magnitude that is significant with 95% confidence and a power of 80%, the clinicians will require 33 patients in each group.
Formula
This calculator uses the following formula for the sample size n:
n = (Zα/2+Zβ)2 *2*σ2 / d2,
where Zα/2 is the critical value of the Normal distribution at α/2 (e.g. for a confidence level of 95%, α is 0.05 and the critical value is 1.96), Zβ is the critical value of the Normal distribution at β (e.g. for a power of 80%, β is 0.2 and the critical value is 0.84), σ2 is the population variance, and d is the difference you would like to detect.
Note: This is a standard formula based on the normal distribution. References can be found in many texts, for example the Estimation of Sample Size and Power for Comparing Two Means section in Rosner, B., (2015). Fundamentals of Biostatistics. 8th ed. USA: Cengage Learning.
Discussion
The above sample size calculator provides you with the recommended number of samples required to detect a difference between two means. By changing the four inputs (the confidence level, power, difference and population variance) in the Alternative Scenarios, you can see how each input is related to the sample size and what would happen if you didn’t use the recommended sample size.
For some further information, see our blog post on The Importance and Effect of Sample Size.
Definitions
Confidence level
This reflects the confidence with which you would like to detect a significant difference between the two means. If your confidence level is 95%, then this means you have a 5% probability of incorrectly detecting a significant difference when one does not exist, i.e., a false positive result (otherwise known as type I error).
Power
The power is the probability of detecting a signficant difference when one exists. If your power is 80%, then this means that you have a 20% probability of failing to detect a significant difference when one does exist, i.e., a false negative result (otherwise known as type II error).
Hypothesised difference
This is the difference that you would like to detect. Given a difference, d, then your null hypothesis is:
H0: μ2-μ1< d
and your alternative hypothesis is:
H1: μ2-μ1≥ d
where μ1 and μ2 are the means of your two groups. You require a large enough sample size in order to detect a significant difference of d if one exists.
Population variance
This is calculated as:
σ2 = (1/N)* ∑Ni=1(xi-μ)2,
where,
μ = (1/N)* ∑Ni=1xi
and gives you an indication of how variable the population is. When performing significance tests, the sample variance provides an estimate of the population variance for inclusion in the formula.
Sample size
This is the minimum sample size for each group to detect whether the stated difference exists between the two means (with the required confidence level and power). Note that if some people choose not to respond they cannot be included in your sample and so if non-response is a possibility your sample size will have to be increased accordingly. In general, the higher the response rate the better the estimate, as non-response will often lead to biases in you estimate.