More Information
Worked Example
A retailer has monitored a random sample of 500 customers who have viewed their website on a certain day and recorded the number who purchased an item to be 380. They then estimate that 76% of all customers that view their website go on to purchase at least one item. Given that their website has on average 10,000 views per day and they have estimated this proportion from a random sample, the retailer would also like to know how reliable this estimate is? The 95% confidence interval for this proportion is between 72.35% and 79.65%. If they had in fact monitored half the number of customers this interval would increase to between 70.77% and 81.23%.
Formula
This calculator uses the following formula for the confidence interval, ci:
ci = p ± Zα/2*√(1/n)*p*(1-p)*FPC,
where:
FPC = (N-n)/(N-1),
Zα/2 is the critical value of the Normal distribution at α/2 (e.g. for a confidence level of 95%, α is 0.05 and the critical value is 1.96), p is the sample proportion, n is the sample size and N is the population size. Note that a Finite Population Correction (FPC) has been applied to the confidence interval formula.
The following reference explains how the FPC is used to adjust a variance estimate when sampling without replacement (see pages 141-142).
Daniel WW (1999). Biostatistics: A Foundation for Analysis in the Health Sciences. 7th edition. New York: John Wiley & Sons.
Discussion
Calculating a confidence interval provides you with an indication of how reliable your sample proportion is (the wider the interval, the greater the uncertainty associated with your estimate).
By changing the three inputs (the sample proportion, confidence level and sample size) in the Alternative Scenarios, you can see how each input is related to the confidence interval. The larger your sample size, the more certain you can be that the estimate reflects the population, so the narrower the confidence interval. However, the relationship is not linear, e.g., doubling the sample size does not halve the confidence interval.
Definitions
Sample proportion
The sample proportion is your ‘best guess’ for what the true population proportion is given your sample of data.
Confidence level
The confidence level is the probability that the confidence interval contains the true population proportion. If the survey is repeated and the confidence interval calculated each time, you would expect the true value to lie within these intervals on 95% of occasions. The higher the confidence level the more certain you can be that the interval contains the population proportion.
Sample size
This is the total number of samples randomly drawn from you population. The larger the sample size, the more certain you can be that the estimate reflects the population. Choosing a sample size is an important aspect when desiging your study or survey. For some further information, see our blog post on The Importance and Effect of Sample Size, and for guidance on how to choose your sample size for estimating a population proportion, see our sample size calculator.
Population size
This is the total number of distinct individuals in your population. In this formula we use a finite population correction to account for sampling from populations that are small. If your population is large, but you don’t know how large you can conservatively use 100,000. The sample size doesn’t change much for populations larger than 100,000.