Population Proportion – Sample Size

Calculators

Use this calculator to determine the appropriate sample size for estimating the proportion of your population that possesses a particular property (eg. they like your product, they own a car, or they can speak a second language) to within a specified margin of error. If you intend to ask more than one question, then use the largest sample size across all questions. Note that if the questions do not all have just two valid answers (eg. yes or no), but include one or more additional responses (eg. “don’t know”), then you will need a different sample size calculator.

Calculator

5% is a common choice

%

The margin of error is the level of precision you require. This is the range in which the true proportion is estimated to be and should be expressed in percentage points (e.g., ±2%).

A lower margin of error requires a larger sample size.

Typical choices are 90%, 95%, or 99%

%

The confidence level specifies the amount of uncertainty associated with your estimate. This is the chance that the margin of error will contain the true proportion.

A higher confidence level requires a larger sample size.

If you don't know, use 100,000

How many people are there in the population from which you are sampling? The sample size doesn't change much for populations larger than 100,000.

If you're not sure, leave this as 50%

%

What do you expect the sample proportion to be? This can often be determined by using the results from a previous survey, or by running a small pilot study.

Alternative Scenarios

With a sample size of
Your margin of error would be
9.79%
3.08%
0.93%
With a margin of error of % % %
Your sample size would be
8763
2345
383
With a confidence level of % % %
Your sample size would be
270
383
660
With a population size of
Your sample size would be
80
278
370
With a sample proportion of % % %
Your sample size would be
139
288
246

More Information

Worked Example

If a retailer would like to estimate the proportion of their customers who bought an item after viewing their website on a certain day with a 95% confidence level and 5% margin of error, how many customers do they have to monitor?   Given that their website has on average 10,000 views per day and they are uncertain of their current conversion rate, then they would need to sample 370 customers.  If, however they know from previous studies that they would expect a conversion rate of 5%, then a sample size of 73 would be sufficient.

Formula

This calculator uses the following formula for the sample size n:

n = N*X / (X + N – 1),

where,

X = Zα/22 ­*p*(1-p) / MOE2,

and Zα/2 is the critical value of the Normal distribution at α/2 (e.g. for a confidence level of 95%, α is 0.05 and the critical value is 1.96), MOE is the margin of error, p is the sample proportion, and N is the population size.  Note that a Finite Population Correction has been applied to the sample size formula.

The following reference explains how the FPC is used to adjust a variance estimate when sampling without replacement (see pages 141-142).

Daniel  WW  (1999).  Biostatistics:  A  Foundation for   Analysis   in   the   Health   Sciences.   7th edition. New York: John Wiley & Sons.

Discussion

The above sample size calculator provides you with the recommended number of samples required to estimate the true proportion mean with the required margin of error and confidence level.

You can use the Alternative Scenarios to see how changing the four inputs (the margin of error, confidence level, population size and sample proportion) affect the sample size.  By watching what happens to the alternative scenarios you can see how each input is related to the sample size and what would happen if you didn’t use the recommended sample size. The larger the sample size, the more certain you can be that the estimates reflect the population, so the narrower the confidence interval. However, the relationship is not linear, e.g., doubling the sample size does not halve the confidence interval.

For some further information, see our blog post on The Importance and Effect of Sample Size.

Definitions

Margin of error

The margin of error is the the level of precision you require. This is the plus or minus number that is often reported with an estimated proportion and is also called the confidence interval. It is the range in which the true population proportion is estimated to be and is often expressed in percentage points (e.g., ±2%).  Note that the actual precision achieved after you collect your data will be more or less than this target amount, because it will be based on the proportion estimated from the data and not your expected sample proportion.

Confidence level

The confidence level is the probability that the margin of error contains the true proportion. If the study was repeated and the range calculated each time, you would expect the true value to lie within these ranges on 95% of occasions.  The higher the confidence level the more certain you can be that the interval contains the true proportion.

Population size

This is the total number of distinct individuals in your population.  In this formula we use a finite population correction to account for sampling from populations that are small.  If your population is large, but you don’t know how large you can conservatively use 100,000.  The sample size doesn’t change much for populations larger than 100,000.

Sample proportion

The sample proportion is what you expect the results to be. This can often be determined by using the results from a previous survey, or by running a small pilot study. If you are unsure, use 50%, which is conservative and gives the largest sample size.  Note that this sample size calculation uses the Normal approximation to the Binomial distribution.  If, the sample proportion is close to 0 or 1 then this approximation is not valid and you need to consider an alternative sample size calculation method.

Sample size

This is the minimum sample size you need to estimate the true population proportion with the required margin of error and confidence level. Note that if some people choose not to respond they cannot be included in your sample and so if non-response is a possibility your sample size will have to be increased accordingly. In general, the higher the response rate the better the estimate, as non-response will often lead to biases in your estimate.