If you were to interview every single beneficiary your dataset would be completely precise. If you calculated the percentage of beneficiaries who are in the bottom two wealth quintiles from that dataset, the result would be the ‘true’ percentage of the beneficiary population in the bottom two quintiles. Given that the sample population is different than the whole beneficiary population, the results from the sample will also be different than the true percentage. Increasing the sample size helps correct for this.
You can think of the ‘true’ beneficiary population percentage as being within a range around the sample percentage. If your sample is 100 respondents, the range will be around +/- 10%. For example, if 55% of your sample is in the bottom two national quintiles and your sample is 100 respondents, then the ‘true ’ figure will probably be between 45% and 65%. If the sample is 1000 respondents, the range will be around +/- 5%. If your sample has 1000 respondents and 55% of the sample was in the bottom two quintiles, you could be confident that true percentage would be between 50% and 60%. The bigger your sample, the more precise it is, the smaller the range.
The most critical factor in determining the sample size is how large a range you are willing to accept. Note that the minimum sample sizes given in on this website will give you a range of about +/- 10%. If you would like a smaller range, so you can be more confident in your results, you’ll have to increase the sample size.
The ‘range’ is often referred to as the ‘95% confidence interval’, because you can be 95% sure that the true percentage is within that range.
Summary
- WE KNOW: Percentage from the sample
- WE WANT: Percentage from the population
- A sample gives you a range within which the population percentage probably falls (the range is called a 95% confidence interval)