Principles of Sampling - ORIG

We have pre-prepared a sample design for you and created an Excel tool that will help you determine the sample size you need, which facilities/sites you should collect data from, and how many beneficiaries to interview at each of those facilities/sites. There is a short ‘how-to’ guide to using the tool included in the toolkit.

Random selection of beneficiaries is essential to getting a good dataset. This means you cannot simply choose beneficiaries who are in easy and convenient places or who are from the best facilities/sites. Every beneficiary of the facility/site should have an equal chance of being chosen (except in the case of over-sampling for a specific sub-group, although this is generally not recommended).

To get the perfect dataset, you would interview every single beneficiary in your program. This is too expensive and time-consuming, so instead we interview a sample of beneficiaries. The downside of sampling is that it reduces accuracy. Determining the sample that you need to use will be a matter of balancing feasibility and affordability against accuracy.

To make things simpler, we have provided a minimum sample size that will provide you with an acceptable level of precision below. If you can make the sample bigger than the minimum, that will help increase precision. Please note that any size less than the minimum will result in an imprecise and inaccurate outcome.

What is precision in sampling?

If you were to interview every single beneficiary your dataset would be completely precise. If you calculated the percentage of beneficiaries who are in the bottom two wealth quintiles from that dataset, the result would be the ‘true’ percentage of the beneficiary population in the bottom two quintiles. Given that the sample population is different than the whole beneficiary population, the results from the sample will also be different than the true percentage. Increasing the sample size helps correct for this.

You can think of the ‘true’ beneficiary population percentage as being within a range around the sample percentage. If your sample is 100 respondents, the range will be around +/- 10%. For example, if 55% of your sample is in the bottom two national quintiles and your sample is 100 respondents, then the ‘true ’ figure will probably be between 45% and 65%. If the sample is 1000 respondents, the range will be around +/- 5%. If your sample has 1000 respondents and 55% of the sample was in the bottom two quintiles, you could be confident that true percentage would be between 50% and 60%. The bigger your sample, the more precise it is, the smaller the range.

The most critical factor in determining the sample size is how large a range you are willing to accept. Note that the minimum sample sizes given in on this website will give you a range of about +/- 10%. If you would like a smaller range, so you can be more confident in your results, you’ll have to increase the sample size.

The ‘range’ is often referred to as the ‘95% confidence interval’, because you can be 95% sure that the true percentage is within that range.

Summary

WE KNOW: Percentage from the sample
WE WANT: Percentage from the population
A sample gives you a range within which the population percentage probably falls (the range is called a 95% confidence interval)