Sampling is the key to survey research. No matter how well a study is done in other ways, if the sample has not been properly found, the results cannot be regarded as correct. Though this chapter may be more difficult than the others, but it is perhaps the most important chapter in this book. It applies mainly to surveys, but is also important for planning other types of research.

The first concept you need to understand is the difference between a population and a sample.

To make a sample, you first need a population. In non-technical language, **population** means "the number of people living in an area." This meaning of population is also used in survey research, but this is only one of many possible definitions of population. The word **universe** is sometimes used in survey research, and means exactly the same in this context as population.

The **unit of population** is whatever you are counting: there can be a population of people, a population of households, a population of events, institutions, transactions, and so forth. Anything you can count can be a population unit. But if you can't get information from it, and you can't measure it in some way, it's not a unit of population that is suitable for survey research.

For a survey, various limits (geographical and otherwise) can be placed on a population. Some populations that could be covered by surveys are..

- All people living in Cambodia.
- All people aged 18 and over.
- All households in Hanoi.
- All schools in Australia.
- All instances of tuning in to a radio station in the last seven days

...and so on. If you can express it in a phrase beginning "All," and you can count it, it's a population of some kind. The commonest kind of population used in survey research uses the formula:

- All people aged X years and over, who live in area Y.

The "X years and over" criterion usually rules out children below a certain age, both because of the difficulties involved in interviewing them and because many research questions don't apply to them.

Even though some populations can't be questioned directly, they're still populations. For example, schools can't fill in questionnaires, but somebody can do so on behalf of each school. The distinction is important when finding the answers to questions like "What proportion of schools in Western Samoa have libraries?" You need only one questionnaire from each school - not one from each teacher, or one from each student.

Often, the population you end up surveying is not the population you really wanted, because some part of the population cannot be surveyed. For example, if you want to survey opinions among the whole population of an area, and choose to do the survey by telephoning people at home, the population you actually survey will be people with a telephone in their home. If the people with no telephone have different opinions, you will not discover this.

As long as the surveyed population is a high proportion of the wanted population, the results obtained should also be true for the larger population. For example, if 90% of homes have a telephone, the 10% without a phone would have to be very different, for the survey's results not to be true for the whole population.

A sampling frame can be one of two things: either a list of all members of a population, or a method of selecting any member of the population. The term **general population** refers to everybody in a particular geographical area. Common sampling frames for the general population are electoral rolls, street directories, telephone directories, and customer lists from utilities which are used by almost all households: water, electricity, sewerage, and so on.

It is best to use the list that is most accurate, most complete, and most up to date. This differs from country to country. In some countries, the best lists are of households, in other countries, they are of people. For most surveys, a list of households (specially if it is in street order) is more useful than a list of people. Another commonly used sampling frame (which I do not recommend for sampling people) is a map.

A sample is a part of the population from which it was drawn. Survey research is based on sampling, which involves getting information from only some members of the population.

If information is obtained from the whole population, it's not a sample, but a census. Some surveys, based on very small populations (such as all members of an organization) in fact are censuses and not sample surveys. When you do a census, the techniques given in this book still apply, but there is no sampling error - as long as the whole group participates in the census.

Samples can be drawn in several different ways, such as probability samples, quota samples, purposive samples, and volunteer samples.

Probability samples

Sometimes known as random samples, probability samples are the most accurate of all. It is only with a probability sample that it's possible to accurately estimate how different the sample is from the whole population. With a probability sample, every member of the population has an equal (or known) chance of being included in the sample. In most professional surveys, each member of the population has the same chance of being included in the sample, but sometimes certain types of people are deliberately over-represented in the sample. Results are calculated compensate for the sample imbalance.

With a probability sample, the first step is usually to try to find a sampling frame: a list of all members of the population. Using this list, individuals or households are numbered, and some numbers are chosen at random to determine who is surveyed. If no population list is available, other methods are used to ensure that every population member has an equal (or other known) chance of inclusion in the survey.

Quota samples

In the early days of survey research, quota sampling was very common. No population list is used, but a quota, usually based on census data, is drawn up.

For example, suppose the general population is being surveyed, and 50% of them are known to be male, and half of each sex is aged over 40. If each interviewer had to obtain 20 interviews, she or he would be told to interview 10 men and 10 women, 5 of each aged under 40, and 5 of each aged 40-plus. It is usually the interviewers who decide how where they find the respondents. In this case, age and sex are referred to as **control variables**.

A problem with quota samples is that some respondents are easier to find than others. The interviewer in the previous example may have quickly found 10 women, and 5 men over 40, but may then have taken a lot of time finding men under 40. If too many control variables are used, interviewers will waste a lot of time trying to find respondents to fit particular categories. For example (if interviews had been specified in terms of occupation and household size, as well as age and sex) "2 male butchers aged 40 to 44, living in households of 8 or more people".

It's important with quota sampling to use appropriate control variables. Are some people in a category more likely to take part in the survey than others? And are those same people also likely to give different answers from those in another category? If so, that category should be a control variable.

For example, if women are more willing than men to be surveyed (which is generally true) and if the two sexes' patterns of answers are expected to be quite different, then the quota design should obtain balanced numbers from each sex. In fact, sex and age group are the two commonest control variables in quota surveys, but occasionally a different variable can be the most relevant. If you're planning a quota sample, you can't assume that by getting the right proportion in each age group for each sex, everything else will be OK.

Pure quota samples are little used these days, except for surveys done in public places, but sometimes partial quota sampling can be useful. A common example is when choosing one respondent from a household. The probability method begins by finding out how many people live in the household, then selecting an interviewee purely at random. There are practical problems with this approach (explained later in this chapter), so when a household has been randomly selected, quota sampling is often used to choose the person to be interviewed.

Volunteer samples

Samples of volunteers should generally be treated with suspicion. However, as all survey research involves some element of volunteering, there is no fixed line between a volunteer sample and a probability sample. The main difference between a pure volunteer sample and a probability sample of volunteers is that in the former case, volunteers make all the effort; no sampling frame is used.

The main source of problems with volunteer samples is the proportion who volunteer. If too few of the population volunteer for the survey, you must wonder what was so special about them. There is usually no way of finding out how those who volunteered are different from those who didn't. But if the whole population volunteer to take part in the survey, there's no problem.

In some circumstances, volunteer samples can be useful. For example, the Australian Broadcasting Corporation used to survey panels of listeners to its two serious radio networks, Radio National and Classic FM. To recruit people for the panels, the networks broadcast advertisements, asking those interested to contact the ABC.

To check whether these volunteers were representative of all listeners to the networks, we carried out of random surveys of listeners to these networks, and compared the answers of panel members and randomly selected listeners. We found that the volunteers were representative in most ways, but had slightly higher education levels, were a little younger than other listeners, and listened to the station much more often.

As only about 5% of the population listen regularly to these networks, random surveys are very expensive to conduct (20 households must be contacted to find each listener), so using volunteer samples saves a lot of money. Not all people who volunteer for panels need be accepted. A quota system can be used, to ensure that various parts of the population are accurately represented.

When people who know nothing about sampling organize surveys, they often have a large number of questionnaires printed, and offer one to everybody who's interested. Amateur researchers often seem to feel that if the number of questionnaires returned is large enough, the lack of a sample design isn't important. Certainly, you will get some results, but you will have no way of knowing how representative the respondents are of the population. You may not even know what the population *is,* with this method. The less effort that goes into distributing questionnaires to particular individuals and convincing them that participation is worthwhile, the more likely it is that those who complete and return questionnaires will be a very small (and probably not typical) section of the population.

About the only way in which a volunteer sample can produce accurate results (without being checked against a probability sample), is if a high proportion of the population voluntarily returns questionnaires. I've known this to work a few times, usually in country areas with a small population, where up to about 50% of all households have returned questionnaires. Even so, if all the effort is left to the respondents, there's no certainty that somebody who wants to distort the results has not filled in hundreds of questionnaires.

The same problems apply to drawing conclusions from unsolicited mail and phone calls. For example, politicians sometimes make claims like "My mail is running five to one in favour of the stand I made last week." There are many reasons why the letters sent in may not be representative of the population. The same applies to letters sent to broadcasting organizations: all these tell you is the opinions of the letter-writers. It is only when the majority of listeners write letters that the opinions expressed in these letters *might* be representative.

Purposive samples

A purposive sample is one in which a surveyor tries to create a representative sample without sampling at random.

One of the commonest uses of purposive sampling is in selecting a group of geographical areas to represent a larger area. For example, door-to-door interviewing can become extremely expensive in rural areas with a low population density. In a country such as Cambodia, it is not feasible to do a door-to-door survey covering the whole country. Though areas could be picked purely at random, if the budget was small and only a small number of towns and cities could be included, you might choose these in a purposive way, perhaps ensuring that different types of town were included. However, there are better ways to do this - for example...

Maximum variation samples

A maximum variation sample (sometimes called a maximum diversity sample) is a special kind of purposive sample. Normally, a purposive sample is not representative, and does not claim to be. A maximum variation sample aims to be more representative than a random sample (which, despite what many people think, is not always the most representative, specially when the sample size is small).

Instead of seeking representativeness through equal probability, it's sought by including a wide range of extremes. This is an extension of the statistical principle of **regression towards the mean** - in other words, if a group of people is (on average) extreme in some way, it will contain some people who themselves are average. So if you sought a "minimum variation" sample by only trying to cover the types of people who you thought were average, you'd be likely to miss out on a number of different groups which might make up quite a high proportion of the population. But by seeking maximum variation, average people are automatically included.

When you are selecting a multi-stage sample (explained in more detail below) the first stage might be to draw a sample of districts in the whole country. If this number is less than about 30, it's likely that the sample will be unrepresentative in some ways. Two solutions to this are stratification (also explained below) and maximum-variation sampling. For both of these, some local knowledge is needed.

With maximum-variation sampling, you try to include all the extremes in the population. This method is normally used to choose no more than about 30 units. For example, in a small village, you might decide to interview 10 people. If this was a radio audience survey, you could ask to interview

- the oldest person in the village who listens to radio
- the oldest who does not listen to radio
- the youngest who listens to radio
- a man who listens to radio all day
- a woman who listens to radio all day
- somebody who has never listened to radio in his or her life
- the person with the most radios (a repairman, perhaps)
- the person with the biggest aerial
- a person who is thought to be completely average in all ways

...and so on. The principle is that if you deliberately try to interview a very different selection of people, their aggregate answers will be close to the average. The method sounds odd, but works well in places where a random sample cannot be drawn. And of course it only works when information about the different kinds of sample unit (e.g. a person) is widely known.

Map-based sampling

When you are planning a door to door survey, it is tempting to use a map as the basis for sampling. To get 100 starting points for clusters, all you need to do is throw 100 darts at the map.

This method, if properly done, gives every unit of area on the map an equal chance of being surveyed. This would be valid only if your unit of measurement was a unit of land area — for example, if you were estimating the distribution of a plant species. If you are surveying people or households, this equal-area method will over-represent farmers and people living on large properties. People living in high-density urban areas will be greatly under-represented. Even within a small urban area, large differences in density can exist.

Slightly better, but still badly flawed, is a method used in the 1980s by a Sydney research company. This was based on a street directory, and gave every street an equal chance of being surveyed. The trouble was that streets ranged in size from the Pacific Highway (with thousands of addresses on it) to cul-de-sacs with only one or two dwellings. This would not have caused a problem if there were no consistent difference between long streets and short streets. However, in Sydney, long streets tend to have many blocks of flats, while short streets tend to have single-unit houses. The people living on long streets tend to be poorer and more transient than others, and include fewer families with children.

Found samples

Perhaps you have a list of names and addresses of some of your audience, collected for a marketing purpose. This is known as a **found sample** or **convenience sample**. It's tempting to survey these people, because it seems so easy. But avoid it! You have no way of knowing how representative such a sample is. You can certainly get a result, but you won't know to what extent that result is true of people who were not included in the sample.

Snowball samples

If you're researching a rare population, sometimes the only feasible way to find its members is by asking others. First of all, you somehow have to find a few members of the population - by any method you can. That is the first round.

You now ask each of these first-round members if they know of any others. These names form the second round.

Then you go to each of those second-round people, and ask them for more names.

Keep repeating the process, for several more rounds. The important thing is knowing when to stop. For each round, keep a count of the number of names you get, and also the number of new names - people you haven't heard about before. Calculate the number of new names as a percentage of the total number of names. For example, if one round gives you 50 names, but 20 are for people who were mentioned in earlier rounds, the percentage of new names for that round is 60%. You'll probably find that the percentage of new names rises at first, then drops sharply. When you start hearing about the same people over and over again, it's time to stop - perhaps when the percentage of new names drops to around 10%. This is often at the fourth or fifth round.

You now have something close to a list of the whole population (and many of them will know that you're planning some research). Using that list, you can now draw a systematic sample, as detailed in section 9 of this chapter.

Snowball sampling requires a lot of work when the population is large, because you need to draw up an almost-complete list of the population. So this method works best when a population is very small. But if the population is small enough to list every member without a huge amount of work, you could do a census, rather than a sample: in other words, contact all of them.

Snowball sampling works well when members of a population know the other members. For example, if you are studying people who speak a minority language, or who share some disability, there's a good chance that most of them know of each other. The biggest problem with snowball sampling is that isolated people, who are not known to other members of the population, will not be included in your study, because you'll never find out about them. In the case of minorities, sometimes the more successful members will blend into the ruling culture, feeling no need to communicate with other members of that minority. So if you survey only the ones who know each other, you may get a false impression. A partial solution to this is to begin with a telephone directory or other population list. If people in that population have some distinctive family names, you can find them in the directory, and contact those people for the first phase of the snowball.

Stratification

The simplest type of sampling involves drawing one sample from the whole survey area. If the coverage area of a radio station is a large town and its surrounding countryside, there may be a population list that covers the whole area - an electoral roll, perhaps. If you want to select (say) 40 random addresses as starting points for a door-to-door cluster survey, you could simply pick 40 addresses from the population list.

This is simple, but there's a slight danger that all 40 addresses may be in the same part of the coverage area. This happened to me once, when I planned a survey in Timaru in New Zealand. We drew 20 addresses as starting points, then plotted them on a map. By an unlucky fluke, they were all in one quarter of the town. I considered ignoring that sample, and selecting another 20 addresses. But what if the same imbalance occurred again?

The solution was to stratify the sample. Using census data from small areas, Timaru was divided into four quarters, with almost exactly equal populations. We then selected 5 addresses in each quarter. This way, we were certain that the clusters would be spread evenly across the town.

Stratification is easy to do, and you should use it whenever possible. But for it to be possible, you need to have (a) census data about smaller parts of the whole survey area, and (b) some way of selecting the sample within each small area. For example, if you were using a telephone directory as a sampling frame, each residential listing might show the suburb where that number was. (It doesn't matter if the person mentioned in the listing still lives there - you use a telephone directory as a list of addresses, not people.) In this case, you'd need census data on the number of households in each suburb, to be able to use stratification effectively.

The principle of stratification is simply that, if an area has X% of the population, it should also have X% of the interviews.

Here's an example of a stratified sample design for the Amhara Region of Ethiopia, from a survey I organized there in 2000. The Amhara Region is divided into 11 zones. For each zone, you find out the population, work out what percentage it is of the total, then make sure that the number of clusters is as close as possible to those proportions. Produce a table laid out like this. You begin with the numbers shown in bold type, and calculate the rest.

Zone |
Population (1994) '000s |
% of total population |
Clusters |
% of clusters |
exact |
rounded |
A |
B |
C |
D |
E |
North Gondar |
2 089 |
15.1 |
6.80 |
7 |
15.6 |
South Gondar |
1769 |
12.8 |
5.76 |
6 |
13.3 |
North Wello |
1260 |
9.1 |
4.10 |
4 |
8.9 |
South Wello |
2124 |
15.4 |
6.93 |
7 |
15.6 |
North Shewa |
1561 |
11.3 |
5.08 |
5 |
11.1 |
East Gojam |
1700 |
12.3 |
5.54 |
6 |
13.3 |
West Gojam |
1779 |
12.9 |
5.80 |
6 |
13.3 |
Wag Himera |
276 |
2.0 |
0.90 |
1 |
2.2 |
Agew Awi |
717 |
5.2 |
2.34 |
2 |
4.4 |
Oromiya |
463 |
3.3 |
1.48 |
1 |
2.2 |
Bahir Dar |
96 |
0.7 |
0.32 |
0 |
0 |
Total |
13 835 |
100.0 |
45 |
45 |
100.0 |

Calculate the table as follows. For each zone:

Figure in column B = Figure in column A **/** the total of A **x** 100.

E.g. Column B for North Gondar = 2089 / 13835 x 100 = 15.1

Column C = B **x** total of C

Column D = same as C, but rounded to nearest whole number

Column E = D **/** total of D

As North Gondar has 15.1% of the population, it should also have 15.1% of the clusters. But 15.1% of 45 clusters is 6.8, and you can't have 0.8 of a cluster. So the number of clusters in North Gondar is rounded up to 7.

This process is repeated for each other zone. Sometimes, because of the rounding, the total number of clusters is 1 more or less than the total you planned for. To fix this, you can change the final number of clusters, adding or subtracting 1. Another solution is to cheat a little, by rounding a whole number of clusters in the wrong direction: in the above table you could round 1.48 up to 2, or 5.54 down to 5, with very little effect on the accuracy of the proportions. When you round a figure in the wrong direction, choose the one/s with numbers ending in the closest figure to .5

You could also add a column F: the difference between B and E. The maximum difference depends on the number of clusters, but should usually be less than 2%. If any difference is larger than 2%, you may need to have more clusters, with fewer interviews in each.

In the above table, there's a problem with the Bahir Dar zone. The population there was only 96,000, so this zone needed 0.32 of a cluster. This is rounded down to 0, so there are no clusters in that zone, and therefore no interviews in that zone. However, this is a serious problem, because Bahir Dar is the main city of the Amhara province. What can be done about this?

There are three solutions:

(1) If an area would have no interviews, combine it with an adjoining area. As Bahir Dar is inside West Gojam, these two zones could be combined. The exact number of clusters would be 6.12 (5.80 + 0.32), which in this case still rounds down to 6. However, there's a change that you *still* wouldn't get an address in Bahir Dar.

(2) Round the 0.32 upwards instead of downwards, and include one cluster in Bahir Dar. The total number of clusters would then be 46 - but people living in Bahir Dar would be over-represented in the survey: 2.2% of the clusters (1 in 46), but 0.7% of the population.

(3) Change the cluster size in Bahir Dar. Instead of using clusters of 8 households (as elsewhere), you could do a single cluster of 4 households. So now there would be 45.5 clusters, and Bahir Dar, with 0.7% of the population, would have 1.1% of the interviews. That's close enough. However, having different cluster sizes often confuses the interviewers, so you'd need two slightly different sets of interviewer instructions.

Multi-stage sampling

With door-to-door surveys, sampling is done in several steps. Often, the first step is stratification. For example, census data can be used to select which districts in the survey area will be included. In the second step, random sampling could be used, but each district might need to be treated separately, depending on the information available there. This would decide which households would be surveyed. The third step would involve sampling individuals within households, perhaps using quota sampling.

The concept of randomness

Before we discuss random sampling, you need to be clear about the exact meaning of "random." In common speech, it means "anything will do", but the meaning used in statistics is much more precise: a person is chosen at random from a population when every member of that population has the same chance of being sampled. If some people have a higher chance than others, the selection is not random. To maximize accuracy, surveys conducted on scientific principles always use random samples.

Imagine a complete list of the population, with one line for every member: for example, a list of 1500 members of an organization, numbered from 1 up to 1500. Suppose you want to survey 100 of them. To draw a simple random sample, choose 100 different random numbers, between 1 and 1500. Any member whose number is chosen will be surveyed. If the same number comes up twice, the second occurrence is ignored, as nobody will be surveyed more than once. So if the method for selecting random numbers can produce the same number twice, about 110 selections will need to be made to get 100 people.

Another type of random sampling, called **systematic sampling**, is more commonly used. This ensures that no number will come up twice. No matter how many thousands of people you will interview, you need only one random number for systematic sampling.

In the above example, you are surveying 1 member in 15. Think of the population as divided into 100 groups, each with 15 people. You need to choose one person from each group, so you choose a random number between 1 and 15. Let's say this number is 7. You then choose the 7th person in each group. If the members were numbered 1-15 in the first group, 16-30 in the second, 31-45 in the third, and so on, you'd interview people with numbers 7, 22, and 37 - adding 15 each time. Exactly 100 members would be chosen for the survey, and their numbers would be evenly spread through the membership list.

Sources of random numbers

The commonest source of random numbers in most countries is the serial numbers on banknotes. There can be no bias in using the last few digits of the first banknote you happen to pull out of your pocket, because there should be an equal chance of drawing each possible combination of final digits. Other source of unpredictable large numbers (from which you can use the last few digits) include lottery results, public transport tickets, even stock market indexes.

You can also cheat. With systematic sampling, only one random number is needed. Just ask somebody to state a number, between 1 and the upper limit. Though statisticians would frown, it will be almost impossible for this to bias the results, if you are using a systematic sample.

Principles of random sampling

The essential principle of random sampling is that everybody in the population to be surveyed should have an equal chance of being questioned. If you do a survey, and everybody had an equal chance of inclusion, you're in a position to estimate the accuracy of your results.

Every survey has sampling variation. If you survey 100 people, and get a certain result, this result will be slightly different than if you had surveyed another group of 100 people. This is like tossing coins: if you toss a coin 100 times, you know that there should be 50 heads and 50 tails. But the chances are quite strong (92 in 100, to be exact) that you won't get exactly 50 heads and 50 tails. However, the chances of getting 0 heads and 100 tails are practically nonexistent.

Using statistical techniques, it's possible to work out the exact chances of every possible combination of heads and tails. For example, there are 680 chances in 1000 that you'll get between 45 and 55 heads in 100 throws. (If you doubt this, find 100 coins, throw them 1000 times, and see the result for yourself!)

In the same way, even though you know the results from a survey are not exactly accurate, they are probably pretty close — but only if every member of the surveyed population had an equal chance of being included in the survey.

To estimate how much sampling error there is likely to be in a survey result, use the following table. "Standard error" means (roughly) the average difference between the true figure and each case.

Table of standard errors

% of sample giving this answer |
Sample size (no. of interviews) |
100 |
200 |
400 |
800 |
5 or 95% |
2.2% |
1.6% |
1.1% |
0.8% |
10 or 90 |
3.0% |
2.1% |
1.5% |
1.1% |
15 or 85 |
3.6% |
2.5% |
1.8% |
1.3% |
20 or 80 |
4.0% |
2.8% |
2.0% |
1.4% |
30 or 70 |
4.6% |
3.3% |
2.3% |
1.6% |
40 or 60 |
4.9% |
3.5% |
2.4% |
1.7% |
50% |
5.0% |
3.5% |
2.5% |
1.8% |

When using the above table, think of each question as having two possible answers. Although a question may have more than two answers (e.g. age groups of under 25, 25 to 44, and 45 or over), the number can always be reduced to two, conceptually. For example, suppose 20% of a sample is in the 25 to 44 group. Therefore, the other 80% is in the "not 25 to 44" age group. The margin of error on this 20/80 split is 4%, so the true population figure is likely to be anywhere between 16% and 24%. There is one chance in three that it will be outside this range, and 1 chance in 20 that it be outside twice this range: i.e. less than 12 or more than 28%.

If all that sounds too difficult, just assume that the margin of error is 5%, on any result. For example, if a survey finds that 25% of the population listen to your station, it's likely that the true figure will be somewhere between 20% and 30%. (Likely - but not certain - because there's a small chance that the true figure could be less than 20% or more than 30%. A well-known saying among statisticians is "statistics means never having to say you're certain.")

Always remember that the above table shows only sampling error, which is fairly predictable. There could also be other, unpredictable, sources of error.

Note in the above table that the margin of error for 400 interviews is always half that for 100. This means that to halve the error in a survey, you must quadruple the sample size. So unless you have a huge budget, you must learn to tolerate sampling error.

There are several ways to choose a sample size: you can either calculate it from a formula, or use a rough "rule of thumb."

The formula for calculating the sampling error to a survey question is:

n = p x q / SE^{2}

where:

n is the sample size: the number of people interviewed.

p is the percentage answering Yes to the question.

q is the percentage not answering Yes to the question.

SE is the standard error as shown in the table above.

An example

You guess that maybe a quarter of all people listen to your station, so **p** is 25%, and **q** is 75%. You want the figure to be correct within 3%. If you do find a figure of 25% who listen, you want to make sure the true figure is between 22% and 28%. So to calculate the required sample size:

n = 25 x 75 / (3 x 3)

= 208

This formula (which I have over-simplified slightly), is useful in working out how big a sample size you need for a given survey. But to calculate the sample size you first have to know roughly how many people will answer Yes to the question, and also decide how large a standard error you can tolerate. For beginners, this is not simple. Another problem is that samples calculated in this way can be horrifyingly large. For example, if you changed the tolerance from 3% to 1% in the above example, you'd have to interview 1875 people. Yet another problem is that every question in a survey may require a different sample size.

In an ideal world, you'd calculate the sample size for a survey as shown above, and cost would never be a problem. However, as most surveys are done to a budget, your starting point in practice may not be how much error you can tolerate, but rather how little error you can get for a given cost.

To do this, you need to divide the cost of the survey into two parts:

- a fixed part, whose cost is not proportional to sample size, and
- a variable part, for which the cost is so much per member of the sample.

Once you have allocated a proportion of the total budget to the fixed cost, and estimated the cost of getting back each completed questionnaire, you can calculate the affordable sample size.

But what if you don't know the survey cost, and have to recommend a sample size? This is where the rule-of-thumb is useful.

For the majority of surveys, the sample size is between 200 and 2000. A sample below 200 is useful only if you have a very low budget, and little or no information on what proportion of the population engages in the activity of most interest to you — or if the entire population is not much larger than that. A sample size over 2000 is probably a waste of time and money, unless there are subgroups of the population, which must be studied in detail.

If you don't vitally need such large numbers, and have more funds than you need, don't spend it on increasing the sample size beyond the normal level. Instead, spend it on improving the quality of the work: more interviewer training, more detailed supervision, more verification, and more pre-testing. Better still, do two surveys: a small one first, to get some idea of the data, then a larger one. With the experience you gain on the first survey, the second one will be of higher quality.

The sample size also depends on how much you know about the subject in question. If you have no information at all on a subject, a sample of only 100 can be quite useful, though its standard error is large.

Rule of thumb

Are you confused about which sample size to choose? Try my rule of thumb:

Condition | Recommended sample |

No previous experience at doing surveys. No existing survey data. | 100 to 200 |

Some previous experience, or some previous data. Want to divide sample into sets of 2 groups (e.g. young/old, male/female) |
200 to 400 |

Have previous experience and previous data. Want to divide sample into sets of up to 4 groups. Want to compare with previous survey data. | 400 to 600 |

A common misconception

Consider this question: if a survey in a town with 10,000 people needs a sample of 400 for a given level of accuracy, what sample size would you need for the same level of accuracy in the whole country, with a population of 10,000,000? (That's 1000 times the population of the town.)

Did you guess 400,000? Most people do. The correct answer is 400.4 - you might as well call it 400.

The formula I gave above isn't quite complete. The full version has what's called the **finite population correction** (or FPC) added, so the full formula is:

n = p x q / SE^{2} x (N-n)/N

where N is the population. Unless the sample size is more than about 5% of the population, the *(N-n)/N* bit (the FPC) makes almost no difference to the required sample size.

Is that too technical? Think of it another way. Imagine that you have a bowl of soup. You don't know what flavour it is. So you stir the soup in the bowl, take a spoonful, and sip it. The bowl of soup is the population, and the spoonful is the sample. As long as the bowl is well-stirred (so that each spoonful is a random sample), the size of the bowl is irrelevant. If the bowl was twice the size, you wouldn't need to take two spoonfuls to assess the flavour: one spoonful would still be fine. This is equally true for human populations.

Though random sampling is the ideal, sometimes it's not possible. In some countries, census information is either not available, or so many years out of date that it's useless. Even when good census data exists, there may be no maps showing the boundaries of the areas to which the data applies. And even when there exist both good census data and related maps, there may be no sampling frames.

The good news (from a sampling point of view) is that these conditions usually apply in very poor and undeveloped countries with large rural populations. In my experience, there's not a wide range of variation in these populations. This is a difficult thing to prove, but I suspect that the more developed a country, the more differences there are between its citizens. All this is a way of saying that where random sampling is not possible, perhaps it's not so necessary.

The best solution I can think of is to use maximum-variation sampling, described briefly in section 3 of this chapter.

Maximum-variation samples are normally drawn in several stages, so they are multi-stage samples. The first stage is to decide which parts of the population area will be surveyed. For example, if a survey is to represent a whole province, and it's not feasible to survey every part of the province, you must decide which parts of the province will be included. Let's assume that these parts are called counties, and you will need to select some of these.

Maximum variation sampling works like this:

**Stage 1**

1. Think of all the ways in which the counties differ from the province as a whole - specially ways relevant to the subject of the survey. If the survey is about FM radio, and some areas are hilly, reception may be poorer there. If the survey is about malaria, and some counties have large swamps with a lot of mosquitoes, that will be a factor. If the survey will be related to wealth or education levels (as many surveys are), try to find out which counties have the richest and best-educated people, and which have the poorest and least-educated. Try to think of about 10 factors, which are relevant to the survey.

2. Then try to gather objective data about these factors. Failing that, try to find experts on the topics, or people who have travelled around the whole province. Using this information, for each factor make a list of the counties which have a high level of the factor (e.g. lots of mountains, lots of swamps, or wealthy) and counties which have a low level (e.g. all flat, no swamps, or poor).

3. The counties mentioned most often in these lists of extremes should be included in the survey. Mark these counties on a map of the province. Has any large and well-populated area been omitted? If so, add another county, which is as far as possible from all the others mentioned.

**Stage 2**

When the counties (or whatever they are called) have been chosen, the next stage is to work out where in each county the cluster should be chosen. Continue the maximum-variation principle by using the same principle in each country as in stage 1. If a county was chosen for its swampiness and flatness, choose the flattest and swampiest area in the country. If it was chosen for its mountains and wealth, choose a wealthy mountainous area.

To find out where these areas are, you will probably need to travel to that county and speak to local officials. Sometimes you then find that there are local population lists - such as lists of all houses in the area. In that case, you might be able to use random sampling for the final stage. If there are no population lists that you can use, the surveyed households will have to be chosen by block listing, aerial photographs, or radial sampling - see section 2 (below) for details of these methods.

Maximum variation sampling can produce samples that are as representative as random samples. The only problem is that you can never be sure of this. More on maximum variation sampling.