Audience Dialogue

Know Your Audience: chapter 3, part B
Writing questions

5. Question wording

Of all parts of survey research, it is the wording of questions that is least a science and most an art. Here are some principles for question wording, divided into two sections: What To Do, and What Not To Do.

What to do

1 Keep questions short and simple.

I suggest a 25-word limit for a survey question. In a spoken survey, this limit should any multiple-choice answers that form part of the question. If you can’t fit a question into 25 words, try and split it into two smaller questions. Avoid long or difficult words. Remember that the respondent has to remember the whole question to be able to answer it properly.

Even with mail and self-completion questionnaires, if a question is too long (more than about 2 lines, excluding multiple-choice answers), some respondents will not read it all thoroughly.

2 Always encourage multiple answers for questions beginning "Why"

People do things for many reasons. If you ask "Why did you watch that TV program?" one respondent might give any (or all) of these answers:

Any open-ended question that asks for reasons will probably produce almost as many reasons as there are respondents. Most people have several reasons for doing whatever they do. So any person could probably answer a question beginning "Why" with ten completely different answers, all of them true.

If a respondent is asked to give only one answer to a "Why" question, it will be the answer he or she thought of first. With such questions, the interviewer must try to get all the reasons that apply. After a respondent gives each reason, the interviewer should ask "Do you have any more reasons?" and allow the respondent a little time to think of more reasons.

3 Beware of the implied "always"

"Should police carry guns?" This question is ambiguous: does it mean "Should all police always carry guns?" or "Should some police sometimes carry guns?" — or something in between? Make it specific, so that everybody can answer the same question, not what they guess it might be asking.

4 Beware of implied regularity

"Do you ever listen to 5MG?" is not the same as "Have you ever listened to 5MG?" The first implies a regularity that may not exist for many people. The second version is more specific. Other suitable versions include "When did you last listen to 5MG?" (which may not produce a very accurate answer, if it was not recently), and "Have you listened to 5MG in the last week?" (Ask about the last week, and some people will answer for the last two weeks.)

5 Habits are not always the same as behaviour

"Do you listen to 5MG every day?" may be answered Yes by the same person who answers No to "Did you listen to 5MG yesterday?" Most people have a mental picture of their habits, which may differ quite sharply from their actual behaviour. They see themselves as often doing things that in practice they rarely do.

6 Ask precise questions

Avoid vague terms, and those that have different meaning to different people. Don’t ask "Are you a listener to 5MG?", or you’ll get answers like "Well I listen now and then, but I’m not really a 5MG listener." Other common words to be wary of are "local" and "community": if these are used in a question, the exact geographical scope should be made clear.

Answers can’t be precise if questions are not precise. For example, a respondent can interpret "Do you have a TV set?" in several different ways:

So spell out exactly what you mean.

To get highly accurate answers to questions about behaviour, you need to specify as much detail as respondents can stand. For example, even a simple question on the number of radios in a household can produce answers that depend very much on exact wording. For example, compare these four sets of wording:

1 "How many radios are there in your household?"

2 "How many radios are owned by people in your household?"

3 "How many working radios are there in your household, including car radios?"

4 "How many radios are there at your home? Don’t count any that aren’t working, and don’t count two-way or walkie-talkie radios, but do count car radios, and radios built in to other equipment like cassette players."

Version 4 makes it clear exactly what is wanted, but is so detailed that some people wouldn’t listen to it all — and it still isn’t complete. To avoid the possibility of some radios being overlooked, the interviewer could add to version 4:

5 "Think of each room in the house in turn, and whether there are any radios in it. Make sure you don’t miss any that are moved around a lot. Now think of any radios you have outside the house, in vehicles or outbuildings. How many of those are there?"

Version 5 should produce a very accurate count, but could take several minutes. It’s also over the 25-word limit, and should be divided into several questions, asking about each room in turn. Many people would give the same answer to Version 5 as to Version 1, but others would give a much lower number or radios if asked only the first version of the question.

For most purposes, Version 3 will be fine, as long as you know that it produces a slight under-counting of radios.

7 When asking about radio, define "listening" explicitly

Be careful with any questions about listening to radio. Compared with most other types of behaviour, listening is less of an on/off activity. Partial listening is very common. Thus the exact meaning of "listening" needs to be defined within the question. Many diary surveys define listening as "being in the same room as a radio that is switched on." This is partly because it’s easier for respondents to remember being in a room than particular times when they might have been paying attention, but also because the audiences produced by this loose definition of "listening" are much larger - and larger audiences bring more advertising revenue.

In Australia, radio diary surveys, using the "in the same room" definition, show that the average person spends about 3 hours a day listening to radio. When a government time-use survey found that radio listening averaged only 10 minutes a day, the radio industry thought there was a huge mistake. It turned out that 10 minutes a day was the time people spend listening to radio and doing nothing else. When time spent listening to radio while doing something else was included, the average was in fact about 3 hours.

8 Always try to include points of comparison

To find out about your own station, you also need to ask about other stations. To measure a response to a program, that program must be compared with others. ("So 59% like the program? Is that high or low?") Therefore, try to build up a context for the main question; without comparisons, survey results have little meaning.

What to do: summary

The main principle of writing questionnaires is to try and see an organizational problem from a respondent’s point of view - to make a link between the world of the audience member, and the world of the media publisher. If there seems to be any conflict here, remember that it’s the audience who will be answering the questions, so the audience’s view of the world should predominate in a questionnaire.

What not to do in questionnaires

1 Avoid questions beginning "Why don’t"

These are even more difficult than questions beginning "Why". Here it is important to distinguish between an internal question (what the organization wants to know) and a survey question (what each respondent is asked).

Your internal question may be "Why don’t more people listen to our marvellous programs?" If this is converted directly into a questionnaire question such as "Why don’t you listen to more programs on Radio Rhubarb?" many respondents won’t know how to answer. Some will make the first excuse that comes into their head. Some will say "I’m too busy".

But that’s not a useful answer. After all, everybody has 24 hours in their day; it is a matter of priorities. So you need to form some theories, to guess at reasons why more people don’t listen. With a little thought, you can probably break these reasons down into a set of logical possibilities. The question could be re-worded:

Here are some reasons why people don’t listen to more programs on Radio Rhubarb. Please tell me which, if any, reasons apply to you.

[] Not knowing that Radio Rhubarb exists.
[] Not knowing how to find Radio Rhubarb on the dial.
[] Not having access to a radio.
[] Not liking the Radio Rhubarb programs you have heard."

... and so on. You could even distinguish between strong and weak reasons, instead of just ticking one box when a respondent said that reason applied. And in case you forgot to list some important reasons, the question could conclude with the open-ended section: "Are there any other reasons I haven’t mentioned? If so, what are they?"

2 Avoid industry jargon.

Don’t assume that respondents share your knowledge of your industry. For example, many radio terms are not well understood. These include "live" (as in live broadcast), "call sign", "regional", and "network". In countries where the 24-hour clock (e.g. showing 1pm as 1300) is not widely used, many people think 1700 hours is 7pm.

In 1996 I did a survey asking the general population what they thought "multimedia" meant. The correct answer would have included a reference to CD-ROMs, but the most common answer was "organizations that own several types of media, such as radio, TV, and newspapers."

3 Never ask two questions in one.

Combining two questions to save space or time will cause more problems than it solves. Whenever a question contains the words "and" or "or", examine it carefully to make sure it really is one question and not two. Sometimes it can be difficult to realize that what you see as one question can be interpreted by respondents as two questions. You’ll know that this has happened when you expect one answer, but get two.

Take this example from a questionnaire for tourists in the Blue Mountains, near Sydney:

Did you buy your food and drink from the hotel where you stayed?

[] Yes
[] No

People who bought drink but not food from the hotel may have answered Yes, on the ground that either food or drink would qualify. Some who had a drink in the bar may have answered No, on the ground that the question seemed to be asking about meals, rather than bar purchases. Also, the question does not make it clear whether it referred to all food and drink bought, or only some. Because of this confusion, the results would have been worthless.

4 Never use double negatives

For example, questions beginning "Don’t you think X should not ..." Many people will answer Yes when they should have said No, and vice versa. Double negatives are particularly bad when you are asking a group of questions using the same scale - e.g. agree/neutral/disagree. For example, you may present the statement "FM99 should not have commercials before 6am" and invite people to agree or disagree. Among those who disagree, you’ll probably find that some are disagreeing with the statement, while others are disagreeing with the idea of the pre-6am commercials.

5 Don’t expect memory feats

Memory feats include asking people exactly what they did a week ago. Sometimes you have no alternative - but don’t expect accurate answers. For many, memory has a telescoping effect, by which two months seems like one, a year ago seems like six months, and so on.

There are two problems with asking people to remember something from a long time ago: telescoping and forgetting. Some researchers have found that within a time range of 2 to 4 weeks, these two effects roughly cancel each other out - but don’t count on this.

6 Avoid questions beginning If

If you ask hypothetical questions - such as "What station would you listen to at 9 a.m. if Eugene Shurple was no longer on 5MG?" — you will get hypothetical replies, such as "It would depend who replaced him." Similarly, if you have a new program in mind, and describe it in a survey, and get a favourable response, don’t be too surprised if it turns out to be unpopular. See below, under Program Testing, for a better way.

7 Avoid tongue-twisters.

If interviewers will have to read the questions aloud, the questionnaire writer should also each question aloud (quickly) before finalizing the wording.

8 Avoid ambiguity

Sometimes it’s hard to realize that a question you intend to have one meaning can be understood to have quite a different meaning. For example, an ABC survey a few years ago, at a time when industrial action had caused occasional news blackouts, asked "Which channel’s news do you have most confidence in?" Later, we realized that "confidence" is ambiguous — we’d taken it to mean credibility, but some respondents assumed it referred to regularity.

In fact, "confidence" almost comes under the heading of Vague Term, To Be Avoided. A better wording would have been "If you saw differing reports of the same event, in news bulletins on channels 2, 7, 9, and 10, which channel would you believe most?"

9 Avoid leading questions

Leading questions are those that make it clear by their wording that one answer is preferred. An example appeared in an advertisement in major Australian newspapers, opposing the UN convention on eliminating discrimination against women. One question in this pseudo-questionnaire was "Do you want Soviet-style laws on women’s rights imposed on Australia?" Who would dare answer Yes to a question including words like "Soviet-style" and "imposed"?

This is an obvious example of a leading question, but others are more subtle. In fact, there’s no clear boundary between leading and informing. Two 1984 surveys asked about uranium mining. One asked "Should uranium be mined, or left in the ground?" and the other asked "Are you in favour of the mining of uranium for peaceful purposes?"

The second version found a much higher percentage of people in favour of uranium mining, presumably due to the inclusion of "for peaceful purposes?" (And of course, it was sheer coincidence that the second version was sponsored by a pro-mining group.)

So whenever a question includes an explanatory statement that is not strictly part of the question, it’s probable that any information presented in the statement will affect the answers. Try to avoid such statements. By telling the survey sample something that the general public may not know, the answers from the sample may no longer represent the opinions of the whole public.

10 Avoid easy escapes

By an "easy escape" I mean an alternative answer seemingly so obvious that many respondents will accept it without thinking. This applies mainly to multiple choice questions. If you ask a difficult question, which requires some thought, and offer an alternative that seems to cover all the others, many people will choose that one.

For example: "Do you think there should be more of X than there is now, or less, or is the present amount about right?" No matter what X is, you can be sure that approximately 50% of respondents will say the present amount is about right. And if suddenly there is more of X, and a second survey is done: surprise, surprise! You will still find 50% saying the present amount is "about right." Changing "about right" to "exactly right" can partly solve this problem.

Try not to offer people a choice of answers which includes "it depends," or words to that effect. This is such an easy choice that many respondents will choose that answer, without considering the other possibilities.

11 Avoid ranking

Sometimes a questionnaire will include a question like this:

"Please indicate how much you like these programs by writing 1 beside the program you like most, 2 beside the program you like next most, and so on, down to 8.

__ State and Laws
__ Cheo
__ The Eugene Shurple Show
__ Famous Moments in Farming
__ Weekly Drainage Review
__ News
__ Weather report
__ Singing Talkback

Respondents hate this type of question. They keep changing their minds, they can’t decide, and they become very frustrated. In spoken questionnaires, they ask the interviewer for help. In written questionnaires, answers to this type of question are often unreadable, because of all the crossings-out. Even people who can rank the first few and the last few items often don’t have any opinion about the ones in the middle.

Equally bad is when respondents are given say 100 points, and asked to spread these points between programs (or other items) depending on how much they favour each. Many respondents can’t add up!

A simpler solution is to use a multiple-answer format, limiting the number of answers. For example

"I’ll read out the names of 8 programs. Please tell me up to 3 that you like the most.

[] State and Laws
[] Cheo
etc.

The results are much the same as for ranking, but this method is easier for everybody: interviewers, respondents, and survey analysts.

12 Avoid asking about "today"

Answers will depend on the time of the interview. Somebody interviewed at 9pm, and asked "What day did you last watch TV?" is more likely to answer "today" than somebody interviewed in the morning. In such situations, it is better to ask about "today or yesterday."

Sensitivity of question types to exact wording

Questions can be of two broad types:

(1) Questions about behaviour or observable facts.

(2) Questions about attitudes and thoughts.

Questions of the first type, such as "What age are you?" or "Which radio stations have you listened to in the last seven days?" are not much affected by changes of wording. Thus "What is your age in years?" and "What age are you?" and "How old are you at the moment?" would produce almost identical answers.

But for questions about attitudes, a seemingly insignificant change in wording can cause a very large difference in answers — as in the uranium mining example a few pages back. The more complex the concept, the more the responses will vary due to minor changes in wording. However, for simple concepts (e.g. radio and TV programs) and simple attitudes (e.g. approval) the wording used doesn’t affect responses much.

In the 1980s, I managed a weekly TV appreciation survey in South Australia. For several years we had people rate each program on a 5-point scale:

1 = like it very much indeed
2 = like it a lot
3 = neither like not dislike it
4 = dislike it
5 = hate it

I set up a series of tests, using different sets of wording for half the questionnaires distributed each week. Two new scales were created, one based on perceived quality, one on behaviour

Perceived quality

1 = Excellent
2 = Very good
3 = Good
4 = Fair
5 = Poor

Behaviour

1= I try never to miss this program
2 = I usually try to see this program
3 = I don’t care if I see it or not
4 = I avoid this program
5 = I switch off when this program comes on.

The three scales - of liking (the original scale), of perceived quality, and of behaviour produced almost identical results, across about 70 programs. Correlations between the scales were very high indeed: around .85

When a 5-point scale is offered, respondents seem to forget about the actual wording, after answering the first few questions in the set.

The simplest way to ask questions about attitudes is probably to phrase the questions as statements, and ask respondents to react to the statement, using this 5-point Likert scale:

1 = strongly agree
2 = mildly agree
3 = neither agree nor disagree
4 = mildly disagree
5 = strongly disagree.

This scale has been widely used, and found to work well in many languages.

A danger in asking questions about attitudes is that people who have no attitude on an issue sometimes feel pressure to state one, simply because they are being asked. As surveys are meant to produce results that are representative of the whole population, if attitudes are created in respondents, they will no longer represent the population. Therefore, interviewers should make it clear to respondents that not having an opinion on an issue is a highly acceptable answer.

I also suggest a liberal use of open-ended questions and opportunities for comment, when opinions are being sought. When specific statements are made, and respondents are asked to agree or disagree, there’s a danger that these statements may not reflect the exact opinion that anybody holds. Inviting respondents to give their opinions on related issues helps to assess whether this is a real problem.

6. Sets of questions

Some types of question are usually found in linked sets. Here are a few sets commonly used in audience research: item lists, diaries, and program testing.

6.1 Item lists

An item list is a set of questions (you could call it a "multiple-question question") with a single introduction, then a long list of items. Respondents are asked to answer each item on the same scale. Here’s an example, from a spoken questionnaire.

"Now I’ll read out a list of programs. For each program, please tell me if you like it a lot, or like it a little, or don’t like it at all, or haven’t heard it."

Like a lot = 1 / like a little = 2 / Don’t like = 3 / Haven’t heard = 4

"First, Ockham’s Razor. Do you like it a lot, or like it a little, or don’t you like it at all, or haven’t you heard it? __

"Next, Cheo. How much do you like that, or haven’t you heard it?" __

"Now, Farming Arts" ___

Note how, in a spoken interview, the full wording is given the first few times, then gradually abbreviated, as the respondent begins to remember the possible categories. If the respondent hesitates at a later question, the interviewer repeats the full wording.

Here’s the same question, as it might appear on a written questionnaire.

Please give your opinion of each program, by ticking one box in each line: Like a lotLike a littleDon’t
like
Haven’t heard
Ockham’s Razor[][][][]
Cheo[][][][]
Farming Arts[][][][]

This matrix layout is successful only if you give very clear instructions, such as the above: "...ticking one box on each line." Even so, some respondents will ignore the instructions, and tick no boxes on some lines.

Here’s an alternative item-list layout for a written questionnaire. Don’t make the boxes too small, or you won’t be able to read the numbers that some respondents write in. Allow at least 6mm (18 points) vertical spacing between lines. This method also has the advantage of faster and more accurate computer data entry than the above layout.

Please give your opinion of each program, by writing one of these numbers in the box on the left:

1 if you like it a lot
2 if you like it a little
3 if you don’t like it
4 if you haven’t heard it

[ ] Ockham’s Razor
[ ] Cheo
[ ] Farming Arts

In a spoken questionnaire, you can easily have up to 20 or 30 items in a list of this type. If you go too much beyond this, and the topics aren’t of interest to many people, some respondents become bored, and pay less attention to giving accurate answers.

In a written questionnaire, you can include many more items. 70 is no problem, and I’ve used more than 120 items (though with no other questions on the questionnaire). With large numbers of items, it helps to divide them into groups. If the items were programs, for example you could group them by day of the week, saying something like "Now I’ll ask you about some programs that are broadcast on Tuesdays".

6.2 Diaries

One of the first things that radio and television stations like to know about their audiences is how the audience size varies by time of day across the week. They like to see a table showing the estimated audience for each station, for every quarter-hour in a typical week.

In most advanced countries (and some developing countries) television audiences are measured by meters attached to TV sets, but this is not feasible with radios, which are much more portable. Thus the normal way of collecting information about audiences at different times is to use a diary. Respondents can either fill in the diary themselves, or an interviewer can question them, and fill it in for them.

Whichever method is chosen, the respondents must remember which stations they have listened to, at which times. Therefore, respondents must be aware of the time of day. In societies where few people have clocks or watches, diary surveys need to ask "which programs have you listened to?" instead of "at what time did you listen?" In societies where most people are conscious of the time of day, a diary survey will produce more accurate results.

In countries with a high literacy rate, the easiest way to do a diary survey is to distribute blank diaries to households, leave them for respondents to fill in (normally for a week), and return to collect the completed questionnaires. There can be either one diary per household, or one for each person in the household. If most households have only one TV set or one radio, one diary is left for the whole household. In countries where most households have more than one TV set or radio, one diary is left for each person — except young children.

It is possible to collect details of radio listening and television viewing using the same diary, but we have found that in such cases, 20% to 50% less radio listening is recorded than for diaries which ask only about radio. It seems that some people forget about their radio listening when asked about their television viewing.

As well as asking about the times when people listen to radio and watch television, diaries can also record the times when respondents are asleep, and where they are at each time (at home, at work, elsewhere, or travelling).

In an area with no more than about 10 stations, a diary is usually laid out in the form of a table. There is one line for each time period, and one column for each station. The time-periods are normally quarter-hours, but can also be half-hours or hours. There is one chart for each day of the week. If the time-period is a quarter-hour, each day’s chart will probably take up two pages. If the period is half an hour or an hour, the daily chart will fill one page.

To show which stations they were listening to (or watching), the respondent simply puts a checkmark in the box for that station, that time period, as in this example:

RADIO LISTENING DIARY FOR TUESDAY
TimeRadio 1Radio 2Radio 3FM99
 
Short-
wave
0600-0630[x][ ][ ][ ][ ]
0630-0700[x][ ][ ][ ][ ]
0700-0730[x][ ][ ][ ][ ]
0730-0800[ ][ ][ ][ ][ ]
0800-0830[ ][ ][ ][x][ ]

To help respondents understand how to fill in the diaries, the interviewer probably needs to demonstrate to at least one person in each household how the diary should be filled in. Respondents should be shown how to answer if they listen in only part of a time period: usually we tell them to tick the box if they spend at least half the time listening: i.e 15 minutes or more, in a half-hour time period.

It is important to tell respondents to enter their actual listening in the diary. It can be tempting to collect a whole week’s information in one interview, by asking people what stations they usually listen to, at which times. We have found that this "usual listening" approach produces wrong information, because most people believe their habits are more regular than they really are.

The above example is a diary for a single respondent. In TV surveys, sometimes there is one diary for each TV set. It is kept near the TV set, and filled in whenever somebody is watching.

If program schedules are available early enough, a diary can produce results that are more accurate by printing the names of the programs as well as the times. However, this can be a lot of work to organize, and if the programs are not broadcast at the scheduled times (or some are changed after the diaries have been printed) the results can be less accurate, not more so. If the program names do not match the times, most people will tick the programs, not the actual times.

If the survey is being done by telephone, or the literacy level is low, it is best to have interviewers to fill in the diaries. With this method, there will be fewer problems interpreting the completed diaries, but some information will be lost. We have found that most people cannot remember their behaviour to the nearest quarter-hour more than two days previously. Therefore when interviewers are filling in the diaries in our Australian telephone surveys, they ask only about "today" and "yesterday". And of course the information for "today" can only extend to the time of interview - which is why we do these interviews in the evening.

When collecting diary data in an interview, the questions go like this:

"Please think back to yesterday morning. What time did you wake up? ...
"And before [that time] but after midnight, did you listen to the radio at all? ...

[If so] What station did you listen to? ...
When did you start listening to it? ...
When did you stop listening to it? ...
Going back to [time when woke up], did you listen to the radio at all in the next few hours? ...

[If so] What station did you listen to? ...
When did you start listening to it? ...
When did you stop listening to it? ...
And after [that time] did you listen to radio again yesterday? ...

[If so] What station did you listen to? ...
When did you start listening to it? ...
When did you stop listening to it? ... "

The interviewer helps the respondent to remember what he or she did "yesterday", dividing the day into time zones of about 3 hours. For each time zone, the sequence of questions asked is:

While they gather this information, the interviewers fill in the diaries.

For this method to work well, the interviewers must be thoroughly trained in the sequence of questions. We have found that accuracy is improved if interviewers also ask respondents where they were at each time "today" and "yesterday": at home, at work, or somewhere else. If respondents think back, mentally retracing their movements, they often remember radio listening which they would otherwise have forgotten.

6.3 Program testing

A common purpose for audience research is to find out how to improve a radio or TV programs, by interviewing people who listen to or watch the station. Stations with small audiences usually have less money than stations with large audiences, so cannot afford surveys with such large samples. However, the smaller the audience a station has, the more expensive it is to survey.

If you want to interview 100 listeners to a station, and everybody listens to that station, only 100 people need be contacted to get 100 interviews. But when only 10% of people listen to a station, and 100 listeners are to be interviewed, 1000 people must be contacted.

Stations with small audiences can partly solve this problem by using a program-testing approach to research. This approach is ideally suited to telephone surveys, but is also feasible for other spoken surveys. You can record brief extracts from a program, and have the interviewer play these extracts back to respondents. Then everybody can give an opinion.

There are some objections to such an approach. First, can a brief extract give the flavour of a program to a non-listener? How would the extract be chosen? Is it relevant to get an opinion from people who are never going to listen to the station?

These are valid objections, though they can be countered. Some programs are more suitable for extracts than are others, but usually an extract can be chosen in such a way as to illustrate several points of interest about the program. As for getting opinions from non-listeners to the station, this group can if necessary be filtered out, and not asked these questions.

These extracts should be short. If the extract lasts longer than a minute or so, it becomes harder for the respondent to offer an opinion, as there are so many aspects about which an opinion might be given. respondents should be told, before the extract, roughly how long it will last.

To avoid too much relying on memory, it is good to present extracts in pairs, like this:

"Now I’d like to play you two extracts from radio programs. I’d like your opinions of them. Here’s the first now. It will take about half a minute."

[Play extract A first if respondent number is odd, B first if even.]

After playing the extracts, the interviewer can ask which was preferred, and the reasons for preferring that extract, as well as evaluations of the components of each extract — e.g. the music, the announcer’s voice, and so on.

This technique is specially useful for testing new programs. Rather than ask a hypothetical question along the lines of "Would you listen to a program that had this, this, and that?", the respondent can be asked to listen to one or more extracts, and to compare these to extracts from either an existing program, or another possibility for a new program.