Audience Dialogue

Know Your Audience: chapter 15, part B
Measuring internet audiences

2. Uses of internet research (continued)

As with other types of audience research, the research questions about a web site will be suggested by the site's goals. Common site goals include

  1. Maximizing the use of the web site
  2. Saving money
  3. Making money
  4. Appearing up-to-date by having a web site
  5. Creating a favourable attitude to your organization
  6. Helping your audience communicate with you.

Some of these goals are easier to evaluate than others. For maximizing use of the site, whether this is defined as hits or visitors, the log files are used.

To judge whether a web site is saving money or making money is not a question for researchers, but for accountants.

As for appearing up to date by having a web site: if the site is there, the goal is achieved.

Goal 5, if a comparison is to be made between people who have visited the web site and those who haven't, shouldn't be done on the web, but using other research methods.

That leaves goal 6: helping with communication. This includes the study of site usability and effectiveness. So even though it's only one goal of six, it's a very broad area, and a lot of work can be done. Many web sites are just plain awful - seemingly designed by mad programmers or graphic artists to impress corporate clients (never mind the poor users). I'm talking about pages that

... and the list goes on and on. There's plenty to do. But not all of it can be done using web-based questionnaires.

Usability and effectiveness testing

This is a sub-set (but a big one) of Goal 6 above: improving communication on web sites. It's more a cognitive task, and is usually done in a group setting. Typically, a small sample is gathered, of people who are a web site's target audience. The sample size is often as little as 10: that's how bad many web sites are - that such a tiny sample is enough to demonstrate it. Respondents are usually found from an advertisement placed on the web site being tested: along the lines of "If you are interested in taking part in research about this site, and you live in Zanzibar, please click here to contact us."

The respondents, once recruited, sit at a specially prepared computer, in a laboratory-like room with the interviewer/tester nearby. The respondents are given tasks to do, such as finding a particular piece of information on the web site of interest. They are asked to think aloud while doing this, explaining their actions and expressing their feelings. It's also useful to find out any assumptions they are making, though these are more difficult to discover. Everything they say is recorded on tape - either audio or video. In the most elaborate laboratories, multiple video cameras are used: one focused on the screen, one on the respondent's face, and one on his or her hands.

Rules are laid down to determine what happens when a respondent gets stuck, and can go no further. The tester shouldn't help too soon, but if the tester doesn't help at all, the respondent may walk out in disgust. Usability testing is often a very stressful experience, causing strong emotions. Respondents feel angry with themselves (or the site) when they can't find what should be an obvious piece of information.

Special software is used to monitor the actions taken by the respondent, including every key pressed. This can be used to replay the computer session, while the tape with comments is being replayed. Putting all this information together, the researchers determine the problem points in the site, and how to solve them. The most common type of problem, particularly in large sites (over 100 pages) is not being able to find particular pieces of information.

What this type of laboratory testing doesn't usually find is the wide variety of problems users have at home, using their own computers and modems: pages not displaying properly, slow loading times, and the like.

Another weakness of this type of usability testing (though not an inherent one) is its focus on a single site. Even if there's nothing actually wrong with a web site, users may prefer a competing site, because they find it easier to find information on, know it better, or simply prefer its style. Therefore any comprehensive study of a web site should include a comparison - from a user's point of view - with its competitors. And who those competitors are should be decided by the users, not the client. As made clear by the "uses and gratifications" school of audience research, the motivations of a visitor to a web site may be completely different to those the site owner had in mind.

A simple example of online usability testing is monitoring users' success in finding what they're looking for on a site. If your site has a search engine, with a space where users can type in whatever they're looking for, you can have a short questionnaire on the page that appears with the search results. It simply asks:

 
Did you find what you were looking for?
[] Yes [] No
Comment _________________________

In the Comment space, the unsuccessful searchers often type what they were looking for - and sometimes rude words about the inability of the search engine to find something that they know is one the site. If the words they were searching are recorded, as well as the answer to the above questions, you can collect information on unsuccessful searches, find out which are the commonest, and change the site or the searching software, so that searches gradually become more successful.

3. Internet research methods

There are three main aspects of internet research: audience measurement (which can often be done by observation, without asking respondents), quantitative research (e.g. surveys), and qualitative research (largely open-ended).

Audience measurement

The most basic measure of audience sizes on the worldwide web is the number of people who access each web page. There are several ways of estimating this, but all methods have some problems with accuracy. These include log file analysis, panels, and ISP caching.

Log file analysis

When you visit a web site, its computer (known as a server) and your computer exchange details. The web site computer receives a request: "the computer with IP number 123.245.101.242 wants you to send the file called index.html"

Every time a server receives a request for a file, the details are noted in a log file. The requesting computer, as well as supplying its IP number, can also send

For each request, the log file also records

If an HTML file has a number of associated images, these are sent with the main file. Every time a file is sent, this is called a hit. So if one web page has 20 images, this will be 21 hits. It doesn't mean that 21 people downloaded the page: only that 21 files were sent.

Every time a main page file (usually ending with .htm or .html) is sent, this is called an impact, or page view, or page impression. When comparing the popularity of web pages, the number of page views makes a lot more sense than the number of hits.

Software is available for web servers which summarizes the number of hits and page views for every page on a site. This software can tell you:

Examples of server log analysis software are Faststats and Webtrends. There are many others, too.

Server log analysis software can tell you something about the use of your web site. The main thing it can't tell you (unless you use cookies) is how many different people have accessed each page. You can't assume that a particular IP number means a particular computer, because many users are connected to the internet through ISPs. An ISP might have a thousand subscribers, but only a hundred modems. A user dialling the ISP is allocated the first unused modem - and that modem's IP number is the one the server sees. So there's no way of knowing whether modem number 64 is the same person that was using that number five minutes earlier.

The solution is to use cookies: small files sent to a user's computer by a server.

When a user who has visited a site before returns to it, the server can check whether the user's computer has a cookie for that site. If so, the server knows its a previous user returning. If not, maybe that user hasn't visited the site before - or maybe the user has their browser set to reject cookies.

But even when a cookie is received, the server can't be sure that it's the same person. It's the computer that receives the cookie, not the user - and many computers have more than one user.

If, every time a user connected to a web page, the server found out, and registered another page impression, this would produce an accurate count of the users of that page. But in fact, a server never finds out about many visitors to its pages: this is because of caching.

Caching

So what is caching? (It's pronounced "cashing", by the way.) To understand this, you need to know the path a web page (i.e. the file containing that page) follows between the server and a user. Most users are connected through ISPs, and ISPs don't want to spend more money than necessary on connecting to a distant server if another of their users has recently called up that page. So most ISPs use proxy caching. Every page requested by one of their users is stored on their hard disks, in case another of their users asks for it. The second user will then receive the page more quickly, and the ISP won't have to pay the cost of retrieving that page from a distant server. The dialogue between computers goes like this:

User's computer to ISP: "Please send page X."
ISP's computer to its own hard disk:
"Do you have a copy of page X?
If so, what's its date and time?"
If the ISP's hard drive doesn't have the page,
the ISP's computer asks the server to send it.
The server sends it, recording this on its log.
If the hard drive does have the page,
the ISP's computer asks the server:
" Do you have a more recent version of page X
than the one we have here?"
If the answer is yes, the more recent page is sent to
the ISP, who forwards it to the user.
If there's no more recent version, the user gets
the ISP's stored version.

Local caching

There's another complication still. Not only ISPs have caches: so do users - even if they don't know it.

Most browsers (e.g. Netscape and Internet Explorer, which between them account for over 90% of Web usage) store on each user's local disk every web page which the user accesses during a session (each time the browser is started up). As long as the user's browser cache hasn't run out of disk space, the ISP won't be asked for that page; it will be looked up on the user's hard disk.

Users can choose how they want this to be done. They have three choices -

1. not caching: looking up the page on the server every time, or

2. caching on a per-session basis: the page will be found on their hard disk as long as they haven't exited from their browser since the page was last accessed, or

3. always caching: recovering the page from their hard disk, even if they originally accessed it years ago.

However, some pages refuse to be cached. When the HTML code for a page contains a "no cache" instruction, all browsers (as far as I know) won't recover it from the user's hard disk, but will go back to the original server every time. This is important for web-based news services - if a news story on a web page is updated, the user will probably prefer to see the latest version.

Because of all this caching, the number of requests a server receives for a web page will be an underestimate - unless the page has a "no cache" instruction. When the number of accesses from cache is taken into account, the true number of accesses to a page can often be several times as much as the server log shows. The more a page is accessed, the more likely it is that it will be cached by ISPs.

One solution to the measurement problem caused by caching is to record on the users' own computers all the pages accessed. Using special software, this can be done automatically - but a panel of users must be found who are willing to accept such software, and form a representative sample of all users.

Panels with special software

A research company finds a few thousand people who are willing to co-operate in regular research. Panel members download special software to their computers. This software collects details of all the web sites visited by that computer, in the same way that a server log does: the URL (web address) visited, the time and date of access, and perhaps other data collected from the user. Every time a user connects to the internet, the special software contacts the research company, and sends details of all sites visited the previous time that user was connected.

In theory, this method should be perfect - as long as the sample of users is fully representative. This is probably not difficult to achieve, for people who use the internet from home. But a lot of Web use is done from workplaces. Large organizations often have high-speed lines, and during business hours, a high proportion of Web access comes through such organizations. But these organizations generally do not agree to allow their computers to download this panel software.

So it seems that the panel method greatly underestimates the web sites used by people at work (such as news sites) and overestimates sites used at home (e.g. entertainment).

The special software could also measure activities that a log file can't capture, such as

I don't know of any such software that is publicly available; internet measurement companies seem to write their own. If you have a popular web site, you can buy information from these companies (such as Media Metrix, Nielsen Netratings, and Taylor Nelson Sofres). But if you don't have a popular web site, they won't have the information, because the panels they use are a tiny proportion of the population.

The largest randomly selected panel I know of is a US one with 65,000 members. Though that's a big sample, there are around 100 million internet users in the US, so the panel includes only one in 1,500 of them.

If the real audience to a web site is about the same each week, figures reported from a panel shouldn't vary wildly from one week to the next. A rule of thumb is that the Law of Large Numbers begins to apply with samples of about 30. If less than 30 panellists visit a site in the average week, the recorded figures are likely to fluctuate wildly, and will therefore be useless. If a panel contains one 1,500th of the population, the real number of visitors to a page will need to be about 1,500 times 30, or 45,000 a week - or whatever the reporting period is (but the internet is moving so fast that reporting periods need to be fairly short). And even in the US, most web pages do not receive anywhere near 45,000 visitors a week. So even such an enormous panel can make reasonably accurate visitor counts only for very popular pages.

Measuring ISP caching

Another way to measure the popularity of web sites is to make use of the ISP caches. As most users access the internet through ISPs, and all ISPs seem to use proxy caching, analysis of the ISPs' own logs should provide an accurate reflection of web site use - as long as the sample of ISPs is fully representative.

There are a few problems with this method:

An example of ISP caching is the Australian organization Sinewave Interactive, also known as Hitwise. Its web site (at www.hitwise.com.au) includes an explanation about how it measures the popularity of web pages.

The above notes on measuring internet use apply only to the Worldwide Web, not to other internet protocols such as email. Email traffic is much easier to measure: there are no problems with caching, proxies, etc.

If a publisher encourages its audience to send email messages, incoming email can be measured quite easily. This can either be done manually, by sorting and counting incoming emails, or (on a larger scale) using software specially designed for handling email: this can be extremely expensive.

Measuring outward email is easy, too. Here I'm thinking of a single message, sent to many recipients - e.g. a newsletter sent by email.

If a mailing list is kept on a spreadsheet or database file, the number of recipients can be noted directly. What is not so easy is measuring how many of those recipients actually read the email. Many email programs issue read receipts, and if a message cannot be delivered, the program will inform the sender. But some email receiving programs don't send read receipts - so a message may appear to have arrived, even though the recipient's address no longer exists. Also, the fact that a remote server has received an email message is no guarantee that the recipient will read the message.

One of the best ways to estimate readership of an email newsletter is to send out a message asking readers to reconfirm their subscription. This will give a minimum number of keen readers, but will miss out others who may value the newsletter, but not reconfirm because they are on holiday, having temporary computer problems, etc.

And of course, people often forward email messages to friends and colleagues - so there are likely to be more readers than the subscription database shows.

A useful fact for a web site's owner is how long visitors spend looking at a page. The easiest way to measure this is with software on a user's own computer. This software will give the time when each page is downloaded, and from that it's possible to work out the time the user could spend looking at a page. For example, if one page is downloaded at 11:03:00 and the next page at 11:04:00, the user could have spent up to 1 minute looking at the first page. Of course that's the maximum possible - often, when a page stays on the screen for 10 minutes or longer, the user may have left the computer unattended - so measures of time spent must be regarded as maxima.

Email surveys

Electronic mail is now widely used in western countries, and the number of users is rapidly growing. Right now (early 2003) over 60% of adults in North America, northern Europe, and Australia have email access, with continuing growth of around 5% a year.

Email is very cheap to use - after the equipment has been paid for. To send an email message to any number of recipients costs no more than a short telephone call. Email surveys therefore have tiny distribution costs: there are no expenses for printing questionnaires, interviewing, postage, freight, or data entry. With the high level of coverage and enormous cost savings, it's tempting to use email for surveys. However, there are several practical problems.

ASCII email

The principle of ASCII (plain text) email questionnaires is that users are expected to echo the questionnaire back, adding answers. (With most email software, a reply to a message will automatically include the message being answered. Each line of the answered message usually begins with a > symbol. This is called echoing.) For example, a questionnaire might include this question:

Q3. Are you
[ ] male, or
[ ] female?
Please put an X in one of the above boxes.

The user is expected to enter the answer thus:

> Q3. Are you
> [ ] male, or
> [x] female?
> Please put an X in one of the above boxes.

To interpret this result, the analysis software will look first for a line beginning > Q3 then for a line in which the first letter is x (or X). If a female respondent puts an F in the box instead of X, some software won't recognize this. And of course, not all respondents put their X inside the box - which of course is not really a box, but a space between two brackets, thus [ ]. There are many different ways in which people can - and do - answer such questions. Though a human can easily work out the intended answer, for software it's much more difficult.

Also, for this to work, users must have their email software set to echo messages. Though most people do this (even if they don't know it), some don't. Some don't know how to change the settings to make echoing possible, and some email software doesn't allow echoing.

Because of these problems, the only situation where I recommend using ASCII email surveys is when there are very few questions (say 5 at the most), and sample sizes are small (say 100 maximum) and most questions are open-ended. A human can then edit combine each response into one file by cutting and pasting. This, of course, is data entry - but in a messier form than usual. And with this process, it's very easy to make mistakes.

HTML email

The alternative to ASCII email is HTML email. This is growing in popularity, though some people with older email software can't use it, others could use it but don't want to, others don't know how to set their email software to use it, and many users don't realize it exists.

HTML email has far fewer restrictions than ASCII email. Most elements that can be displayed on a web page can also be done with HTML email. This includes questionnaires, created with HTML forms. To complete a questionnaire, a respondent with HTML email simply reads the form, fills it in, and returns it by pressing the REPLY button.

Here's how the above example might look with an HTML email questionnaire:

Q3. Which sex are you? 
(Please click one button.)
o male
o female

When the user clicks one of the round buttons, a dot will appear in its centre. When the questionnaire is returned, the result for that question could be found in a string of characters such as this:

ID=163&Q1=yes&Q2=""&Q3=male&Q4=never

Each email questionnaire returned arrives back at the survey office as a separate message: one line of data in a format similar to the above. For analysis, all replies are combined into a single data file. Though every reply should be individually checked to ensure that it is in the correct format, less seems to go wrong with HTML email than with ASCII email. Sometimes there are data transmission errors, which cause a respondent's data to be changed. When this happens, it can mess up the combined file, causing nonsensical results.

Another problem is writing a questionnaire in HTML email format. People normally write HTML email using a word processor, such as Microsoft Word. As far as I know, such word processors cannot produce HTML forms, not am I aware of any software that is designed for this purpose.

There are programs designed for creating web pages, which work like word processors, but produce HTML: for example, Filemaker Home Page, Dreamweaver, and Microsoft Front Page. Most of these can create forms - at least to some extent. Others, such as Netscape Composer, cannot create forms. These form pages can be sent as HTML attachments - except by Front Page, which can create a form only on a web site.

I've tried to find a figure for the percentage of email users whose software can read HTML email, but without success. My guess is that it's about 30% at the moment, but it should grow rapidly over the next few years.

Email attachments

One way around the messiness of ASCII email questionnaires is to send a questionnaire as an attachment. This can be done in several ways: the attachment can be a word processor document, a spreadsheet, a computer program, or an HTML document.

Word processor attachments

A questionnaire can be a word processing document, in a format that as many people as possible can read. The most widely available format is RTF (rich text format). Almost all post-1995 computers can read RTF files.

Though the questionnaire in word-processor format will look more attractive than in plain ASCII format, there's still no guarantee of consistency about how and where respondents put their answers - unless you create a word processing form, with fields that can be filled in.

Another software alternative that can create forms with fields that can be filled in by users is the Acrobat PDF format. Most internet users in businesses now have software that can read PDF files. These has the advantage that respondents can enter their answers only in the form fields.

Special software attachments

An attachment can be a piece of software, which launches a questionnaire: this is data entry software, with a particular questionnaire built in.

The advantage of this method is that (as long as the user has the patience to wait while the program downloads) the software can be very sophisticated, including elaborate skipping and checking. There are more disadvantages than advantages:

A software attachment would work best with a panel, in which the members have some established degree of trust with the survey organization, and the organization has staff who can help less technical members install the software (e.g. by phone), and the software need be downloaded only once.

HTML attachments

In this case, the attachment is an HTML form, which can be opened with the respondent's browser software. Almost every computer with email access will also have an HTML browser (usually Netscape or Internet Explorer). Some people who work in large organizations have external email at their workplace, but no Web access. However, they often have intranet access, and therefore have a browser. Thus they complete a questionnaire with an HTML attachment, even though they could not see a Web-based questionnaire.

As far as respondents are concerned, an HTML email attachment arrives in the form of email, but when they click on it, it works exactly like a Web-based survey; see the section on Web Surveys (below) for details. Some email programs seem to automatically display an HTML attachment, without the user having to click on it.

This is my preferred method of doing surveys by email. It doesn't depend on respondents having particular operating systems or email software, and the HTML files are small, so questionnaires download quickly.

The main disadvantage is that the attachment must be opened. Respondents who don't click on the attachment symbol will not see the questionnaire. Another disadvantage is that some people are suspicious of email attachments, and will not open them.

The main body of the email message should be a letter, explaining the purpose of the questionnaire and urging respondents to fill it in. Such covering letters are essential with mail surveys, and they equally necessary with mail surveys.

Here's an example of such a message:

To: <<subscriber list>>

From: Bertha Bluggs <dddd@geranium.com.nu>

Subject: Survey about Geranium web site

Dear Geranium user,

As a subscriber to the our newsletter, you probably have some opinions about our web site. We are seeking reactions from our site's visitors about some possible ways of improving our site - so we'd be very pleased if you could take a few minutes to give your opinions, using the attached questionnaire.

To see the questionnaire, just click on the icon below, and the questionnaire will open in your browser program (e.g. Netscape or Internet Explorer). When you have finished answering the questions, please click on the SUBMIT button at the end.

If you don't want to complete this questionnaire - or any questionnaire from us - please reply to this email with a brief explanatory comment. Then we shan't send you a reminder.

But we do urge you to do the survey, even if you don't often use our web site. We greatly value all feedback.

Yours sincerely,

Bertha Bluggs

Definitive Data Development Director

Geranium questionnaire

Some comments on that message...

Web surveys

Though more people use email than use the worldwide web, the Web is more standardized than email, so for respondents with Web access, these surveys can seem easier to do. At any given time, thousands of Web surveys are being done - but most of these pay no attention to sampling, and will produce answers which can't be trusted. For a web survey to produce usable results, a useful population must be defined, and a high response rate must be achieved.

A web survey can be produced in several ways, all equivalent to producing a web site.

1. Manual programming.

The questionnaire (or form as it is called in HTML) can be programmed manually. This usually involves the questionnaire designer writing a paper questionnaire, then turning it over to a programmer, who will convert it into code so that it will be displayed as a web page. This code is usually written in HTML, sometimes with Javascript added, sometimes with other computer languages. Because the result is a computer program - and computer programs never work properly without revision - it will need to be thoroughly tested before being put on the Web.

If automatic skipping and value-checking are included, particular attention needs to be paid to checking these - unlike a face-to-face survey, there is no interviewer to fix the mistakes. Also, the web version needs to be tested on a wide variety of new and old computers, and operating systems, and browser versions. If Javascript sections are included, these need to be tested even more thoroughly.

2. Web composition software

An easier method is to use a web page composition program to produce the questionnaire. This is much easier than writing in HTML or Javascript; it is similar to using a word processing or desktop publishing program. However, some flexibility is lost, and some complex types of layout are not possible. Programs that can do this include NVu, Filemaker Home Page, Microsoft Front Page, Adobe Go Live, and Dreamweaver. Apart from NVu (which is free) software in this category costs around several hundred dollars.

Whether a questionnaire is written directly in HTML (and/or other languages), or produced with web composition software, it then needs to be uploaded to a web site so that respondents can fill it in. The next challenge, after a questionnaire has been completed, is to get the data back in a usable form. That's not so easy: more software is required. This is discussed in more detail below.

3. Simple web survey software

The next step up is to use a program designed specifically for web surveys. These programs:

Some examples of these programs are
Survey Solutions (www.perseus.com )
www Survey Assistant ( www.mohsho.com/s_ware )
Surveypro ( www.surveypro.com )
Powertab ( www.powerknowledge.com ) [not working June 2006].

All are relatively cheap (around several hundred dollars) but in various ways limited in their capabilities. WWW Survey Assistant is free for noncommercial users. Powertab is particularly easy to use, but is available only for Macintosh. Free demonstrations are available for all of these, by visiting their web sites. All the programs in this section and the next include a relatively easy way of getting the data back to the survey organizer.

4. Heavy-duty internet survey software

If you want really powerful software, you need to pay 1,000 dollars or more. For this you get features such as:

This type of software is much more difficult to use than the programs listed above; it's designed for regular use by market research companies, and this is reflected in its price and complexity. Some examples are
Senecio MaCati e-poll ( www.senecio.com),
SurveySaid (www.surveysaid.com), and
Websurv ( www.websurv.com).

I've tried several of these (though not in detail), and they seem to work well - but they require detailed computer knowledge. If you're having trouble understanding this chapter, you probably won't be able to use this type of software without getting expert help.

5. Have somebody else do it

Instead of running the whole survey on your own web site, you can place the questionnaire on your own web site, but have another organization receive the answers and analyse the results for you. This requires much less Internet knowledge. Some organizations that offer such a service are
Sysurvey (www.sysurvey.com ),
Zoomerang (www.zoomerang.com ),
Dubidu (www.dubidu.dk ),and
Free Online Surveys (www.freeonlinesurveys.com).

Many of these have free versions, but with limitations on the sample size - often about 50 respondents - not enough for a real survey, but enough to try out the system.

This type of service is best suited to the situation where you know exactly what you want to ask, and simply want to divide your website users into several categories, depending on their answers to a set of multiple-response questions. For example, you can find out their demographic details: their age groups, sexes, occupations, where they live, and so on. If you're interested in trying out internet surveys, and aren't sure where to start, I suggest beginning with this option.

Finding respondents

If a questionnaire is on a web site, how can respondents find it, to fill it in? There are several possibilities here, including links, email/web surveys, and pop-ups.

Links

The easiest method is to put a link on your home page, saying something like "Take part in our user survey." When the web was new, way back in 1996, and web surveys were even rarer, this approach worked quite well. 50% to 65% of people who looked at our page urging them to take part in the survey sent in a completed questionnaire. More recent researchers have reported much lower response rates, and I expect these to decline further as the novelty wears off.

Email with web link

A combination email/web survey has also worked well for me. With this system, you first collect the email addresses of potential respondents. One way to do this is the same as item 1 above: putting a link on a web page saying "Take part in our survey". When users click on the link, instead of being taken to the questionnaire, they are asked to send in their email address. The survey organizer, on receiving this inquiry, answers with an email including the web address of the questionnaire.

You might be wondering: why use such an indirect method? Wouldn't it be easier to send them straight to the questionnaire? The reason is that with this method, you know the email addresses of the potential respondents, and you can chase them up when they don't complete the questionnaire. This produces a much higher response rate - as long as you send enough reminders. For this to work, respondents must put their email address on the questionnaire they return; it's not possible to automatically detect an email address from a web questionnaire.

The email message that tells would-be respondents where on the Web to find the questionnaire can also give them a password. If everybody has a different password, this will usually prevent them from submitting the questionnaire twice, and deter non-eligible people who stumble on the site from completing the questionnaire. Unfortunately, people lose their passwords, get them wrong, and don't trust them, so using a password will cut the response rate. A better method is to create a separate web address for each respondent, redirecting the respondent to the real questionnaire page. That way, people don't know they have a password.

Pop-ups

A development which proved very effective for a year or two was the pop-up questionnaire. Somebody looks up a web page, and as they do so (or as they leave the site) a new window pops up on their computer screen. "Would you like to fill in our questionnaire?" it asks. Most people (so far) say Yes, and complete the questionnaire. An example of this can be seen at www.customersat.com.

One advantage of the pop-up system is that it doesn't need to pop up for all users. For a very popular web site, the pop-up could be programmed to appear only for every 100th visitor. To make sure nobody is asked to do the questionnaire twice, a cookie can be used: i.e. a small file is sent to the user's computer. Next time that user visits that web site, the site checks for the presence of that cookie. If it is found, it means that this user (actually, this computer, though perhaps a different person) has already been asked to do the questionnaire. Some software designed for web surveys now includes the ability to produce pop-up questionnaires.

The main disadvantage of approaching web site visitors with a pop-up questionnaire is exactly the same disadvantage as interviewing in a public place: you can't follow up the visitors who didn't complete the questionnaire. Even if you can achieve a relatively high one-off response rate of 30%, you still know almost nothing about the other 70% of visitors.

Since about 2002, when the novelty of pop-up windows rapidly wore off, the problem with pop-ups is that many people assume they are advertising, and close them before reading them. To improve the response rate, you can (a) pop the questionnaire up when the visitor leaves your site rather than entering it, and (b) warn them that a pop-up may arrive.

Combining several methods

A common way to do internet surveys is to combine an email notification with a Web-based questionnaire. For example, people who have already indicated interest in a survey are sent an email message like this:

Dear FM99 listener,

You told us recently that you are interested in giving your opinions about recent FM99 programs. Our questionnaire is now ready. It's on the Worldwide Web at
http://www.fm99.org.nu/survey.html

I'd be very grateful if you could fill it in within the next day or two. If you have any problems, or want your name removed from our mailing list, please email our help desk:
kemal@fm99.org.nu or telephone Kemal at 09 181 262

Thanking you in advance
Eugene Shurple
Station Manager

Most recipients will have email software that shows the above web address and email address underlined, usually in blue. Clicking on the first blue line will bring up the questionnaire on the web site. Note how each internet address in the above message began on a new line - if the address wraps onto a second line, sometimes clicking on it doesn't work. Note too the concern for respondents' privacy - allowing them to have their name removed from your list (but make sure you do remove the name - specially if you later send a reminder! Usually only a few percent of people want their name removed, but they get extremely angry if you promise to remove it, then don't.)

From the respondent's point of view, this method (email leading to a web site) is much the same as an HTML attachment. If the questionnaire is in multiple parts, and each part is checked after it is completed, this method is preferable to multiple HTML attachments - with those, some people will be sure to complete them out of sequence.

Getting answers back

This is the difficult part. The Worldwide Web wasn't designed for two-way communication. Though it's quite easy to create a web site, it can be surprisingly hard to collect questionnaire responses. Partly this is because of the danger of hacking: webmasters want to protect their computers from malicious hackers.

Getting information back into a host computer usually involves CGI: the Common Gateway Interface. This means that, for every survey, a CGI program must be written, often in a computer language called Perl. It's something that has to be done by a computer programmer, with the co-operation of the ISP (the owner of the host computer).

There are several easier ways to get information back, which don't involve CGI programming. The free programs Formmail (see www.formmail.to) and CGIemail are widely used. They work by sending the completed questionnaires back to you as emails. The completed questionnaires arrive one at a time in your email inbox. You must then combine them into a single file for analysis, and convert that file into a format which normal survey analysis software will read. Software exists which will do this: it tends to be either incredibly expensive (because it's designed for huge companies to manage email from their millions of customers) or free (because a program which does only this is very simple to write). With all software which sends completed questionnaires as email, the survey organizer will know the sender's email address. This process is not anonymous, and some email programs bring up a rather scary message when a respondent goes to send an email questionnaire. This will deter some people from submitting their answers.

Another easy method of getting completed data back involves Front Page extensions. If the Microsoft program Front Page is used to create a questionnaire, if offers an option of collecting data from the completed questionnaires in a single computer file - which can then be analysed by standard survey software. I have used Front Page several times, but it has some annoying problems that make it more difficult to analyse the data. For example, if a respondent presses the Enter key to start a new line in an open-ended answer, Front Page records this as a new respondent. Therefore you need to study the data files in detail before processing them with survey analysis software.

Whether you get questionnaires back as email or on a file, you need to have that survey software. The package programs listed above handle analysis as well as questionnaire creation, so it may be easier to buy one of those.

Analysis and reporting

When you have all your responses in a single computer file, you can analyse them - either with the software used to analyse all other surveys (SPSS and the like) or using the analysis options included in some of the web survey software.

One advantage of using the special-purpose web survey software is that if often presents the results as a web page. Standard analysis software can't do this (yet). So if you'd like the world - or some password- protected corner of the world - to be easily able to read your survey results, the special web survey software will be best.