Audience Dialogue

Using Excel for survey analysis (3)
Analysing numeric variables

The best way to handle numeric variables with Excel is the Data Analysis Toolpak. It comes on the standard CD with Excel version 97 and above (maybe even Excel 95), but is often not installed. To check whether it's installed on your computer, look in the Tools menu in Excel - not the more obvious Data menu. If you see Data Analysis (probably at the bottom) the Toolpak is ready to use. If you don't see Data Analysis, you'll have to find the CD and do a custom installation. When you're ready, click on Data Analysis, and you'll see a window like this:

The two main tools to use are Descriptive statistics (which gives you averages, standard deviations, etc) and Histogram (which gives you a frequency distribution - though not necessarily a histogram).

When you choose Descriptive Statistics, you see a window like this:

You need to pick an input range. If you have entered one questionnaire per row, the input range will be a column - so click on Columns, enter the column number (e.g. you have to type in D:D for column D) and click the Labels in first row box if that applies. In the lower section, labelled Output options, choose Summary statistics. I recommend keeping the default option of New Workshop Ply.  Whatever "Ply" means, it puts the results in a new window.

If any cell in the chosen column (apart from the label in row 1) contains a character that's not a number or a blank space, it won't work. So if you've entered, say, hyphens to mean "not answered" you'll have to delete them all. If all is well, you get a table like this:

B2

Mean

1.9875

Standard Error

0.14506

Median

1

Mode

1

Standard Deviation

1.297454

Sample Variance

1.683386

Kurtosis

-0.11934

Skewness

1.093655

Range

4

Minimum

1

Maximum

5

Sum

159

Count

80

You can analyse several columns at a time, such as columns A to D, by entering A:D (or whatever) in the Input Range box in the Descriptive Statistics window (see above).

To get frequency distributions, you need to choose Histogram. But before you do that, you must set up a bin range. This is just a list of the top ends of the frequency groups. Suppose your questionnaire asked the postcodes of people's addresses, which might vary from 5000 to 5650. If you don't want to see every individual postcode, with maybe 1 respondent living in that area, you could group the postcodes in sets of 100 - e.g. 5000-5100, 5101-5200, and so on. So the figures for your bin ranges will be 5100, 5200, and so on up to 5700.

If you want to see how many respondents lived in each individual postcode, you'd need to set up a bin range for each one: 5001, 5002, 5003, and so on.

Enter the bin range numbers in an unused part of the spreadsheet. The foot of the column you're analysing might seem like a good choice, but unless you're very careful, this can cause nasty errors later. So put the bins in an unused column way out on the right, like this (cells M1 to M8 in this example; M1 is the heading).

postcode bins

5100

5200

5300

5400

5500

5600

5700

Now you can specify the Input Range (the column label will do, entering column B as B:B) and the Bin Range. If both of these have labels in row 1, tick the Labels box. Finally, choose some Output options.

 The results come up in a new worksheet:

postcode bins

Frequency

Cumulative %

postcode
bins

Frequency

Cumulative %

5100

13

3.07%

5500

98

23.11%

5200

2

3.54%

5700

95

45.52%

5300

79

22.17%

5300

79

64.15%

5400

75

39.86%

5400

75

81.84%

5500

98

62.97%

5600

62

96.46%

5600

62

77.59%

5100

13

99.53%

5700

95

100.00%

5200

2

100.00%

More

0

100.00%

More

0

100.00%

This is actually two tables, presenting the same data in different order. The one on the left is created by ticking Cumulative percentage in the Output options. The table on the right is created by ticking the Pareto (Sorted histogram) box. You don't usually need both tables.

The chart (the actual histogram) is usually a bit of a mess, but you can edit the components to make it look more presentable. (Hint: click the right-hand mouse button in various places on the chart to see menus of what's possible right there.)

Next » Analysing nominal variables (slow to load - lots of screenshots)