SPSS 9 - 13

 This is a version of Appendix 2 of the book “Applied Regression Analysis: A guide for students and researchers.”  It is intended to be read with the book, but should work as a standalone guide to SPSS.  More versions will be added, for different programs and versions of SPSS, when we perceive that there is demand, or when SPSS changes.

Go Back to
Jeremy Miles Homepage
Applying Regression Analysis Homepage
Applying Regression Analysis Extras Page
Appendix 2 Page

In this section we will describe how to use the SPSS™ version 9.0 for Windows™ to carry out some of the procedures we have described in the book.  The regression procedure in SPSS has changed only a little since the Windows version was released, so you should be able to follow everything, if you are using any version from 5.0 onwards.  Of course we don’t know what will happen in the future, but we have no reason to believe that it will change dramatically. 

Linear Regression

Main Dialog Box

To find the main regression dialog box, click on the Analyze menu item (labelled statistics in older versions of SPSS).Then click Regression, and then Linear . . .You should see something like Figure 1

Figure 1: Linear regression dialog box in SPSS 

A: Is the list of dependent variables in your dataset.Depending on how SPSS is set up, you might see the variable labels here, rather than the variable names. 

B: Choose some additional statistics to go in the output.See the section below. 

C: Choose what plots you would like drawn.See below. 

D: Choose what information you would like saving in your data file.See below. 

E: Some additional options.See below. 

F: OK.  Press this when you have finished, to run the analysis. 

G: Reset all values back to their defaults.Useful if you want to run a completely different analysis. 

H: Cancel and ignore any changes that have been made. 

I: Help.Get some help. 

J: These buttons move variables between the variable list on the left (A), and the independent and dependent boxes (N and K). 

K: The dependent variable.Use the button (J) to put your dependent variable into this box. 

L: Next block.This is used when carrying out hierarchical regression (see chapter 2) to add variables in blocks. 

M: The variable selection technique. Choices are enter, stepwise, remove, backward, forward.See chapter 2. 

N: The list of dependent variables.

Statistics Dialog Box

Figure 2 shows the dialog box that appears when you click the save button.

Figure 2: The statistics dialog box

A: Estimates.Tick this box to get the parameter estimates (the constant and slope coefficient for the variables).It is ticked by default. 

B: Confidence intervals.Tick this box to get confidence intervals around the parameter estimates.These are calculated from the standard errors. 

C: Casewise diagnostics of the residuals.This will print out details of any outliers who are further from the mean than you specify in D.See Chapter 4. 

D: Tells SPSS which outliers to print out. 

E: Model fit.Tick this box to get R, R2, adjusted R2 and the ANOVA table.It is ticked by default. 

F: R2 change.Tick this box to get a value, and significance test for the change in R2 when additional blocks are entered.This is used when carrying out hierarchical regression (see chapter 2). 

G: Descriptive statistics for all of your variables - means, standard deviations, etc. 

H: Tick this box to get the Tolerance and VIF (chapter 6). 

I: Continue.Click here when finished. 

J: Cancel.Go back and ignore all changed. 

K: Help.Get additional help.

Plots Dialog Box

Figure 3: Plots dialog box

A: The list of possible variables to plot. (See chapter 4 for descriptions). 

B: Residual plots, to check for normality. 

C: Next button, to add more plots.

Save Dialog Box

This dialog box allows you to calculate and save a range of variables in your dataset.

Figure 4: The save dialog box.

A: Predicted values (see chapter 1, 2 and 4).Adjusted values are the predicted value for the case when the analysis was carried out without the case. 

B: Distance statistics.See chapter 4 for descriptions. 

C: Residuals.See chapter 4 for descriptions. 

D Influence statistics (see chapter 4).

Options Dialog

Figure 5: Options Dialog Box

A: Alter the entry and exit criteria for stepwise methods. 

B: Determine what to do with missing data.

Computing and Recoding Variables

There are two ways in SPSS of calculating new variables from old.When you are dealing with continuous data, the most convenient method is the compute dialog.When dealing with categorical, for example to create dummy variables (see chapter 3) the most convenient method is the recode dialog.

The Compute Dialog

When we want to carry out some sort of transformation on a scale, for example a non-linear transformation (see chapter 6) or creating a moderator (see chapter 7) you should use the compute variable dialog.This is found in the Data Window, by clicking the Transform menu, followed by Compute.

Figure 6: Compute variable dialog box

A: Target variable.This is the name of the new variable to be created. 

B: Variable list.The name of the variables in the data file. 

C: Numeric expression.Here we put the calculation that will be used to create the new variable.

D: Some mathematical expressions to put in the numeric expression. 

E: Functions to add in the numeric expression, for example log (see chapter 6). 

Some examples of the compute box: 

To create a new variable called bxa, which is the product of books and attend

·In the target variable box, write bxa

·In the numeric expression write booksattend

·Press OK. 

To create a new variable called hassles3, which is the cube of hassles 

·In the target variable box, write hassles3

·In the numeric expression write hassles ** 3

·Press OK. 

To create a new variable called logtime, which is the log of a variable called time: 

·In the target variable box, writelogtime

·In the numeric expression write log(time

·Press OK.

The Recode Dialog Box

The recode dialog box is used when we want to manipulate categorical variables.This is most commonly done when we want to take a categorical variable with more than two levels, and turn it into a series of dummy coded variables, to represent the variables (see Chapter 3).
Imagine we had a categorical variable, called group, which had three possible values.The value 0 indicates that the person is in the control group, 1 indicates they are in group 1 and 2 indicates they are in group 2.We want to turn this into 2 dummy coded variables, that represent membership of group 1 and group 2.(We do not need a third variable to represent the membership of the control groups - see chapter 3 for an explanation of why.We will call the two new dummy variables group_1, which will be equal to 1 if the person is in group 1, and otherwise 0, and group_2, which will be equal to 1 if the person is in group 2, and otherwise zero.Table 1 shows the possible values of the three variables. 


The recode process has two steps.In the first step we tell SPSS what name we would like the new variable to have.In the second step we tell SPSS what we would like the values in the new variable to be.To Recode variables, first select the Transform menu, then the Recode item, and then Into Different Variables . . .

Step 1: Name new variables.The dialog box to do this is shown in Figure 7.

Figure 7

A: The variable list, with which we are becoming familiar. 

B: The button to move variables to and from the variable list 

C: The list of old variables, linked to their new variables. 

D: The list of new variables.Type the name of the new variable here, and then click the Change button (F).In our example we would first type group_1 here. 

E: Use this button to select the next dialog box, where we tell SPSS what values are to change. 

F: The Change button, adds the name of the variable that we have typed into D to the output variable list, in C. 

G: The OK button.You will not be able to press this until you have pressed button E, and set old and new values. 

Step 2: Tell SPSS the old and new values.The dialog box to do this is shown in Figure 8.

Figure 8

A: Here we say the value of the variable we want to recode. 

B: Here we say the value we want to have in the new variable. 

C: When we have filled in both A and B, click the Add button to add the recode to the list. 

D: Here we have the list of transformations that are to take place.In our example, when we are coding the variable group_1, this should say: 




When we are recoding the variable group_2, this should say: 




Note that we need to do two “runs” through this procedure to create both of the dummy variables. 

E: Continue.Click this when you have finished.

Logistic Regression

The logistic regression procedure is more sophisticated than the linear regression procedure, and automates some of the tasks that we had to do to prepare data for categorical independent variables (see Chapter 3) and interactions (see chapter 6).
To select the logistic regression dialog, choose the Analyze menu (Statistics in some versions of SPSS).Then choose the Regression item, and then Binary Logistic (just Logistic in some versions).

The Logistic Regression Dialog Box

Figure 9: The Logistic Regression Dialog Box

A: The variable list. 

B: The button to move variable to and from the variable list. 

C: This is an interaction button.It allows us to specify an interaction effect from this dialog box without going through the steps we describe in Chapter 7.To use this select the two variables you are interested in, and rather than using the usual button to move them, use this button. 

D: The dependent variable.This must have only two values.

E: The covariates list.This is the word that SPSS uses for independent variables in logistic regression. 

F: The entry method.See Chapter 2 for a description of these methods.(And we repeat everything we said about stepwise methods for linear regression models.) 

G: Here we can specify that variables are categorical.This saves us gong through the procedure described in chapter 3, to create dummy variables.See below. 

H: The save button.This allows us to save some information back to the dataset.See below. 

I: Options: See below.

The Categorical Variable Dialog Box

Figure 10

A: The variable list. 

B: The list of categorical variables.If you have any categorical independent variables, you can add them to this list.This saves going through the procedure that we described in Chapter 3, to create dummy variables. 

C: Here the type of coding is specified.These were described in Chapter 3.First you select the reference category (first or last), and then the type of coding you would like to use.The two main choices are Indicator coding, and what SPSS calls Simple coding, which we described as Dummy coding in Chapter 3.

Save Dialog Box

The options in this box are very similar to the options that were available in the save dialog box in Linear Regression.Most of the diagnostic checks that are available in logistic regression are very similar to the checks that were available in linear regression, which we dealt with in Chapter 4.We did not consider them specifically in terms of linear regression though, and the reader is directed towards Menard (1995).

Figure 11

A: Predicted values.These are in terms of either probability of group membership, or predicted group membership, based on a cut-off of 0.5. 

B: Influence statistics.These are the equivalent of the influence statistics we encountered for linear regression, in Chapter 4. 

C: Residuals. Again these are similar to the residuals we encountered in Chapter 4.

Options Dialog Box

Figure 12

A: A series of diagnostic charts and indices are presented within this section.Most of them are beyond the scope of this book. 

B: The probabilities for stepwise entry and removal. 

C: The 95% CI for B.This is calculated using the SE of B, as in linear regression. 

D: The classification cut-off.This is used to determine predicted group membership.If the probability of a person being in a group is greater than 0.5, the analysis predicts that they will be in that group. 

E: The maximum number of iterations that are allowed.On some occasions you may find that the logistic regression does not converge, and therefore you may need to increase this value. 

© Jeremy Miles and Mark Shevlin, 2000