Friday, January 26, 2007

Canonical Correlation

Catherine Day sent an email to the psych-postgrads list, which said:
Sorry to bombard you with yet another statistical problem but I'm desperate! I have 2 sets of latent variables (1 set measuring taste preference and another set measuring personality dimensions).
I'm looking at the relationship between taste preference and personality and understand that canonical correlation is the appropriate analysis. The problem is nobody at Sheffield Hallam has performed one before and we do not have the add-on package for SPSS that does this.
Does anyone know how to do this type of analysis or can point me in the
direction of some training?
I replied:

Canonical correlation is one possibility. It's kind of atheoretical though - that is, if you know what the latent variables are, you probably shouldn't use it. You might want to use structural equation modelling or partial least squares analysis instead (although both of those are pretty fiddly, and you shouldn't use them unless you really, really have to).

Just to check: are you modelling at the item level? If you are, I'd consider summing to the scales (maybe factor analysing first) and then doing regression.

There's a book on canonical correlation, in the Sage Little Green Books series. I think that Tabachnick and Fidell's book 'Using multivariate statistics' cover it as well.

Finally, you have got the add on, you just don't realise it. It's an SPSS syntax file, called 'canonical correlation.sps', and you'll find it in the SPSS folder (c:\program files\SPSS, if you're using windows with a locally installed version).

To use it, you type syntax (into the syntax editor) like this:

INCLUDE 'c:\Program Files\SPSS11\Canonical correlation.sps'.
CANCORR set1 = y1, y2, y3/
set2 = x1, x2/.

Where x1 and x2 are predictors, and y1 and y2 are outcomes (just keep going until you've got them all). (I think I've got my x and y the right way around - I don't have SPSS here, so I'm guessing a bit.)

Sunday, January 07, 2007

SEM on the cheap

Helen M sent a message to the psych-postgrads list, because she didn't have the AMOS package for SEM. And it ain't cheap.

I sent the following reply:

First, there is a program called Mx, which is free, and available from http://www.vcu.edu/mx/ - it was designed for twin studies, but can do any kind of structural equation modelling. It has a path diagram input, but it needs tweaking in syntax, which is a little fiddly - until you get used to it.

Second, there is a program call R, which is also free - it's available from www.r-project.org. R can read an SPSS file, and can do everything you'll ever want to do - it can also read an SPSS file. R comes in packages, and one of the packages is called sem (note - lower case), it's written by John Fox, and can do, SEM (upper case).
It's syntax based, but it's pretty easy - you write y <- x to regress y on x, and y <-> x to correlate y with x. R will download the sem package itself, if you ask it nicely, but the sem home page is here: http://socserv.mcmaster.ca/jfox/Misc/sem/index.html . Also, if you use R, you make life easier for yourself with a program called JGR,
pronounced, JaGuaR - which you can find here: http://rosuda.org/JGR/
(JGR stands for Java GUI for R (hmmm... I'll explain an acronym with
three more abbreviations).

Third, precisely for people in your situation, you can 'rent' a copy
of LISREL, which will do SEM for you - LISREL has something of a bad
reputation, because it was the first package out there, and so people
remember that, but it's much easier nowadays. It costs $65 feeble
American dollars, which works out at about £33 nowadays - or for a
year, it's $100. http://estore.e-academy.com/index.cfm?loc=estore/soft_browse/soft_display_product&ID_Product=525

Fourth, if you've got access to Stata, there's a free Stata add on
called gllamm (generalized linear latent and multilevel models - or
something like that), it's free and it's available from
www.gllamm.org. It can do SEM, but it doesn't seem that
straightforward to me (which means I tried for 5 minutes and gave up).

Fifth, if you're doing something very straightforward, and you're not
bad at Excel, you can do SEM using Excel - although if you want to do
anything more complicated than a one factor CFA, you're in trouble.
And even if you only want to do that, you're in a bit of trouble.
I'm only saying it 'cos I wrote a paper that tells you how to do it.
And no one ever does it. Not even me.

Also, you're going to want some help when you make trivial mistakes
which will drive you insane to spot (comma in the wrong place - things
like that). You can email this list, which might help, the
psych-methods list (on jiscmail as well), semnet - although you will
immediately get bogged down with arguments about arcane details of
your models. If you use sem, you can ask the r-users list (you'll
find stuff on the R web site about that - you get an awful lot of
email on it though).

I'm assuming you've got a book, so I won't go into that now.