Learner
Posts: 1

# Factor analysis + multiple regression

Hi!

I need to solve the following question:

Do anthropometric and fitness variables predict the difference between being in the top quartile of improvement in jumping height between year 1 and year 6 or not being in the top quartile?

Reduce the number of anthropometric and fitness variables before doing the analysis.

Given is a dataset with different variables (see attachment)

I was wondering which steps I need to take to solve this problem? I was thinking to execute first a factor analysis, followed by a multiple regression. But I am not sure if this is correct or if I need to insert more steps?

Thanks in advance if anyone could help me!

I am using version 7.1 of Sas Enterprise Guide.

Posts: 3,288

## Re: Factor analysis + multiple regression

[ Edited ]

Lore wrote:

Hi!

I need to solve the following question:

Do anthropometric and fitness variables predict the difference between being in the top quartile of improvement in jumping height between year 1 and year 6 or not being in the top quartile?

Reduce the number of anthropometric and fitness variables before doing the analysis.

I consider variable reduction strategies all to be flawed in the presence of correlated x-variables. They will be mis-led by this correlation between the x-variables often to the point where you get the wrong sign on a estimated slope, and the variances of the estimates will likely by HUGE.

I was wondering which steps I need to take to solve this problem? I was thinking to execute first a factor analysis, followed by a multiple regression. But I am not sure if this is correct or if I need to insert more steps?

Factor analysis will chose variables (or actually create new factors from your existing variables), without regard to the response variable. Factor analysis does not use the response variable at all, and so you could get factors that are poor predictors. In other words, I consider Factor analysis (in this situation) to be fatally flawed. That's two reasons why this approach is fatally flawed, the first reason is explained above.

So what do you do? You use a method that is less sensitive to the correlations between the x-variables, and finds variables (or factors) that are good predictors. What method is that? It's called Partial Least Squares and in SAS it is PROC PLS. Of course, this requires a change of mind-set that allows you to keep all of your x-variables in the model (that's how PLS works) and so the whole idea (and effort associated with carrying out the idea) of reducing the number of variables goes away.

--
Paige Miller
Discussion stats