BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
ksmielitz
Quartz | Level 8

Okay, so I ran a proc factor and it turns out I have 2 factors...fine, great.

 

I then ran the proc score to create scores so that I could use them in my regression statements...one of the factors I could actually handle separately, but my SES variable has 2 continuous variables (hhincome, ageofmom), 1 binary (banked), and 1 categorical (highest)...and I need to use the SES factor as a predictor of my DVs.

 

So the code that works (including the proc factor that determined I had 2 factors, not just the one I thought):

 

proc factor rotate=varimax ev scree min=1;
var welfare ssi foodstamps banked ageofmom hhincome highest;
run;

proc factor data=nlsyproject outstat=Factout
		method=prin rotate=varimax score;
	var  banked ageofmom hhincome highest;
title 'SESFactor Scoring';
run;

proc score data=nlsyproject score=factout out=Fscore;
	var banked ageofmom hhincome highest;
	run;

proc factor data=nlsyproject outstat=Factout
		method=prin rotateo=varimax score;
	var welfare foodstamps ssi;
title 'GovtAssist Scoring';
run;

proc score data=nlsyproject score=factout out=Fscore;
	var welfare foodstamps ssi;
	run;

So, how do I use 'SESFactor' as a predictor in my regression statements?

I need to have some way to store the result and label it so I have some conciseness.

 

The GovtAssist factor I figure I can create a 0-3 scale for range of use 0=no assistance through 3=all types and then keep plugging along, but I have NO idea how to handle SES if not for this factor scoring thing that I can't seem to save the resulting score for use in the model.

 

Thanks in advance for any help.

1 ACCEPTED SOLUTION

Accepted Solutions
rayIII
SAS Employee

Proc Score creates the factor columns for you, but it is up to you to provide nicer names like SES. As mentioned earlier, a single run of FACTOR and SCORE should suffice. 

 

Here's a really simple example. You will need to modify it to use your own variables, factor names, and MODEL statement. My code is just for illustration.

 

proc factor data=sashelp.iris outstat=Factout nfactors=2
	method=prin rotate=varimax score;
    var petallength petalwidth sepallength sepalwidth;
run;


proc score data=sashelp.iris score=factout out=Fscore (rename=(factor1=someNewNameLikeSES factor2=someOtherNewName);
	var petallength petalwidth sepallength sepalwidth;
run;

 
proc reg data = Fscore ;
	model someNewNameLikeSES = someOtherNewName; 
run; 

View solution in original post

11 REPLIES 11
stat_sas
Ammonite | Level 13

Hi,

 

Why are you running two separate factor analysis? Can't it be done in a single run?

ksmielitz
Quartz | Level 8

@stat_sas Yes, it can be done in one factor analysis, but that confused me even more. I need to ensure that I have two separate predictors defined (govtassist and SES). I don't really care how they're titled, but I can't figure out how to run the variables that construct my SES factor as one unit. Is this possible? I figured out a way to make govtassist by redefining my variable. But I don't know how to combine 3 continuous and 1 binary variable and label the resulting factor...to then use that factor as a predictor in my regressions/other statistical tests.

 

I know that hhincome, ageofmom, banked, and highest construct the SES variable but if I run:

 

proc reg;

model behavior=hhincome ageofmom banked highest;

run;

 

is it recognizing those 4 variables as my SES factor? And then when I add in male (to control for gender) black hispanic (compared to whites) will my results be different if I have a defined SES factor than if I have those 4 variables separately listed? 

 

Ack!

 

Thanks in advance for any additional thoughts and/or feedback.

K8

stat_sas
Ammonite | Level 13

Hi,

 

 

What do you mean by

 

"how to combine 3 continuous and 1 binary variable and label the resulting factor"?

 

Are hhincome ageofmom banked highest uncorrelated?

 

ksmielitz
Quartz | Level 8

@stat_sas You'll have to forgive me because I may not be explaining this correctly.

 

Yes, hhincome ageofmom banked and highest are correlated.

 

My concern is whether or not I have to combine them into one specifically defined factor variable to represent all of them in the regression.

 

So, I could run 

 

proc reg;

model behavior= hhincome ageofmom banked highest

 

OR (if possible)

 

proc reg;

model behavior= SES (which encompasses the 4 listed above).

 

Is there a way to do that or am I just over-complicating this?

SteveDenham
Jade | Level 19

Take a look at PROC PLS--an excellent way of modeling multiple responses based on multiple predictive variables.  Just a different way of trying to solve a similar problem.

 

Steve Denham

ksmielitz
Quartz | Level 8

I'll take a look, @SteveDenham. Thanks!

rayIII
SAS Employee

Proc Score creates the factor columns for you, but it is up to you to provide nicer names like SES. As mentioned earlier, a single run of FACTOR and SCORE should suffice. 

 

Here's a really simple example. You will need to modify it to use your own variables, factor names, and MODEL statement. My code is just for illustration.

 

proc factor data=sashelp.iris outstat=Factout nfactors=2
	method=prin rotate=varimax score;
    var petallength petalwidth sepallength sepalwidth;
run;


proc score data=sashelp.iris score=factout out=Fscore (rename=(factor1=someNewNameLikeSES factor2=someOtherNewName);
	var petallength petalwidth sepallength sepalwidth;
run;

 
proc reg data = Fscore ;
	model someNewNameLikeSES = someOtherNewName; 
run; 
ksmielitz
Quartz | Level 8

@rayIII Okay, I'm feeling good about this, but it won't let me rename them. When I type in "rename" like you did, it doesn't turn blue.

 

And am I reading your last section of code correctly that I would set my factors equal to one another in the regression statement? Why would I do that?

 

Thanks, K8

ksmielitz
Quartz | Level 8

@rayIII

 

I GOT IT!! I GOT IT!! I GOT IT!!!!

 

I used the factors to predict my DV and it worked!!!!

 

THANK YOU!!!

rayIII
SAS Employee

Glad you got it! Sorry, I left off a parenthesis. It should have been:

 

proc score data=sashelp.iris score=factout out=Fscore (rename=(factor1=someNewNameLikeSES factor2=someOtherNewName));
var petallength petalwidth sepallength sepalwidth;
run;

 

Also, my proc reg call was just an example to show how it to use scored data in a regression analysis. You would use your own Model statement like:

 

model behavior =someNewNameLikeSES someOtherNewName; 

stat_sas
Ammonite | Level 13

Hi,

 

If variables are highly correlated then first step after running factor analysis is to see eigen values which represent
variation explained by factors. As an example, if first eigen value explains 80% of the variation then first factor
score would be sufficient for subsequent analysis and this can also be used as a representative of four variables.

 

On the other hand, if more top two eigen values significantly expains most of the variation in orginal variables then you have to
use two factor scores in regression analysis. Lastly, running separate factor analysis may produce factor scores which may
be again correlated and can introduce overfitting and destablize parameter estimates.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 11 replies
  • 2797 views
  • 4 likes
  • 4 in conversation