Programming the statistical procedures from SAS

Predicting Values of Principal Components obtained from PROC PRINCOMP

Reply
New Contributor
Posts: 3

Predicting Values of Principal Components obtained from PROC PRINCOMP

Hi there! I hope you can help me out with my problem. This is for my project in school.

 

After conducting a survey, I performed Principal Component Analysis on the variables (survey questions) to reduce their count. I used PROC PRINCOMP to obtain the principal components. Before I can use the principal components I chose to retain in logistic regression, I need to predict their values first. I tried using PROC SCORE but somehow I could not make it work.

 

It worked with STATA when I used the command "predict <principal components>, score". I wonder if there is a similar statement or function in SAS to produce the same output?

 

Thank you very much!

 

These are my code so far for your reference. Smiley Very Happy

 

/*----------for PCA---------*/

PROC PRINCOMP DATA=Data OUT=prin;
 VAR v1 v2 v3 v4 v5 v6 v7 v8 v9 v10;
RUN;

 

/*----this part does not work (for predicting)-----*/

PROC SCORE DATA=Data SCORE=print OUT=newdata PREDICT;
VAR prin1 prin2;
RUN;

Super User
Posts: 18,549

Re: Predicting Values of Principal Components obtained from PROC PRINCOMP

I think you need the OUTSTAT dataset not the OUT dataset? 

 

See the example on the doc though it uses PROC FACTOR the methods should be similar. 

 

http://documentation.sas.com/?docsetId=statug&docsetVersion=14.2&docsetTarget=statug_score_examples0...

New Contributor
Posts: 3

Re: Predicting Values of Principal Components obtained from PROC PRINCOMP

Hi Reeza! Thanks for your response! Smiley Very Happy

 

I followed the example from the link you gave me. These are my code:

 

PROC FACTOR DATA=Data OUTSTAT=fact METHOD=PRIN EIGENVECTORS SCORE;
VAR v1 v2 v3 v4 v5 v6 v7 v8 v9 v10;
RUN;

 

PROC SCORE DATA=DATA SCORE=fact OUT=newdata PREDICT;
VAR v1 v2 v3 v4 v5 v6 v7 v8 v9 v10;
RUN;

 

However, what I got was the predicted values of the factors, not the principal components. Smiley Sad Do you have any other inputs?

Trusted Advisor
Posts: 1,670

Re: Predicting Values of Principal Components obtained from PROC PRINCOMP

[ Edited ]

Using PROC PRINCOMP:

 

Principal component scores (is that what you are referring to?) are in the OUT= data set

 

Principal component loadings (vectors) are in the OUTSTAT= data set

 

Doing PCA to get scores so you can do a logistic regression is not something I would recommend because the PCA scores are computed ignoring the dependent variable and thus there is no reason to suspect that the PCA scores will be good predictors. I would recommend using Partial Least Squares (PROC PLS) regression in you case, the dimensions/scores used are determined by finding dimensions that are predictive of your Y variable(s), a property that PCA cannot claim. There is also a logistic version of PLS that has been developed, see https://cedric.cnam.fr/fichiers/RC906.pdf

New Contributor
Posts: 3

Re: Predicting Values of Principal Components obtained from PROC PRINCOMP

Hi PaigeMiller! Thank you for your response! And thanks for your input regarding the different method. However, I am still unable to predict the values for the principal components (the ones in the eigenvector table) I chose to retain. Smiley Sad

Trusted Advisor
Posts: 1,670

Re: Predicting Values of Principal Components obtained from PROC PRINCOMP

We are probably using different terminology to describe the value you want. And so I do not understand what you mean by predicted values from PCA.

 

Principal components produces scores in each dimension (one for each observation), and it produces loadings (also called eigenvectors) in each dimension, one for each original variable.

 

So can you tell me in your own words what this predicted value is that you are looking for? Predicted value of WHAT? (For example, in an ordinary least squares regression, you can obtain the predicted values are for the y-variables; you can also obtain estimates of the slope and intercept; but since there are no y-variables in PCA, I am still not sure what is being predicted).

Super User
Posts: 18,549

Re: Predicting Values of Principal Components obtained from PROC PRINCOMP

You have the principal components from PROC FACTOR. 

 

You have the new data scored with the principal compenents from PROC SCORE. 

 

Take the time to spell out EXACTLY what you want because whatever you're trying to accomplish is unclear. 

 

 

Trusted Advisor
Posts: 1,670

Re: Predicting Values of Principal Components obtained from PROC PRINCOMP

[ Edited ]

Agreeing with @Reeza, we don't understand what you are trying to do, we don't understand what numbers you are trying to compute, we don't understand the phrase "predicted values" in the context of Principal Components analysis. Much more detail about what you want to do is critical here.

Trusted Advisor
Posts: 1,221

Re: Predicting Values of Principal Components obtained from PROC PRINCOMP

Hi,

 

In order to get predicted components you need to get eigenvector using proc princomp which will be used in score procedure. Please try this. 

 

PROC PRINCOMP DATA=sashelp.heart OUTSTAT=eigenvector out=pc;
VAR ageatstart height weight diastolic systolic;
RUN;

/*----this part does not work (for predicting)-----*/


PROC SCORE DATA=sashelp.heart SCORE=eigenvector OUT=newdata PREDICT;
VAR ageatstart height weight diastolic systolic;
RUN;

Ask a Question
Discussion stats
  • 8 replies
  • 253 views
  • 6 likes
  • 4 in conversation