BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
marianaalcos
Quartz | Level 8

Hi there,

 

I'm new to SAS and I'm having a bad time trying to figure out simple tasks, such as the one below. Any help would be very much appreciated!

 

I have a database with 15 variables + response to which I applied PCA. Then, I took the 4 first PCs and applied a simple regression model to it. I'm struggling to transform the PC coefficients from the regression model (the "betas" from linear regression) into coefficients for the original variables, i.e., I need to multiply the matrix of the 4 eigenvectors (a 15x4 matrix) by the column vector (4x1) of "betas". 

Here's my attempt:

 

*running PCA;

proc princomp data=HMW.CRIME prefix=PC plots(only)=(scree)
		out=WORK.Princomp_scores outstat=WORK.Princomp_stats;
	var M So Ed Po1 Po2 LF M_F Pop NW U1 U2 Wealth Ineq Prob Time;
	title 'Principal Componentes Analysis';
run;
title;
		
*Build linear regression model with the first 4 principal components;

proc reg data=WORK.PRINCOMP_SCORES alpha=0.05 plots(only)=(diagnostics 
		observedbypredicted) outest=estimates;
	model Crime=PC1 PC2 PC3 PC4 /;
	output out=WORK.Reg_stats p=p_;
	title 'Regression Model';
	title2 'Using 4 first PC';
	run;
quit;
title;

DATA estimates (KEEP = PC1 PC2 PC3 PC4); *getting the betas from regression with 4PCs;
 SET estimates;
RUN;

DATA eigenvectors (DROP = _TYPE_ _NAME_);
 SET Princomp_stats (WHERE= (_TYPE_="SCORE")); *getting eigenvectors;
RUN;

From the above, I was able to "isolate" the betas and the complete eigenvector matrix by creating two new datasets, but I don't know how to proceed from here.

I thought about merging the two datasets (estimates and eigenvectors) and then performing the multiplication, but I wonder if it can be simpler than this.

 

In R, for instance, I'd just do: 

newbetas <- pca$rotation[,1:4] %*% pc_betas

 

Thanks!

 

 

 

UPDATE___________________

 

I was able to solve this using IML as follows:

proc iml; * use this procedure for matrix operations;
/* read variables from a SAS data set into a matrix */
varNames = {'M','So','Ed','Po1','Po2','LF','M_F','Pop','NW',
'U1','U2','Wealth','Ineq','Prob','Time'};
use work.eigenvectors;
read all var _ALL_ into eigen_matrix[colname=varNames];
print eigen_matrix;
close work.eigenvectors;
use work.estimates;
read all var _ALL_ into betas;
print betas;
close work.estimates;
alphas = eigen_matrix` * betas`; *notice the transpose symbol `;
print alphas;

 

Hope this helps someone in the future since it took me a while!

1 ACCEPTED SOLUTION

Accepted Solutions
marianaalcos
Quartz | Level 8

UPDATE___________________

 

I was able to solve this using IML as follows:

proc iml; * use this procedure for matrix operations;
/* read variables from a SAS data set into a matrix */
varNames = {'M','So','Ed','Po1','Po2','LF','M_F','Pop','NW',
'U1','U2','Wealth','Ineq','Prob','Time'};
use work.eigenvectors;
read all var _ALL_ into eigen_matrix[colname=varNames];
print eigen_matrix;
close work.eigenvectors;
use work.estimates;
read all var _ALL_ into betas;
print betas;
close work.estimates;
alphas = eigen_matrix` * betas`; *notice the transpose symbol `;
print alphas;

 

Hope this helps someone in the future since it took me a while!

View solution in original post

8 REPLIES 8
Reeza
Super User

Look at PROC SCORE, one of the examples in the documentation has exactly the code you're looking for in this case.

 


@marianaalcos wrote:

Hi there,

 

I'm new to SAS and I'm having a bad time trying to figure out simple tasks, such as the one below. Any help would be very much appreciated!

 

I have a database with 15 variables + response to which I applied PCA. Then, I took the 4 first PCs and applied a simple regression model to it. I'm struggling to transform the PC coefficients from the regression model (the "betas" from linear regression) into coefficients for the original variables, i.e., I need to multiply the matrix of the 4 eigenvectors (a 15x4 matrix) by the column vector (4x1) of "betas". 

Here's my attempt:

 

*running PCA;

proc princomp data=HMW.CRIME prefix=PC plots(only)=(scree)
		out=WORK.Princomp_scores outstat=WORK.Princomp_stats;
	var M So Ed Po1 Po2 LF M_F Pop NW U1 U2 Wealth Ineq Prob Time;
	title 'Principal Componentes Analysis';
run;
title;
		
*Build linear regression model with the first 4 principal components;

proc reg data=WORK.PRINCOMP_SCORES alpha=0.05 plots(only)=(diagnostics 
		observedbypredicted) outest=estimates;
	model Crime=PC1 PC2 PC3 PC4 /;
	output out=WORK.Reg_stats p=p_;
	title 'Regression Model';
	title2 'Using 4 first PC';
	run;
quit;
title;

DATA estimates (KEEP = PC1 PC2 PC3 PC4); *getting the betas from regression with 4PCs;
 SET estimates;
RUN;

DATA eigenvectors (DROP = _TYPE_ _NAME_);
 SET Princomp_stats (WHERE= (_TYPE_="SCORE")); *getting eigenvectors;
RUN;

From the above, I was able to "isolate" the betas and the complete eigenvector matrix by creating two new datasets, but I don't know how to proceed from here.

I thought about merging the two datasets (estimates and eigenvectors) and then performing the multiplication, but I wonder if it can be simpler than this.

 

In R, for instance, I'd just do: 

newbetas <- pca$rotation[,1:4] %*% pc_betas

 

Thanks!


 

marianaalcos
Quartz | Level 8

Hi Reeza,

 

Seems to me that what proc score does is linear combination.

I'm not looking for that.

 

I want to multiply the output of the piece of code below (which results in a 15x4 dataset)

DATA eigenvectors (DROP = _TYPE_ _NAME_);
 SET Princomp_stats (WHERE= (_TYPE_="SCORE" & _NAME_ in 
 ('PC1','PC2','PC3', 'PC4'))); *getting eigenvectors;
RUN;

 

by my regression estimates coming from the piece of code below (it's currently a row vector, so I also need to transpose it).

DATA estimates (KEEP = PC1 PC2 PC3 PC4); *getting the betas from regression with 4PCs;
 SET Reg_out;
RUN;

Is there an easy way of doing this?

Unfortunately, google is not helping much =/ 

Reeza
Super User

Matrix multiplication is a series of linear combinations. Can you upload a small excel workbook that shows a fully worked example of what you need. The next easiest approach is to use a data step and temporary array. 

Reeza
Super User

Please ignore my previous response. 

 

You're trying it to back transform your regression coefficients to interpret them? I don’t think that makes sense. 

 

Instead you should probably be using PLS and I’m going to page @PGStats to chime in because this is beyond me and he’s much better at these questions. 

marianaalcos
Quartz | Level 8
Hi Reeza,

Yes I'm trying to transform the PC coefficients into coefficients for the original variables. I appreciate your efforts, but I was able to do it using proc IML!

I'll post the solved and the solution above.
Thank you so much!
PGStats
Opal | Level 21

Make sure you check your final results. In my understanding the eigenvectors are applied to the standardized variables to get the PCA scores (unless you use the COV option, which is not advisable unless all your variables share the same scale). So I think you would need to do some rescaling to express your regression coefficients in terms of the original variables.

PG
marianaalcos
Quartz | Level 8

Hi PG,

 

Yes,the coefficients obtained with my code are for standardized data.

In the sequence I had to convert back to get the coefficients in the original scale.

But after learning proc iml I basically just applied a code I had in R to do that and it worked just fine 🙂 

 

Thanks anyways!

marianaalcos
Quartz | Level 8

UPDATE___________________

 

I was able to solve this using IML as follows:

proc iml; * use this procedure for matrix operations;
/* read variables from a SAS data set into a matrix */
varNames = {'M','So','Ed','Po1','Po2','LF','M_F','Pop','NW',
'U1','U2','Wealth','Ineq','Prob','Time'};
use work.eigenvectors;
read all var _ALL_ into eigen_matrix[colname=varNames];
print eigen_matrix;
close work.eigenvectors;
use work.estimates;
read all var _ALL_ into betas;
print betas;
close work.estimates;
alphas = eigen_matrix` * betas`; *notice the transpose symbol `;
print alphas;

 

Hope this helps someone in the future since it took me a while!

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

SAS Enterprise Guide vs. SAS Studio

What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 8 replies
  • 1895 views
  • 4 likes
  • 3 in conversation