BookmarkSubscribeRSS Feed
_maldini_
Barite | Level 11

Is there a way to output the correlation coefficients from a correlation matrix produced by PROC CORR and use them in a calculation to create a new variable?

 

My goal is to assess multicollinearity using the following calculation:

new_var =  (correlation coefficient)^2 x 100 = % of information shared by the 2 variables

 

Thanks.

 

4 REPLIES 4
ballardw
Super User

Suggest that you provide the Proc Corr code and which coefficient(s) you want.

ODS OUTPUT is the most likely.

An example you should be able to run. Note that each option creates different tables that can be found in the DETAILS section of the Proc Corr to determine which table, such as the Pearsoncorr below, to request. The bit after the = is the name of the data set to create, assuming your remaining syntax is correct

proc corr data=sashelp.class
   pearson spearman ;
   ods output pearsoncorr=pcoef 
              SpearmanCorr=scoef
   ;
   var height weight;
run;
  
_maldini_
Barite | Level 11

Yikes...Working w/ this example, adding age, taking the output data set pcoef, I would want to create a new variable that goes thru each column and squares the Pearson correlation coefficient and then multiplies it by 100.

Screenshot 2023-03-09 at 11.23.18 AM.pngScreenshot 2023-03-09 at 11.23.27 AM.pngScreenshot 2023-03-09 at 11.23.51 AM.png

How would I do that for a matrix of variables?

This may be too complicated. I'm also trying to assess multicollinearity a different way. I just wanted to see how this approach might work.

 

ballardw
Super User

If you only want the correlation between specific variables you can reduce the proc corr results by using Var and With statements.

Such as

proc corr data=sashelp.class
   pearson spearman kendall hoeffding;
   ods output pearsoncorr=pcoef 
              SpearmanCorr=scoef
   ;
   var height  ;
   with weight;
run;
  

The output above does not have a "matrix".

 

You still have not shared your Proc Corr or even which variables you are specifically concerned about.

Tom
Super User Tom
Super User

Seems trivial.  Use PROC CORR to generate correlation dataset.  Transpose it to make it easier to work with. Generate you new variable.

proc corr data=sashelp.class outp=CORR;
   var height weight age;
run;

proc transpose data=corr(rename=(_name_=left)) name=right out=TALL(rename=(col1=CORR));
 where _type_='CORR';
 by left notsorted;
run;

data want;
  set tall;
  new_var =  corr**2 * 100;
run;

proc print;
run;

Tom_0-1678391676314.png

You can figure out how to add some logic to eliminate the duplicates.

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 4 replies
  • 2364 views
  • 3 likes
  • 3 in conversation