- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Is there a way to output the correlation coefficients from a correlation matrix produced by PROC CORR and use them in a calculation to create a new variable?
My goal is to assess multicollinearity using the following calculation:
new_var = (correlation coefficient)^2 x 100 = % of information shared by the 2 variables
Thanks.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Suggest that you provide the Proc Corr code and which coefficient(s) you want.
ODS OUTPUT is the most likely.
An example you should be able to run. Note that each option creates different tables that can be found in the DETAILS section of the Proc Corr to determine which table, such as the Pearsoncorr below, to request. The bit after the = is the name of the data set to create, assuming your remaining syntax is correct
proc corr data=sashelp.class pearson spearman ; ods output pearsoncorr=pcoef SpearmanCorr=scoef ; var height weight; run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Yikes...Working w/ this example, adding age, taking the output data set pcoef, I would want to create a new variable that goes thru each column and squares the Pearson correlation coefficient and then multiplies it by 100.
How would I do that for a matrix of variables?
This may be too complicated. I'm also trying to assess multicollinearity a different way. I just wanted to see how this approach might work.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
If you only want the correlation between specific variables you can reduce the proc corr results by using Var and With statements.
Such as
proc corr data=sashelp.class pearson spearman kendall hoeffding; ods output pearsoncorr=pcoef SpearmanCorr=scoef ; var height ; with weight; run;
The output above does not have a "matrix".
You still have not shared your Proc Corr or even which variables you are specifically concerned about.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Seems trivial. Use PROC CORR to generate correlation dataset. Transpose it to make it easier to work with. Generate you new variable.
proc corr data=sashelp.class outp=CORR;
var height weight age;
run;
proc transpose data=corr(rename=(_name_=left)) name=right out=TALL(rename=(col1=CORR));
where _type_='CORR';
by left notsorted;
run;
data want;
set tall;
new_var = corr**2 * 100;
run;
proc print;
run;
You can figure out how to add some logic to eliminate the duplicates.