Proc CORR creates m*m correlation matrix ... how to convert it into pair-wise dataset?!
Say here are ten variables, CORR creates 10*10 correlation matrix, need to convert into 10*10=100 row (var1 var2 corr12)..
See the article, "Use a bar chart to visualize pairwise correlations."
The focus of that article is to create a bar chart of the pairwise correlations, but you can use the same code to create a list.
The SAS/IML language has similarities with R, but if you are new to SAS, here is an explicit example, modified from the article:
ods exclude all;
proc corr data=sashelp.Heart; /* pairwise correlation */
var _NUMERIC_;
ods output PearsonCorr = Corr; /* write correlations, p-values, and sample sizes to data set */
run;
ods exclude none;
proc iml;
use Corr;
read all var "Variable" into ColNames; /* get names of variables */
read all var (ColNames) into mCorr; /* matrix of correlations */
close Corr;
numCols = ncol(mCorr); /* number of variables */
numPairs = numCols*(numCols-1) / 2;
length = 2*nleng(ColNames) + 5; /* max length of new ID variable */
PairNames = j(NumPairs, 1, BlankStr(length));
i = 1;
do row= 2 to numCols; /* construct the pairwise names */
do col = 1 to row-1;
PairNames[i] = strip(ColNames[col]) + " vs. " + strip(ColNames[row]);
i = i + 1;
end;
end;
lowerIdx = loc(row(mCorr) > col(mCorr)); /* indices of lower-triangular elements */
Corr = mCorr[ lowerIdx ];
create CorrPairs var {"PairNames" "Corr"};
append;
close;
QUIT;
proc print data=CorrPairs; run;
There way to do in R. How to do in SAS?! https://stackoverflow.com/questions/33781147/r-converting-correlation-matrix-to-pairwise
See the article, "Use a bar chart to visualize pairwise correlations."
The focus of that article is to create a bar chart of the pairwise correlations, but you can use the same code to create a list.
The SAS/IML language has similarities with R, but if you are new to SAS, here is an explicit example, modified from the article:
ods exclude all;
proc corr data=sashelp.Heart; /* pairwise correlation */
var _NUMERIC_;
ods output PearsonCorr = Corr; /* write correlations, p-values, and sample sizes to data set */
run;
ods exclude none;
proc iml;
use Corr;
read all var "Variable" into ColNames; /* get names of variables */
read all var (ColNames) into mCorr; /* matrix of correlations */
close Corr;
numCols = ncol(mCorr); /* number of variables */
numPairs = numCols*(numCols-1) / 2;
length = 2*nleng(ColNames) + 5; /* max length of new ID variable */
PairNames = j(NumPairs, 1, BlankStr(length));
i = 1;
do row= 2 to numCols; /* construct the pairwise names */
do col = 1 to row-1;
PairNames[i] = strip(ColNames[col]) + " vs. " + strip(ColNames[row]);
i = i + 1;
end;
end;
lowerIdx = loc(row(mCorr) > col(mCorr)); /* indices of lower-triangular elements */
Corr = mCorr[ lowerIdx ];
create CorrPairs var {"PairNames" "Corr"};
append;
close;
QUIT;
proc print data=CorrPairs; run;
Why not just use normal SAS code?
Generate the correlations.
Transpose.
proc corr data=sashelp.heart out=corr noprint;
var _numeric_;
run;
proc transpose
data=corr(rename=(_name_=LEFT))
out=pairs(rename=(_name_=RIGHT) drop=_label_ )
;
by left notsorted;
where _type_='CORR';
id _type_;
run;
Results:
Obs LEFT RIGHT CORR 1 AgeCHDdiag AgeCHDdiag 1.00000 2 AgeCHDdiag AgeAtStart 0.55091 3 AgeCHDdiag Height -0.21042 4 AgeCHDdiag Weight -0.13741 5 AgeCHDdiag Diastolic -0.03510 6 AgeCHDdiag Systolic 0.06924 7 AgeCHDdiag MRW 0.00642 8 AgeCHDdiag Smoking -0.28357 9 AgeCHDdiag AgeAtDeath 0.74811 10 AgeCHDdiag Cholesterol 0.00363 11 AgeAtStart AgeCHDdiag 0.55091 12 AgeAtStart AgeAtStart 1.00000 13 AgeAtStart Height -0.13173 14 AgeAtStart Weight 0.09352 15 AgeAtStart Diastolic 0.27540 16 AgeAtStart Systolic 0.37938 17 AgeAtStart MRW 0.20451 18 AgeAtStart Smoking -0.16743 19 AgeAtStart AgeAtDeath 0.68860 20 AgeAtStart Cholesterol 0.27341 21 Height AgeCHDdiag -0.21042 ...
You could eliminate the self correlations and the duplicate entries by just filtering to the observations where LEFT < RIGHT.
data have;
set sashelp.Heart;
run;
ods exclude all;
proc corr data=have; /* pairwise correlation */
var _NUMERIC_;
ods output PearsonCorr = Corr; /* write correlations, p-values, and sample sizes to data set */
run;
ods exclude none;
proc sort data=corr(drop=Label) out=corr2;
by variable;
run;
proc sql noprint;
select variable into : vname separated by ' ' from corr2;
quit;
proc transpose data=corr2 out=want;
by variable;
var &vname.;
run;
Works great. Thanks.
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.