Environment: SAS Studio 9.4
PROC CORR data=have;
var Y logY;
with X_001-X_100;
run;
In working on a dataset with hundreds of variables, is there a way to sort the Pearson correlation coefficients with the response variable, 'y', in decreasing order through PROC CORR? Instead of sorting the output as X_001, X_002, ..., X_100 and their corresponding Pearson correlation coefficients, sort by the magnitude of coefficients.
Several web searches have shown syntax such as 'DESCENDING' or 'RANK' but does not seem to work.
All pairs of variable correlations with correlation, count and p-value in descending order (spearman)
Could not run in WPS due to record limit(no limit with R).
This is code from the oto_voodoo macro(you can just extract it).
see
https://www.dropbox.com/s/po3ahepe7r7dnm7/oto_voodoo.sas?dl=0
HAVE ( sashelp.iris)
====================
p to 40 obs from sashelp.iris total obs=150
bs SPECIES SEPALLENGTH SEPALWIDTH PETALLENGTH PETALWIDTH
1 Setosa 50 33 14 2
2 Setosa 46 34 14 3
3 Setosa 46 36 10 2
4 Setosa 51 33 17 5
5 Setosa 55 35 13 2
...
WANT
====
Variable Correlations (Spearman)
Correlated Correlation Number Spearman
Variable With Coef of Obs P
PETALWIDTH PETALLENGTH 0.93767 150 <.0001
PETALLENGTH SEPALLENGTH 0.88190 150 <.0001
PETALWIDTH SEPALLENGTH 0.83429 150 <.0001
PETALLENGTH SEPALWIDTH 0.30964 150 <.0001
PETALWIDTH SEPALWIDTH 0.28903 150 <.0001
SEPALWIDTH SEPALLENGTH 0.16678 150 <.0001
WORKING CODE
============
%_vdo_cor(
lib=sashelp
,mem=iris
);
FULL SOLUTION
=============
%macro _vdo_cor(
lib=&libname
,mem=&data
);
data _vcor0th/view=_vcor0th;
set %str(&lib).%str(&mem) (keep=_numeric_);
_rec=_n_;
if _n_=1 then _rec=.;
run;
ods exclude all;
ods output spearmancorr=_vvcor1st;
proc corr data=_vcor0th (keep=_numeric_) spearman;
var _numeric_;
with _numeric_;
run;
ods select all;
proc sql noprint;select count(*) into :_vv_num separated by ' ' from _vvcor1st;quit;
%put &=_vv_num;
data _vvcor2nd;
keep var wth n val p_rec;
set _vvcor1st(drop=label );
array num[*] _numeric_;
do _i_=1 to &_vv_num;
if num[_i_] ne . then do;
var=variable;
wth=vname(num[_i_]);
n=num[_i_+ %eval(2 * &_vv_num)];
val=abs(num[_i_]);
if (_i_ < _n_
and not (var='_REC' or wth = '_REC')) then output;
end;
end;
run;
proc sort data=_vvcor2nd out=vv_corsrt;
by descending val;
run;
title "Variable Correlations (Spearman)";
proc print data=vv_corsrt(obs=100) noobs width=min label split='#';
label
var = "Variable"
wth = "Correlated#With"
val = "Correlation#Coef"
n = "Number of Obs"
p_rec = "Spearman P";
var var wth val n p_rec;
run;
ods select all;
%mend _vdo_cor;
%_vdo_cor(
lib=sashelp
,mem=iris
);
I don't think there is any option that will generate the desired results directly from the proc. You'll have to output the correlations and then manipulate the data in some way. The code below is one example that may or may not be what you want, based on my interpretation of your question.
data have;
array x_{100} x_001 - x_100;
do i = 1 to 50;
y = ranuni(4) * 100;
logy = log(y);
do j = 1 to 100;
x_(j) = ranuni(4) * 100;
end;
output;
end;
run;
ods output PearsonCorr = corrs;
proc corr data = have nosimple;
var y logy;
with x_:;
run;
ods output close;
proc sql;
SELECT * FROM corrs
ORDER BY py descending;
quit;
The following requires some hard coding, but I think will provide what you want:
ods output PearsonCorr=need; proc corr data=sashelp.cars nosimple noprob; var _numeric_; run; data need (drop=label nmsrp--nlength); set need; run; proc transpose data=need out=want; by variable notsorted; run; data want (drop=_:); length variables $67; set want (rename=(col1=corr)); array vnames(2) $32. variable _name_; call sortc(of vnames(*)); if vnames(1) ne vnames(2); variables=catx(' X ',vnames(1),vnames(2)); abscorr=abs(corr); run; proc sort data=want out=want (drop=abscorr variable) nodupkey; by descending abscorr variables; run;
Art, CEO, AnalystFinder.com
You might also be interested in how to build a correlations heat map -- should work fine in SAS Studio. The blog post provides a macro you can use, and the data might be useful for your other purposes.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.