The following is one correlation matrix of x1-x6. I want to sort them as follows:
First, set x1 as the first variable.
Second, find the variable whose correlation coefficent with x1 is the largest (except x1),and set it as the second variale,e.g...x(2);
Third, find the variable whose correlation coefficent with x(2) is the largest,and set it as the third variale,e.g...x(3);
.....
.....
Finally, sort all variables in such sequence. All variables is sorted by their maximal correlation coefficent with before it.
I can sort x1-x6 by hand. But if I have 100 variables, how to sort them in sas?
Thanks.
NAME | x1 | x2 | x3 | x4 | x5 | x6 |
x1 | 1 | 0.795283 | 0.648228 | 0.702434 | 0.410562 | 0.67573 |
x2 | 0.795283 | 1 | 0.785185 | 0.852621 | 0.517509 | 0.8671 |
x3 | 0.648228 | 0.785185 | 1 | 0.711466 | 0.50897 | 0.695645 |
x4 | 0.702434 | 0.852621 | 0.711466 | 1 | 0.457522 | 0.757802 |
x5 | 0.410562 | 0.517509 | 0.50897 | 0.457522 | 1 | 0.485006 |
x6 | 0.67573 | 0.8671 | 0.695645 | 0.757802 | 0.485006 | 1 |
Oh. I don't realize that your problem are so complicated.
See the following code is whether what you need.
data class; input x1-x4; cards; 1 2 4 6 2 3 5 6 3 5 7 8 6 8 9 2 3 4 6 2 2 1 4 8 2 5 7 9 ; run; %let x1=x2; proc corr data=class outp=x(where=(_name_ is not missing)) noprint; var x: ; run; data xx(keep=path value); set x; length path $ 40; array _x{*} x: ; do i=1 to dim(_x); if _x{i} ne 1 then do; path=catx(' ',_name_,vname(_x{i})) ; value=_x{i}; output; end; end; if _n_ eq 1 then call symputx('num',dim(_x)); run; data _null_; if 0 then set xx; declare hash ha(hashexp:10,dataset:'xx'); declare hiter hi('ha'); ha.definekey('path'); ha.definedata('path','value'); ha.definedone(); length list $ 4000; list="&x1"; do i=1 to &num-1; max=.; do while(hi.next()=0); if (strip(scan(list,-1,' ')) eq strip(scan(path,1,' ')) ) and not index(list,strip(scan(path,-1,' '))) then do; if value gt max then do;temp=strip(scan(path,-1,' '));max=value;end; end; end; list=catx(' ',list,temp); end; call symputx('list',list); run; %put &num &list; proc corr data=class outp=want(where=(_name_ is not missing)) noprint; var &list ; run;
Ksharp
Tian.Kong
How about:
data class; input x1-x4; cards; 1 2 4 6 2 3 5 6 3 5 7 8 6 8 9 2 3 4 6 2 2 1 4 8 2 5 7 9 ; run; %let x1=x1; proc corr data=class outp=x(where=(upcase(_name_)="%upcase(&x1)")); var x1-x4 ; run; data xx(keep=name value); set x(keep=x:); array _x{*} x: ; do i=1 to dim(_x); name=vname(_x{i}); value=_x{i}; output; end; run; proc sort data=xx;by descending value;run; proc sql; select name into : list separated by ' ' from xx; quit; proc corr data=class outp=want; var &list ; run;
Ksharp
Thanks. Greatly appreciated.
but it seems not what I want. Just look at the data I presented above.
We will set x1 x2 as the first two variables for sure. For the third one, If using your program, it will be x4. But the one which has maximal correlation coefficent with x2 is x6, not x4.
Any suggestion?
Is the following what you are trying to do?:
data have;
input NAME $ x1-x6;
cards;
x1 1 0.795283 0.648228 0.702434 0.410562 0.67573
x2 0.795283 1 0.785185 0.852621 0.517509 0.8671
x3 0.648228 0.785185 1 0.711466 0.50897 0.695645
x4 0.702434 0.852621 0.711466 1 0.457522 0.757802
x5 0.410562 0.517509 0.50897 0.457522 1 0.485006
x6 0.67573 0.8671 0.695645 0.757802 0.485006 1
;
data want (keep=v: corr);
set have (rename=(name=var1));
array corrs(*) x1-x6;
do i=1 to 6;
if i ne _n_ then do;
var2=catt('x',i);
corr=corrs(i);
output;
end;
end;
run;
proc sort data=want;
by var1 descending corr;
run;
Thanks.
It's very close to what I want.
I'll set x1 as 1st. x2 has a maximal correlation coefficent with x1. So x2 will be the second.
x6 has a maximal correlation coefficent with x2. So x6 will be the third.
x2 has a maximal correlation coefficent with x6, But x2 is already in the sequence (2nd). So we
will choose x4, which has a second maximal correlation coefficent with x6, as the fourth one in the squence
................
and so..............
Sounds like you are looking for output like that produced by the PROC CORR statement option BEST=. Check the documentation.
Oh. I don't realize that your problem are so complicated.
See the following code is whether what you need.
data class; input x1-x4; cards; 1 2 4 6 2 3 5 6 3 5 7 8 6 8 9 2 3 4 6 2 2 1 4 8 2 5 7 9 ; run; %let x1=x2; proc corr data=class outp=x(where=(_name_ is not missing)) noprint; var x: ; run; data xx(keep=path value); set x; length path $ 40; array _x{*} x: ; do i=1 to dim(_x); if _x{i} ne 1 then do; path=catx(' ',_name_,vname(_x{i})) ; value=_x{i}; output; end; end; if _n_ eq 1 then call symputx('num',dim(_x)); run; data _null_; if 0 then set xx; declare hash ha(hashexp:10,dataset:'xx'); declare hiter hi('ha'); ha.definekey('path'); ha.definedata('path','value'); ha.definedone(); length list $ 4000; list="&x1"; do i=1 to &num-1; max=.; do while(hi.next()=0); if (strip(scan(list,-1,' ')) eq strip(scan(path,1,' ')) ) and not index(list,strip(scan(path,-1,' '))) then do; if value gt max then do;temp=strip(scan(path,-1,' '));max=value;end; end; end; list=catx(' ',list,temp); end; call symputx('list',list); run; %put &num &list; proc corr data=class outp=want(where=(_name_ is not missing)) noprint; var &list ; run;
Ksharp
Tian.Kong
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.