BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
MikeTurner
Calcite | Level 5

The following is one correlation matrix of x1-x6. I want to sort them as follows:

First, set x1 as the first variable.

Second, find the variable whose correlation coefficent with x1 is the largest (except x1),and set it as the second variale,e.g...x(2);

Third, find the variable whose correlation coefficent with x(2) is the largest,and set it as the third variale,e.g...x(3);

.....

.....

Finally, sort all variables in such sequence. All variables is sorted by their maximal correlation coefficent with before it.

I can sort x1-x6 by hand. But if I have 100 variables, how to sort them in sas?

Thanks.

NAME       x1     x2      x3      x4      x5      x6
x110.7952830.6482280.7024340.4105620.67573
x20.79528310.7851850.8526210.5175090.8671
x30.6482280.78518510.7114660.508970.695645
x40.7024340.8526210.71146610.4575220.757802
x50.4105620.5175090.508970.45752210.485006
x60.675730.86710.6956450.7578020.4850061
1 ACCEPTED SOLUTION

Accepted Solutions
Ksharp
Super User

Oh. I don't realize that your problem are so complicated.

See the following code is whether what you need.

data class;
input x1-x4;
cards;
1 2 4 6
2 3 5 6
3 5 7 8
6 8 9 2
3 4 6 2
2 1 4 8
2 5 7 9
;
run;



%let x1=x2;


proc corr data=class outp=x(where=(_name_ is not missing)) noprint;
 var  x: ;
run;

data xx(keep=path value);
 set x;
 length path $ 40;
 array _x{*} x: ;
 do i=1 to dim(_x);
   if _x{i} ne 1 then do;
    path=catx(' ',_name_,vname(_x{i})) ;
    value=_x{i};
   output;
   end;
 end;
 if _n_ eq 1 then call symputx('num',dim(_x));
run;


data _null_;
if 0 then set xx;
 declare hash ha(hashexp:10,dataset:'xx');
 declare hiter hi('ha');
  ha.definekey('path');
  ha.definedata('path','value');
  ha.definedone();
 length list $ 4000;
 list="&x1";  
 do i=1 to &num-1;
  max=.;
  do while(hi.next()=0); 
   if (strip(scan(list,-1,' ')) eq strip(scan(path,1,' ')) ) and not index(list,strip(scan(path,-1,' ')))  then do;
     if value gt max then do;temp=strip(scan(path,-1,' '));max=value;end;
   end;
  end;
  list=catx(' ',list,temp); 
end;
 call symputx('list',list);
run;

%put &num &list;

proc corr data=class outp=want(where=(_name_ is not missing)) noprint;
 var  &list ;
run;





Ksharp

Tian.Kong

View solution in original post

6 REPLIES 6
Ksharp
Super User

How about:

data class;
input x1-x4;
cards;
1 2 4 6
2 3 5 6
3 5 7 8
6 8 9 2
3 4 6 2
2 1 4 8
2 5 7 9
;
run;



%let x1=x1;


proc corr data=class outp=x(where=(upcase(_name_)="%upcase(&x1)"));
 var x1-x4 ;
run;
data xx(keep=name value);
 set x(keep=x:);
 array _x{*} x: ;
 do i=1 to dim(_x);
   name=vname(_x{i});
   value=_x{i};
   output;
 end;
run;
proc sort data=xx;by descending value;run;
proc sql;
 select name into : list separated by ' ' from xx;
quit;

proc corr data=class outp=want;
 var &list ;
run;


Ksharp

MikeTurner
Calcite | Level 5

Thanks. Greatly appreciated.

but it seems not what  I want. Just look at the data I presented above.

We will set x1 x2 as the first two variables for sure. For the third one,  If using your program, it will be x4. But the one which has maximal correlation coefficent with x2 is x6, not x4.

Any suggestion?


art297
Opal | Level 21

Is the following what you are trying to do?:

data have;

  input NAME $ x1-x6;

  cards;

x1          1          0.795283          0.648228          0.702434          0.410562          0.67573

x2          0.795283          1          0.785185          0.852621          0.517509          0.8671

x3          0.648228          0.785185          1          0.711466          0.50897          0.695645

x4          0.702434          0.852621          0.711466          1          0.457522          0.757802

x5          0.410562          0.517509          0.50897          0.457522          1          0.485006

x6          0.67573          0.8671          0.695645          0.757802          0.485006          1

;

data want (keep=v: corr);

  set have (rename=(name=var1));

  array corrs(*) x1-x6;

  do i=1 to 6;

    if i ne _n_ then do;

      var2=catt('x',i);

      corr=corrs(i);

      output;

    end;

  end;

run;

proc sort data=want;

  by var1 descending corr;

run;

MikeTurner
Calcite | Level 5

Thanks.

It's very close to what I want.

I'll set x1 as 1st. x2 has a maximal correlation coefficent with x1. So x2 will be the second.

x6 has a maximal correlation coefficent with x2. So x6 will be the third. 

x2 has a maximal correlation coefficent with x6, But x2 is already in the sequence (2nd). So we

will choose x4, which has a second maximal correlation coefficent with x6, as the fourth one in the squence

................

and so..............


data_null__
Jade | Level 19

Sounds like you are looking for output like that produced by the PROC CORR statement option BEST=.  Check the documentation.

Ksharp
Super User

Oh. I don't realize that your problem are so complicated.

See the following code is whether what you need.

data class;
input x1-x4;
cards;
1 2 4 6
2 3 5 6
3 5 7 8
6 8 9 2
3 4 6 2
2 1 4 8
2 5 7 9
;
run;



%let x1=x2;


proc corr data=class outp=x(where=(_name_ is not missing)) noprint;
 var  x: ;
run;

data xx(keep=path value);
 set x;
 length path $ 40;
 array _x{*} x: ;
 do i=1 to dim(_x);
   if _x{i} ne 1 then do;
    path=catx(' ',_name_,vname(_x{i})) ;
    value=_x{i};
   output;
   end;
 end;
 if _n_ eq 1 then call symputx('num',dim(_x));
run;


data _null_;
if 0 then set xx;
 declare hash ha(hashexp:10,dataset:'xx');
 declare hiter hi('ha');
  ha.definekey('path');
  ha.definedata('path','value');
  ha.definedone();
 length list $ 4000;
 list="&x1";  
 do i=1 to &num-1;
  max=.;
  do while(hi.next()=0); 
   if (strip(scan(list,-1,' ')) eq strip(scan(path,1,' ')) ) and not index(list,strip(scan(path,-1,' ')))  then do;
     if value gt max then do;temp=strip(scan(path,-1,' '));max=value;end;
   end;
  end;
  list=catx(' ',list,temp); 
end;
 call symputx('list',list);
run;

%put &num &list;

proc corr data=class outp=want(where=(_name_ is not missing)) noprint;
 var  &list ;
run;





Ksharp

Tian.Kong

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 944 views
  • 6 likes
  • 4 in conversation