Solved: Re: Sorting SAS Table with variable name as output

deega · Posted 07-12-2016 12:15 AM

I have the follwing table

ID	a1	s2	d3	f4	g5
a 1	0	2.56217	3.36783	3.53842	3.00214
s 2	2.56217	0	2.87413	4.1333	3.60389
d 3	3.36783	2.87413	0	2.46764	3.39893
f 4	3.53842	4.1333	2.46764	0	2.97952
g 5	3.00214	3.60389	3.39893	2.97952	0

I want to sort this table rowwise(or may be columnwise) but instead of the observation value I need the variable name as output, something like below

ID	id1	id2	id3	id4	id5
a 1	a1	s2	g5	d3	f4
s 2	s2	a1	d3	g5	f4
d 3	d3	f4	s2	a1	g5
f 4	f4	d3	g5	a1	s2
g 5	g5	f4	a1	d3	s2

Some more details:

Considering only the first row of the table

ID	a1	s2	d3	f4	g5
a 1	0	2.56217	3.36783	3.53842	3.00214

Sorting It alonwith variable names

ID	a1	s2	d3	f4	g5
a 1	0	2.56217	3.36783	3.53842	3.00214

Resulting into

ID	a1	s2	g5	d3	f4
a 1	0	2.56217	3.00214	3.36783	3.53842

Now taking only the variable names

ID
a 1	a1	s2	g5	d3	f4

Doing it for all the rows is my desired output

Ksharp · Posted 07-12-2016 01:07 AM

data have;
infile cards truncover expandtabs;
input ID $	a1	s2	d3	f4	g5;
cards;
a1	0	2.56217	3.36783	3.53842	3.00214
s2	2.56217	0	2.87413	4.1333	3.60389
d3	3.36783	2.87413	0	2.46764	3.39893
f4	3.53842	4.1333	2.46764	0	2.97952
g5	3.00214	3.60389	3.39893	2.97952	0
;
run;


proc transpose data=have out=temp;
by id notsorted;
run;
proc sort data=temp;
by id col1;
run;
proc transpose data=temp out=want(drop=_:) prefix=id;
by id;
var _name_;
run;

View solution in original post

ChrisNZ · Posted 07-12-2016 12:59 AM

that's one weird puzzle.

why is d3 sometimes followed by f4 and sometimes by g5?

High-Performance SAS Coding - Third Edition

Ksharp · Posted 07-12-2016 01:07 AM

data have;
infile cards truncover expandtabs;
input ID $	a1	s2	d3	f4	g5;
cards;
a1	0	2.56217	3.36783	3.53842	3.00214
s2	2.56217	0	2.87413	4.1333	3.60389
d3	3.36783	2.87413	0	2.46764	3.39893
f4	3.53842	4.1333	2.46764	0	2.97952
g5	3.00214	3.60389	3.39893	2.97952	0
;
run;


proc transpose data=have out=temp;
by id notsorted;
run;
proc sort data=temp;
by id col1;
run;
proc transpose data=temp out=want(drop=_:) prefix=id;
by id;
var _name_;
run;

deega · Posted 07-12-2016 01:10 AM

Wow ! Its done in just few lines and I was thinking about a loop. Thanks a lot !

Ksharp · Posted 07-12-2016 01:19 AM

Here is IML code , if you like it . It would be faster than PROC TRANSPOSE.

data have;
infile cards truncover expandtabs;
input ID $	a1	s2	d3	f4	g5;
cards;
a1	0	2.56217	3.36783	3.53842	3.00214
s2	2.56217	0	2.87413	4.1333	3.60389
d3	3.36783	2.87413	0	2.46764	3.39893
f4	3.53842	4.1333	2.46764	0	2.97952
g5	3.00214	3.60389	3.39893	2.97952	0
;
run;


proc iml;
use have;
read all var _num_ into x[c=vnames r=id];
close;

want=j(nrow(x),ncol(x),blankstr(nleng(vnames)));
do i=1 to nrow(x);
 temp=t(x[i,]);
 call sortndx(ndx,temp,1); 
 want[i,]=t(vnames[ndx]);
end;

names='id1':'id'+char(ncol(x));
create want from want[c=names r=id];
append from want[r=id];
close;
quit;

deega · Posted 07-12-2016 02:00 AM

Actually I have big data of around 200,0000 rows. This step will be in between the whole logic and the imput would be a square matrix of 200,0000 rows & columns. So what do you suggest in this suggestion. Transpose or IML ? Also could you advice some tutorial/ documentation which can help me in solving such things on my own ...?

Ksharp · Posted 07-12-2016 02:16 AM

I would recommend to use IML, if you could make memory bigger by set option -memsize 20G in sasconfv9 file.

If you don't have enough memory , try the following code could give you more fast .

data have;
infile cards truncover expandtabs;
input ID $	a1	s2	d3	f4	g5;
cards;
a1	0	2.56217	3.36783	3.53842	3.00214
s2	2.56217	0	2.87413	4.1333	3.60389
d3	3.36783	2.87413	0	2.46764	3.39893
f4	3.53842	4.1333	2.46764	0	2.97952
g5	3.00214	3.60389	3.39893	2.97952	0
;
run;


data want;
 set have;
 array temp{5};
 array have{5} a1	s2	d3	f4	g5;
 array id_{5} $ 32;
 do i=1 to dim(have);
  temp{i}=have{i};
 end;
 call sortn(of temp{*});
 do i=1 to dim(have);
  id_{i}=vname(have{whichn(temp{i},of have{*})});
 end;
 drop i temp: a1	s2	d3	f4	g5;
run;

Ksharp · Posted 07-12-2016 08:46 PM

Hi, If there are duplicated value in an obs, my data step might not work for you .

Ksharp · Posted 07-12-2016 09:31 PM

Actually IML can handle BIG data .

data have;
infile cards truncover expandtabs;
input ID $	a1	s2	d3	f4	g5;
cards;
a1	0	2.56217	3.36783	3.53842	3.00214
s2	2.56217	0	2.87413	4.1333	3.60389
d3	3.36783	2.87413	0	2.46764	3.39893
f4	3.53842	4.1333	2.46764	0	2.97952
g5	3.00214	3.60389	3.39893	2.97952	0
a11	0	2.56217	3.36783	3.53842	3.00214
s12	2.56217	0	2.87413	4.1333	3.60389
d13	3.36783	2.87413	0	2.46764	3.39893
f14	3.53842	4.1333	2.46764	0	2.97952
g15	3.00214	3.60389	3.39893	2.97952	0
a21	0	2.56217	3.36783	3.53842	3.00214
s22	2.56217	0	2.87413	4.1333	3.60389
d23	3.36783	2.87413	0	2.46764	3.39893
f24	3.53842	4.1333	2.46764	0	2.97952
g25	3.00214	3.60389	3.39893	2.97952	0
;
run;



proc iml;
want=j(1,5,blankstr(32));
id=blankstr(32);
use have;
create want from want[c={ a1 s2 d3 f4 g5} r=id];

setin have;
setout want;

do data;
 read next var _num_ into x[c=vnames r=id];
 call sortndx(ndx,x`,1); 
 want=t(vnames[ndx]);
 
 append from want[r=id];
end;

close have want;
quit;

deega · Posted 07-12-2016 09:37 PM

Thankyou Xia Keshan. It was a great help. However, it will take good amount of time in testing these codes on actual data, currently I am just trying on Dummy data. I dont know how feasible it will be on actual data, if required I will make clusters to reduce the data. I will inform you which one finally works. I have used proc transpose but IML is new to me. I tried it once and the output was not coming in the form of SAS Table so I did not try the same. Is it possible to save the output from IML in SAS table ?

Ksharp · Posted 07-12-2016 10:27 PM

Yes. That is what I am trying to do. Use the following code to save output into a SAS Table WANT .

create want from want;
append from want;
close;

Ksharp · Posted 07-12-2016 11:37 PM

HaHa, I found the following code could handle duplicate value as long as there are no missing value .

data have;
infile cards truncover expandtabs;
input ID $	a1	s2	d3	f4	g5;
cards;
a1	0	2.56217	3.36783	3.53842	3.00214
s2	2.56217	0	2.87413	4.1333	3.60389
d3	3.36783	2.87413	0	2.46764	3.39893
f4	3.53842	4.1333	2.46764	0	2.97952
g5	3	    3	3.39893	2.97952	0
;
run;


data want;
 set have;
 array have{5} a1	s2	d3	f4	g5;
 array id_{5} $ 32;
 do i=1 to dim(have);
  min=min(of have{*});
  idx=whichn(min,of have{*});
  id_{i}=vname(have{idx});
  have{idx}=.;
 end;
 drop idx min i a1 s2 d3 f4 g5;
run;

deega · Posted 07-21-2016 10:59 PM

Hi Xia,
Thanks for your support.
I am trying this program on big data but now the problem is while defining array its not possible to write the column names like as you did in the following statement
array have{5} a1 s2 d3 f4 g5;
Now, in first test phase, columns are 100 and that too not in any particular order, could you suggest anything ??

Ksharp · Posted 07-21-2016 11:48 PM

Make a macro variable to hold these variables name .
OR you could try my IML code .



data have;
infile cards truncover expandtabs;
input ID $	a1	s2	d3	f4	g5;
cards;
a1	0	2.56217	3.36783	3.53842	3.00214
s2	2.56217	0	2.87413	4.1333	3.60389
d3	3.36783	2.87413	0	2.46764	3.39893
f4	3.53842	4.1333	2.46764	0	2.97952
g5	3	    3	3.39893	2.97952	0
;
run;
proc transpose data=have(obs=0) out=temp;
run;
proc sql noprint;
 select count(*) into : n from temp;
 select _name_ into : list separated by ' ' from temp;
quit;

data want;
 set have;
 array have{&n} &list ;
 array id_{&n} $ 32;
 do i=1 to dim(have);
  min=min(of have{*});
  idx=whichn(min,of have{*});
  id_{i}=vname(have{idx});
  have{idx}=.;
 end;
 keep id id_: ;
run;

ChrisNZ · Posted 07-21-2016 11:54 PM

The double-dash syntax allows you to list all the variables in a table in the order they are stored.

array have{*} a1 -- g5;


data t; 
  a1=2; b2=3;z5=4; g5=1;
run;
data tt; 
  set t ;
  array have{*} a1 -- g5;
  put have[3]=;
run;

z5=4

High-Performance SAS Coding - Third Edition

ID	id1	id2	id3	id4	id5
a 1	a1	s2	g5	d3	f4
s 2	s2	a1	d3	g5	f4
d 3	d3	f4	s2	a1	g5
f 4	f4	d3	g5	a1	s2
g 5	g5	f4	a1	d3	s2

ID	id1	id2	id3	id4	id5
a 1	a1	s2	g5	d3	f4
s 2	s2	a1	d3	g5	f4
d 3	d3	f4	s2	a1	g5
f 4	f4	d3	g5	a1	s2
g 5	g5	f4	a1	d3	s2

Classroom Training Available!

ID	id1	id2	id3	id4	id5
a 1	a1	s2	g5	d3	f4
s 2	s2	a1	d3	g5	f4
d 3	d3	f4	s2	a1	g5
f 4	f4	d3	g5	a1	s2
g 5	g5	f4	a1	d3	s2