I have the follwing table
ID | a1 | s2 | d3 | f4 | g5 |
a 1 | 0 | 2.56217 | 3.36783 | 3.53842 | 3.00214 |
s 2 | 2.56217 | 0 | 2.87413 | 4.1333 | 3.60389 |
d 3 | 3.36783 | 2.87413 | 0 | 2.46764 | 3.39893 |
f 4 | 3.53842 | 4.1333 | 2.46764 | 0 | 2.97952 |
g 5 | 3.00214 | 3.60389 | 3.39893 | 2.97952 | 0 |
I want to sort this table rowwise(or may be columnwise) but instead of the observation value I need the variable name as output, something like below
ID | id1 | id2 | id3 | id4 | id5 |
a 1 | a1 | s2 | g5 | d3 | f4 |
s 2 | s2 | a1 | d3 | g5 | f4 |
d 3 | d3 | f4 | s2 | a1 | g5 |
f 4 | f4 | d3 | g5 | a1 | s2 |
g 5 | g5 | f4 | a1 | d3 | s2 |
Some more details:
Considering only the first row of the table
ID | a1 | s2 | d3 | f4 | g5 |
a 1 | 0 | 2.56217 | 3.36783 | 3.53842 | 3.00214 |
Sorting It alonwith variable names
ID | a1 | s2 | d3 | f4 | g5 |
a 1 | 0 | 2.56217 | 3.36783 | 3.53842 | 3.00214 |
Resulting into
ID | a1 | s2 | g5 | d3 | f4 |
a 1 | 0 | 2.56217 | 3.00214 | 3.36783 | 3.53842 |
Now taking only the variable names
ID | |||||
a 1 | a1 | s2 | g5 | d3 | f4 |
Doing it for all the rows is my desired output
data have;
infile cards truncover expandtabs;
input ID $ a1 s2 d3 f4 g5;
cards;
a1 0 2.56217 3.36783 3.53842 3.00214
s2 2.56217 0 2.87413 4.1333 3.60389
d3 3.36783 2.87413 0 2.46764 3.39893
f4 3.53842 4.1333 2.46764 0 2.97952
g5 3.00214 3.60389 3.39893 2.97952 0
;
run;
proc transpose data=have out=temp;
by id notsorted;
run;
proc sort data=temp;
by id col1;
run;
proc transpose data=temp out=want(drop=_:) prefix=id;
by id;
var _name_;
run;
that's one weird puzzle.
why is d3 sometimes followed by f4 and sometimes by g5?
data have;
infile cards truncover expandtabs;
input ID $ a1 s2 d3 f4 g5;
cards;
a1 0 2.56217 3.36783 3.53842 3.00214
s2 2.56217 0 2.87413 4.1333 3.60389
d3 3.36783 2.87413 0 2.46764 3.39893
f4 3.53842 4.1333 2.46764 0 2.97952
g5 3.00214 3.60389 3.39893 2.97952 0
;
run;
proc transpose data=have out=temp;
by id notsorted;
run;
proc sort data=temp;
by id col1;
run;
proc transpose data=temp out=want(drop=_:) prefix=id;
by id;
var _name_;
run;
Wow ! Its done in just few lines and I was thinking about a loop. Thanks a lot !
Here is IML code , if you like it . It would be faster than PROC TRANSPOSE.
data have;
infile cards truncover expandtabs;
input ID $ a1 s2 d3 f4 g5;
cards;
a1 0 2.56217 3.36783 3.53842 3.00214
s2 2.56217 0 2.87413 4.1333 3.60389
d3 3.36783 2.87413 0 2.46764 3.39893
f4 3.53842 4.1333 2.46764 0 2.97952
g5 3.00214 3.60389 3.39893 2.97952 0
;
run;
proc iml;
use have;
read all var _num_ into x[c=vnames r=id];
close;
want=j(nrow(x),ncol(x),blankstr(nleng(vnames)));
do i=1 to nrow(x);
temp=t(x[i,]);
call sortndx(ndx,temp,1);
want[i,]=t(vnames[ndx]);
end;
names='id1':'id'+char(ncol(x));
create want from want[c=names r=id];
append from want[r=id];
close;
quit;
Actually I have big data of around 200,0000 rows. This step will be in between the whole logic and the imput would be a square matrix of 200,0000 rows & columns. So what do you suggest in this suggestion. Transpose or IML ? Also could you advice some tutorial/ documentation which can help me in solving such things on my own ...?
I would recommend to use IML, if you could make memory bigger by set option -memsize 20G in sasconfv9 file.
If you don't have enough memory , try the following code could give you more fast .
data have;
infile cards truncover expandtabs;
input ID $ a1 s2 d3 f4 g5;
cards;
a1 0 2.56217 3.36783 3.53842 3.00214
s2 2.56217 0 2.87413 4.1333 3.60389
d3 3.36783 2.87413 0 2.46764 3.39893
f4 3.53842 4.1333 2.46764 0 2.97952
g5 3.00214 3.60389 3.39893 2.97952 0
;
run;
data want;
set have;
array temp{5};
array have{5} a1 s2 d3 f4 g5;
array id_{5} $ 32;
do i=1 to dim(have);
temp{i}=have{i};
end;
call sortn(of temp{*});
do i=1 to dim(have);
id_{i}=vname(have{whichn(temp{i},of have{*})});
end;
drop i temp: a1 s2 d3 f4 g5;
run;
Hi, If there are duplicated value in an obs, my data step might not work for you .
Actually IML can handle BIG data .
data have;
infile cards truncover expandtabs;
input ID $ a1 s2 d3 f4 g5;
cards;
a1 0 2.56217 3.36783 3.53842 3.00214
s2 2.56217 0 2.87413 4.1333 3.60389
d3 3.36783 2.87413 0 2.46764 3.39893
f4 3.53842 4.1333 2.46764 0 2.97952
g5 3.00214 3.60389 3.39893 2.97952 0
a11 0 2.56217 3.36783 3.53842 3.00214
s12 2.56217 0 2.87413 4.1333 3.60389
d13 3.36783 2.87413 0 2.46764 3.39893
f14 3.53842 4.1333 2.46764 0 2.97952
g15 3.00214 3.60389 3.39893 2.97952 0
a21 0 2.56217 3.36783 3.53842 3.00214
s22 2.56217 0 2.87413 4.1333 3.60389
d23 3.36783 2.87413 0 2.46764 3.39893
f24 3.53842 4.1333 2.46764 0 2.97952
g25 3.00214 3.60389 3.39893 2.97952 0
;
run;
proc iml;
want=j(1,5,blankstr(32));
id=blankstr(32);
use have;
create want from want[c={ a1 s2 d3 f4 g5} r=id];
setin have;
setout want;
do data;
read next var _num_ into x[c=vnames r=id];
call sortndx(ndx,x`,1);
want=t(vnames[ndx]);
append from want[r=id];
end;
close have want;
quit;
Thankyou Xia Keshan. It was a great help. However, it will take good amount of time in testing these codes on actual data, currently I am just trying on Dummy data. I dont know how feasible it will be on actual data, if required I will make clusters to reduce the data. I will inform you which one finally works. I have used proc transpose but IML is new to me. I tried it once and the output was not coming in the form of SAS Table so I did not try the same. Is it possible to save the output from IML in SAS table ?
Yes. That is what I am trying to do. Use the following code to save output into a SAS Table WANT .
create want from want;
append from want;
close;
HaHa, I found the following code could handle duplicate value as long as there are no missing value .
data have;
infile cards truncover expandtabs;
input ID $ a1 s2 d3 f4 g5;
cards;
a1 0 2.56217 3.36783 3.53842 3.00214
s2 2.56217 0 2.87413 4.1333 3.60389
d3 3.36783 2.87413 0 2.46764 3.39893
f4 3.53842 4.1333 2.46764 0 2.97952
g5 3 3 3.39893 2.97952 0
;
run;
data want;
set have;
array have{5} a1 s2 d3 f4 g5;
array id_{5} $ 32;
do i=1 to dim(have);
min=min(of have{*});
idx=whichn(min,of have{*});
id_{i}=vname(have{idx});
have{idx}=.;
end;
drop idx min i a1 s2 d3 f4 g5;
run;
Make a macro variable to hold these variables name . OR you could try my IML code . data have; infile cards truncover expandtabs; input ID $ a1 s2 d3 f4 g5; cards; a1 0 2.56217 3.36783 3.53842 3.00214 s2 2.56217 0 2.87413 4.1333 3.60389 d3 3.36783 2.87413 0 2.46764 3.39893 f4 3.53842 4.1333 2.46764 0 2.97952 g5 3 3 3.39893 2.97952 0 ; run; proc transpose data=have(obs=0) out=temp; run; proc sql noprint; select count(*) into : n from temp; select _name_ into : list separated by ' ' from temp; quit; data want; set have; array have{&n} &list ; array id_{&n} $ 32; do i=1 to dim(have); min=min(of have{*}); idx=whichn(min,of have{*}); id_{i}=vname(have{idx}); have{idx}=.; end; keep id id_: ; run;
The double-dash syntax allows you to list all the variables in a table in the order they are stored.
array have{*} a1 -- g5;
data t;
a1=2; b2=3;z5=4; g5=1;
run;
data tt;
set t ;
array have{*} a1 -- g5;
put have[3]=;
run;
z5=4
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.