Col1 | Col2 |
1 | 1,5,4,3,2,6 |
2 | 21,3,2,1,5,15,17,3 |
3 | 1,2,3 |
4 | 4,3,2 |
Dear Friends,
I have dataset like above, I would like to convert it like below. Can someone suggest me the easy way. Thanks in advance.
Col1 | Col2 |
1 | 1,2,3,4,5,6 |
2 | 1,2,3,3,5,15,17,21 |
3 | 1,2,3 |
4 | 2,3,4 |
Or use the "RESOLVE trick" (just for fun):
data want(drop=_ i);
set have;
_=col2;
col2=' ';
do i=1 to countw(_,',');
col2=catx(',',col2,input(resolve(cats('%sysfunc(ordinal(',i,',',_,'))')),32.));
end;
run;
Normalize the data first.
data tall;
set have;
length index value 8;
do index=1 to max(1,countw(col2,','));
value=input(scan(col2,index,','),32.);
output;
end;
run;
Then normal sorting works (add NODUPKEY option to PROC SORT statement to eliminate duplicates).
proc sort data=tall ;
by col1 value;
run;
If for some reason you need to recreate that comma delimited list run another data step.
data want;
do until (last.col1);
set tall;
by col1;
length new_col2 $200;
new_col2=catx(',',new_col2,value);
end;
drop index value ;
run;
You could try combing "infile trick" and missing=" " option:
data have;
input Col1 Col2 :$ 200.;
cards;
1 1,5,4,3,2,6
2 21,3,2,1,5,15,17,3
3 1,2,3
4 4,3,2
;
run;
proc print data = have;
run;
/* create a fake temp file */
filename f temp;
data _null_;
file f;
put;
run;
options missing=" "; /* for CATX() to replace missing dot (.) with a space (" ") */
%let WN=20; /* max number of possible values on a list */
data want;
set have;
infile f truncover dlm=",";
input @1 @;
_infile_ = Col2;
input @1 (m1-m&WN.) (:) @@;
array mm m1-m&WN.;
put mm[*]=;
call sortn(of m:);
Col2 = catx(",",of m:);
keep Col1 Col2;
run;
filename f clear;
options missing=.;
proc print data = want;
run;
Bart
Or use the "RESOLVE trick" (just for fun):
data want(drop=_ i);
set have;
_=col2;
col2=' ';
do i=1 to countw(_,',');
col2=catx(',',col2,input(resolve(cats('%sysfunc(ordinal(',i,',',_,'))')),32.));
end;
run;
If you know the range of number you could use an array to collect and catx() regenerate the string. Make sure to set the MISSING option properly.
data have;
input col1 col2 :$50. ;
cards;
1 1,5,4,3,2,6
2 21,3,2,1,5,15,17,3
3 1,2,3
4 4,3,2
5 1,2,junk,3
;
options missing=' ';
data want;
set have;
array x[100] ;
do index=1 to countw(col2,',');
value=input(scan(col2,index,','),??32.);
if value in (1:100) then x[value]=value;
else put 'WARNING: Invalid number at ' index= col1= col2=;
end;
col2=catx(',',of x:);
drop index value x:;
run;
options missing='.';
Log
11 options missing=' '; 12 data want; 13 set have; 14 array x[100] ; 15 do index=1 to countw(col2,','); 16 value=input(scan(col2,index,','),??32.); 17 if value in (1:100) then x[value]=value; 18 else put 'WARNING: Invalid number at ' index= col1= col2=; 19 end; 20 col2=catx(',',of x:); 21 drop index value x:; 22 run; WARNING: Invalid number at index=3 col1=5 col2=1,2,junk,3 NOTE: There were 5 observations read from the data set WORK.HAVE. NOTE: The data set WORK.WANT has 5 observations and 2 variables. NOTE: DATA statement used (Total process time): real time 0.00 seconds cpu time 0.00 seconds 23 options missing='.';
Results
You can transfer the components of the cell into a hash object with the "ordered" attribute. Then iterate through them and concatenate each of them:
data have;
input col1 col2 :$50. ;
cards;
1 1,5,4,3,2,6,3.5
2 21,3,2,1,5,15,17,3
3 1,2,3
4 4,3,2
5 1,2,junk,3
;
data want (drop=i _:);
set have;
if _n_=1 then do;
declare hash h (ordered:'A',multidata:'Y');
h.definekey('_x');
h.definedata('_x');
h.definedone();
declare hiter hi ('h');
end;
do i=1 to countw(col2,',');
_x=input(scan(col2,i,','),best12.);
if _x^=. then h.add();
end;
length new_col $50;
do while (hi.next()=0);
new_col=catx(',',new_col,_x);
end;
h.clear();
run;
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.