BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
SukumarBalusamy
Calcite | Level 5
Col1Col2
11,5,4,3,2,6
221,3,2,1,5,15,17,3
31,2,3
44,3,2

 Dear Friends,

I have dataset like above, I would like to convert it like below. Can someone suggest me the easy way. Thanks in advance.

Col1 Col2 
11,2,3,4,5,6
21,2,3,3,5,15,17,21
31,2,3
42,3,4
1 ACCEPTED SOLUTION

Accepted Solutions
FreelanceReinh
Jade | Level 19

Or use the "RESOLVE trick" (just for fun):

data want(drop=_ i);
set have;
_=col2;
col2=' ';
do i=1 to countw(_,',');
  col2=catx(',',col2,input(resolve(cats('%sysfunc(ordinal(',i,',',_,'))')),32.));
end;
run;

View solution in original post

9 REPLIES 9
Tom
Super User Tom
Super User

Normalize the data first.  

data tall;
  set have;
  length index value 8;
  do index=1 to max(1,countw(col2,','));
     value=input(scan(col2,index,','),32.);
      output;
  end;
run;

Tom_0-1752245353575.png

 

Then normal sorting works  (add NODUPKEY option to PROC SORT statement to eliminate duplicates).

proc sort data=tall ;
  by col1 value;
run;

If for some reason you need to recreate that comma delimited list run another data step.

data want;
  do until (last.col1);
    set tall;
    by col1;
    length new_col2 $200;
    new_col2=catx(',',new_col2,value);
  end;
  drop index value ;
run;

Tom_1-1752245383997.png

 

SukumarBalusamy
Calcite | Level 5
Thank you very much for the code
yabwon
Amethyst | Level 16

You could try combing "infile trick" and missing=" " option:

data have;
input Col1 Col2 :$ 200.;
cards;
1	1,5,4,3,2,6
2	21,3,2,1,5,15,17,3
3	1,2,3
4	4,3,2
;
run;
proc print data = have;
run;


/* create a fake temp file */
filename f temp;
data _null_;
  file f;
  put;
run;
options missing=" "; /* for CATX() to replace missing dot (.) with a space (" ") */

%let WN=20; /* max number of possible values on a list */

data want;
  set have; 
  infile f truncover dlm=",";
  input @1 @;
  _infile_ = Col2;
  input @1 (m1-m&WN.) (:) @@;

  array mm m1-m&WN.;
  put mm[*]=;
  call sortn(of m:);

  Col2 = catx(",",of m:);

  keep Col1 Col2;
run;
filename f clear;
options missing=.;

proc print data = want;
run;

Bart

_______________
Polish SAS Users Group: www.polsug.com and communities.sas.com/polsug

"SAS Packages: the way to share" at SGF2020 Proceedings (the latest version), GitHub Repository, and YouTube Video.
Hands-on-Workshop: "Share your code with SAS Packages"
"My First SAS Package: A How-To" at SGF2021 Proceedings

SAS Ballot Ideas: one: SPF in SAS, two, and three
SAS Documentation



FreelanceReinh
Jade | Level 19

Or use the "RESOLVE trick" (just for fun):

data want(drop=_ i);
set have;
_=col2;
col2=' ';
do i=1 to countw(_,',');
  col2=catx(',',col2,input(resolve(cats('%sysfunc(ordinal(',i,',',_,'))')),32.));
end;
run;
SukumarBalusamy
Calcite | Level 5
Thank you so much for code this worked, I like this solution very much.
Tom
Super User Tom
Super User

If you know the range of number you could use an array to collect and catx() regenerate the string. Make sure to set the MISSING option properly.

data have;
  input col1 col2 :$50. ;
cards;
1 1,5,4,3,2,6
2 21,3,2,1,5,15,17,3
3 1,2,3
4 4,3,2
5 1,2,junk,3
;

options missing=' ';
data want;
  set have;
  array x[100] ;
  do index=1 to countw(col2,',');
    value=input(scan(col2,index,','),??32.);
    if value in (1:100) then x[value]=value;
    else put 'WARNING: Invalid number at ' index= col1= col2=;
  end;
  col2=catx(',',of x:);
  drop index value x:;
run;
options missing='.';

Log

11   options missing=' ';
12   data want;
13     set have;
14     array x[100] ;
15     do index=1 to countw(col2,',');
16       value=input(scan(col2,index,','),??32.);
17       if value in (1:100) then x[value]=value;
18       else put 'WARNING: Invalid number at ' index= col1= col2=;
19     end;
20     col2=catx(',',of x:);
21     drop index value x:;
22   run;

WARNING: Invalid number at index=3 col1=5 col2=1,2,junk,3
NOTE: There were 5 observations read from the data set WORK.HAVE.
NOTE: The data set WORK.WANT has 5 observations and 2 variables.
NOTE: DATA statement used (Total process time):
      real time           0.00 seconds
      cpu time            0.00 seconds


23   options missing='.';

Results

Tom_0-1752245003304.png

 

 

mkeintz
PROC Star

You can transfer the components of the cell into a hash object with the "ordered" attribute.  Then iterate through them and concatenate each of them:

 

data have;
  input col1 col2 :$50. ;
cards;
1 1,5,4,3,2,6,3.5
2 21,3,2,1,5,15,17,3
3 1,2,3
4 4,3,2
5 1,2,junk,3
;

data want (drop=i _:);
  set have;
  if _n_=1 then do;
    declare hash h (ordered:'A',multidata:'Y');
      h.definekey('_x');
      h.definedata('_x');
      h.definedone();
    declare hiter hi ('h');
  end;

  do i=1 to countw(col2,',');
    _x=input(scan(col2,i,','),best12.);
    if _x^=. then h.add();
  end;
  length new_col $50;
  do while (hi.next()=0);
    new_col=catx(',',new_col,_x);
  end;
  h.clear();
run;
  
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
SukumarBalusamy
Calcite | Level 5
Thank you very much for generalized code
SukumarBalusamy
Calcite | Level 5
Thank you very much for your code

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 9 replies
  • 1186 views
  • 14 likes
  • 5 in conversation