BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
AviS
Fluorite | Level 6

I have a SAS dataset with 3 columns that could have one or more values separated by comma. I would have to add a new column with distinct values in 3 columns separated by comma in a new field "Final.

 

Have:

IDY2017Y2018Y2019
aa2, a3a2,a3a4
ba2 a4
ca1, a2a2,a3a4,a3

 

Want:

IDY2017Y2018Y2019Final
aa2, a3a2,a3a4a2, a3, a4
ba2 a4a2, a4
ca1, a2a2,a3a4,a3a1, a2, a3, a4

 

Thanks!

1 ACCEPTED SOLUTION

Accepted Solutions
novinosrin
Tourmaline | Level 20
data have;
infile cards expandtabs truncover;
input (ID	Y2017	Y2018	Y2019) (:$10.);
cards;
a	a2,a3	a2,a3	a4
b	a2	 .	a4 .
c	a1,a2	a2,a3	a4,a3
;

data want;
 set have;
 length t1 t2 final $1000;
 array y Y2017	Y2018	Y2019;
 t1=catx(',',of y(*));
 do _n_=1 to countw(t1,', ');
  t2=scan(t1,_n_,',');
  if not index(final,strip(t2)) then final=catx(',',final,t2);
 end;
 drop t:;
run;

View solution in original post

3 REPLIES 3
ballardw
Super User

Did you do something to force the multiple values per cell?

If so you may find it easier to back before that step was done and start over.

 

Also, is the final order at all important? Will you be doing something where you need to treat a1,a2,a3 the same as a3,a1,a2?

 

 

novinosrin
Tourmaline | Level 20
data have;
infile cards expandtabs truncover;
input (ID	Y2017	Y2018	Y2019) (:$10.);
cards;
a	a2,a3	a2,a3	a4
b	a2	 .	a4 .
c	a1,a2	a2,a3	a4,a3
;

data want;
 set have;
 length t1 t2 final $1000;
 array y Y2017	Y2018	Y2019;
 t1=catx(',',of y(*));
 do _n_=1 to countw(t1,', ');
  t2=scan(t1,_n_,',');
  if not index(final,strip(t2)) then final=catx(',',final,t2);
 end;
 drop t:;
run;
LeonidBatkhan
Lapis Lazuli | Level 10

Hi AviS,

Here is a possible solution to your problem:

data A;
   length ID $2 Y2017-Y2019 $10;
   input ID 1-2 Y2017 4-10 Y2018 13-18 Y2019 20-25;
   datalines;
a	a2, a3	a2,a3	 a4
b	a2	 	          a4
c	a1, a2	a2,a3	 a4,a3
;

data B (keep=ID VAL);
   set A;
   length ALL $200 VAL $10;
   ALL = catx(',',Y2017,Y2018,Y2019);
   do i=1 to countw(ALL,',');
      VAL = strip(scan(ALL,i,','));
      output;
   end;
run;

proc sort data=B out=C nodupkey;
   by ID VAL;
run;

data D (drop=VAL);
   merge A C;
   by ID;
   length FINAL $200;
   retain FINAL;
   if first.ID then FINAL = '';
   FINAL = catx(', ', FINAL,VAL);
   if last.ID then output;
run;

Run this code to make sure it works for you.

Is that what you want?

Best,

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 864 views
  • 2 likes
  • 4 in conversation