BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
AviS
Fluorite | Level 6

I have a SAS dataset with 3 columns that could have one or more values separated by comma. I would have to add a new column with distinct values in 3 columns separated by comma in a new field "Final.

 

Have:

IDY2017Y2018Y2019
aa2, a3a2,a3a4
ba2 a4
ca1, a2a2,a3a4,a3

 

Want:

IDY2017Y2018Y2019Final
aa2, a3a2,a3a4a2, a3, a4
ba2 a4a2, a4
ca1, a2a2,a3a4,a3a1, a2, a3, a4

 

Thanks!

1 ACCEPTED SOLUTION

Accepted Solutions
novinosrin
Tourmaline | Level 20
data have;
infile cards expandtabs truncover;
input (ID	Y2017	Y2018	Y2019) (:$10.);
cards;
a	a2,a3	a2,a3	a4
b	a2	 .	a4 .
c	a1,a2	a2,a3	a4,a3
;

data want;
 set have;
 length t1 t2 final $1000;
 array y Y2017	Y2018	Y2019;
 t1=catx(',',of y(*));
 do _n_=1 to countw(t1,', ');
  t2=scan(t1,_n_,',');
  if not index(final,strip(t2)) then final=catx(',',final,t2);
 end;
 drop t:;
run;

View solution in original post

3 REPLIES 3
ballardw
Super User

Did you do something to force the multiple values per cell?

If so you may find it easier to back before that step was done and start over.

 

Also, is the final order at all important? Will you be doing something where you need to treat a1,a2,a3 the same as a3,a1,a2?

 

 

novinosrin
Tourmaline | Level 20
data have;
infile cards expandtabs truncover;
input (ID	Y2017	Y2018	Y2019) (:$10.);
cards;
a	a2,a3	a2,a3	a4
b	a2	 .	a4 .
c	a1,a2	a2,a3	a4,a3
;

data want;
 set have;
 length t1 t2 final $1000;
 array y Y2017	Y2018	Y2019;
 t1=catx(',',of y(*));
 do _n_=1 to countw(t1,', ');
  t2=scan(t1,_n_,',');
  if not index(final,strip(t2)) then final=catx(',',final,t2);
 end;
 drop t:;
run;
LeonidBatkhan
Lapis Lazuli | Level 10

Hi AviS,

Here is a possible solution to your problem:

data A;
   length ID $2 Y2017-Y2019 $10;
   input ID 1-2 Y2017 4-10 Y2018 13-18 Y2019 20-25;
   datalines;
a	a2, a3	a2,a3	 a4
b	a2	 	          a4
c	a1, a2	a2,a3	 a4,a3
;

data B (keep=ID VAL);
   set A;
   length ALL $200 VAL $10;
   ALL = catx(',',Y2017,Y2018,Y2019);
   do i=1 to countw(ALL,',');
      VAL = strip(scan(ALL,i,','));
      output;
   end;
run;

proc sort data=B out=C nodupkey;
   by ID VAL;
run;

data D (drop=VAL);
   merge A C;
   by ID;
   length FINAL $200;
   retain FINAL;
   if first.ID then FINAL = '';
   FINAL = catx(', ', FINAL,VAL);
   if last.ID then output;
run;

Run this code to make sure it works for you.

Is that what you want?

Best,

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 1084 views
  • 2 likes
  • 4 in conversation