want:
column1
1
5,11,12,13
2
1
actual data for column1:
column1
1,1
5,11,12,13,5,11,12
2,2,2
1,1,1,1
Basic data modelling paradigm tells you that you shouldn't keep multiple values in the same column.
Your post is obvious test data, but what is the real life example?
My take would be to scan the columns and output each value to a new row.. Then you can remove duplicates using PROC SORT with NODUPRECS or NODUPKEY, or SQL with SELECT DISTINCT.
Try this:
data have;
infile cards4 dlm = '0a0d'x;
input column : $ 256.;
cards4;
1,1
5,11,12,13,5,11,12
2,2,2
1,1,1,1
;;;;
run;
proc print;
run;
data want;
length element $ 20;
declare hash H(ordered:"A");
H.defineKey("element");
H.defineDone();
declare hiter I("H");
do until(eof);
set have end=eof;
H.CLEAR();
if " " ne column then
do k=1 to countw(column);
element = scan(column,k,",");
rc = H.add();
end;
call missing(column);
rc = I.first();
do while (rc = 0);
column=catx(",",column,element);
rc = I.next();
end;
output;
end;
stop;
keep column;
run;
proc print;
run;
Bart
One more ("classic") approach:
data have;
infile cards4 dlm = '0a0d'x;
input column : $ 256.;
cards4;
1,1
5,11,12,13,5,11,12
2,2,2
1,1,1,1
;;;;
run;
proc print;
run;
data want0;
set have end=eof;
id+1;
if " " ne column then
do k=1 to countw(column);
element = scan(column,k,",");
output;
end;
keep id element;
run;
proc sort data = want0 nodupkey;
by id element;
run;
data want0;
set want0;
by id;
length column $ 256;
retain column;
if first.ID then column="";;
column=catx(",",column,element);
if last.ID then output;
keep column;
run;
proc print;
run;
Bart
Try this
data have;
infile datalines4;
input column : $ 20.;
datalines4;
1,1
5,11,12,13,5,11,12
2,2,2
1,1,1,1
;;;;
data want(keep=column newstring);
set have;
newstring=scan(column, 1, ',');
do i=2 to countw(column,',');
word=scan(column, i, ',');
found=find(newstring, word, 'it');
if found=0 then newstring=catx(',', newstring, word);
end;
run;
data have;
infile datalines4;
input column : $ 20.;
datalines4;
1,1
5,11,12,13,5,11,12
2,2,2
1,1,1,1
;;;;
data want;
set have;
array x{999} $ 80;
do i=1 to countw(column,',');
temp=scan(column,i,',');
if temp not in x then x{i}=temp;
end;
want=catx(',',of x{*});
keep column want;
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.