Good day Experts,
A
I created a sample dataset "Have." All the name are seperated with comma in the 'Name' column, however the names in the rows are varous. I would like to list all the nonDuplicated Name shown the 'Want' result. Please let me know how to procede. Thank you.
data Have; length Name $100 ; infile datalines delimiter='/'; input Name; datalines; John,Marry,Jacob, ,/ ,Ana,Marry,/ Jacob,John,/ Will,Tom,Micheal/ ; data Want; length NameNoDup $100 ; infile datalines delimiter='/'; input NameNoDup; datalines; John/ Marry/ Jacob/ Ana/ Will/ Tom/ Micheal ;
data Have;
length Name $100 ;
infile datalines delimiter='/';
input Name;
datalines;
John,Marry,Jacob, ,/
,Ana,Marry,/
Jacob,John,/
Will,Tom,Micheal/
;
data want;
set have;
do i=1 to countw(name,',');
newname=scan(name,i,',');
output;
end;
run;
proc sql;
create table want2 as select distinct newname
from want where not missing(newname);
run;
This raises the question, why are you creating data sets in such an unusual format? You would make your life much easier by creating a data set with one name per line, rather than many names per line.
data Have;
length Name $100 ;
infile datalines delimiter='/';
input Name;
datalines;
John,Marry,Jacob, ,/
,Ana,Marry,/
Jacob,John,/
Will,Tom,Micheal/
;
data want;
set have;
do i=1 to countw(name,',');
newname=scan(name,i,',');
output;
end;
run;
proc sql;
create table want2 as select distinct newname
from want where not missing(newname);
run;
This raises the question, why are you creating data sets in such an unusual format? You would make your life much easier by creating a data set with one name per line, rather than many names per line.
@ybz12003 wrote:
Good day Experts,
A
I created a sample dataset "Have." All the name are seperated with comma in the 'Name' column, however the names in the rows are varous. I would like to list all the nonDuplicated Name shown the 'Want' result. Please let me know how to procede. Thank you.
data Have; length Name $100 ; infile datalines delimiter='/'; input Name; datalines; John,Marry,Jacob, ,/ ,Ana,Marry,/ Jacob,John,/ Will,Tom,Micheal/ ; data Want; length NameNoDup $100 ; infile datalines delimiter='/'; input NameNoDup; datalines; John/ Marry/ Jacob/ Ana/ Will/ Tom/ Micheal ;
If the order isn't particularly critical
data need; set have; do i=1 to countw(name); onename = scan(name,i); if not missing(name) then output; end; drop i name; run; proc sql; create table want as select distinct onename as name from need ; quit;
Default output is likely to be alphabetical by name.
data Have;
length Name $100 ;
infile datalines delimiter=',/' ;
input Name @@;
datalines;
John,Marry,Jacob, ,/
,Ana,Marry,/
Jacob,John,/
Will,Tom,Micheal/
;
proc sort data=have nodupkey;
where not missing(compress(trim(name), , 's'));
by name;
run;
proc print data=have;
run;
Read them directly into individual values?
data Have;
length Name $100 ;
infile datalines delimiter='/';
input Name;
datalines;
John,Marry,Jacob, ,/
,Ana,Marry,/
Jacob,John,/
Will,Tom,Micheal/
;
data _null_;
if _n_=1 then do;
dcl hash h();
h.definekey('newname');
h.definedone();
end;
set have end=z;
do _n_=1 to countw(name,',');
newname=scan(name,_n_,',');
if newname>' ' then rc=h.add();
end;
if z;
h.output(dataset:'want');
run;
proc print noobs;run;
Hi all, I tried all your coding with my actual dataset. For some reason, PaigeMiller's code works better. Thank you so much for all of your time and effort.
Catch the best of SAS Innovate 2025 — anytime, anywhere. Stream powerful keynotes, real-world demos, and game-changing insights from the world’s leading data and AI minds.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.