BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
ybz12003
Rhodochrosite | Level 12

Good day Experts,

A

I created a sample dataset "Have."  All the name are seperated with comma in the 'Name' column, however the names in the rows are varous.   I would like to list all the nonDuplicated Name shown the 'Want' result.  Please let me know how to procede.    Thank you.

data Have;  
	length Name $100 ; 
	infile datalines delimiter='/'; 
	input Name;  
	datalines;                     
	John,Marry,Jacob, ,/
	,Ana,Marry,/
	Jacob,John,/
	Will,Tom,Micheal/
;   

data Want;  
	length NameNoDup $100 ; 
	infile datalines delimiter='/'; 
	input NameNoDup;  
	datalines;                     
	John/
	Marry/
	Jacob/
	Ana/
	Will/
	Tom/
	Micheal
;   
1 ACCEPTED SOLUTION

Accepted Solutions
PaigeMiller
Diamond | Level 26
data Have;  
	length Name $100 ; 
	infile datalines delimiter='/'; 
	input Name;  
	datalines;                     
	John,Marry,Jacob, ,/
	,Ana,Marry,/
	Jacob,John,/
	Will,Tom,Micheal/
;   

data want;
    set have;
    do i=1 to countw(name,',');
        newname=scan(name,i,',');
        output;
    end;
run;
proc sql;
    create table want2 as select distinct newname
    from want where not missing(newname);
run;

This raises the question, why are you creating data sets in such an unusual format? You would make your life much easier by creating a data set with one name per line, rather than many names per line. 

--
Paige Miller

View solution in original post

5 REPLIES 5
PaigeMiller
Diamond | Level 26
data Have;  
	length Name $100 ; 
	infile datalines delimiter='/'; 
	input Name;  
	datalines;                     
	John,Marry,Jacob, ,/
	,Ana,Marry,/
	Jacob,John,/
	Will,Tom,Micheal/
;   

data want;
    set have;
    do i=1 to countw(name,',');
        newname=scan(name,i,',');
        output;
    end;
run;
proc sql;
    create table want2 as select distinct newname
    from want where not missing(newname);
run;

This raises the question, why are you creating data sets in such an unusual format? You would make your life much easier by creating a data set with one name per line, rather than many names per line. 

--
Paige Miller
ballardw
Super User

@ybz12003 wrote:

Good day Experts,

A

I created a sample dataset "Have."  All the name are seperated with comma in the 'Name' column, however the names in the rows are varous.   I would like to list all the nonDuplicated Name shown the 'Want' result.  Please let me know how to procede.    Thank you.

data Have;  
	length Name $100 ; 
	infile datalines delimiter='/'; 
	input Name;  
	datalines;                     
	John,Marry,Jacob, ,/
	,Ana,Marry,/
	Jacob,John,/
	Will,Tom,Micheal/
;   

data Want;  
	length NameNoDup $100 ; 
	infile datalines delimiter='/'; 
	input NameNoDup;  
	datalines;                     
	John/
	Marry/
	Jacob/
	Ana/
	Will/
	Tom/
	Micheal
;   

 

If the order isn't particularly critical

data need;
   set have;
   do i=1 to countw(name);
      onename = scan(name,i);
      if not missing(name) then output;
   end;
   drop i name;
run;

proc sql;
   create table want as
   select distinct onename as name
   from need
   ;
quit;

Default output is likely to be alphabetical by name.

Reeza
Super User
data Have;  
	length Name $100 ; 
	infile datalines delimiter=',/' ; 
	input Name @@;  
	datalines;                     
	John,Marry,Jacob, ,/
	,Ana,Marry,/
	Jacob,John,/
	Will,Tom,Micheal/
;  

proc sort data=have nodupkey;
where not missing(compress(trim(name), , 's'));
by name;
run;

proc print data=have;
run;

Read them directly into individual values?

novinosrin
Tourmaline | Level 20
data Have;  
	length Name $100 ; 
	infile datalines delimiter='/'; 
	input Name;  
	datalines;                     
	John,Marry,Jacob, ,/
	,Ana,Marry,/
	Jacob,John,/
	Will,Tom,Micheal/
;   

data _null_;
 if _n_=1 then do;
  dcl hash h();
  h.definekey('newname');
  h.definedone();
 end;
 set have end=z;
 do _n_=1 to countw(name,',');
   newname=scan(name,_n_,',');
   if newname>' ' then rc=h.add();
 end;
 if z;
 h.output(dataset:'want');
run;

proc print noobs;run;
ybz12003
Rhodochrosite | Level 12

Hi all, I tried all your coding with my actual dataset.  For some reason, PaigeMiller's code works better.  Thank you so much for all of your time and effort.

sas-innovate-white.png

Missed SAS Innovate in Orlando?

Catch the best of SAS Innovate 2025 — anytime, anywhere. Stream powerful keynotes, real-world demos, and game-changing insights from the world’s leading data and AI minds.

 

Register now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 1139 views
  • 3 likes
  • 5 in conversation