Combining the strings of unique items through a Data Step into one string per ID?

Reply
New Contributor PSL
New Contributor
Posts: 4

Combining the strings of unique items through a Data Step into one string per ID?

Hi,

 

I am trying to get the below list to show with each ID only once with all multiple rows of strings combined

 

From:

 

codestring
1a
1b
2e
3g
3h
4i
5k
5l
5m

 

The code I am using to get the initial data is

 

data work.blah;  
   infile datalines missover; 
   input ID $ string $;  
   datalines; 
1 a
1 b
2 e
3 g
3 h
4 i
5 k
5 l
5 m
;                         

 


To:

codelist
1a, b
2e
3g, h
4i
5k, l, m

 

The code I am using is

 

data strings;
length list $100.;
set work.blah end=eof;
retain list;
if n=1 then list=string;
else list = cats(list,", ",string);
if eof then output;
run;

 

Unfortunately the result I am getting is

listIDstringn
,a,b,e,g,h,i,k,l,m5m .

 

My questions are:

1) How can I get the unique IDs, and not just the last one (5)

2) How can I get the first string to start and not the ", "

3) How can I drop the 'string' and 'n' columns

 

 

For reference I am using SAS EG 6.1 M1

 

 

Super User
Posts: 23,700

Re: Combining the strings of unique items through a Data Step into one string per ID?

There’s two methods here with a fully worked example. 

 

https://gist.github.com/statgeek/d583cfa992bf56da51d435165b07e96a

New Contributor PSL
New Contributor
Posts: 4

Re: Combining the strings of unique items through a Data Step into one string per ID?

Thanks for the answer, could you explain the difference between the two solutions? Would there be any reason to choose the Data Step over the Proc SQL or vice versa?
Super User
Posts: 23,700

Re: Combining the strings of unique items through a Data Step into one string per ID?

If you have really big data and want only one pass with the data, the data step is a faster option. Or if you're calculating multiple things in the same data step you may want that approach. 

 


@PSL wrote:
Thanks for the answer, could you explain the difference between the two solutions? Would there be any reason to choose the Data Step over the Proc SQL or vice versa?

 

New Contributor PSL
New Contributor
Posts: 4

Re: Combining the strings of unique items through a Data Step into one string per ID?

Understood, thank you very much
PROC Star
Posts: 1,781

Re: Combining the strings of unique items through a Data Step into one string per ID?

data work.blah;  
   infile datalines missover; 
   input ID $ string $;  
   datalines; 
1 a
1 b
2 e
3 g
3 h
4 i
5 k
5 l
5 m
;   

data want;
set blah;
by id;
length list $50;
retain list ;
if first.id then list=string;
else list=catx(',',list,string);
if last.id;
drop string;
run;
New Contributor PSL
New Contributor
Posts: 4

Re: Combining the strings of unique items through a Data Step into one string per ID?

Posted in reply to novinosrin

Thanks, that worked perfect

Ask a Question
Discussion stats
  • 6 replies
  • 78 views
  • 0 likes
  • 3 in conversation