Collapsing and concatenating rows in data set

Accepted Solution Solved
Reply
Contributor
Posts: 40
Accepted Solution

Collapsing and concatenating rows in data set

 

Hello!

I am hoping someone can help me.  I have the following data set below (this is just a sample of the dataset) and need the second dataset shown below.  Note that Count can range from 1 to 6 in my dataset.  Essentially if I capture the rows where count is the maximum for each value of ID, I will have what I need.  Any ideas on how to do that?

 

My SAS code:

 

data output;
set input;
if count ne 1 then AllDepts=catx(",",AllDepts,Dept);
else AllDepts=Dept;
retain AllDepts;
run;

 

Dataset I have:    
Count ID Name Dept
1 123 Person 1 Dept A
2 123 Person 1 Dept C
3 123 Person 1 Dept E
1 456 Person 2 Dept B
2 456 Person 2 Dept C
3 456 Person 2 Dept F
4 456 Person 2 Dept Q
1 567 Person 3 Dept B
       
Dataset I need:    
Count ID Name AllDepts
3 123 Person 1 Dept A, Dept C, Dept E
4 456 Person 2 Dept B, Dept C, Dept F, Dept Q
1 567 Person 3 Dept B
       
Dataset I am getting:  
Count ID Name AllDepts
1 123 Person 1 Dept A
2 123 Person 1 Dept A, Dept C
3 123 Person 1 Dept A, Dept C, Dept E
1 456 Person 2 Dept B
2 456 Person 2 Dept B, Dept C
3 456 Person 2 Dept B, Dept C, Dept F
4 456 Person 2 Dept B, Dept C, Dept F, Dept Q
1 567 Person 3 Dept B

 

Thanks,

KelseyB

 


Accepted Solutions
Solution
‎11-10-2015 09:24 AM
Trusted Advisor
Posts: 1,116

Re: Collapsing and concatenating rows in data set

Isn't it necessary to declare AllDepts as a character variable (assuming that it is not contained in the input dataset)? If I run either of your data steps, my SAS complains about "invalid numeric data" and makes AllDepts a numeric variable.

 

My solution would be:

proc sort data=input;
by ID count;
run;

data output;
do until(last.ID);
  set input;
  length AllDepts $46; /* to be adapted depending on max. length of Dept */
  by ID;
  AllDepts=catx(', ', AllDepts, Dept);
end;
drop Dept;
run;

This applies the "DOW loop" technique (cf. http://www2.sas.com/proceedings/sugi28/099-28.pdf). The purpose of the unusual position of the (declarative) LENGTH statement is just to obtain the desired column order.

View solution in original post


All Replies
Super User
Posts: 19,193

Re: Collapsing and concatenating rows in data set

Use BY group processing.

proc sort data=input;
by ID count;
run;

data output;
set input;
BY ID;

retain AllDepts;
if not first.ID then AllDepts=catx(",",AllDepts,Dept); 
else AllDepts=Dept;

if last.ID then output;

run;
Contributor
Posts: 40

Re: Collapsing and concatenating rows in data set

Reeza, 

Thank you for your help with this.  Unfortunately, I got an empty dataset when I ran this.  It may have been something I did because I dummied my variables to send in the code, however.  I got a solution from another contributor, but I appreciate your quick response.

 

Thanks again,

KelseyB

Solution
‎11-10-2015 09:24 AM
Trusted Advisor
Posts: 1,116

Re: Collapsing and concatenating rows in data set

Isn't it necessary to declare AllDepts as a character variable (assuming that it is not contained in the input dataset)? If I run either of your data steps, my SAS complains about "invalid numeric data" and makes AllDepts a numeric variable.

 

My solution would be:

proc sort data=input;
by ID count;
run;

data output;
do until(last.ID);
  set input;
  length AllDepts $46; /* to be adapted depending on max. length of Dept */
  by ID;
  AllDepts=catx(', ', AllDepts, Dept);
end;
drop Dept;
run;

This applies the "DOW loop" technique (cf. http://www2.sas.com/proceedings/sugi28/099-28.pdf). The purpose of the unusual position of the (declarative) LENGTH statement is just to obtain the desired column order.

Contributor
Posts: 40

Re: Collapsing and concatenating rows in data set

FreelanceReinhard,

You are correct.  I accidentally left out the length statement for AllDepts when I copied my code.  It was actually in my program though so I didn't get an error.  

This works perfectly!  Thank you so much for your help.

 

Best regards,

KelseyB

Respected Advisor
Posts: 4,138

Re: Collapsing and concatenating rows in data set

Something like below should do:

data have;
  infile datalines dlm=',' dsd truncover;
  input (Count ID Name Dept) ($);
  datalines;
1,123,Person 1,Dept A
2,123,Person 1,Dept C
3,123,Person 1,Dept E
1,456,Person 2,Dept B
2,456,Person 2,Dept C
3,456,Person 2,Dept F
4,456,Person 2,Dept Q
1,567,Person 3,Dept B
;
run;

data want(keep=Count ID Name AllDepts);
  set have;
  by id count;
  length AllDepts $100;
  retain AllDepts;
  AllDepts=catx(', ',AllDepts,Dept);
  if last.id then
    do;
      output;
      call missing(AllDepts);
    end;
run;
Contributor
Posts: 40

Re: Collapsing and concatenating rows in data set

Thanks Patrick.  This works perfectly.  I had already accepted another respondents' answer, but I greatly appreciate this.  Two ways to do something is better than noneSmiley Happy

 

Best Regards,

KelseyB

🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 6 replies
  • 405 views
  • 2 likes
  • 4 in conversation