Solved
Contributor
Posts: 40

# Collapsing and concatenating rows in data set

Hello!

I am hoping someone can help me.  I have the following data set below (this is just a sample of the dataset) and need the second dataset shown below.  Note that Count can range from 1 to 6 in my dataset.  Essentially if I capture the rows where count is the maximum for each value of ID, I will have what I need.  Any ideas on how to do that?

My SAS code:

data output;
set input;
if count ne 1 then AllDepts=catx(",",AllDepts,Dept);
else AllDepts=Dept;
retain AllDepts;
run;

 Dataset I have: Count ID Name Dept 1 123 Person 1 Dept A 2 123 Person 1 Dept C 3 123 Person 1 Dept E 1 456 Person 2 Dept B 2 456 Person 2 Dept C 3 456 Person 2 Dept F 4 456 Person 2 Dept Q 1 567 Person 3 Dept B Dataset I need: Count ID Name AllDepts 3 123 Person 1 Dept A, Dept C, Dept E 4 456 Person 2 Dept B, Dept C, Dept F, Dept Q 1 567 Person 3 Dept B Dataset I am getting: Count ID Name AllDepts 1 123 Person 1 Dept A 2 123 Person 1 Dept A, Dept C 3 123 Person 1 Dept A, Dept C, Dept E 1 456 Person 2 Dept B 2 456 Person 2 Dept B, Dept C 3 456 Person 2 Dept B, Dept C, Dept F 4 456 Person 2 Dept B, Dept C, Dept F, Dept Q 1 567 Person 3 Dept B

Thanks,

KelseyB

Accepted Solutions
Solution
‎11-10-2015 09:24 AM
Posts: 1,242

## Re: Collapsing and concatenating rows in data set

Isn't it necessary to declare AllDepts as a character variable (assuming that it is not contained in the input dataset)? If I run either of your data steps, my SAS complains about "invalid numeric data" and makes AllDepts a numeric variable.

My solution would be:

``````proc sort data=input;
by ID count;
run;

data output;
do until(last.ID);
set input;
length AllDepts \$46; /* to be adapted depending on max. length of Dept */
by ID;
AllDepts=catx(', ', AllDepts, Dept);
end;
drop Dept;
run;
``````

This applies the "DOW loop" technique (cf. http://www2.sas.com/proceedings/sugi28/099-28.pdf). The purpose of the unusual position of the (declarative) LENGTH statement is just to obtain the desired column order.

All Replies
Super User
Posts: 23,663

## Re: Collapsing and concatenating rows in data set

Use BY group processing.

``````proc sort data=input;
by ID count;
run;

data output;
set input;
BY ID;

retain AllDepts;
if not first.ID then AllDepts=catx(",",AllDepts,Dept);
else AllDepts=Dept;

if last.ID then output;

run;``````
Contributor
Posts: 40

## Re: Collapsing and concatenating rows in data set

Reeza,

Thank you for your help with this.  Unfortunately, I got an empty dataset when I ran this.  It may have been something I did because I dummied my variables to send in the code, however.  I got a solution from another contributor, but I appreciate your quick response.

Thanks again,

KelseyB

Solution
‎11-10-2015 09:24 AM
Posts: 1,242

## Re: Collapsing and concatenating rows in data set

Isn't it necessary to declare AllDepts as a character variable (assuming that it is not contained in the input dataset)? If I run either of your data steps, my SAS complains about "invalid numeric data" and makes AllDepts a numeric variable.

My solution would be:

``````proc sort data=input;
by ID count;
run;

data output;
do until(last.ID);
set input;
length AllDepts \$46; /* to be adapted depending on max. length of Dept */
by ID;
AllDepts=catx(', ', AllDepts, Dept);
end;
drop Dept;
run;
``````

This applies the "DOW loop" technique (cf. http://www2.sas.com/proceedings/sugi28/099-28.pdf). The purpose of the unusual position of the (declarative) LENGTH statement is just to obtain the desired column order.

Contributor
Posts: 40

## Re: Collapsing and concatenating rows in data set

FreelanceReinhard,

You are correct.  I accidentally left out the length statement for AllDepts when I copied my code.  It was actually in my program though so I didn't get an error.

This works perfectly!  Thank you so much for your help.

Best regards,

KelseyB

Posts: 4,736

## Re: Collapsing and concatenating rows in data set

Something like below should do:

``````data have;
infile datalines dlm=',' dsd truncover;
input (Count ID Name Dept) (\$);
datalines;
1,123,Person 1,Dept A
2,123,Person 1,Dept C
3,123,Person 1,Dept E
1,456,Person 2,Dept B
2,456,Person 2,Dept C
3,456,Person 2,Dept F
4,456,Person 2,Dept Q
1,567,Person 3,Dept B
;
run;

data want(keep=Count ID Name AllDepts);
set have;
by id count;
length AllDepts \$100;
retain AllDepts;
AllDepts=catx(', ',AllDepts,Dept);
if last.id then
do;
output;
call missing(AllDepts);
end;
run;``````
Contributor
Posts: 40

## Re: Collapsing and concatenating rows in data set

Thanks Patrick.  This works perfectly.  I had already accepted another respondents' answer, but I greatly appreciate this.  Two ways to do something is better than none

Best Regards,

KelseyB

🔒 This topic is solved and locked.