Hello!
I am hoping someone can help me. I have the following data set below (this is just a sample of the dataset) and need the second dataset shown below. Note that Count can range from 1 to 6 in my dataset. Essentially if I capture the rows where count is the maximum for each value of ID, I will have what I need. Any ideas on how to do that?
My SAS code:
data output;
set input;
if count ne 1 then AllDepts=catx(",",AllDepts,Dept);
else AllDepts=Dept;
retain AllDepts;
run;
Dataset I have: | |||
Count | ID | Name | Dept |
1 | 123 | Person 1 | Dept A |
2 | 123 | Person 1 | Dept C |
3 | 123 | Person 1 | Dept E |
1 | 456 | Person 2 | Dept B |
2 | 456 | Person 2 | Dept C |
3 | 456 | Person 2 | Dept F |
4 | 456 | Person 2 | Dept Q |
1 | 567 | Person 3 | Dept B |
Dataset I need: | |||
Count | ID | Name | AllDepts |
3 | 123 | Person 1 | Dept A, Dept C, Dept E |
4 | 456 | Person 2 | Dept B, Dept C, Dept F, Dept Q |
1 | 567 | Person 3 | Dept B |
Dataset I am getting: | |||
Count | ID | Name | AllDepts |
1 | 123 | Person 1 | Dept A |
2 | 123 | Person 1 | Dept A, Dept C |
3 | 123 | Person 1 | Dept A, Dept C, Dept E |
1 | 456 | Person 2 | Dept B |
2 | 456 | Person 2 | Dept B, Dept C |
3 | 456 | Person 2 | Dept B, Dept C, Dept F |
4 | 456 | Person 2 | Dept B, Dept C, Dept F, Dept Q |
1 | 567 | Person 3 | Dept B |
Thanks,
KelseyB
Isn't it necessary to declare AllDepts as a character variable (assuming that it is not contained in the input dataset)? If I run either of your data steps, my SAS complains about "invalid numeric data" and makes AllDepts a numeric variable.
My solution would be:
proc sort data=input;
by ID count;
run;
data output;
do until(last.ID);
set input;
length AllDepts $46; /* to be adapted depending on max. length of Dept */
by ID;
AllDepts=catx(', ', AllDepts, Dept);
end;
drop Dept;
run;
This applies the "DOW loop" technique (cf. http://www2.sas.com/proceedings/sugi28/099-28.pdf). The purpose of the unusual position of the (declarative) LENGTH statement is just to obtain the desired column order.
Use BY group processing.
proc sort data=input;
by ID count;
run;
data output;
set input;
BY ID;
retain AllDepts;
if not first.ID then AllDepts=catx(",",AllDepts,Dept);
else AllDepts=Dept;
if last.ID then output;
run;
Reeza,
Thank you for your help with this. Unfortunately, I got an empty dataset when I ran this. It may have been something I did because I dummied my variables to send in the code, however. I got a solution from another contributor, but I appreciate your quick response.
Thanks again,
KelseyB
Isn't it necessary to declare AllDepts as a character variable (assuming that it is not contained in the input dataset)? If I run either of your data steps, my SAS complains about "invalid numeric data" and makes AllDepts a numeric variable.
My solution would be:
proc sort data=input;
by ID count;
run;
data output;
do until(last.ID);
set input;
length AllDepts $46; /* to be adapted depending on max. length of Dept */
by ID;
AllDepts=catx(', ', AllDepts, Dept);
end;
drop Dept;
run;
This applies the "DOW loop" technique (cf. http://www2.sas.com/proceedings/sugi28/099-28.pdf). The purpose of the unusual position of the (declarative) LENGTH statement is just to obtain the desired column order.
FreelanceReinhard,
You are correct. I accidentally left out the length statement for AllDepts when I copied my code. It was actually in my program though so I didn't get an error.
This works perfectly! Thank you so much for your help.
Best regards,
KelseyB
Something like below should do:
data have;
infile datalines dlm=',' dsd truncover;
input (Count ID Name Dept) ($);
datalines;
1,123,Person 1,Dept A
2,123,Person 1,Dept C
3,123,Person 1,Dept E
1,456,Person 2,Dept B
2,456,Person 2,Dept C
3,456,Person 2,Dept F
4,456,Person 2,Dept Q
1,567,Person 3,Dept B
;
run;
data want(keep=Count ID Name AllDepts);
set have;
by id count;
length AllDepts $100;
retain AllDepts;
AllDepts=catx(', ',AllDepts,Dept);
if last.id then
do;
output;
call missing(AllDepts);
end;
run;
Thanks Patrick. This works perfectly. I had already accepted another respondents' answer, but I greatly appreciate this. Two ways to do something is better than none:)
Best Regards,
KelseyB
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.