BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
KelseyB
Fluorite | Level 6

 

Hello!

I am hoping someone can help me.  I have the following data set below (this is just a sample of the dataset) and need the second dataset shown below.  Note that Count can range from 1 to 6 in my dataset.  Essentially if I capture the rows where count is the maximum for each value of ID, I will have what I need.  Any ideas on how to do that?

 

My SAS code:

 

data output;
set input;
if count ne 1 then AllDepts=catx(",",AllDepts,Dept);
else AllDepts=Dept;
retain AllDepts;
run;

 

Dataset I have:    
Count ID Name Dept
1 123 Person 1 Dept A
2 123 Person 1 Dept C
3 123 Person 1 Dept E
1 456 Person 2 Dept B
2 456 Person 2 Dept C
3 456 Person 2 Dept F
4 456 Person 2 Dept Q
1 567 Person 3 Dept B
       
Dataset I need:    
Count ID Name AllDepts
3 123 Person 1 Dept A, Dept C, Dept E
4 456 Person 2 Dept B, Dept C, Dept F, Dept Q
1 567 Person 3 Dept B
       
Dataset I am getting:  
Count ID Name AllDepts
1 123 Person 1 Dept A
2 123 Person 1 Dept A, Dept C
3 123 Person 1 Dept A, Dept C, Dept E
1 456 Person 2 Dept B
2 456 Person 2 Dept B, Dept C
3 456 Person 2 Dept B, Dept C, Dept F
4 456 Person 2 Dept B, Dept C, Dept F, Dept Q
1 567 Person 3 Dept B

 

Thanks,

KelseyB

 

1 ACCEPTED SOLUTION

Accepted Solutions
FreelanceReinh
Jade | Level 19

Isn't it necessary to declare AllDepts as a character variable (assuming that it is not contained in the input dataset)? If I run either of your data steps, my SAS complains about "invalid numeric data" and makes AllDepts a numeric variable.

 

My solution would be:

proc sort data=input;
by ID count;
run;

data output;
do until(last.ID);
  set input;
  length AllDepts $46; /* to be adapted depending on max. length of Dept */
  by ID;
  AllDepts=catx(', ', AllDepts, Dept);
end;
drop Dept;
run;

This applies the "DOW loop" technique (cf. http://www2.sas.com/proceedings/sugi28/099-28.pdf). The purpose of the unusual position of the (declarative) LENGTH statement is just to obtain the desired column order.

View solution in original post

6 REPLIES 6
Reeza
Super User

Use BY group processing.

proc sort data=input;
by ID count;
run;

data output;
set input;
BY ID;

retain AllDepts;
if not first.ID then AllDepts=catx(",",AllDepts,Dept); 
else AllDepts=Dept;

if last.ID then output;

run;
KelseyB
Fluorite | Level 6

Reeza, 

Thank you for your help with this.  Unfortunately, I got an empty dataset when I ran this.  It may have been something I did because I dummied my variables to send in the code, however.  I got a solution from another contributor, but I appreciate your quick response.

 

Thanks again,

KelseyB

FreelanceReinh
Jade | Level 19

Isn't it necessary to declare AllDepts as a character variable (assuming that it is not contained in the input dataset)? If I run either of your data steps, my SAS complains about "invalid numeric data" and makes AllDepts a numeric variable.

 

My solution would be:

proc sort data=input;
by ID count;
run;

data output;
do until(last.ID);
  set input;
  length AllDepts $46; /* to be adapted depending on max. length of Dept */
  by ID;
  AllDepts=catx(', ', AllDepts, Dept);
end;
drop Dept;
run;

This applies the "DOW loop" technique (cf. http://www2.sas.com/proceedings/sugi28/099-28.pdf). The purpose of the unusual position of the (declarative) LENGTH statement is just to obtain the desired column order.

KelseyB
Fluorite | Level 6

FreelanceReinhard,

You are correct.  I accidentally left out the length statement for AllDepts when I copied my code.  It was actually in my program though so I didn't get an error.  

This works perfectly!  Thank you so much for your help.

 

Best regards,

KelseyB

Patrick
Opal | Level 21

Something like below should do:

data have;
  infile datalines dlm=',' dsd truncover;
  input (Count ID Name Dept) ($);
  datalines;
1,123,Person 1,Dept A
2,123,Person 1,Dept C
3,123,Person 1,Dept E
1,456,Person 2,Dept B
2,456,Person 2,Dept C
3,456,Person 2,Dept F
4,456,Person 2,Dept Q
1,567,Person 3,Dept B
;
run;

data want(keep=Count ID Name AllDepts);
  set have;
  by id count;
  length AllDepts $100;
  retain AllDepts;
  AllDepts=catx(', ',AllDepts,Dept);
  if last.id then
    do;
      output;
      call missing(AllDepts);
    end;
run;
KelseyB
Fluorite | Level 6

Thanks Patrick.  This works perfectly.  I had already accepted another respondents' answer, but I greatly appreciate this.  Two ways to do something is better than none:)

 

Best Regards,

KelseyB

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 6 replies
  • 1212 views
  • 2 likes
  • 4 in conversation