BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
KelseyB
Fluorite | Level 6

 

Hello!

I am hoping someone can help me.  I have the following data set below (this is just a sample of the dataset) and need the second dataset shown below.  Note that Count can range from 1 to 6 in my dataset.  Essentially if I capture the rows where count is the maximum for each value of ID, I will have what I need.  Any ideas on how to do that?

 

My SAS code:

 

data output;
set input;
if count ne 1 then AllDepts=catx(",",AllDepts,Dept);
else AllDepts=Dept;
retain AllDepts;
run;

 

Dataset I have:    
Count ID Name Dept
1 123 Person 1 Dept A
2 123 Person 1 Dept C
3 123 Person 1 Dept E
1 456 Person 2 Dept B
2 456 Person 2 Dept C
3 456 Person 2 Dept F
4 456 Person 2 Dept Q
1 567 Person 3 Dept B
       
Dataset I need:    
Count ID Name AllDepts
3 123 Person 1 Dept A, Dept C, Dept E
4 456 Person 2 Dept B, Dept C, Dept F, Dept Q
1 567 Person 3 Dept B
       
Dataset I am getting:  
Count ID Name AllDepts
1 123 Person 1 Dept A
2 123 Person 1 Dept A, Dept C
3 123 Person 1 Dept A, Dept C, Dept E
1 456 Person 2 Dept B
2 456 Person 2 Dept B, Dept C
3 456 Person 2 Dept B, Dept C, Dept F
4 456 Person 2 Dept B, Dept C, Dept F, Dept Q
1 567 Person 3 Dept B

 

Thanks,

KelseyB

 

1 ACCEPTED SOLUTION

Accepted Solutions
FreelanceReinh
Jade | Level 19

Isn't it necessary to declare AllDepts as a character variable (assuming that it is not contained in the input dataset)? If I run either of your data steps, my SAS complains about "invalid numeric data" and makes AllDepts a numeric variable.

 

My solution would be:

proc sort data=input;
by ID count;
run;

data output;
do until(last.ID);
  set input;
  length AllDepts $46; /* to be adapted depending on max. length of Dept */
  by ID;
  AllDepts=catx(', ', AllDepts, Dept);
end;
drop Dept;
run;

This applies the "DOW loop" technique (cf. http://www2.sas.com/proceedings/sugi28/099-28.pdf). The purpose of the unusual position of the (declarative) LENGTH statement is just to obtain the desired column order.

View solution in original post

6 REPLIES 6
Reeza
Super User

Use BY group processing.

proc sort data=input;
by ID count;
run;

data output;
set input;
BY ID;

retain AllDepts;
if not first.ID then AllDepts=catx(",",AllDepts,Dept); 
else AllDepts=Dept;

if last.ID then output;

run;
KelseyB
Fluorite | Level 6

Reeza, 

Thank you for your help with this.  Unfortunately, I got an empty dataset when I ran this.  It may have been something I did because I dummied my variables to send in the code, however.  I got a solution from another contributor, but I appreciate your quick response.

 

Thanks again,

KelseyB

FreelanceReinh
Jade | Level 19

Isn't it necessary to declare AllDepts as a character variable (assuming that it is not contained in the input dataset)? If I run either of your data steps, my SAS complains about "invalid numeric data" and makes AllDepts a numeric variable.

 

My solution would be:

proc sort data=input;
by ID count;
run;

data output;
do until(last.ID);
  set input;
  length AllDepts $46; /* to be adapted depending on max. length of Dept */
  by ID;
  AllDepts=catx(', ', AllDepts, Dept);
end;
drop Dept;
run;

This applies the "DOW loop" technique (cf. http://www2.sas.com/proceedings/sugi28/099-28.pdf). The purpose of the unusual position of the (declarative) LENGTH statement is just to obtain the desired column order.

KelseyB
Fluorite | Level 6

FreelanceReinhard,

You are correct.  I accidentally left out the length statement for AllDepts when I copied my code.  It was actually in my program though so I didn't get an error.  

This works perfectly!  Thank you so much for your help.

 

Best regards,

KelseyB

Patrick
Opal | Level 21

Something like below should do:

data have;
  infile datalines dlm=',' dsd truncover;
  input (Count ID Name Dept) ($);
  datalines;
1,123,Person 1,Dept A
2,123,Person 1,Dept C
3,123,Person 1,Dept E
1,456,Person 2,Dept B
2,456,Person 2,Dept C
3,456,Person 2,Dept F
4,456,Person 2,Dept Q
1,567,Person 3,Dept B
;
run;

data want(keep=Count ID Name AllDepts);
  set have;
  by id count;
  length AllDepts $100;
  retain AllDepts;
  AllDepts=catx(', ',AllDepts,Dept);
  if last.id then
    do;
      output;
      call missing(AllDepts);
    end;
run;
KelseyB
Fluorite | Level 6

Thanks Patrick.  This works perfectly.  I had already accepted another respondents' answer, but I greatly appreciate this.  Two ways to do something is better than none:)

 

Best Regards,

KelseyB

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 857 views
  • 2 likes
  • 4 in conversation