DATA Step, Macro, Functions and more

Count by ID

Accepted Solution Solved
Reply
Contributor
Posts: 25
Accepted Solution

Count by ID

Hi all,

 

I have a large SAS dataset with multiple columns where two in particular are of interest: ID (unique) and subject.

 

I am trying to count the number of times each ID has an associated subject (see example below).

 

This is what I have:

IDSubject
1High
1Med
1Med
2Low
2Low
3High
3Low
3Med
3Med

 

This is what I need:

 

IDHighMedLow
1120
2002
3121

 

Any suggestions on how I can achieve this in a single data step?

 

I have tried to do something like this but can't get it to work efficiently:

 

data need;
set have;
by ID notsorted;
if first.ID and subject='High' then High=0;
High+1;
run;

 

Any help would be greatly appreciated.

 

Cheers,

Pete


Accepted Solutions
Solution
a month ago
Super User
Posts: 6,632

Re: Count by ID

Posted in reply to PetePatel

The suggestion you already received will get you the result you want.  However, since you specifically asked for a DATA step solution in a single step, here is how you would go about that:

 

data want;

set have;

by id;

if first.id then do;

   high = 0;

   med = 0;

   low = 0;

end;

if subject='High' then high + 1;

else if subject='Med' then med + 1;

else if subject='Low' then low + 1;

drop subject;

if last.id;

run;

View solution in original post


All Replies
PROC Star
Posts: 1,584

Re: Count by ID

[ Edited ]
Posted in reply to PetePatel
data have;
input ID	Subject$;
cards;
1	High
1	Med
1	Med
2	Low
2	Low
3	High
3	Low
3	Med
3	Med
;

proc freq data= have;
by id;
tables subject/out=_have(drop=percent);
run;

proc transpose data=_have out=want(drop=_label_);
by id;
var count;
id subject;
run;
Solution
a month ago
Super User
Posts: 6,632

Re: Count by ID

Posted in reply to PetePatel

The suggestion you already received will get you the result you want.  However, since you specifically asked for a DATA step solution in a single step, here is how you would go about that:

 

data want;

set have;

by id;

if first.id then do;

   high = 0;

   med = 0;

   low = 0;

end;

if subject='High' then high + 1;

else if subject='Med' then med + 1;

else if subject='Low' then low + 1;

drop subject;

if last.id;

run;

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 2 replies
  • 99 views
  • 2 likes
  • 3 in conversation