DATA Step, Macro, Functions and more

Counting the number of occurrences for a grouped variable

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 13
Accepted Solution

Counting the number of occurrences for a grouped variable

Hello,

I have data arranged like this and I'm trying to count the number of non-zero occurrences for each ID value.  I'm trying to do this without using a PROC step and to only use a DATA step.  Please forgive me if this question has been solved elsewhere - I couldn't find this in the community.

 

ID                   A        B

1000               0        1

1000               0        1

1000               1        0

1111               1        0

1111               1        0

1111               1        0

1111               0        1

1111               0        1

2222               0        1

2222               0        1

2222               1        0

2222               1        0

2222               0        1

 

The final result set should look this:

ID                   A        B

1000               1        2

1111               3        2

2222               2        3

 

Your help is greatly appreciated - thank you in advance!

~E


Accepted Solutions
Solution
‎06-15-2017 02:45 PM
Super Contributor
Posts: 275

Re: Counting the number of occurrences for a grouped variable

Method one:
proc sql;

create table want as
select distinct(ID),sum(A) as A, sum(B) as B from have group by ID;
quit;

 

Method two:

proc summary data=have;
class id;
var A B;
ways 1;
output out=want sum(A B)=A B;
run;

 

Method three:

data want;
set have;
by id;
retain sumA sumB;
if first.id then call missing(sumA,sumB);
sumA+A; sumB+B;
if last.id then output;
run;

View solution in original post


All Replies
Super User
Posts: 5,430

Re: Counting the number of occurrences for a grouped variable

Why not a PROC if that's easier/better?
Data never sleeps
Occasional Contributor
Posts: 13

Re: Counting the number of occurrences for a grouped variable

It needs to be a data step because the results will later be combined into another data step.  There are other variables that will need to be in the dataset, I was just trying to keep it simple showing only those three generic ones.  I don't really want to use a PROC and unsure if it's possible to use the results of a PROC step into another DATA step.

 

Thanks,

E

Trusted Advisor
Posts: 1,923

Re: Counting the number of occurrences for a grouped variable


esvenson wrote:

It needs to be a data step because the results will later be combined into another data step. 


Well, that's not the way SAS works. You can run a PROC, have the results in a SAS data set, and then that data set combined with other data in a data step.

 

<blockquote>There are other variables that will need to be in the dataset</blockquote>

 

Yep. PROCs work here too. See my paragraph above. There's nothing you have said that forces you to use DATA step only.

Occasional Contributor
Posts: 13

Re: Counting the number of occurrences for a grouped variable

Posted in reply to PaigeMiller

Ok, good to know.  Do you have an idea on how to do it?

 

Thanks for your help,

E

Trusted Advisor
Posts: 1,923

Re: Counting the number of occurrences for a grouped variable

You could certainly do it in a data step if it was sorted first ... but that requires PROC SORT ... I guess you can't do this without a PROC.

Occasional Contributor
Posts: 13

Re: Counting the number of occurrences for a grouped variable

Posted in reply to PaigeMiller

I have no problem using a PROC SORT first - I only meant that the final result would be unusable for what I later need as a PROC MEANS or PROC FREQ.  Hope this clarifies.

 

E

Solution
‎06-15-2017 02:45 PM
Super Contributor
Posts: 275

Re: Counting the number of occurrences for a grouped variable

Method one:
proc sql;

create table want as
select distinct(ID),sum(A) as A, sum(B) as B from have group by ID;
quit;

 

Method two:

proc summary data=have;
class id;
var A B;
ways 1;
output out=want sum(A B)=A B;
run;

 

Method three:

data want;
set have;
by id;
retain sumA sumB;
if first.id then call missing(sumA,sumB);
sumA+A; sumB+B;
if last.id then output;
run;

Occasional Contributor
Posts: 13

Re: Counting the number of occurrences for a grouped variable

Dear slchen,
Thank you so much for taking the time to provide different ways of doing this. I tried all three and they all worked great. Thanks again.

Have a great day,
~E
Frequent Contributor
Posts: 136

Re: Counting the number of occurrences for a grouped variable

You can use RETAIN statement and FIRST. and LAST. , Something like this....

 

DATA WANT;
INFILE DATALINES;
INPUT ID A B;
DATALINES;
1000 0 1
1000 0 1
1000 1 0
1111 1 0
1111 1 0
1111 1 0
1111 0 1
1111 0 1
2222 0 1
2222 0 1
2222 1 0
2222 1 0
2222 0 1
;
RUN;
PROC SORT DATA=WANT;
BY ID;
RUN;

DATA WANT1(DROP=A B);
SET WANT;
RETAIN Count_A Count_B;
BY ID ;
IF FIRST.ID THEN DO;count_A=A;Count_B=B;END;
ELSE DO;Count_A=Count_A+A;Count_B=Count_B+B;END;
IF LAST.ID THEN OUTPUT;
RUN;
Thanks,
Suryakiran
Super User
Posts: 5,510

Re: Counting the number of occurrences for a grouped variable

Just in case this opens new avenues of thinking for you, here is one way to use a PROC to create a SAS data set:

 

proc summary data=have nway;

class ID;

var A B;

output out=want (drop=_type_ _freq_) sum=;

run;

 

The WANT data set should match what you are seeking here, with one issue to consider.  You asked for a count of the non-zero values.  This is actually something a little bit different ... the sum of all the values.

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 10 replies
  • 181 views
  • 1 like
  • 6 in conversation