Fluorite | Level 6

## Counting the number of occurrences for a grouped variable

Hello,

I have data arranged like this and I'm trying to count the number of non-zero occurrences for each ID value.  I'm trying to do this without using a PROC step and to only use a DATA step.  Please forgive me if this question has been solved elsewhere - I couldn't find this in the community.

ID                   A        B

1000               0        1

1000               0        1

1000               1        0

1111               1        0

1111               1        0

1111               1        0

1111               0        1

1111               0        1

2222               0        1

2222               0        1

2222               1        0

2222               1        0

2222               0        1

The final result set should look this:

ID                   A        B

1000               1        2

1111               3        2

2222               2        3

~E

1 ACCEPTED SOLUTION

Accepted Solutions
Lapis Lazuli | Level 10

## Re: Counting the number of occurrences for a grouped variable

Method one:
proc sql;

create table want as
select distinct(ID),sum(A) as A, sum(B) as B from have group by ID;
quit;

Method two:

proc summary data=have;
class id;
var A B;
ways 1;
output out=want sum(A B)=A B;
run;

Method three:

data want;
set have;
by id;
retain sumA sumB;
if first.id then call missing(sumA,sumB);
sumA+A; sumB+B;
if last.id then output;
run;

10 REPLIES 10
Tourmaline | Level 20

## Re: Counting the number of occurrences for a grouped variable

Why not a PROC if that's easier/better?
Data never sleeps
Fluorite | Level 6

## Re: Counting the number of occurrences for a grouped variable

It needs to be a data step because the results will later be combined into another data step.  There are other variables that will need to be in the dataset, I was just trying to keep it simple showing only those three generic ones.  I don't really want to use a PROC and unsure if it's possible to use the results of a PROC step into another DATA step.

Thanks,

E

Diamond | Level 26

## Re: Counting the number of occurrences for a grouped variable

@esvenson wrote:

It needs to be a data step because the results will later be combined into another data step.

Well, that's not the way SAS works. You can run a PROC, have the results in a SAS data set, and then that data set combined with other data in a data step.

<blockquote>There are other variables that will need to be in the dataset</blockquote>

Yep. PROCs work here too. See my paragraph above. There's nothing you have said that forces you to use DATA step only.

--
Paige Miller
Fluorite | Level 6

## Re: Counting the number of occurrences for a grouped variable

Ok, good to know.  Do you have an idea on how to do it?

E

Diamond | Level 26

## Re: Counting the number of occurrences for a grouped variable

You could certainly do it in a data step if it was sorted first ... but that requires PROC SORT ... I guess you can't do this without a PROC.

--
Paige Miller
Fluorite | Level 6

## Re: Counting the number of occurrences for a grouped variable

I have no problem using a PROC SORT first - I only meant that the final result would be unusable for what I later need as a PROC MEANS or PROC FREQ.  Hope this clarifies.

E

Lapis Lazuli | Level 10

## Re: Counting the number of occurrences for a grouped variable

Method one:
proc sql;

create table want as
select distinct(ID),sum(A) as A, sum(B) as B from have group by ID;
quit;

Method two:

proc summary data=have;
class id;
var A B;
ways 1;
output out=want sum(A B)=A B;
run;

Method three:

data want;
set have;
by id;
retain sumA sumB;
if first.id then call missing(sumA,sumB);
sumA+A; sumB+B;
if last.id then output;
run;

Fluorite | Level 6

## Re: Counting the number of occurrences for a grouped variable

Dear slchen,
Thank you so much for taking the time to provide different ways of doing this. I tried all three and they all worked great. Thanks again.

Have a great day,
~E
Meteorite | Level 14

## Re: Counting the number of occurrences for a grouped variable

You can use RETAIN statement and FIRST. and LAST. , Something like this....

``````DATA WANT;
INFILE DATALINES;
INPUT ID A B;
DATALINES;
1000 0 1
1000 0 1
1000 1 0
1111 1 0
1111 1 0
1111 1 0
1111 0 1
1111 0 1
2222 0 1
2222 0 1
2222 1 0
2222 1 0
2222 0 1
;
RUN;
PROC SORT DATA=WANT;
BY ID;
RUN;

DATA WANT1(DROP=A B);
SET WANT;
RETAIN Count_A Count_B;
BY ID ;
IF FIRST.ID THEN DO;count_A=A;Count_B=B;END;
ELSE DO;Count_A=Count_A+A;Count_B=Count_B+B;END;
IF LAST.ID THEN OUTPUT;
RUN;``````
Thanks,
Suryakiran
PROC Star

## Re: Counting the number of occurrences for a grouped variable

Just in case this opens new avenues of thinking for you, here is one way to use a PROC to create a SAS data set:

proc summary data=have nway;

class ID;

var A B;

output out=want (drop=_type_ _freq_) sum=;

run;

The WANT data set should match what you are seeking here, with one issue to consider.  You asked for a count of the non-zero values.  This is actually something a little bit different ... the sum of all the values.

Discussion stats
• 10 replies
• 18566 views
• 2 likes
• 6 in conversation