BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
pkopersk
Fluorite | Level 6

Hello,

 

I am struggling with achieving something like this:

VAR1VAR2COUNT_BY_GROUP
A102
A202
B2431
C43
C213
C43

 

Where the COUNT_BY_GROUP variable is the total number of elements of that group.

 

Here is the code that I'm currently using:

data want;
set have;
by VAR1;
retain COUNT_BY_GROUP;
if first.VAR1 then COUNT_BY_GROUP = 0;
COUNT_BY_GROUP +1;
run;

But it gives me a cumulative count. I want to save the total count of elements in a group.

 

Thank you in advance.

1 ACCEPTED SOLUTION

Accepted Solutions
novinosrin
Tourmaline | Level 20

Hi @pkopersk   Your approach is correct, however you need an extra pass of the by group to assign the computed cumulative count to each record of the by group. So, Basically I have tweaked your code to read the BY group again by using a technique called interleave aka sorted append and conditionally compute the cumulative count and assign appropriately.

 


data have;
input VAR1 $	VAR2;*	COUNT_BY_GROUP;
cards;
A	10	2
A	20	2
B	243	1
C	4	3
C	21	3
C	4	3
;

data want;
 set have(in=a) have(in=b);
 by VAR1;
 retain COUNT_BY_GROUP;
 if a then do;
  if first.VAR1 then COUNT_BY_GROUP = 0;
  COUNT_BY_GROUP +1;
 end;
 if b;
run;

The genius @mkeintz  taught me this among many others he taught me personally over the last couple of years. Have fun learning!

 

View solution in original post

3 REPLIES 3
PaigeMiller
Diamond | Level 26
proc freq data=have;
    tables var1/noprint out=_a_;
run;
data want;
    merge have _a_;
    by var1;
run;

PROC FREQ is where you should start when you have a problem that involves counting a number of observations in a group. It's a very good thing to learn, and a far superior way to go than writing your own data step code each time you want to perform counting.

--
Paige Miller
ed_sas_member
Meteorite | Level 14

Hi @pkopersk 

Here is another approach to the one proposed by @PaigeMiller :

proc sql;
	create table want as
	select VAR1, VAR2, n(VAR1) as COUNT_BY_GROUP
	from have
	group by VAR1;
quit;
novinosrin
Tourmaline | Level 20

Hi @pkopersk   Your approach is correct, however you need an extra pass of the by group to assign the computed cumulative count to each record of the by group. So, Basically I have tweaked your code to read the BY group again by using a technique called interleave aka sorted append and conditionally compute the cumulative count and assign appropriately.

 


data have;
input VAR1 $	VAR2;*	COUNT_BY_GROUP;
cards;
A	10	2
A	20	2
B	243	1
C	4	3
C	21	3
C	4	3
;

data want;
 set have(in=a) have(in=b);
 by VAR1;
 retain COUNT_BY_GROUP;
 if a then do;
  if first.VAR1 then COUNT_BY_GROUP = 0;
  COUNT_BY_GROUP +1;
 end;
 if b;
run;

The genius @mkeintz  taught me this among many others he taught me personally over the last couple of years. Have fun learning!

 

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 1887 views
  • 0 likes
  • 4 in conversation