BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
pkopersk
Fluorite | Level 6

Hello,

 

I am struggling with achieving something like this:

VAR1VAR2COUNT_BY_GROUP
A102
A202
B2431
C43
C213
C43

 

Where the COUNT_BY_GROUP variable is the total number of elements of that group.

 

Here is the code that I'm currently using:

data want;
set have;
by VAR1;
retain COUNT_BY_GROUP;
if first.VAR1 then COUNT_BY_GROUP = 0;
COUNT_BY_GROUP +1;
run;

But it gives me a cumulative count. I want to save the total count of elements in a group.

 

Thank you in advance.

1 ACCEPTED SOLUTION

Accepted Solutions
novinosrin
Tourmaline | Level 20

Hi @pkopersk   Your approach is correct, however you need an extra pass of the by group to assign the computed cumulative count to each record of the by group. So, Basically I have tweaked your code to read the BY group again by using a technique called interleave aka sorted append and conditionally compute the cumulative count and assign appropriately.

 


data have;
input VAR1 $	VAR2;*	COUNT_BY_GROUP;
cards;
A	10	2
A	20	2
B	243	1
C	4	3
C	21	3
C	4	3
;

data want;
 set have(in=a) have(in=b);
 by VAR1;
 retain COUNT_BY_GROUP;
 if a then do;
  if first.VAR1 then COUNT_BY_GROUP = 0;
  COUNT_BY_GROUP +1;
 end;
 if b;
run;

The genius @mkeintz  taught me this among many others he taught me personally over the last couple of years. Have fun learning!

 

View solution in original post

3 REPLIES 3
PaigeMiller
Diamond | Level 26
proc freq data=have;
    tables var1/noprint out=_a_;
run;
data want;
    merge have _a_;
    by var1;
run;

PROC FREQ is where you should start when you have a problem that involves counting a number of observations in a group. It's a very good thing to learn, and a far superior way to go than writing your own data step code each time you want to perform counting.

--
Paige Miller
ed_sas_member
Meteorite | Level 14

Hi @pkopersk 

Here is another approach to the one proposed by @PaigeMiller :

proc sql;
	create table want as
	select VAR1, VAR2, n(VAR1) as COUNT_BY_GROUP
	from have
	group by VAR1;
quit;
novinosrin
Tourmaline | Level 20

Hi @pkopersk   Your approach is correct, however you need an extra pass of the by group to assign the computed cumulative count to each record of the by group. So, Basically I have tweaked your code to read the BY group again by using a technique called interleave aka sorted append and conditionally compute the cumulative count and assign appropriately.

 


data have;
input VAR1 $	VAR2;*	COUNT_BY_GROUP;
cards;
A	10	2
A	20	2
B	243	1
C	4	3
C	21	3
C	4	3
;

data want;
 set have(in=a) have(in=b);
 by VAR1;
 retain COUNT_BY_GROUP;
 if a then do;
  if first.VAR1 then COUNT_BY_GROUP = 0;
  COUNT_BY_GROUP +1;
 end;
 if b;
run;

The genius @mkeintz  taught me this among many others he taught me personally over the last couple of years. Have fun learning!

 

sas-innovate-white.png

Special offer for SAS Communities members

Save $250 on SAS Innovate and get a free advance copy of the new SAS For Dummies book! Use the code "SASforDummies" to register. Don't miss out, May 6-9, in Orlando, Florida.

 

View the full agenda.

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 1599 views
  • 0 likes
  • 4 in conversation