BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
psnorrod
Obsidian | Level 7

Greetings!

 

I want to count the number of events by the following variables: year, sex, and age_group; and can drop the state variable after events are counted. The code below will count the number of events, however, it does not separate counts by age_group. So far it combines some age_groups for each year.

 

I've attached two data sets with "sample1" representing my data and "sample2" representing my desired results.

Finally, the data is all events so there are no "zeros".

 

proc sort data = one;
by year state;
run;

 

data two;
set one;
by year state ;
if first.state then count = 0;
count + 1;
if last.state;
keep year count Sex age_group;
run;

 

Thank you and your help is much appreciated!

 

-Paul

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

Your output Sample2 does not have anything for 2003 with Sex=2 and Age_group=1 but the input data does. So I am going to guess that your 4th line of Sample 2 is supposed to be 2003 Sex=2 and Age_group=1 instead of a second row with Sex=2 and Age_group=2;

If my belief is correct then one way to get the output you are requesting is

 

proc summary data= sample1 nway;
   class year sex age_group;
   output out=work.summary (drop=_type_ rename=(_freq_=count));
run;

Though the order will vary a bit from your Sample2.

Or

proc freq data= tmp2.sample1 noprint;
   tables year*sex*age_group /out=work.freq (drop=percent) ;
run;

Simple counting belongs to Proc Freq /Summary depending on what you want. Proc Sql also does counts

proc sql;
   create table work.sql as
   select year, sex, age_group, count(*) as count
   from sample1
   group by year, sex, age_group
   ;
quit;

 

View solution in original post

4 REPLIES 4
ballardw
Super User

Your output Sample2 does not have anything for 2003 with Sex=2 and Age_group=1 but the input data does. So I am going to guess that your 4th line of Sample 2 is supposed to be 2003 Sex=2 and Age_group=1 instead of a second row with Sex=2 and Age_group=2;

If my belief is correct then one way to get the output you are requesting is

 

proc summary data= sample1 nway;
   class year sex age_group;
   output out=work.summary (drop=_type_ rename=(_freq_=count));
run;

Though the order will vary a bit from your Sample2.

Or

proc freq data= tmp2.sample1 noprint;
   tables year*sex*age_group /out=work.freq (drop=percent) ;
run;

Simple counting belongs to Proc Freq /Summary depending on what you want. Proc Sql also does counts

proc sql;
   create table work.sql as
   select year, sex, age_group, count(*) as count
   from sample1
   group by year, sex, age_group
   ;
quit;

 

psnorrod
Obsidian | Level 7

Thank you ballardw!

 

Unfortunately in my attempt at brevity, I was not very clear. "Sample2" is just an example and does not include 2003. My full data set has over 190k events. I need to create a final data file, like sample2 to merge with my denominator data. Ultimately, I'm going to use proc genmod to obtain crude and adjusted rates for each year. I want to use the sex and age_group variables for the adjusted rates. I hope!

ballardw
Super User

@psnorrod wrote:

Thank you ballardw!

 

Unfortunately in my attempt at brevity, I was not very clear. "Sample2" is just an example and does not include 2003. My full data set has over 190k events. I need to create a final data file, like sample2 to merge with my denominator data. Ultimately, I'm going to use proc genmod to obtain crude and adjusted rates for each year. I want to use the sex and age_group variables for the adjusted rates. I hope!


In which case you may need to make sure that you have combinations with a 0 count to be complete. If a model is using those variables and one year does not have a combination of values that appear in other years then you may have unexpected results.

psnorrod
Obsidian | Level 7

Yes. Thank you @ballardw

There are years with out the event based on inclusion criteria, however, I cannot think of another method to assign the zero value other than creating a separate data set with state fips, year, sex and age-group then somehow merging or using a conditional statement to keep the zeros.

 

 

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 2244 views
  • 1 like
  • 2 in conversation