BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
MillerEL
Obsidian | Level 7

Hello,

 

I have a question about aggregating frequencies. In short, each observation has a family ID and a marker that says if the observation is for an adult or a child. (example data below). NOTE: the real dataset included 1000s of families. 

 

FamID adlt chld
001      1     0
001      0     1
001      0     1
002      1     0
002      1     0
002      0     1
002      0     1
002      0     1
002      0     1
003      1     0
003      0     1
003      0     1
004      1     0 
004      0     1
004      0     1

 

I need to determine how many families have 1 child, or 2 children, etc...

The output should look like the following:

 

#Child   #families

1            0

2            3

3            0

4            1

 

Can anyone offer advise on how to approach this? I tried the following without success:

 

PROC SUMMARY data=in;
class FamID;
var chld adlt;
output out=agg sum=chld adlt;
run;

 

 

PROC TABULATE data=in;
class FamID;
var chld;
table chld;
run;

 

I'm sure that I'm overthinking this, but I'm stuck. Any help would be appreciated. 

Thank you. 

 

1 ACCEPTED SOLUTION

Accepted Solutions
Astounding
PROC Star

Many ways to skin this cat ... I tend to favor PROC FREQ for counting.  Start by adding NWAY to the end of your PROC SUMMARY statement.  Then continue:

 

proc freq data=AGG;

tables chld;

run;

 

The CHLD column is the number of children, and the FREQUENCY column is the number of families having that number of children.

View solution in original post

4 REPLIES 4
Reeza
Super User

Your approach is correct, but the data for proc tabulate should be the output from the PROC SUMMARY not the original data set. 

 

The logic is essentially:

 

1. Find the number of kids per family - add child numbers using PROC SUMMARY

2. Summarize data from Step 1 using PROC FREQ or PROC TABULATE. Note that you don't want the CLASS variable in the PROC TABULATE because you're counting overall, not by FAMILY ID.

 

 

MillerEL
Obsidian | Level 7

Perfect. Thank you!

Astounding
PROC Star

Many ways to skin this cat ... I tend to favor PROC FREQ for counting.  Start by adding NWAY to the end of your PROC SUMMARY statement.  Then continue:

 

proc freq data=AGG;

tables chld;

run;

 

The CHLD column is the number of children, and the FREQUENCY column is the number of families having that number of children.

MillerEL
Obsidian | Level 7
Perfect! Thank you!

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 1596 views
  • 6 likes
  • 3 in conversation