I am trying to collapse my data from the individual person level to the state-year level and get the sum for one variable and the mean for another. How do I do this in SAS? See below for a simplified example of what I am trying to do:
What I have:
Year |
State |
Dog Owner Dummy |
Dogs Owned |
2019 |
MA |
0 |
0 |
2019 |
MA |
1 |
1 |
2019 |
CA |
1 |
2 |
2019 |
CA |
1 |
1 |
2020 |
MA |
1 |
2 |
2020 |
MA |
0 |
0 |
2020 |
CA |
0 |
0 |
2020 |
CA |
0 |
0 |
What I want:
Year |
State |
Dog Owner Count |
Avg # Dogs Owned |
2019 |
MA |
1 |
0.5 |
2019 |
CA |
2 |
1.5 |
2020 |
MA |
1 |
1 |
2020 |
CA |
0 |
0 |
Look at PROC SUMMARY (or its alias PROC MEANS).
A relevant example can be found at Simple Proc Summary Example in SAS, from SASNRD
Look at PROC SUMMARY (or its alias PROC MEANS).
A relevant example can be found at Simple Proc Summary Example in SAS, from SASNRD
Ahh! Thank you. This is helpful. However, what if I have more than one variable. Say I have two variables that I want to get the mean for and one I want to get the total sum for? How do I specify which action I want to take for which variables?
@raivester wrote:
Ahh! Thank you. This is helpful. However, what if I have more than one variable. Say I have two variables that I want to get the mean for and one I want to get the total sum for? How do I specify which action I want to take for which variables?
Proc means / summary will provide many statistics for many variables.
Proc summary data= have;
class somegrouping variables;
var <list your variables to request stats here>;
output out=want sum(<list variables that you want summed>) = (list of variable names for output)
mean(<list of variabls you want the mean for>) = (list of variable names for output)
std (<list of variables you want the std for>) = ...
run;
If you don't want to type a bunch of output variable names you can use the / autoname option after requesting the variables. The statistic will be added a suffix to the variable name.
Example:
Proc summary data=sashelp.class nway; class sex; var age height weight; output out=summaryexample sum (age height)= mean(weight)= median(age height)= std= /autoname; run;
Note that as the STD statistic did not include any variables that all the variables on the VAR statement used.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.