BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
raivester
Quartz | Level 8

I am trying to collapse my data from the individual person level to the state-year level and get the sum for one variable and the mean for another. How do I do this in SAS? See below for a simplified example of what I am trying to do:

 

What I have:

Year

State

Dog Owner Dummy

Dogs Owned

2019

MA

0

0

2019

MA

1

1

2019

CA

1

2

2019

CA

1

1

2020

MA

1

2

2020

MA

0

0

2020

CA

0

0

2020

CA

0

0

 

What I want:

Year

State

Dog Owner Count

Avg # Dogs Owned

2019

MA

1

0.5

2019

CA

2

1.5

2020

MA

1

1

2020

CA

0

0

1 ACCEPTED SOLUTION

Accepted Solutions
mkeintz
PROC Star

Look at PROC SUMMARY (or its alias PROC MEANS).

 

A relevant example can be found at Simple Proc Summary Example in SAS, from SASNRD 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

View solution in original post

3 REPLIES 3
mkeintz
PROC Star

Look at PROC SUMMARY (or its alias PROC MEANS).

 

A relevant example can be found at Simple Proc Summary Example in SAS, from SASNRD 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
raivester
Quartz | Level 8

Ahh! Thank you. This is helpful. However, what if I have more than one variable. Say I have two variables that I want to get the mean for and one I want to get the total sum for? How do I specify which action I want to take for which variables?

ballardw
Super User

@raivester wrote:

Ahh! Thank you. This is helpful. However, what if I have more than one variable. Say I have two variables that I want to get the mean for and one I want to get the total sum for? How do I specify which action I want to take for which variables?


Proc means / summary will provide many statistics for many variables.

 

Proc summary data= have;

   class  somegrouping variables;

  var <list your variables to request stats here>;

  output out=want     sum(<list variables that you want summed>) =   (list of variable names for output)

                                mean(<list of variabls you want the mean for>) = (list of variable names for output)

                               std (<list of variables you want the std for>) = ...

run;

 

If you don't want to type a bunch of output variable names you can use the / autoname option after requesting the variables. The statistic will be added a suffix to the variable name.

Example:

Proc summary data=sashelp.class nway;
   class sex;
   var age height weight;
   output out=summaryexample
       sum (age height)=
       mean(weight)=
       median(age height)=
       std= /autoname;
run;

Note that as the STD statistic did not include any variables that all the variables on the VAR statement used.