BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
raivester
Quartz | Level 8

I am trying to collapse my data from the individual person level to the state-year level and get the sum for one variable and the mean for another. How do I do this in SAS? See below for a simplified example of what I am trying to do:

 

What I have:

Year

State

Dog Owner Dummy

Dogs Owned

2019

MA

0

0

2019

MA

1

1

2019

CA

1

2

2019

CA

1

1

2020

MA

1

2

2020

MA

0

0

2020

CA

0

0

2020

CA

0

0

 

What I want:

Year

State

Dog Owner Count

Avg # Dogs Owned

2019

MA

1

0.5

2019

CA

2

1.5

2020

MA

1

1

2020

CA

0

0

1 ACCEPTED SOLUTION

Accepted Solutions
mkeintz
PROC Star

Look at PROC SUMMARY (or its alias PROC MEANS).

 

A relevant example can be found at Simple Proc Summary Example in SAS, from SASNRD 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

View solution in original post

3 REPLIES 3
mkeintz
PROC Star

Look at PROC SUMMARY (or its alias PROC MEANS).

 

A relevant example can be found at Simple Proc Summary Example in SAS, from SASNRD 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
raivester
Quartz | Level 8

Ahh! Thank you. This is helpful. However, what if I have more than one variable. Say I have two variables that I want to get the mean for and one I want to get the total sum for? How do I specify which action I want to take for which variables?

ballardw
Super User

@raivester wrote:

Ahh! Thank you. This is helpful. However, what if I have more than one variable. Say I have two variables that I want to get the mean for and one I want to get the total sum for? How do I specify which action I want to take for which variables?


Proc means / summary will provide many statistics for many variables.

 

Proc summary data= have;

   class  somegrouping variables;

  var <list your variables to request stats here>;

  output out=want     sum(<list variables that you want summed>) =   (list of variable names for output)

                                mean(<list of variabls you want the mean for>) = (list of variable names for output)

                               std (<list of variables you want the std for>) = ...

run;

 

If you don't want to type a bunch of output variable names you can use the / autoname option after requesting the variables. The statistic will be added a suffix to the variable name.

Example:

Proc summary data=sashelp.class nway;
   class sex;
   var age height weight;
   output out=summaryexample
       sum (age height)=
       mean(weight)=
       median(age height)=
       std= /autoname;
run;

Note that as the STD statistic did not include any variables that all the variables on the VAR statement used.

 

 

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 3 replies
  • 867 views
  • 2 likes
  • 3 in conversation