BookmarkSubscribeRSS Feed
cawalsh
Calcite | Level 5

I am trying to run descriptive statistics for, let's say, a state instead of on the entire country. How do I run descriptive statistics for my filtered state?

Also, if I filter my data, how can I save my filtered data as a new table? Thank you!

3 REPLIES 3
VDD
Ammonite | Level 13 VDD
Ammonite | Level 13

create a table with you subset of data then use a output out=myresults; option in your stats

Patrick
Opal | Level 21

Most procedures will allow you to subset data directly within the Proc using a WHERE clause. Alternatively apply the where clause as data set option:

Proc Something data=mydata(where=(<condition>));

 

You you want to use all the data but get separate stats by some classification variable then many proc's offer a CLASS keyword for this OR use by group processing (requires you to first have the source data pre-sorted). 

Proc Something data=...

  class <classification var>;

  ....

 

Proc Sort data=mydata out=mydata_sorted;

  by <vars>;

run;

Proc Something data=mydata_sorted;

  by <vars>;

  ...

 

 

ballardw
Super User

@cawalsh wrote:

I am trying to run descriptive statistics for, let's say, a state instead of on the entire country. How do I run descriptive statistics for my filtered state?

Also, if I filter my data, how can I save my filtered data as a new table? Thank you!


Think carefully about why you are saving a filtered data set. If you are thinking of creating one data set for each state that is generally a poor idea except possibly for a report on each state.

SAS has a very important concept called BY group processing. That means that each level of a variable or combination of variables can be processed for the values as a group using BY syntax.

 

If you split data up then to do the same thing on each set you have to modify or duplicate code to point to each dataset, possibly creating yet another output data set. Eventually creating lots of harder than necessary code to maintain and update.

AND you would need to know the combinations of variables that have values as some things work poorly with empty data sets.

 

Example of BY group processing with two variables:

proc sort data=sashelp.class out=work.class;
   by sex age;
run;

proc means data=work.class;
   by sex age;
   var height weight;
run;

Creates summary statistics of n, mean, min, max and standard deviation of height and weight for all the levels of Sex and Age in the data set. I didn't need to know the 11 combinations of the two variables, creating 11 separate data sets or code to summarize 11 different sets.

If I only want summaries of males 14 or older :

proc sort data=sashelp.class out=work.class;
   by sex age;
run;

proc means data=work.class;
   by sex age;
   var height weight;
run;

BY group processing requires that a data set be sorted by the by variables referenced on the BY statement. Options in sorting such as DESCENDING if used in sorting also need to be used in the BY statement as otherwise the procedure will assume ascending sort.

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 3 replies
  • 1411 views
  • 0 likes
  • 4 in conversation