BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
BoboTheFool
Calcite | Level 5

So, probably a simple question.  I need to find the mean of a subset of a subset of a variable (3 variables total).   So lets say I am trying to find the mean of NFC team scores, first by division, and then by team in the division.  I have a scores variable, a variable, division, that includes NFC EAST, NFC WEST, NFC SOUTH, and NFC NORTH.  Then a third variable, team,  that contains all of the teams in the conference.  I can get the mean of the scores of the division, but when I try to go to that deeper level of teams in the division, i cannot do it.  Here is the code I would use to get the mean by division:

 

proc sort data = football;

by division;

run;

 

proc means data = football;

var scores;

by division;

run;

 

this works just fine.

 

when I try to go by team,

 

it gives me an error because I haven't sorted by team, which then messes up the sort my division.

 

any suggestions?

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

Use the CLASS statement instead of the BY statement and you can get summary for any of the combinations you want.

proc summary data=football ;
   class division team ;
   var score ;
   output out=want mean=mean_score ;
run;

You will get overall mean, mean for each level of division, mean for each level of team and also mean of each team within division.

The _TYPE_ variable will tell you which of the class variables are contributing to the result record.

_TYPE_=0  will be the overall mean. 

_TYPE_=1 will be the TEAM means.

_TYPE_=2 will be the DIVISION means.

_TYPE_=3 will the the TEAM within DIVISION means (which for NFL teams will be the same as TEAM means since each team is in one and only one DIVISION).

 

View solution in original post

4 REPLIES 4
PGStats
Opal | Level 21

Do you want:

 

proc sort data = football;
by division team;
run;
 
proc means data = football;
var scores;
by division team;
run;

?

PG
Tom
Super User Tom
Super User

Use the CLASS statement instead of the BY statement and you can get summary for any of the combinations you want.

proc summary data=football ;
   class division team ;
   var score ;
   output out=want mean=mean_score ;
run;

You will get overall mean, mean for each level of division, mean for each level of team and also mean of each team within division.

The _TYPE_ variable will tell you which of the class variables are contributing to the result record.

_TYPE_=0  will be the overall mean. 

_TYPE_=1 will be the TEAM means.

_TYPE_=2 will be the DIVISION means.

_TYPE_=3 will the the TEAM within DIVISION means (which for NFL teams will be the same as TEAM means since each team is in one and only one DIVISION).

 

BoboTheFool
Calcite | Level 5

Tom,

 

Using the class statement was dead on.  I racked my brain for a few hours trying to figure out how to code the problem correctly.  Thanks for your help!

 

Adam

Reeza
Super User

In addition to @Tom solution, if you want control over the levels look at TYPEs and WAYS statements in PROC MEANS.

They allow you to control the different combinations of class variables, super useful and under utilized. 

 

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 1189 views
  • 3 likes
  • 4 in conversation