turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- Base SAS Programming
- /
- Sorting, subsetting

Topic Options

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

09-20-2016 09:39 PM

So, probably a simple question. I need to find the mean of a subset of a subset of a variable (3 variables total). So lets say I am trying to find the mean of NFC team scores, first by division, and then by team in the division. I have a scores variable, a variable, division, that includes NFC EAST, NFC WEST, NFC SOUTH, and NFC NORTH. Then a third variable, team, that contains all of the teams in the conference. I can get the mean of the scores of the division, but when I try to go to that deeper level of teams in the division, i cannot do it. Here is the code I would use to get the mean by division:

proc sort data = football;

by division;

run;

proc means data = football;

var scores;

by division;

run;

this works just fine.

when I try to go by team,

it gives me an error because I haven't sorted by team, which then messes up the sort my division.

any suggestions?

Accepted Solutions

Solution

09-20-2016
11:27 PM

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to BoboTheFool

09-20-2016 10:57 PM

Use the CLASS statement instead of the BY statement and you can get summary for any of the combinations you want.

```
proc summary data=football ;
class division team ;
var score ;
output out=want mean=mean_score ;
run;
```

You will get overall mean, mean for each level of division, mean for each level of team and also mean of each team within division.

The _TYPE_ variable will tell you which of the class variables are contributing to the result record.

_TYPE_=0 will be the overall mean.

_TYPE_=1 will be the TEAM means.

_TYPE_=2 will be the DIVISION means.

_TYPE_=3 will the the TEAM within DIVISION means (which for NFL teams will be the same as TEAM means since each team is in one and only one DIVISION).

All Replies

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to BoboTheFool

09-20-2016 10:53 PM

Do you want:

```
proc sort data = football;
by division team;
run;
proc means data = football;
var scores;
by division team;
run;
```

?

PG

Solution

09-20-2016
11:27 PM

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to BoboTheFool

09-20-2016 10:57 PM

Use the CLASS statement instead of the BY statement and you can get summary for any of the combinations you want.

```
proc summary data=football ;
class division team ;
var score ;
output out=want mean=mean_score ;
run;
```

You will get overall mean, mean for each level of division, mean for each level of team and also mean of each team within division.

The _TYPE_ variable will tell you which of the class variables are contributing to the result record.

_TYPE_=0 will be the overall mean.

_TYPE_=1 will be the TEAM means.

_TYPE_=2 will be the DIVISION means.

_TYPE_=3 will the the TEAM within DIVISION means (which for NFL teams will be the same as TEAM means since each team is in one and only one DIVISION).

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

09-20-2016 11:27 PM

Tom,

Using the class statement was dead on. I racked my brain for a few hours trying to figure out how to code the problem correctly. Thanks for your help!

Adam

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to BoboTheFool

09-20-2016 11:05 PM

In addition to @Tom solution, if you want control over the levels look at TYPEs and WAYS statements in PROC MEANS.

They allow you to control the different combinations of class variables, super useful and under utilized.