BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
BrentFulton
Calcite | Level 5

Hi,

I am using SAS 9.2 and would like to calculate the mean of variable x by the variable id. For some reason, I am creating an extra observation with the code below, so would appreciate knowing why and how to fix it.

Thanks,

Brent Fulton

UC Berkeley

data1: N=1000, unique id=300. id is a numeric and none is missing. No x is missing.

data2: N=301, where one id is missing, that is: .

My problem is that data2 should have 300 observations.

proc means data=data1 noprint;

class id;

var x;

output out=data2 mean=x_mean;

run;

1 ACCEPTED SOLUTION

Accepted Solutions
ieva
Pyrite | Level 9

You should add nway option:

proc means data=data1 noprint nway ;

class id;

var x;

output out=data2 mean=x_mean;

run;

The  extra observation you create is mean of all observations you have. You can see that in variable _type_ which shows different levels of calculations.

View solution in original post

4 REPLIES 4
ieva
Pyrite | Level 9

You should add nway option:

proc means data=data1 noprint nway ;

class id;

var x;

output out=data2 mean=x_mean;

run;

The  extra observation you create is mean of all observations you have. You can see that in variable _type_ which shows different levels of calculations.

Doc_Duke
Rhodochrosite | Level 12

ieva's approach would get rid of the grand mean, but the missing is still a valid value.  To delete that, do it in the data clause of the PROC MEANS:

proc means data=data1(WHERE=(id ^= .)) noprint nway ;

class id;

var x;

output out=data2 mean=x_mean;

run;

BrentFulton
Calcite | Level 5

This was a helpful answer. My data doesn't have a missing id, but I'm likely to have a dataset that does in the future.

MikeZdeb
Rhodochrosite | Level 12

Hi ... the default behavior of PROC MEANS (and SUMMARY) is to ignore missing values for variables in a CLASS statement (both numeric and character).  So, a WHERE data set option is not needed to leave out observations with missing IDs.  You have to be proactive to have missing CLASS values included ...

data x;

input id x @@;

datalines;

1 10 1 20 2 20 2 30 . 40 . 50

;

run;

proc means data=x noprint nway;

class id;

var x;

output out=y mean=x_mean;

run;

proc means data=x noprint nway missing ;

class id;

var x;

output out=z mean=x_mean;

run;

DATA SET Y: NO MISSING OPTION

id    _TYPE_    _FREQ_    x_mean

1       1         2        15

2       1         2        25

DATA SET Z: MISSING OPTION ADDED

id    _TYPE_    _FREQ_    x_mean

.       1         2        45

1       1         2        15

2       1         2        25

ps  If the object is to produce a data set, why not just use SUMMARY where NOPRINT is the default  ...

proc summary data=x nway;

class id;

var x;

output out=y mean=x_mean;

run;


SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 108930 views
  • 8 likes
  • 4 in conversation