BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
BrentFulton
Calcite | Level 5

Hi,

I am using SAS 9.2 and would like to calculate the mean of variable x by the variable id. For some reason, I am creating an extra observation with the code below, so would appreciate knowing why and how to fix it.

Thanks,

Brent Fulton

UC Berkeley

data1: N=1000, unique id=300. id is a numeric and none is missing. No x is missing.

data2: N=301, where one id is missing, that is: .

My problem is that data2 should have 300 observations.

proc means data=data1 noprint;

class id;

var x;

output out=data2 mean=x_mean;

run;

1 ACCEPTED SOLUTION

Accepted Solutions
ieva
Pyrite | Level 9

You should add nway option:

proc means data=data1 noprint nway ;

class id;

var x;

output out=data2 mean=x_mean;

run;

The  extra observation you create is mean of all observations you have. You can see that in variable _type_ which shows different levels of calculations.

View solution in original post

4 REPLIES 4
ieva
Pyrite | Level 9

You should add nway option:

proc means data=data1 noprint nway ;

class id;

var x;

output out=data2 mean=x_mean;

run;

The  extra observation you create is mean of all observations you have. You can see that in variable _type_ which shows different levels of calculations.

Doc_Duke
Rhodochrosite | Level 12

ieva's approach would get rid of the grand mean, but the missing is still a valid value.  To delete that, do it in the data clause of the PROC MEANS:

proc means data=data1(WHERE=(id ^= .)) noprint nway ;

class id;

var x;

output out=data2 mean=x_mean;

run;

BrentFulton
Calcite | Level 5

This was a helpful answer. My data doesn't have a missing id, but I'm likely to have a dataset that does in the future.

MikeZdeb
Rhodochrosite | Level 12

Hi ... the default behavior of PROC MEANS (and SUMMARY) is to ignore missing values for variables in a CLASS statement (both numeric and character).  So, a WHERE data set option is not needed to leave out observations with missing IDs.  You have to be proactive to have missing CLASS values included ...

data x;

input id x @@;

datalines;

1 10 1 20 2 20 2 30 . 40 . 50

;

run;

proc means data=x noprint nway;

class id;

var x;

output out=y mean=x_mean;

run;

proc means data=x noprint nway missing ;

class id;

var x;

output out=z mean=x_mean;

run;

DATA SET Y: NO MISSING OPTION

id    _TYPE_    _FREQ_    x_mean

1       1         2        15

2       1         2        25

DATA SET Z: MISSING OPTION ADDED

id    _TYPE_    _FREQ_    x_mean

.       1         2        45

1       1         2        15

2       1         2        25

ps  If the object is to produce a data set, why not just use SUMMARY where NOPRINT is the default  ...

proc summary data=x nway;

class id;

var x;

output out=y mean=x_mean;

run;


sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 105884 views
  • 8 likes
  • 4 in conversation