SAS Procedures

Help using Base SAS procedures
BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
BrentFulton
Calcite | Level 5

Hi,

I am using SAS 9.2 and would like to calculate the mean of variable x by the variable id. For some reason, I am creating an extra observation with the code below, so would appreciate knowing why and how to fix it.

Thanks,

Brent Fulton

UC Berkeley

data1: N=1000, unique id=300. id is a numeric and none is missing. No x is missing.

data2: N=301, where one id is missing, that is: .

My problem is that data2 should have 300 observations.

proc means data=data1 noprint;

class id;

var x;

output out=data2 mean=x_mean;

run;

1 ACCEPTED SOLUTION

Accepted Solutions
ieva
Pyrite | Level 9

You should add nway option:

proc means data=data1 noprint nway ;

class id;

var x;

output out=data2 mean=x_mean;

run;

The  extra observation you create is mean of all observations you have. You can see that in variable _type_ which shows different levels of calculations.

View solution in original post

4 REPLIES 4
ieva
Pyrite | Level 9

You should add nway option:

proc means data=data1 noprint nway ;

class id;

var x;

output out=data2 mean=x_mean;

run;

The  extra observation you create is mean of all observations you have. You can see that in variable _type_ which shows different levels of calculations.

Doc_Duke
Rhodochrosite | Level 12

ieva's approach would get rid of the grand mean, but the missing is still a valid value.  To delete that, do it in the data clause of the PROC MEANS:

proc means data=data1(WHERE=(id ^= .)) noprint nway ;

class id;

var x;

output out=data2 mean=x_mean;

run;

BrentFulton
Calcite | Level 5

This was a helpful answer. My data doesn't have a missing id, but I'm likely to have a dataset that does in the future.

MikeZdeb
Rhodochrosite | Level 12

Hi ... the default behavior of PROC MEANS (and SUMMARY) is to ignore missing values for variables in a CLASS statement (both numeric and character).  So, a WHERE data set option is not needed to leave out observations with missing IDs.  You have to be proactive to have missing CLASS values included ...

data x;

input id x @@;

datalines;

1 10 1 20 2 20 2 30 . 40 . 50

;

run;

proc means data=x noprint nway;

class id;

var x;

output out=y mean=x_mean;

run;

proc means data=x noprint nway missing ;

class id;

var x;

output out=z mean=x_mean;

run;

DATA SET Y: NO MISSING OPTION

id    _TYPE_    _FREQ_    x_mean

1       1         2        15

2       1         2        25

DATA SET Z: MISSING OPTION ADDED

id    _TYPE_    _FREQ_    x_mean

.       1         2        45

1       1         2        15

2       1         2        25

ps  If the object is to produce a data set, why not just use SUMMARY where NOPRINT is the default  ...

proc summary data=x nway;

class id;

var x;

output out=y mean=x_mean;

run;


sas-innovate-white.png

Our biggest data and AI event of the year.

Don’t miss the livestream kicking off May 7. It’s free. It’s easy. And it’s the best seat in the house.

Join us virtually with our complimentary SAS Innovate Digital Pass. Watch live or on-demand in multiple languages, with translations available to help you get the most out of every session.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 111588 views
  • 8 likes
  • 4 in conversation