BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
CactusPete
Calcite | Level 5

I am working with NHANES blood pressure laboratory data which includes multiple recordings of systolic and diastolic blood pressure readings (Systolic = BPXSY1, BPXSY2, and BPXSY3). I am trying to find the average of these reading but my current code is simply only copying the first reading and displaying that as the average (BPXSY_avg). Currently BPXSY_avg = BPXSY1, but it should be = ((BPXSY1 + BPXSY2 + BPXSY3) / 3) . The current code I am trying is as follows and an image of my results is attached:

/*Calculate the mean of BPXSY1, BPXSY2, and BPXSY3*/
proc means data=COHORT3;
var BPXSY1 BPXSY2 BPXSY3;
output out=MEANS mean=BPXSY_avg;
run;

/*Display the mean value*/
proc print data=MEANS;
var BPXSY_avg;
run; 

nhanesQ.jpg

1 ACCEPTED SOLUTION

Accepted Solutions
mkeintz
PROC Star

So you want the mean across columns (variables), not mean across rows (observations).  That's one mean per obs, right?

 

Then

data want;
  set have;
  BPX_mean=mean(BPXSY1, BPXSY2, BPXSY3);
run;
proc print data=want;
run;

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

View solution in original post

5 REPLIES 5
Astounding
PROC Star

Since you want to capture the means of three variables, you need to supply three names.  You only supply one name here:

output out=MEANS mean=BPXSY_avg;

Therefore, BPXSY_avg becomes the mean of the first variable in the VAR statement.  Supply three names after OUT=, and SAS will save a mean for all three variables.

 

Of course, if your intended result is something else, we can discuss what you want and how to obtain it.  From your post, it sounds like you need to add together the three variables before using PROC MEANS, and use that total variable in the VAR statement.

CactusPete
Calcite | Level 5
Ok, that makes sense. Yes, I am looking to add the three variables together and finding the average of those 3. How would I go about this? I apologize as I am very new to SAS.
mkeintz
PROC Star

So you want the mean across columns (variables), not mean across rows (observations).  That's one mean per obs, right?

 

Then

data want;
  set have;
  BPX_mean=mean(BPXSY1, BPXSY2, BPXSY3);
run;
proc print data=want;
run;

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
PaigeMiller
Diamond | Level 26

PROCs, such as PROC MEANS, generally do their arithmetic down the columns, that is why you get the mean of all the BPXSY1 values. If you want the mean across the three columns (not down the columns), you can do this in a DATA step, as shown by @mkeintz .

--
Paige Miller
ballardw
Super User

@CactusPete wrote:
Ok, that makes sense. Yes, I am looking to add the three variables together and finding the average of those 3. How would I go about this? I apologize as I am very new to SAS.

The "easy" way to name output statistics is to let SAS name them for you. Run this code and look at the output data set.

proc means data=COHORT3;
   var BPXSY1 BPXSY2 BPXSY3;
   output out=MEANS mean= max= min= std= /autoname;
run;

The example code asks for 4 different summary statistics for all the variables on the Var statement. That autoname option asks SAS to name the statistics with  the variable name as the base and add the statistic as a suffix.

Or you use lists if you want different statistics for some variables. You would place the name(s) of variables in parentheses after the statistic requested and then a list of output variable names.

proc means data=COHORT3;
var BPXSY1 BPXSY2 BPXSY3;
output out=MEANS mean(BPSXY1) =BPXSY_avg
       max( BPXSY2 BPXSY3) = sy1max sy3max;
run;

But for ( BPXSY1 + BPXSY2 + BPXSY3)/3 for example you need to add a variable for each observation prior to Proc means.

 

Second, related to actual use of the data, is that NHANES is a complex sample data source. If you want your results to match anything reported nationally you need to use the weights in the data set AND account for the complex data collection system. That would typically mean use of Proc Surveymeans/ surveyfreq with the variables for the sample design as well as the weight.

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
SAS Enterprise Guide vs. SAS Studio

What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 720 views
  • 2 likes
  • 5 in conversation