BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
leaning2sas
Calcite | Level 5

Hello!

I am a basic user and have been trying to get a geometric mean of the variable of interest (concentration of a chemical) in the urine stratified by different years and age. Since this chemical is present in urine, it must be adjusted for creatinine. The variable of interest has been converted into its log, as it is not normally distributed. I used proc glimmix as this was nonnormal data and adjusted the model for creatinine, to obtain the predicted mean, and then used those to get the geometric means. I am not sure if my code is right or even if I am using the right approach. I would appreciate it if you could guide me regarding the same. 

 

proc glimmix data = NH.demo0118_phth_fv2;
class timephth;
model mbp = timephth creatinine_phth/solution; * URXUCR=urinary_creatinine;
output out=NH.ph_ad pred=p_mbp_mean;
run;


proc surveymeans data = NH.ph_ad allgeo;
domain age timephth;
var p_mbp_mean;
title "Geometric mean of XX years of MBP";
ODS output domaingeomeans = NH.mbp_gm;
run;

1 ACCEPTED SOLUTION

Accepted Solutions
Rick_SAS
SAS Super FREQ

In case you wonder why Steve's suggestion works, recall that the geometric mean for a set of positive numbers is equivalent to computing the arithmetic mean of the log-transformed data and then using the exponential function to back-transform the result. See https://blogs.sas.com/content/iml/2019/09/30/what-is-a-geometric-mean.html

 

View solution in original post

6 REPLIES 6
SteveDenham
Jade | Level 19

If you want an adjusted geometric mean, I would suggest the following code:

 

proc glimmix data=NH.demo0118_phth_fv2;
ods output lsmeans=geomeans;
class timephth;
model value=timephth creatinine_phth/dist=lognormal;
lsmeans timephth/cl;
run;

data geomeans2;
set geomeans;
mean=exp(estimate);
run;

This will give you the geometric mean at the ARITHMETIC mean of creatinine_phth. You can also get confidence bounds by exponentiating the variables lower and upper in the geomeans dataset.

 

SteveDenham

 

Rick_SAS
SAS Super FREQ

In case you wonder why Steve's suggestion works, recall that the geometric mean for a set of positive numbers is equivalent to computing the arithmetic mean of the log-transformed data and then using the exponential function to back-transform the result. See https://blogs.sas.com/content/iml/2019/09/30/what-is-a-geometric-mean.html

 

leaning2sas
Calcite | Level 5
I appreciate the explanation. The link was very helpful in understanding geometric mean since it is not used often in data analysis.
leaning2sas
Calcite | Level 5

Thank you very much! I appreciate your assistance with the code. I was confused while writing the code, but now I understand. 

leaning2sas
Calcite | Level 5

Just a follow-up question, since I'm using a survey dataset, can the same codes be used? Also, how do I stratify by age? For e.g., I want creatinine adjusted geometric mean for age groups 6-12 and 13-18. I really appreciate any help you can provide. Thanks in advance! 

SteveDenham
Jade | Level 19

If the data come from a survey, you may want to change to one of the appropriate survey procedures such as SURVEYMEANS or SURVEYREG. SURVEYMEANS can provide geometric means directly, while SURVEYREG would require transformation/back-transformation. Shifting to the SURVEY procedures requires you to have knowledge of the complex survey design used to collect the data.

 

There are several ways to convert continuous variables such as age to categorical variables in a DATA step, but beware of loss of power/increase in standard error when doing so.  There is a lot of discussion in the literature about the dangers of categorization of continuous variables. From the short example you present, you would have to convince me that the difference between a 12 year old and a 13 year old (different categories) was markedly greater than between a 12 year old and a 6 year old (same category).

 

SteveDenham

sas-innovate-2026-white.png



April 27 – 30 | Gaylord Texan | Grapevine, Texas

Registration is open

Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!

Register now

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 6 replies
  • 2472 views
  • 2 likes
  • 3 in conversation