BookmarkSubscribeRSS Feed
MVP
Calcite | Level 5 MVP
Calcite | Level 5

Hello,

 

I have a dataset that requires a log transformation due to skewed data. However, performing a log transformation changes some of my values to negative values, which do not allow me to obtain a geometric mean. Those values are still of importance to my analysis, so I was adviced to use a log +1 transformation.

 

My question is will using the command PROC SURVEYMEANS with ALLGEO to calculate the geometric mean take into account the "+1" in the log +1 transformation or does that have to be factored in the sas code.

 

Any advice is greatly appreciated.

 

Thanks,

 

Mark

2 REPLIES 2
Rick_SAS
SAS Super FREQ

I don't think you want the geometric mean of the log-transformed values. You want the arithmetic mean.

The reason is that the geometric mean of the original data is equal to the logarithm of the geometric mean of the transformed data.

In symbols, if y_i = log(x_i), then

mean(y) = (1/n) Sum y_i =  (1/n) Sum log(x_i) = log( (Prod x_i)^(1/n) = log( GeoMean(x) )

or

exp( mean(y) ) = GeoMean(x)

 

 

When you add 1 to the data, you are changing the reference value for the measurement. There is not always an easy way to interpret statistics on the log(x+1) scale in terms of the original measurements. This fact does not invalidate the transformation, it just means that the results are harder to interpret.

 

MVP
Calcite | Level 5 MVP
Calcite | Level 5

Hi Rick,

 

Thank you for your response.

 

I should have clarified. Yes, once the data is transformed, I will be taking the arithmetic mean using PROCSURVEYMEANS. I will be then calculating  the geometric mean from that point.

 

From a practical standpoint, I was just wondering if there was any difference in the SAS code when doing a log vs log+1 transformation since most guidance out there uses a log transformation -> arithmetic mean -> calculation of geometric mean

 

I am using this guidance... https://support.sas.com/rnd/app/stat/examples/SurveyGeoMean/new_example/stat_webex.pdf

 

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 2 replies
  • 2315 views
  • 0 likes
  • 2 in conversation