Statistical Procedures

Programming the statistical procedures from SAS
BookmarkSubscribeRSS Feed
MVP
Calcite | Level 5 MVP
Calcite | Level 5

Hello,

 

I have a dataset that requires a log transformation due to skewed data. However, performing a log transformation changes some of my values to negative values, which do not allow me to obtain a geometric mean. Those values are still of importance to my analysis, so I was adviced to use a log +1 transformation.

 

My question is will using the command PROC SURVEYMEANS with ALLGEO to calculate the geometric mean take into account the "+1" in the log +1 transformation or does that have to be factored in the sas code.

 

Any advice is greatly appreciated.

 

Thanks,

 

Mark

2 REPLIES 2
Rick_SAS
SAS Super FREQ

I don't think you want the geometric mean of the log-transformed values. You want the arithmetic mean.

The reason is that the geometric mean of the original data is equal to the logarithm of the geometric mean of the transformed data.

In symbols, if y_i = log(x_i), then

mean(y) = (1/n) Sum y_i =  (1/n) Sum log(x_i) = log( (Prod x_i)^(1/n) = log( GeoMean(x) )

or

exp( mean(y) ) = GeoMean(x)

 

 

When you add 1 to the data, you are changing the reference value for the measurement. There is not always an easy way to interpret statistics on the log(x+1) scale in terms of the original measurements. This fact does not invalidate the transformation, it just means that the results are harder to interpret.

 

MVP
Calcite | Level 5 MVP
Calcite | Level 5

Hi Rick,

 

Thank you for your response.

 

I should have clarified. Yes, once the data is transformed, I will be taking the arithmetic mean using PROCSURVEYMEANS. I will be then calculating  the geometric mean from that point.

 

From a practical standpoint, I was just wondering if there was any difference in the SAS code when doing a log vs log+1 transformation since most guidance out there uses a log transformation -> arithmetic mean -> calculation of geometric mean

 

I am using this guidance... https://support.sas.com/rnd/app/stat/examples/SurveyGeoMean/new_example/stat_webex.pdf

 

sas-innovate-white.png

Our biggest data and AI event of the year.

Don’t miss the livestream kicking off May 7. It’s free. It’s easy. And it’s the best seat in the house.

Join us virtually with our complimentary SAS Innovate Digital Pass. Watch live or on-demand in multiple languages, with translations available to help you get the most out of every session.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 2 replies
  • 3280 views
  • 0 likes
  • 2 in conversation