03-21-2017 02:41 AM
I have a dataset that requires a log transformation due to skewed data. However, performing a log transformation changes some of my values to negative values, which do not allow me to obtain a geometric mean. Those values are still of importance to my analysis, so I was adviced to use a log +1 transformation.
My question is will using the command PROC SURVEYMEANS with ALLGEO to calculate the geometric mean take into account the "+1" in the log +1 transformation or does that have to be factored in the sas code.
Any advice is greatly appreciated.
03-21-2017 07:58 AM
I don't think you want the geometric mean of the log-transformed values. You want the arithmetic mean.
The reason is that the geometric mean of the original data is equal to the logarithm of the geometric mean of the transformed data.
In symbols, if y_i = log(x_i), then
mean(y) = (1/n) Sum y_i = (1/n) Sum log(x_i) = log( (Prod x_i)^(1/n) = log( GeoMean(x) )
exp( mean(y) ) = GeoMean(x)
When you add 1 to the data, you are changing the reference value for the measurement. There is not always an easy way to interpret statistics on the log(x+1) scale in terms of the original measurements. This fact does not invalidate the transformation, it just means that the results are harder to interpret.
03-21-2017 10:04 AM
Thank you for your response.
I should have clarified. Yes, once the data is transformed, I will be taking the arithmetic mean using PROCSURVEYMEANS. I will be then calculating the geometric mean from that point.
From a practical standpoint, I was just wondering if there was any difference in the SAS code when doing a log vs log+1 transformation since most guidance out there uses a log transformation -> arithmetic mean -> calculation of geometric mean
I am using this guidance... https://support.sas.com/rnd/app/stat/examples/SurveyGeoMean/new_example/stat_webex.pdf