## Logarithmic Scale?

Occasional Contributor
Posts: 7

# Logarithmic Scale?

I have the following data below and know that the values were found by using a logarithmic scale.  How do I incorporate that into my t-test?  Would I have to use a log statement somewhere?  If so, where?  If not, why not?

Thank you

data work.compare;

length group \$1;
input group value;
cards;

A 2.7

A 0.0

A 3.1

A 4.6

B 6.2

B 5.2

B 3.5

B 3.6

B 5.9

;
run;
ods graphics on;

proc ttest data=work.compare plots=box;
class group;

var value;
ods graphics off;

Posts: 5,521

## Re: Logarithmic Scale?

Hi,

For some data, it is reasonable to assume that the logarithm are normally distributed. Taking the logarithms often stabilizes the group variances. Compare group variances on the original scale (assumed logarithmic) and the back transformed scale (assumed linear):

data compare;

length group \$1;

input group Value;

expValue = exp(value);

label Value="Log Value" expValue = "Linear Value";

cards;

A 2.7

A 0.0

A 3.1

A 4.6

B 6.2

B 5.2

B 3.5

B 3.6

B 5.9

;

ods listing exclude ConfLimits;

proc ttest data=compare plots=none;

class group;

var value expValue;

run;

The TTEST Procedure

Variable:  Value  (Log Value)

group          N        Mean     Std Dev     Std Err     Minimum     Maximum

A              4      2.6000      1.9166      0.9583           0      4.6000

B              5      4.8800      1.2677      0.5669      3.5000      6.2000

Diff (1-2)           -2.2800      1.5788      1.0591

Method           Variances        DF    t Value    Pr > |t|

Pooled           Equal             7      -2.15      0.0683

Satterthwaite    Unequal      5.0074      -2.05      0.0958

Equality of Variances

Method      Num DF    Den DF    F Value    Pr > F

Folded F         3         4       2.29    0.4414

Variable:  expValue  (Linear Value)

group          N        Mean     Std Dev     Std Err     Minimum     Maximum

A              4     34.3905     44.2774     22.1387      1.0000     99.4843

B              5       221.8       203.4     90.9600     33.1155       492.7

Diff (1-2)            -187.4       156.5       105.0

Method           Variances        DF    t Value    Pr > |t|

Pooled           Equal             7      -1.79      0.1174

Satterthwaite    Unequal       4.467      -2.00      0.1085

Equality of Variances

Method      Num DF    Den DF    F Value    Pr > F

Folded F         4         3      21.10    0.0311

Variances are indeed more stable on the (assumed) log scale. It is thus better justified to run parametric tests (such as the t-test) on that scale. No special option required.

PG

PG
Posts: 1,270

## Re: Logarithmic Scale?

Not sure why logarithmic scale was used? Log transformation is usually done on skewed data. With this small sample size it would be hard to validate results statistically. Also t test interpertation based on log data will not be the same as on original data.

Posts: 2,124

## Re: Logarithmic Scale?

stat@sas:  Although the log transform is used empirically on skewed data (the variance stabilization that PGStats mentioned), in some areas it is the standard approach to work with certain types of variables.  It may be that the data are known from other work to follow a lognormal distribution.

Bailey:  If the data are indeed log scaled, stat@sas' comment on interpretation needs to be paid attention to.  The test statistics are valid and the means/S.D's are too (on the log scale).  Where you run into challenges is that the difference of the means may not be interpretable.

Posts: 1,270

## Re: Logarithmic Scale?

Agreed - I was concerned about the interpretation of results based on original vs log transformed data.

Occasional Contributor
Posts: 6

## Re: Logarithmic Scale?

Exponentiate the results for interpretation...

Posts: 1,270

## Re: Logarithmic Scale?

What do you mean by "Exponentiate the results for interpretation" please?

Posts: 2,124

## Re: Logarithmic Scale?

slg, that's the problem:  If you take a mean of logged data (as Bailey did in the t-test) and exponentiate that mean, it is NOT an estimate of the mean of the non-logged data.You end up with the geometric mean, which is much more difficult to interpret.

Occasional Contributor
Posts: 6

## Re: Logarithmic Scale?

I don't believe I stated that exponentiation produces the mean of the non-logged data. Assuming that the log-transformation is warranted due to extreme skew or outliers then the mean of the non-logged data will not give useful information (i.e., is not a typical value of the distribution). Log transformation "normalizes" the distribution - but the log-mean is not a meaningful value (in terms of the original unit of measurement of the data), hence makes sense to take the anti-log to bring the mean value back to the original unit of measurement. The new value (which as you correctly point out is the geometric mean) is a "normalized mean" and is, of course, different from the original mean... but assuming that the outliers in the original distribution are results of anomaly, it is a better representation of the central location. Why is that difficult?

Posts: 5,521

## Re: Logarithmic Scale?

In some fields, such as solution chemistry, it is quite natural to take the logarithms of measurements (e.g. concentrations) and build linear models about them. But at one point, predictions must be back-transformed to their original scale and true means are required. Many estimators of the true mean have been proposed for the back-transformation problem. My favorite is Duan's Smearing Estimator which is fairly simple and robust.

Log transformation is however not a good method for dealing with the skewing effect of outliers. You will be much better off with more robust location estimators such as the median or the Winsorized mean and scale estimators such as MAD, Qn or Sn.

PG

PG
Posts: 2,655

## Re: Logarithmic Scale?

What looks like a smearing estimator is presented in the PROC GLIMMIX documentation where DIST=LOGNORMAL is discussed (although it is NOT the nonparametric bootstrapped value in the original paper).  A big tip of the hat to you, PG, for giving me a name for what we have been doing, according to the stub definition on Wikipedia.

Steve Denham

Posts: 5,521

## Re: Logarithmic Scale?

Hi , I don't think smearing estimates are available in SAS procedures. It is however easy to get them from the residuals.

Reference: Duan, N., 1983. Smearing estimate: a nonparametric retransformation method. J. Am. Stat. Assoc. 78, 605–610.

PG

PG
Occasional Contributor
Posts: 6

## Re: Logarithmic Scale?

Raise the log base by the result. For example in your data the first case (A) = 2.7. Assuming that 2.7 is a log to the base 10, exponentiation = 10^2.7 or 501.2. If it is a log to the base e, then 2.7183^2.7 =  14.88 etc... Hope this helps..