Contributor
Posts: 38

# When should I not transform a variable?

Hi everyone,

In regression modelling (logistic regression and linear regression), when is it not best to transform a variable? In other words:

- If a variable is normally distributed but has a very large range, should it still be transformed?

- Should binary variables be transformed? if not, why?

- What other reasons are there not to transform a variable?

Thanks

Paul

PROC Star
Posts: 7,471

## Re: When should I not transform a variable?

I am not a statistician, so am responding based only on experience (and the stats I learned getting a PhD in Educational Psychology) and to insure that I see the responses with those with more expertise (hi  @Rick_SAS )

In my experience the principal reason for doing any transformation is when you have a distribution that you assume, or theory suggests, that it comes from a factor that has something other than a normal distribution.

I know regarding insurance claims that it holds for binary variables, as frequency of insurance claims (a binary variable: have or don't have a claim( is one such distribution.

Art, CEO, AnalystFinder.com

New Contributor
Posts: 3

## Re: When should I not transform a variable?

Thanks, but surely by transforming a binary variable, you will completely ruin your chances of making any meaningful interpretations from them.

Is the presence of outliers a good enough reason to warrant a transformation? Some variables are normally distributed but have outliers. In this case, will it still be necessary to transform the variable?

Thanks

Posts: 1,915

## Re: When should I not transform a variable?

[ Edited ]

fbgeoff wrote:

Is the presence of outliers a good enough reason to warrant a transformation? Some variables are normally distributed but have outliers. In this case, will it still be necessary to transform the variable?