SAS Programming

Deekshana · Posted 01-08-2019 11:21 AM

Hi Iam new to SAS and statistics,

Can I know which is the best approach to transform non normal data(positive,negative,zero values) distribution to NORMAL.

I came across few internet sites which mentioned to perform Log transformation by adding a constant.But some says this is not a good approach.

I have attached(T test file) on which I want to run "T test " on "overall difference variable " to see if my template worked better - here post test score is the score after template where as pretest score is the scorer before giving template(i.e instructions).

Since the data is skewed and I ran some normality test check "Shapiro wilk" and I got a P value which is less than 0.05.

So based on the shapiro result and my histogram on "overall score difference " variable I came to know my variable is not normally distributed so I cant run T test and decided to do a normal transformation.But my difference variable has negative,positive,zero values.

Thanks for your help in advance.Please ask me questions if I am not clear as this is my first post.

Reeza · Posted 01-08-2019 11:38 AM

Instead, can you use a non-parametric option?

ballardw · Posted 01-08-2019 11:48 AM

With a smallish sample, 31 records, and paired observations, prescore and postscore, I would be strongly tempted to look at a non-parametric test such as Wilcoxon.

Deekshana · Posted 01-08-2019 12:06 PM

Hi ballardw I could remember some basic stats which says if n<30 we use Wilcoxon since T test applies for observations when n>30.

But my sample contains n>30 so should I use "Wilcoxon" OR else "transform the actual data and then do T test"

ballardw · Posted 01-08-2019 12:54 PM

@Deekshana wrote:

Hi ballardw I could remember some basic stats which says if n<30 we use Wilcoxon since T test applies for observations when n>30.

But my sample contains n>30 so should I use "Wilcoxon" OR else "transform the actual data and then do T test"

Before going to the complexity of transforming the data I would tend to run both a Wicoxon and TTest with the PAIRED option and see how the results look.

Also those guidelines for thirty are general in nature. A WILCOXON test is just less efficient if the data is actually normal but it will work on much larger data sets. The advantage being that the only requirement the numeric data values have some actual meaning such as a measurement.

The >30 is that when you treat multiple groups of more than 30 records the means tend behave more or less normally. So you likely don't need to do a transform at all. Which is why I suggest trying both and comparing results.

SAS Programming

Which is the best approach to transform non normal data(+Ve,-Ve,0 values) distribution to normal

Re: Which is the best approach to transform non normal data(+Ve,-Ve,0 values) distribution to normal

Re: Which is the best approach to transform non normal data(+Ve,-Ve,0 values) distribution to normal

Re: Which is the best approach to transform non normal data(+Ve,-Ve,0 values) distribution to normal

Re: Which is the best approach to transform non normal data(+Ve,-Ve,0 values) distribution to normal

Follow Us

What is...

SAS Programming

Which is the best approach to transform non normal data(+Ve,-Ve,0 values) distribution to normal

Re: Which is the best approach to transform non normal data(+Ve,-Ve,0 values) distribution to normal

Re: Which is the best approach to transform non normal data(+Ve,-Ve,0 values) distribution to normal

Re: Which is the best approach to transform non normal data(+Ve,-Ve,0 values) distribution to normal

Re: Which is the best approach to transform non normal data(+Ve,-Ve,0 values) distribution to normal

Our biggest data and AI event of the year.

SAS Training: Just a Click Away

Follow Us

What is...