Solved: t-test and non-normally distributed data

haoduonge · Posted 05-07-2022 04:08 PM

Hi all,

I have a very basic analysis (t-test) and need your comments on it.

One assumption for a t-test is "the dependent variable should be normally distributed for each category of the independent variable". But also, "it is quite "robust" to violations of normality, meaning that this assumption can be a little violated and still provide valid results".

Therefore, I decide to use rank-sum test (non-parametric) only when both groups (of the independent variable) are not normally distributed. However, if one group has a small sample N, it is mostly normally distributed.

For example: two groups of 100 – both are not normally distributed

But if one group of 180, and one group of 20 then the group of 20 is mostly normally distributed in all variables examined.

I really appreciate your advice on this basic issue.

Thanks!

Hao

https://statistics.laerd.com/stata-tutorials/independent-t-test-using-stata.php

PaigeMiller · Posted 05-08-2022 06:48 AM

Using the Central Limit Theorem, as your N increases, the distribution of the mean approaches a normal distribution. So, in my work, we usually have large N values, and so I don't worry about it. I go ahead and use the t-test.

Naturally, if your N is low (let's say < 50), you might want to use a non-parametric test such as those found in PROC NPAR1WAY or PROC UNIVARIATE.

--
Paige Miller

View solution in original post

PaigeMiller · Posted 05-07-2022 05:43 PM

Many people, including me, will not click on links to "unknown" web sites. I suggest whatever it is you want us to see, copy and paste it into a message here.

Also, its not clear to me what your question is, and it doesn't really appear that you have actually asked a question.

--
Paige Miller

haoduonge · Posted 05-07-2022 06:06 PM

You don't need to click on that link, it is just the basic knowlegge about assumption (normal distribution) for a t-test.

Even it (t-test) requires assumption of normally distributed data, it is quite "robust" to violations of the normality assumption.

My question is:

Do you use non-parametric test (eg, rank-sum test) in any case the assumption of normality is violated or you are easy on it? if so, at what level?

Hao

Ksharp · Posted 05-08-2022 05:03 AM

"Do you use non-parametric test (eg, rank-sum test) in any case the assumption of normality is violated"

Yes. I would also try Wilcoxon test .and compare these two result to see if they are the same. If not ,I would rather trust Wilconxon test.

Or Maybe @Rick_SAS would comment something .

PaigeMiller · Posted 05-08-2022 06:48 AM

Using the Central Limit Theorem, as your N increases, the distribution of the mean approaches a normal distribution. So, in my work, we usually have large N values, and so I don't worry about it. I go ahead and use the t-test.

Naturally, if your N is low (let's say < 50), you might want to use a non-parametric test such as those found in PROC NPAR1WAY or PROC UNIVARIATE.

--
Paige Miller

t-test and non-normally distributed data

Re: t-test and non-normally distributed data

Re: t-test and non-normally distributed data

Re: t-test and non-normally distributed data

Re: t-test and non-normally distributed data

Re: t-test and non-normally distributed data

SAS Innovate 2026 Registration is Open