BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
lheer
Calcite | Level 5

Hi everyone,

 

I want to run some Goodness-of-fit tests for several variables, but for every variable the three tests (KS, AD and C-vM) show similar p-values 0.010, 0.005, 0.005 respectively. This would mean that every variable is not normally distributed. However, when looking at Q-Q plot and histograms, not every variable is not normally distributed.

It looks like these p-values are default p-values, since for every variable these p-values are the same.

This is what I did:

 

proc univariate data=<library.dataset>;
var <variables>;
histogram /normal;
qqplot;
run;

 

Can anyone help me?

 

Thank you!

1 ACCEPTED SOLUTION

Accepted Solutions
Ksharp
Super User

@Rick_SAS  wrote many blogs about this GOF .especially your data is not big enough or have some integer value .

These tests (KS, AD and C-vM) are not always trusted  . 

I would follow Q-Q plot .

View solution in original post

7 REPLIES 7
PaigeMiller
Diamond | Level 26

Show us what you are seeing.

--
Paige Miller
lheer
Calcite | Level 5

I see these p-values for every single variable, but with different D-, W-Sq- and A-Sq-values.Capture.PNG

ballardw
Super User

@lheer wrote:

I see these p-values for every single variable, but with different D-, W-Sq- and A-Sq-values.Capture.PNG


Send the output to a data set and you can examine the, almost certainly miniscule, actul p-values. The TABLES the procedure reports with will use a threshold value instead of attempting to fit a value like 0.00000000004583 into a 6-column display.

Ksharp
Super User

@Rick_SAS  wrote many blogs about this GOF .especially your data is not big enough or have some integer value .

These tests (KS, AD and C-vM) are not always trusted  . 

I would follow Q-Q plot .

Rick_SAS
SAS Super FREQ

I'm going to guess that you are testing large data? As KSharb suggests, you might want to read the article "Goodness-of-fit tests: A cautionary tale for large and small samples".

 

Anyway, a more important question is WHY you want to test many variables for normality. What are you trying to accomplish? Why does a lack of normality bother you?

Ksharp
Super User

@Rick_SAS  You are right. OP must have a big table .

PaigeMiller
Diamond | Level 26

@lheer wrote:

... but for every variable the three tests (KS, AD and C-vM) show similar p-values 0.010, 0.005, 0.005 respectively


This is NOT what the SAS output is showing. It does not show a value of 0.010 or 0.005 or 0.005 respectively.

 

It shows a value of <0.010 and <0.005 and <0.005, and these are not default values, these are calculations, rounded to some meaningful threshold.

--
Paige Miller