Hi everyone,
I want to run some Goodness-of-fit tests for several variables, but for every variable the three tests (KS, AD and C-vM) show similar p-values 0.010, 0.005, 0.005 respectively. This would mean that every variable is not normally distributed. However, when looking at Q-Q plot and histograms, not every variable is not normally distributed.
It looks like these p-values are default p-values, since for every variable these p-values are the same.
This is what I did:
proc univariate data=<library.dataset>;
var <variables>;
histogram /normal;
qqplot;
run;
Can anyone help me?
Thank you!
@Rick_SAS wrote many blogs about this GOF .especially your data is not big enough or have some integer value .
These tests (KS, AD and C-vM) are not always trusted .
I would follow Q-Q plot .
Show us what you are seeing.
I see these p-values for every single variable, but with different D-, W-Sq- and A-Sq-values.
@lheer wrote:
I see these p-values for every single variable, but with different D-, W-Sq- and A-Sq-values.
Send the output to a data set and you can examine the, almost certainly miniscule, actul p-values. The TABLES the procedure reports with will use a threshold value instead of attempting to fit a value like 0.00000000004583 into a 6-column display.
@Rick_SAS wrote many blogs about this GOF .especially your data is not big enough or have some integer value .
These tests (KS, AD and C-vM) are not always trusted .
I would follow Q-Q plot .
I'm going to guess that you are testing large data? As KSharb suggests, you might want to read the article "Goodness-of-fit tests: A cautionary tale for large and small samples".
Anyway, a more important question is WHY you want to test many variables for normality. What are you trying to accomplish? Why does a lack of normality bother you?
@Rick_SAS You are right. OP must have a big table .
@lheer wrote:
... but for every variable the three tests (KS, AD and C-vM) show similar p-values 0.010, 0.005, 0.005 respectively
This is NOT what the SAS output is showing. It does not show a value of 0.010 or 0.005 or 0.005 respectively.
It shows a value of <0.010 and <0.005 and <0.005, and these are not default values, these are calculations, rounded to some meaningful threshold.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.