- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Is there a function I can call directly that will calculate the p-value given the K-S d-statistic and the number of entries in the two distributions? I need the function that NPAR1WAY uses behind the scenes. The problem is that I can't give NPAR1WAY what it needs (unweighted data, mine is weighted) so I use NPAR1WAY to get the d-statistic and I need to calculate the p-value myself. I could scale up the reported p-value except for my weighted data NPAR1WAY always says p<0.0001, i.e. it doesn't return a value I can scale!
I need something like: myPvalue = KS_pValue(dStatistic,n1,n2);
Ranges for n1,n2 in my case are: 1000 < n1 < 100,000, 50 < n2 < 2000.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
You can get the actual p-value in the output data set. The value you are seeing is the formatted value using PVALUE format.
input Treatment $ Response Freq @@;
datalines;
Active 5 5 Active 4 11 Active 3 5 Active 2 1 Active 1 5
Placebo 5 2 Placebo 4 4 Placebo 3 7 Placebo 2 7 Placebo 1 12
;
proc npar1way data=Arthritis KS;
class Treatment;
var Response;
freq Freq;
output out=ks;
run;
proc print;
run;

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I looked in the output data, all the p-values are 0. The problem is my weights are huge, as high as 200,000. But the true p-value calculated properly should not be 0.
Might EXACT help? I fiddled with EXACT, MC, MAXTIME, etc. but it just sat there for long periods of time and I had to kill it. Have to try N= as well.
Anyway, if I could just access the function NPAR1WAY uses I'd be fine...
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
The help function in SAS can take you to the details of the KS test. Or just check out the web page
SAS/STAT(R) 9.2 User's Guide, Second Edition
New versions have the same information.
With the samples sizes you have, even the smallest difference will be significant.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Yes, I saw the equations, I was hoping a function exists to do the work. I understand in practice no one sums from zero to infinity, only a few terms are kept and special corrections are applied, so I'd much rather find a working function written by experts than have to research this on my own.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Part of your message looks like you are concerned about getting p<0.0001, instead of the actual (small value), say p = 1.67*10^(-8). You can get a more exact printout by storing the relevant statistics with an ODS output statement, and then printing the stored file. Here is a simple example (without frequencies) where the printout gives < .0001. The last column of the KS2 file has two rows. First row is D and second row is p (in scientific notation). Variable is called nValue2 (for some strange reason).
data a;
do group = 1 to 2;
do rep = 1 to 10000;
y = group*.1 + rannor(1);
output;end;
end;
run;
proc npar1way data=a edf ks;
class group;
var y;
ods output KolSmir2Stats=ks2 ;
run;
proc print data=ks2;run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I tried that previously, it just printed exactly 0 for all the p-values (see column of zeros below). Perhaps my weights are so large that it's hopeless to get a non-zero p-value that I can correct for the weights. Seems like I'll just have to calculate the p-value myself.
I sure would like to find a function in SAS that does the calculation...
1 | Male | (<20) | ApoB | .002233328 | 6.2103 | 0.22041 | 0 | .000002242 | 17.335 | 0.22293 | 6.2813 | 0 |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2 | Male | (20-29) | ApoB | .002473620 | 7.3677 | 0.12183 | 0 | .000003057 | 27.119 | 0.12511 | 7.5660 | 0 |
3 | Male | (30-39) | ApoB | .005193976 | 15.9733 | 0.16763 | 0 | .000011119 | 105.162 | 0.16785 | 15.9943 | 0 |
4 | Male | (40-49) | ApoB | .003007115 | 9.7514 | 0.07357 | 0 | .000002399 | 25.230 | 0.09987 | 13.2380 | 0 |
5 | Male | (50-59) | ApoB | .004811441 | 14.8999 | 0.09631 | 0 | .000006126 | 58.747 | 0.12704 | 19.6538 | 0 |
6 | Male | (60-69) | ApoB | .007293936 | 19.9366 | 0.13297 | 0 | .000009974 | 74.515 | 0.19752 | 29.6162 | 0 |
7 | Male | (70-79) | ApoB | .005602209 | 11.4631 | 0.09621 | 0 | .000004495 | 18.821 | 0.15906 | 18.9521 | 0 |
8 | Male | (>=80) | ApoB | .005729162 | 7.4794 | 0.09921 | 0 | .000011073 | 18.872 | 0.13626 | 10.2727 | 0 |
9 | Female | (<20) | ApoB | .002512096 | 6.9934 | 0.22836 | 0 | .000002821 | 21.866 | 0.22836 | 6.9934 | 0 |
10 | Female | (20-29) | ApoB | .002397185 | 7.4033 | 0.09843 | 0 | .000002358 | 22.488 | 0.09896 | 7.4432 | 0 |
11 | Female | (30-39) | ApoB | .005747304 | 17.9301 | 0.16782 | 0 | .000014166 | 137.876 | 0.18721 | 20.0008 | 0 |
12 | Female | (40-49) | ApoB | .004009143 | 12.3046 | 0.08878 | 0 | .000007887 | 74.293 | 0.10704 | 14.8363 | 0 |
13 | Female | (50-59) | ApoB | .005250789 | 17.1014 | 0.10402 | 0 | .000006173 | 65.481 | 0.15556 | 25.5739 | 0 |
14 | Female | (60-69) | ApoB | .005870556 | 16.3198 | 0.10203 | 0 | .000011481 | 88.728 | 0.15890 | 25.4182 | 0 |
15 | Female | (70-79) | ApoB | .004183428 | 9.3058 | 0.07472 | 0 | .000003132 | 15.496 | 0.14135 | 17.6047 | 0 |
16 | Female | (>=80) | ApoB | .008706090 | 14.2755 | 0.16290 | 0 | .000015396 | 41.394 | 0.19818 | 17.3680 | 0 |
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Not my area. But any formula other than the exact one will be an approximation.I doubt if the p value will scale linearly with n, so there would be no simple upscaling.