turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- Analytics
- /
- Stat Procs
- /
- Calculating p-value for two-sample Kolmogorov-Smir...

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

08-25-2014 01:27 PM

Is there a function I can call directly that will calculate the p-value given the K-S d-statistic and the number of entries in the two distributions? I need the function that NPAR1WAY uses behind the scenes. The problem is that I can't give NPAR1WAY what it needs (unweighted data, mine is weighted) so I use NPAR1WAY to get the d-statistic and I need to calculate the p-value myself. I could scale up the reported p-value except for my weighted data NPAR1WAY always says p<0.0001, i.e. it doesn't return a value I can scale!

I need something like: myPvalue = KS_pValue(dStatistic,n1,n2);

Ranges for n1,n2 in my case are: 1000 < n1 < 100,000, 50 < n2 < 2000.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

08-25-2014 01:42 PM

You can get the actual p-value in the output data set. The value you are seeing is the formatted value using PVALUE format.

input Treatment $ Response Freq @@;

datalines;

Active 5 5 Active 4 11 Active 3 5 Active 2 1 Active 1 5

Placebo 5 2 Placebo 4 4 Placebo 3 7 Placebo 2 7 Placebo 1 12

;

class Treatment;

var Response;

freq Freq;

output out=ks;

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

08-25-2014 02:44 PM

I looked in the output data, all the p-values are 0. The problem is my weights are huge, as high as 200,000. But the true p-value calculated properly should not be 0.

Might EXACT help? I fiddled with EXACT, MC, MAXTIME, etc. but it just sat there for long periods of time and I had to kill it. Have to try N= as well.

Anyway, if I could just access the function NPAR1WAY uses I'd be fine...

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

08-25-2014 03:26 PM

The help function in SAS can take you to the details of the KS test. Or just check out the web page

SAS/STAT(R) 9.2 User's Guide, Second Edition

New versions have the same information.

With the samples sizes you have, even the smallest difference will be significant.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

08-25-2014 03:32 PM

Yes, I saw the equations, I was hoping a function exists to do the work. I understand in practice no one sums from zero to infinity, only a few terms are kept and special corrections are applied, so I'd much rather find a working function written by experts than have to research this on my own.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

08-25-2014 03:50 PM

Part of your message looks like you are concerned about getting p<0.0001, instead of the actual (small value), say p = 1.67*10^(-8). You can get a more exact printout by storing the relevant statistics with an ODS output statement, and then printing the stored file. Here is a simple example (without frequencies) where the printout gives < .0001. The last column of the KS2 file has two rows. First row is D and second row is p (in scientific notation). Variable is called nValue2 (for some strange reason).

data a;

do group = 1 to 2;

do rep = 1 to 10000;

y = group*.1 + rannor(1);

output;end;

end;

run;

proc npar1way data=a edf ks;

class group;

var y;

ods output KolSmir2Stats=ks2 ;

run;

proc print data=ks2;run;

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

08-25-2014 04:37 PM

I tried that previously, it just printed exactly 0 for all the p-values (see column of zeros below). Perhaps my weights are so large that it's hopeless to get a non-zero p-value that I can correct for the weights. Seems like I'll just have to calculate the p-value myself.

I sure would like to find a function in SAS that does the calculation...

1 | Male | (<20) | ApoB | .002233328 | 6.2103 | 0.22041 | 0 | .000002242 | 17.335 | 0.22293 | 6.2813 | 0 |
---|---|---|---|---|---|---|---|---|---|---|---|---|

2 | Male | (20-29) | ApoB | .002473620 | 7.3677 | 0.12183 | 0 | .000003057 | 27.119 | 0.12511 | 7.5660 | 0 |

3 | Male | (30-39) | ApoB | .005193976 | 15.9733 | 0.16763 | 0 | .000011119 | 105.162 | 0.16785 | 15.9943 | 0 |

4 | Male | (40-49) | ApoB | .003007115 | 9.7514 | 0.07357 | 0 | .000002399 | 25.230 | 0.09987 | 13.2380 | 0 |

5 | Male | (50-59) | ApoB | .004811441 | 14.8999 | 0.09631 | 0 | .000006126 | 58.747 | 0.12704 | 19.6538 | 0 |

6 | Male | (60-69) | ApoB | .007293936 | 19.9366 | 0.13297 | 0 | .000009974 | 74.515 | 0.19752 | 29.6162 | 0 |

7 | Male | (70-79) | ApoB | .005602209 | 11.4631 | 0.09621 | 0 | .000004495 | 18.821 | 0.15906 | 18.9521 | 0 |

8 | Male | (>=80) | ApoB | .005729162 | 7.4794 | 0.09921 | 0 | .000011073 | 18.872 | 0.13626 | 10.2727 | 0 |

9 | Female | (<20) | ApoB | .002512096 | 6.9934 | 0.22836 | 0 | .000002821 | 21.866 | 0.22836 | 6.9934 | 0 |

10 | Female | (20-29) | ApoB | .002397185 | 7.4033 | 0.09843 | 0 | .000002358 | 22.488 | 0.09896 | 7.4432 | 0 |

11 | Female | (30-39) | ApoB | .005747304 | 17.9301 | 0.16782 | 0 | .000014166 | 137.876 | 0.18721 | 20.0008 | 0 |

12 | Female | (40-49) | ApoB | .004009143 | 12.3046 | 0.08878 | 0 | .000007887 | 74.293 | 0.10704 | 14.8363 | 0 |

13 | Female | (50-59) | ApoB | .005250789 | 17.1014 | 0.10402 | 0 | .000006173 | 65.481 | 0.15556 | 25.5739 | 0 |

14 | Female | (60-69) | ApoB | .005870556 | 16.3198 | 0.10203 | 0 | .000011481 | 88.728 | 0.15890 | 25.4182 | 0 |

15 | Female | (70-79) | ApoB | .004183428 | 9.3058 | 0.07472 | 0 | .000003132 | 15.496 | 0.14135 | 17.6047 | 0 |

16 | Female | (>=80) | ApoB | .008706090 | 14.2755 | 0.16290 | 0 | .000015396 | 41.394 | 0.19818 | 17.3680 | 0 |

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

08-25-2014 04:55 PM

Not my area. But any formula other than the exact one will be an approximation.I doubt if the p value will scale linearly with n, so there would be no simple upscaling.