turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- Analytics
- /
- Stat Procs
- /
- Running the KS test for Poisson distributed data

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

06-15-2017 03:54 AM

Hello SAS user,

I have to run the Kolmogorov-Smirnov test on a Poisson distributed data, by quantifying the distance between the empirical distribution function of the loss data set and the cumulative Poisson distribution function; in my case, the Poisson distribution function is the reference parametric distribution.

Browsing on the internet, till now I found only the KS test special case in which one compares the Normal distribution vs the empirical one.

What about the case I need to compare the empirical distribution function against the Poisson distribution?

Thanks all in advance for your help!

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

06-15-2017 09:31 AM

KS Test is nonparameter test, which means it does not matter what kind of distribution your variable conform to , you always could use KS Test.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

06-16-2017 05:34 AM

Thanks for your answer @Ksharp!

Do you mean I could simply use the *UNIVARIATE* procedure to implement the KS test?

Particularly, I may use:

proc npar1way edf data = dataset; class x var y; exact ks; run;

where * y *is the observed data and x is a vector of simulated data coming from a Poisson distribution?

Thanks!

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

06-16-2017 08:39 AM

I totally agree with @Rick_SAS . and I do remember Rick has written a blog about this question.

Search Poisson at Rick's blog, you will find . or @Rick_SAS could point you the URL .

Back to your question. Yes. You can do this but you need change data structure.

x y

7 2

5 6

...

-->

name value

x 7

x 5

....

y 2

y 6

.....

after that ,run KS test.

proc npar1way edf data = dataset; class name var value ; exact ks; run;

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

06-16-2017 05:50 AM

Can you explain WHY you have to run a KS test for Poisson data?

PROC UNIVARIATE is not appropriate for discrete (count) data.

If you are trying to fit Poisson data, you can use PROC GENMOD, which provides goodness-of-fit statistics.

If you want a graphical representation of the fit (similar to a quantile-quantile plot) you can create a "Poissonness plot", although for small data.it might not be very enlightening.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

06-16-2017 06:09 AM

HI @Rick_SAS and thanks for your answer.

I have to compute the KS stat and the relative p-value comparing the the theoretical Poisson distribution with the observed data.

I agree with you about the fact it does not make sense, but it is a request for reporting the validation results of internal models; the aim is to quantify a distance between the empirical distribution function and the cdf of the Poisson distribution.

By using the *PROC GENMOD*, as you suggested above, I did not get the KS statistic and p-value.

Could you suggest some way to run the KS test?

Thanks!

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

06-16-2017 06:25 AM

Quantopic wrote:

I agree with you about the fact it does not make sense, but it is a request for reporting the validation results of internal models; the aim is to quantify a distance between the empirical distribution function and the cdf of the Poisson distribution.

When something does not make sense, you should point that out to your supervisor.

I am familiar with a recent paper that shows how to compute KS statistics for discrete distributions, but the computation is much more difficult than for continuous data. If your company licenses SA/IML and you are an experienced SAS/IML programmer with knowledge of numerical analysis, you might be able to implement the procedure in a few days or weeks. If you are less experienced or don't have SAS/IML, it will take longer. And to what purpose? The KS test is not more powerful than other GOF tests that are already provided.

Talk to your supervisor and explain that KS tests for discrete distributions are still at the research stage and have not made their way into SAS procedures. To implement it yourself would require advanced IML programming.