Programming the statistical procedures from SAS

Running the KS test for Poisson distributed data

Reply
Contributor
Posts: 21

Running the KS test for Poisson distributed data

Hello SAS user,

 

I have to run the Kolmogorov-Smirnov test on a Poisson distributed data, by quantifying the distance between the empirical distribution function of the loss data set and the cumulative Poisson distribution function; in my case, the Poisson distribution function is the reference parametric distribution.

 

Browsing on the internet, till now I found only the KS test special case in which one compares the Normal distribution vs the empirical one.

 

What about the case I need to compare the empirical distribution function against the Poisson distribution?

 

Thanks all in advance for your help!

 

 

Super User
Posts: 9,775

Re: Running the KS test for Poisson distributed data

KS Test is nonparameter test, which means it does not matter what kind of distribution your variable conform to , you always could use KS Test.

Contributor
Posts: 21

Re: Running the KS test for Poisson distributed data

Thanks for your answer @Ksharp!

 

Do you mean I could simply use the UNIVARIATE procedure to implement the KS test?

 

Particularly, I may use:

 

proc npar1way
        edf
        data = dataset;
                    class x
                    var y;
        exact ks;
run;

 

where  y is the observed data and x is a vector of simulated data  coming from a Poisson distribution?

 

Thanks!

Super User
Posts: 9,775

Re: Running the KS test for Poisson distributed data

I totally agree with @Rick_SAS . and I do remember Rick has written a blog about this question. 

Search Poisson at Rick's blog, you will find . or @Rick_SAS could point you the URL .

 

Back to your question. Yes. You can do this but you need change data structure.

 

x  y

7 2

5 6

...

 

-->

 

name value

x 7

x 5

....

y 2

y 6

.....

 

after that ,run KS test.

 

proc npar1way
        edf
        data = dataset;
                    class name
                    var value ;
        exact ks;
run;
SAS Super FREQ
Posts: 3,547

Re: Running the KS test for Poisson distributed data

Can you explain WHY you have to run a KS test for Poisson data?

PROC UNIVARIATE is not appropriate for discrete (count) data.

If you are trying to fit Poisson data, you can use PROC GENMOD, which provides goodness-of-fit statistics.

If you want a graphical representation of the fit (similar to a quantile-quantile plot) you can create a "Poissonness plot", although for small data.it might not be very enlightening.

Contributor
Posts: 21

Re: Running the KS test for Poisson distributed data

HI @Rick_SAS and thanks for your answer.

 

I have to compute the KS stat and the relative p-value comparing the the theoretical Poisson distribution with the observed data.

 

I agree with you about the fact it does not make sense, but it is a request for reporting the validation results of internal models; the aim is to quantify a distance between the empirical distribution function and the cdf of the Poisson distribution.

 

By using the PROC GENMOD, as you suggested above, I did not get the KS statistic and p-value.

 

Could you suggest some way to run the KS test?

 

Thanks!

 

 

SAS Super FREQ
Posts: 3,547

Re: Running the KS test for Poisson distributed data


Quantopic wrote:

 

I agree with you about the fact it does not make sense, but it is a request for reporting the validation results of internal models; the aim is to quantify a distance between the empirical distribution function and the cdf of the Poisson distribution.

 

 


When something does not make sense, you should point that out to your supervisor.

 

I am familiar with a recent paper that shows how to compute KS statistics for discrete distributions, but the computation is much more difficult than for continuous data. If your company licenses SA/IML and you are an experienced SAS/IML programmer with knowledge of numerical analysis, you might be able to implement the procedure in a few days or weeks. If you are less experienced or don't have SAS/IML, it will take longer. And to what purpose? The KS test is not more powerful than other GOF tests that are already provided.

 

Talk to your supervisor and explain that KS tests for discrete distributions are still at the research stage and have not made their way into SAS procedures.  To implement it yourself would require advanced IML programming.

Ask a Question
Discussion stats
  • 6 replies
  • 174 views
  • 0 likes
  • 3 in conversation