i have more than 100 variables and use npar1way to estimate KS.
My aim is each variable to be tested based on the class variable
proc npar1way edf data=mydata noprint;
class source;
var &val;
output out=stat ks;
run;
Below is an example of the outcome for one variable. My question is, do i need to use the KS or the D statistic?
Kolmogorov-Smirnov Two-Sample Test (Asymptotic) | |||
KS | 0.038621 | D | 0.1039 |
KSa | 1.07377 | Pr > KSa | 0.1991 |
For comparing two different distributions, look at the graphic at Wikipedia, D is the size vertically of the black arrow, it is the maximum vertical difference between the two distribution. That's the one you want.
The documentation explains the difference: https://documentation.sas.com/doc/en/pgmsascdc/9.4_3.4/statug/statug_npar1way_details24.htm
For comparing two different distributions, look at the graphic at Wikipedia, D is the size vertically of the black arrow, it is the maximum vertical difference between the two distribution. That's the one you want.
The documentation explains the difference: https://documentation.sas.com/doc/en/pgmsascdc/9.4_3.4/statug/statug_npar1way_details24.htm
Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.
If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website.
Learn how to run multiple linear regression models with and without interactions, presented by SAS user Alex Chaplin.
Find more tutorials on the SAS Users YouTube channel.