BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
SidBajpai
Calcite | Level 5

How can I do a two sample KS test in SAS IML of the null hypothesis that the two empirical distributions were drawn from the same continuous distribution?

1 ACCEPTED SOLUTION

Accepted Solutions
Rick_SAS
SAS Super FREQ

Probably the easiest way is to leverage PROC NPAR1WAY, like this:

proc iml;
/* the data */

x = T(1:10);
y = {2,2,3,5,5,6,6,8,9,10};

/* concatenate data and add ID variable; write to data set */
ID = repeat(1, nrow(x)) // repeat(2, nrow(x));
z = x // y;

create KSData var {ID z};
append;
close KSData;
quit;

proc npar1way data=KSData edf;
   class ID;
   var z;
run;

If you need to stay inside PROC IML, you can use the SUBMIT and ENDSUBMIT statements to call PROC NPAR1WAY and then read in whatever statistics you need from the procedure output. For details and an example, see Video: Calling SAS procedures from the SAS/IML language - The DO Loop

View solution in original post

4 REPLIES 4
Rick_SAS
SAS Super FREQ

Probably the easiest way is to leverage PROC NPAR1WAY, like this:

proc iml;
/* the data */

x = T(1:10);
y = {2,2,3,5,5,6,6,8,9,10};

/* concatenate data and add ID variable; write to data set */
ID = repeat(1, nrow(x)) // repeat(2, nrow(x));
z = x // y;

create KSData var {ID z};
append;
close KSData;
quit;

proc npar1way data=KSData edf;
   class ID;
   var z;
run;

If you need to stay inside PROC IML, you can use the SUBMIT and ENDSUBMIT statements to call PROC NPAR1WAY and then read in whatever statistics you need from the procedure output. For details and an example, see Video: Calling SAS procedures from the SAS/IML language - The DO Loop

SidBajpai
Calcite | Level 5

Thanks Dr Wicklin,

This is really helpful.

However if in the code above,

x is a nxm matrix,

y is a px1 vector,

where n≠m≠p

The value n,m and p can change each time I run the code.

Now if I have to get the P value of the KS test for each column of matrix x by the column vector y, how do you suggest I modify this code above in IML.

Thanks

Rick_SAS
SAS Super FREQ

It's almost the same, you just need to replicate y and run PROC NPAR1WAY on x1-xm. For example:


proc iml;
/* the data */
x = T(1:10)|| T(0:9) || T(2:11);
y = {2,5,3,8,7,6,6,9};

/* concatenate data and add ID variable; write to data set */
z = x // repeat(y, 1, ncol(x));
ID = repeat(1, nrow(x)) // repeat(2, nrow(y));

Q = ID || z;
varNames = "ID" || ("x1":("x"+strip(char(ncol(x)))));
create KSData from Q[c=varNames];
append from Q;
close KSData;

/* submit; */

proc npar1way data=KSData plots=none noprint;
   class ID;
   var x:;
   output out=KSOUT edf;
run;

/* endsubmit; */

Now read the p-values, which are in the KSOUT data set.

SidBajpai
Calcite | Level 5

That works,

Thanks


sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

Multiple Linear Regression in SAS

Learn how to run multiple linear regression models with and without interactions, presented by SAS user Alex Chaplin.

Find more tutorials on the SAS Users YouTube channel.

From The DO Loop
Want more? Visit our blog for more articles like these.
Discussion stats
  • 4 replies
  • 1865 views
  • 0 likes
  • 2 in conversation