BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
JackHP
Fluorite | Level 6

Hello,

 

When I run the below code SAS continually "runs" and I don't seem to ever get an output despite waiting over 10 minutes. My dataset has under 200 observations so I can't imagine that processing power is the issue. 

 

 

proc NPAR1WAY data=mvrx19.master wilcoxon correct=no;
title "compare PSS Medians by FI status";
class FI_1;
var PSS4NS_1 PSS10NS_1;
where psurvey1 ne '.'d;
exact wilcoxon;
run;

 

I tried taking out "Exact wilcoxon" which fixed this issue, but wondering why I couldn't get output when including it, and would like to have the option of generating exact p values.

 

Thank you!

 

I am running SAS University Edition with 6 GB RAM dedicated for the virtual Machine.

1 ACCEPTED SOLUTION

Accepted Solutions
FreelanceReinh
Jade | Level 19

Hello @JackHP,

 

I wouldn't rule out that the computation of the two exact tests takes a long time. (There's a reason why the MAXTIME= option of the EXACT statement exists; see Computation Options.) To estimate the time needed I would restrict the input dataset using a suitable (extended) WHERE condition or an OBS= dataset option or by drawing random samples so as to use only, say, 10, 20, 40, ... observations and see how the run times increase. Of course, both FI_1 values must be represented in each of these smaller datasets.

 

Then, if the extrapolated run time for the full dataset was inacceptable, I would use the MC option of the EXACT statement (after a slash, see link above) and as long as the run times are tolerable add the N= option (MC is then redundant) with increasing values n>10000 (the default). You'll observe that the confidence limits of the exact p-values tend to get narrower as n increases. (As random variables they fluctuate even between runs of the same code, unless you use the SEED= option.) Finally, the Monte Carlo estimate with the largest tolerable n value might be the best estimate of the exact p-value that you can reasonably get.

View solution in original post

2 REPLIES 2
FreelanceReinh
Jade | Level 19

Hello @JackHP,

 

I wouldn't rule out that the computation of the two exact tests takes a long time. (There's a reason why the MAXTIME= option of the EXACT statement exists; see Computation Options.) To estimate the time needed I would restrict the input dataset using a suitable (extended) WHERE condition or an OBS= dataset option or by drawing random samples so as to use only, say, 10, 20, 40, ... observations and see how the run times increase. Of course, both FI_1 values must be represented in each of these smaller datasets.

 

Then, if the extrapolated run time for the full dataset was inacceptable, I would use the MC option of the EXACT statement (after a slash, see link above) and as long as the run times are tolerable add the N= option (MC is then redundant) with increasing values n>10000 (the default). You'll observe that the confidence limits of the exact p-values tend to get narrower as n increases. (As random variables they fluctuate even between runs of the same code, unless you use the SEED= option.) Finally, the Monte Carlo estimate with the largest tolerable n value might be the best estimate of the exact p-value that you can reasonably get.

JackHP
Fluorite | Level 6

Hi @FreelanceReinh ,

 

Thank you for your detailed response and suggestions. I will take a look at restricting my sample to get an idea of processing time for smaller samples to extrapolate the run time for the whole sample - great idea! I appreciate your explanation and links to these resources, it is very helpful!

 

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 2 replies
  • 2060 views
  • 3 likes
  • 2 in conversation