I am trying to accomplish the following in a simulation study in IML: Some of the samples drawn are bad samples that generates error and aborts the program. How can I stop where there is a possibility of error and return to the begining of loop and start the next iteration? Thanks.
It sounds like you want to subtract a loop counter. Maybe you are thinking of something like this:
/* fill array with 10 values from truncated normal. */
proc iml;
call randseed(123);
x = j(1, 10);
do i = 1 to 10;
call randgen(y, "Normal");
if y < 0 then
i = i - 1; /* back up counter (?!) */
else
x[i] = y;
end;
print x;
Personally, I don't like that programming construct because it looks like the DO loop is going to execute 10 times, but in reality it will probably execute more. Also, I don't like messing with the value of a looping variable in a DO loop. It makes me nervous.
An alternative is to replace the DO loop with a DO WHILE loop. Here it is clear that the loop will continue until 10 successes are achieved. The loop counter is incremented only after a successful value has been generated.
/* alternate (better?) approach. Use DO WHILE loop. */
proc iml;
call randseed(123);
x = j(1, 10);
i = 1;
do while(i <= 10);
call randgen(y, "Normal");
if y > 0 then do;
i = i + 1; /* only increment counter if condition satisfied */
x[i] = y;
end;
end;
print x;
I have a feeling that we're going to wind up asking for lots more details, but, the very simple and direct answer to your question is
IF THEN/ELSE
ought to do what you are asking.
Hi Paige, please see my message below.
It's hard to give precise advise without an example of the PROC IML call and the error generating process.
What is causing the error? What constitutes a "bad" sample? Seems like you should just be able to have a nested IF condition within the DO loop that checks for the presence of the underlynig source of error that, when true, adds 1 to the DO counter.
Hi Ryan please see my message below.
Here is the code to generate data. The problem arises during multinomial logistic regression data generation process. Code snippet below is to generate the the categorical outcome from multinomial logistics regression model. This is to be used for treatment status.
** generate treatment status; call RANDGEN(t, "TABLE", p);
However, simulating 1000 samples most often result in some categories with no values which result in error during dummy code transformation. I was trying to find a way to skip these samples which stall the program, and begin next cycle. If there are 950 samples out 1000 it is fine. But for now, sometimes the error happen to be with the 10th sample and cannot go beyond that.
%let S=1211; %let NumSamples = 100; %let N=1200; %let beta11 = log(3); %let beta12 = log(4); %let beta13 = 0; %let beta21 = log(9/10); %let beta22 = log(10/9); %let beta23 = 0; %let alpha0 = 0; %let alpha1 = 0.2; %let alpha2 = 0.4; %let alphaX = 0.2; %let alphaX3 = 0.1; ** simulate data; proc iml; ** assign variable names and allocate space for the data and parameters; varNamesData={SampleID x x3 t t1 t2 y}; TempSimData = J(&N, NCOL(varNamesData)); x = J(&N, 1); t = J(&N, 1); p = J(&N, 3); t1 = J(&N, 1, 0); t2 = J(&N, 1, 0); t3 = J(&N, 1, 0); y = J(&N,1); epsilon = J(&N, 1); TempSimData = J(&N, NCOL(varNamesData)); create SimData from TempSimData[c=varNamesData]; ** simulation loop; do SampleID = 1 to &NumSamples; call RANDSEED(0); call RANDGEN(x, "NORMAL", 1, 1); ** calculate the qubic term; x3 = x##3; beta01 = -(&beta11 * 1 + &beta21 * 4); beta02 = -(&beta12 * 1 + &beta22 * 4); ** define linear equations; eta13 = beta01 + &beta11 * x + &beta21 * x3; *T=1 vs T=3; eta23 = beta02 + &beta12 * x + &beta22 * x3; *T=2 vs T=3; *eta33 = 0 + 0 * x + 0*x3;; *T=3 vs T=3; ** find actual probabilities for subjects to be in each treatment level; pi1 = exp(eta13) / (exp(eta13) + exp(eta23) + 1); pi2 = exp(eta23) / (exp(eta13) + exp(eta23) + 1); pi3 = 1 / (exp(eta13) + exp(eta23) + 1); ** fill the probability matrix from pi1, pi2, and pi3; p[,1] = pi1; p[,2] = pi2; p[,3] = pi3; ** generate treatment status; call RANDGEN(t, "TABLE", p); idx1 = LOC(t=1); idx2 = LOC(t=2); idx3 = LOC(t=3); * create dummy variables for treatment levels; if NCOL(idx1)>0 then t1[idx1]=1; else print "No observations in level 1"; if NCOL(idx2)>0 then t2[idx2]=1; else print "No observations in level 2"; if NCOL(idx3)>0 then t3[idx3]=1; else print "No observations in level 3"; ** generate residuals; call RANDGEN(epsilon, "NORMAL", 0, .5); ** generate y; y = &alpha0 + &alpha1*t1 + &alpha2*t2 + &alphaX*x + &alphaX3*x3 + epsilon; ** create a temporary simulated data for each simulation loop; TempSimData[,1] = SampleID; TempSimData[,2] = x; TempSimData[,3] = x3; TempSimData[,4] = t; TempSimData[,5] = t1; TempSimData[,6] = t2; TempSimData[,7] = y; setout SimData; append from TempSimData; end; close SimData; quit;
It sounds like you want to subtract a loop counter. Maybe you are thinking of something like this:
/* fill array with 10 values from truncated normal. */
proc iml;
call randseed(123);
x = j(1, 10);
do i = 1 to 10;
call randgen(y, "Normal");
if y < 0 then
i = i - 1; /* back up counter (?!) */
else
x[i] = y;
end;
print x;
Personally, I don't like that programming construct because it looks like the DO loop is going to execute 10 times, but in reality it will probably execute more. Also, I don't like messing with the value of a looping variable in a DO loop. It makes me nervous.
An alternative is to replace the DO loop with a DO WHILE loop. Here it is clear that the loop will continue until 10 successes are achieved. The loop counter is incremented only after a successful value has been generated.
/* alternate (better?) approach. Use DO WHILE loop. */
proc iml;
call randseed(123);
x = j(1, 10);
i = 1;
do while(i <= 10);
call randgen(y, "Normal");
if y > 0 then do;
i = i + 1; /* only increment counter if condition satisfied */
x[i] = y;
end;
end;
print x;
Thank you Rick, DO WHILE did the trick!
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Learn how to run multiple linear regression models with and without interactions, presented by SAS user Alex Chaplin.
Find more tutorials on the SAS Users YouTube channel.