Hi,
I am trying to simulate to simulate a longitudinal design in SAS IML for continuous and binary outcomes. Could anyone help with
1) Providing a more efficient and shorter way to write the following continuous simulation in SAS/IML. I tried to find an alternative way (less coding) to do this loop using RANDGEN or RAND for both continuous and binary variable. On page 227 of Simulating data with sas (Rick Wiclkin), there is an example but I couldn't adapt it.
proc iml;
call randseed(1);
time = T( do(1, 10, 1));
std=1;
N =5;
grandmean=20;
whz = j(nrow(time), N,.);
whz[1,]=0.2; /*initialize the first instance*/
ei = j(1,N);
do i = 2 to nrow(time);
call randgen(ei, "Normal", 0, std);
whz[i,] = grandmean + whz[i-1,] + ei;
end;
print time whz;
quit;
2) Providing an example for the binary one. I couldn't get my code to work
proc iml;
call randseed(1);
time = T( do(1, 10, 1));
N =5;
pa = j(nrow(time), N,.);
pa[1,]=0; /*initialize the first instance*/
do i = 2 to nrow(time);
call randgen(pa, "bernouilli", logistic( 2*[i-1,]));
end;
print time pa;
/*This did not work*/
Thank you
OK. Thanks for the additional details. Your IML program already performs these computations, so my responses are
> 1) Providing a more efficient and shorter way to write the following continuous simulation in SAS/IML
The only random variable is the error term, and all errors are independent. You can't avoid the loop b/c it's a time series, but I'd suggest incorporating the grandmean into the call to RANDGEN and using
proc iml;
call randseed(1);
time = T( do(1, 10, 1));
std=1;
N =5;
grandmean=20;
whz = j(nrow(time), N,.);
e = whz;
call randgen(e, "Normal", grandmean, std); /* generate all N(mu, sigma) */
whz[1,]=0.2; /*initialize the first instance*/
do i = 2 to nrow(time);
whz[i,] = whz[i-1,] + e[i-1,];
end;
print time whz;
> 2) Providing an example for the binary one. I couldn't get my code to work
Your probability depends only on the previous value of arpa, so the call to RANDGEN has to go inside the loop. Otherwise, the same ideas apply:
arpa = j(nrow(time), N, 0);
a = j(1, N, 0);
do i = 2 to nrow(time);
call randgen(a, "Bernoulli", logistic(2*arpa[i-1, ]));
arpa[i,] = a;
end;
print arpa;
You haven't described what you want (1) to do. The page you quote shows an ANOVA example, but this code looks like a time series in which each step increments by 20 (plus random amount) from the previous time step.
For (2), it looks like you have forgotten 'pa' in the argument to the LOGISTIC function, but again I don't know what you are trying to accomplish.:
call randgen(pa, "bernouilli", logistic( 2*pa[i-1,])); /* ??? */
Thank you Rick,
I wanted to be able to simulate future whz[t] and pa[t] as a function of past whz[t-1] and pa[t-1], respectively. In other words, I wanted to get data similar to this data step simulation but I'd like to do it in SAS IML.
%let start=1;
%let endpoint=10;
%let grandmean=20;
%let std=1;
data simregwide;
array arwhz {&start:&endpoint} whz&start-whz&endpoint;
array arpa {&start:&endpoint} pa&start-pa&endpoint;
call streaminit(1234);
do id=1 to 5;
do t = &start to &endpoint;
if t =1 then
do;
arwhz[1]=0.2;
arpa[1]=0;
end;
else
do;
arwhz[t]=rand("normal", &grandmean + arwhz[t-1], &std);
arpa[t]=rand("bernouilli", logistic(2*arpa[t-1]));
end;
end;
output;
end;
keep id whz1-whz10 pa1-pa10;
run;
Data simreglong ;
retain id t whz pa;
set simregwide;
array arwhz {&start:&endpoint} whz&start-whz&endpoint;
array arpa {&start:&endpoint} pa&start-pa&endpoint;
do t = &start to &endpoint;
whz = arwhz[t];
pa = arpa[t];
output;
end;
keep id t whz pa;
run;
Thank you
OK. Thanks for the additional details. Your IML program already performs these computations, so my responses are
> 1) Providing a more efficient and shorter way to write the following continuous simulation in SAS/IML
The only random variable is the error term, and all errors are independent. You can't avoid the loop b/c it's a time series, but I'd suggest incorporating the grandmean into the call to RANDGEN and using
proc iml;
call randseed(1);
time = T( do(1, 10, 1));
std=1;
N =5;
grandmean=20;
whz = j(nrow(time), N,.);
e = whz;
call randgen(e, "Normal", grandmean, std); /* generate all N(mu, sigma) */
whz[1,]=0.2; /*initialize the first instance*/
do i = 2 to nrow(time);
whz[i,] = whz[i-1,] + e[i-1,];
end;
print time whz;
> 2) Providing an example for the binary one. I couldn't get my code to work
Your probability depends only on the previous value of arpa, so the call to RANDGEN has to go inside the loop. Otherwise, the same ideas apply:
arpa = j(nrow(time), N, 0);
a = j(1, N, 0);
do i = 2 to nrow(time);
call randgen(a, "Bernoulli", logistic(2*arpa[i-1, ]));
arpa[i,] = a;
end;
print arpa;
Thank you so much, Rick. This worked. I have a follow-up question on how to summarize the results by time points but I'll ask in a different post for clarity
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how to run multiple linear regression models with and without interactions, presented by SAS user Alex Chaplin.
Find more tutorials on the SAS Users YouTube channel.