BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
SimRock
Obsidian | Level 7

Hi,

I am trying to simulate to simulate a longitudinal design in SAS IML for continuous and binary outcomes. Could anyone help with

1) Providing a more efficient and shorter way to write the following continuous simulation in SAS/IML. I tried to find an alternative way (less coding) to do this loop using RANDGEN or RAND for both continuous and binary variable. On page 227 of Simulating data with sas (Rick Wiclkin), there is an example but I couldn't adapt it.

 

proc iml;
		call randseed(1);
		time = T( do(1, 10, 1));
		std=1;
		N =5;
		grandmean=20;
		whz = j(nrow(time), N,.);
		whz[1,]=0.2; /*initialize the first instance*/

		ei = j(1,N);
		do i = 2 to nrow(time);
			call randgen(ei, "Normal", 0, std);
			whz[i,] =  grandmean + whz[i-1,] + ei;
		end;
		print time whz;
quit;

2) Providing an example for the binary one. I couldn't get my code to work

 

 

proc iml;
		call randseed(1);
		time = T( do(1, 10, 1));
		N =5;
		pa = j(nrow(time), N,.);
		pa[1,]=0; /*initialize the first instance*/

		do i = 2 to nrow(time);
			call randgen(pa, "bernouilli", logistic( 2*[i-1,]));
		end;
		print time pa;
/*This did not work*/

 

 

Thank you

1 ACCEPTED SOLUTION

Accepted Solutions
Rick_SAS
SAS Super FREQ

OK. Thanks for the additional details. Your IML program already performs these computations, so my responses are

> 1) Providing a more efficient and shorter way to write the following continuous simulation in SAS/IML

 

The only random variable is the error term, and all errors are independent. You can't avoid the loop b/c it's a time series, but I'd suggest incorporating the grandmean into the call to RANDGEN and using

 

proc iml;
call randseed(1);
time = T( do(1, 10, 1));
std=1;
N =5;
grandmean=20;
whz = j(nrow(time), N,.);
e = whz;
call randgen(e, "Normal", grandmean, std); /* generate all N(mu, sigma) */

whz[1,]=0.2; /*initialize the first instance*/
do i = 2 to nrow(time);
   whz[i,] =  whz[i-1,] + e[i-1,];
end;
print time whz;

> 2) Providing an example for the binary one. I couldn't get my code to work

 

Your probability depends only on the previous value of arpa, so the call to RANDGEN has to go inside the loop. Otherwise, the same ideas apply:

 

arpa = j(nrow(time), N, 0);
a = j(1, N, 0);
do i = 2 to nrow(time);
   call randgen(a, "Bernoulli", logistic(2*arpa[i-1, ]));
   arpa[i,] =  a;
end;
print arpa;

View solution in original post

4 REPLIES 4
Rick_SAS
SAS Super FREQ

You haven't described what you want (1) to do. The page you quote shows an ANOVA example, but this code looks like a time series in which each step increments by 20 (plus random amount) from the previous time step. 

 

For (2), it looks like you have forgotten 'pa' in the argument to the LOGISTIC function, but again I don't know what you are trying to accomplish.:  

call randgen(pa, "bernouilli", logistic( 2*pa[i-1,]));  /* ??? */

 

 

SimRock
Obsidian | Level 7

Thank you Rick,

I wanted to be able to simulate future whz[t] and pa[t] as a function of past whz[t-1]  and pa[t-1], respectively. In other words, I wanted to get data similar to this data step simulation but I'd like to do it in SAS IML.

%let start=1;
%let endpoint=10;
%let grandmean=20;
%let std=1;
data simregwide;
      array arwhz	{&start:&endpoint} whz&start-whz&endpoint;
	  array arpa	{&start:&endpoint} pa&start-pa&endpoint;
	  
		call streaminit(1234);
		do id=1 to 5;
			do t = &start to &endpoint;
			  if t =1 then
					do;
					   arwhz[1]=0.2;
					   arpa[1]=0;
					end;

				else
					do;
					    arwhz[t]=rand("normal", &grandmean + arwhz[t-1], &std);
			  			arpa[t]=rand("bernouilli", logistic(2*arpa[t-1]));
			        end;
			
			end;
    		output;
		end;
		keep id whz1-whz10 pa1-pa10;
run;

Data simreglong ;
retain id t whz pa;
	set simregwide;
	  array arwhz	{&start:&endpoint} whz&start-whz&endpoint;
	  array arpa	{&start:&endpoint} pa&start-pa&endpoint;

	do t = &start to &endpoint;
			whz =	arwhz[t];
			pa = 	arpa[t];
	output;
	end;

	keep 	id t whz pa;
run;

 Thank you

Rick_SAS
SAS Super FREQ

OK. Thanks for the additional details. Your IML program already performs these computations, so my responses are

> 1) Providing a more efficient and shorter way to write the following continuous simulation in SAS/IML

 

The only random variable is the error term, and all errors are independent. You can't avoid the loop b/c it's a time series, but I'd suggest incorporating the grandmean into the call to RANDGEN and using

 

proc iml;
call randseed(1);
time = T( do(1, 10, 1));
std=1;
N =5;
grandmean=20;
whz = j(nrow(time), N,.);
e = whz;
call randgen(e, "Normal", grandmean, std); /* generate all N(mu, sigma) */

whz[1,]=0.2; /*initialize the first instance*/
do i = 2 to nrow(time);
   whz[i,] =  whz[i-1,] + e[i-1,];
end;
print time whz;

> 2) Providing an example for the binary one. I couldn't get my code to work

 

Your probability depends only on the previous value of arpa, so the call to RANDGEN has to go inside the loop. Otherwise, the same ideas apply:

 

arpa = j(nrow(time), N, 0);
a = j(1, N, 0);
do i = 2 to nrow(time);
   call randgen(a, "Bernoulli", logistic(2*arpa[i-1, ]));
   arpa[i,] =  a;
end;
print arpa;
SimRock
Obsidian | Level 7

Thank you so much, Rick. This worked. I have a follow-up question on how to summarize the results by time points but I'll ask in a different post for clarity

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

Multiple Linear Regression in SAS

Learn how to run multiple linear regression models with and without interactions, presented by SAS user Alex Chaplin.

Find more tutorials on the SAS Users YouTube channel.

From The DO Loop
Want more? Visit our blog for more articles like these.
Discussion stats
  • 4 replies
  • 1480 views
  • 0 likes
  • 2 in conversation