BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
thewan
Quartz | Level 8

This is an example dataset. I am doing an interim analysis, 3 interims and 1 final look. SimulID shows I have 3 simulations.

 

In all 3 simulations, the first interim happens when there are a total of 50 events. This is simple and straightforward.

 

The next interim happens at less than newtime, time value where 50 events happen + 17. This will vary from simulation to simulation. The problem is that SAS is using the value from the last simulation for the other two simulations as well. Is there a way I can have SAS replace the value in newtime and newtime2 with each iteration of SimulID.

 

/* sigma is scale parameter; use sigma=1/lambda for a rate parameter */
%macro RandExp(sigma);
((&sigma) * rand("Exponential"))
%mend;

/************ GENERATE TTE DATA *****************/

data Events(keep= SimulID PatientID t Event arm);
	do SimulID=1 to 3;					/*Number of Simulations*/
		call streaminit(1);
		HazardRate = 0.05; 				/* rate at which subject experiences event */
		CensorRate = 0.001; 				/* rate at which subject drops out */
		EndTime = 365; 					/* end of study period */
		do PatientID = 1 to 100;
			tEvent = %RandExp(1/HazardRate);
			c = %RandExp(1/CensorRate);
			t = min(tEvent, c, EndTime);
			Event = (c > tEvent | tEvent < EndTime);
			arm=1;
		output;
		end;
		do PatientID = 101 to 200;
			tEvent = %RandExp(1/(HazardRate-0.01));
			c = %RandExp(1/CensorRate);
			t = min(tEvent, c, EndTime);
			Event = (c > tEvent | tEvent < EndTime);
			arm=2;
		output;
		end;
	end;
run;





/************ INTERIM ANALYSIS *****************/
proc sort data=Events;
	by SimulID t;
run;

data Events;
	set Events;
	by SimulID;
	if first.SimulID then Total=0; 				*Set Total by Simulation ID;
	Total + Event; 						*Count only the events;
run;


/*Interim 1*/
data Events;
	set Events;
	if Total <= 50 then do;					*Interim 1 happens when events<=50;
		L_EVT1 = 1;
		Y1 = t;
		if event = 0 then L_EVT1 = 0;
		interim=1;
	end;
	if interim=. then interim=5;				*Used to make calculations easier;
run;



/*Sort by interim since the next datastep uses interim as a by variable*/
proc sort data=Events;
	by interim;
run;



data Events;
	set Events;
	by interim;
	if last.interim and interim=1 then do;
		newtime=t+17;
		call symputx('newtime',newtime,'G');	*The next interim happens at time at last.interim=1 + 17;
	end;
run;




/*Interim 2*/
data Events;
	set Events;
	if total>50 and t LE symgetn('newtime') then do;
		L_EVT2 = 1;
		Y2 = t;
		if event = 0 then L_EVT2 = 0;
		interim=2;
	end;
run;

proc sort data=Events;
	by interim;
run;

data Events;
	set Events;
	by interim;
	if last.interim and interim=2 then do;
		newtime=t+17;
		call symputx('newtime2',newtime,'G');	*The next interim happens at time at last.interim=2 + 17;
	end;
run;




/*Interim 3*/
data Events;
set Events;
	if t>symgetn('newtime') and t LE symgetn('newtime2') then do;
		L_EVT3 = 1;
		Y3 = t;
		if event = 0 then L_EVT3 = 0;
		interim=3;
	end;
run;

proc sort data=Events;
	by interim;
run;




/*Last Look*/
data Events;
	set Events;
	if t>symgetn('newtime2') then do;
		L_EVT4 = 1;
		Y4 = t;
		if event = 0 then L_EVT4 = 0;
		interim=4;
	end;
run;



proc sort data=Events;
	by SimulID interim;
run;
1 ACCEPTED SOLUTION

Accepted Solutions
thewan
Quartz | Level 8

Thank you, this sounds like a good approach!

I ended up using the concatenate statement and stored each macro variable according to the relevant Simulation ID, and it's working.

 

call symputx(catt('newtime',SimulID),newtime,'G');

View solution in original post

4 REPLIES 4
jimbarbour
Meteorite | Level 14

I know nothing of simulation.  I am an old programmer, looking at things from a data perspective.  If I'm so wide of the mark to be of no help, have yourself a good chuckle, and perhaps someone more knowledgeable will come along.

 

If you're wanting to have values by simulation and then at interim points within each simulation, I'm wondering if this sort "by interim;"

/*Sort by interim since the next datastep uses interim as a by variable*/
proc sort data=Events;
	by interim;
run;

shouldn't be "By Simulation Interim;".

 

If I want to maintain separate groups of information, I usually use that grouping as the first variable in a sort.

 

Second, if each simulation should have different values of newtime (or newtime2), have you considered a macro array?  Instead of Newtime and Newtime2, you might use NewtimeA and NewtimeB, but you would have NewtimeA1, NewtimeA2, NewtimeA3 and the same for NewtimeB.  The simulation number, 1, 2, or three could be used to control which occurrance of the macro array would be used, and then you would have distinct values for each simulation.

 

Macro arrays (you probably already know this, but for the sake of being thorough), are typically referenced in the form &&Var&i.  For example, NewtimeA would be &&NewtimeA&simulation and the value of &simulation would correspond to the simulation number.  NewtimeB would follow a similar pattern.  In this way you would have distinct values for NewtimeA and NewtimeB by simulation.

 

You'd probably have to put all your data steps after the first data step into a macro and in that macro have a structure something like this:

%macro Simulation_Control; 
    %do Simulation = 1 %TO 3;
         ... Data steps go here, each modified to do only 1 simulation at a time and to use the proper &&NewtimeA&Simulation and &&NewtimeB&Simulation values ...
    %end;
%mend Simulation_Control;

These are just some thoughts from someone thoroughly ill-informed.

 

No doubt others will come up with better ideas than this.

 

Best of luck,

 

Jim

thewan
Quartz | Level 8

Thank you for the quick reply, and the suggestion on updating the PROC SORTS by Simulation ID and interim. That cleaned up my results!

 

I did look into macro arrays before. I didn't consider them since the arrays would grow to be ridiculously long for a large number of simulations. If I'm running 1000 simulations then that would slow my program. I'm kind of bummed that macro arrays seem to be the only solution to this.

 

I updated my program to reflect your suggestions.

 

dm 'log;clear;output;clear; '; 							/*Clear the log file, results window, working directory*/;
dm 'odsresults; clear';;
PROC DATASETS LIB=work NOlist MEMTYPE=data kill; 

/* sigma is scale parameter; use sigma=1/lambda for a rate parameter */
%macro RandExp(sigma);
((&sigma) * rand("Exponential"))
%mend;



/************ GENERATE TTE DATA *****************/

data Events(keep= SimulID PatientID t Event arm);
	do SimulID=1 to 3;									/*Number of Simulations*/
		call streaminit(1);
		HazardRate = 0.05; 								/* rate at which subject experiences event */
		CensorRate = 0.001; 							/* rate at which subject drops out */
		EndTime = 365; 									/* end of study period */
		do PatientID = 1 to 100;
			tEvent = %RandExp(1/HazardRate);
			c = %RandExp(1/CensorRate);
			t = min(tEvent, c, EndTime);
			Event = (c > tEvent | tEvent < EndTime);
			arm=1;
		output;
		end;
		do PatientID = 101 to 200;
			tEvent = %RandExp(1/(HazardRate-0.01));
			c = %RandExp(1/CensorRate);
			t = min(tEvent, c, EndTime);
			Event = (c > tEvent | tEvent < EndTime);
			arm=2;
		output;
		end;
	end;
run;

proc sort data=Events;
	by SimulID t;
run;

data Events;
	set Events;
	by SimulID;
	if first.SimulID then Total=0; 				*Set Total by Simulation ID;
	Total + Event; 								*Count only the events;
run;


/*Interim 1*/
data Events;
	set Events;
	if Total <= 50 then do;						*Interim 1 happens when events<=50;
		L_EVT1 = 1;
		Y1 = t;
		if event = 0 then L_EVT1 = 0;
		interim=1;
	end;
	if interim=. then interim=5;				*Used to make calculations easier;
run;


/*Sort by interim since the next datastep uses interim as a by variable*/
proc sort data=Events;
	by SimulID interim;
run;



%macro sim;
/************ INTERIM ANALYSIS *****************/

%do SimulID=1 %to 3;

		data Events;
			set Events;
			by SimulID interim;
			if last.interim and interim=1 then do;
				newtime=t+17;
				call symputx('newtime',newtime,'G');	*The next interim happens at time at last.interim=1 + 17;
			end;
		run;


		/*Interim 2*/
		data Events;
			set Events;
			if total>50 and t LE symgetn('newtime') then do;
				L_EVT2 = 1;
				Y2 = t;
				if event = 0 then L_EVT2 = 0;
				interim=2;
			end;
		run;

		proc sort data=Events;
			by SimulID interim;
		run;

		data Events;
			set Events;
			by SimulID interim;
			if last.interim and interim=2 then do;
				newtime=t+17;
				call symputx('newtime2',newtime,'G');	*The next interim happens at time at last.interim=2 + 17;
			end;
		run;


		/*Interim 3*/
		data Events;
		set Events;
			if t>symgetn('newtime') and t LE symgetn('newtime2') then do;
				L_EVT3 = 1;
				Y3 = t;
				if event = 0 then L_EVT3 = 0;
				interim=3;
			end;
		run;

		proc sort data=Events;
			by SimulID interim;
		run;


		/*Last Look*/
		data Events;
			set Events;
			if t>symgetn('newtime2') then do;
				L_EVT4 = 1;
				Y4 = t;
				if event = 0 then L_EVT4 = 0;
				interim=4;
			end;
		run;



		proc sort data=Events;
			by SimulID interim;
		run;

%end;

%mend sim;

%sim;

 

 

jimbarbour
Meteorite | Level 14

I looked at your modified code.

 

Since you are doing 

%do SimulID=1 %to 3;

I think that if you add a WHERE statement to your first Data step in your macro that the macro variables NewTime and NewTime2 should be unique by simulation.

 

The code would look something like this:

		data Events;
			set Events;
			by SimulID interim;
            where SimulID = &SimulID;
			if last.interim and interim=1 then do;
				newtime=t+17;
				call symputx('newtime',newtime,'G');	*The next interim happens at time at last.interim=1 + 17;
			end;
		run;

That might avoid having to use a macro array.  You could try experimenting with that.

 

Jim

thewan
Quartz | Level 8

Thank you, this sounds like a good approach!

I ended up using the concatenate statement and stored each macro variable according to the relevant Simulation ID, and it's working.

 

call symputx(catt('newtime',SimulID),newtime,'G');

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 763 views
  • 1 like
  • 2 in conversation