I am using SAS 9.4 and attempting to adapt Rick Wicklin's data driven simulation procedure as described at The DO Loop September 27, 2017. The two adaptions I am intending to make are (1) generating multivariate data (here, 2 dvs for simplicity) for 2 groups given a single study condition and (2) doing this same thing for each of several study conditions that involve, for example, different sample sizes or different correlations. The advantage of generating data for all of these conditions with one program is, of course, so that I do not have to conduct many separate simulations when sample size and so forth change. The first adaption is below and is not causing any problems. However, for the second adaptation, I have having difficulty reading from each of several lines from the data set that contains the study conditions. As you see under the Two Condition header, I am using a Scan Loop procedure that works well for creating the data set containing the groups but, as you can see under the Two Condition Multivariate header, does not create multivariate outcomes (with this problem occurring in Proc IML.) I assume that I am not reading in the vectors properly in Proc IML, and I do not know how to proceed. I am not sure if Scan Loop can work with PROC IML and I am just not implementing it correctly or if some different method needs to be used. I have 3 sections of code below. The first generates 2 group multivariate data using values from one condition that are contained in a data set. Again, there are no problems here, but I am showing this syntax to give you a better idea of the nature of the simulation. The second section contains includes the Scan Loop procedure but applies it only to the part of the simulation where groups are assigned (i.e., no multivariate data). I do this to show how that the Scan Loop procedure does exactly what I want it to do: create data for each of several conditions (just 2 conditions here) and obtains a single data set having raw data simulated many times under each of the study conditions. The third section attempts to apply the Scan Loop procedure to obtain the multivariate outcome data but fails. Any help or pointers you could give would be greatly appreciated! Thanks, Keenan Well, here goes . . . /************************************************* For one Condition; *************************************************/ /* Data file that contains the parameters */ data cormat; input var1 covar1 covar2 var2 Meanres1 Meanres2 N NumSamples Cond ESY1 ESY2; datalines; 1 .5 .5 1 0 0 50 10 1 .5 .5 ; /* Assigns macro variable names and obtains values of parameters from Data set Cormat: */ data _null_; set cormat; CALL SYMPUT('N',N); CALL SYMPUT('Numsamples',Numsamples); CALL SYMPUT ('Cond', Cond); CALL SYMPUT ('Mean1', Mean1); CALL SYMPUT ('Mean2', Mean2); CALL SYMPUT ('ESY1', ESY1); CALL SYMPUT ('ESY2', ESY2); run; /* Creates data set with 2 groups with Group N = N/2 and reps = Numsamples, retains study parameters; */ Data Group; Cond=&Cond; do Reps = 1 to &Numsamples; do ID = 1 to &N; If ID le &N/2 then T = 0; else T=1; output; end; end; run; Data Group; Set Group; If T = 1 then do; /* Assigns treatment means */ PredY1 = &ESY1; PredY2 = &ESY2; end; Else if T = 0 then do; PredY1 = 0; /*Assigns control means */ PredY2 = 0; end; Output; run; /* Obtaining residuals from multivariate normal distribution using values from cormat */ proc iml; use cormat (Keep = var1 covar1 covar2 var2); read all var _NUM_ into vector; Cov = shape (vector, 2, 2); close cormat; use cormat (Keep = Meanres1 Meanres2); read all var _NUM_ into Mean; close cormat; use cormat (keep = N Numsamples); read all var {N} into N; read all var {Numsamples} into Numsamples; close cormat; R = RandNormal(N*Numsamples, Mean, Cov); Reps = colvec(repeat(T(1:Numsamples), 1, N)); /* 1,1,1,...,2,2,2,...,3,3,3,... */ Z = Reps || R; create MVN from Z[c={"Reps" "r1" "r2"}]; append from Z; close MVN; quit; /* Merging data set having predicted values with data set having residuals */ /* Creating Observed Y scores as predicted plus residual */ Data all; merge group mvn; Y1 = PredY1 + r1; Y2 = PredY2 + r2; run; /************************************************************************************* For Two Conditions; Scan Loop works for Group data set *************************************************************************************/ /* Data file that contains the parameters, with each line having different study conditions */ data cormat; input var1 covar1 covar2 var2 Meanres1 Meanres2 N NumSamples Cond ESY1 ESY2; datalines /* 2 conditions now included with N differing across conditions */; 1 .5 .5 1 0 0 50 10 1 .5 .5 1 .5 .5 1 0 0 30 10 2 .5 .5 ; /* Macro to SCAN through cormat data file */ %MACRO SCANLOOP(cormat,Field1,Field2,Field3,Field4,Field5,Field6,Field7,Field8,Field9,Field10,Field11); /* First provide the number of records in cormat */ DATA _NULL_; IF 0 THEN SET &cormat NOBS=X; CALL SYMPUT('RECCOUNT',X); STOP; RUN; /* loop from one to number of records */ %DO I=1 %TO &RECCOUNT; /* Advance to the Ith record */ DATA _NULL_; SET &cormat (FIRSTOBS=&I); /* store the variables of interest in macro variables */ CALL SYMPUT('Var1',&Field1); CALL SYMPUT('Covar1',&Field2); CALL SYMPUT('Covar2',&Field3); CALL SYMPUT('Var2',&Field4); CALL SYMPUT('Meanres1',&Field5); CALL SYMPUT('Meanres2',&Field6); CALL SYMPUT('N',&Field7); CALL SYMPUT('NumSamples',&Field8); CALL SYMPUT('Cond',&Field9); CALL SYMPUT('ESY1',&Field10); CALL SYMPUT('ESY2',&Field11); STOP; RUN; /* Creates data set with 2 groups with Group N = N/2 and reps = Numsamples, retains study parameters; */ /* Does this for each study condition */ Data Group; Cond=&Cond; do Reps = 1 to &Numsamples; do ID = 1 to &N; If ID le &N/2 then T = 0; else T=1; output; end; end; run; Data Group; Set Group; If T = 1 then do; PredY1 = &ESY1; PredY2 = &ESY2; end; Else if T = 0 then do; PredY1 = 0; PredY2 = 0; end; Output; run; /* Proc datasets appends data sets from each study condition */ PROC DATASETS; APPEND BASE=ALLDATA DATA=Group; RUN; QUIT; %END; %MEND SCANLOOP; /* Call SCANLOOP macro */ %SCANLOOP(cormat,var1,covar1,covar2,var2,Meanres1,Meanres2,N,NumSamples,Cond,ESY1,ESY2); RUN; /************************************************************************************* For Two Conditions Multivariate Data: Problem occurs in Proc IML part *************************************************************************************/ /* Data file that contains the parameters, with each line having different study conditions */ data cormat; input var1 covar1 covar2 var2 Meanres1 Meanres2 N NumSamples Cond ESY1 ESY2; datalines /* 2 conditions now included with N differing across conditions */; 1 .5 .5 1 0 0 50 10 1 .5 .5 1 .5 .5 1 0 0 30 10 2 .5 .5 ; /* Macro to SCAN through cormat data file */ %MACRO SCANLOOP(cormat,Field1,Field2,Field3,Field4,Field5,Field6,Field7,Field8,Field9,Field10,Field11); /* First provide the number of records in cormat */ DATA _NULL_; IF 0 THEN SET &cormat NOBS=X; CALL SYMPUT('RECCOUNT',X); STOP; RUN; /* loop from one to number of records */ %DO I=1 %TO &RECCOUNT; /* Advance to the Ith record */ DATA _NULL_; SET &cormat (FIRSTOBS=&I); /* store the variables of interest in macro variables */ CALL SYMPUT('Var1',&Field1); CALL SYMPUT('Covar1',&Field2); CALL SYMPUT('Covar2',&Field3); CALL SYMPUT('Var2',&Field4); CALL SYMPUT('Meanres1',&Field5); CALL SYMPUT('Meanres2',&Field6); CALL SYMPUT('N',&Field7); CALL SYMPUT('NumSamples',&Field8); CALL SYMPUT('Cond',&Field9); CALL SYMPUT('ESY1',&Field10); CALL SYMPUT('ESY2',&Field11); STOP; RUN; /* Creates data set with 2 groups with Group N = N/2 and reps = Numsamplese, retains study parameters; */ /* Does this for each study condition */ Data Group; Cond=&Cond; do Reps = 1 to &Numsamples; do ID = 1 to &N; If ID le &N/2 then T = 0; else T=1; output; end; end; run; Data Group; Set Group; If T = 1 then do; PredY1 = &ESY1; PredY2 = &ESY2; end; Else if T = 0 then do; PredY1 = 0; PredY2 = 0; end; Output; run; /* Proc data sets appends data sets from each study condition */ PROC DATASETS; APPEND BASE=ALLDATA DATA=Group; RUN; QUIT; /* The procedure fails from this point on */ /* Obtaining residuals from multivariate normal distribution using values from cormat */ proc iml; use cormat (Keep = var1 covar1 covar2 var2}; /* I am not reading these variables, as well as others below, in properly; I also tried placing an & before each variable but that also failed */ read all var _NUM_ into vector; Cov = shape (vector, 2, 2); close cormat; use cormat (Keep = Meanres1 Meanres2); read all var _NUM_ into Mean; close cormat; use cormat (keep = N Numsamples); read all var {N} into N; read all var {Numsamples} into Numsamples; close cormat; R = RandNormal(N*Numsamples, Mean, Cov); Reps = Fieldvec(repeat(T(1:Numsamples), 1, N)); /* 1,1,1,...,2,2,2,...,3,3,3,... */ Z = Reps || R; create MVN from Z[c={"Reps" "r1" "r2"}]; append from Z; close MVN; quit; %END; %MEND SCANLOOP; /* Call SCANLOOP macro */ %SCANLOOP(cormat,var1,covar1,covar2,var2,Meanres1,Meanres2,N,NumSamples,Cond,ESY1,ESY2); RUN; Below are some of the error messages I receive when I run the syntax above. ERROR 23-7: Invalid value for the KEEP option. ERROR: Invalid value for the KEEP option. ERROR: Some options for file WORK.CORMAT were not processed because of errors or warnings noted above. statement : USE at line 290 column 1 ERROR: No data set is currently open for input. statement : READ at line 290 column 1 ERROR: (execution) Matrix has not been set to a value. operation : SHAPE at line 290 column 1 operands : vector, *LIT1001, *LIT1002 vector 0 row 0 col (type ?, size 0) *LIT1001 1 row 1 col (numeric) 2 *LIT1002 1 row 1 col (numeric) 2
... View more