Statistical programming, matrix languages, and more

IML data simulation Do Loop Fail

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 7
Accepted Solution

IML data simulation Do Loop Fail

Hi Folks,

 

I'm very new to PROC IML but am attempting to simulate a dataset based on a preexisting correlation matrix. I found a nice macro for this (corr2data) at http://www.ats.ucla.edu/stat/sas/macros/corr2data_demo.htm. I've simplified this slightly for my use, as provided below. 

 

This works well for me but I need to adapt this to simulate hundreds of datasets from the same correlation matrix, and assign a sample ID to each simulated dataset. I used a %Do loop to achieve this, and in combination with a datastep this works, but I'm failing at making a sampleID variable to append to each row of the output dataset. (e.g., so all rows generated in the first iteration are labeled "1", and all rows labeled in the 13th iteration are labeled "13", for example).

 

Here's what I've got so far. Help much appreciated. 

The macro call: %SIMUDATA(simu.Fakenorm, simu.Rmat, 1051); 

 

the macro:

%macro SIMUDATA(outdata, corrmat, n);
%Do index=1 %to 20 %by 1;

proc iml;
use &corrmat;
read all var _num_ into C;
rn = nrow(C);
cn = ncol(C);
p = root(C);
dim = nrow(C);
myvar = rannor(J(&n, dim, 0));
do i = 1 to dim;
myvar[, i] = myvar[,i]-(sum(myvar[,i])/&n);
end;
XX = (t(myvar)*myvar)/(&n-1);
U = root(inv(XX));
Y = myvar*T(U);
T = Y*p;
* S=J(dim, 1, &index);               <----I made this in an attempt to create a vector of nrow length labeled with the index number
* V=insert(T,S,0,35);               <-----I made this to append that to the simulated dataset
create &outdata from V;
append from V;
quit;

data simu.cumu;
set simu.cumu &outdata;
run;

%end;
%mend;


Accepted Solutions
Solution
‎05-06-2016 01:17 PM
Occasional Contributor
Posts: 7

Re: IML data simulation Do Loop Fail

Much appreciated!

View solution in original post


All Replies
SAS Super FREQ
Posts: 3,391

Re: IML data simulation Do Loop Fail

This macro is unnecessary. The macro merely generates random multivariate normal samples, which you can do directly in SAS/IML by using the RANDNORMAL function. See the article "Sampling from the multivariate normal distribution."

 

To produce many samples from the same correlation matrix, see the article "How to generate multiple samples from the multivariate normal distribution in SAS."

 

Since you say you are new to SAS/IML, here is the program that incorporates what you've asked for, but PLEASE read the article for background:

data corrmat;
input c1 c2 c3;
datalines;
3 2 1
2 4 0
1 0 2
;

proc iml;
call randseed(4321);               
/* specify population mean and covariance */
Mean = {0, 0, 0};
use corrmat;
read all var _num_ into Cov;
close corrmat;

N = 5;                 /* sample size */
NumSamples = 10;       /* number of samples/replicates */  
 
X = RandNormal(N*NumSamples, Mean, Cov);
ID = colvec(repeat(T(1:NumSamples), 1, N)); /* 1,1,1,...,2,2,2,...,3,3,3,... */
Z = ID || X;
create MVN from Z[c={"ID" "x1" "x2" "x3"}];
append from Z;
close MVN;
quit;

When you analyze these samples, be sure to use the BY statement in procedures, and do not write a macro loop. as detailed in the article "Simulation in SAS: The slow way or the BY way."

Occasional Contributor
Posts: 7

Re: IML data simulation Do Loop Fail

Hi Rick, 

 

Thanks for your reply!

 

One thing I'm not following in your script is the preservation of the original correlation matrix in the generated sample. The sample correlation matrix im generating from has 34 variables, and I need the correlation pattern in those variables preserved in the generated data. 

 

 

SAS Super FREQ
Posts: 3,391

Re: IML data simulation Do Loop Fail

I showed you a 3-variable example. The only change you need to make is to make the population mean (the zero vector) equal to the dimension of your data.  You might also want to capture the original names of the variables and re-use them in the output data set.

 

proc iml;
call randseed(4321);               
/* specify population mean and covariance */
use corrmat;
read all var _num_ into Cov[c=varNames]; /* save var names */
close corrmat;
Mean = j(nrow(Cov),1,0); /* zero vector */

N = 5;                 /* sample size */
NumSamples = 10;       /* number of samples/replicates */  
 
X = RandNormal(N*NumSamples, Mean, Cov);
ID = colvec(repeat(T(1:NumSamples), 1, N)); /* 1,1,1,...,2,2,2,...,3,3,3,... */
Z = ID || X;
varNames = "ID" || varNames; /* comncatenate "ID" to var names */
create MVN from Z[c=varNames];
append from Z;
close MVN;
quit;
Solution
‎05-06-2016 01:17 PM
Occasional Contributor
Posts: 7

Re: IML data simulation Do Loop Fail

Much appreciated!

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 4 replies
  • 315 views
  • 0 likes
  • 2 in conversation