Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Options

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Posted 06-02-2021 05:42 PM
(442 views)

Hello,

I am new to proc iml and am trying to simulate multivariate data to run through statistical models and compute power. Below is very simplified code similar to what I am do - the middle of the program where here I am simulating one variable from a normal distribution would be more complicated and would be simulating multivariate data with a specified correlation. The part I cannot figure out is how to incorporate the do loops into iml and get out the same type of output I would get from the code below (i.e, columns for run, n, trt, plot, and y) so that I can run my statistical model by run and n and compute power. Any help would be greatly appreciated.

Also, I am able to simulate multivariate normal data without iml. I found code to simulate multivariate binomial both with iml and without. For a multivariate beta distribution I found iml code using copulas. However, I also need to simulate multivariate count data (both Poisson and negative binomial) but I have not found any code for this and was not sure if copulas could be used to these two distributions. If anyone knows of some sample code for multivariate Poisson or negative binomial, that would be very helpful. I am running SAS 9.4 and SAS/IML 15.2.

Thank you. Deb

data test;

do run=1 to 1000;

do n=2 to 50 by 2;

do trt=1 to 4;

do plot=1 to n;

y=rand("Normal",0,1);

output;

end;

end;

end;

end;

run;

3 REPLIES 3

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

That's a lot of questions. Most of them are addressed in Chapters 11 and 12 of *Simulating Data with SAS*.

The Poisson and negative binomial models are two examples of a general linear model. See the article "Simulate many samples from a logistic regression model," which shows how to simulate a logistic model. You can get other generalized linear models by modifying the statement

call randgen(y, "Bernoulli", mu); /* 4. simulate binary response */

to instead sample Y from the Poisson or negative binomial distribution.

I've written about how to compute a power curve by using the DATA step. To use PROC IML, you can study the example at the end of the article "Use simulation to estimate the power of a statistical test."

You can then add the loops. You can study the code in "Estimate a power curve in parallel in SAS Viya,"

but that article uses parallel computations in SAS Viya. Nevertheless, you can use the general framework of the program in PROC IML in SAS 9, but compute the curve by using serial computations.

Here is one specific suggestion: Put the 'n' loop on the outside. The inner loops are then responsible for generating 1000 samples of your data for a given sample size.

I hope this helps. If you have specific questions, start a new thread in which you ask ONE question. Include the IML code that you have written so far.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

For tips on efficient simulation in SAS, read the paper (or watch the video) "Ten Tips for Simulating Data with SAS," especially the section on avoiding macro loops and PROC APPEND: "Simulation in SAS: The slow way or the BY way."

The SAS/IML language supports the iterative DO statement. The syntax is the same as for the DATA step. There is no need to use macro loops.

As I said, the article "Simulate many samples from a logistic regression model" provides simulation code similar to what you want to do. You can open the data set for output, run the loops, and output (APPEND) the simulated data for each sample.

My suggestion: Write a module called RandMVBeta that takes the following input parameters, which are the design parameters for your simulation study:

- runs (number of samples)
- n (sample size)
- diff (difference between means in population)
- rho (correlation of MVN data in population)

The function should return a (runs*n) x 2 matrix of random variates from your correlated bivariate beta distribution. You can then call that function in a loop that runs over the design parameters.

You then have two choices:

1. For each call, you output (append) the random values, along with the values of the parameters. At the end of your program, you have written a SAS data set that you can analyze using PROC CORR, PROC MEANS, etc.

2. For each sample, use PROC IML to compute the results by calling the CORR function, the MEAN function, etc. You would then only write (append) the statistics for each sample, not the simulated values.

The second option is optimal to speed and efficiency, but all this depends on your proficiency as a SAS/IML programmer.

Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.

**If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website. **

Multiple Linear Regression in SAS

Learn how to run multiple linear regression models with and without interactions, presented by SAS user Alex Chaplin.

Find more tutorials on the SAS Users YouTube channel.