BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
Levi_M
Fluorite | Level 6

Good Morning, 

I am trying to create a synthetic bivariate database where the the outcome is influenced by calculated correlations. Roughly 10+ variables. I am using RanNBIN, but continue to have violations. Does anyone have experience to assist me with this?

 

Thank you, 

Spaxxs 

1 ACCEPTED SOLUTION

Accepted Solutions
Rick_SAS
SAS Super FREQ

Chapter 9 of Wicklin (2013)  explains why some sets of correlations are not feasible. "Feasible" means that the combination of means and correlations that you specified are not possible under the distributions that the macro uses.

 

The macro you are using supports a limited number of correlation structures: compound symmetric, AR(1), and banded. It is possible that your parameters are not feasible for those structured correlation matrices, but are feasible for more general correlations. The algorithms in Wicklin (2013) use the Emrich-Piedmonte algorithm, which enables you to fit arbitrary correlation structures (but you still have to specify feasible parameters).  However, RANMBIN requires only Base SAS whereas the methods in Wicklin (2013) require a SAS/IML license.

View solution in original post

5 REPLIES 5
AMSAS
SAS Super FREQ

@Levi_M You will have better success getting a reply if you could provide more information. 

Here's a link to a SAS Note on RANMBIN:
Sample 66969: Generate multivariate binary data with specified means and correlation matrix 

  • What happens in the SAS log? 
  • What errors do you get?
  • Does it work on a small example?

The more information you can provide the better we can assist

sbxkoenk
SAS Super FREQ

Hello,

 

I had never heard about %RanMBIN.

@Rick_SAS : you know that one?

 

But I know about this (related) blog from @Rick_SAS :

Tips to simulate binary and categorical variables
By Rick Wicklin on The DO Loop November 2, 2020
https://blogs.sas.com/content/iml/2020/11/02/simulate-binary-and-categorical-variables.html

 

Cheers,

Koen

Levi_M
Fluorite | Level 6

Thank you for the feed back. I have attached the 4 items that I am using for the dataset generation. 

spaxxs_0-1656514950095.png

 

Hopefully this helps?

Rick_SAS
SAS Super FREQ

Chapter 9 of Wicklin (2013)  explains why some sets of correlations are not feasible. "Feasible" means that the combination of means and correlations that you specified are not possible under the distributions that the macro uses.

 

The macro you are using supports a limited number of correlation structures: compound symmetric, AR(1), and banded. It is possible that your parameters are not feasible for those structured correlation matrices, but are feasible for more general correlations. The algorithms in Wicklin (2013) use the Emrich-Piedmonte algorithm, which enables you to fit arbitrary correlation structures (but you still have to specify feasible parameters).  However, RANMBIN requires only Base SAS whereas the methods in Wicklin (2013) require a SAS/IML license.

Ksharp
Super User

How about this one ? By using Genetic Algorithm .

 

data corr;
infile cards expandtabs;
input x1-x4 ;
cards;
1           0.452159638 0.220107738 0.412390423
0.452159638 1	        0.080503668 0.366678316
0.220107738 0.080503668	1           0.0022723
0.412390423 0.366678316	0.0022723   1
;


proc iml;
use corr;
read all var _num_ into corr[c=vname];
close;

start function(x) global(ncol,corr);
 temp=corr(shape(x,0,ncol));
 sse=ssq(temp-corr) ;
 return (sse);
finish;

nobs=1000;
ncol=ncol(corr);
size=nobs#ncol;

bounds=j(2,size,0);
bounds[2,]=1 ;    

id=gasetup(2,size,123456789);
call gasetobj(id,0,"function");
call gasetsel(id,10,1,.95);
call gainit(id,10000,bounds);


niter =  200 ;
do i = 1 to niter;
 call garegen(id);
 call gagetval(value, id);
end;
call gagetmem(mem, value, id, 1);

want=shape(mem,0,ncol);

create want from want[c=vname];
append from want;
close;

print value[l = "Min Value:(be near zero,be better)"] ;
call gaend(id);
quit;


proc corr data=want pearson;
var _numeric_;
run;

Ksharp_0-1656592056667.png

 

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 645 views
  • 8 likes
  • 5 in conversation