BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Xusheng
Obsidian | Level 7

Hi, I used simulation to generate mixed distribution returns from state 1 & 2, where N =5000, P(state 1) = 70%, P(state 2) = 30%. Then I separate return 1 and return 2, and filling the missing value as 0. Now the question is how to create the average value scalar for each return? I mean create this number for future use. I tried transpose since the average number does not have the same dimension with my variables. However, after I count for the number of each return, R1 has 3400 whil R2 has 1600. Should these be 3500 for R1 and 1500 for R2? I would appreciate any comments and suggestion. Following is my sample and code:

/* sample from a mixture distribution */
%let N = 5000;
data States(drop=i);
call streaminit(12345);
array prob [2] _temporary_ (0.7 0.3);
do i = 1 to &N;
   type = rand("Table", of prob[*]);
   if type=1 then      Return = rand("Normal", 0, 3); /* state 1 */
   else                Return = rand("Normal", 5, 10); /* steta 2 */             
   output;
end;
run;
data work.States;
	set work.States;
if type = 1 then R1 = Return + 0;
else R2 = Return + 0;
if R1 =. then R1 =0;
if R2 =. then R2 =0;
Run;
proc transpose 
data=work.States;
run;
data want;
set data1;
array cnt col1-col5000;
array cnts cols1-cols5000;
do over cnt;
if cnt in (0,.) then cnts=1;
else cnts=.;
allcnts=nmiss(of cols1-cols5000);
end;
drop col: cols:;
run;

Here is a short example for the result:

N     type     return      R1       R2

1        1        0.58       0.58      0

2        2        0.36         0       0.36

3        1       -0.74      -0.74      0

1 ACCEPTED SOLUTION

Accepted Solutions
art297
Opal | Level 21

I am definitely NOT an expert regarding the fairly new RAND function. However, that said, based on the UNIFORM distribution and the seed you used, 3416 of the observations you created had a random number generated that was less than .7.

 

That is apparently what the table option is selecting.

 

Your multiple calls to the function also likely interferred with a standard expectation. Try the following and look at the output. I use proc means to get the group counts but, at the same time, it (I think) does the calculations you want:

 

%let N = 5000;
data States(drop=i);
  call streaminit(12345);
  array prob [2] _temporary_ (0.7 0.3);
  do i = 1 to &N;
    type = rand("Table", of prob[*]);
    output;
  end;
run;
data States;
  set States;
  call streaminit(12345);
  if type=1 then      Return = rand("Normal", 0, 3); /* state 1 */
  else                Return = rand("Normal", 5, 10); /* steta 2 */             
run;
data states;
  set states;
  call streaminit(12345);
  rnum = rand("Uniform");
run;
proc sort data=states;
  by rnum;
run;

proc means data=states;
  var return;
  class type;
run;

Art, CEO, AnalystFinder.com

 

View solution in original post

6 REPLIES 6
art297
Opal | Level 21

If you want exact proportional assignment you might be better off using proc surveyselect. There is probably a more direct route to get what you want, but the following will work:

 

%let N = 5000;
data States;
  do i = 1 to &N;
   output;
  end;
run;

proc surveyselect data=states groups=10 seed=12345 out=states noprint;
run;

data States(drop=i);
  set States;
   if groupid lt 8 then type=1;
   else type=2;
   if type=1 then Return = rand("Normal", 0,  3);/* state 1 */
   else           Return = rand("Normal", 5, 10);/* state 2 */             
   output;
run;

Art, CEO, AnalystFinder.com

 

Xusheng
Obsidian | Level 7

Hi art297, thank you for your contribution. I got exactly 3500 returns from state 1 and rest from state 2.

I still have two quesitons with your code.

1. Is there any possible reason why my code generate 3400 R1 and 1600 R2?

2. After grouping and creating the returns, how to get the mean, variance, standard deviation, densitiy, etc for return from state1 and from state 2 separately?

 

art297
Opal | Level 21

I am definitely NOT an expert regarding the fairly new RAND function. However, that said, based on the UNIFORM distribution and the seed you used, 3416 of the observations you created had a random number generated that was less than .7.

 

That is apparently what the table option is selecting.

 

Your multiple calls to the function also likely interferred with a standard expectation. Try the following and look at the output. I use proc means to get the group counts but, at the same time, it (I think) does the calculations you want:

 

%let N = 5000;
data States(drop=i);
  call streaminit(12345);
  array prob [2] _temporary_ (0.7 0.3);
  do i = 1 to &N;
    type = rand("Table", of prob[*]);
    output;
  end;
run;
data States;
  set States;
  call streaminit(12345);
  if type=1 then      Return = rand("Normal", 0, 3); /* state 1 */
  else                Return = rand("Normal", 5, 10); /* steta 2 */             
run;
data states;
  set states;
  call streaminit(12345);
  rnum = rand("Uniform");
run;
proc sort data=states;
  by rnum;
run;

proc means data=states;
  var return;
  class type;
run;

Art, CEO, AnalystFinder.com

 

Xusheng
Obsidian | Level 7

Thank you art297, I've tried your code and it works. 

Since I would like to use the exact 70% and 30%, I will apply your first method and then use excel to get the parameters I need.

I will consider the second method when I come back to SAS and continue with following steps. Thank you.

ilikesas
Barite | Level 11

Hi Xusheng,

 

I don't know if this approach is too crude, but why don't you just generate the 5000 i's, and if i is less than or equal to 3500 then type =1 and else type=2, and then you do all the subsequent simulation steps like art297 showed?

Xusheng
Obsidian | Level 7

Thank you ilikesas, I did what you suggest -- create two loop one is from 1 to 3500 and another is from 3501 to 5000. And it works. Thank you so much.

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 1062 views
  • 1 like
  • 3 in conversation