turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- Base SAS Programming
- /
- Variable average with missing value

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

04-01-2017 08:40 AM

Hi, I used simulation to generate mixed distribution returns from state 1 & 2, where N =5000, P(state 1) = 70%, P(state 2) = 30%. Then I separate return 1 and return 2, and filling the missing value as 0. Now the question is how to create the average value scalar for each return? I mean create this number for future use. I tried transpose since the average number does not have the same dimension with my variables. However, after I count for the number of each return, R1 has 3400 whil R2 has 1600. Should these be 3500 for R1 and 1500 for R2? I would appreciate any comments and suggestion. Following is my sample and code:

```
/* sample from a mixture distribution */
%let N = 5000;
data States(drop=i);
call streaminit(12345);
array prob [2] _temporary_ (0.7 0.3);
do i = 1 to &N;
type = rand("Table", of prob[*]);
if type=1 then Return = rand("Normal", 0, 3); /* state 1 */
else Return = rand("Normal", 5, 10); /* steta 2 */
output;
end;
run;
data work.States;
set work.States;
if type = 1 then R1 = Return + 0;
else R2 = Return + 0;
if R1 =. then R1 =0;
if R2 =. then R2 =0;
Run;
proc transpose
data=work.States;
run;
data want;
set data1;
array cnt col1-col5000;
array cnts cols1-cols5000;
do over cnt;
if cnt in (0,.) then cnts=1;
else cnts=.;
allcnts=nmiss(of cols1-cols5000);
end;
drop col: cols:;
run;
```

Here is a short example for the result:

N type return R1 R2

1 1 0.58 0.58 0

2 2 0.36 0 0.36

3 1 -0.74 -0.74 0

Accepted Solutions

Solution

04-01-2017
04:03 PM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Xusheng

04-01-2017 03:17 PM

I am definitely NOT an expert regarding the fairly new RAND function. However, that said, based on the UNIFORM distribution and the seed you used, 3416 of the observations you created had a random number generated that was less than .7.

That is apparently what the table option is selecting.

Your multiple calls to the function also likely interferred with a standard expectation. Try the following and look at the output. I use proc means to get the group counts but, at the same time, it (I think) does the calculations you want:

%let N = 5000; data States(drop=i); call streaminit(12345); array prob [2] _temporary_ (0.7 0.3); do i = 1 to &N; type = rand("Table", of prob[*]); output; end; run; data States; set States; call streaminit(12345); if type=1 then Return = rand("Normal", 0, 3); /* state 1 */ else Return = rand("Normal", 5, 10); /* steta 2 */ run; data states; set states; call streaminit(12345); rnum = rand("Uniform"); run; proc sort data=states; by rnum; run; proc means data=states; var return; class type; run;

Art, CEO, AnalystFinder.com

All Replies

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Xusheng

04-01-2017 12:01 PM

If you want exact proportional assignment you might be better off using proc surveyselect. There is probably a more direct route to get what you want, but the following will work:

%let N = 5000; data States; do i = 1 to &N; output; end; run; proc surveyselect data=states groups=10 seed=12345 out=states noprint; run; data States(drop=i); set States; if groupid lt 8 then type=1; else type=2; if type=1 then Return = rand("Normal", 0, 3);/* state 1 */ else Return = rand("Normal", 5, 10);/* state 2 */ output; run;

Art, CEO, AnalystFinder.com

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to art297

04-01-2017 01:33 PM

Hi art297, thank you for your contribution. I got exactly 3500 returns from state 1 and rest from state 2.

I still have two quesitons with your code.

1. Is there any possible reason why my code generate 3400 R1 and 1600 R2?

2. After grouping and creating the returns, how to get the mean, variance, standard deviation, densitiy, etc for return from state1 and from state 2 separately?

Solution

04-01-2017
04:03 PM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Xusheng

04-01-2017 03:17 PM

I am definitely NOT an expert regarding the fairly new RAND function. However, that said, based on the UNIFORM distribution and the seed you used, 3416 of the observations you created had a random number generated that was less than .7.

That is apparently what the table option is selecting.

Your multiple calls to the function also likely interferred with a standard expectation. Try the following and look at the output. I use proc means to get the group counts but, at the same time, it (I think) does the calculations you want:

%let N = 5000; data States(drop=i); call streaminit(12345); array prob [2] _temporary_ (0.7 0.3); do i = 1 to &N; type = rand("Table", of prob[*]); output; end; run; data States; set States; call streaminit(12345); if type=1 then Return = rand("Normal", 0, 3); /* state 1 */ else Return = rand("Normal", 5, 10); /* steta 2 */ run; data states; set states; call streaminit(12345); rnum = rand("Uniform"); run; proc sort data=states; by rnum; run; proc means data=states; var return; class type; run;

Art, CEO, AnalystFinder.com

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to art297

04-01-2017 04:03 PM

Thank you art297, I've tried your code and it works.

Since I would like to use the exact 70% and 30%, I will apply your first method and then use excel to get the parameters I need.

I will consider the second method when I come back to SAS and continue with following steps. Thank you.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Xusheng

04-01-2017 07:08 PM

Hi Xusheng,

I don't know if this approach is too crude, but why don't you just generate the 5000 i's, and if i is less than or equal to 3500 then type =1 and else type=2, and then you do all the subsequent simulation steps like art297 showed?

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to ilikesas

04-04-2017 12:22 PM

Thank you ilikesas, I did what you suggest -- create two loop one is from 1 to 3500 and another is from 3501 to 5000. And it works. Thank you so much.