Calcite | Level 5

## Simulating Data from Standard Normal Distribution

First I create a 100 obs dataset with a DO...End loop with the variable x. And then I use another DO loop of 600
rounds to generate a total of 600 samples, 100 obs each. And then I use a proc means step to calculate the sample mean of x for each sample. So my goal is to have a total of 600 sample means (Each sample containing 100 obs) and then find out the mean of these sample means.

Here is my code:

``````data sample(keep=X);
call streaminit(123);
do j=1 to 600;
do i=1 to 100;
X = rand("Normal");
output;
end;
end;
run;

proc means data=sample;
run;``````

I get a total of 60,000 observations but want 600 sample means. The proc means statement gives the mean for all 60,000 obs.

1 ACCEPTED SOLUTION

Accepted Solutions
Diamond | Level 26

## Re: Simulating Data from Standard Normal Distribution

@ddpatel wrote:
First I create a 100 obs dataset with a DO...End loop with the variable x. And then I use another DO loop of 600
rounds to generate a total of 600 samples, 100 obs each. And then I use a proc means step to calculate the sample mean of x for each sample. So my goal is to have a total of 600 sample means (Each sample containing 100 obs) and then find out the mean of these sample means.

Here is my code:

``````data sample(keep=X);
call streaminit(123);
do j=1 to 600;
do i=1 to 100;
X = rand("Normal");
output;
end;
end;
run;

proc means data=sample;
run;``````

I get a total of 60,000 observations but want 600 sample means. The proc means statement gives the mean for all 60,000 obs.

Use this code to get a mean for each of the 600 values of j.

``````data sample;
call streaminit(123);
do j=1 to 600;
do i=1 to 100;
X = rand("Normal");
output;
end;
end;
run;
proc summary data=sample nway;
class j;
var x;
output out=want mean=;
run;``````
--
Paige Miller
3 REPLIES 3
Diamond | Level 26

## Re: Simulating Data from Standard Normal Distribution

@ddpatel wrote:
First I create a 100 obs dataset with a DO...End loop with the variable x. And then I use another DO loop of 600
rounds to generate a total of 600 samples, 100 obs each. And then I use a proc means step to calculate the sample mean of x for each sample. So my goal is to have a total of 600 sample means (Each sample containing 100 obs) and then find out the mean of these sample means.

Here is my code:

``````data sample(keep=X);
call streaminit(123);
do j=1 to 600;
do i=1 to 100;
X = rand("Normal");
output;
end;
end;
run;

proc means data=sample;
run;``````

I get a total of 60,000 observations but want 600 sample means. The proc means statement gives the mean for all 60,000 obs.

Use this code to get a mean for each of the 600 values of j.

``````data sample;
call streaminit(123);
do j=1 to 600;
do i=1 to 100;
X = rand("Normal");
output;
end;
end;
run;
proc summary data=sample nway;
class j;
var x;
output out=want mean=;
run;``````
--
Paige Miller
Calcite | Level 5

## Re: Simulating Data from Standard Normal Distribution

Would anyone be willing to explain to me why the "mean=" at the end is necessary? I had to do basically this on an assignment. Without the mean=, it gave me 3000 results instead of 600. I tried searching the SAS documentation for an explanation and couldn't find anything useful...

SAS Super FREQ

## Re: Simulating Data from Standard Normal Distribution

The OUTPUT statement requests the statistics that you want to save in the output data set. In this case, you are requesting the mean statistic for each j, where j=1..600.

You might be more familiar with PROC MEANS than PROC SUMMARY. If so, here is an equivalent way to produce the 600 sample means:

``````proc means data=sample noprint;
by j;
var x;
output out=want mean=;
run;``````
Discussion stats
• 3 replies
• 2442 views
• 3 likes
• 4 in conversation