turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- General Programming
- /
- Random numbers , Sample size - then calculate mean...

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

10-09-2017 07:49 PM

Generate 625 samples of size 961 random numbers from U(1, 9). For each of these 625 samples calculate the mean.

a) Find the simulated probability that the mean is between 3 and 4.

b) Find the mean of the means.

c) Find the standard deviation of the means.

d) Draw the histogram of the means.

I believe my code I have below is a good template for what I need to do above. however it is showing the simulated probability that the mean is between 11 and 12 as of right now. Can someone break down the code below for me. I have a general idea on what its doing but still a little lost. I understand the proc freq and proc univariate but above that I am having trouble. Some explanation would help. Also how would I change the simulated probability to show between 2 and 4? Thank YOU!

data a;

meanx=0;

do j=1 to 225;

sumx=0;

do i = 1 to 625;

u=rand ("Uniform");

x=10+(22-10)*u;

sumx=sumx+x;

end;

meanx=sumx/625;

output;

end;

run;

proc freq;

tables meanx;

run;

proc univariate;

var meanx;

histogram meanx;

run;

Accepted Solutions

Solution

10-16-2017
08:37 PM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to jsjoden

10-12-2017 09:21 AM

First, remember that you should use the NOPRINT option on the PROC MEANS statement, as explained in the article "Turn off ODS when running simulations in SAS." That is Tip 7 in my "Ten Tips" paper, so you might want to re-read that paper.

*> How do I find the probability that the mean lies between 3 and 4?*

You would use a DATA step to create an indicator variable for the event "mean is between 3 and 4," then use PROC FREQ to count. An example is in Tip 10 of my paper. However, you are only using 625 Monte Carlo samples and a large sample size, so all of the sample means are greater than 4 for this simulation. Therefore all you can conclude is that the probability is less than 1/625.

*> how do I output the standard deviation in proc means? *

On the OUTPUT statement use STD=SampleStd;

*> I am confused as of how means= Samplemean in proc means*

The keyword MEAN= specifies the statistic that you want to output. The value to the right of the equal sign (Samplemean) specifies the name of the variable in the output data set that will contain that statistic.

```
proc means data=Sim noprint;
by SampleID;
var x;
output out=OutStats3 mean=SampleMean std=SampleStd;
run;
/* P( Sample mean in [3,4] ) = 0 (less than 1/&NumSamples) */
data PValue34;
set OutStats3;
mean34 = (3<= SampltMean <= 4);
run;
proc freq data=PValue34;
tables mean34;
run;
ods select Moments Histogram;
proc univariate data=OutStats3;
label SampleMean = "Sample Mean of U(1,9) Data";
var SampleMean SampleStd;
histogram SampleMean SampleStd / normal ; /* overlay normal fit */
run;
```

If you intend to do many more simulations, you might want to invest in the book Simulating Data with SAS.

All Replies

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to jsjoden

10-09-2017 07:56 PM - edited 10-09-2017 07:57 PM

You're creating your random variables incorrectly. Review how to create a random variable from a Uniform Distribution.

If you want to understand your code, add comments as you code.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to jsjoden

10-10-2017 07:56 AM

Calling @Rick_SAS

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to jsjoden

10-10-2017 08:03 AM

See Tip 6 on pp 6-8 of Wicklin (2015) "Ten Tips for Simulating Data with SAS." The example in the paper is the same as your example except that the paper uses random uniform variates in (0,1). You can use x = 1 + 8*rand("uniform") to get random variates in the range (1,9).

This method is called Monte Carlo simulation of the sampling distribution of the sample mean. It is important that you use BY-group processing for efficiency, as explained in the article "Simulation in SAS: The slow way or the BY way."

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to jsjoden

10-10-2017 08:05 AM

The probability that a number is between 3 and 4 is simply the count of numbers between 3 and 4, divided by the total number of values in the entire simulation.

To generate random numbers between 1 and 9 that are uniform, generate a uniform RV (which is between 0 and 1, the default) and then expand the range to 1 to 9 by multiplying by a constant and then adding an offset.

Also, your loops seem to be incorrect, I don't see the number 961 anywhere, and it seems to me that the number 625 is used in the wrong loop.

--

Paige Miller

Paige Miller

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to jsjoden

10-10-2017 08:12 AM

The following code could generated random numbers between 1 and 9.

```
data x;
call streaminit(12345678);
do i = 1 to 625;
u=ceil(rand ("Uniform")*9);
output;
end;
run;
proc freq data=x;
table u;
run;
```

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Ksharp

10-10-2017 08:20 AM - edited 10-10-2017 08:20 AM

@Ksharp this generates random INTEGERS between 1 and 9, it does not generate uniform random numbers between 1 and 9

--

Paige Miller

Paige Miller

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to PaigeMiller

10-10-2017 08:26 AM

Opps. My bad. I should clear my eyes before posting.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Ksharp

10-10-2017 08:23 AM

@Ksharp : I guess we need the OP to clarify whether the random numbers are from the continuous or uniform distribution. The OP said "random numbers from U(1, 9)," which usually means the continuous uniform distribution. If integers, then the correct phrase is "random uniform integers in the range 1-9."

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Rick_SAS

10-11-2017 05:05 PM

Rick,

Thank you for pointing me in the right direction. It was right on point. My last question I have is how would I go about finding the probability of the mean between lets say 3 and 4? As well , how do I output the standard deviation in proc means? I know I have to equal it to something but I am confused to as of what. Also I am confused as of how means= Samplemean in proc means. As it seemed it appear out of no where. Here is my code.

%let N = 961; /* sample size */

%let NumSamples = 625; /* number of samples */

data Sim;

call streaminit(123);

do SampleID = 1 to &NumSamples; /* ID variable for each sample */

do i = 1 to &N;

x = 1+8*rand("Uniform"); /* 1 to 9 */

output;

end;

end;

*output;

run;

proc means ;

by SampleID;

var x;

output out=OutStats3 mean=SampleMean; /* I need to find standard deviation as well here. What do I put std = ? */

run;

ods select Moments Histogram;

proc univariate data=OutStats3;

label SampleMean = "Sample Mean of U(1,9) Data";

var SampleMean;

histogram SampleMean / normal ; /* overlay normal fit */

run;

Solution

10-16-2017
08:37 PM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to jsjoden

10-12-2017 09:21 AM

First, remember that you should use the NOPRINT option on the PROC MEANS statement, as explained in the article "Turn off ODS when running simulations in SAS." That is Tip 7 in my "Ten Tips" paper, so you might want to re-read that paper.

*> How do I find the probability that the mean lies between 3 and 4?*

You would use a DATA step to create an indicator variable for the event "mean is between 3 and 4," then use PROC FREQ to count. An example is in Tip 10 of my paper. However, you are only using 625 Monte Carlo samples and a large sample size, so all of the sample means are greater than 4 for this simulation. Therefore all you can conclude is that the probability is less than 1/625.

*> how do I output the standard deviation in proc means? *

On the OUTPUT statement use STD=SampleStd;

*> I am confused as of how means= Samplemean in proc means*

The keyword MEAN= specifies the statistic that you want to output. The value to the right of the equal sign (Samplemean) specifies the name of the variable in the output data set that will contain that statistic.

```
proc means data=Sim noprint;
by SampleID;
var x;
output out=OutStats3 mean=SampleMean std=SampleStd;
run;
/* P( Sample mean in [3,4] ) = 0 (less than 1/&NumSamples) */
data PValue34;
set OutStats3;
mean34 = (3<= SampltMean <= 4);
run;
proc freq data=PValue34;
tables mean34;
run;
ods select Moments Histogram;
proc univariate data=OutStats3;
label SampleMean = "Sample Mean of U(1,9) Data";
var SampleMean SampleStd;
histogram SampleMean SampleStd / normal ; /* overlay normal fit */
run;
```

If you intend to do many more simulations, you might want to invest in the book Simulating Data with SAS.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Rick_SAS

10-16-2017 08:39 PM

Thanks Rick! Everything is clarified and straight forward!

Just quick question. If I wanted to do exponential instead of uniform. Would this be correct?

x = 1+8*rand("exponential")/lambda;

Thanks again

Just quick question. If I wanted to do exponential instead of uniform. Would this be correct?

x = 1+8*rand("exponential")/lambda;

Thanks again

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to jsjoden

10-17-2017 05:35 AM

Sometimes the exponential family is parameterized by using a scale parameter. Sometimes a rate parameter.

If E ~ Exp(1), then

- The random variable sigma*E is exponential with scale parameter sigma.

- The random variable E/lambda is exponential with rate parameter lambda.

The lower bound of an exponential r.v. is 0, so adding 1 would translate the threshold to 1.

I think you do not need the 8. If you want a truncated exponential distribution, you would use an IF-THEN statement to accept/reject the random values in [1,8], such as

x = 1 + rand("expo")/lambda;

if x <= 8;