Random numbers , Sample size - then calculate means and Standard Deviation.

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 15
Accepted Solution

Random numbers , Sample size - then calculate means and Standard Deviation.

 Generate 625 samples of size 961 random numbers from U(1, 9). For each of these 625 samples calculate the mean.
a) Find the simulated probability that the mean is between 3 and 4.
b) Find the mean of the means.
c) Find the standard deviation of the means.
d) Draw the histogram of the means.

 

I believe my code I have below is a good template for what I need to do above. however it is showing the simulated probability that the mean is between 11 and 12 as of right now. Can someone break down the code below for me. I have a general idea on what its doing but still a little lost. I understand the proc freq and proc univariate but above that I am having trouble. Some explanation would help.  Also how would I change the simulated probability to show between 2 and 4? Thank YOU!

 

data a;
meanx=0;
do j=1 to 225;
sumx=0;
do i = 1 to 625;
u=rand ("Uniform");
x=10+(22-10)*u;
sumx=sumx+x;
end;
meanx=sumx/625;
output;
end;
run;

 

proc freq;
tables meanx;
run;

 

proc univariate;
var meanx;
histogram meanx;
run;


Accepted Solutions
Solution
‎10-16-2017 08:37 PM
SAS Super FREQ
Posts: 4,175

Re: Random numbers , Sample size - then calculate means and Standard Deviation.

First, remember that you should use the NOPRINT option on the PROC MEANS statement, as explained in the article "Turn off ODS when running simulations in SAS." That is Tip 7 in my "Ten Tips" paper, so you might want to re-read that paper.

 

> How do I find the probability that the mean lies between 3 and 4?

You would use a DATA step to create an indicator variable for the event "mean is between 3 and 4," then use PROC FREQ to count. An example is in Tip 10 of my paper. However, you are only using 625 Monte Carlo samples and a large sample size, so all of the sample means are greater than 4 for this simulation. Therefore all you can conclude is that the probability is less than 1/625.

 

> how do I output the standard deviation in proc means? 

On the OUTPUT statement use STD=SampleStd;

 

>  I am confused as of how means= Samplemean  in proc means

The keyword MEAN= specifies the statistic that you want to output. The value to the right of the equal sign (Samplemean) specifies the name of the variable in the output data set that will contain that statistic. 

 

proc means data=Sim noprint;
by SampleID;
var x;
output out=OutStats3 mean=SampleMean std=SampleStd;
run;
 
/* P( Sample mean in [3,4] ) = 0  (less than 1/&NumSamples) */
data PValue34;
set OutStats3;
mean34 = (3<= SampltMean <= 4);
run;
proc freq data=PValue34;
tables mean34;
run;

ods select Moments Histogram;
proc univariate data=OutStats3;
label SampleMean = "Sample Mean of U(1,9) Data";
var SampleMean SampleStd;
histogram SampleMean SampleStd / normal ; /* overlay normal fit */
run;

If you intend to do many more simulations, you might want to invest in the book Simulating Data with SAS.

View solution in original post


All Replies
Super User
Posts: 23,339

Re: Random numbers , Sample size - then calculate means and Standard Deviation.

[ Edited ]

You're creating your random variables incorrectly. Review how to create a random variable from a Uniform Distribution. 

 

If you want to understand your code, add comments as you code. 

Super User
Posts: 10,695

Re: Random numbers , Sample size - then calculate means and Standard Deviation.

Calling @Rick_SAS

SAS Super FREQ
Posts: 4,175

Re: Random numbers , Sample size - then calculate means and Standard Deviation.

See Tip 6 on pp 6-8 of Wicklin (2015) "Ten Tips for Simulating Data with SAS."  The example in the paper is the same as your example except that the paper uses random uniform variates in (0,1).  You can use x = 1 + 8*rand("uniform") to get random variates in the range (1,9).

 

This method is called Monte Carlo simulation of the sampling distribution of the sample mean. It is important that you use BY-group processing for efficiency, as explained in the article "Simulation in SAS: The slow way or the BY way."

Respected Advisor
Posts: 2,836

Re: Random numbers , Sample size - then calculate means and Standard Deviation.

The probability that a number is between 3 and 4 is simply the count of numbers between 3 and 4, divided by the total number of values in the entire simulation.

 

To generate random numbers between 1 and 9 that are uniform, generate a uniform RV (which is between 0 and 1, the default) and then expand the range to 1 to 9 by multiplying by a constant and then adding an offset.

 

Also, your loops seem to be incorrect, I don't see the number 961 anywhere, and it seems to me that the number 625 is used in the wrong loop.

--
Paige Miller
Super User
Posts: 10,695

Re: Random numbers , Sample size - then calculate means and Standard Deviation.

The following code could generated random numbers between 1 and 9.

 

data x;
call streaminit(12345678);
do i = 1 to 625;
u=ceil(rand ("Uniform")*9);
output;
end;
run;
proc freq data=x;
table u;
run;
Respected Advisor
Posts: 2,836

Re: Random numbers , Sample size - then calculate means and Standard Deviation.

[ Edited ]

@Ksharp this generates random INTEGERS between 1 and 9, it does not generate uniform random numbers between 1 and 9

--
Paige Miller
Super User
Posts: 10,695

Re: Random numbers , Sample size - then calculate means and Standard Deviation.

Posted in reply to PaigeMiller

Opps. My bad. I should clear my eyes before posting.

SAS Super FREQ
Posts: 4,175

Re: Random numbers , Sample size - then calculate means and Standard Deviation.

@Ksharp : I guess we need the OP to clarify whether the random numbers are from the continuous or uniform distribution. The OP said "random numbers from U(1, 9)," which usually means the continuous uniform distribution. If integers, then the correct phrase is "random uniform integers in the range 1-9."

Occasional Contributor
Posts: 15

Re: Random numbers , Sample size - then calculate means and Standard Deviation.

Rick,

 

Thank you for pointing me in the right direction. It was right on point. My last question I have is how would I go about finding the probability of the mean between lets say 3 and 4?  As well , how do I output the standard deviation in proc means? I know I have to equal it to something but I am confused to as of what. Also I am confused as of how means= Samplemean  in proc means. As it seemed it appear out of no where. Here is my code.

 

%let N = 961; /* sample size */
%let NumSamples = 625; /* number of samples */
data Sim;
call streaminit(123);
do SampleID = 1 to &NumSamples; /* ID variable for each sample */
do i = 1 to &N;
x = 1+8*rand("Uniform"); /* 1 to 9 */
output;
end;
end;
*output;
run;

 


proc means ;
by SampleID;
var x;
output out=OutStats3 mean=SampleMean;      /* I need to find standard deviation as well here. What do I put std = ? */
run;

 


ods select Moments Histogram;
proc univariate data=OutStats3;
label SampleMean = "Sample Mean of U(1,9) Data";
var SampleMean;
histogram SampleMean / normal ; /* overlay normal fit */
run;

Solution
‎10-16-2017 08:37 PM
SAS Super FREQ
Posts: 4,175

Re: Random numbers , Sample size - then calculate means and Standard Deviation.

First, remember that you should use the NOPRINT option on the PROC MEANS statement, as explained in the article "Turn off ODS when running simulations in SAS." That is Tip 7 in my "Ten Tips" paper, so you might want to re-read that paper.

 

> How do I find the probability that the mean lies between 3 and 4?

You would use a DATA step to create an indicator variable for the event "mean is between 3 and 4," then use PROC FREQ to count. An example is in Tip 10 of my paper. However, you are only using 625 Monte Carlo samples and a large sample size, so all of the sample means are greater than 4 for this simulation. Therefore all you can conclude is that the probability is less than 1/625.

 

> how do I output the standard deviation in proc means? 

On the OUTPUT statement use STD=SampleStd;

 

>  I am confused as of how means= Samplemean  in proc means

The keyword MEAN= specifies the statistic that you want to output. The value to the right of the equal sign (Samplemean) specifies the name of the variable in the output data set that will contain that statistic. 

 

proc means data=Sim noprint;
by SampleID;
var x;
output out=OutStats3 mean=SampleMean std=SampleStd;
run;
 
/* P( Sample mean in [3,4] ) = 0  (less than 1/&NumSamples) */
data PValue34;
set OutStats3;
mean34 = (3<= SampltMean <= 4);
run;
proc freq data=PValue34;
tables mean34;
run;

ods select Moments Histogram;
proc univariate data=OutStats3;
label SampleMean = "Sample Mean of U(1,9) Data";
var SampleMean SampleStd;
histogram SampleMean SampleStd / normal ; /* overlay normal fit */
run;

If you intend to do many more simulations, you might want to invest in the book Simulating Data with SAS.

Occasional Contributor
Posts: 15

Re: Random numbers , Sample size - then calculate means and Standard Deviation.

Thanks Rick! Everything is clarified and straight forward!

Just quick question. If I wanted to do exponential instead of uniform. Would this be correct?

x = 1+8*rand("exponential")/lambda;



Thanks again
SAS Super FREQ
Posts: 4,175

Re: Random numbers , Sample size - then calculate means and Standard Deviation.

Sometimes the exponential family is parameterized by using a scale parameter. Sometimes a rate parameter.

If E ~ Exp(1), then 

- The random variable sigma*E is exponential with scale parameter sigma.

- The random variable E/lambda is exponential with rate parameter lambda.

The lower bound of an exponential r.v. is 0, so adding 1 would translate the threshold to 1.

 

I think you do not need the 8. If you want a truncated exponential distribution, you would use an IF-THEN statement to accept/reject the random values in [1,8], such as

x = 1 + rand("expo")/lambda;

if x <= 8;

 

 

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 12 replies
  • 589 views
  • 4 likes
  • 5 in conversation