SAS Programming

Ani7 · Posted 01-16-2019 09:44 AM

I am trying to randomize the numbers in my dataset for a few different variables. The macro that I am using to do this is currently:

%macro RandBetween(min, max);
   (&min + ((1+&max-&min)*abs(rand("normal"))))
%mend;

I think my understanding of a normal distribution isn't correct since %RandBetween(0,1) seems to be returning numbers greater than 1 as well. This obviously will return an incorrect 'percentage' variable since I want that to have a maximum value of 1 (100%). Consequently, other variables that are generated like this:

avg_opioid = %RandBetween(0,&max_avg_opioid.);

also seem to have values greater than their max (presumably because the rand("normal") function is returning numbers greater than 1. I have also tried rand("normal",0.5,0.5) but that doesn't seem to help either. At this point, I think my understanding of the normal distribution may be skewed.

How do I go about limiting the return of rand("normal") to a min and max of 0 and 1 respectively?

Kurt_Bremser · Posted 01-16-2019 10:21 AM

The "problem" with the normal distribution is that it does not have defined lower/upper bounds. Theoretically, any value is possible, only with diminishing probability the farther away from the mean it is.

The more data points you have, the higher the probability of exceeding any wanted minimum/maximum.

So, when running this:

data test;
do i = 1 to 1000;
  x1 = rand('normal',.5,.1);
  output;
end;
run;

I stayed well between 0 and 1, but with 10 million iterations I exceeded the bounds.

Are you sure you don't want uniform distribution?

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

View solution in original post

Kurt_Bremser · Posted 01-16-2019 09:49 AM

After the macro is resolved, you get

(0 + ((1+1-0)*abs(rand("normal"))))

so you'll get numbers between 0 and 2.

Change your macro to this:

%macro RandBetween(min, max);
   (&min + ((&max-&min)*abs(rand("normal"))))
%mend;

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

Ani7 · Posted 01-16-2019 09:53 AM

I tried this and it still returns numbers greater than 1. Looking at the documentation of the rand function here (http://support.sas.com/documentation/cdl/en/lrdict/64316/HTML/default/a001466748.htm#a002505417), it looks like the rand('NORMAL') function can return values greater than 1.

PeterClemmensen · Posted 01-16-2019 10:10 AM

Are you sure you want to create a 'percentage variable' using the normail distribution? A N(0,1) distribution is not restricted to values between 0 and 1. It is a normal distribution with mean 0 and variance 1 ..

If you do not actually need the normail, then simply do this to get a value between 0 and 1

data _null_;
x=rand('uniform');
put x;
run;

The DATA to DATA Step Macro
Blog: SASnrd

Ani7 · Posted 01-16-2019 10:15 AM

Unfortunately, for the purposes of what I am trying to do with the data, I need it to be a normal distribution.

Kurt_Bremser · Posted 01-16-2019 10:21 AM

The "problem" with the normal distribution is that it does not have defined lower/upper bounds. Theoretically, any value is possible, only with diminishing probability the farther away from the mean it is.

The more data points you have, the higher the probability of exceeding any wanted minimum/maximum.

So, when running this:

data test;
do i = 1 to 1000;
  x1 = rand('normal',.5,.1);
  output;
end;
run;

I stayed well between 0 and 1, but with 10 million iterations I exceeded the bounds.

Are you sure you don't want uniform distribution?

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

Ani7 · Posted 01-16-2019 10:28 AM

Awesome, this was exactly what I needed. It understand that the asymptotic nature of the bell curve doesn't allow for bounds but that is okay since 1 or 2 observations over 100% out of 100,000 is reasonable. Thanks!

SAS Programming

How do I generate a random number between 0 and 1 with a normal distribution?

Re: How do I generate a random number between 0 and 1 with a normal distribution?

Re: How do I generate a random number between 0 and 1 with a normal distribution?

Re: How do I generate a random number between 0 and 1 with a normal distribution?

Re: How do I generate a random number between 0 and 1 with a normal distribution?

Re: How do I generate a random number between 0 and 1 with a normal distribution?

Re: How do I generate a random number between 0 and 1 with a normal distribution?

Re: How do I generate a random number between 0 and 1 with a normal distribution?

Code to generate random numbers from a Fretchet Distribution

[SAS 활용 노하우] IML - Random Number Generation Functions

Generate random numbers

Improving Your Generated Forecasts in SAS Visual Forecasting; Part 1, ...

SAS 9 Software Distribution

Follow Us

What is...

SAS Programming

Our biggest data and AI event of the year.

SAS Training: Just a Click Away

Follow Us

What is...