Help using Base SAS procedures

How to I generate random numbers using an increasing linear probability distribution?

Accepted Solution Solved
Reply
Contributor
Posts: 50
Accepted Solution

How to I generate random numbers using an increasing linear probability distribution?

I'm trying to generate two sets of 5,000 random numbers.  I have successfully generated the first set, which is a uniform distribution of integers from 0 to 120. For the second set, I would like to sample from a function with a linear (monotonic) increase in probability over that interval.  So, the probability of randomly generating the number 0 is near 0 and the probability of randomly generating the number 120 is greater than all other numbers.


I apologize if this seems basic but I'm having a hard time phrasing my issue and so have been unable to find any technical support on this issue.  If anyone has any references for me to review, I'd greatly appreciate it. My current thinking is that I somehow need to transform the rand("uniform") distribution to increase linearly, but I can't seem to figure out. I'd prefer if I could adjust the slope of the line.


I've pasted some code below that shows how I've been trying to code this.


data Random_numbers;

call streaminit(123); /* set random number seed */

do i = 1 to 5000;

   uniform = int(rand("uniform") *120);

   increasing = int((rand("uniform") * SOME TRANSFORMATION HERE? )*120);

   output;

end;


PROC FREQ DATA = Random_numbers;

  TABLES uniform increasing ;

RUN;



Thanks for your help!


Accepted Solutions
Solution
‎04-08-2015 12:29 PM
Contributor
Posts: 52

Re: How to I generate random numbers using an increasing linear probability distribution?

A proposed solution.

In the following,
  yRnd1 = random variable with a uniform probability in the, say, 1-100 range.
  yRnd2 = random variable with a probability distribution linearly increasing from 1 to, say, 100.
  yRnd2 = random variable with a probability distribution quadratically increasing from 1 to, say, 100.

/******************************/
/**** sample distributions ****/
/******************************/
data t_a;
  do _N_ = 1 to 100000;
     xRnd  = ranuni(3);
     yRd1 = ceil(100*xRnd);
     yRd2 = ceil(100*(xRnd**(1/2)));
     yRd3 = ceil(100*(xRnd**(1/3)));
     output;
  end;
run;

/********************************************************/
/*** wanting to set 3 graphs with the same maximum wt ***/
/********************************************************/
proc sql;
  create table t_1 as
  select a.yRd1, round((a.cnts/b.zMax),0.001) as wt
  from   (select yRd1, sum(1) as cnts from t_a group by yRd1) a,
         (select max(cnts) as zMax
          from
           (select yRd1, sum(1) as cnts from t_a group by yRd1)) b
  order by a.yRd1;

  create table t_2 as
  select a.yRd2, round((a.cnts/b.zMax),0.001) as wt
  from   (select yRd2, sum(1) as cnts from t_a group by yRd2) a,
         (select max(cnts) as zMax
          from
           (select yRd2, sum(1) as cnts from t_a group by yRd2)) b
  order by a.yRd2;

  create table t_3 as
  select a.yRd3, round((a.cnts/b.zMax),0.001) as wt
  from   (select yRd3, sum(1) as cnts from t_a group by yRd3) a,
         (select max(cnts) as zMax
          from
          (select yRd3, sum(1) as cnts from t_a group by yRd3)) b
  order by a.yRd3;

quit;

/******************************/
/*** plotting for, say, t_2 ***/
/******************************/
proc sgplot data=t_2;
   scatter x=yRd2 y=wt;
run;

View solution in original post


All Replies
Super User
Posts: 11,343

Re: How to I generate random numbers using an increasing linear probability distribution?

You may be looking for the Rand('TABLE',p1,p2,...) function where p1=probability of selecting 1, p2= probability of selecting 2 and so forth. Admittedly that's a somewhat long statement to code but without more description of what kind of shape you might want...

Contributor
Posts: 50

Re: How to I generate random numbers using an increasing linear probability distribution?

I'd prefer something that was more easily manipulated and changed (i.e. in an equation rather than a tabled form) if anyone has any other ideas.

However, this will work for now and I appreciate the quick help! 

SAS Employee
Posts: 340

Re: How to I generate random numbers using an increasing linear probability distribution?

The transformation you need here is the inverse CDF.

inverse-cdf | Why?

Contributor
Posts: 50

Re: How to I generate random numbers using an increasing linear probability distribution?

Posted in reply to gergely_batho

Thanks for the response.  I looked at the link you recommended.  I also reviewed Paper 236-25 (Pseudo-Random Numbers: Out of Uniform http://www2.sas.com/proceedings/sugi25/25/po/25p236.pdf), which discusses this method.

I'm just having trouble figuring out how to code it.  Do I use the CDF function?

So if I generate a uniform distribution of numbers from 0 to 1 my probability density function is (pdf) is 1.  Does this mean my cumulative density function (CDF) is X? 

SAS Employee
Posts: 340

Re: How to I generate random numbers using an increasing linear probability distribution?

No: your PDF is a function with the following properties: f(0)=0, f(120)=A, it is linear between 0 and 120 and 0 otherwise (outside of [0,120] interval).

You will get the CDF by intergarting this function.  A is a fixed parameter, you need to calculate is so, that the area under the curve of PDF (=the area of the triangle) is 1.

Solution
‎04-08-2015 12:29 PM
Contributor
Posts: 52

Re: How to I generate random numbers using an increasing linear probability distribution?

A proposed solution.

In the following,
  yRnd1 = random variable with a uniform probability in the, say, 1-100 range.
  yRnd2 = random variable with a probability distribution linearly increasing from 1 to, say, 100.
  yRnd2 = random variable with a probability distribution quadratically increasing from 1 to, say, 100.

/******************************/
/**** sample distributions ****/
/******************************/
data t_a;
  do _N_ = 1 to 100000;
     xRnd  = ranuni(3);
     yRd1 = ceil(100*xRnd);
     yRd2 = ceil(100*(xRnd**(1/2)));
     yRd3 = ceil(100*(xRnd**(1/3)));
     output;
  end;
run;

/********************************************************/
/*** wanting to set 3 graphs with the same maximum wt ***/
/********************************************************/
proc sql;
  create table t_1 as
  select a.yRd1, round((a.cnts/b.zMax),0.001) as wt
  from   (select yRd1, sum(1) as cnts from t_a group by yRd1) a,
         (select max(cnts) as zMax
          from
           (select yRd1, sum(1) as cnts from t_a group by yRd1)) b
  order by a.yRd1;

  create table t_2 as
  select a.yRd2, round((a.cnts/b.zMax),0.001) as wt
  from   (select yRd2, sum(1) as cnts from t_a group by yRd2) a,
         (select max(cnts) as zMax
          from
           (select yRd2, sum(1) as cnts from t_a group by yRd2)) b
  order by a.yRd2;

  create table t_3 as
  select a.yRd3, round((a.cnts/b.zMax),0.001) as wt
  from   (select yRd3, sum(1) as cnts from t_a group by yRd3) a,
         (select max(cnts) as zMax
          from
          (select yRd3, sum(1) as cnts from t_a group by yRd3)) b
  order by a.yRd3;

quit;

/******************************/
/*** plotting for, say, t_2 ***/
/******************************/
proc sgplot data=t_2;
   scatter x=yRd2 y=wt;
run;

SAS Super FREQ
Posts: 3,755

Re: How to I generate random numbers using an increasing linear probability distribution?

You want to use the inverse CDF method for generating random numbers.  Sometimes you can use the QUANTILE function to help solve the inverse problem (see the example of the folded normal distribution), but other times you need to solve for the root of the cumulative distribution: F(x) = u, where u ~U(0,1).  In SAS/IML, you can use the FROOT function to solve for numerical roots.

If your CDF is given explicitly by a formula or by empirical quantiles, you can use linear interpolation.  See this blog post.

http://blogs.sas.com/content/iml/2014/06/18/distribution-from-quantiles.html

🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 7 replies
  • 829 views
  • 6 likes
  • 5 in conversation