turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- SAS Procedures
- /
- How to I generate random numbers using an increasi...

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

04-07-2015 01:34 PM

I'm trying to generate two sets of 5,000 random numbers. I have successfully generated the first set, which is a uniform distribution of integers from 0 to 120. For the second set, I would like to sample from a function with a linear (monotonic) increase in probability over that interval. So, the probability of randomly generating the number 0 is near 0 and the probability of randomly generating the number 120 is greater than all other numbers.

I apologize if this seems basic but I'm having a hard time phrasing my issue and so have been unable to find any technical support on this issue. If anyone has any references for me to review, I'd greatly appreciate it. My current thinking is that I somehow need to transform the rand("uniform") distribution to increase linearly, but I can't seem to figure out. I'd prefer if I could adjust the slope of the line.

I've pasted some code below that shows how I've been trying to code this.

**data** Random_numbers;

call streaminit(**123**); /* set random number seed */

do i = **1** to **5000**;

uniform = int(rand("uniform") *120);

increasing = int((rand("uniform") * **SOME TRANSFORMATION HERE?** )*120);

output;

end;

**PROC** **FREQ** DATA = Random_numbers;

TABLES uniform increasing ;

**RUN**;

Thanks for your help!

Accepted Solutions

Solution

04-08-2015
12:29 PM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to mconover

04-08-2015 12:29 PM

A proposed solution.

In the following,

yRnd1 = random variable with a uniform probability in the, say, 1-100 range.

yRnd2 = random variable with a probability distribution linearly increasing from 1 to, say, 100.

yRnd2 = random variable with a probability distribution quadratically increasing from 1 to, say, 100.

/******************************/

/**** sample distributions ****/

/******************************/

data t_a;

do _N_ = 1 to 100000;

xRnd = ranuni(3);

yRd1 = ceil(100*xRnd);

yRd2 = ceil(100*(xRnd**(1/2)));

yRd3 = ceil(100*(xRnd**(1/3)));

output;

end;

run;

/********************************************************/

/*** wanting to set 3 graphs with the same maximum wt ***/

/********************************************************/

proc sql;

create table t_1 as

select a.yRd1, round((a.cnts/b.zMax),0.001) as wt

from (select yRd1, sum(1) as cnts from t_a group by yRd1) a,

(select max(cnts) as zMax

from

(select yRd1, sum(1) as cnts from t_a group by yRd1)) b

order by a.yRd1;

create table t_2 as

select a.yRd2, round((a.cnts/b.zMax),0.001) as wt

from (select yRd2, sum(1) as cnts from t_a group by yRd2) a,

(select max(cnts) as zMax

from

(select yRd2, sum(1) as cnts from t_a group by yRd2)) b

order by a.yRd2;

create table t_3 as

select a.yRd3, round((a.cnts/b.zMax),0.001) as wt

from (select yRd3, sum(1) as cnts from t_a group by yRd3) a,

(select max(cnts) as zMax

from

(select yRd3, sum(1) as cnts from t_a group by yRd3)) b

order by a.yRd3;

quit;

/******************************/

/*** plotting for, say, t_2 ***/

/******************************/

proc sgplot data=t_2;

scatter x=yRd2 y=wt;

run;

All Replies

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to mconover

04-07-2015 01:56 PM

You may be looking for the Rand('TABLE',p1,p2,...) function where p1=probability of selecting 1, p2= probability of selecting 2 and so forth. Admittedly that's a somewhat long statement to code but without more description of what kind of shape you might want...

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to ballardw

04-07-2015 02:26 PM

I'd prefer something that was more easily manipulated and changed (i.e. in an equation rather than a tabled form) if anyone has any other ideas.

However, this will work for now and I appreciate the quick help!

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to mconover

04-07-2015 04:58 PM

The transformation you need here is the inverse CDF.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to gergely_batho

04-07-2015 10:58 PM

Thanks for the response. I looked at the link you recommended. I also reviewed Paper 236-25 (Pseudo-Random Numbers: Out of Uniform http://www2.sas.com/proceedings/sugi25/25/po/25p236.pdf), which discusses this method.

I'm just having trouble figuring out how to code it. Do I use the CDF function?

So if I generate a uniform distribution of numbers from 0 to 1 my probability density function is (pdf) is 1. Does this mean my cumulative density function (CDF) is X?

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to mconover

04-08-2015 02:36 AM

No: your PDF is a function with the following properties: f(0)=0, f(120)=A, it is linear between 0 and 120 and 0 otherwise (outside of [0,120] interval).

You will get the CDF by intergarting this function. A is a fixed parameter, you need to calculate is so, that the area under the curve of PDF (=the area of the triangle) is 1.

Solution

04-08-2015
12:29 PM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to mconover

04-08-2015 12:29 PM

A proposed solution.

In the following,

yRnd1 = random variable with a uniform probability in the, say, 1-100 range.

yRnd2 = random variable with a probability distribution linearly increasing from 1 to, say, 100.

yRnd2 = random variable with a probability distribution quadratically increasing from 1 to, say, 100.

/******************************/

/**** sample distributions ****/

/******************************/

data t_a;

do _N_ = 1 to 100000;

xRnd = ranuni(3);

yRd1 = ceil(100*xRnd);

yRd2 = ceil(100*(xRnd**(1/2)));

yRd3 = ceil(100*(xRnd**(1/3)));

output;

end;

run;

/********************************************************/

/*** wanting to set 3 graphs with the same maximum wt ***/

/********************************************************/

proc sql;

create table t_1 as

select a.yRd1, round((a.cnts/b.zMax),0.001) as wt

from (select yRd1, sum(1) as cnts from t_a group by yRd1) a,

(select max(cnts) as zMax

from

(select yRd1, sum(1) as cnts from t_a group by yRd1)) b

order by a.yRd1;

create table t_2 as

select a.yRd2, round((a.cnts/b.zMax),0.001) as wt

from (select yRd2, sum(1) as cnts from t_a group by yRd2) a,

(select max(cnts) as zMax

from

(select yRd2, sum(1) as cnts from t_a group by yRd2)) b

order by a.yRd2;

create table t_3 as

select a.yRd3, round((a.cnts/b.zMax),0.001) as wt

from (select yRd3, sum(1) as cnts from t_a group by yRd3) a,

(select max(cnts) as zMax

from

(select yRd3, sum(1) as cnts from t_a group by yRd3)) b

order by a.yRd3;

quit;

/******************************/

/*** plotting for, say, t_2 ***/

/******************************/

proc sgplot data=t_2;

scatter x=yRd2 y=wt;

run;

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to mconover

06-10-2015 05:13 PM

You want to use the inverse CDF method for generating random numbers. Sometimes you can use the QUANTILE function to help solve the inverse problem (see the example of the folded normal distribution), but other times you need to solve for the root of the cumulative distribution: F(x) = u, where u ~U(0,1). In SAS/IML, you can use the FROOT function to solve for numerical roots.

If your CDF is given explicitly by a formula or by empirical quantiles, you can use linear interpolation. See this blog post.

http://blogs.sas.com/content/iml/2014/06/18/distribution-from-quantiles.html