turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

10-15-2012 10:54 AM

*Dear All,*

*I would like to know whether it is possible using SAS/IML to **generate random numbers from zero-modified Negative Binomial.*for example :

The zero-modi ed version is de fine as follows

P(X = k) = p if k = 0 and K Gamma(r+k)/Gamma(r)k! ( 1+beta )^-r ( beta/(1+ beta) )^k , otherwise

where K is de ned as 1-p,

r; usual parameters and p the new parameter

*Thanks in advance,*Hermawan Budyanto

Accepted Solutions

Solution

11-12-2012
09:44 AM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to babaJB

11-12-2012 09:44 AM

You've got the right idea, but there are a few mistakes in your program. (1) You shouldn't be using U any more, (2) The R function c() is concatenation, not addition, so you want to use the IML vertical concatentation operator (//)

(3) You want Y0 = Y[ loc(Y>0) ]

There is also the matter that IML will issue an error if you use an empty subscript, so it is best to make sure that LOC returns a nonempty matrix.

Here's one approach. To make it easier to follow, I encapsulated the random variate generation into it's own function:

proc iml;

start rpois(N, T); /* helper function */

x = j(N,1);

call randgen ( x, 'POISSON', T);

return ( x );

finish;

start RandZeroTruncatedPoisson( N, /* number of observation */

lambda); /* parameter of Poisson Distribution */

r = N;

do while ( r > 0);

Y = rpois(r,lambda);

idx = loc(Y>0);

if ncol(idx)>0 then do;

Y0 = Y0 // Y[idx];

r = N-nrow(Y0);

end;

end;

return(y0);

finish;

/* test the function */

call randseed(1);

x = RandZeroTruncatedPoisson(10,1);

print x;

All Replies

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to babaJB

10-16-2012 09:36 AM

If I understand correctly, then I think you can get what you want using the standard RAND fuction. Something like:

c=rand( 'BERNOULLI' , 1-ip) * rand( 'NEGBINOMIAL' , p, k);

where ip is the zero inflation probability.

Ian.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to IanWakeling

11-07-2012 02:42 AM

In such other program ( I. E Stata) there is a package/procedure to run Zero modified or Zero Truncated. I just wondering how to generate a sample from negative binomial which reject 0 values in SAS. There must be a way to do using RAND function, I just don't know how.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to babaJB

11-07-2012 11:01 AM

Since you posted this to the IML forum, I assume you want an IML solution.

See this article for an overview of how to simulate efficiently in IML by using the RANDGEN routine: Efficient Sampling - The DO Loop

See this article for how to simulate from a mixture distribution: Generate a random sample from a mixture distribution - The DO Loop

(Notice that your distribution is a mixture of a Bernoulli and a Negative Binomial distrib.)

It sounds like you want to wrap Ian's idea into an IML module. If so, try this (untested):

proc iml;

start RandZeroModNegBin( N, /* number of observations to simulate */

ip, /* inflated probability of zero */

p, /* success probability for Neg Binom */

k ); /* number of trials for Neg Binom */

x = j(N,1); /* allocate vector */

call randgen(x, "Bernoulli", 1-ip);

b = x; /* vector of 0/1 */

call randgen(x, "NegBinomial", p, k);

return ( b#x ); /* 0 with prob=ip, NegBin(p,k) with prob=(1-ip) */

finish;

/* test the function */

x = RandZeroModNegBin(1000, 0.2, 0.5, 5);

/* to see how we did, plot the empirical distribution */

create Plot var{x}; append; close;

quit;

proc freq data=Plot;

tables x / plot=freqplot;

run;

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Rick_SAS

11-07-2012 11:35 AM

Right now I am working on SAS base using SAS IML. Actually, I am using "Loss Model Data and Decision" written by Klugman et all as reference for the distribution. The author's isn't mention that the distribution is mixture between Bernoulli and Negative Binomial. I am just thinking using Inversion Method, but is quite difficult to counter the problem comes from inversion function of Gamma function.

What about to generate data from Zero truncated Negative Binomial or Extended Truncated Negative Binomial? Generating count data for which the value zero cannot occur and for which the conditional means are not equal to the conditional variances. The data exhibit over dispersion.

In my case, I want to generate the count data of the length of hospital stay. Absolutely,The Length of hospital stay is recorded as a minimum of at least one day. When I am trying to assign zero in ip " variable of inflated probabilty of zero" in the code, the 0 data is still existed because the simulation from negative binomial distribution still produce the 0's data.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to babaJB

11-07-2012 11:49 AM

As you say, using the inversion method would be hard because there isn't a simple inverse CDF.

The other distributions are similar. They are all built by combining elementary distributions, perhaps with some IF/THEN logic. For each distribution, write down the formula and then translate it into vecotr notation, using the previous links and examples. If you get stuck, post the IML statements that show your attempt (and a formula or link to the PDF or CDF) and someone can help you get unstuck.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Rick_SAS

11-08-2012 09:46 AM

I found the algorithm to create 1000 sample from a Poisson distribution with mean T, after rejecting all zeros ( link :R help archive: RE:

n<-1000; T<-3.5

Y<-rpois(n,T); Y0<-Y[Y>0]; r<-(n - length(Y0)) while(r>0){

Y<-rpois(r,T); Y0<-c(Y0,Y[Y>0]); r<-(n - length(Y0)) }

Then I'm trying to write in SAS IML Languange. The code is below :

ods graphics on;

proc iml;

start RandZeroTruncatedPoisson ( N, /* number of observation */

lambda); /* parameter of Poisson Distribution */

Y = j(N,1,.);

call randgen ( U, 'POISSON', lambda);

Y0= loc(Y>0);

r = nrow(U)-nrow(Y0);

Y=j(r,1);

do while ( r > 0);

call randgen ( Y,'POISSON',lambda);

Yx=loc(Y>0);

Y0=Y0+Yx;

r= N-nrow(Y0);

end;

return(y);

finish;

/* test the function */

x = RandZeroTruncatedPoisson(1000,1007.38);

create Plot var{x}; append; close;

quit;

proc freq data=Plot;

tables x / plot=freqplot;

run;

This is the log report for the result :

IML Ready

302 start RandZeroTruncatedPoisson ( N, /* number of observation */

303 lambda);

303! /* parameter of Poisson Distribution */

304 Y = j(N,1,.);

305 call randgen ( U, 'POISSON', lambda);

306 Y0= loc(Y>0);

307 r = nrow(U)-nrow(Y0);

308 Y=j(r,1);

309 do while ( r > 0);

310 call randgen ( Y,'POISSON',lambda);

311 Yx=loc(Y>0);

312 Y0=Y0+Yx;

313 r= N-nrow(Y0);

314 end;

315 return(y);

316 finish;

NOTE: Module RANDZEROTRUNCATEDPOISSON defined.

317 /* test the function */

318 x = RandZeroTruncatedPoisson(1000,1007.38);

ERROR: (execution) Matrix has not been set to a value.

operation : + at line 312 column 14

operands : Y0, Yx

Y0 0 row 0 col (type ?, size 0)

Yx 1 row 1 col (numeric)

1

statement : ASSIGN at line 312 column 9

traceback : module RANDZEROTRUNCATEDPOISSON at line 312 column 9

NOTE: Paused in module RANDZEROTRUNCATEDPOISSON.

319 /* to see how we did, plot the empirical distribution */

320 create Plot var{x};

I believe that Yx=loc(Y>0); Y0=j[Y0,Yx]; isn't the right way to express R code for Y0<-c(Y0,Y[Y>0]). That R code is assignment statement to set up vector Y0 consisting prior Y0 and the new Y which the values in Y is greater than zero. Does Anybody know how to fix it ?

Solution

11-12-2012
09:44 AM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to babaJB

11-12-2012 09:44 AM

You've got the right idea, but there are a few mistakes in your program. (1) You shouldn't be using U any more, (2) The R function c() is concatenation, not addition, so you want to use the IML vertical concatentation operator (//)

(3) You want Y0 = Y[ loc(Y>0) ]

There is also the matter that IML will issue an error if you use an empty subscript, so it is best to make sure that LOC returns a nonempty matrix.

Here's one approach. To make it easier to follow, I encapsulated the random variate generation into it's own function:

proc iml;

start rpois(N, T); /* helper function */

x = j(N,1);

call randgen ( x, 'POISSON', T);

return ( x );

finish;

start RandZeroTruncatedPoisson( N, /* number of observation */

lambda); /* parameter of Poisson Distribution */

r = N;

do while ( r > 0);

Y = rpois(r,lambda);

idx = loc(Y>0);

if ncol(idx)>0 then do;

Y0 = Y0 // Y[idx];

r = N-nrow(Y0);

end;

end;

return(y0);

finish;

/* test the function */

call randseed(1);

x = RandZeroTruncatedPoisson(10,1);

print x;

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Rick_SAS

11-25-2012 12:09 PM

Many Thanks ianWickellin especially Rick Wicklin. Simulation of the above can also be applied in the negative binomial distribution. I tried to generate 10 samples of data by the number of 1000 observations of each sample, statistics descriptive of the results of 10 samples are close to the descriptive statistics of the sample exists. Even there are not literature about this algorithm, I believe this is the one way to generate data from zero truncated distribution.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Rick_SAS

11-25-2012 08:43 PM

For more discussion of this topic and related issues, see

Efficient acceptance-rejection simulation - The DO Loop

and

Efficient acceptance-rejection simulation: Part II - The DO Loop