Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

🔒 This topic is **solved** and **locked**.
Need further help from the community? Please
sign in and ask a **new** question.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Posted 10-15-2012 10:54 AM
(2668 views)

*Dear All,*

*I would like to know whether it is possible using SAS/IML to **generate random numbers from zero-modified Negative Binomial.*for example :

The zero-modi ed version is de fine as follows

P(X = k) = p if k = 0 and K Gamma(r+k)/Gamma(r)k! ( 1+beta )^-r ( beta/(1+ beta) )^k , otherwise

where K is de ned as 1-p,

r; usual parameters and p the new parameter

*Thanks in advance,*Hermawan Budyanto

1 ACCEPTED SOLUTION

Accepted Solutions

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

You've got the right idea, but there are a few mistakes in your program. (1) You shouldn't be using U any more, (2) The R function c() is concatenation, not addition, so you want to use the IML vertical concatentation operator (//)

(3) You want Y0 = Y[ loc(Y>0) ]

There is also the matter that IML will issue an error if you use an empty subscript, so it is best to make sure that LOC returns a nonempty matrix.

Here's one approach. To make it easier to follow, I encapsulated the random variate generation into it's own function:

proc iml;

start rpois(N, T); /* helper function */

x = j(N,1);

call randgen ( x, 'POISSON', T);

return ( x );

finish;

start RandZeroTruncatedPoisson( N, /* number of observation */

lambda); /* parameter of Poisson Distribution */

r = N;

do while ( r > 0);

Y = rpois(r,lambda);

idx = loc(Y>0);

if ncol(idx)>0 then do;

Y0 = Y0 // Y[idx];

r = N-nrow(Y0);

end;

end;

return(y0);

finish;

/* test the function */

call randseed(1);

x = RandZeroTruncatedPoisson(10,1);

print x;

9 REPLIES 9

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

If I understand correctly, then I think you can get what you want using the standard RAND fuction. Something like:

c=rand( 'BERNOULLI' , 1-ip) * rand( 'NEGBINOMIAL' , p, k);

where ip is the zero inflation probability.

Ian.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Since you posted this to the IML forum, I assume you want an IML solution.

See this article for an overview of how to simulate efficiently in IML by using the RANDGEN routine: Efficient Sampling - The DO Loop

See this article for how to simulate from a mixture distribution: Generate a random sample from a mixture distribution - The DO Loop

(Notice that your distribution is a mixture of a Bernoulli and a Negative Binomial distrib.)

It sounds like you want to wrap Ian's idea into an IML module. If so, try this (untested):

proc iml;

start RandZeroModNegBin( N, /* number of observations to simulate */

ip, /* inflated probability of zero */

p, /* success probability for Neg Binom */

k ); /* number of trials for Neg Binom */

x = j(N,1); /* allocate vector */

call randgen(x, "Bernoulli", 1-ip);

b = x; /* vector of 0/1 */

call randgen(x, "NegBinomial", p, k);

return ( b#x ); /* 0 with prob=ip, NegBin(p,k) with prob=(1-ip) */

finish;

/* test the function */

x = RandZeroModNegBin(1000, 0.2, 0.5, 5);

/* to see how we did, plot the empirical distribution */

create Plot var{x}; append; close;

quit;

proc freq data=Plot;

tables x / plot=freqplot;

run;

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Right now I am working on SAS base using SAS IML. Actually, I am using "Loss Model Data and Decision" written by Klugman et all as reference for the distribution. The author's isn't mention that the distribution is mixture between Bernoulli and Negative Binomial. I am just thinking using Inversion Method, but is quite difficult to counter the problem comes from inversion function of Gamma function.

What about to generate data from Zero truncated Negative Binomial or Extended Truncated Negative Binomial? Generating count data for which the value zero cannot occur and for which the conditional means are not equal to the conditional variances. The data exhibit over dispersion.

In my case, I want to generate the count data of the length of hospital stay. Absolutely,The Length of hospital stay is recorded as a minimum of at least one day. When I am trying to assign zero in ip " variable of inflated probabilty of zero" in the code, the 0 data is still existed because the simulation from negative binomial distribution still produce the 0's data.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

As you say, using the inversion method would be hard because there isn't a simple inverse CDF.

The other distributions are similar. They are all built by combining elementary distributions, perhaps with some IF/THEN logic. For each distribution, write down the formula and then translate it into vecotr notation, using the previous links and examples. If you get stuck, post the IML statements that show your attempt (and a formula or link to the PDF or CDF) and someone can help you get unstuck.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

I found the algorithm to create 1000 sample from a Poisson distribution with mean T, after rejecting all zeros ( link :R help archive: RE:

n<-1000; T<-3.5

Y<-rpois(n,T); Y0<-Y[Y>0]; r<-(n - length(Y0)) while(r>0){

Y<-rpois(r,T); Y0<-c(Y0,Y[Y>0]); r<-(n - length(Y0)) }

Then I'm trying to write in SAS IML Languange. The code is below :

ods graphics on;

proc iml;

start RandZeroTruncatedPoisson ( N, /* number of observation */

lambda); /* parameter of Poisson Distribution */

Y = j(N,1,.);

call randgen ( U, 'POISSON', lambda);

Y0= loc(Y>0);

r = nrow(U)-nrow(Y0);

Y=j(r,1);

do while ( r > 0);

call randgen ( Y,'POISSON',lambda);

Yx=loc(Y>0);

Y0=Y0+Yx;

r= N-nrow(Y0);

end;

return(y);

finish;

/* test the function */

x = RandZeroTruncatedPoisson(1000,1007.38);

create Plot var{x}; append; close;

quit;

proc freq data=Plot;

tables x / plot=freqplot;

run;

This is the log report for the result :

IML Ready

302 start RandZeroTruncatedPoisson ( N, /* number of observation */

303 lambda);

303! /* parameter of Poisson Distribution */

304 Y = j(N,1,.);

305 call randgen ( U, 'POISSON', lambda);

306 Y0= loc(Y>0);

307 r = nrow(U)-nrow(Y0);

308 Y=j(r,1);

309 do while ( r > 0);

310 call randgen ( Y,'POISSON',lambda);

311 Yx=loc(Y>0);

312 Y0=Y0+Yx;

313 r= N-nrow(Y0);

314 end;

315 return(y);

316 finish;

NOTE: Module RANDZEROTRUNCATEDPOISSON defined.

317 /* test the function */

318 x = RandZeroTruncatedPoisson(1000,1007.38);

ERROR: (execution) Matrix has not been set to a value.

operation : + at line 312 column 14

operands : Y0, Yx

Y0 0 row 0 col (type ?, size 0)

Yx 1 row 1 col (numeric)

1

statement : ASSIGN at line 312 column 9

traceback : module RANDZEROTRUNCATEDPOISSON at line 312 column 9

NOTE: Paused in module RANDZEROTRUNCATEDPOISSON.

319 /* to see how we did, plot the empirical distribution */

320 create Plot var{x};

I believe that Yx=loc(Y>0); Y0=j[Y0,Yx]; isn't the right way to express R code for Y0<-c(Y0,Y[Y>0]). That R code is assignment statement to set up vector Y0 consisting prior Y0 and the new Y which the values in Y is greater than zero. Does Anybody know how to fix it ?

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

You've got the right idea, but there are a few mistakes in your program. (1) You shouldn't be using U any more, (2) The R function c() is concatenation, not addition, so you want to use the IML vertical concatentation operator (//)

(3) You want Y0 = Y[ loc(Y>0) ]

There is also the matter that IML will issue an error if you use an empty subscript, so it is best to make sure that LOC returns a nonempty matrix.

Here's one approach. To make it easier to follow, I encapsulated the random variate generation into it's own function:

proc iml;

start rpois(N, T); /* helper function */

x = j(N,1);

call randgen ( x, 'POISSON', T);

return ( x );

finish;

start RandZeroTruncatedPoisson( N, /* number of observation */

lambda); /* parameter of Poisson Distribution */

r = N;

do while ( r > 0);

Y = rpois(r,lambda);

idx = loc(Y>0);

if ncol(idx)>0 then do;

Y0 = Y0 // Y[idx];

r = N-nrow(Y0);

end;

end;

return(y0);

finish;

/* test the function */

call randseed(1);

x = RandZeroTruncatedPoisson(10,1);

print x;

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

For more discussion of this topic and related issues, see

Efficient acceptance-rejection simulation - The DO Loop

and

Efficient acceptance-rejection simulation: Part II - The DO Loop

Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.

**If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website. **

Multiple Linear Regression in SAS

Learn how to run multiple linear regression models with and without interactions, presented by SAS user Alex Chaplin.

Find more tutorials on the SAS Users YouTube channel.