BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
DomLimo
Fluorite | Level 6

Hi SAS Community,

I am using PROC FMM (running under SAS V9.4_M5 and SAS/STAT V 14.3) to explore factors affecting counts of hospital admissions (and other similar events) for children in a large birth cohort. As the vast majority of children in the study population do not attend hospital or experience the other events, I have lots of zero counts in the data.

I have found very useful tips in the SAS documentation for running zero-inflated poisson and negative binomial regression models and assessing how well those models estimate the count profiles within the data (eg usage notes 43522  and this SAS Global Forum 2008 paper.

 

However, to extend the analysis to hurdle models (eg by using the %NLEstimate macro or generating plots as demonstrated in the 2008 paper) I need to pass parameters to SAS pdf function for truncated poisson and truncated negative binomial distributions. I have run hurdle models in PROC FMM using these distributions but haven't been readily able to assess their fit or generate estimates for different covariate profiles as I can do for other distributions.

 

To date, I have not been able to locate the correct syntax or other documentation to generate pdf estimates for truncpoisson or truncnegbin distributions. Can anyone on this forum point me in the right direction to find this documentation?

Many thanks in advance

1 ACCEPTED SOLUTION

Accepted Solutions
Rick_SAS
SAS Super FREQ

FreelanceReinhard 's response is a good solution, provided that the OP is interested only in left truncation at n=0. Otherwise, the normalization constant needs to be adjusted to reflect the range of the truncation.

 

For an example and discussion of the truncated normal distribution, see "Implement the truncated normal distribution in SAS," 

which uses PROC IML instead of PROC FCMP. 

 

If you wonder where those denominators came from in FreelanceReinhard's soln, they are the cumulative weights of the truncated distribution. I would have written them explicitly as 

 

wt_TP = 1-pdf('poisson',0,m);  /* CDF for n>0 */

and
wt_TNB = 1-pdf('negb',0,p,n); /* CDF for N>0 */

View solution in original post

8 REPLIES 8
Reeza
Super User
I'm not aware of any such functions existing, but I've moved your question to the statistical forum and hopefully someone else can help you out.
FreelanceReinh
Jade | Level 19

Hi @DomLimo (and welcome to the SAS Support Communities! :-))

 

So, you are missing the truncated Poisson and truncated negative binomial distributions in the list of probability distributions available in the PDF function? In this case you can create your own functions using PROC FCMP.

 

Example:

proc fcmp outlib=work.funcs.prob;
function pdf_trpoi(n,m);
  return(if n>0 then pdf('poisson',n,m)/(1-exp(-m)) else 0);
endsub;

function pdf_trnegb(m,p,n);
  return(if m>0 then pdf('negb',m,p,n)/(1-p**n) else 0);
endsub;
run;

options cmplib=work.funcs;

After submitting the above code, functions pdf_trpoi and pdf_trnegb are available which compute probabilities for the zero-truncated Poisson and zero-truncated negative binomial distribution, respectively. The arguments of these functions are the same as those of the PDF function for the corresponding non-truncated distributions.

 

Example:

%let lambda=2.5;

data trpoi;
do x=0 to 10;
  p=pdf('poisson',x,&lambda);
  tp=pdf_trpoi(x,&lambda);
  output;
end;
label p="Poisson(&lambda)"
      tp="zero-truncated Poisson(&lambda)";
run;

proc sgplot data=trpoi;
xaxis values=(0 to 10);
yaxis label='Probability';
scatter x=x y=p;
scatter x=x y=tp;
run;

Given that you're working with fairly specialized models (I am not familiar with), whereas the above calculations are comparably elementary, I'm not sure if this answers your question.

Rick_SAS
SAS Super FREQ

FreelanceReinhard 's response is a good solution, provided that the OP is interested only in left truncation at n=0. Otherwise, the normalization constant needs to be adjusted to reflect the range of the truncation.

 

For an example and discussion of the truncated normal distribution, see "Implement the truncated normal distribution in SAS," 

which uses PROC IML instead of PROC FCMP. 

 

If you wonder where those denominators came from in FreelanceReinhard's soln, they are the cumulative weights of the truncated distribution. I would have written them explicitly as 

 

wt_TP = 1-pdf('poisson',0,m);  /* CDF for n>0 */

and
wt_TNB = 1-pdf('negb',0,p,n); /* CDF for N>0 */

DomLimo
Fluorite | Level 6

Thanks all for this prompt assistance. Thanks @Rick_SAS for the clarification of the denominators in @FreelanceReinh 's solution. As I'm only interested in truncation at n=0 for the time-being, this solution meets my immediate requirements.

This has been a very positive first foray into the SAS Community forum.

Rick_SAS
SAS Super FREQ

We are glad that you got a solution and found the forum helpful and friendly.

 

Sometimes tne "experts" disagree over the choice of the "accepted" solution. I suggest you select FrelanceReihhard's answer, not mine, as the solution to your question. He provided a correct set of formulas and a working program.

DomLimo
Fluorite | Level 6

Hi @FreelanceReinh 

Thank you for this response - it's very helpful indeed. (and thanks for the welcome!)

 

I have been using SAS (as a "user", not programmer or developer) for decades (since SAS 6.2 or just before then) and have always been confident that it's possible to do whatever I need to do with data in SAS. However I have never used PROC FCMP and you've shown me how I can create my own functions and store them for later use...that's fantastic!

 

It will take me a couple of days to test the proposed solution, but it look very promising.

 

Thanks very much indeed! (and thanks @Reeza for leading me to the right community forum for my question).

 

 

 

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 8 replies
  • 922 views
  • 12 likes
  • 4 in conversation