Solved: Re: Zero-truncated Negative Binomial Regression

amyip · Posted 08-14-2024 03:38 PM

I'm trying ZTNB for the first time (SAS EG 8.2) to model cumulative days' supply of prescription meds based on individual-level characteristics (e.g. binary long-term care, sex, age group), where all subjects have at least one day's supply, ie, no zeroes. I found the following code from:

https://stats.oarc.ucla.edu/sas/dae/zero-truncated-negative-binomial/

https://jbhender.github.io/Stats506/F18/GP/Group6.html

proc nlmixed data=ZTNB;

log_mu = intercept + b_long_term_care*long_term_care + b_female*female + b_agegr1*agegr1;
mu = exp(log_mu);
het = 1/alpha;
ll = lgamma(CumDays+het) - lgamma(CumDays + 1) - lgamma(het) - het*log(1+alpha*mu)
+ CumDays*log(alpha*mu) - CumDays*log(1+alpha*mu) - log(1 - (1 + alpha * mu)**-het);
model CumDays ~ general(ll);
run;

Running this code, I get the following in the log:

NOTE: To assign starting values to parameters, use the PARMS statement. The default starting value of 1.0 is in effect for all parameters.

ERROR: QUANEW Optimization cannot be completed.

NOTE: Optimization routine cannot improve the function value.

NOTE:The SAS System stopped processing this step because of errors.

Any guidance on how to run this procedure will be appreciated.

StatDave · Posted 08-16-2024 05:22 PM

Just plug your model into the code shown in the Note I referred to. I assume that your data is individual level data (not aggregated data) and that each of the predictors in your model are numeric. In NLMIXED, you can start without the PARMS statement to set initial parameter values, but if you run into fitting problems then you might need to include the PARMS statement to set initial values for one or more of your model parameters.

      proc nlmixed data=ZTNB;
         y=CumDays;
         mean = exp(intercept + b_long_term_care*long_term_care + b_female*female + b_agegr1*agegr1);
         ll = lgamma(y+1/k) - lgamma(1/k) - lgamma(y+1) + y*log(k*mean) -
              (y+(1/k))*log(1+k*mean) - log(1-(1+k*mean)**(-1/k));
         model y ~ general(ll);
         run;

Or just use PROC FMM:

      proc fmm data=ZTNB;
         model y = long_term_care female agegr1 / dist=truncnegbin;
         run;

View solution in original post

StatDave · Posted 08-14-2024 06:58 PM

See this note that shows how to fit the zero truncated Poisson and negative binomial models using PROC FMM and PROC NLMIXED. As shown there, it's easiest to do with FMM.

amyip · Posted 08-16-2024 12:58 PM

Thanks for taking the time to reply. Are you able to troubleshoot how to run my SAS program? I already set up the regression using proc nlmixed, using sample code from the two references I listed.

StatDave · Posted 08-16-2024 05:22 PM

Just plug your model into the code shown in the Note I referred to. I assume that your data is individual level data (not aggregated data) and that each of the predictors in your model are numeric. In NLMIXED, you can start without the PARMS statement to set initial parameter values, but if you run into fitting problems then you might need to include the PARMS statement to set initial values for one or more of your model parameters.

      proc nlmixed data=ZTNB;
         y=CumDays;
         mean = exp(intercept + b_long_term_care*long_term_care + b_female*female + b_agegr1*agegr1);
         ll = lgamma(y+1/k) - lgamma(1/k) - lgamma(y+1) + y*log(k*mean) -
              (y+(1/k))*log(1+k*mean) - log(1-(1+k*mean)**(-1/k));
         model y ~ general(ll);
         run;

Or just use PROC FMM:

      proc fmm data=ZTNB;
         model y = long_term_care female agegr1 / dist=truncnegbin;
         run;

amyip · Posted 08-19-2024 07:42 PM

Thank you -- it worked! Can the estimates be transformed to facilitate interpretation (our target audience would be clinicians, not statisticians), e.g. exponentiation or taking the natural logarithm?

StatDave · Posted 08-20-2024 11:11 AM

Just add a PREDICT statement to get predicted values in the OUT= data set for each of the observations in the input data set.

predict mean out=preds;

If you have one or more particular settings of the predictor(s) that you want a predicted value for, you can use one (or more) ESTIMATE statements. For example, to get the predicted mean for MENT=30 in the ZTNB example using the Long data in the second half of the note I referred to earlier:

estimate 'pred for ment=30' exp(b0+b1*30);

amyip · Posted 08-22-2024 10:35 AM

My question may not have been clear. Running the model yields an estimate for the covariate of interest, long-term care, of 0.2757 (95% CI 0.2254, 0.3261), p<0.0001, which is difficult to interpret. Would it be appropriate to report the corresponding e^β of 1.32 (95% CI 1.26, 1.39) as a pseudo relative risk, such that those in long-term care were prescribed on average 1.32X the total days' supply of medication as their community-dwelling counterparts?

amyip · Posted 08-29-2024 11:06 AM

In follow-up to proc nlmixed vs. proc fmm, I was able to run my model with both procedures, but they yielded nonidentical (albeit similar) results. Is one more accurate than the other? From this note:

Although compared to proc nlmixed, proc fmm provides a much easier way to specify a zero-truncated poisson regression, it has much more limited postestimation options. For example, estimate and predict statements are not available with proc fmm. Additionally, proc fmm like most other SAS procs, uses the last group within a categorical variable as the reference group.

I included dummy variables for my multilevel categorical predictors, omitting the referent, so I'm not sure the difference in estimates btw nlmixed and fmm can be explained by fmm using the last group as the referent.

Registration is open